smarter tokenization pool #2

michaelfeil · 2025-11-26T07:36:43Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the documentation guidelines.
Did you write any new necessary tests? If applicable, did you include or update the insta snapshots?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

michaelfeil · 2025-11-26T07:37:52Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-26T07:39:54Z

router/src/lib.rs

+    // fall-back to num_cpus - 1 to leave some CPU for the backend, and at most 64 workers.
+    let tokenization_workers =
+        tokenization_workers.unwrap_or_else(|| std::cmp::min(std::cmp::max(2, num_cpus::get() - 1), 64));


Fallback workers oversubscribe small hosts

When tokenization_workers is not set, the new fallback min(max(2, num_cpus::get() - 1), 64) still spawns at least two tokenization threads even on 1–2 CPU hosts, leaving no core free for the backend despite the intent expressed in the preceding comment. On such small instances this change can regress inference throughput by competing with the backend for all available CPU, unlike the previous default of matching the core count. Consider leaving one core free without forcing a minimum of two workers.

Useful? React with 👍 / 👎.

fair, i'll change to 1 minium.

michaelfeil · 2025-11-26T08:29:25Z

@codex review

chatgpt-codex-connector · 2025-11-26T08:29:35Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

smarter

1c99226

chatgpt-codex-connector bot reviewed Nov 26, 2025

View reviewed changes

max 1

7b4d4d2

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

smarter tokenization pool #2

smarter tokenization pool #2

michaelfeil commented Nov 26, 2025

Uh oh!

michaelfeil commented Nov 26, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 26, 2025

Uh oh!

michaelfeil Nov 26, 2025

Uh oh!

This comment was marked as off-topic.

Uh oh!

michaelfeil commented Nov 26, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

smarter tokenization pool #2

Are you sure you want to change the base?

smarter tokenization pool #2

Conversation

michaelfeil commented Nov 26, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

michaelfeil commented Nov 26, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

michaelfeil Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

michaelfeil commented Nov 26, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants