-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Move getAdjustedTokenCountFromModel from CLI to core #9968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Moved getAdjustedTokenCountFromModel function from CLI tokenizer to core/llm/countTokens.ts - Updated core countTokens() to use getAdjustedTokenCountFromModel for all models - Added comprehensive tests for getAdjustedTokenCountFromModel in core - Updated CLI tokenizer to import getAdjustedTokenCountFromModel from core - Removed duplicate implementation and constants from CLI This consolidates the token adjustment logic in one place and ensures consistent token counting across CLI and core. The llama tokenizer is now used for all models with appropriate multipliers for Claude (1.23x), Gemini (1.18x), and Mistral family (1.26x) models. Generated with [Continue](https://continue.dev) Co-Authored-By: Continue <noreply@continue.dev> Co-authored-by: dallin <dallin@continue.dev>
|
|
✅ Review Complete Code Review Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 3 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="extensions/cli/src/util/tokenizer.ts">
<violation number="1" location="extensions/cli/src/util/tokenizer.ts:3">
P2: CLI now imports getAdjustedTokenCountFromModel from core/llm/countTokens.js, which pulls in heavy tokenizer dependencies at module load time, likely regressing CLI startup/memory for a simple multiplier helper.</violation>
</file>
<file name="core/llm/countTokens.ts">
<violation number="1" location="core/llm/countTokens.ts:166">
P2: countTokensAsync does not apply model-specific token multipliers, so async token counts can be lower than countTokens for Claude/Gemini/Mistral models, affecting limit checks like readFileLimit.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="core/llm/getAdjustedTokenCount.ts">
<violation number="1" location="core/llm/getAdjustedTokenCount.ts:31">
P2: Regression: Mistral-family models like “Ministral”/“Mathstral” no longer match the multiplier because the broad `includes("stral")` check was replaced with a strict allowlist.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| } else if (lowerModelName.includes("gemini")) { | ||
| multiplier = GEMINI_TOKEN_MULTIPLIER; | ||
| } else if ( | ||
| lowerModelName.includes("mistral") || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Regression: Mistral-family models like “Ministral”/“Mathstral” no longer match the multiplier because the broad includes("stral") check was replaced with a strict allowlist.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At core/llm/getAdjustedTokenCount.ts, line 31:
<comment>Regression: Mistral-family models like “Ministral”/“Mathstral” no longer match the multiplier because the broad `includes("stral")` check was replaced with a strict allowlist.</comment>
<file context>
@@ -27,8 +27,13 @@ export function getAdjustedTokenCountFromModel(
- } else if (lowerModelName.includes("stral")) {
- // devstral, mixtral, mistral, etc
+ } else if (
+ lowerModelName.includes("mistral") ||
+ lowerModelName.includes("mixtral") ||
+ lowerModelName.includes("codestral") ||
</file context>
Summary
This PR moves the
getAdjustedTokenCountFromModelfunction from the CLI tokenizer tocore/llm/countTokens.tsand integrates it with core's token counting logic.Changes
getAdjustedTokenCountFromModelfrom CLI to corecountTokens()to apply model-specific multipliersWhy?
Model Multipliers
The function applies these multipliers based on empirical data:
Testing
Added unit tests covering:
Breaking Changes
None - this is backward compatible. The function signature changed slightly (takes model name string instead of ModelConfig) but all call sites have been updated.
Continue Tasks:▶️ 1 queued · ✅ 2 no changes — View all
Summary by cubic
Moves getAdjustedTokenCountFromModel to core and applies model-specific multipliers inside core token counting. This unifies token estimates across core and CLI.
Written for commit 7c32e1e. Summary will update on new commits.