[Feat] Allow tokenizer_name override#321
Conversation
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Code Review
This pull request introduces the ability to override the default tokenizer by adding a tokenizer_name field to the ModelParams configuration. The benchmark execution logic now checks for this override, and corresponding updates have been made to the configuration templates and unit tests. Feedback suggests adding a CLI alias for the new field to maintain consistency with other parameters and improve command-line usability.
| tokenizer_name: str | None = Field( | ||
| None, | ||
| description="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.", | ||
| ) |
There was a problem hiding this comment.
To maintain consistency with other fields in ModelParams (such as name and max_new_tokens) and to improve usability when running benchmarks from the command line, consider adding a CLI alias for tokenizer_name using cyclopts.Parameter. This allows users to easily override the tokenizer repo ID via the --tokenizer flag.
| tokenizer_name: str | None = Field( | |
| None, | |
| description="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.", | |
| ) | |
| tokenizer_name: Annotated[ | |
| str | None, | |
| cyclopts.Parameter( | |
| alias="--tokenizer", | |
| help="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.", | |
| ), | |
| ] = None |
There was a problem hiding this comment.
Pull request overview
Adds support for configuring a tokenizer identifier independently of the configured model name, enabling token-metrics/ISL token counting to use a HuggingFace tokenizer repo/path even when the endpoint’s model naming differs (e.g., OpenRouter/SambaNova naming conventions).
Changes:
- Extend
ModelParamsschema with an optionaltokenizer_nameoverride. - Update benchmark setup to prefer
tokenizer_namewhen probing for tokenizer availability. - Add unit coverage for the new schema field and update full YAML templates to include it.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
tests/unit/config/test_schema.py |
Verifies tokenizer_name default and explicit override behavior in ModelParams. |
src/inference_endpoint/config/templates/online_template_full.yaml |
Documents the new model_params.tokenizer_name config option in the online full template. |
src/inference_endpoint/config/templates/offline_template_full.yaml |
Documents the new model_params.tokenizer_name config option in the offline full template. |
src/inference_endpoint/config/templates/concurrency_template_full.yaml |
Documents the new model_params.tokenizer_name config option in the concurrency full template. |
src/inference_endpoint/config/schema.py |
Adds tokenizer_name to ModelParams. |
src/inference_endpoint/commands/benchmark/execute.py |
Uses the override (when set) as the source for tokenizer existence probing and subsequent token-metrics enablement. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tokenizer_source = config.model_params.tokenizer_name or model_name | ||
| tokenizer_name = ( | ||
| tokenizer_source if _check_tokenizer_exists(tokenizer_source) else None | ||
| ) |
| ] = StreamingMode.AUTO | ||
| tokenizer_name: str | None = Field( | ||
| None, | ||
| description="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.", |
What does this PR do?
Allow specifying a tokenizer name that's different from model name in the yaml file.
Example: OpenRouter keeps all model names lower case, so Qwen/Qwen3.6-35B-A3B becomes qwen/qwen3.6-35b. SambaNova strips org name, so it becomes Qwen3.6-35B-A3B. tokenizer name override allows using those endpoints.
Type of change
Related issues
Testing
Checklist