Skip to content

[Feat] Allow tokenizer_name override#321

Open
tianmu-li wants to merge 3 commits into
mlcommons:mainfrom
tianmu-li:feat/separate_tokenizer_name
Open

[Feat] Allow tokenizer_name override#321
tianmu-li wants to merge 3 commits into
mlcommons:mainfrom
tianmu-li:feat/separate_tokenizer_name

Conversation

@tianmu-li
Copy link
Copy Markdown
Collaborator

What does this PR do?

Allow specifying a tokenizer name that's different from model name in the yaml file.
Example: OpenRouter keeps all model names lower case, so Qwen/Qwen3.6-35B-A3B becomes qwen/qwen3.6-35b. SambaNova strips org name, so it becomes Qwen3.6-35B-A3B. tokenizer name override allows using those endpoints.

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
@tianmu-li tianmu-li requested review from a team and Copilot May 20, 2026 18:38
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to override the default tokenizer by adding a tokenizer_name field to the ModelParams configuration. The benchmark execution logic now checks for this override, and corresponding updates have been made to the configuration templates and unit tests. Feedback suggests adding a CLI alias for the new field to maintain consistency with other parameters and improve command-line usability.

Comment on lines +200 to +203
tokenizer_name: str | None = Field(
None,
description="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To maintain consistency with other fields in ModelParams (such as name and max_new_tokens) and to improve usability when running benchmarks from the command line, consider adding a CLI alias for tokenizer_name using cyclopts.Parameter. This allows users to easily override the tokenizer repo ID via the --tokenizer flag.

Suggested change
tokenizer_name: str | None = Field(
None,
description="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.",
)
tokenizer_name: Annotated[
str | None,
cyclopts.Parameter(
alias="--tokenizer",
help="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.",
),
] = None

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for configuring a tokenizer identifier independently of the configured model name, enabling token-metrics/ISL token counting to use a HuggingFace tokenizer repo/path even when the endpoint’s model naming differs (e.g., OpenRouter/SambaNova naming conventions).

Changes:

  • Extend ModelParams schema with an optional tokenizer_name override.
  • Update benchmark setup to prefer tokenizer_name when probing for tokenizer availability.
  • Add unit coverage for the new schema field and update full YAML templates to include it.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/config/test_schema.py Verifies tokenizer_name default and explicit override behavior in ModelParams.
src/inference_endpoint/config/templates/online_template_full.yaml Documents the new model_params.tokenizer_name config option in the online full template.
src/inference_endpoint/config/templates/offline_template_full.yaml Documents the new model_params.tokenizer_name config option in the offline full template.
src/inference_endpoint/config/templates/concurrency_template_full.yaml Documents the new model_params.tokenizer_name config option in the concurrency full template.
src/inference_endpoint/config/schema.py Adds tokenizer_name to ModelParams.
src/inference_endpoint/commands/benchmark/execute.py Uses the override (when set) as the source for tokenizer existence probing and subsequent token-metrics enablement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +391 to +394
tokenizer_source = config.model_params.tokenizer_name or model_name
tokenizer_name = (
tokenizer_source if _check_tokenizer_exists(tokenizer_source) else None
)
] = StreamingMode.AUTO
tokenizer_name: str | None = Field(
None,
description="HuggingFace tokenizer repo ID. Overrides model name for tokenizer loading.",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants