Skip to content

[#11037][fix] Fix proto-to-SamplingParams conversion bugs and add gRPC tests#11292

Merged
QiJune merged 1 commit intoNVIDIA:mainfrom
CatherineSue:grpc-tests-fix
Feb 5, 2026
Merged

[#11037][fix] Fix proto-to-SamplingParams conversion bugs and add gRPC tests#11292
QiJune merged 1 commit intoNVIDIA:mainfrom
CatherineSue:grpc-tests-fix

Conversation

@CatherineSue
Copy link
Contributor

@CatherineSue CatherineSue commented Feb 4, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Fixed beam search parameter handling to correctly enable and configure beam search functionality.
    • Corrected seed parameter assignment for proper randomization control.
    • Improved stop words and bad words processing for more reliable decoding.
  • Tests

    • Expanded test coverage for sampling parameter conversion and gRPC service functionality.
    • Added comprehensive end-to-end tests for generate flows, health checks, and model information retrieval.

Description

Fix 5 bugs in create_sampling_params_from_proto() where kwargs didn't match SamplingParams dataclass fields (slots=True, kw_only=True), causing TypeError at runtime:

  • beam_widthuse_beam_search=True + best_of=beam_width
  • random_seedseed
  • stop_words/bad_words as kwargs → set _stop_word_ids/_bad_word_ids post-construction
  • guided_decoding_paramsguided_decoding
  • GuidedDecodingParams(json_schema=...)GuidedDecodingParams(json=...)

Add comprehensive proto conversion tests and e2e gRPC service tests as requested in #11037.

Test Coverage

  • TestComprehensiveSamplingParamsConversion (4 tests, no GPU):
    • test_all_sampling_config_fields — sets every proto field, verifies every SamplingParams field
    • test_end_id_minus_one_sets_ignore_eos
    • test_defaults_when_fields_unset
    • test_guided_decoding_all_types — JSON, JSON Schema, Regex, EBNF Grammar
  • TestGrpcServiceEndToEnd (6 tests, GPU required, TinyLlama-1.1B):
    • test_generate_non_streaming — verifies complete response with token IDs
    • test_generate_streaming — verifies delta chunks reassemble to match complete output
    • test_health_check
    • test_abort_nonexistent_request
    • test_get_model_info
    • test_get_server_info

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

The changes refactor SamplingParams creation in grpc_request_manager.py to use pre-tokenized word IDs for stop/bad words, update beam search handling via use_beam_search and best_of parameters, and rename guided decoding configurations. Tests are expanded with comprehensive SamplingParams conversion coverage and end-to-end gRPC service tests including fixtures and async helpers.

Changes

Cohort / File(s) Summary
SamplingParams Mapping
tensorrt_llm/grpc/grpc_request_manager.py
Refactored SamplingParams construction: beam_width now maps to use_beam_search + best_of; seed parameter renamed; stop/bad words converted to pre-tokenized word IDs injected after object construction; guided_decoding_params renamed to guided_decoding with schema key adjustments.
Test Coverage Expansion
tests/unittest/llmapi/test_grpc.py
Updated SamplingParams assertions to reflect new beam search and guided decoding mappings. Added TestComprehensiveSamplingParamsConversion class covering all SamplingConfig fields, embedding bias conversion, and end_id handling. Introduced TestGrpcServiceEndToEnd with fixtures (grpc_service), async helpers, and end-to-end tests for generate flows, health checks, and model info retrieval.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: fixing bugs in proto-to-SamplingParams conversion and adding comprehensive test coverage.
Description check ✅ Passed The description comprehensively covers the five specific bugs fixed, test coverage details with test counts and scopes, and confirms PR checklist completion.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…dd gRPC tests

Fix 5 bugs in create_sampling_params_from_proto() where kwargs didn't
match SamplingParams dataclass fields (slots=True), causing TypeError
at runtime: beam_width -> use_beam_search+best_of, random_seed -> seed,
stop_words/bad_words -> _stop_word_ids/_bad_word_ids post-construction,
guided_decoding_params -> guided_decoding, json_schema -> json.

Add comprehensive proto conversion tests covering all fields, defaults,
and guided decoding types. Add e2e gRPC service tests using real
TinyLlama-1.1B model covering Generate (streaming + non-streaming),
HealthCheck, Abort, GetModelInfo, and GetServerInfo RPCs.

Signed-off-by: Chang Su <chang.s.su@oracle.com>
@svc-trtllm-gh-bot svc-trtllm-gh-bot added the Community want to contribute PRs initiated from Community label Feb 4, 2026
@juney-nvidia juney-nvidia requested a review from QiJune February 5, 2026 00:59
@juney-nvidia
Copy link
Collaborator

@QiJune Hi Jun, can you help review this PR?

Thanks
June

@QiJune
Copy link
Collaborator

QiJune commented Feb 5, 2026

/bot run

Copy link
Collaborator

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34863 [ run ] triggered by Bot. Commit: 2fd36b3

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34863 [ run ] completed with state SUCCESS. Commit: 2fd36b3
/LLM/main/L0_MergeRequest_PR pipeline #26897 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

@QiJune
Copy link
Collaborator

QiJune commented Feb 5, 2026

/bot run

@QiJune QiJune enabled auto-merge (squash) February 5, 2026 07:06
@tensorrt-cicd
Copy link
Collaborator

PR_Github #34932 [ run ] triggered by Bot. Commit: 2fd36b3

@tensorrt-cicd
Copy link
Collaborator

PR_Github #34932 [ run ] completed with state SUCCESS. Commit: 2fd36b3
/LLM/main/L0_MergeRequest_PR pipeline #26949 completed with status: 'SUCCESS'

@QiJune QiJune merged commit 9601b17 into NVIDIA:main Feb 5, 2026
5 checks passed
SchumiDing pushed a commit to SchumiDing/TensorRT-LLM that referenced this pull request Feb 6, 2026
…dd gRPC tests (NVIDIA#11292)

Signed-off-by: Chang Su <chang.s.su@oracle.com>
inciaf pushed a commit to inciaf/trtllm-energy-monitoring that referenced this pull request Feb 18, 2026
…dd gRPC tests (NVIDIA#11292)

Signed-off-by: Chang Su <chang.s.su@oracle.com>
Signed-off-by: Ahmet Inci <ainci@nvidia.com>
CatherineSue added a commit to CatherineSue/TensorRT-LLM that referenced this pull request Feb 18, 2026
…tions

Replace local proto file, generated stubs, and compile script with the
smg-grpc-proto PyPI package. Proto definitions are now owned by SMG and
published as a versioned package, providing a single source of truth
across SGLang, vLLM, and TensorRT-LLM.

- Add smg-grpc-proto>=0.3.3 to requirements.txt
- Remove local trtllm_service.proto, compile_protos.py
- Update __init__.py to import from smg_grpc_proto.generated
- Remove BuildPyWithProtoCompile from setup.py

Signed-off-by: Chang Su <chang.s.su@oracle.com>
juney-nvidia pushed a commit that referenced this pull request Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community want to contribute PRs initiated from Community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments