[#11037][fix] Fix proto-to-SamplingParams conversion bugs and add gRPC tests#11292
[#11037][fix] Fix proto-to-SamplingParams conversion bugs and add gRPC tests#11292QiJune merged 1 commit intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughThe changes refactor SamplingParams creation in grpc_request_manager.py to use pre-tokenized word IDs for stop/bad words, update beam search handling via use_beam_search and best_of parameters, and rename guided decoding configurations. Tests are expanded with comprehensive SamplingParams conversion coverage and end-to-end gRPC service tests including fixtures and async helpers. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c5771e8 to
d7be137
Compare
…dd gRPC tests Fix 5 bugs in create_sampling_params_from_proto() where kwargs didn't match SamplingParams dataclass fields (slots=True), causing TypeError at runtime: beam_width -> use_beam_search+best_of, random_seed -> seed, stop_words/bad_words -> _stop_word_ids/_bad_word_ids post-construction, guided_decoding_params -> guided_decoding, json_schema -> json. Add comprehensive proto conversion tests covering all fields, defaults, and guided decoding types. Add e2e gRPC service tests using real TinyLlama-1.1B model covering Generate (streaming + non-streaming), HealthCheck, Abort, GetModelInfo, and GetServerInfo RPCs. Signed-off-by: Chang Su <chang.s.su@oracle.com>
d7be137 to
2fd36b3
Compare
|
@QiJune Hi Jun, can you help review this PR? Thanks |
|
/bot run |
|
PR_Github #34863 [ run ] triggered by Bot. Commit: |
|
PR_Github #34863 [ run ] completed with state
|
|
/bot run |
|
PR_Github #34932 [ run ] triggered by Bot. Commit: |
|
PR_Github #34932 [ run ] completed with state |
…dd gRPC tests (NVIDIA#11292) Signed-off-by: Chang Su <chang.s.su@oracle.com>
…dd gRPC tests (NVIDIA#11292) Signed-off-by: Chang Su <chang.s.su@oracle.com> Signed-off-by: Ahmet Inci <ainci@nvidia.com>
…tions Replace local proto file, generated stubs, and compile script with the smg-grpc-proto PyPI package. Proto definitions are now owned by SMG and published as a versioned package, providing a single source of truth across SGLang, vLLM, and TensorRT-LLM. - Add smg-grpc-proto>=0.3.3 to requirements.txt - Remove local trtllm_service.proto, compile_protos.py - Update __init__.py to import from smg_grpc_proto.generated - Remove BuildPyWithProtoCompile from setup.py Signed-off-by: Chang Su <chang.s.su@oracle.com>
Summary by CodeRabbit
Bug Fixes
Tests
Description
Fix 5 bugs in
create_sampling_params_from_proto()where kwargs didn't matchSamplingParamsdataclass fields (slots=True, kw_only=True), causingTypeErrorat runtime:beam_width→use_beam_search=True+best_of=beam_widthrandom_seed→seedstop_words/bad_wordsas kwargs → set_stop_word_ids/_bad_word_idspost-constructionguided_decoding_params→guided_decodingGuidedDecodingParams(json_schema=...)→GuidedDecodingParams(json=...)Add comprehensive proto conversion tests and e2e gRPC service tests as requested in #11037.
Test Coverage
TestComprehensiveSamplingParamsConversion(4 tests, no GPU):test_all_sampling_config_fields— sets every proto field, verifies every SamplingParams fieldtest_end_id_minus_one_sets_ignore_eostest_defaults_when_fields_unsettest_guided_decoding_all_types— JSON, JSON Schema, Regex, EBNF GrammarTestGrpcServiceEndToEnd(6 tests, GPU required, TinyLlama-1.1B):test_generate_non_streaming— verifies complete response with token IDstest_generate_streaming— verifies delta chunks reassemble to match complete outputtest_health_checktest_abort_nonexistent_requesttest_get_model_infotest_get_server_infoPR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.