[#11292][feat] use smg-grpc-proto package for gRPC proto definitions by CatherineSue · Pull Request #11578 · NVIDIA/TensorRT-LLM

CatherineSue · 2026-02-18T22:49:13Z

Summary by CodeRabbit

Release Notes

Refactor
- Migrated gRPC protobuf definitions to an external dependency package for improved maintainability and consistency.
- Simplified build process by removing custom protobuf compilation steps during installation.
- Updated imports to use centralized protobuf definitions from the dependency package.

Description

Replace local proto file, generated stubs, and compile script with the smg-grpc-proto PyPI package.

Proto definitions are now owned by SMG (Shepherd Model Gateway) and published as a versioned package, providing a single source of truth across SGLang, vLLM, and TensorRT-LLM. This is the same migration pattern already applied to SGLang and vLLM upstream.

Changes:

Add smg-grpc-proto>=0.3.3 to requirements.txt (requires grpcio>=1.78.0, protobuf>=5.26.0)
Remove local trtllm_service.proto and compile_protos.py from tensorrt_llm/grpc/
Update tensorrt_llm/grpc/__init__.py to import pb2 modules from smg_grpc_proto.generated (all downstream files continue using from tensorrt_llm.grpc import ... unchanged)
Remove BuildPyWithProtoCompile custom build hook from setup.py

What stays the same:

grpc_servicer.py, grpc_request_manager.py, serve.py, test_grpc.py — zero code changes (imports resolve through __init__.py re-export)
All gRPC server behavior is unchanged

Context: Follow-up to #11292 which added the gRPC server. See smg-grpc-proto on PyPI and the ownership plan.

Test Coverage

Existing tests/unittest/llmapi/test_grpc.py covers proto message construction, sampling params conversion, and end-to-end gRPC service flow — all unchanged and passing
Pre-commit hooks pass on all modified files

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

…tions Replace local proto file, generated stubs, and compile script with the smg-grpc-proto PyPI package. Proto definitions are now owned by SMG and published as a versioned package, providing a single source of truth across SGLang, vLLM, and TensorRT-LLM. - Add smg-grpc-proto>=0.3.3 to requirements.txt - Remove local trtllm_service.proto, compile_protos.py - Update __init__.py to import from smg_grpc_proto.generated - Remove BuildPyWithProtoCompile from setup.py Signed-off-by: Chang Su <chang.s.su@oracle.com>

coderabbitai · 2026-02-18T22:54:03Z

📝 Walkthrough

Walkthrough

The changes migrate gRPC protobuf definitions from locally-managed source files to an external dependency package (smg-grpc-proto). The custom build process that compiled protobufs during setup is removed, and local proto definitions are replaced with imports from the external package.

Changes

Cohort / File(s)	Summary
Build and Dependency Management `requirements.txt`, `setup.py`	Added smg-grpc-proto>=0.3.3 as a runtime dependency; removed custom setuptools build_py command (BuildPyWithProtoCompile class) that previously compiled gRPC protobufs during package builds.
Proto Module Imports `tensorrt_llm/grpc/__init__.py`	Switched protobuf imports from locally-generated modules to smg_grpc_proto.generated package; removed local proto file path constants (GRPC_MODULE_DIR, PROTO_FILE) and runtime proto compilation/verification functions (compile_protos, ensure_protos_available).
Removed Local Proto Infrastructure `tensorrt_llm/grpc/compile_protos.py`, `tensorrt_llm/grpc/trtllm_service.proto`	Deleted proto compilation script and original proto definitions file, as these are now provided by the external smg-grpc-proto package.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: migrating from local gRPC proto definitions to using the smg-grpc-proto package.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check	✅ Passed	PR description comprehensively covers all required template sections with clear explanations of changes, rationale, test coverage, and checklist completion.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tensorrt_llm/grpc/__init__.py (1)
1-1: ⚠️ Potential issue | 🟡 Minor

Update copyright year to 2026.

The file is meaningfully modified by this PR. Per coding guidelines, the NVIDIA copyright header must reflect the year of latest meaningful modification.
🛠️ Proposed fix
-# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
As per coding guidelines: "All source files must contain an NVIDIA copyright header with the year of latest meaningful modification."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/grpc/__init__.py` at line 1, Update the SPDX copyright header in
tensorrt_llm/grpc/__init__.py to reflect the latest meaningful modification year
2026 by changing the year in the existing header line (the SPDX/ copyright
comment at the top of the file).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tensorrt_llm/grpc/__init__.py`:
- Line 1: Update the SPDX copyright header in tensorrt_llm/grpc/__init__.py to
reflect the latest meaningful modification year 2026 by changing the year in the
existing header line (the SPDX/ copyright comment at the top of the file).

juney-nvidia

LGTM

juney-nvidia · 2026-02-19T08:57:56Z

/bot run

tensorrt-cicd · 2026-02-19T09:03:52Z

PR_Github #36225 [ run ] triggered by Bot. Commit: 92a6645 Link to invocation

tensorrt-cicd · 2026-02-19T11:39:17Z

PR_Github #36225 [ run ] completed with state SUCCESS. Commit: 92a6645
/LLM/main/L0_MergeRequest_PR pipeline #28005 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

tburt-nv · 2026-02-19T15:48:16Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-02-19T15:54:13Z

PR_Github #36253 [ run ] triggered by Bot. Commit: 92a6645 Link to invocation

tensorrt-cicd · 2026-02-19T19:29:06Z

PR_Github #36253 [ run ] completed with state SUCCESS. Commit: 92a6645
/LLM/main/L0_MergeRequest_PR pipeline #28030 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

juney-nvidia · 2026-02-20T02:45:39Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-02-20T02:52:12Z

PR_Github #36309 [ run ] triggered by Bot. Commit: 92a6645 Link to invocation

tensorrt-cicd · 2026-02-20T03:55:49Z

PR_Github #36309 [ run ] completed with state SUCCESS. Commit: 92a6645
/LLM/main/L0_MergeRequest_PR pipeline #28082 completed with status: 'SUCCESS'

Link to invocation

Extract raw image bytes from the MultimodalInput proto message, decode them as PIL images, and pass them to the LLM API via the multi_modal_data dict. The LLM API decodes token IDs back to text and re-processes through the model's input processor with images. Signed-off-by: Chang Su <chang.s.su@oracle.com>

…tokenized TokenSequence (#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

Extract raw image bytes from the MultimodalInput proto message, decode them as PIL images, and pass them to the LLM API via the multi_modal_data dict. The LLM API decodes token IDs back to text and re-processes through the model's input processor with images. Signed-off-by: Chang Su <chang.s.su@oracle.com>

Signed-off-by: Chang Su <chang.s.su@oracle.com>

…f pre-tokenized TokenSequence (NVIDIA#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

…VIDIA#11800) Signed-off-by: Chang Su <chang.s.su@oracle.com>

…f pre-tokenized TokenSequence (NVIDIA#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

…VIDIA#11800) Signed-off-by: Chang Su <chang.s.su@oracle.com>

…f pre-tokenized TokenSequence (NVIDIA#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

…VIDIA#11800) Signed-off-by: Chang Su <chang.s.su@oracle.com>

CatherineSue requested a review from a team as a code owner February 18, 2026 22:49

coderabbitai Bot reviewed Feb 18, 2026

View reviewed changes

svc-trtllm-gh-bot added the Community want to contribute PRs initiated from Community label Feb 18, 2026

juney-nvidia reviewed Feb 19, 2026

View reviewed changes

Comment thread requirements.txt

juney-nvidia approved these changes Feb 19, 2026

View reviewed changes

juney-nvidia enabled auto-merge (squash) February 19, 2026 08:59

tburt-nv approved these changes Feb 19, 2026

View reviewed changes

juney-nvidia merged commit c172acf into NVIDIA:main Feb 20, 2026
7 checks passed

This was referenced Mar 4, 2026

[#11578][fix] Use string stop/bad words in gRPC proto instead of pre-tokenized TokenSequence #11888

Merged

feat(grpc): extract gRPC servicer into smg-grpc-servicer package, add --grpc flag to vllm serve vllm-project/vllm#36169

Merged

QiJune pushed a commit that referenced this pull request Mar 6, 2026

[#11578][fix] Use string stop/bad words in gRPC proto instead of pre-…

5f1fb7c

…tokenized TokenSequence (#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

venkywonka pushed a commit that referenced this pull request Mar 6, 2026

[#11578][feat] support multimodal image input in gRPC server (#11800)

5918348

Signed-off-by: Chang Su <chang.s.su@oracle.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Mar 9, 2026

[NVIDIA#11578][fix] Use string stop/bad words in gRPC proto instead o…

44f109a

…f pre-tokenized TokenSequence (NVIDIA#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Mar 9, 2026

[NVIDIA#11578][feat] support multimodal image input in gRPC server (N…

3e1a4e6

…VIDIA#11800) Signed-off-by: Chang Su <chang.s.su@oracle.com>

tianyuz-nv pushed a commit to wanqian-nv/TensorRT-LLM that referenced this pull request Mar 19, 2026

[NVIDIA#11578][fix] Use string stop/bad words in gRPC proto instead o…

c67e9b6

…f pre-tokenized TokenSequence (NVIDIA#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

tianyuz-nv pushed a commit to wanqian-nv/TensorRT-LLM that referenced this pull request Mar 19, 2026

[NVIDIA#11578][feat] support multimodal image input in gRPC server (N…

06e2bd2

…VIDIA#11800) Signed-off-by: Chang Su <chang.s.su@oracle.com>

limin2021 pushed a commit to limin2021/TensorRT-LLM that referenced this pull request Mar 19, 2026

[NVIDIA#11578][fix] Use string stop/bad words in gRPC proto instead o…

4084c45

…f pre-tokenized TokenSequence (NVIDIA#11888) Signed-off-by: Chang Su <chang.s.su@oracle.com>

limin2021 pushed a commit to limin2021/TensorRT-LLM that referenced this pull request Mar 19, 2026

[NVIDIA#11578][feat] support multimodal image input in gRPC server (N…

3ac9329

…VIDIA#11800) Signed-off-by: Chang Su <chang.s.su@oracle.com>

Conversation

CatherineSue commented Feb 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Description

Test Coverage

PR Checklist

Uh oh!

coderabbitai Bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

juney-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

juney-nvidia commented Feb 19, 2026

Uh oh!

tensorrt-cicd commented Feb 19, 2026

Uh oh!

tensorrt-cicd commented Feb 19, 2026

Uh oh!

tburt-nv commented Feb 19, 2026

Uh oh!

tensorrt-cicd commented Feb 19, 2026

Uh oh!

tensorrt-cicd commented Feb 19, 2026

Uh oh!

juney-nvidia commented Feb 20, 2026

Uh oh!

tensorrt-cicd commented Feb 20, 2026

Uh oh!

tensorrt-cicd commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CatherineSue commented Feb 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 18, 2026 •

edited

Loading