docs(advance): add Add a New Speculative Decoding Method guide by SuperMarioYL · Pull Request #4589 · InternLM/lmdeploy

SuperMarioYL · 2026-05-17T19:17:27Z

Motivation

The PyTorch engine has a clean plug-in surface for speculative decoding
(BaseSpecProposer + SPEC_PROPOSERS registry in
lmdeploy/pytorch/spec_decode/proposers/base.py), and four shipped
methods register against it: eagle, eagle3, deepseek_mtp,
qwen3_5_mtp. The user-facing docs/en/advance/spec_decoding.md
teaches usage of those four names but never explains how to add a
fifth, so users have asked the question externally:

[Feature] support DFlash: Block Diffusion for Flash Speculative Decoding #4530 — "[Feature] support DFlash: Block Diffusion for Flash Speculative Decoding"
[Feature] Speculative Decoding #1738 — "[Feature] Speculative Decoding"

Both are open. A short extension-contract page closes the gap without
locking the engine into anything new.

Modification

Add docs/en/advance/spec_decoding_new_method.md and a toctree entry
for it in docs/en/index.rst, right next to spec_decoding.md.

The page mirrors the shape of the existing
docs/en/advance/pytorch_new_model.md (which documents the model-patch
extension contract):

The registry / base class / method string triad.
The build_specdecode_proposer entry point and why
proposers/__init__.py must import the new class.
What BaseSpecProposer already provides so contributors don't
re-implement weight loading, draft forward, decoding-input update,
or fallbacks.
A minimal MyMethod(BaseSpecProposer) skeleton with
@SPEC_PROPOSERS.register_module(name='my_method').
The 3-tuple return contract for get_outputs (draft token ids,
model_metas, target_hidden_states).
When to override build_model, illustrated with the two in-tree
precedents (Qwen3_5MTP shares the target embeddings; Eagle3
swaps embeddings conditionally and widens
get_target_hidden_size).
A 5-item shipping checklist.

No code changes. All snippets and references point to symbols that
exist in lmdeploy/pytorch/spec_decode/proposers/.

BC-breaking

None — docs only.

Use cases

Anyone wanting to add a new draft-token proposer (e.g. the DFlash
method requested in #4530) can now read one page and know which class
to subclass, which method to implement, what to return, and where to
register.

Checklist

pre-commit run --files docs/en/advance/spec_decoding_new_method.md docs/en/index.rst passes (mdformat, codespell, trailing whitespace, end-of-file, copyright check).
Documentation only; no code touched, no new tests needed.
Existing supported versions unaffected.
Doc cross-links to spec_decoding.md and explicitly names the four shipped methods so the new page does not drift from them.

Closes (partially) the docs side of #1738 and #4530.

Document the BaseSpecProposer + SPEC_PROPOSERS extension contract so that third parties can add a draft-token proposer without reverse engineering the engine. The existing spec_decoding.md teaches usage for the four shipped methods (eagle, eagle3, deepseek_mtp, qwen3_5_mtp) but does not explain the plug-in surface; users have asked for this in InternLM#1738 and InternLM#4530. Contents follow the same shape as docs/en/advance/pytorch_new_model.md: the registry / base-class / method-string triad, what BaseSpecProposer already implements, a minimal new proposer, the get_outputs contract, when to override build_model (with the in-tree Qwen3_5MTP and Eagle3 examples), and a 5-item shipping checklist. Add the page to docs/en/index.rst under the Advance section right next to spec_decoding.md.

Copilot

Pull request overview

Adds a new documentation page that explains how to extend the PyTorch engine's speculative decoding pipeline with a new proposer, and wires it into the docs toctree. This addresses the docs gap referenced by issues #1738 and #4530.

Changes:

Adds docs/en/advance/spec_decoding_new_method.md walking through the SPEC_PROPOSERS registry, BaseSpecProposer contract, get_outputs return tuple, when to override build_model, and a contributor checklist.
Registers the new page in docs/en/index.rst next to spec_decoding.md.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
docs/en/advance/spec_decoding_new_method.md	New guide describing the proposer plug-in contract, with examples mirrored from the in-tree `deepseek_mtp`, `eagle3`, and `qwen3_5_mtp` proposers.
docs/en/index.rst	Adds the new doc to the advance toctree.

Verified against lmdeploy/pytorch/spec_decode/proposers/{base,deepseek_mtp,eagle3,qwen3_5_mtp}.py: registry name, build_specdecode_proposer signature, BaseSpecProposer API surface, the get_outputs 3-tuple, and the Eagle3/Qwen3_5MTP build_model overrides quoted in the doc all match the current code.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

RunningLeon · 2026-05-18T08:21:39Z

+@SPEC_PROPOSERS.register_module(name='qwen3_5_mtp')
+class Qwen3_5MTP(DeepseekMTP):
+
+    def build_model(self, empty_init, target_model=None, build_model_ctx=None):


one may also need to make changes in lmdeploy/pytorch/configurations and add model definition in lmdeploy/pytorch/models.

Thanks for the pointer @RunningLeon! Pushed 0f0fcd6 which adds a new "Wire up the draft model architecture" section right after the build_model discussion, covering both touch-points you flagged:

lmdeploy/pytorch/configurations/ — how to add an AutoModelConfigBuilder (auto-registered via configurations/__init__.py walking the package), with references to deepseek_v2.py, qwen3_5.py, and llama.py for the deepseek_mtp / qwen3_5_mtp / eagle patterns.

lmdeploy/pytorch/models/ — how to add the draft model class and register the architecture string in module_map.py, with deepseek_mtp.py, qwen3_5_mtp.py, llama_eagle.py / llama_eagle3.py, and glm4moe_mtp.py cited as templates.

The checklist at the bottom was also extended with the two new items. PTAL when you have a moment.

… for new spec-decoding method Address review feedback on PR InternLM#4589: a new speculative-decoding method typically also needs (1) an AutoModelConfigBuilder under lmdeploy/pytorch/configurations/ to recognise the draft hf_config and flip model_paradigm to 'ar_spec', and (2) a draft model class under lmdeploy/pytorch/models/ registered in module_map.py so the engine patcher can resolve the draft architecture string. Add a new section covering both touch-points with references to existing implementations (deepseek_mtp, qwen3_5_mtp, llama_eagle/eagle3, glm4moe_mtp), and extend the checklist accordingly.

lvhan028 requested review from RunningLeon and Copilot May 18, 2026 03:14

lvhan028 added the documentation Improvements or additions to documentation label May 18, 2026

Copilot started reviewing on behalf of lvhan028 May 18, 2026 03:14 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

RunningLeon requested a review from Copilot May 18, 2026 08:18

Copilot started reviewing on behalf of RunningLeon May 18, 2026 08:19 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

RunningLeon reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(advance): add Add a New Speculative Decoding Method guide#4589

docs(advance): add Add a New Speculative Decoding Method guide#4589
SuperMarioYL wants to merge 2 commits into
InternLM:mainfrom
SuperMarioYL:docs/spec-decoding-new-method

SuperMarioYL commented May 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

RunningLeon May 18, 2026

Uh oh!

SuperMarioYL May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

SuperMarioYL commented May 17, 2026

Motivation

Modification

BC-breaking

Use cases

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

RunningLeon May 18, 2026

Choose a reason for hiding this comment

Uh oh!

SuperMarioYL May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants