Add OpenAI Responses-compatible endpoint by CUHKSZzxy · Pull Request #4582 · InternLM/lmdeploy

CUHKSZzxy · 2026-05-13T03:14:45Z

Summary

Add a text-first OpenAI Responses-compatible POST /v1/responses endpoint.
Support string/message input, instructions/developer-role normalization, function tools, tool choice validation, and Responses SSE events.
Add focused tests, Responses API docs, and Codex integration docs.

Validation

pytest tests/test_lmdeploy/serve/openai/test_responses.py -q (18 passed)
git diff --check upstream/main...HEAD
Local Codex smoke tests against LMDeploy for no-tool, read, edit, multi-step, and project workflows.

Codex Demo

Assistance

Assisted with Codex + GPT-5.5 High

Copilot

Pull request overview

This PR adds a text-first, OpenAI Responses API–compatible endpoint (POST /v1/responses) to LMDeploy’s OpenAI server, including request normalization (string/messages/instructions/developer role), function tool mapping/tool-choice validation, and an SSE streaming event surface. It also updates middleware route protection, integrates the new router into api_server, and adds tests + documentation (including Codex integration docs).

Changes:

Add lmdeploy/serve/openai/responses.py implementing POST /v1/responses (non-stream + SSE streaming) and related request/response models.
Wire the new endpoint into the OpenAI API server and protect it under engine-sleep middleware.
Add focused unit tests plus English/Chinese documentation and integration guides (Codex / Claude Code).

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/test_lmdeploy/serve/openai/test_responses.py	Adds unit coverage for input normalization, tools/tool_choice validation, response shapes, and SSE event shapes.
lmdeploy/serve/utils/server_utils.py	Adds `/v1/responses` to sleeping-engine protected inference routes.
lmdeploy/serve/openai/responses.py	Implements the Responses-compatible router, request parsing, tool conversion, non-stream response construction, and SSE streaming events.
lmdeploy/serve/openai/api_server.py	Registers the new Responses router on the FastAPI app.
docs/zh_cn/llm/api_server.md	Links to the new Responses endpoint documentation.
docs/zh_cn/llm/api_server_responses.md	Documents the `/v1/responses` endpoint (Text V1 subset), tools, SSE events, and Codex setup notes.
docs/zh_cn/index.rst	Adds the Responses doc page to the Chinese toctree.
docs/en/llm/api_server.md	Links to the new Responses endpoint documentation.
docs/en/llm/api_server_responses.md	Documents the `/v1/responses` endpoint and points to Codex integration docs.
docs/en/integration/codex.md	Adds a Codex → LMDeploy `/v1/responses` integration guide.
docs/en/integration/claude_code.md	Adds a Claude Code → LMDeploy `/v1/messages` integration guide.
docs/en/index.rst	Adds the Responses doc page and a new Integrations toctree (Codex/Claude Code).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lvhan028 · 2026-06-01T10:15:38Z

+    ):
+        if getattr(request, field_name) is not None:
+            ignored_fields.append(field_name)
+    if request.parallel_tool_calls is not None and request.parallel_tool_calls is not True:


What's the expected behavior if "request.parallel_tool_calls is True"?

Fixed, currently aligned with vLLM-style behavior. True/default keeps all parsed tool calls; False filters to the first final tool call or streaming tool-call index 0.

The behavior is model-specific; if the model supports parallel tool calls, the output can contain multiple tool calls per response.

lvhan028 · 2026-06-01T10:16:59Z

+            'user',
+            'presence_penalty',
+            'frequency_penalty',
+            'repetition_penalty',


we support "repetition_penalty", don't we?

Fixed. ResponsesRequest now accepts repetition_penalty and forwards it to GenerationConfig

lvhan028 · 2026-06-01T10:18:59Z

+            'stream_options',
+            'top_logprobs',


Are "stream_options" and "top_logprobs" different from the ones defined in openai's v1/chat/completions?

lvhan028 · 2026-06-01T10:22:12Z

+        except ValueError as err:
+            return _error_response(HTTPStatus.BAD_REQUEST, str(err), param='tool_choice')
+
+        parser_cls = getattr(server_context, 'response_parser_cls', None)


parser_cls = server_context.response_parser_cls
We don't need to check if parser_cls is None since it is definitely by api_server's set_parsers

lvhan028 · 2026-06-01T10:25:48Z

+
+        parser_cls = getattr(server_context, 'response_parser_cls', None)
+        tools_enabled = tools and tool_choice != 'none'
+        if tools_enabled and (parser_cls is None or parser_cls.tool_parser_cls is None):


"parser_cls is None" can be removed safely

lvhan028 · 2026-06-01T10:28:04Z

+                tools=parser_tools,
+                tool_choice=tool_choice,
+            )
+            response_parser = parser_cls(request=openai_request, tokenizer=tokenizer)


May rebase main branch since the initialization of parsers doesn't request "tokenizer" any longer

lvhan028 · 2026-06-01T12:04:03Z

readthedocs build error:

  | from .protocol import (
  | File "/home/docs/checkouts/readthedocs.org/user_builds/lmdeploy/checkouts/4582/lmdeploy/serve/openai/responses/protocol.py", line 9, in <module>
  | from openai.types.responses import (
  | ModuleNotFoundError: No module named 'openai'

May add "openai" in docs.txt

…xt-v1

CUHKSZzxy marked this pull request as ready for review May 13, 2026 03:30

Copilot AI review requested due to automatic review settings May 13, 2026 03:30

Copilot started reviewing on behalf of CUHKSZzxy May 13, 2026 03:30 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

Comment thread lmdeploy/serve/openai/responses/serving.py Outdated

Comment thread lmdeploy/serve/openai/responses.py Outdated

Comment thread lmdeploy/serve/openai/responses/serving.py Outdated

lvhan028 added the enhancement New feature or request label May 13, 2026

lvhan028 self-requested a review May 23, 2026 09:46

lvhan028 reviewed Jun 1, 2026

View reviewed changes

CUHKSZzxy added 17 commits June 1, 2026 20:21

feat: add text v1 responses endpoint

35d77a4

feat: support responses function tools

eb57702

docs: add responses api server guide

50c09b3

fix: harden responses tool handling

159168f

docs: update responses example model

e9d6df4

docs: add codex integration guide

45d62be

fix: address responses api review comments

e84235f

Refactor Responses endpoint protocol

cb86c92

Improve Responses protocol compatibility

a2c6859

fix: support responses repetition penalty

50ba8e0

fix: align responses parallel tool call filtering

f7a7fa5

fix: use responses stream options schema

89e1a05

fix: simplify responses parser setup

b47847a

docs: add openai to docs requirements

3c3839d

fix: address responses review feedback

145fcc6

fix: use openai responses sdk types

2a5c63f

fix: align parallel tool call filtering

1e29354

CUHKSZzxy force-pushed the feat/responses-api-text-v1 branch from 17455dc to 1e29354 Compare June 2, 2026 08:44

CUHKSZzxy added 5 commits June 2, 2026 17:18

docs: skip responses openapi examples

45dc674

test: trim redundant responses checks

541e867

fix: tighten responses review handling

0fb4d90

Merge remote-tracking branch 'origin/main' into feat/responses-api-te…

353fd0d

…xt-v1

Merge remote-tracking branch 'origin/main' into feat/responses-api-te…

4c6171b

…xt-v1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenAI Responses-compatible endpoint#4582

Add OpenAI Responses-compatible endpoint#4582
CUHKSZzxy wants to merge 22 commits into
InternLM:mainfrom
CUHKSZzxy:feat/responses-api-text-v1

CUHKSZzxy commented May 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lvhan028 Jun 1, 2026

Uh oh!

CUHKSZzxy Jun 2, 2026

Uh oh!

lvhan028 Jun 1, 2026

Uh oh!

CUHKSZzxy Jun 2, 2026

Uh oh!

lvhan028 Jun 1, 2026

Uh oh!

lvhan028 Jun 1, 2026 •

edited

Loading

Uh oh!

lvhan028 Jun 1, 2026

Uh oh!

lvhan028 Jun 1, 2026

Uh oh!

lvhan028 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

CUHKSZzxy commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Codex Demo

Assistance

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lvhan028 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

CUHKSZzxy Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

CUHKSZzxy Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

lvhan028 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CUHKSZzxy commented May 13, 2026 •

edited

Loading

lvhan028 Jun 1, 2026 •

edited

Loading