feat: Responses API background mode (background=True + adaptive polling) by dtlics · Pull Request #3472 · openai/openai-agents-python

dtlics · 2026-05-20T14:46:45Z

Draft. Design questions in #3471 are still open — posting this PR alongside the issue to make the design concrete, not to request review. Will mark ready for review after maintainer feedback on the four open questions.

What this branch does

Appends background: bool | None and background_poll_interval_seconds: float | None to ModelSettings (preserves positional ordering per AGENTS.md).
Adds a submit-and-adaptive-poll loop to OpenAIResponsesModel.get_response via a new private _poll_background_response_until_terminal. Honors openai-poll-after-ms response headers; explicit background_poll_interval_seconds overrides; falls back to 1.0s. On asyncio.CancelledError or terminal failure, schedules client.responses.cancel(id) fire-and-forget so server-side work doesn't leak.
stream_response is unchanged at the call-site level — background=True flows through _build_response_create_kwargs into the streaming responses.create() call, giving server-side durability without client-side resume logic.
OpenAIChatCompletionsModel and OpenAIResponsesWSModel raise UserError when background=True is set, so users don't silently lose the durability guarantee they opted into.
Docs page at docs/background.md registered in mkdocs.yml; runnable example at examples/background_mode/main.py.

Diff: 791 insertions, 1 deletion across 10 files. No changes to run.py, anything under run_internal/, or run_state.py — CURRENT_SCHEMA_VERSION is not bumped.

Design choices currently reflected in the code

These map to the four open questions in #3471. All are easy to flip — happy to restructure based on maintainer preference.

Surface — ModelSettings, not RunConfig. Joins the existing family of model-call toggles (store, reasoning, prompt_cache_retention, context_management) and gives per-agent granularity in multi-agent runs.
Non-Responses backends — UserError, not silent no-op. Loud failure, on the reasoning that users opted into a server-side durability guarantee that those transports can't provide. Counterpoint (raised in Support Responses API background mode in Runner (background=True + adaptive polling) #3471): reasoning / verbosity already silently no-op on those backends, so picking the opposite policy here is an inconsistency. Easy to switch to silent no-op if preferred.
Streaming — server-side durability only, no client-side starting_after auto-resume. Mirrors plain openai-python's behavior (it exposes responses.stream(response_id=..., starting_after=N) as a primitive but doesn't auto-resume on disconnect).
Retrieve-call retries — AsyncOpenAI.max_retries only (partial). Each retrieve call uses the client's built-in max_retries for transient HTTP failures. Honest gap: the plan's intent was that an exhausted-retry failure during polling should not propagate up to get_response_with_retry and trigger a fresh submit+poll cycle (which would discard a possibly-minutes-long in-flight reasoning response). The current code does not mark such exceptions as non-retriable, so outer-envelope replay can still happen subject to retry policy. The cancel-on-CancelledError half of Q4 is implemented; the suppress-outer-retry half is not. Will fix once Q4 lands one way or the other.

Test plan

make format && make lint && make typecheck && make tests all pass on the branch:

make format — ruff format: 778 files left unchanged. ruff check --fix: All checks passed.
make lint — ruff check: All checks passed.
make typecheck — mypy: Success, no issues found in 772 source files. pyright: 0 errors, 0 warnings, 0 informations.
make tests — parallel: 4589 passed, 2 skipped in 16.61s. Serial: 27 passed, 5 skipped, 4590 deselected in 4.82s.

New tests (in tests/models/test_openai_responses.py and tests/models/test_openai_chatcompletions.py) cover: terminal-on-first-response fast path, queued→in_progress→completed multi-poll, terminal failed / cancelled / incomplete raises, openai-poll-after-ms header honored, explicit interval overrides header, CancelledError during poll schedules responses.cancel, extra_args={"background": True} conflict raises TypeError, streaming pass-through, and Chat Completions / WS UserError rejections.

Not verified by me: make build-docs / make build-full-docs were not run. The new docs page is registered in all four language nav sections in mkdocs.yml (en/ja/ko/zh) on the assumption the existing translation pipeline fills in non-English content — worth a reviewer eye.

Out of scope (proposed follow-ups, mirroring #3471)

Stream auto-resume via responses.stream(response_id=..., starting_after=N) on transport errors.
Cross-process Runner resume / continuation tokens — would need RunState schema bump.
ZDR enforcement — background mode is not ZDR-compatible and retains response data for ~10 minutes server-side.
Auto-detection of when to use background mode based on model / payload.
OpenAIResponsesModel.retrieve_response() helper — users can call client.responses.retrieve(id) directly on the underlying client, so this would add public API surface without adding capability.

…onds fields Append two optional fields to ModelSettings to opt into Responses API background mode. background=True submits via responses.create(background=True) and adaptively polls responses.retrieve(id) until terminal; the optional poll_interval_seconds pins the cadence or defers to the openai-poll-after-ms response header. Fields are appended at the end of the dataclass per AGENTS.md's positional compatibility rule. background is added to _TRACEABLE_MODEL_SETTING_FIELDS so the flag is captured in spans; the interval is operational metadata and is intentionally excluded.

…und mode When ModelSettings.background is True, OpenAIResponsesModel.get_response now submits via responses.create(background=True), then polls responses.retrieve(id) until the response reaches a terminal status (completed | failed | cancelled | incomplete). Streaming pass-through is unchanged: stream_response forwards background=True to responses.create(stream=True, background=True) for server-side durability without client-side auto-resume. Polling honors the openai-poll-after-ms response header for adaptive intervals (matches openai-python's create_and_poll pattern); an explicit background_poll_interval_seconds overrides the header; the fallback is 1.0s. On asyncio.CancelledError or a non-recoverable error mid-poll, the SDK schedules a fire-and-forget responses.cancel(id) so server-side compute is not leaked, then re-raises. Non-completed terminal states raise the existing response_terminal_failure_error helper. background is plumbed through _build_response_create_kwargs alongside store and prompt_cache_retention, so the existing extra_args duplicate-key check catches accidental double-spec.

…ompletions adapters Setting ModelSettings.background=True on an adapter that cannot honor it must fail loudly rather than silently drop the durability guarantee the caller opted into: - OpenAIResponsesWSModel: the WebSocket transport always streams and cannot decouple submit from poll. Raise UserError in the overridden _fetch_response so both get_response and stream_response paths are covered. - OpenAIChatCompletionsModel: the Chat Completions API has no background parameter. Add _handle_unsupported_background and call it at the top of get_response and stream_response, mirroring the existing _handle_unsupported_prompt pattern.

… and rejections Add 15 tests for the new background mode: - terminal-on-first-response (no poll triggered) - multi-poll until completed - terminal failures (failed | cancelled | incomplete) raise ModelBehaviorError - openai-poll-after-ms header drives the next sleep interval - explicit background_poll_interval_seconds overrides the header - asyncio.CancelledError mid-poll schedules a fire-and-forget responses.cancel(id) and re-raises (uses a real-sleep handle captured pre-monkeypatch to avoid re-tripping the cancel after the test undoes the patch) - background=True is plumbed into the responses.create() kwargs - extra_args={"background": True} + ModelSettings.background=True surfaces the existing duplicate-key TypeError - streaming + background passes through unchanged - OpenAIResponsesWSModel rejects background=True from both get_response and stream_response - OpenAIChatCompletionsModel rejects background=True from both get_response and stream_response Update test_all_fields_serialization to set the two new ModelSettings fields so the "every field non-None" invariant still holds.

New docs/background.md describes the transparent use through Runner, the streaming pass-through, retrieving a response by id via the underlying AsyncOpenAI client, the cancel-on-CancelledError behavior, supported backends (Responses HTTP only — WS and Chat Completions raise UserError), and the platform limits (~10-minute retention, not ZDR-compatible). Registered under "Background mode" in all four language nav sections in mkdocs.yml. Translated content for ja/ko/zh will be generated by the existing docs translation pipeline.

examples/background_mode/main.py runs the same prompt twice — once synchronously, once with ModelSettings(background=True) — to demonstrate that opting into background mode is a one-field change at the Agent level and produces equivalent final output, with the durability win coming from the underlying submit + poll transport rather than from the SDK API.

seratch · 2026-05-20T23:15:31Z

Closing this for this reason: #3471 (comment) Thanks for your interest here.

dtlics added 6 commits May 20, 2026 15:18

dtlics mentioned this pull request May 20, 2026

Support Responses API background mode in Runner (background=True + adaptive polling) #3471

Closed

seratch closed this May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Responses API background mode (background=True + adaptive polling)#3472

feat: Responses API background mode (background=True + adaptive polling)#3472
dtlics wants to merge 6 commits into
openai:mainfrom
dtlics:feat/responses-background-mode

dtlics commented May 20, 2026

Uh oh!

seratch commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dtlics commented May 20, 2026

What this branch does

Design choices currently reflected in the code

Test plan

Out of scope (proposed follow-ups, mirroring #3471)

Uh oh!

seratch commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants