feat: add conversation history to ToolContext#2745
feat: add conversation history to ToolContext#2745HuxleyHu98 wants to merge 1 commit intoopenai:mainfrom
Conversation
bd2a003 to
5c7d961
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bd2a003792
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
5c7d961 to
f23f1ec
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f23f1ec8c8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
f23f1ec to
6a81776
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a817768a1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| This intentionally includes only items that can be represented as model input. | ||
| Internal bookkeeping items such as approval placeholders are skipped. | ||
| """ | ||
| history = ItemHelpers.input_to_new_input_list(original_input) |
There was a problem hiding this comment.
Build tool history from filtered turn input
The history snapshot is seeded from original_input, but model calls may first transform input via call_model_input_filter in get_new_response. That means ToolContext.conversation_history can include unfiltered/redacted-away items that the model never saw, so the new field no longer reliably represents the visible context that led to the tool call and can unintentionally surface filtered content when tools reuse the snapshot.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed. The history snapshot now seeds from the actual filtered turn input sent to the model rather than the raw original_input. get_new_response() stores the deduplicated post-call_model_input_filter input on the run context, and turn_resolution builds conversation_history from that snapshot.
I also added a regression test that applies call_model_input_filter and verifies the tool-visible history contains the redacted/filtered input rather than the original unfiltered value.
Separately, I pulled in and formatted the newly-added .agents/skills/runtime-behavior-probe/templates/python_probe.py from main so make lint is green again on the current merge base.
05d5ee9 to
871cd79
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 871cd7929d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
871cd79 to
afdf550
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: afdf5509e5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
afdf550 to
5adf0e5
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5adf0e594f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
251ec48 to
adb7877
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: adb78779cf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| filtered_model_input = await maybe_filter_model_input( | ||
| agent=agent, | ||
| run_config=run_config, | ||
| context_wrapper=context_wrapper, |
There was a problem hiding this comment.
Preserve original input-filtered snapshot on resume
Resume rebuilding now re-runs maybe_filter_model_input using the current run_config, not the filter output captured when the interrupted model call originally ran. If a run started with call_model_input_filter and later resumes via Runner.run(agent, state) without re-supplying the same filter, conversation_history for approved function tools is reconstructed from unfiltered/changed input, which can reintroduce redacted content and diverge from replay-safe behavior.
Useful? React with 👍 / 👎.
adb7877 to
7c1d78d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7c1d78d2f1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
7e86e99 to
8c353f9
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8c353f91bb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/agents/run_context.py
Outdated
| """ | ||
|
|
||
| turn_input: list[TResponseInputItem] = field(default_factory=list) | ||
| _tool_history_input: list[TResponseInputItem] = field(default_factory=list, repr=False) |
There was a problem hiding this comment.
Preserve RunContextWrapper positional constructor order
Adding _tool_history_input between turn_input and _approvals changes the dataclass-generated RunContextWrapper.__init__ positional binding, so existing calls like RunContextWrapper(ctx, usage, turn_input, approvals) now pass approvals into history and leave _approvals at its default. This is a source-compatibility regression for exported API consumers and can silently break approval behavior on resumed/tool-approval flows; the new field should be appended in a compatibility-safe way (for example init=False or a custom constructor preserving prior positional slots).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed. _tool_history_input is now a private, init=False field, so RunContextWrapper keeps its existing positional constructor order. I also added a source-compat regression that exercises the legacy positional constructor shape (RunContextWrapper(ctx, usage, turn_input, approvals)) and verifies approvals still bind to _approvals rather than the new history storage.
8c353f9 to
f64bc96
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f64bc9639e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
eb1f152 to
91fb87f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 91fb87f8a0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Prepared a minimal formatting-only fix for the failing run (23375138860) on head SHA . Patch commit: discopops@aa9531f This commit only formats:Validation run on that exact SHA context:
If helpful, you can cherry-pick: |
|
Prepared a minimal formatting-only fix for failing Tests run 23375138860 on head SHA eb1f152. Patch commit: This commit only formats:
Validation on that exact SHA context:
Cherry-pick command: git cherry-pick aa9531f |
91fb87f to
7302f1f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7302f1f01c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| preserve_tool_history = ( | ||
| conversation_id is not None | ||
| and context_wrapper._tool_history_conversation_id == conversation_id | ||
| ) |
There was a problem hiding this comment.
Preserve history for previous_response_id server runs
This reset condition preserves _tool_history_input only when an explicit conversation_id matches, so reused RunContextWrapper instances in previous_response_id/auto_previous_response_id flows always get cleared at the start of each new Runner.run. In run_internal/run_loop.py, _tool_history_conversation_id is populated from server_conversation_tracker.conversation_id, which is None in those flows, so follow-up tool calls receive only the latest delta while the model still has full server-managed history; that makes ToolContext.conversation_history diverge from the context that actually produced the tool call.
Useful? React with 👍 / 👎.
| def _build_function_tool_conversation_history( | ||
| turn_input: Sequence[TResponseInputItem], | ||
| ) -> list[TResponseInputItem]: | ||
| """Build the visible history snapshot for a local function tool invocation.""" | ||
| return list(turn_input) |
There was a problem hiding this comment.
Deep-copy conversation history before passing to tools
This helper builds conversation_history with only a shallow list copy, so each history item object is still shared with context_wrapper._tool_history_input. If a tool mutates a history item (for example editing content or IDs), it mutates the runner’s internal snapshot that is later reused for interruption state and server-history accumulation, which can corrupt subsequent model input and resume behavior even though the field is documented as a snapshot.
Useful? React with 👍 / 👎.
|
Hi, thanks for sharing this idea. I just wanted to let you know that, since this PR brings lots of changes to run internals and these could conflict with our ongoing priorities, we won't consider this in the short term even if it's good to go. |
|
Thanks for the clarification — that makes sense. |
|
[P2][Owner:openai-agents-python maintainers] Escalation for CI-OPENAI-AGENTS-001: this finding has been open since 2026-03-11 (>72h). A validated format fix is already published on \ (commit ). Please cherry-pick/apply and re-run the Tests workflow to close the carry-forward blocker. |
|
[P2][Owner:openai-agents-python maintainers] Escalation for CI-OPENAI-AGENTS-001: this finding has been open since 2026-03-11 (>72h). A validated format fix is published on branch discopops:ci/fix-format-eb1f152c (commit aa9531f). Please cherry-pick/apply and re-run the Tests workflow to close the carry-forward blocker. |
Summary
This change adds
conversation_historytoToolContextso local function tools can access the visible conversation history at invocation time in non-streaming runs.Previously, tools could access the current tool call metadata and
turn_input, but not the accumulated visible history that led up to the tool invocation. This made some tool patterns awkward in non-streaming mode, especially tools that need to inspect parent-agent context or delegate to another model.This PR keeps the scope intentionally narrow:
conversation_history: list[TResponseInputItem]toToolContextToolContext.conversation_historyfield indocs/context.mdThe implementation threads the history snapshot through the current runtime pipeline rather than reviving the old
_run_impl.pypath.Test plan
Ran focused verification for the changed runtime path and related
ToolContextbehavior:uv run ruff check src/agents/tool_context.py src/agents/run_internal/tool_execution.py src/agents/run_internal/tool_planning.py src/agents/run_internal/turn_resolution.py tests/test_tool_context.py tests/test_agent_runner.pyuv run mypy src/agents/tool_context.py src/agents/run_internal/tool_execution.py src/agents/run_internal/tool_planning.py src/agents/run_internal/turn_resolution.py tests/test_tool_context.py tests/test_agent_runner.pyuv run pytest -q tests/test_tool_context.py tests/test_agent_runner.py tests/test_agent_as_tool.pyAlso verified:
conversation_historyIssue number
Closes #904
Checks
make lintandmake format