Skip to content

Python: Fix WorkflowAgent not persisting response messages to session history#4319

Merged
moonbox3 merged 1 commit intomicrosoft:mainfrom
moonbox3:agent/fix-1694-1
Feb 26, 2026
Merged

Python: Fix WorkflowAgent not persisting response messages to session history#4319
moonbox3 merged 1 commit intomicrosoft:mainfrom
moonbox3:agent/fix-1694-1

Conversation

@moonbox3
Copy link
Contributor

Motivation and Context

When using WorkflowAgent with multi-turn sessions, assistant responses were never stored in session history. This meant that on subsequent turns, the workflow only received prior user inputs—not prior assistant responses—breaking conversational context across turns (issue #1694).

Fixes #1694

Description

The root cause was that session_context._response was never set before calling _run_after_providers, so InMemoryHistoryProvider (and other after-run providers) had no response to persist. The fix sets session_context._response to the completed AgentResponse in both the non-streaming and streaming paths of WorkflowAgent—in the streaming case, by collecting all yielded AgentResponseUpdates and reconstructing the final response via AgentResponse.from_updates. Three new tests verify multi-turn history persistence for non-streaming, streaming, and serialize/deserialize round-trip scenarios.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by moonbox3's agent

… history (microsoft#1694)

WorkflowAgent._run_impl() and _run_stream_impl() did not set
session_context._response before calling _run_after_providers().
This caused InMemoryHistoryProvider.after_run() to see context.response
as None, so response messages were never stored in the session.

On subsequent runs, the workflow only received prior user inputs without
assistant responses, breaking multi-turn conversations.

Fix: Set session_context._response to the workflow result before running
after_run providers, matching the behavior of the regular Agent class.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 26, 2026 12:37
Copy link
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 3 | Confidence: 83%

✓ Correctness

The diff correctly fixes a bug where session_context._response was not set before running after_run providers, causing InMemoryHistoryProvider to miss persisting assistant responses in multi-turn conversations. Both the non-streaming and streaming paths are handled. The non-streaming path straightforwardly assigns the already-computed result. The streaming path collects all updates and reconstructs the response via AgentResponse.from_updates. The tests are well-structured and cover non-streaming, streaming, and serialization roundtrip scenarios. No correctness issues found.

✓ Security Reliability

This diff fixes multi-turn conversation history by setting session_context._response before running after_run providers. The change is straightforward and well-tested. There are no injection risks, secrets, or unsafe deserialization issues. The main reliability concern is an asymmetry between the streaming and non-streaming paths: when the streaming path yields zero updates, _response is never set, yet _run_after_providers still executes, which could cause downstream providers to silently skip persistence or raise an AttributeError if they expect _response to exist. This is a minor edge-case reliability gap, not a blocking issue.

✓ Test Coverage

The three new tests cover the core non-streaming and streaming multi-turn history persistence, plus a serialization roundtrip, which directly exercises the production-code changes. However, the streaming path has an explicit if all_updates: guard for the empty-stream edge case that is never tested, the streaming test omits text-content assertions (only checking roles), and none of the tests verify that the _response object set on session_context actually matches the AgentResponse returned to the caller.

Suggestions

  • In the streaming path, consider whether the if all_updates: guard could silently hide issues where the workflow produces no output. If an empty response is unexpected, logging a warning might help with debugging.
  • The # type: ignore[assignment] comments suggest _response may not be part of the public typed interface of session_context. Consider adding a proper setter or typed attribute to avoid relying on private attribute assignment.
  • In the streaming path, consider setting session_context._response to a sentinel or empty AgentResponse even when all_updates is empty, so that after_run providers behave consistently with the non-streaming path and don't encounter a missing _response attribute.
  • Add a test for the empty-stream edge case (stream that yields zero updates) to cover the if all_updates: guard on the streaming path and verify that _response is not set / remains None.
  • The streaming test (test_multi_turn_session_stores_responses_streaming) only asserts on roles. Add text-content assertions (similar to the non-streaming variant) to confirm the actual message payloads survive the round-trip.
  • Consider adding an assertion that the AgentResponse returned by agent.run() matches what ends up persisted in the session history, to ensure session_context._response is set to the correct value and not a stale or mismatched object.

Automated review by moonbox3's agents

@markwallace-microsoft
Copy link
Member

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework/_workflows
   _agent.py3597080%66, 74–80, 116–117, 207, 253, 255, 318, 320, 374–375, 381–382, 388, 390, 395, 455–456, 465, 472, 498, 531–533, 535, 537, 539, 544, 549, 596, 626, 643, 682–685, 691, 697, 701–702, 705–711, 715–716, 724, 785, 792, 798–799, 810, 842, 849, 870, 879, 883, 885–887, 894
TOTAL22179276287% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4675 247 💤 0 ❌ 0 🔥 1m 18s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes WorkflowAgent session history persistence so multi-turn sessions include prior assistant responses (addressing #1694), by ensuring the completed response is available to after-run context providers (e.g., InMemoryHistoryProvider) in both non-streaming and streaming runs.

Changes:

  • Set session_context._response before invoking _run_after_providers in the non-streaming execution path.
  • In streaming, collect emitted AgentResponseUpdates and reconstruct a final AgentResponse via AgentResponse.from_updates(...) so after-run providers can persist assistant messages.
  • Add tests covering multi-turn history persistence for non-streaming, streaming, and session serialize/deserialize round-trips.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
python/packages/core/agent_framework/_workflows/_agent.py Ensures after-run providers can persist assistant outputs by populating SessionContext._response for both streaming and non-streaming workflow runs.
python/packages/core/tests/workflow/test_workflow_agent.py Adds regression tests validating assistant responses are persisted across turns (including streaming and session round-trip).

@moonbox3 moonbox3 added this pull request to the merge queue Feb 26, 2026
Merged via the queue into microsoft:main with commit 6f7e55c Feb 26, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Unexpected thread behaviour across workflow agents

5 participants