Skip to content

fix: resolve followup message failure after frontend tool calls#2

Open
mme wants to merge 24 commits intomainfrom
feat/fix-followup-after-frontend-tools
Open

fix: resolve followup message failure after frontend tool calls#2
mme wants to merge 24 commits intomainfrom
feat/fix-followup-after-frontend-tools

Conversation

@mme
Copy link

@mme mme commented Mar 19, 2026

Summary

  • Fix Bedrock ValidationException ("Expected toolResult blocks") on follow-up messages after using shared state (Canvas/Todos) or any frontend tool call
  • Three fixes in copilotkit/copilotkit_lg_middleware.py:
    • Position-preserving dedup: real tool result replaces placeholder in-place instead of being kept at wrong position
    • Adjacency-based unanswered check: only ToolMessages immediately following an AIMessage count as "answered"
    • Orphan cleanup: removes leftover ToolMessages after stripping unanswered tool_calls
  • Add app mode UI (Chat/App toggle) with enableAppMode/enableChatMode frontend tools
  • Add error handling in agent entrypoint (emits RunErrorEvent instead of silent crash)
  • Replace debug print statements with logging

Test plan

  • Pie chart (Controlled Generative UI)
  • Bar chart (Controlled Generative UI)
  • MCP apps (Open Generative UI / Excalidraw)
  • Change theme (Frontend Tools)
  • Scheduling (Human In The Loop)
  • Canvas (Shared State) + follow-up messages

blove and others added 24 commits March 13, 2026 15:53
Adds end-to-end CopilotKit v2 chat support — streaming generative UI
(pie chart), frontend action tools, and persistent memory — running on
Bedrock AgentCore with Claude Sonnet via the Converse API.

## Backend (patterns/langgraph-single-agent)

### langgraph_agent.py — complete rewrite for AgentCore + CopilotKit
- Replaces `create_react_agent` + raw streaming with `create_agent`
  (from `langchain.agents`) using `CopilotKitMiddleware`, connecting the
  LangGraph agent to CopilotKit's frontend action / generative-UI protocol
- Adopts `BedrockAgentCoreApp` (`@app.entrypoint`) instead of FastAPI,
  keeping AgentCore's native invocation model
- MCP tools fetched per-request from the AgentCore Gateway via
  `MultiServerMCPClient` with a fresh OAuth2 token each time (avoids
  token-expiry in long-running processes)
- Actor identity resolved from `forwardedProps` keys
  (`actor_id`/`user_id`) or the `sub` claim in the Cognito Bearer JWT,
  then threaded through `AgentCoreMemorySaver` so each user's history is
  isolated
- `ActorAwareLangGraphAgent(LangGraphAGUIAgent)` adds three overrides
  required by `AgentCoreMemorySaver`:
  1. `_filter_orphan_tool_messages` — restores `tool_calls` stripped by
     `clean_orphan_tool_calls` (frontend tools have no ToolMessage in the
     checkpoint); ensures MESSAGES_SNAPSHOT carries `toolCalls` so the
     rendered component (e.g. pie chart) is not removed when the snapshot
     overwrites client state
  2. `langgraph_default_merge_state` — prepends repaired AIMessages when
     Run 2 (CopilotKit follow-up) adds a ToolMessage; without this,
     `_fix_messages_for_bedrock` strips the `tool_use` content block
     (because `tool_calls=[]`) → orphan ToolMessage → Bedrock API error
  3. `get_checkpoint_before_message` — injects `actor_id` into the
     LangGraph config for time-travel / edit history lookups
- `serialize_agui_event` / terminal-event guard in `invocations` ensure
  the AG-UI stream always ends with `RUN_FINISHED` or `RUN_ERROR`

### requirements.txt
- Replaces `langgraph` + `langchain-aws` stubs with the full dependency
  set: `ag-ui-protocol`, `partialjson`, `langgraph==1.0.10rc1`,
  `langchain-aws==1.0.0`, `langchain-mcp-adapters`, `copilotkit` (local
  vendor), `bedrock-agentcore`, and pinned versions for reproducibility

### Dockerfile
- Installs the local `copilotkit/` and `ag_ui_langgraph/` vendor packages
  via `pip install -e` so no PyPI dependency is needed for these patched
  libraries

## Vendored packages (copilotkit/ and ag_ui_langgraph/)

Local copies of CopilotKit SDK and ag-ui-langgraph with Bedrock
Converse API compatibility patches (sourced from PR mme:mme/local-copilotkit):

### copilotkit/copilotkit_lg_middleware.py — CopilotKitMiddleware
- `_fix_messages_for_bedrock`: strips unanswered `tool_calls` and syncs
  `tool_use` content blocks before each Bedrock model call, preventing
  the `toolUse` / `toolResult` interleaving errors the Converse API
  requires
- `before_agent`: injects app context from `copilotkit.context` as a
  `SystemMessage` at the start of each agent turn
- `after_model` / `after_agent`: intercepts frontend tool calls so they
  are not forwarded to `ToolNode`, then restores them to the checkpoint
  after the agent exits (enables CopilotKit's frontend-action loop)
- `awrap_model_call`: merges frontend tool definitions into the model
  request so the LLM can call them

### ag_ui_langgraph/agent.py — LangGraphAGUIAgent
- `langgraph_default_merge_state`: fixes string `args` in checkpoint
  `tool_calls`, replaces fake ToolMessages injected by
  `patch_orphan_tool_calls` with the real AG-UI result, and deduplicates
  tool definitions across runs
- `_filter_orphan_tool_messages` / `_ORPHAN_TOOL_MSG_RE`: removes fake
  ToolMessages (pattern: "Tool call '…' with id '…' was interrupted
  before completion.") that AgentCore's saver injects when a tool call
  has no matching result in the checkpoint

## CopilotKit Lambda runtime (infra-cdk/lambdas/copilotkit-runtime/)

New TypeScript Lambda that sits between the frontend and the Python
AgentCore agent, acting as the CopilotKit server-side runtime:
- `CopilotRuntime` with `InMemoryAgentRunner` wraps the Python agent as
  an `HttpAgent` (AG-UI over HTTP)
- `CopilotKitRunner` (extends `InMemoryAgentRunner`) overrides `connect`:
  on reconnect it replays `TOOL_CALL_RESULT` events for every `toolCall`
  in the `MESSAGES_SNAPSHOT`, preventing CopilotKit's `processAgentResult`
  from re-triggering Run 2 when the user reloads the page
- Supports multiple named agents via env vars
  (`LANGGRAPH_AGENTCORE_AG_UI_URL`, `STRANDS_AGENTCORE_AG_UI_URL`) or a
  single `AGENTCORE_AG_UI_URL` fallback; agent selected by
  `COPILOTKIT_AGENT_NAME`

## Frontend (frontend/)

### CopilotChatInterface.tsx (new)
- Loads runtime config (`copilotKitRuntimeUrl`) from `aws-exports.json`
- Wraps `<CopilotKitProvider>` + `<CopilotChat>` with Cognito Bearer
  token forwarded as `Authorization` header

### Generative UI
- `PieChart.tsx`: Recharts-based pie chart component with legend,
  dark-mode support, and a Zod schema (`PieChartPropsSchema`) so
  CopilotKit can validate tool arguments before rendering
- `useGenerativeUi.ts`: registers `pieChart` as a controlled generative-UI
  component via `useComponent` from `@copilotkit/react-core/v2`
- `useExampleSuggestions.ts`: registers suggested prompts in the chat UI

### ChatPage / ChatInterface routing
- `ChatPage.tsx` routes to the new `CopilotChatInterface` when
  `copilotKitRuntimeUrl` is present in config; falls back to the existing
  `ChatInterface` otherwise
- `main.tsx` / `auth.ts` updated for compatibility

### package.json
- Adds `@copilotkit/react-core`, `@copilotkit/react-ui`, `recharts`, and
  `zod`; pins `@copilotkit/runtime-client-gql` for the v2 API surface

## Infrastructure

### CDK (infra-cdk/)
- `backend-stack.ts`: adds a `CopilotKitRuntimeFunction` (Node.js 22
  Lambda, 512 MB, 5 min timeout) with `LANGGRAPH_AGENTCORE_AG_UI_URL` /
  `STRANDS_AGENTCORE_AG_UI_URL` env vars pointing to the AgentCore
  runtime URLs; exposes `/copilotkit/{proxy+}` on the existing API
  Gateway with Cognito authorizer
- `fast-main-stack.ts`: threads the new Lambda construct through the
  stack output
- `config.yaml`: adds `copilotKitRuntimeUrl` to the frontend runtime
  config exported to `aws-exports.json`

### Terraform (infra-terraform/)
- `copilotkit_runtime.tf`: equivalent Lambda + IAM + API Gateway
  resources for Terraform deployments
- `locals.tf`, `outputs.tf`, `ssm.tf`: expose `copilotkit_runtime_url`
  as an SSM parameter and stack output
- `deploy-frontend.{py,sh}` / `scripts/deploy-frontend.py`: write
  `copilotKitRuntimeUrl` into `aws-exports.json` during frontend deploy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oint-aws 1.0.5

Upgrade langgraph-checkpoint-aws from 1.0.1 to 1.0.5. Version 1.0.5 uses
patch_orphan_tool_calls (injecting placeholder ToolMessages) instead of
clean_orphan_tool_calls (stripping tool_calls). The vendored ag_ui_langgraph
base class already handles patch_orphan_tool_calls correctly via
_filter_orphan_tool_messages and langgraph_default_merge_state, and
CopilotKitMiddleware handles Bedrock API errors via _fix_messages_for_bedrock.

Remove from ActorAwareLangGraphAgent:
- _reconstruct_tool_calls helper
- _filter_orphan_tool_messages override
- langgraph_default_merge_state override
- get_checkpoint_before_message override

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sage bugs

Fix two bugs in the LangGraph AG-UI agent when used with CopilotKit frontend tools:

**Fix 1: Infinite loop after frontend tool call (pie chart / createPieChart)**

Root cause: `patch_orphan_tool_calls` (langgraph-checkpoint-aws v1.0.5) generates a
new random ToolMessage ID on every checkpoint load. The previous approach tried to
replace the placeholder by ID in `stream_input["messages"]`, but the ID from the first
`aget_state()` call never matches the placeholder ID from the internal `astream_events`
reload — so the real result was appended alongside the placeholder, causing Bedrock to
reject duplicate `toolResult` IDs (ValidationException).

Fix: added step 4 to `_fix_messages_for_bedrock` in `CopilotKitMiddleware` to
deduplicate `ToolMessage`s by `tool_call_id` in-place before each Converse API call,
keeping the real result over any "interrupted before completion" placeholder. Simplified
`langgraph_default_merge_state` to pass incoming `ToolMessage`s through without
attempting unreliable ID-based replacement.

Also upgrades `langgraph-checkpoint-aws` from 1.0.1 to 1.0.5 which switches from
`clean_orphan_tool_calls` (removes tool_calls, breaking continuation) to
`patch_orphan_tool_calls` (adds placeholder ToolMessages, enabling continuation).

**Fix 2: RUN_ERROR on multi-message follow-up turns**

Root cause: when the checkpoint had more messages than the incoming request, the
agent always triggered time-travel regeneration — including for legitimate follow-up
turns (new user message after a prior exchange).

Fix: checks whether all incoming non-ToolMessage IDs are already in the checkpoint
before deciding to time-travel. If they are, it's a continuation; only time-travel
when the last user message ID exists in the checkpoint but the incoming messages are
a proper subset (indicating a genuine re-generation request).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oList, TodoCanvas)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add explicit typed cast for agent.state in TodoCanvas to avoid untyped access
- Import Todo type in TodoCanvas
- Prevent saving empty values in TodoCard.saveEdit by keeping the editor open
- Remove redundant `as "pending" | "completed"` cast in TodoList.toggleStatus
- Remove dead null-check on non-nullable todos array in TodoList

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Kit registrations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s into CopilotChatInterface

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… infrastructure

Introduces a query_data LangChain tool backed by a 40-row financial CSV (db.csv),
updates tools/__init__.py to export it, adds three passing unit tests, and adds
conftest.py files that resolve the repo-root tools/ vs agent tools/ sys.path conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dos tools

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove manual terminal event tracking, exception wrapping, and
serialize_agui_event helper from the invocations entrypoint — just
forward events via model_dump, matching the FastAPI endpoint pattern.
Also set admin_user_email and fix strands runtime ARN fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire @ag-ui/mcp-apps-middleware into the CopilotKit runtime Lambda so
agents can use Excalidraw's MCP server for open generative UI. Add the
corresponding "network diagram" suggestion to the frontend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix Bedrock ValidationException ("Expected toolResult blocks") that occurred on
follow-up messages after using shared state (Canvas/Todos). The root cause was
that _fix_messages_for_bedrock deduplicated ToolMessages by dropping the
placeholder (correct position, adjacent to AI message) and keeping the real
result (wrong position, appended at end). Bedrock requires toolResult blocks
immediately after their corresponding toolUse.

Three fixes in copilotkit_lg_middleware.py:
- Dedup now replaces placeholder in-place with real result, preserving position
- Unanswered tool_call detection uses adjacency check instead of global lookup
- Orphan ToolMessages are cleaned up after stripping tool_calls

Also adds app mode UI (Chat/App toggle), error handling in agent entrypoint,
and None-event filtering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants