Skip to content

Failed A2A task errors leak into conversation history as regular content #4309

@filipecaixeta

Description

@filipecaixeta

🔴 Required Information

Describe the Bug:
When a remote agent task fails (either due to a tool exception or an internal error like a sessions database connection failure), the error message is converted to regular conversation content instead of being treated as an error. This causes the error to be propagated to downstream agents as artifacts/messages, polluting the conversation history and leaking internal details (like SQL queries, stack traces) into the LLM prompt.

Steps to Reproduce:

Scenario 1: Session database failure

  1. Set up a multi-agent system with Agent B (router) calling Agent A (sub-agent) via A2A protocol
  2. Configure Agent B with a DatabaseSessionService connected to PostgreSQL
  3. Cause a database connection failure in Agent B (e.g., by restarting the database or waiting for connection timeout)
  4. Send a user message to Agent B that triggers a transfer to Agent A
  5. Observe that the SQLAlchemy error message becomes part of the conversation history and is visible to the user/LLM

Scenario 2: Tool failure propagated as artifact

  1. Create Agent A with a tool that raises an exception
  2. Have Agent B call Agent A via RemoteA2aAgent
  3. Have Agent B then transfer to Agent C
  4. Observe that Agent A's error message is sent to Agent C as an artifact (instead of being treated as an error)

Expected Behavior:

  1. When a task fails (state: "failed"), RemoteA2aAgent should create an Event with error_message set instead of content
  2. Events with error_message should NOT be included in the conversation history sent to the LLM
  3. The error should still be logged/propagated for debugging, but not pollute the prompt

Observed Behavior:

The failed task's error message is converted to event.content via convert_a2a_task_to_event(), making it part of the conversation history. When the router agent transfers to another agent, the error appears in the context.

Example conversation showing leaked error:

User: Last week performance YOY

[agent_marketing] called tool transfer_to_agent with parameters: {'agent_name': 'agent_analytics'}
[agent_marketing] transfer_to_agent tool returned result: {'result': None}
[agent_analytics] said: (sqlalchemy.dialects.postgresql.asyncpg.InterfaceError) <class 'asyncpg.exceptions._base.InterfaceError'>: connection is closed
[SQL: SELECT sessions.app_name AS sessions_app_name, sessions.user_id AS sessions_user_id, sessions.id AS sessions_id, sessions.state AS sessions_state, sessions.create_time AS sessions_create_time, sessions.update_time AS sessions_update_time
FROM sessions
WHERE sessions.app_name = $1::VARCHAR AND sessions.user_id = $2::VARCHAR AND sessions.id = $3::VARCHAR]
[parameters: ('agent_analytics', 'user@example.com', '93b169b8-35b4-4b8e-9c80-80f3c7f8f29c')]

Environment Details:

  • ADK Library Version: 1.1.0
  • Desktop OS: Linux (Kubernetes)
  • Python Version: 3.11

Model Information:

  • Are you using LiteLLM: Yes
  • Which model is being used: gemini-2.5-pro

🟡 Optional Information

Regression:
N/A - this appears to be existing behavior

Logs:

Scenario 1: Agent B's session database fails, error is returned to the caller and becomes part of the conversation.

Scenario 2: Agent A sends this failed task response to Agent B:

{
  "id": "097e4d0b-d8c8-4b2b-9bb9-136b547004e3",
  "jsonrpc": "2.0",
  "result": {
    "contextId": "734279ec-91d4-4ae0-9f7b-73094f247500",
    "final": true,
    "kind": "status-update",
    "status": {
      "message": {
        "kind": "message",
        "messageId": "ffa48ac4-8bc3-42b3-9d2f-f06e80eec66e",
        "parts": [
          {
            "kind": "text",
            "text": "my exception message"
          }
        ],
        "role": "agent"
      },
      "state": "failed",
      "timestamp": "2026-01-28T18:20:03.090828+00:00"
    },
    "taskId": "6aaf8f8b-ff60-4d78-8672-6566b5821d58"
  }
}

Agent B then sends this to Agent C (error has become an artifact):

{
  "id": "5deb9c00-7de5-4c85-b482-8a7be84fc507",
  "jsonrpc": "2.0",
  "result": {
    "artifact": {
      "artifactId": "c5ed0c8d-0389-4482-8cc1-4c756e17d75d",
      "parts": [
        {
          "kind": "text",
          "text": "my exception message"
        }
      ]
    },
    "contextId": "18caacab-edaa-4e70-b218-5737a677cf72",
    "kind": "artifact-update",
    "lastChunk": true,
    "taskId": "eb3b41d3-51da-46ec-807d-6b3d20afe4fb"
  }
}

Screenshots / Video:
N/A

Additional Context:

The root cause is in RemoteA2aAgent._handle_a2a_response() (google/adk/agents/remote_a2a_agent.py). When processing a failed task, it calls convert_a2a_task_to_event() which puts the error message in event.content. There's no check for TaskState.failed to handle errors differently.

Proposed fix: After convert_a2a_task_to_event(), check if task.status.state == TaskState.failed and if so, create an Event with error_message set (and content empty):

if task and task.status and task.status.state == TaskState.failed:
    error_text = "Remote agent task failed"
    if event.content and event.content.parts:
        text_parts = [p.text for p in event.content.parts if hasattr(p, 'text') and p.text]
        if text_parts:
            error_text = " ".join(text_parts)
    event = Event(
        author=self.name,
        error_message=error_text,
        invocation_id=ctx.invocation_id,
        branch=ctx.branch,
    )

The same fix should be applied to streaming task status updates (A2ATaskStatusUpdateEvent handling).

Minimal Reproduction Code:

from google.adk.agents import LlmAgent
from google.adk.agents.remote_a2a_agent import RemoteA2aAgent

# Agent A - will fail (e.g., tool exception or session DB failure)
agent_a = LlmAgent(
    name="agent_a",
    model="gemini-2.5-flash",
    tools=[failing_tool],  # Tool that raises an exception
)

# Agent B - router that calls Agent A
agent_b = LlmAgent(
    name="agent_b",
    model="gemini-2.5-flash",
    sub_agents=[
        RemoteA2aAgent(
            name="agent_a",
            agent_card="http://agent-a:8080/.well-known/agent-card.json",
        ),
    ],
)

# When agent_b transfers to agent_a and agent_a fails,
# the error message becomes part of agent_b's conversation history

How often has this issue occurred?:

  • Always (100%) - whenever a remote agent task fails

Metadata

Metadata

Assignees

No one assigned

    Labels

    a2a[Component] This issue is related a2a support inside ADK.

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions