diff --git a/docs/docs.json b/docs/docs.json index 727443ac..dc0fc8d2 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -82,12 +82,24 @@ "group": "Draft: In Progress and May Change", "hidden": true, "pages": [ + "protocol/draft/overview", "protocol/draft/authentication", + "protocol/draft/initialization", "protocol/draft/session-setup", "protocol/draft/session-list", "protocol/draft/session-delete", + "protocol/draft/prompt-turn", + "protocol/draft/content", + "protocol/draft/tool-calls", "protocol/draft/file-system", "protocol/draft/cancellation", + "protocol/draft/terminals", + "protocol/draft/agent-plan", + "protocol/draft/session-modes", + "protocol/draft/session-config-options", + "protocol/draft/slash-commands", + "protocol/draft/extensibility", + "protocol/draft/transports", "protocol/draft/schema" ] } diff --git a/docs/protocol/draft/agent-plan.mdx b/docs/protocol/draft/agent-plan.mdx new file mode 100644 index 00000000..874572a7 --- /dev/null +++ b/docs/protocol/draft/agent-plan.mdx @@ -0,0 +1,83 @@ +--- +title: "Agent Plan" +description: "How Agents communicate their execution plans" +--- + +Plans are execution strategies for complex tasks that require multiple steps. + +Agents may share plans with Clients through [`session/update`](./prompt-turn#3-agent-reports-output) notifications, providing real-time visibility into their thinking and progress. + +## Creating Plans + +When the language model creates an execution plan, the Agent **SHOULD** report it to the Client: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "plan", + "entries": [ + { + "content": "Analyze the existing codebase structure", + "priority": "high", + "status": "pending" + }, + { + "content": "Identify components that need refactoring", + "priority": "high", + "status": "pending" + }, + { + "content": "Create unit tests for critical functions", + "priority": "medium", + "status": "pending" + } + ] + } + } +} +``` + + + An array of [plan entries](#plan-entries) representing the tasks to be + accomplished + + +## Plan Entries + +Each plan entry represents a specific task or goal within the overall execution strategy: + + + A human-readable description of what this task aims to accomplish + + + + The relative importance of this task. + +- `high` +- `medium` +- `low` + + + + + The current [execution status](#status) of this task + +- `pending` +- `in_progress` +- `completed` + + + +## Updating Plans + +As the Agent progresses through the plan, it **SHOULD** report updates by sending more `session/update` notifications with the same structure. + +The Agent **MUST** send a complete list of all plan entries in each update and their current status. The Client **MUST** replace the current plan completely. + +### Dynamic Planning + +Plans can evolve during execution. The Agent **MAY** add, remove, or modify plan entries as it discovers new requirements or completes tasks, allowing it to adapt based on what it learns. diff --git a/docs/protocol/draft/content.mdx b/docs/protocol/draft/content.mdx new file mode 100644 index 00000000..931f451e --- /dev/null +++ b/docs/protocol/draft/content.mdx @@ -0,0 +1,204 @@ +--- +title: "Content" +description: "Understanding content blocks in the Agent Client Protocol" +--- + +Content blocks represent displayable information that flows through the Agent Client Protocol. They provide a structured way to handle various types of user-facing content—whether it's text from language models, images for analysis, or embedded resources for context. + +Content blocks appear in: + +- User prompts sent via [`session/prompt`](./prompt-turn#1-user-message) +- Language model output streamed through [`session/update`](./prompt-turn#3-agent-reports-output) notifications +- Progress updates and results from [tool calls](./tool-calls) + +## Content Types + +The Agent Client Protocol uses the same `ContentBlock` structure as the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/specification/2025-06-18/schema#contentblock). + +This design choice enables Agents to seamlessly forward content from MCP tool outputs without transformation. + +### Text Content + +Plain text messages form the foundation of most interactions. + +```json +{ + "type": "text", + "text": "What's the weather like today?" +} +``` + +All Agents **MUST** support text content blocks when included in prompts. + + + The text content to display + + + + Optional metadata about how the content should be used or displayed. [Learn + more](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). + + +### Image Content + +Images can be included for visual context or analysis. + +```json +{ + "type": "image", + "mimeType": "image/png", + "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAAB..." +} +``` + + Requires the `image` [prompt +capability](./initialization#prompt-capabilities) when included in prompts. + + + Base64-encoded image data + + + + The MIME type of the image (e.g., "image/png", "image/jpeg") + + + + Optional URI reference for the image source + + + + Optional metadata about how the content should be used or displayed. [Learn + more](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). + + +### Audio Content + +Audio data for transcription or analysis. + +```json +{ + "type": "audio", + "mimeType": "audio/wav", + "data": "UklGRiQAAABXQVZFZm10IBAAAAABAAEAQB8AAAB..." +} +``` + + Requires the `audio` [prompt +capability](./initialization#prompt-capabilities) when included in prompts. + + + Base64-encoded audio data + + + + The MIME type of the audio (e.g., "audio/wav", "audio/mp3") + + + + Optional metadata about how the content should be used or displayed. [Learn + more](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). + + +### Embedded Resource + +Complete resource contents embedded directly in the message. + +```json +{ + "type": "resource", + "resource": { + "uri": "file:///home/user/script.py", + "mimeType": "text/x-python", + "text": "def hello():\n print('Hello, world!')" + } +} +``` + +This is the preferred way to include context in prompts, such as when using @-mentions to reference files or other resources. + +By embedding the content directly in the request, Clients can include context from sources that the Agent may not have direct access to. + + Requires the `embeddedContext` [prompt +capability](./initialization#prompt-capabilities) when included in prompts. + + + The embedded resource contents, which can be either: + + + + The URI identifying the resource + + + + The text content of the resource + + + + Optional MIME type of the text content + + + + + + + The URI identifying the resource + + + + Base64-encoded binary data + + + + Optional MIME type of the blob + + + + + + + Optional metadata about how the content should be used or displayed. [Learn + more](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). + + +### Resource Link + +References to resources that the Agent can access. + +```json +{ + "type": "resource_link", + "uri": "file:///home/user/document.pdf", + "name": "document.pdf", + "mimeType": "application/pdf", + "size": 1024000 +} +``` + + + The URI of the resource + + + + A human-readable name for the resource + + + + The MIME type of the resource + + + + Optional display title for the resource + + + + Optional description of the resource contents + + + + Optional size of the resource in bytes + + + + Optional metadata about how the content should be used or displayed. [Learn + more](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). + diff --git a/docs/protocol/draft/error.mdx b/docs/protocol/draft/error.mdx new file mode 100644 index 00000000..01d1aba3 --- /dev/null +++ b/docs/protocol/draft/error.mdx @@ -0,0 +1,6 @@ +--- +title: "Error" +description: "Error handling in the Agent Client Protocol" +--- + +_Documentation coming soon_ diff --git a/docs/protocol/draft/extensibility.mdx b/docs/protocol/draft/extensibility.mdx new file mode 100644 index 00000000..31441013 --- /dev/null +++ b/docs/protocol/draft/extensibility.mdx @@ -0,0 +1,134 @@ +--- +title: "Extensibility" +description: "Adding custom data and capabilities" +--- + +The Agent Client Protocol provides built-in extension mechanisms that allow implementations to add custom functionality while maintaining compatibility with the core protocol. These mechanisms ensure that Agents and Clients can innovate without breaking interoperability. + +## The `_meta` Field + +All types in the protocol include a `_meta` field with type `{ [key: string]: unknown }` that implementations can use to attach custom information. This includes requests, responses, notifications, and even nested types like content blocks, tool calls, plan entries, and capability objects. + +```json +{ + "jsonrpc": "2.0", + "id": 1, + "method": "session/prompt", + "params": { + "sessionId": "sess_abc123def456", + "prompt": [ + { + "type": "text", + "text": "Hello, world!" + } + ], + "_meta": { + "traceparent": "00-80e1afed08e019fc1110464cfa66635c-7a085853722dc6d2-01", + "zed.dev/debugMode": true + } + } +} +``` + +Clients may propagate fields to the agent for correlation purposes, such as `requestId`. The following root-level keys in `_meta` **SHOULD** be reserved for [W3C trace context](https://www.w3.org/TR/trace-context/) to guarantee interop with existing MCP implementations and OpenTelemetry tooling: + +- `traceparent` +- `tracestate` +- `baggage` + +Implementations **MUST NOT** add any custom fields at the root of a type that's part of the specification. All possible names are reserved for future protocol versions. + +## Extension Methods + +The protocol reserves any method name starting with an underscore (`_`) for custom extensions. This allows implementations to add new functionality without the risk of conflicting with future protocol versions. + +Extension methods follow standard [JSON-RPC 2.0](https://www.jsonrpc.org/specification) semantics: + +- **[Requests](https://www.jsonrpc.org/specification#request_object)** - Include an `id` field and expect a response +- **[Notifications](https://www.jsonrpc.org/specification#notification)** - Omit the `id` field and are one-way + +### Custom Requests + +In addition to the requests specified by the protocol, implementations **MAY** expose and call custom JSON-RPC requests as long as their name starts with an underscore (`_`). + +```json +{ + "jsonrpc": "2.0", + "id": 1, + "method": "_zed.dev/workspace/buffers", + "params": { + "language": "rust" + } +} +``` + +Upon receiving a custom request, implementations **MUST** respond accordingly with the provided `id`: + +```json +{ + "jsonrpc": "2.0", + "id": 1, + "result": { + "buffers": [ + { "id": 0, "path": "/home/user/project/src/main.rs" }, + { "id": 1, "path": "/home/user/project/src/editor.rs" } + ] + } +} +``` + +If the receiving end doesn't recognize the custom method name, it should respond with the standard "Method not found" error: + +```json +{ + "jsonrpc": "2.0", + "id": 1, + "error": { + "code": -32601, + "message": "Method not found" + } +} +``` + +To avoid such cases, extensions **SHOULD** advertise their [custom capabilities](#advertising-custom-capabilities) so that callers can check their availability first and adapt their behavior or interface accordingly. + +### Custom Notifications + +Custom notifications are regular JSON-RPC notifications that start with an underscore (`_`). Like all notifications, they omit the `id` field: + +```json +{ + "jsonrpc": "2.0", + "method": "_zed.dev/file_opened", + "params": { + "path": "/home/user/project/src/editor.rs" + } +} +``` + +Unlike with custom requests, implementations **SHOULD** ignore unrecognized notifications. + +## Advertising Custom Capabilities + +Implementations **SHOULD** use the `_meta` field in capability objects to advertise support for extensions and their methods: + +```json +{ + "jsonrpc": "2.0", + "id": 0, + "result": { + "protocolVersion": 1, + "agentCapabilities": { + "loadSession": true, + "_meta": { + "zed.dev": { + "workspace": true, + "fileNotifications": true + } + } + } + } +} +``` + +This allows implementations to negotiate custom features during initialization without breaking compatibility with standard Clients and Agents. diff --git a/docs/protocol/draft/initialization.mdx b/docs/protocol/draft/initialization.mdx new file mode 100644 index 00000000..50307e71 --- /dev/null +++ b/docs/protocol/draft/initialization.mdx @@ -0,0 +1,225 @@ +--- +title: "Initialization" +description: "How all Agent Client Protocol connections begin" +--- + +{/* todo! link to all concepts */} + +The Initialization phase allows [Clients](./overview#client) and [Agents](./overview#agent) to negotiate protocol versions, capabilities, and authentication methods. + +
+ +```mermaid +sequenceDiagram + participant Client + participant Agent + + Note over Client, Agent: Connection established + Client->>Agent: initialize + Note right of Agent: Negotiate protocol
version & capabilities + Agent-->>Client: initialize response + Note over Client,Agent: Ready for session setup +``` + +
+ +Before a Session can be created, Clients **MUST** initialize the connection by calling the `initialize` method with: + +- The latest [protocol version](#protocol-version) supported +- The [capabilities](#client-capabilities) supported + +They **SHOULD** also provide a name and version to the Agent. + +```json +{ + "jsonrpc": "2.0", + "id": 0, + "method": "initialize", + "params": { + "protocolVersion": 1, + "clientCapabilities": { + "fs": { + "readTextFile": true, + "writeTextFile": true + }, + "terminal": true + }, + "clientInfo": { + "name": "my-client", + "title": "My Client", + "version": "1.0.0" + } + } +} +``` + +The Agent **MUST** respond with the chosen [protocol version](#protocol-version) and the [capabilities](#agent-capabilities) it supports. It **SHOULD** also provide a name and version to the Client as well: + +```json +{ + "jsonrpc": "2.0", + "id": 0, + "result": { + "protocolVersion": 1, + "agentCapabilities": { + "loadSession": true, + "promptCapabilities": { + "image": true, + "audio": true, + "embeddedContext": true + }, + "mcpCapabilities": { + "http": true, + "sse": true + } + }, + "agentInfo": { + "name": "my-agent", + "title": "My Agent", + "version": "1.0.0" + }, + "authMethods": [] + } +} +``` + +## Protocol version + +The protocol versions that appear in the `initialize` requests and responses are a single integer that identifies a **MAJOR** protocol version. This version is only incremented when breaking changes are introduced. + +Clients and Agents **MUST** agree on a protocol version and act according to its specification. + +See [Capabilities](#capabilities) to learn how non-breaking features are introduced. + +### Version Negotiation + +The `initialize` request **MUST** include the latest protocol version the Client supports. + +If the Agent supports the requested version, it **MUST** respond with the same version. Otherwise, the Agent **MUST** respond with the latest version it supports. + +If the Client does not support the version specified by the Agent in the `initialize` response, the Client **SHOULD** close the connection and inform the user about it. + +## Capabilities + +Capabilities describe features supported by the Client and the Agent. + +All capabilities included in the `initialize` request are **OPTIONAL**. Clients and Agents **SHOULD** support all possible combinations of their peer's capabilities. + +The introduction of new capabilities is not considered a breaking change. Therefore, Clients and Agents **MUST** treat all capabilities omitted in the `initialize` request as **UNSUPPORTED**. + +Capabilities are high-level and are not attached to a specific base protocol concept. + +Capabilities may specify the availability of protocol methods, notifications, or a subset of their parameters. They may also signal behaviors of the Agent or Client implementation. + +Implementations can also [advertise custom capabilities](./extensibility#advertising-custom-capabilities) using the `_meta` field to indicate support for protocol extensions. + +### Client Capabilities + +The Client **SHOULD** specify whether it supports the following capabilities: + +#### File System + + + The `fs/read_text_file` method is available. + + + + The `fs/write_text_file` method is available. + + + + Learn more about File System methods + + +#### Terminal + + + All `terminal/*` methods are available, allowing the Agent to execute and + manage shell commands. + + + + Learn more about Terminals + + +### Agent Capabilities + +The Agent **SHOULD** specify whether it supports the following capabilities: + + + The [`session/load`](./session-setup#loading-sessions) method is available. + + + + Object indicating the different types of [content](./content) that may be + included in `session/prompt` requests. + + +#### Prompt capabilities + +As a baseline, all Agents **MUST** support `ContentBlock::Text` and `ContentBlock::ResourceLink` in `session/prompt` requests. + +Optionally, they **MAY** support richer types of [content](./content) by specifying the following capabilities: + + + The prompt may include `ContentBlock::Image` + + + + The prompt may include `ContentBlock::Audio` + + + + The prompt may include `ContentBlock::Resource` + + +#### MCP capabilities + + + The Agent supports connecting to MCP servers over HTTP. + + + + The Agent supports connecting to MCP servers over SSE. + +Note: This transport has been deprecated by the MCP spec. + + + +#### Session Capabilities + +As a baseline, all Agents **MUST** support `session/new`, `session/prompt`, `session/cancel`, and `session/update`. + +Optionally, they **MAY** support other session methods and notifications by specifying additional capabilities. + + + `session/load` is still handled by the top-level `load_session` capability. + This will be unified in future versions of the protocol. + + +## Implementation Information + +Both Clients and Agents **SHOULD** provide information about their implementation in the `clientInfo` and `agentInfo` fields respectively. Both take the following three fields: + + + Intended for programmatic or logical use, but can be used as a display name + fallback if title isn’t present. + + + + Intended for UI and end-user contexts — optimized to be human-readable and + easily understood. If not provided, the name should be used for display. + + + + Version of the implementation. Can be displayed to the user or used for + debugging or metrics purposes. + + + + Note: in future versions of the protocol, this information will be required. + + +--- + +Once the connection is initialized, you're ready to [create a session](./session-setup) and begin the conversation with the Agent. diff --git a/docs/protocol/draft/overview.mdx b/docs/protocol/draft/overview.mdx new file mode 100644 index 00000000..a8dbf1dd --- /dev/null +++ b/docs/protocol/draft/overview.mdx @@ -0,0 +1,214 @@ +--- +title: "Overview" +description: "How the Agent Client Protocol works" +--- + +The Agent Client Protocol allows [Agents](#agent) and [Clients](#client) to communicate by exposing methods that each side can call and sending notifications to inform each other of events. + +## Communication Model + +The protocol follows the [JSON-RPC 2.0](https://www.jsonrpc.org/specification) specification with two types of messages: + +- **Methods**: Request-response pairs that expect a result or error +- **Notifications**: One-way messages that don't expect a response + +## Message Flow + +A typical flow follows this pattern: + + + + +- Client → Agent: `initialize` to establish connection +- Client → Agent: `authenticate` if required by the Agent + + + + + +- Client → Agent: `session/new` to create a new session +- Client → Agent: `session/load` to resume an existing session if supported + + + + + - Client → Agent: `session/prompt` to send user message + - Agent → Client: `session/update` notifications for progress updates + - Agent → Client: File operations or permission requests as needed + - Client → Agent: `session/cancel` to interrupt processing if needed + - Turn ends and the Agent sends the `session/prompt` response with a stop reason + + + +## Agent + +Agents are programs that use generative AI to autonomously modify code. They typically run as subprocesses of the Client. + +### Baseline Methods + +Schema]} +> + [Negotiate versions and exchange capabilities.](./initialization). + + +Schema]} +> + Authenticate with the Agent (if required). + + +Schema]} +> + [Create a new conversation session](./session-setup#creating-a-session). + + +Schema]} +> + [Send user prompts](./prompt-turn#1-user-message) to the Agent. + + +### Optional Methods + +Schema]} +> + [Load an existing session](./session-setup#loading-sessions) (requires + `loadSession` capability). + + +Schema]} +> + [Switch between agent operating + modes](./session-modes#setting-the-current-mode). + + +### Notifications + +Schema]} +> + [Cancel ongoing operations](./prompt-turn#cancellation) (no response + expected). + + +## Client + +Clients provide the interface between users and agents. They are typically code editors (IDEs, text editors) but can also be other UIs for interacting with agents. Clients manage the environment, handle user interactions, and control access to resources. + +### Baseline Methods + +Schema]} +> + [Request user authorization](./tool-calls#requesting-permission) for tool + calls. + + +### Optional Methods + +Schema]} +> + [Read file contents](./file-system#reading-files) (requires `fs.readTextFile` + capability). + + +Schema]} +> + [Write file contents](./file-system#writing-files) (requires + `fs.writeTextFile` capability). + + +Schema]} +> + [Create a new terminal](./terminals) (requires `terminal` capability). + + +Schema]} +> + Get terminal output and exit status (requires `terminal` capability). + + +Schema]} +> + Release a terminal (requires `terminal` capability). + + +Schema]} +> + Wait for terminal command to exit (requires `terminal` capability). + + +Schema]} +> + Kill terminal command without releasing (requires `terminal` capability). + + +### Notifications + +Schema]} +> + [Send session updates](./prompt-turn#3-agent-reports-output) to inform the + Client of changes (no response expected). This includes: - [Message + chunks](./content) (agent, user, thought) - [Tool calls and + updates](./tool-calls) - [Plans](./agent-plan) - [Available commands + updates](./slash-commands#advertising-commands) - [Mode + changes](./session-modes#from-the-agent) + + +## Argument requirements + +- All file paths in the protocol **MUST** be absolute. +- Line numbers are 1-based + +## Error Handling + +All methods follow standard JSON-RPC 2.0 [error handling](https://www.jsonrpc.org/specification#error_object): + +- Successful responses include a `result` field +- Errors include an `error` object with `code` and `message` +- Notifications never receive responses (success or error) + +## Extensibility + +The protocol provides built-in mechanisms for adding custom functionality while maintaining compatibility: + +- Add custom data using `_meta` fields +- Create custom methods by prefixing their name with underscore (`_`) +- Advertise custom capabilities during initialization + +Learn about [protocol extensibility](./extensibility) to understand how to use these mechanisms. + +## Next Steps + +- Learn about [Initialization](./initialization) to understand version and capability negotiation +- Understand [Session Setup](./session-setup) for creating and loading sessions +- Review the [Prompt Turn](./prompt-turn) lifecycle +- Explore [Extensibility](./extensibility) to add custom features diff --git a/docs/protocol/draft/prompt-turn.mdx b/docs/protocol/draft/prompt-turn.mdx new file mode 100644 index 00000000..57e8df34 --- /dev/null +++ b/docs/protocol/draft/prompt-turn.mdx @@ -0,0 +1,319 @@ +--- +title: "Prompt Turn" +description: "Understanding the core conversation flow" +--- + +A prompt turn represents a complete interaction cycle between the [Client](./overview#client) and [Agent](./overview#agent), starting with a user message and continuing until the Agent completes its response. This may involve multiple exchanges with the language model and tool invocations. + +Before sending prompts, Clients **MUST** first complete the [initialization](./initialization) phase and [session setup](./session-setup). + +## The Prompt Turn Lifecycle + +A prompt turn follows a structured flow that enables rich interactions between the user, Agent, and any connected tools. + +
+ +```mermaid +sequenceDiagram + participant Client + participant Agent + + Note over Agent,Client: Session ready + + Note left of Client: User sends message + Client->>Agent: session/prompt (user message) + Note right of Agent: Process with LLM + + loop Until completion + Note right of Agent: LLM responds with
content/tool calls + Agent->>Client: session/update (plan) + Agent->>Client: session/update (agent_message_chunk) + + opt Tool calls requested + Agent->>Client: session/update (tool_call) + opt Permission required + Agent->>Client: session/request_permission + Note left of Client: User grants/denies + Client-->>Agent: Permission response + end + Agent->>Client: session/update (tool_call status: in_progress) + Note right of Agent: Execute tool + Agent->>Client: session/update (tool_call status: completed) + Note right of Agent: Send tool results
back to LLM + end + + opt User cancelled during execution + Note left of Client: User cancels prompt + Client->>Agent: session/cancel + Note right of Agent: Abort operations + Agent-->>Client: session/prompt response (cancelled) + end + end + + Agent-->>Client: session/prompt response (stopReason) + +``` + +### 1. User Message + +The turn begins when the Client sends a `session/prompt`: + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "method": "session/prompt", + "params": { + "sessionId": "sess_abc123def456", + "prompt": [ + { + "type": "text", + "text": "Can you analyze this code for potential issues?" + }, + { + "type": "resource", + "resource": { + "uri": "file:///home/user/project/main.py", + "mimeType": "text/x-python", + "text": "def process_data(items):\n for item in items:\n print(item)" + } + } + ] + } +} +``` + + + The [ID](./session-setup#session-id) of the session to send this message to. + + + The contents of the user message, e.g. text, images, files, etc. + + Clients **MUST** restrict types of content according to the [Prompt Capabilities](./initialization#prompt-capabilities) established during [initialization](./initialization). + + + Learn more about Content + + + + +### 2. Agent Processing + +Upon receiving the prompt request, the Agent processes the user's message and sends it to the language model, which **MAY** respond with text content, tool calls, or both. + +### 3. Agent Reports Output + +The Agent reports the model's output to the Client via `session/update` notifications. This may include the Agent's plan for accomplishing the task: + +```json expandable +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "plan", + "entries": [ + { + "content": "Check for syntax errors", + "priority": "high", + "status": "pending" + }, + { + "content": "Identify potential type issues", + "priority": "medium", + "status": "pending" + }, + { + "content": "Review error handling patterns", + "priority": "medium", + "status": "pending" + }, + { + "content": "Suggest improvements", + "priority": "low", + "status": "pending" + } + ] + } + } +} +``` + + + Learn more about Agent Plans + + +The Agent then reports text responses from the model: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "agent_message_chunk", + "content": { + "type": "text", + "text": "I'll analyze your code for potential issues. Let me examine it..." + } + } + } +} +``` + +If the model requested tool calls, these are also reported immediately: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "tool_call", + "toolCallId": "call_001", + "title": "Analyzing Python code", + "kind": "other", + "status": "pending" + } + } +} +``` + +### 4. Check for Completion + +If there are no pending tool calls, the turn ends and the Agent **MUST** respond to the original `session/prompt` request with a `StopReason`: + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "result": { + "stopReason": "end_turn" + } +} +``` + +Agents **MAY** stop the turn at any point by returning the corresponding [`StopReason`](#stop-reasons). + +### 5. Tool Invocation and Status Reporting + +Before proceeding with execution, the Agent **MAY** request permission from the Client via the `session/request_permission` method. + +Once permission is granted (if required), the Agent **SHOULD** invoke the tool and report a status update marking the tool as `in_progress`: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "tool_call_update", + "toolCallId": "call_001", + "status": "in_progress" + } + } +} +``` + +As the tool runs, the Agent **MAY** send additional updates, providing real-time feedback about tool execution progress. + +While tools execute on the Agent, they **MAY** leverage Client capabilities such as the file system (`fs`) methods to access resources within the Client's environment. + +When the tool completes, the Agent sends another update with the final status and any content: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "tool_call_update", + "toolCallId": "call_001", + "status": "completed", + "content": [ + { + "type": "content", + "content": { + "type": "text", + "text": "Analysis complete:\n- No syntax errors found\n- Consider adding type hints for better clarity\n- The function could benefit from error handling for empty lists" + } + } + ] + } + } +} +``` + + + Learn more about Tool Calls + + +### 6. Continue Conversation + +The Agent sends the tool results back to the language model as another request. + +The cycle returns to [step 2](#2-agent-processing), continuing until the language model completes its response without requesting additional tool calls or the turn gets stopped by the Agent or cancelled by the Client. + +## Stop Reasons + +When an Agent stops a turn, it must specify the corresponding `StopReason`: + + + The language model finishes responding without requesting more tools + + + + The maximum token limit is reached + + + + The maximum number of model requests in a single turn is exceeded + + +The Agent refuses to continue + +The Client cancels the turn + +## Cancellation + +Clients **MAY** cancel an ongoing prompt turn at any time by sending a `session/cancel` notification: + +```json +{ + "jsonrpc": "2.0", + "method": "session/cancel", + "params": { + "sessionId": "sess_abc123def456" + } +} +``` + +The Client **SHOULD** preemptively mark all non-finished tool calls pertaining to the current turn as `cancelled` as soon as it sends the `session/cancel` notification. + +The Client **MUST** respond to all pending `session/request_permission` requests with the `cancelled` outcome. + +When the Agent receives this notification, it **SHOULD** stop all language model requests and all tool call invocations as soon as possible. + +After all ongoing operations have been successfully aborted and pending updates have been sent, the Agent **MUST** respond to the original `session/prompt` request with the `cancelled` [stop reason](#stop-reasons). + + + API client libraries and tools often throw an exception when their operation is aborted, which may propagate as an error response to `session/prompt`. + +Clients often display unrecognized errors from the Agent to the user, which would be undesirable for cancellations as they aren't considered errors. + +Agents **MUST** catch these errors and return the semantically meaningful `cancelled` stop reason, so that Clients can reliably confirm the cancellation. + + + +The Agent **MAY** send `session/update` notifications with content or tool call updates after receiving the `session/cancel` notification, but it **MUST** ensure that it does so before responding to the `session/prompt` request. + +The Client **SHOULD** still accept tool call updates received after sending `session/cancel`. + +--- + +Once a prompt turn completes, the Client may send another `session/prompt` to continue the conversation, building on the context established in previous turns. diff --git a/docs/protocol/draft/session-config-options.mdx b/docs/protocol/draft/session-config-options.mdx new file mode 100644 index 00000000..524b489f --- /dev/null +++ b/docs/protocol/draft/session-config-options.mdx @@ -0,0 +1,282 @@ +--- +title: "Session Config Options" +description: "Flexible configuration selectors for agent sessions" +--- + +Agents can provide an arbitrary list of configuration options for a session, allowing Clients to offer users customizable selectors for things like models, modes, reasoning levels, and more. + + + Session Config Options are the preferred way to expose session-level + configuration. If an Agent provides `configOptions`, Clients **SHOULD** use + them instead of the [`modes`](./session-modes) field. Modes will be removed in + a future version of the protocol. + + +## Initial State + +During [Session Setup](./session-setup) the Agent **MAY** return a list of configuration options and their current values: + +```json +{ + "jsonrpc": "2.0", + "id": 1, + "result": { + "sessionId": "sess_abc123def456", + "configOptions": [ + { + "id": "mode", + "name": "Session Mode", + "description": "Controls how the agent requests permission", + "category": "mode", + "type": "select", + "currentValue": "ask", + "options": [ + { + "value": "ask", + "name": "Ask", + "description": "Request permission before making any changes" + }, + { + "value": "code", + "name": "Code", + "description": "Write and modify code with full tool access" + } + ] + }, + { + "id": "model", + "name": "Model", + "category": "model", + "type": "select", + "currentValue": "model-1", + "options": [ + { + "value": "model-1", + "name": "Model 1", + "description": "The fastest model" + }, + { + "value": "model-2", + "name": "Model 2", + "description": "The most powerful model" + } + ] + } + ] + } +} +``` + + + The list of configuration options available for this session. The order of + this array represents the Agent's preferred priority. Clients **SHOULD** + respect this ordering when displaying options. + + +### ConfigOption + + + Unique identifier for this configuration option. Used when setting values. + + + + Human-readable label for the option + + + + Optional description providing more details about what this option controls + + + + Optional [semantic category](#option-categories) to help Clients provide + consistent UX. + + + + The type of input control. Currently only `select` is supported. + + + + The currently selected value for this option + + + + The available values for this option + + +### ConfigOptionValue + + + The value identifier used when setting this option + + + + Human-readable name to display + + + + Optional description of what this value does + + +## Option Categories + +Each config option **MAY** include a `category` field. Categories are semantic metadata intended to help Clients provide consistent UX, such as attaching keyboard shortcuts, choosing icons, or deciding placement. + + + Categories are for UX purposes only and **MUST NOT** be required for + correctness. Clients **MUST** handle missing or unknown categories gracefully. + + +Category names beginning with `_` are free for custom use (e.g., `_my_custom_category`). Category names that do not begin with `_` are reserved for the ACP spec. + +| Category | Description | +| --------------- | -------------------------------- | +| `mode` | Session mode selector | +| `model` | Model selector | +| `thought_level` | Thought/reasoning level selector | + +When multiple options share the same category, Clients **SHOULD** use the array ordering to resolve ties, preferring earlier options in the list for prominent placement or keyboard shortcuts. + +## Option Ordering + +The order of the `configOptions` array is significant. Agents **SHOULD** place higher-priority options first in the list. + +Clients **SHOULD**: + +- Display options in the order provided by the Agent +- Use ordering to resolve ties when multiple options share the same category +- If displaying a limited number of options, prefer those at the beginning of the list + +## Default Values and Graceful Degradation + +Agents **MUST** always provide a default value for every configuration option. This ensures the Agent can operate correctly even if: + +- The Client doesn't support configuration options +- The Client chooses not to display certain options +- The Client receives an option type it doesn't recognize + +If a Client receives an option with an unrecognized `type`, it **SHOULD** ignore that option. The Agent will continue using its default value. + +## Setting a Config Option + +The current value of a config option can be changed at any point during a session, whether the Agent is idle or generating a response. + +### From the Client + +Clients can change a config option value by calling the `session/set_config_option` method: + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "method": "session/set_config_option", + "params": { + "sessionId": "sess_abc123def456", + "configId": "mode", + "value": "code" + } +} +``` + + + The ID of the session + + + + The `id` of the configuration option to change + + + + The new value to set. Must be one of the values listed in the option's + `options` array. + + +The Agent **MUST** respond with the complete list of all configuration options and their current values: + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "result": { + "configOptions": [ + { + "id": "mode", + "name": "Session Mode", + "type": "select", + "currentValue": "code", + "options": [...] + }, + { + "id": "model", + "name": "Model", + "type": "select", + "currentValue": "model-1", + "options": [...] + } + ] + } +} +``` + + + The response always contains the **complete** configuration state. This allows + Agents to reflect dependent changes. For example, if changing the model + affects available reasoning options, or if an option's available values change + based on another selection. + + +### From the Agent + +The Agent can also change configuration options and notify the Client by sending a `config_option_update` session notification: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "config_option_update", + "configOptions": [ + { + "id": "mode", + "name": "Session Mode", + "type": "select", + "currentValue": "code", + "options": [...] + }, + { + "id": "model", + "name": "Model", + "type": "select", + "currentValue": "model-2", + "options": [...] + } + ] + } + } +} +``` + +This notification also contains the complete configuration state. Common reasons an Agent might update configuration options include: + +- Switching modes after completing a planning phase +- Falling back to a different model due to rate limits or errors +- Adjusting available options based on context discovered during execution + +## Relationship to Session Modes + +Session Config Options supersede the older [Session Modes](./session-modes) API. However, during the transition period, Agents that provide mode-like configuration **SHOULD** send both: + +- `configOptions` with a `category: "mode"` option for Clients that support config options +- `modes` for Clients that only support the older API + +If an Agent provides both `configOptions` and `modes` in the session response: + +- Clients that support config options **SHOULD** use `configOptions` exclusively and ignore `modes` +- Clients that don't support config options **SHOULD** fall back to `modes` +- Agents **SHOULD** keep both in sync to ensure consistent behavior regardless of which field the Client uses + + + Learn about the Session Modes API + diff --git a/docs/protocol/draft/session-modes.mdx b/docs/protocol/draft/session-modes.mdx new file mode 100644 index 00000000..03bdaa17 --- /dev/null +++ b/docs/protocol/draft/session-modes.mdx @@ -0,0 +1,173 @@ +--- +title: "Session Modes" +description: "Switch between different agent operating modes" +--- + + + You can now use [Session Config Options](./session-config-options). Dedicated + session mode methods will be removed in a future version of the protocol. + Until then, you can offer both to clients for backwards compatibility. + + +Agents can provide a set of modes they can operate in. Modes often affect the system prompts used, the availability of tools, and whether they request permission before running. + +## Initial state + +During [Session Setup](./session-setup) the Agent **MAY** return a list of modes it can operate in and the currently active mode: + +```json +{ + "jsonrpc": "2.0", + "id": 1, + "result": { + "sessionId": "sess_abc123def456", + "modes": { + "currentModeId": "ask", + "availableModes": [ + { + "id": "ask", + "name": "Ask", + "description": "Request permission before making any changes" + }, + { + "id": "architect", + "name": "Architect", + "description": "Design and plan software systems without implementation" + }, + { + "id": "code", + "name": "Code", + "description": "Write and modify code with full tool access" + } + ] + } + } +} +``` + + + The current mode state for the session + + +### SessionModeState + + + The ID of the mode that is currently active + + + + The set of modes that the Agent can operate in + + +### SessionMode + + + Unique identifier for this mode + + + + Human-readable name of the mode + + + + Optional description providing more details about what this mode does + + +## Setting the current mode + +The current mode can be changed at any point during a session, whether the Agent is idle or generating a response. + +### From the Client + +Typically, Clients display the available modes to the user and allow them to change the current one, which they can do by calling the [`session/set_mode`](./schema#session%2Fset-mode) method. + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "method": "session/set_mode", + "params": { + "sessionId": "sess_abc123def456", + "modeId": "code" + } +} +``` + + + The ID of the session to set the mode for + + + + The ID of the mode to switch to. Must be one of the modes listed in + `availableModes` + + +### From the Agent + +The Agent can also change its own mode and let the Client know by sending the `current_mode_update` session notification: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "current_mode_update", + "modeId": "code" + } + } +} +``` + +#### Exiting plan modes + +A common case where an Agent might switch modes is from within a special "exit mode" tool that can be provided to the language model during plan/architect modes. The language model can call this tool when it determines it's ready to start implementing a solution. + +This "switch mode" tool will usually request permission before running, which it can do just like any other tool: + +```json +{ + "jsonrpc": "2.0", + "id": 3, + "method": "session/request_permission", + "params": { + "sessionId": "sess_abc123def456", + "toolCall": { + "toolCallId": "call_switch_mode_001", + "title": "Ready for implementation", + "kind": "switch_mode", + "status": "pending", + "content": [ + { + "type": "text", + "text": "## Implementation Plan..." + } + ] + }, + "options": [ + { + "optionId": "code", + "name": "Yes, and auto-accept all actions", + "kind": "allow_always" + }, + { + "optionId": "ask", + "name": "Yes, and manually accept actions", + "kind": "allow_once" + }, + { + "optionId": "reject", + "name": "No, stay in architect mode", + "kind": "reject_once" + } + ] + } +} +``` + +When an option is chosen, the tool runs, setting the mode and sending the `current_mode_update` notification mentioned above. + + + Learn more about permission requests + diff --git a/docs/protocol/draft/slash-commands.mdx b/docs/protocol/draft/slash-commands.mdx new file mode 100644 index 00000000..271b115d --- /dev/null +++ b/docs/protocol/draft/slash-commands.mdx @@ -0,0 +1,96 @@ +--- +title: "Slash Commands" +description: "Advertise available slash commands to clients" +--- + +Agents can advertise a set of slash commands that users can invoke. These commands provide quick access to specific agent capabilities and workflows. Commands are run as part of regular [prompt](./prompt-turn) requests where the Client includes the command text in the prompt. + +## Advertising commands + +After creating a session, the Agent **MAY** send a list of available commands via the `available_commands_update` session notification: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "available_commands_update", + "availableCommands": [ + { + "name": "web", + "description": "Search the web for information", + "input": { + "hint": "query to search for" + } + }, + { + "name": "test", + "description": "Run tests for the current project" + }, + { + "name": "plan", + "description": "Create a detailed implementation plan", + "input": { + "hint": "description of what to plan" + } + } + ] + } + } +} +``` + + + The list of commands available in this session + + +### AvailableCommand + + + The command name (e.g., "web", "test", "plan") + + + + Human-readable description of what the command does + + + + Optional input specification for the command + + +### AvailableCommandInput + +Currently supports unstructured text input: + + + A hint to display when the input hasn't been provided yet + + +## Dynamic updates + +The Agent can update the list of available commands at any time during a session by sending another `available_commands_update` notification. This allows commands to be added based on context, removed when no longer relevant, or modified with updated descriptions. + +## Running commands + +Commands are included as regular user messages in prompt requests: + +```json +{ + "jsonrpc": "2.0", + "id": 3, + "method": "session/prompt", + "params": { + "sessionId": "sess_abc123def456", + "prompt": [ + { + "type": "text", + "text": "/web agent client protocol" + } + ] + } +} +``` + +The Agent recognizes the command prefix and processes it accordingly. Commands may be accompanied by any other user message content types (images, audio, etc.) in the same prompt array. diff --git a/docs/protocol/draft/terminals.mdx b/docs/protocol/draft/terminals.mdx new file mode 100644 index 00000000..270ec758 --- /dev/null +++ b/docs/protocol/draft/terminals.mdx @@ -0,0 +1,281 @@ +--- +title: "Terminals" +description: "Executing and managing terminal commands" +--- + +The terminal methods allow Agents to execute shell commands within the Client's environment. These methods enable Agents to run build processes, execute scripts, and interact with command-line tools while providing real-time output streaming and process control. + +## Checking Support + +Before attempting to use terminal methods, Agents **MUST** verify that the Client supports this capability by checking the [Client Capabilities](./initialization#client-capabilities) field in the `initialize` response: + +```json highlight={7} +{ + "jsonrpc": "2.0", + "id": 0, + "result": { + "protocolVersion": 1, + "clientCapabilities": { + "terminal": true + } + } +} +``` + +If `terminal` is `false` or not present, the Agent **MUST NOT** attempt to call any terminal methods. + +## Executing Commands + +The `terminal/create` method starts a command in a new terminal: + +```json +{ + "jsonrpc": "2.0", + "id": 5, + "method": "terminal/create", + "params": { + "sessionId": "sess_abc123def456", + "command": "npm", + "args": ["test", "--coverage"], + "env": [ + { + "name": "NODE_ENV", + "value": "test" + } + ], + "cwd": "/home/user/project", + "outputByteLimit": 1048576 + } +} +``` + + + The [Session ID](./session-setup#session-id) for this request + + + + The command to execute + + + + Array of command arguments + + + + Environment variables for the command. + +Each variable has: + +- `name`: The environment variable name +- `value`: The environment variable value + + + + + Working directory for the command (absolute path) + + + + Maximum number of output bytes to retain. Once exceeded, earlier output is + truncated to stay within this limit. + +When the limit is exceeded, the Client truncates from the beginning of the output +to stay within the limit. + +The Client **MUST** ensure truncation happens at a character boundary to maintain valid +string output, even if this means the retained output is slightly less than the +specified limit. + + + +The Client returns a Terminal ID immediately without waiting for completion: + +```json +{ + "jsonrpc": "2.0", + "id": 5, + "result": { + "terminalId": "term_xyz789" + } +} +``` + +This allows the command to run in the background while the Agent performs other operations. + +After creating the terminal, the Agent can use the `terminal/wait_for_exit` method to wait for the command to complete. + + + The Agent **MUST** release the terminal using `terminal/release` when it's no + longer needed. + + +## Embedding in Tool Calls + +Terminals can be embedded directly in [tool calls](./tool-calls) to provide real-time output to users: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "tool_call", + "toolCallId": "call_002", + "title": "Running tests", + "kind": "execute", + "status": "in_progress", + "content": [ + { + "type": "terminal", + "terminalId": "term_xyz789" + } + ] + } + } +} +``` + +When a terminal is embedded in a tool call, the Client displays live output as it's generated and continues to display it even after the terminal is released. + +## Getting Output + +The `terminal/output` method retrieves the current terminal output without waiting for the command to complete: + +```json +{ + "jsonrpc": "2.0", + "id": 6, + "method": "terminal/output", + "params": { + "sessionId": "sess_abc123def456", + "terminalId": "term_xyz789" + } +} +``` + +The Client responds with the current output and exit status (if the command has finished): + +```json +{ + "jsonrpc": "2.0", + "id": 6, + "result": { + "output": "Running tests...\n✓ All tests passed (42 total)\n", + "truncated": false, + "exitStatus": { + "exitCode": 0, + "signal": null + } + } +} +``` + + + The terminal output captured so far + + + + Whether the output was truncated due to byte limits + + + + Present only if the command has exited. Contains: + +- `exitCode`: The process exit code (may be null) +- `signal`: The signal that terminated the process (may be null) + + + +## Waiting for Exit + +The `terminal/wait_for_exit` method returns once the command completes: + +```json +{ + "jsonrpc": "2.0", + "id": 7, + "method": "terminal/wait_for_exit", + "params": { + "sessionId": "sess_abc123def456", + "terminalId": "term_xyz789" + } +} +``` + +The Client responds once the command exits: + +```json +{ + "jsonrpc": "2.0", + "id": 7, + "result": { + "exitCode": 0, + "signal": null + } +} +``` + + + The process exit code (may be null if terminated by signal) + + + + The signal that terminated the process (may be null if exited normally) + + +## Killing Commands + +The `terminal/kill` method terminates a command without releasing the terminal: + +```json +{ + "jsonrpc": "2.0", + "id": 8, + "method": "terminal/kill", + "params": { + "sessionId": "sess_abc123def456", + "terminalId": "term_xyz789" + } +} +``` + +After killing a command, the terminal remains valid and can be used with: + +- `terminal/output` to get the final output +- `terminal/wait_for_exit` to get the exit status + +The Agent **MUST** still call `terminal/release` when it's done using it. + +### Building a Timeout + +Agents can implement command timeouts by combining terminal methods: + +1. Create a terminal with `terminal/create` +2. Start a timer for the desired timeout duration +3. Concurrently wait for either the timer to expire or `terminal/wait_for_exit` to return +4. If the timer expires first: + - Call `terminal/kill` to terminate the command + - Call `terminal/output` to retrieve any final output + - Include the output in the response to the model +5. Call `terminal/release` when done + +## Releasing Terminals + +The `terminal/release` kills the command if still running and releases all resources: + +```json +{ + "jsonrpc": "2.0", + "id": 9, + "method": "terminal/release", + "params": { + "sessionId": "sess_abc123def456", + "terminalId": "term_xyz789" + } +} +``` + +After release the terminal ID becomes invalid for all other `terminal/*` methods. + +If the terminal was added to a tool call, the client **SHOULD** continue to display its output after release. diff --git a/docs/protocol/draft/tool-calls.mdx b/docs/protocol/draft/tool-calls.mdx new file mode 100644 index 00000000..2982296a --- /dev/null +++ b/docs/protocol/draft/tool-calls.mdx @@ -0,0 +1,310 @@ +--- +title: "Tool Calls" +description: "How Agents report tool call execution" +--- + +Tool calls represent actions that language models request Agents to perform during a [prompt turn](./prompt-turn). When an LLM determines it needs to interact with external systems—like reading files, running code, or fetching data—it generates tool calls that the Agent executes on its behalf. + +Agents report tool calls through [`session/update`](./prompt-turn#3-agent-reports-output) notifications, allowing Clients to display real-time progress and results to users. + +While Agents handle the actual execution, they may leverage Client capabilities like [permission requests](#requesting-permission) or [file system access](./file-system) to provide a richer, more integrated experience. + +## Creating + +When the language model requests a tool invocation, the Agent **SHOULD** report it to the Client: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "tool_call", + "toolCallId": "call_001", + "title": "Reading configuration file", + "kind": "read", + "status": "pending" + } + } +} +``` + + + A unique identifier for this tool call within the session + + + + A human-readable title describing what the tool is doing + + + + The category of tool being invoked. + + + - `read` - Reading files or data - `edit` - Modifying files or content - + `delete` - Removing files or data - `move` - Moving or renaming files - + `search` - Searching for information - `execute` - Running commands or code - + `think` - Internal reasoning or planning - `fetch` - Retrieving external data + - `other` - Other tool types (default) + + +Tool kinds help Clients choose appropriate icons and optimize how they display tool execution progress. + + + + + The current [execution status](#status) (defaults to `pending`) + + + + [Content produced](#content) by the tool call + + + + [File locations](#following-the-agent) affected by this tool call + + + + The raw input parameters sent to the tool + + + + The raw output returned by the tool + + +## Updating + +As tools execute, Agents send updates to report progress and results. + +Updates use the `session/update` notification with `tool_call_update`: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "tool_call_update", + "toolCallId": "call_001", + "status": "in_progress", + "content": [ + { + "type": "content", + "content": { + "type": "text", + "text": "Found 3 configuration files..." + } + } + ] + } + } +} +``` + +All fields except `toolCallId` are optional in updates. Only the fields being changed need to be included. + +## Requesting Permission + +The Agent **MAY** request permission from the user before executing a tool call by calling the `session/request_permission` method: + +```json +{ + "jsonrpc": "2.0", + "id": 5, + "method": "session/request_permission", + "params": { + "sessionId": "sess_abc123def456", + "toolCall": { + "toolCallId": "call_001" + }, + "options": [ + { + "optionId": "allow-once", + "name": "Allow once", + "kind": "allow_once" + }, + { + "optionId": "reject-once", + "name": "Reject", + "kind": "reject_once" + } + ] + } +} +``` + + + The session ID for this request + + + + The tool call update containing details about the operation + + + + Available [permission options](#permission-options) for the user to choose + from + + +The Client responds with the user's decision: + +```json +{ + "jsonrpc": "2.0", + "id": 5, + "result": { + "outcome": { + "outcome": "selected", + "optionId": "allow-once" + } + } +} +``` + +Clients **MAY** automatically allow or reject permission requests according to the user settings. + +If the current prompt turn gets [cancelled](./prompt-turn#cancellation), the Client **MUST** respond with the `"cancelled"` outcome: + +```json +{ + "jsonrpc": "2.0", + "id": 5, + "result": { + "outcome": { + "outcome": "cancelled" + } + } +} +``` + + + The user's decision, either: - `cancelled` - The [prompt turn was + cancelled](./prompt-turn#cancellation) - `selected` with an `optionId` - The + ID of the selected permission option + + +### Permission Options + +Each permission option provided to the Client contains: + + + Unique identifier for this option + + + + Human-readable label to display to the user + + + + A hint to help Clients choose appropriate icons and UI treatment for each option. + +- `allow_once` - Allow this operation only this time +- `allow_always` - Allow this operation and remember the choice +- `reject_once` - Reject this operation only this time +- `reject_always` - Reject this operation and remember the choice + + + +## Status + +Tool calls progress through different statuses during their lifecycle: + + + The tool call hasn't started running yet because the input is either streaming + or awaiting approval + + + + The tool call is currently running + + + + The tool call completed successfully + + +The tool call failed with an error + +## Content + +Tool calls can produce different types of content: + +### Regular Content + +Standard [content blocks](./content) like text, images, or resources: + +```json +{ + "type": "content", + "content": { + "type": "text", + "text": "Analysis complete. Found 3 issues." + } +} +``` + +### Diffs + +File modifications shown as diffs: + +```json +{ + "type": "diff", + "path": "/home/user/project/src/config.json", + "oldText": "{\n \"debug\": false\n}", + "newText": "{\n \"debug\": true\n}" +} +``` + + + The absolute file path being modified + + + + The original content (null for new files) + + + + The new content after modification + + +### Terminals + +Live terminal output from command execution: + +```json +{ + "type": "terminal", + "terminalId": "term_xyz789" +} +``` + + + The ID of a terminal created with `terminal/create` + + +When a terminal is embedded in a tool call, the Client displays live output as it's generated and continues to display it even after the terminal is released. + + + Learn more about Terminals + + +## Following the Agent + +Tool calls can report file locations they're working with, enabling Clients to implement "follow-along" features that track which files the Agent is accessing or modifying in real-time. + +```json +{ + "path": "/home/user/project/src/main.py", + "line": 42 +} +``` + + + The absolute file path being accessed or modified + + + + Optional line number within the file + diff --git a/docs/protocol/draft/transports.mdx b/docs/protocol/draft/transports.mdx new file mode 100644 index 00000000..274fdbdb --- /dev/null +++ b/docs/protocol/draft/transports.mdx @@ -0,0 +1,52 @@ +--- +title: "Transports" +description: "Mechanisms for agents and clients to communicate with each other" +--- + +ACP uses JSON-RPC to encode messages. JSON-RPC messages **MUST** be UTF-8 encoded. + +The protocol currently defines the following transport mechanisms for agent-client communication: + +1. [stdio](#stdio), communication over standard in and standard out +2. _[Streamable HTTP](#streamable-http) (draft proposal in progress)_ + +Agents and clients **SHOULD** support stdio whenever possible. + +It is also possible for agents and clients to implement [custom transports](#custom-transports). + +## stdio + +In the **stdio** transport: + +- The client launches the agent as a subprocess. +- The agent reads JSON-RPC messages from its standard input (`stdin`) and sends messages to its standard output (`stdout`). +- Messages are individual JSON-RPC requests, notifications, or responses. +- Messages are delimited by newlines (`\n`), and **MUST NOT** contain embedded newlines. +- The agent **MAY** write UTF-8 strings to its standard error (`stderr`) for logging purposes. Clients **MAY** capture, forward, or ignore this logging. +- The agent **MUST NOT** write anything to its `stdout` that is not a valid ACP message. +- The client **MUST NOT** write anything to the agent's `stdin` that is not a valid ACP message. + +```mermaid +sequenceDiagram + participant Client + participant Agent Process + + Client->>+Agent Process: Launch subprocess + loop Message Exchange + Client->>Agent Process: Write to stdin + Agent Process->>Client: Write to stdout + Agent Process--)Client: Optional logs on stderr + end + Client->>Agent Process: Close stdin, terminate subprocess + deactivate Agent Process +``` + +## _Streamable HTTP_ + +_In discussion, draft proposal in progress._ + +## Custom Transports + +Agents and clients **MAY** implement additional custom transport mechanisms to suit their specific needs. The protocol is transport-agnostic and can be implemented over any communication channel that supports bidirectional message exchange. + +Implementers who choose to support custom transports **MUST** ensure they preserve the JSON-RPC message format and lifecycle requirements defined by ACP. Custom transports **SHOULD** document their specific connection establishment and message exchange patterns to aid interoperability.