-
Notifications
You must be signed in to change notification settings - Fork 670
feat(agent): show tool result images & support send img to remote cha… #1690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| # Plan | ||
|
|
||
| ## Current behavior | ||
|
|
||
| Tool execution can attach visual previews to `tool_call.imagePreviews`. The desktop renderer shows those previews only inside the expanded tool-call details, not as normal assistant image messages. `prepareToolImagePreviewPresentation()` currently promotes only successful built-in `image_generate` previews into assistant `image` blocks. Other tool result images, including screenshots, remain embedded in the tool-call metadata. | ||
|
|
||
| Remote snapshots historically persisted only assistant `image` blocks. The first fix added a fallback that also persists `tool_call.imagePreviews`, but the broader issue is conversation-level visibility: the assistant transcript itself should contain the image result. | ||
|
|
||
| ## Implementation approach | ||
|
|
||
| 1. Generalize `prepareToolImagePreviewPresentation()` so successful, non-error tool result previews with usable `data` are promoted into assistant `image` blocks for any tool source. | ||
| 2. Keep the existing special-case behavior for built-in `image_generate`: its previews are promoted and removed from the tool-call detail panel. | ||
| 3. For other tools, promote usable previews while preserving metadata-only/unusable previews on `tool_call.imagePreviews` so the detail panel can still show what is available. | ||
| 4. Add stable image block metadata linking promoted images back to the tool call and preview source/title. | ||
| 5. Keep the remote snapshot fallback for legacy conversations where previews are already stored only in `tool_call.imagePreviews`. | ||
| 6. Update tests to cover screenshot/tool-output promotion in the normal runtime path and the remote fallback path. | ||
|
|
||
| ## Affected interfaces | ||
|
|
||
| - `AssistantMessageBlock` remains unchanged; promoted images use existing `type: 'image'` and `image_data` fields. | ||
| - `AssistantMessageExtra` gains optional metadata keys through its existing index signature, such as `toolCallId`, `toolImagePreviewId`, `toolImagePreviewSource`, and `toolImagePreviewTitle`. | ||
| - `RemoteConversationSnapshot.generatedImages` remains unchanged. | ||
|
|
||
| ## Data flow | ||
|
|
||
| 1. Tool execution returns `imagePreviews`. | ||
| 2. Runtime normalizes the tool result and calls `prepareToolImagePreviewPresentation()`. | ||
| 3. Usable previews become assistant `image` blocks inserted after the tool-call block. | ||
| 4. Desktop conversation renders those images as normal assistant images. | ||
| 5. Remote snapshot persists those image blocks into `generatedImages`; legacy unpromoted previews are also persisted as fallback. | ||
|
|
||
| ## Compatibility | ||
|
|
||
| - Existing generated-image behavior remains compatible: built-in `image_generate` still hides promoted previews from the tool detail panel. | ||
| - Saved conversations with only `tool_call.imagePreviews` continue remote delivery via the fallback persistence path. | ||
| - Error tool results are not promoted into normal image blocks. | ||
|
|
||
| ## Test strategy | ||
|
|
||
| - Update `agentRuntimePresenter/dispatch` tests to assert generic successful tool image previews are promoted into assistant image blocks. | ||
| - Keep tests for built-in `image_generate`, MCP same-name tool, and error results aligned with the new promotion rules. | ||
| - Keep `RemoteConversationRunner` tests covering fallback persistence from `tool_call.imagePreviews`. | ||
| - Run focused tests, typecheck, format, i18n, and lint. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| # Tool result images in conversation and remote delivery | ||
|
|
||
| ## User need | ||
|
|
||
| Users expect visual tool results to appear as first-class images in the normal chat transcript and in remote-control channels. Today a tool such as `Page.captureScreenshot` can complete successfully and store the screenshot in `tool_call.imagePreviews`, but the assistant may only continue with text or no final content. The image remains hidden behind the tool-call details and remote channels may not receive it unless the result is converted separately. | ||
|
|
||
| ## Goal | ||
|
|
||
| Promote suitable function/tool-call image results into assistant `image` blocks so they are visible in the desktop conversation without depending on the model to restate them. Remote delivery should then reuse the same image blocks and, as a compatibility fallback, still handle unpromoted `tool_call.imagePreviews`. | ||
|
|
||
| ## Acceptance criteria | ||
|
|
||
| - Successful `tool_call` results with resolvable `imagePreviews` create assistant `image` blocks adjacent to the tool call. | ||
|
zhangmo8 marked this conversation as resolved.
|
||
| - `Page.captureScreenshot`, MCP image outputs, file-read image previews, and other non-error tool result images can become visible conversation images. | ||
| - The tool-call detail panel may still show preview metadata only when an image cannot be promoted or when the tool result is an error. | ||
| - The model context can continue safely without requiring the assistant to output the image itself. | ||
| - Remote snapshots deliver promoted image blocks through the existing `generatedImages` path and can still deliver legacy/unpromoted tool result previews. | ||
| - Raw base64 is not leaked into normal text messages. | ||
|
|
||
| ## Constraints | ||
|
|
||
| - Preserve existing image-generation promotion behavior and compatibility for saved conversations. | ||
| - Keep channel-specific remote code unchanged where possible. | ||
| - Avoid promoting error tool results as normal assistant images. | ||
| - Skip previews without usable image data. | ||
|
|
||
| ## Non-goals | ||
|
|
||
| - Changing remote channel APIs or settings. | ||
| - Adding live streaming of images before tool completion. | ||
| - Sending images from tools that only expose remote HTTP URLs without cached/data payloads. | ||
| - Reworking renderer image components. | ||
|
|
||
| ## Open questions | ||
|
|
||
| None. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # Tasks | ||
|
|
||
| - [x] Inspect remote snapshot and channel image delivery flow. | ||
| - [x] Document the initial remote fallback issue and implementation plan. | ||
| - [x] Persist completed tool-call image previews as remote image assets. | ||
| - [x] Re-scope the SDD artifacts to include conversation-level image visibility. | ||
| - [x] Promote successful generic tool result image previews into assistant image blocks. | ||
| - [x] Update focused tests for generic promotion and remote fallback delivery. | ||
| - [x] Run formatter, i18n check/generation, lint, typecheck, and relevant tests. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.