fix(core): parse screenshot MIME in GoogleCUAClient function responses#2159
fix(core): parse screenshot MIME in GoogleCUAClient function responses#2159yawbtng wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: 246ef50 The changes in this PR will be included in the next version bump. This PR includes changesets to release 4 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
This PR is from an external contributor and must be approved by a stagehand team member with write access before CI can run. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9b7e1b1861
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } from "../flowlogger/FlowLogger.js"; | ||
| import { v7 as uuidv7 } from "uuid"; | ||
|
|
||
| const IMAGE_DATA_URL_PATTERN = /^data:(image\/[a-zA-Z0-9.+-]+);base64,(.*)$/; |
There was a problem hiding this comment.
Handle newline-wrapped base64 data URLs
The new parser regex uses (.*) for the payload, but . does not match newlines in JavaScript regexes, so valid data:image/...;base64,... strings that contain line breaks will fail to match and fall back to returning the entire data URL as inlineData.data. In that case Google receives data:image/...;base64,... instead of raw base64 bytes, which can break function-response image decoding. This is a regression from the previous replace(/^data:image\/png;base64,/, "") behavior, which stripped the prefix regardless of payload newlines.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — fixed in 246ef50. Switched (.*) to ([\s\S]*) so the payload group captures newline-wrapped (MIME-style) base64, and added a regression test (parses a PNG data URL with newline-wrapped base64 (MIME-style)) so this can't drift back. The new behavior is now a strict superset of the previous replace(/^data:image\/png;base64,/, "") — wrapped payloads, JPEG/WebP, raw base64, and unrecognized URLs all handled. 6/6 tests pass locally.
why
Closes #2046.
GoogleCUAClientbuilds function-response image parts withinlineData.mimeType = "image/png"and strips the data URL prefix using a PNG-only regex. If a screenshot is JPEG/WebP (or anything non-PNG), the metadata is wrong and the regex no-ops — the whole data URL ends up as the payload.what changed
parseImageDataUrl(input)helper that extracts{ mimeType, data }from animage/<type>data URL, falling back toimage/png+ the raw input for unrecognized formats (preserves prior behavior).mimeTypeat the function-response call site with the helper's parsed values.captureScreenshot()— it still normalizes raw provider input to PNG data URLs, matching the issue's stated current contract.test plan
New
packages/core/tests/unit/google-cua-client.test.tscovering:image/png+ base64 payloadimage/jpeg+ base64 payloadimage/webp+ base64 payloadimage/pngwith the input as datatext/plain) → fallback toimage/pngAlso ran
pnpm run typecheckandpnpm run eslintinpackages/core— both clean.Summary by cubic
Fixes screenshot MIME parsing in
GoogleCUAClientfunction responses by deriving the type and base64 payload from image data URLs, including newline-wrapped payloads. Addresses Linear issue #2046 and ensures JPEG/WebP screenshots set the correctinlineData.mimeType, with a PNG fallback for raw inputs.parseImageDataUrl(input)to extract{ mimeType, data }fromdata:image/*;base64,..., handling newline-wrapped base64.mimeTypewith parsed values.Written for commit 246ef50. Summary will update on new commits. Review in cubic