Skip to content

improvement(sandbox): upgrade pptx/docx/pdf bootstrap with image helpers, MIME guards, and 256 MB isolate limit#4505

Merged
waleedlatif1 merged 9 commits intostagingfrom
fix/pptx
May 8, 2026
Merged

improvement(sandbox): upgrade pptx/docx/pdf bootstrap with image helpers, MIME guards, and 256 MB isolate limit#4505
waleedlatif1 merged 9 commits intostagingfrom
fix/pptx

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

@waleedlatif1 waleedlatif1 commented May 8, 2026

Summary

  • Rewrote pptx/docx/pdf sandbox bootstraps with full image helper globals (getFileBase64, addImage, drawImage, embedImage)
  • Added explicit PNG/JPEG allowlist in pdf embedImage with clear error for unsupported types (WebP, GIF, BMP)
  • Added comma-separator guard in embedImage to prevent silent empty-buffer errors
  • Aligned error messages across all 3 tasks: "6 MB raw / ~8 MB base64" matches the actual _MAX_IMG_B64 = 8 * 1024 * 1024 constant
  • Raised isolated-vm memoryLimit from 128 → 256 MB at both isolate creation sites (4 workers × 256 MB = 1 GB, safe vs 16 GB ECS task)
  • Added page geometry constants to pptx bootstrap (SLIDE_W, SLIDE_H, MARGIN, CONTENT_W, CONTENT_H)
  • Added rgb, StandardFonts, LETTER, A4 shortcuts to pdf bootstrap
  • Added docx page geometry constants (PAGE_W, PAGE_H, MARGIN, CONTENT_W) and full addSection / doc finalize logic
  • Fixed docx finalize to use hand-rolled base64 decoder (JSZip browser build doesn't support nodebuffer output)
  • DOCX addImage: upfront width/height validation — now consistent with PDF drawImage and PPTX addImage
  • PDF embedImage: removed dead Buffer ternary and redundant size guard already enforced by getFileBase64
  • isolated-vm worker: added friendly MemoryLimitError branch in both execute paths — OOM now surfaces a readable message instead of a raw V8 error

Type of Change

  • Bug fix
  • Improvement

Testing

Tested manually — generated PPTX, DOCX, and PDF documents via Mothership. Confirmed image embedding, error messages, and output correctness.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 8, 2026 2:46am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 8, 2026

PR Summary

Medium Risk
Touches sandbox execution limits and bootstrap helpers used by PDF/DOCX/PPTX generation; mistakes could cause increased resource usage or new runtime failures when embedding images.

Overview
Improves document-generation sandbox tasks by adding higher-level image utilities and stricter input validation: getFileBase64 now validates fileId, enforces an ~8MB base64 size cap, and pdf-generate/docx-generate/pptx-generate expose helpers like embedImage/drawImage/addImage with MIME allowlists and clearer errors.

Raises isolated-vm memoryLimit from 128 → 256 MB in both code-execution paths and adds a dedicated MemoryLimitError mapping so sandbox OOMs return a friendly, actionable message instead of raw V8 errors. Also adds convenience geometry constants (letter/A4 for PDF, slide/page dimensions for PPTX/DOCX) to simplify user scripts.

Reviewed by Cursor Bugbot for commit b7db38a. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR hardens the PPTX, DOCX, and PDF sandbox bootstraps with unified image-helper globals (getFileBase64, addImage/drawImage, embedImage), explicit MIME-type guards, data-URI comma guards, and page-geometry constants. It also raises the isolated-vm memoryLimit from 128 MB to 256 MB at both creation sites and adds a MemoryLimitError branch in the executeCode and executeTask catch blocks.

  • Sandbox bootstrap improvements: all three tasks now expose consistent image helpers with upfront size (8 MB base64 cap), comma-separator, and required-opts validation; DOCX and PDF add explicit MIME allowlists that throw clear errors for unsupported types.
  • Memory & error handling: isolate limit doubled to 256 MB; OOM errors are now surfaced with a descriptive MemoryLimitError instead of a raw V8 message, detected via Array buffer allocation failed and memory limit string matching.
  • DOCX finalize: uses a hand-rolled base64 decoder to work around the JSZip browser build's lack of nodebuffer output type support.

Confidence Score: 5/5

Safe to merge; all three sandbox tasks have consistent, well-guarded image helpers and the isolate memory increase is conservative relative to the ECS task limit.

The changes are self-contained sandbox bootstrap improvements and a worker config bump. The image helpers add input validation rather than removing it, the MIME guards eliminate previously silent failures, and the memory-limit increase is well-reasoned. The two open notes (pptx MIME validation gap and broad memory limit string match) are non-blocking quality items that do not affect correctness under normal usage.

No files require special attention; isolated-vm-worker.cjs carries the OOM detection logic worth a second read if the exact isolated-vm error-message format needs to be confirmed against the library version in use.

Important Files Changed

Filename Overview
apps/sim/lib/execution/isolated-vm-worker.cjs Memory limit raised to 256 MB at both isolate-creation sites; friendly MemoryLimitError branch added in both executeCode and executeTask catch blocks using string-matching on err.message
apps/sim/sandbox-tasks/docx-generate.ts Full image-helper globals added (getFileBase64, addImage) with MIME allowlist, comma-guard, and size cap; page geometry constants added; finalize uses hand-rolled base64 decoder due to JSZip browser build limitation
apps/sim/sandbox-tasks/pdf-generate.ts embedImage hardened with explicit PNG/JPEG allowlist and comma-guard; drawImage, getFileBase64, rgb, StandardFonts, LETTER, A4 shortcuts added; dead Buffer ternary removed
apps/sim/sandbox-tasks/pptx-generate.ts Image helpers and slide geometry constants added; addImage validates required positional opts; getFileBase64 strips data: prefix for PptxGenJS format — but unlike docx/pdf, no explicit MIME-type guard before passing to PptxGenJS

Sequence Diagram

sequenceDiagram
    participant UserCode as User Sandbox Code
    participant Bootstrap as Bootstrap Globals
    participant Broker as workspaceFileBroker
    participant Lib as PDF/DOCX/PPTX Library

    UserCode->>Bootstrap: getFileBase64(fileId)
    Bootstrap->>Broker: "workspaceFile({ fileId })"
    Broker-->>Bootstrap: "{ dataUri }"
    Bootstrap->>Bootstrap: "size check (> 8 MB base64 throws)"
    Bootstrap-->>UserCode: dataUri string

    UserCode->>Bootstrap: addImage / drawImage / embedImage(...)
    Bootstrap->>Bootstrap: validate required opts (x,y,w,h / width,height)
    Bootstrap->>Bootstrap: parse comma separator (-1 throws)
    Bootstrap->>Bootstrap: MIME type guard (unsupported throws)
    Bootstrap->>Lib: embedPng / embedJpg / ImageRun / slide.addImage
    Lib-->>UserCode: embedded image object

    UserCode->>Bootstrap: finalize (auto-called by executor)
    Bootstrap->>Lib: save / write / toBase64String
    Lib-->>Bootstrap: bytes / base64
    Bootstrap-->>Bootstrap: hand-rolled base64 decode (DOCX only)
    Bootstrap-->>UserCode: Uint8Array output
Loading

Reviews (9): Last reviewed commit: "fix(sandbox): consistency and cleanup pa..." | Re-trigger Greptile

Comment thread apps/sim/sandbox-tasks/docx-generate.ts Outdated
Comment thread apps/sim/sandbox-tasks/docx-generate.ts Outdated
Comment thread apps/sim/sandbox-tasks/pdf-generate.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/sandbox-tasks/docx-generate.ts
Comment thread apps/sim/sandbox-tasks/docx-generate.ts Outdated
Comment thread apps/sim/sandbox-tasks/docx-generate.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/sandbox-tasks/pptx-generate.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/sandbox-tasks/pptx-generate.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/sandbox-tasks/docx-generate.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 925d2ba. Configure here.

- DOCX addImage: upfront width/height validation (matches PDF/PPTX pattern)
- PDF embedImage: remove dead Buffer ternary; drop redundant size guard already enforced in getFileBase64
- isolated-vm-worker: add friendly MemoryLimitError branch in both execute paths so OOM produces a clear message instead of a raw V8 error
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit b7db38a. Configure here.

@waleedlatif1 waleedlatif1 merged commit a2aa648 into staging May 8, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/pptx branch May 8, 2026 03:01
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

Comment thread apps/sim/lib/execution/isolated-vm-worker.cjs
Comment thread apps/sim/lib/execution/isolated-vm-worker.cjs
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b7db38a. Configure here.

name: 'MemoryLimitError',
},
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOM check is unreachable due to isDisposed ordering

High Severity

The new MemoryLimitError branches in both executeCode and executeTask are unreachable dead code. The isolate?.isDisposed check on lines 382 and 936 runs before the OOM message check. Since isolated-vm auto-disposes the isolate on OOM, isDisposed is always true when an out-of-memory error occurs, so the AbortError ("Execution cancelled") branch fires first. OOM errors will be misreported as user-initiated cancellations. The OOM check needs to be moved before the isDisposed guard to disambiguate the two cases.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b7db38a. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant