Skip to content

Implement streaming response optimization for non-Next.js publisher proxy #563

@aram356

Description

@aram356

Context

The publisher proxy currently buffers the entire response body in memory before sending any bytes to the client. For a 222KB HTML page, peak memory is ~4x the response size and no bytes reach the client until all processing completes.

Performance results (staging vs production, median over 5 runs, Chrome 1440x900)

Metric Production (v135, buffered) Staging (v136, streaming) Delta
TTFB 54 ms 35 ms -19 ms (-35%)
First Paint 186 ms 160 ms -26 ms (-14%)
First Contentful Paint 186 ms 160 ms -26 ms (-14%)
DOM Content Loaded 286 ms 282 ms -4 ms (~same)
DOM Complete 1060 ms 663 ms -397 ms (-37%)

Measured on getpurpose.ai. Production (v135) buffers the entire response before sending. Staging (v136) streams processed chunks incrementally via StreamingBody.

Spec

See streaming response design spec (PR #562).

Plan

See implementation plan (PR #562).

Phase 1: Make streaming pipeline chunk-emitting (PR #583)

Ships independently with immediate memory savings.

Phase 2: Stream responses to client via StreamingBody (PR #585)

Depends on Phase 1. Adds TTFB/TTLB improvement.

Phase 3: Make script rewriters fragment-safe (PR #591)

Depends on Phase 2. Removes the buffered fallback, enabling full streaming even with GTM/NextJS script rewriters active.

Phase 4: Stream binary pass-through responses

Depends on Phase 2. Non-processable content (images, fonts, video) currently buffers in memory unnecessarily. Phase 4 streams them directly via io::copy into StreamingBody.

Acceptance Criteria

  • Streaming activates for all 2xx responses (text and binary)
  • Peak memory per request reduced from ~4x to constant (chunk buffer + parser state)
  • Client receives first body bytes after first processed chunk, not after full buffering
  • No regressions on static, auction, or discovery endpoints
  • Buffered fallback for HTML with post-processors and non-2xx error pages
  • Script rewriters (GTM, NextJS) work correctly under streaming fragmentation
  • Binary responses (images, fonts) stream via pass-through without processing overhead

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions