-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Context
The publisher proxy currently buffers the entire response body in memory before sending any bytes to the client. For a 222KB HTML page, peak memory is ~4x the response size and no bytes reach the client until all processing completes.
Performance results (staging vs production, median over 5 runs, Chrome 1440x900)
| Metric | Production (v135, buffered) | Staging (v136, streaming) | Delta |
|---|---|---|---|
| TTFB | 54 ms | 35 ms | -19 ms (-35%) |
| First Paint | 186 ms | 160 ms | -26 ms (-14%) |
| First Contentful Paint | 186 ms | 160 ms | -26 ms (-14%) |
| DOM Content Loaded | 286 ms | 282 ms | -4 ms (~same) |
| DOM Complete | 1060 ms | 663 ms | -397 ms (-37%) |
Measured on getpurpose.ai. Production (v135) buffers the entire response before sending. Staging (v136) streams processed chunks incrementally via StreamingBody.
Spec
See streaming response design spec (PR #562).
Plan
See implementation plan (PR #562).
Phase 1: Make streaming pipeline chunk-emitting (PR #583)
Ships independently with immediate memory savings.
- Phase 1, Task 1: Fix encoder finalization in process_through_compression #568 — Phase 1, Task 1: Fix encoder finalization in process_through_compression
- Phase 1, Task 2: Convert process_gzip_to_gzip to chunk-based processing #569 — Phase 1, Task 2: Convert process_gzip_to_gzip to chunk-based processing
- Phase 1, Task 3: Convert decompress_and_process to chunk-based processing #570 — Phase 1, Task 3: Convert decompress_and_process to chunk-based processing
- Phase 1, Task 4: Rewrite HtmlRewriterAdapter for incremental streaming #571 — Phase 1, Task 4: Rewrite HtmlRewriterAdapter for incremental streaming
- Phase 1, Task 5: Full verification #572 — Phase 1, Task 5: Full verification
Phase 2: Stream responses to client via StreamingBody (PR #585)
Depends on Phase 1. Adds TTFB/TTLB improvement.
- Phase 2, Task 6: Migrate entry point from #[fastly::main] to raw main() #573 — Phase 2, Task 6: Migrate entry point from #[fastly::main] to raw main()
- Phase 2, Task 7: Refactor process_response_streaming to accept W: Write #574 — Phase 2, Task 7: Refactor process_response_streaming to accept W: Write
- Phase 2, Task 8: Add streaming path to publisher proxy #575 — Phase 2, Task 8: Add streaming path to publisher proxy
- Phase 2, Task 9: Full verification #576 — Phase 2, Task 9: Full verification
- Phase 2, Task 10: Chrome DevTools MCP baseline and comparison #577 — Phase 2, Task 10: Chrome DevTools MCP baseline and comparison
Phase 3: Make script rewriters fragment-safe (PR #591)
Depends on Phase 2. Removes the buffered fallback, enabling full streaming even with GTM/NextJS script rewriters active.
- Phase 3, Task 11: Make NextJsNextDataRewriter fragment-safe #586 — Phase 3, Task 11: Make NextJsNextDataRewriter fragment-safe
- Phase 3, Task 12: Make GoogleTagManagerIntegration rewrite fragment-safe #587 — Phase 3, Task 12: Make GoogleTagManagerIntegration rewrite fragment-safe
- Phase 3, Task 13: Remove buffered mode from HtmlRewriterAdapter #588 — Phase 3, Task 13: Remove buffered mode from HtmlRewriterAdapter
- Phase 3, Task 14: Always use streaming adapter in create_html_processor #589 — Phase 3, Task 14: Always use streaming adapter in create_html_processor
- Phase 3, Task 15: Full verification and regression tests #590 — Phase 3, Task 15: Full verification and regression tests
Phase 4: Stream binary pass-through responses
Depends on Phase 2. Non-processable content (images, fonts, video) currently buffers in memory unnecessarily. Phase 4 streams them directly via io::copy into StreamingBody.
- Phase 4, Task 16: Stream binary pass-through responses via io::copy #592 — Phase 4, Task 16: Stream binary pass-through responses via io::copy
- Phase 4, Task 17: Binary pass-through tests and verification #593 — Phase 4, Task 17: Binary pass-through tests and verification
Acceptance Criteria
- Streaming activates for all 2xx responses (text and binary)
- Peak memory per request reduced from ~4x to constant (chunk buffer + parser state)
- Client receives first body bytes after first processed chunk, not after full buffering
- No regressions on static, auction, or discovery endpoints
- Buffered fallback for HTML with post-processors and non-2xx error pages
- Script rewriters (GTM, NextJS) work correctly under streaming fragmentation
- Binary responses (images, fonts) stream via pass-through without processing overhead