Skip to content

fix(export): stream database dump with LIMIT/OFFSET to fix large-DB failure (#59)#270

Open
Vinzz2303 wants to merge 1 commit into
outerbase:mainfrom
Vinzz2303:fix/issue-59-streaming-database-dump
Open

fix(export): stream database dump with LIMIT/OFFSET to fix large-DB failure (#59)#270
Vinzz2303 wants to merge 1 commit into
outerbase:mainfrom
Vinzz2303:fix/issue-59-streaming-database-dump

Conversation

@Vinzz2303

@Vinzz2303 Vinzz2303 commented Jun 4, 2026

Copy link
Copy Markdown

Fixes

Closes #59 — Database dumps do not work on large databases

Problem

The previous implementation loaded the entire database into a single string in memory before sending any HTTP response:

let dumpContent = 'SQLite format 3\0'
// ... all rows appended ...
const blob = new Blob([dumpContent], ...)
return new Response(blob, { headers })

This caused two failure modes on large databases:

  • Memory exhaustion — Durable Objects have a 1 GB memory cap. Large databases cause OOM crashes mid-request.
  • Gateway timeout — The response body is not sent until the entire dump is assembled. Cloudflare's 30-second limit fires before a single byte is delivered to the client.

Solution

1. Pre-flight metadata collection

Table names and CREATE TABLE schemas are collected before the stream opens. Any DB error returns a clean HTTP 500 (not a broken HTTP 200 with truncated body).

2. ReadableStream with LIMIT/OFFSET batching

Rows are fetched in pages of 1 000 rows and enqueued to a ReadableStream<Uint8Array> immediately. The HTTP response starts flowing after the first enqueue() call — eliminating the gateway-timeout problem entirely.

3. Event-loop yielding

await new Promise(r => setTimeout(r, 0)) is called between batches and between tables, preventing the Durable Object's single JS thread from being monopolised.

4. Correct value escaping

Type Output
null / undefined NULL
boolean 1 / 0
number bare literal
Uint8Array (BLOB) X'hexstring'
string 'single-quoted' with '' escaping

5. Exclude sqlite_ internal tables

The pre-flight query filters out sqlite_sequence, sqlite_stat1, etc. to produce a clean, importable dump.

Tests

9 tests — all passing (npx vitest run src/export/dump.test.ts):

  • ✅ Streaming response has correct Content-Type and Transfer-Encoding: chunked headers
  • ✅ Body contains CREATE TABLE + INSERT statements
  • ✅ Empty database produces valid SQL preamble/epilogue only
  • ✅ Tables with no rows are schema-only (no INSERT)
  • ✅ Single quotes in strings are properly escaped
  • NULL columns render as SQL NULL (new)
  • ✅ Pagination: full 1 000-row batch followed by partial batch = 1 001 INSERT statements (new)
  • ✅ Pre-flight DB error returns HTTP 500 (new)
  • sqlite_ internal tables excluded from dump (new)

/claim #59

…T batching

Fixes outerbase#59 - Database dumps do not work on large databases

## Problem

The previous implementation accumulated the entire database dump as a
string in memory before sending the HTTP response. This caused two
failure modes on large databases:
- Memory exhaustion (Durable Objects 1 GB cap)
- Gateway timeout (Cloudflare 30s limit before first byte)

## Solution

1. Pre-flight metadata collection - fetch table names and schemas BEFORE
   opening the stream. Any DB error returns clean HTTP 500 instead of
   broken HTTP 200 with truncated body.

2. ReadableStream with LIMIT/OFFSET batching - rows fetched in pages of
   1000 rows (BATCH_SIZE) and enqueued immediately. HTTP response starts
   flowing to client after first enqueue(), no more timeout issues.

3. Event-loop yielding - await new Promise(r => setTimeout(r, 0)) between
   batches and tables prevents Durable Object thread from blocking.

4. Correct value escaping:
   - null/undefined -> NULL
   - boolean        -> 1 / 0
   - number         -> bare literal
   - Uint8Array     -> X'hexstring' (BLOB)
   - string         -> single-quoted with escaped quotes

5. Exclude internal sqlite_ tables from dump output.

## Tests (9 total, all passing)

- Streaming response has correct Content-Type and Transfer-Encoding headers
- Body contains CREATE TABLE + INSERT statements
- Empty database produces valid SQL preamble/epilogue
- Tables with no rows are schema-only
- Single quotes in strings are properly escaped
- NULL columns render as SQL NULL (new test)
- Pagination across two LIMIT/OFFSET batches: 1000 + 1 = 1001 rows (new)
- Pre-flight DB error returns HTTP 500, not HTTP 200 (new test)
- sqlite_ internal tables excluded from dump (new test)
@Vinzz2303

Copy link
Copy Markdown
Author

Hi team 👋 This PR is fully ready for review, featuring a streaming implementation with batching and 9 functional tests. Could you please check it out or approve the workflow to let tests run? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Database dumps do not work on large databases

1 participant