fix(export): stream database dump with LIMIT/OFFSET to fix large-DB failure (#59)#270
Open
Vinzz2303 wants to merge 1 commit into
Open
fix(export): stream database dump with LIMIT/OFFSET to fix large-DB failure (#59)#270Vinzz2303 wants to merge 1 commit into
Vinzz2303 wants to merge 1 commit into
Conversation
…T batching Fixes outerbase#59 - Database dumps do not work on large databases ## Problem The previous implementation accumulated the entire database dump as a string in memory before sending the HTTP response. This caused two failure modes on large databases: - Memory exhaustion (Durable Objects 1 GB cap) - Gateway timeout (Cloudflare 30s limit before first byte) ## Solution 1. Pre-flight metadata collection - fetch table names and schemas BEFORE opening the stream. Any DB error returns clean HTTP 500 instead of broken HTTP 200 with truncated body. 2. ReadableStream with LIMIT/OFFSET batching - rows fetched in pages of 1000 rows (BATCH_SIZE) and enqueued immediately. HTTP response starts flowing to client after first enqueue(), no more timeout issues. 3. Event-loop yielding - await new Promise(r => setTimeout(r, 0)) between batches and tables prevents Durable Object thread from blocking. 4. Correct value escaping: - null/undefined -> NULL - boolean -> 1 / 0 - number -> bare literal - Uint8Array -> X'hexstring' (BLOB) - string -> single-quoted with escaped quotes 5. Exclude internal sqlite_ tables from dump output. ## Tests (9 total, all passing) - Streaming response has correct Content-Type and Transfer-Encoding headers - Body contains CREATE TABLE + INSERT statements - Empty database produces valid SQL preamble/epilogue - Tables with no rows are schema-only - Single quotes in strings are properly escaped - NULL columns render as SQL NULL (new test) - Pagination across two LIMIT/OFFSET batches: 1000 + 1 = 1001 rows (new) - Pre-flight DB error returns HTTP 500, not HTTP 200 (new test) - sqlite_ internal tables excluded from dump (new test)
Author
|
Hi team 👋 This PR is fully ready for review, featuring a streaming implementation with batching and 9 functional tests. Could you please check it out or approve the workflow to let tests run? Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes
Closes #59 — Database dumps do not work on large databases
Problem
The previous implementation loaded the entire database into a single string in memory before sending any HTTP response:
This caused two failure modes on large databases:
Solution
1. Pre-flight metadata collection
Table names and
CREATE TABLEschemas are collected before the stream opens. Any DB error returns a clean HTTP 500 (not a broken HTTP 200 with truncated body).2.
ReadableStreamwithLIMIT/OFFSETbatchingRows are fetched in pages of 1 000 rows and enqueued to a
ReadableStream<Uint8Array>immediately. The HTTP response starts flowing after the firstenqueue()call — eliminating the gateway-timeout problem entirely.3. Event-loop yielding
await new Promise(r => setTimeout(r, 0))is called between batches and between tables, preventing the Durable Object's single JS thread from being monopolised.4. Correct value escaping
null/undefinedNULLboolean1/0numberUint8Array(BLOB)X'hexstring'string'single-quoted'with''escaping5. Exclude
sqlite_internal tablesThe pre-flight query filters out
sqlite_sequence,sqlite_stat1, etc. to produce a clean, importable dump.Tests
9 tests — all passing (
npx vitest run src/export/dump.test.ts):Content-TypeandTransfer-Encoding: chunkedheadersCREATE TABLE+INSERTstatementsINSERT)NULLcolumns render as SQLNULL(new)INSERTstatements (new)sqlite_internal tables excluded from dump (new)/claim #59