feat: external-source data replication plugin (closes #72)#273
Open
mikhaeelatefrizk wants to merge 4 commits into
Open
feat: external-source data replication plugin (closes #72)#273mikhaeelatefrizk wants to merge 4 commits into
mikhaeelatefrizk wants to merge 4 commits into
Conversation
Author
|
/claim #72 |
Author
|
C:/Program Files/Git/claim #72 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Closes #72.
Adds a replication plugin that pulls new/changed rows from an external source (Postgres or MySQL — e.g. a Supabase database) into StarbaseDB's internal SQLite on a schedule, giving you a locally‑queryable edge replica. It covers all four asks in the issue:
tracking_col+tracking_type(timestamporid) drive change detection.How it works
A Durable Object has a single alarm, which
src/do.tshardcodes to the cron plugin. Rather than competing for that alarm, the replication plugin reuses cron: it registers a cron task per job (addEvent) and runs its sync when that task fires (onEvent). No changes todo.ts, no alarm collision.On each tick, per active job, it pulls
SELECT … WHERE tracking_col > ? ORDER BY tracking_col ASC LIMIT batchfrom the source (the predicate is omitted on the first run), upserts the page with
INSERT OR REPLACEinto internal SQLite, advances the cursor to the max tracking value, and persists it — paginating until a short page or a per‑run page cap.Notable details
?placeholders (the@outerbase/sdkrewrites these per dialect); table/column identifiers are validated against^[A-Za-z0-9_]+$and always quoted (injection‑safe, reserved‑word‑safe).isRaw) so the query reaches the source unmodified.id, temporal fortimestamp); typed schema inference; chunked upserts within SQLite's bound‑variable limit.last_run_at/last_error/rows_syncedobservability; the source password is redacted on list responses and in stored error messages; all routes are admin‑only.Included alongside the plugin
plugins/cron— a small, backward‑compatibleremoveEvent(name, dataSource?)(and an optionaldataSourcearg onaddEvent) so schedulers can clean up their tasks and not depend on middleware ordering.src/operation.ts—executeSDKQuerynow closes the external driver connection in afinally(mirroring the Hyperdrive branch). Previously it leaked a connection per external query, which a forever‑polling replicator would exhaust.Tasks
plugins/replication/index.ts— the pluginplugins/replication/index.test.ts— 50 unit testsplugins/replication/README.md+meta.jsonsrc/index.tsCronPlugin.removeEvent+ optionaldataSourceexecuteSDKQueryconnection‑leak fixprettierclean, no new type errors in touched filesVerify
pnpm install pnpm vitest run plugins/replication # 50 passingEnd‑to‑end (optional) — point a job at a Supabase table:
See
plugins/replication/README.mdfor the full API (create / list / run / reset / pause / delete).Before
No mechanism to replicate an external table into the internal database; reads against external data always hit the remote source.
After
A configurable, per‑table pull replicator with cursor‑based change tracking, exposed under
/replication/*.