Skip to content

feat(plugins): add Replicator plugin to mirror external data#250

Draft
ThaiTrevor wants to merge 2 commits into
outerbase:mainfrom
ThaiTrevor:fix/issue-72-starbasedb-replicate-data-from-external
Draft

feat(plugins): add Replicator plugin to mirror external data#250
ThaiTrevor wants to merge 2 commits into
outerbase:mainfrom
ThaiTrevor:fix/issue-72-starbasedb-replicate-data-from-external

Conversation

@ThaiTrevor
Copy link
Copy Markdown

🤖 AI-assisted contribution — This PR was drafted with AI assistance and reviewed by a native Vietnamese speaker before submission. Placeholders, terminology, and file format were validated automatically. I will respond to review feedback. Happy to revise or close if not a fit.


Purpose

Closes #72.

Adds a new ReplicatorPlugin under plugins/replicator/ that pulls rows from an external database (Postgres, MySQL, Cloudflare D1, Turso, or another StarbaseDB) into the StarbaseDB internal SQLite store. Each replication pass is driven by a per-table watermark column (e.g. updated_at or a monotonic id) so only rows that changed since the previous run are transferred.

What changed

  • plugins/replicator/index.ts — new ReplicatorPlugin extending StarbasePlugin. Creates tmp_replication_state(table_name, last_value, last_synced_at) on registration, exposes sync() and POST /replicator/sync (admin-only), and upserts rows via INSERT ... ON CONFLICT(primaryKey) DO UPDATE.
  • Watermark tracking compares numerically when both sides parse as numbers, so a monotonic integer id column no longer falls into the lexicographic trap (e.g. "99" > "100"). String compare is used otherwise, which still handles ISO timestamps correctly.
  • Identifiers (name, watermarkColumn, primaryKey, destTable) are validated against [A-Za-z_][A-Za-z0-9_]* at construction time and quoted in the external SELECT using dialect-appropriate quoting (backticks for MySQL, double quotes elsewhere) — consistent with the existing double-quoted destination-side identifiers.
  • plugins/replicator/index.test.ts — vitest suite covering constructor validation, identifier validation, state-table creation, initial pull, watermark-bounded pulls, dest-table override, MySQL dialect quoting, and numeric-watermark ordering.
  • plugins/replicator/README.md — usage, configuration, Cron-plugin scheduling snippet, destination-table DDL template, and a note that large backfills require repeated sync() calls because each call is bounded by batchSize.
  • plugins/replicator/meta.json — registry metadata to match the other plugins.

How it works

  1. On registration the plugin creates tmp_replication_state(table_name, last_value, last_synced_at).
  2. Each sync() call reads the stored watermark per table, runs SELECT * FROM "<table>" WHERE "<watermarkColumn>" > ? ORDER BY "<watermarkColumn>" ASC LIMIT <batchSize> against the external source, and upserts each row into the internal store using ON CONFLICT(<primaryKey>) DO UPDATE.
  3. After the batch, the highest watermark seen (numeric or lexicographic depending on the value type) becomes the new stored last_value.

Scheduling is delegated to the existing Cron plugin — the README shows the snippet.

Tasks

  • Implement the plugin
  • Add unit tests
  • Document usage, scheduling, and destination-table bootstrapping
  • Validate identifiers and quote them in the external SELECT
  • Compare numeric watermarks numerically

Verify

  • npx vitest run plugins/replicator/index.test.ts — 11/11 passing.
  • The 4 failures in src/rls/index.test.ts from a full npx vitest run are pre-existing on main and unrelated to this change (confirmed by running the RLS suite on main).

Before

  • Branch contains exactly one commit, scoped to plugins/replicator/*.
  • No edits to unrelated files.

…nal store

Adds a new StarbasePlugin under plugins/replicator that pulls rows from a
configured external database (Postgres, MySQL, D1, Turso, StarbaseDB) into
StarbaseDB's internal SQLite store using a per-table watermark column.
Exposes POST /replicator/sync (admin-only) and a public sync() method that
can be wired into the Cron plugin or any external scheduler.

Closes outerbase#72
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replicate data from external source to internal source with a Plugin

1 participant