Skip to content

feat: Add pgschema.toml configuration file support#433

Open
NFUChen wants to merge 39 commits into
pgplex:mainfrom
NFUChen:feat/upstream-pr-config-file
Open

feat: Add pgschema.toml configuration file support#433
NFUChen wants to merge 39 commits into
pgplex:mainfrom
NFUChen:feat/upstream-pr-config-file

Conversation

@NFUChen
Copy link
Copy Markdown

@NFUChen NFUChen commented May 14, 2026

Add pgschema.toml configuration file support

Summary

Adds support for a TOML-based configuration file (pgschema.toml) so users can define connection parameters, flags, and per-environment overrides without repeating CLI flags on every invocation.

Before:

pgschema plan --host localhost --port 5432 --db myapp --user postgres --schema public --file schema.sql
pgschema apply --host localhost --port 5432 --db myapp --user postgres --schema public --file schema.sql

After:

# with pgschema.toml in the working directory
pgschema plan
pgschema apply

Features

1. Flat config file

All existing CLI flags can be set in pgschema.toml:

host = "localhost"
port = 5432
db = "myapp"
user = "postgres"
schema = "public"
file = "schema.sql"

2. Named environments (--env)

[env.*] blocks define per-environment overrides that inherit from the base level:

schema = "public"
file = "schema.sql"

[env.dev]
host = "localhost"
db = "myapp_dev"
user = "postgres"

[env.prod]
host = "prod-db.internal"
db = "myapp_prod"
user = "app_user"
lock-timeout = "30s"
auto-approve = false
pgschema plan --env dev
pgschema apply --env prod

Environment merging uses TOML metadata (IsDefined) so explicitly setting a boolean to false in an env block correctly overrides a base-level true — zero values are not silently skipped.

3. Multi-tenant schema loop ([schemas])

For multi-tenant setups, a [schemas] block with a SQL query discovers schema names at runtime. plan and apply iterate over all discovered schemas automatically:

host = "localhost"
db = "myapp"
user = "postgres"
file = "tenant.sql"

[schemas]
query = "SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'tenant_%'"
pgschema plan    # plans for each tenant schema
pgschema apply   # applies to each tenant schema

The discovery query runs inside a read-only transaction to prevent accidental data modification (CREATE/DROP/INSERT are rejected by Postgres).

4. Precedence

CLI flags always win: CLI flags > env vars > config env block > config base > defaults.

Config values are applied in PreRunE hooks before env var resolution, so existing PGHOST/PGPORT/etc. behavior is preserved. A flag is only populated from config if cmd.Flags().Changed(flag) returns false.

Global flags added

Flag Default Description
--config pgschema.toml Path to config file
--env (none) Named environment to use
  • Explicit --config to a missing file → exit with error.
  • Default pgschema.toml missing → silently proceed (backward compatible).
  • --env without a config file → exit with error.

Files changed

Code

File Change
cmd/config/config.go New: TOML parsing, env merging via IsDefined, DiscoverSchemas (read-only tx, URL-encoded DSN), Get()/SetResolved() singleton
cmd/config/config_test.go New: unit tests for config loading, env overrides, booleans, schemas section, edge cases
cmd/config_integration_test.go New: integration tests including read-only enforcement on schema discovery
cmd/root.go --config / --env global flags, loadConfig() in PersistentPreRun
cmd/plan/plan.go applyConfigToPlan() PreRunE, runPlanMultiSchema() using top-level Plan for combined output, removed MarkFlagRequired("file"), unified processOutput()
cmd/plan/output_test.go Adjusted (whitespace + final newline; TestDeriveSchemaOutputTarget removed since per-schema-file logic was dropped in favor of single combined output)
cmd/apply/apply.go applyConfigToApply() PreRunE, runApplyMultiSchema(), applyPlanFile() iterating over Plan.Schemas in sorted order with auto-detection of single vs multi-schema plan files
cmd/apply/apply_test.go New TestRunApply_PlanFlagSkipsMultiSchema ensures --plan short-circuits the multi-schema path even when [schemas] is configured
cmd/apply/apply_integration_test.go Updated call sites: GeneratePlanGenerateSchemaPlan; plan files now go through Plan.AddSchema("public", ...) and are read back via Schemas["public"]
cmd/dump/dump.go applyConfigToDump() PreRunE
cmd/{ignore,migrate}_integration_test.go, cmd/plan/external_db_integration_test.go Updated to call GenerateSchemaPlan and wrap with plan.NewPlan().AddSchema(...) for output
internal/plan/plan.go Slimmed down (~1100 lines removed). Plan is now a top-level container: { version, pgschema_version, created_at, schemas: map[string]*SchemaPlan }. Adds NewPlan(), AddSchema(), SortedSchemaNames(), SummaryString(), multi-schema-aware HumanColored() / ToSQL() (single-schema renders without header), and FromJSON() for the new shape
internal/plan/schema_plan.go New file (~1120 lines): SchemaPlan type holds the per-schema Groups, SourceFingerprint, SourceDiffs, plus all the previous Plan rendering logic (HumanColored, ToSQL, calculateSummaryFromSteps, table/view/materialized-view detail writers, helpers). All extracted verbatim from the old plan.go.
internal/plan/schema_plan_test.go New: covers SchemaPlan summary/no-changes, JSON round-trip across testdata/diff/migrate/v* (now via top-level Plan), debug JSON round-trip with SourceDiffs, single-schema header omission
internal/plan/plan_test.go Rewritten for the new top-level Plan API: AddSchema, SortedSchemaNames, ToJSON/FromJSON round-trip with schemas key, SchemaEntry_ExcludesTopLevelFields, HumanColored_MultiSchema, ToSQL_MultiSchema, SummaryString, CreatedAt_UsesTestTime

Test data — regenerated in this revision

All ~180 testdata/diff/**/plan.json golden files were regenerated to match the new top-level JSON shape. The change is purely structural — no SQL, fingerprints, operations, paths, or step ordering were modified.

Before (single-schema flat):

{
  "version": "1.0.0",
  "pgschema_version": "1.9.0",
  "created_at": "1970-01-01T00:00:00Z",
  "source_fingerprint": { "hash": "..." },
  "groups": [ { "steps": [ ... ] } ]
}

After (schema-keyed):

{
  "version": "1.0.0",
  "pgschema_version": "1.9.0",
  "created_at": "1970-01-01T00:00:00Z",
  "schemas": {
    "public": {
      "source_fingerprint": { "hash": "..." },
      "groups": [ { "steps": [ ... ] } ]
    }
  }
}

Why every file changed:

  • Plan is now always the multi-schema container; groups and source_fingerprint moved inside schemas.<name>.
  • Even single-schema runs (the entire existing diff suite) now serialize through the same path used by multi-tenant runs, ensuring one canonical on-disk format.
  • Top-level version, pgschema_version, and created_at are preserved; per-schema entries deliberately omit them (verified by TestPlan_SchemaEntry_ExcludesTopLevelFields).

The diff per file is mechanical (added wrapping "schemas": { "public": { ... } }, plus 2-space indentation shift), which is why the diffstat shows ~180 files with ±5–6k lines but no semantic changes:

184 files changed, 6511 insertions(+), 5249 deletions(-)

with internal/plan/plan.go shrinking by ~1100 lines as logic moved into internal/plan/schema_plan.go.

Misc

  • README.md — documentation for config file, named environments, multi-schema loop, and precedence.

Design decisions

  • TOML over YAML/JSON: pgschema already depends on BurntSushi/toml for ignore config — no new dependency.
  • Global singleton (config.Get()): Config is loaded once in PersistentPreRun and read by subcommands via config.Get(). Matches the existing pattern where global vars are set in PreRunE hooks.
  • --file is no longer MarkFlagRequired: When config provides file, requiring --file on the CLI would defeat the purpose. Validation moved to runtime in runPlan ("--file is required (provide via flag, config file, or environment)").
  • Read-only transaction for schema discovery: The [schemas].query is user-provided SQL executed against the target database. Wrapping in BeginTx(ctx, &sql.TxOptions{ReadOnly: true}) prevents accidental CREATE/DROP/INSERT even if the query is malformed or malicious. Verified by TestDiscoverSchemas_ReadOnlyEnforcement.
  • URL-encoded DSN in DiscoverSchemas: Built via net/url to avoid injection through host/user/password fields.
  • Unified plan file format: In multi-schema mode, all schema plans are written to a single file using the Plan JSON format ({"schemas": {"tenant_1": {...}, "tenant_2": {...}}}). apply --plan iterates schemas in sorted order. This eliminates the previous limitation of needing one plan file per tenant.
  • Single-schema rendering omits headers: Plan.HumanColored() and Plan.ToSQL() detect len(Schemas) == 1 and delegate directly to the underlying SchemaPlan, so single-schema CLI output is unchanged from before.
  • --plan short-circuits multi-schema: When --plan is provided, RunApply skips the [schemas] discovery path even if config has it set, since the plan file itself dictates which schemas to apply. Covered by TestRunApply_PlanFlagSkipsMultiSchema.

Backward compatibility

  • No CLI flag or env var semantics changed.
  • Without pgschema.toml, command behavior is identical to before — except plan JSON output now wraps under "schemas". This is a breaking change for anyone parsing plan JSON externally or replaying plan files produced by older pgschema versions. All golden plan files in testdata/diff/**/plan.json were regenerated accordingly.
  • All non-golden tests pass unmodified.

Flow diagrams

End-to-end command flow

flowchart TD
	A[Start command plan/apply/dump] --> B[PersistentPreRun loadConfig]
	B --> C{Config file exists?}
	C -->|No and --config explicit| C1[Exit with error]
	C -->|No and --env set| C2[Exit with error]
	C -->|No default file| C3[Continue with nil config]
	C -->|Yes| D[Parse TOML base + env]
	D --> E[Set global resolved config]

	E --> F[Subcommand PreRunE applyConfigToX]
	F --> G[Apply config values only when flag not changed]
	G --> H[Apply env vars and connection defaults]

	H --> I{Has schemas.query and schema not explicitly set?}
	I -->|No| J[Single-schema flow]
	I -->|Yes| K[Discover schemas in read-only transaction]

	K --> L{Command type}
	L -->|plan| M[Loop schemas, GenerateSchemaPlan, AddSchema to combined Plan]
	L -->|apply --file| N[Loop schemas, GenerateSchemaPlan + ApplyMigration each]
	L -->|apply --plan| N2[Load Plan from file, iterate Schemas in sorted order]

	M --> M1[Write combined Plan JSON/SQL/Human to single output]
	M1 --> M2[Progress logs to stderr]

	N --> N3[Progress logs to stderr]
	N2 --> N4[Apply each SchemaPlan with its schema name]

	J --> O[Single-schema behavior preserved, no header in output]
	M2 --> P[Done]
	N3 --> P
	N4 --> P
	O --> P
Loading

Precedence and output behavior

Precedence order

Priority Source Notes
1 (highest) CLI flags Explicit command-line input always wins
2 Environment variables Applied after config fallback
3 Config env block ([env.<name>]) Overrides base config when key is explicitly defined
4 Config base Default values from top-level pgschema.toml
5 (lowest) Built-in defaults Hardcoded defaults in command flags

Multi-schema plan output

All schemas are combined into a single output. JSON wraps per-schema plans in a "schemas" map:

Output format Behavior
--output-json plan.json Single combined Plan JSON file with all schemas
--output-json stdout Combined Plan JSON printed to stdout
--output-human Schemas listed with ── Schema: <name> ── headers (single-schema: no header)
--output-sql Combined SQL with -- Schema: <name> comment headers (single-schema: no header)

Plan JSON format

{
  "version": "1.0.0",
  "pgschema_version": "1.9.0",
  "created_at": "2025-01-01T00:00:00Z",
  "schemas": {
    "tenant_1": {
      "source_fingerprint": { "hash": "..." },
      "groups": [...]
    },
    "tenant_2": {
      "source_fingerprint": { "hash": "..." },
      "groups": [...]
    }
  }
}

apply --plan combined.json iterates schemas in sorted order and applies each SchemaPlan against its named schema.

William-W-Chen and others added 23 commits May 14, 2026 22:17
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… flags

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a config file defines [schemas] with a SQL query, plan and apply
commands discover tenant schemas dynamically and iterate over each one.
Dump is excluded since it produces a single template schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests cover: no config file, explicit config path, env overrides with
inheritance, schemas section, plan fields, boolean overrides, and
command-level config fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: Add `pgschema.toml` configuration file support
@NFUChen NFUChen changed the title Feat/upstream pr config file feat: Add pgschema.toml configuration file support May 14, 2026
@NFUChen NFUChen marked this pull request as ready for review May 14, 2026 17:47
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR adds TOML configuration support for pgschema commands. The main changes are:

  • Adds pgschema.toml loading with named environment overrides.
  • Applies config values before existing environment variable handling.
  • Adds multi-schema discovery and looping for plan and apply.
  • Adds config fallback handling for dump, plan, and apply flags.
  • Documents config files, environments, and multi-tenant schema loops.

Confidence Score: 3/5

This is close, but the structured multi-schema output path should be fixed before merging.

  • Multi-schema planning can still print invalid JSON to stdout when more than one schema is discovered.
  • The config and apply paths otherwise follow the intended precedence model from the changed code.

cmd/plan/plan.go

Important Files Changed

Filename Overview
cmd/plan/plan.go Adds config fallback and multi-schema planning; stdout structured output still needs aggregation.
cmd/apply/apply.go Adds config fallback and multi-schema apply paths with plan flag guard handling.
cmd/config/config.go Adds TOML parsing, environment merging, global resolved config, and schema discovery.

Reviews (2): Last reviewed commit: "test: add tests for deriveSchemaOutputTa..." | Re-trigger Greptile

Comment thread cmd/plan/plan.go
Comment thread cmd/apply/apply.go
Comment thread cmd/plan/plan.go Outdated
Comment thread cmd/plan/plan.go Outdated
Comment thread cmd/apply/apply.go
Comment thread cmd/config/config.go Outdated
Comment thread cmd/root.go
@NFUChen NFUChen marked this pull request as draft May 15, 2026 02:38
@NFUChen NFUChen marked this pull request as ready for review May 15, 2026 04:05
Comment thread cmd/plan/plan.go Outdated
- Renamed `GeneratePlan` to `GenerateSchemaPlan` for clarity.
- Updated `runPlan` and `runPlanMultiSchema` to use the new schema plan generation function.
- Consolidated the `MultiPlan` and `Plan` structures into a unified `Plan` structure that handles both single and multi-schema operations.
- Adjusted methods to work with the new `Plan` structure, including `AddSchema`, `HasAnyChanges`, `ToJSON`, and `ToSQL`.
- Updated tests to reflect the changes in the plan structure and ensure proper functionality.
- Enhanced JSON serialization and deserialization for the new plan structure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants