Skip to content

Release/5.33.2#66

Merged
datakitchen-devops merged 20 commits into
mainfrom
release/5.33.2
May 15, 2026
Merged

Release/5.33.2#66
datakitchen-devops merged 20 commits into
mainfrom
release/5.33.2

Conversation

@aarthy-dk
Copy link
Copy Markdown
Contributor

Features

  • mcp: hygiene issues — list, get, search, update + resource + prompt (TG-1029)
    (a026cc1)

Bug Fixes

  • ui: visual glitches in connections and test results (a4c557e)
  • bigquery: error on freshness monitor when tolerance is null (a111764)
  • logs: suppress streamlit warnings (61f8d0e)
  • standalone: command argument precedence - add log path to config.env (e91277e)
  • standalone: honor config.env for ports, hosts, and base URLs (e28d5a8)
  • mcp: hygiene issues review feedback (TG-1029) (3c6d005)
  • standalone: graceful shutdown of embedded postgres on signal (fde7321)
  • standalone: resolve embedded host/port at connection-build time (4e7cc45)

Documentation

  • simplify README install section to point at docs site (043da3c)

rboni-dk and others added 20 commits May 3, 2026 09:25
…ompt (TG-1029)

Expose profiling anomalies for read, search, and triage. Adds:

* `list_hygiene_issues` — paginated list per profiling run / table group
* `get_hygiene_issue` — full detail with type definition + column profile
* `search_hygiene_issues` — cross-run search with project scoping + `since`
* `update_hygiene_issue` — disposition update, gated by `disposition` permission
* `testgen://hygiene-issue-types` — reference table resource
* `hygiene_triage` — guided workflow prompt

Likelihood split into `issue_likelihood` + `pii_risk` so PII discovery doesn't
muddle regular-issue likelihood values; providing one auto-excludes the other
category via the natural query semantics.

User-facing terminology applied: `Quality Dimension`, `Muted` (for `Inactive`).
Run timestamp sourced from `JobExecution.started_at` per the run/JE consolidation
direction. No "anomaly" in user-facing surfaces — DB columns keep their names.
Sync main to enterprise

See merge request dkinternal/testgen/dataops-testgen!501
fix(ui): visual glitches in connections and test results

See merge request dkinternal/testgen/dataops-testgen!502
fix: standalone and bigquery fixes + streamlit warning suppression

See merge request dkinternal/testgen/dataops-testgen!503
Several settings called os.getenv directly instead of the local getenv
helper that also reads from ~/.testgen/config.env. As a result, values
written by `testgen standalone-setup` (e.g. TG_UI_PORT) were silently
ignored by `testgen run-app`. Also derive UI_BASE_URL from UI_PORT
instead of Streamlit's STREAMLIT_SERVER_PORT, matching the API side.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix(standalone): honor config.env for ports, hosts, and base URLs

See merge request dkinternal/testgen/dataops-testgen!504
Round of review fixes applied on top of the initial implementation:

* Drop the legacy `profile_run_id` from the MCP layer — model methods take
  a `job_execution_id` and JOIN `ProfilingRun` to filter. New
  `ProfilingRun.get_latest_complete_je_id_for_table_group` avoids reading the
  `table_groups.last_complete_profile_run_id` cache (which still points at the
  internal run PK; schema-level cleanup tracked separately).
* Trust the run/JE backfill migration — drop `if x is None` guards and the
  descriptive-string fallback in `_resolve_profile_run`.
* StrEnums for fixed string sets: `Disposition`, `IssueLikelihood`, `PiiRisk`
  (in `common/models/hygiene_issue.py`) and `QualityDimension` (new shared
  `common/enums.py`). Parsers return enum types; coalesce defaults reference
  enum members; sentinel sets dropped.
* Tighten labels: heading uses "for profiling run"; `get_hygiene_issue` no
  longer triples "Run / Profiling Run / Run Date".
* Empty states render with title + italic marker.
* `dq_prevalence` removed from output and dataclasses (no precedent label,
  unhelpful score-engine internal).
feat(mcp): hygiene issues for MCP (TG-1029)

See merge request dkinternal/testgen/dataops-testgen!499
`testgen run-app` was leaving orphan postgres processes after Ctrl+C in
standalone (pip) mode, breaking `tg delete` (data dir locked).

pixeltable-pgserver tracks handles via a disk-backed PID registry at
`<pgdata>/.handle_pids.json` and only stops postgres when the calling PID
is the last one registered. Two paths were leaking PIDs into the registry:

1. Each `run-app all` child re-entered `cli()` and called `start_server()`
   itself, registering its own PID alongside the parent's.
2. On Windows, `_forward_signal_to_child` used `terminate()`
   (TerminateProcess), so children's atexit never ran — their PIDs
   stayed in the registry forever, and the parent's cleanup silently
   no-op'd.

Fix:

- `cli()` now calls `ensure_standalone_setup(uri)` instead of
  `start_server()` when `_TG_STANDALONE_URI` is inherited, so children
  attach to the parent's pgserver without registering a PID.
- `run_app("all")` injects the parent's pgserver URI into `child_env`.
- `run_ui()` Streamlit env-var assembly falls back to the inherited env
  var when this process doesn't own pgserver.
- New `_subprocess_spawn_kwargs()` helper: POSIX `start_new_session`,
  Windows `CREATE_NEW_PROCESS_GROUP`. Applied at both `Popen` sites.
- Windows branch of `_forward_signal_to_child` now sends
  `CTRL_BREAK_EVENT` instead of `terminate()`, so children's atexit and
  SIGBREAK handlers run.
- New `_install_shutdown_handler()` registers SIGINT/SIGTERM/(SIGBREAK)
  in one call. Used in `run_ui`, `run_app("all")`, and `Scheduler.run`.
- `run_app("all")` children-watcher loop now iterates a snapshot of
  `children` so simultaneous child exits don't leave a sibling un-reaped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dk-installer now offers Docker and pip install modes, both fully
documented at docs.datakitchen.io. Replace the long install instructions
in the README — which had drifted out of sync with the installer and
the docs — with a brief pointer to the install pages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs: simplify README install section to point at docs site

See merge request dkinternal/testgen/dataops-testgen!506
fix(standalone): graceful shutdown of embedded postgres on signal

See merge request dkinternal/testgen/dataops-testgen!505
Windows pgserver picks a fresh ephemeral TCP port on every startup, so the
demo-DB connection row written during quick-start became stale as soon as
run-app started a new pgserver session — "Test Connection" and any target-DB
query failed with "connection refused".

Store a <embedded> sentinel in project_host instead of the live host/port;
resolve_connection_params rewrites it to the live values when standalone mode
is active. Single chokepoint covers UI test-connection, profiling, test
execution, and quick-start increments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@datakitchen-devops datakitchen-devops merged commit 82d4828 into main May 15, 2026
2 checks passed
@datakitchen-devops datakitchen-devops deleted the release/5.33.2 branch May 15, 2026 03:26
@github-actions
Copy link
Copy Markdown

Coverage

Coverage Report •
FileStmtsMissBranchBrPartCoverMissing
testgen
   __main__.py50050011200%10–1060
   settings.py17778192%12–18
testgen/commands
   run_launch_db_config.py3737200%1–103
   run_quick_start.py1621624400%1–327
testgen/common
   standalone_postgres.py694618026%35–36, 40, 53–58, 72–92, 97, 106–108, 118–133, 139–142, 147, 159–167
testgen/common/database/flavor
   flavor_service.py90226071%57–59, 63–71, 108, 111, 114, 117–118, 125–129, 133, 136
testgen/common/models
   hygiene_issue.py186388076%64, 154–157, 186–191, 195, 207–224, 229–247, 251, 273–303, 318–353, 366–406, 415–417, 423–462
   profiling_run.py1745720060%83, 134–135, 140–149, 153–163, 173–180, 187–194, 204–268, 275–290, 294–296, 300–318, 321–328, 331–338, 341–352
testgen/mcp
   server.py90754016%52–63, 68–75, 88–198, 215–217
testgen/mcp/prompts
   workflows.py25192022%6, 26–28, 52, 68, 86–111, 120–122
testgen/mcp/tools
   common.py1299228024%41–44, 48–52, 56–57, 61–62, 66–69, 73–77, 81–85, 94–98, 103–106, 110–125, 129–139, 144–149, 154–160, 165–169, 174–177, 186–191, 196–205
   hygiene_issues.py22222211000%1–420
   reference.py62622800%1–110
   test_definitions.py1861638009%45–103, 114–213, 224–251, 260–269, 274–288, 304–351
testgen/scheduler
   cli_scheduler.py17713646018%34–41, 47–75, 79–80, 89–91, 94–95, 98–126, 129–139, 142–154, 164–181, 184–218, 223–230, 234–251
TOTAL16408121933840321% 

Tests Skipped Failures Errors Time
4 0 💤 0 ❌ 4 🔥 15.789s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants