Release/5.33.2#66
Merged
Merged
Conversation
…ompt (TG-1029) Expose profiling anomalies for read, search, and triage. Adds: * `list_hygiene_issues` — paginated list per profiling run / table group * `get_hygiene_issue` — full detail with type definition + column profile * `search_hygiene_issues` — cross-run search with project scoping + `since` * `update_hygiene_issue` — disposition update, gated by `disposition` permission * `testgen://hygiene-issue-types` — reference table resource * `hygiene_triage` — guided workflow prompt Likelihood split into `issue_likelihood` + `pii_risk` so PII discovery doesn't muddle regular-issue likelihood values; providing one auto-excludes the other category via the natural query semantics. User-facing terminology applied: `Quality Dimension`, `Muted` (for `Inactive`). Run timestamp sourced from `JobExecution.started_at` per the run/JE consolidation direction. No "anomaly" in user-facing surfaces — DB columns keep their names.
Sync main to enterprise See merge request dkinternal/testgen/dataops-testgen!501
fix(ui): visual glitches in connections and test results See merge request dkinternal/testgen/dataops-testgen!502
fix: standalone and bigquery fixes + streamlit warning suppression See merge request dkinternal/testgen/dataops-testgen!503
Several settings called os.getenv directly instead of the local getenv helper that also reads from ~/.testgen/config.env. As a result, values written by `testgen standalone-setup` (e.g. TG_UI_PORT) were silently ignored by `testgen run-app`. Also derive UI_BASE_URL from UI_PORT instead of Streamlit's STREAMLIT_SERVER_PORT, matching the API side. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix(standalone): honor config.env for ports, hosts, and base URLs See merge request dkinternal/testgen/dataops-testgen!504
Round of review fixes applied on top of the initial implementation: * Drop the legacy `profile_run_id` from the MCP layer — model methods take a `job_execution_id` and JOIN `ProfilingRun` to filter. New `ProfilingRun.get_latest_complete_je_id_for_table_group` avoids reading the `table_groups.last_complete_profile_run_id` cache (which still points at the internal run PK; schema-level cleanup tracked separately). * Trust the run/JE backfill migration — drop `if x is None` guards and the descriptive-string fallback in `_resolve_profile_run`. * StrEnums for fixed string sets: `Disposition`, `IssueLikelihood`, `PiiRisk` (in `common/models/hygiene_issue.py`) and `QualityDimension` (new shared `common/enums.py`). Parsers return enum types; coalesce defaults reference enum members; sentinel sets dropped. * Tighten labels: heading uses "for profiling run"; `get_hygiene_issue` no longer triples "Run / Profiling Run / Run Date". * Empty states render with title + italic marker. * `dq_prevalence` removed from output and dataclasses (no precedent label, unhelpful score-engine internal).
feat(mcp): hygiene issues for MCP (TG-1029) See merge request dkinternal/testgen/dataops-testgen!499
`testgen run-app` was leaving orphan postgres processes after Ctrl+C in
standalone (pip) mode, breaking `tg delete` (data dir locked).
pixeltable-pgserver tracks handles via a disk-backed PID registry at
`<pgdata>/.handle_pids.json` and only stops postgres when the calling PID
is the last one registered. Two paths were leaking PIDs into the registry:
1. Each `run-app all` child re-entered `cli()` and called `start_server()`
itself, registering its own PID alongside the parent's.
2. On Windows, `_forward_signal_to_child` used `terminate()`
(TerminateProcess), so children's atexit never ran — their PIDs
stayed in the registry forever, and the parent's cleanup silently
no-op'd.
Fix:
- `cli()` now calls `ensure_standalone_setup(uri)` instead of
`start_server()` when `_TG_STANDALONE_URI` is inherited, so children
attach to the parent's pgserver without registering a PID.
- `run_app("all")` injects the parent's pgserver URI into `child_env`.
- `run_ui()` Streamlit env-var assembly falls back to the inherited env
var when this process doesn't own pgserver.
- New `_subprocess_spawn_kwargs()` helper: POSIX `start_new_session`,
Windows `CREATE_NEW_PROCESS_GROUP`. Applied at both `Popen` sites.
- Windows branch of `_forward_signal_to_child` now sends
`CTRL_BREAK_EVENT` instead of `terminate()`, so children's atexit and
SIGBREAK handlers run.
- New `_install_shutdown_handler()` registers SIGINT/SIGTERM/(SIGBREAK)
in one call. Used in `run_ui`, `run_app("all")`, and `Scheduler.run`.
- `run_app("all")` children-watcher loop now iterates a snapshot of
`children` so simultaneous child exits don't leave a sibling un-reaped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dk-installer now offers Docker and pip install modes, both fully documented at docs.datakitchen.io. Replace the long install instructions in the README — which had drifted out of sync with the installer and the docs — with a brief pointer to the install pages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs: simplify README install section to point at docs site See merge request dkinternal/testgen/dataops-testgen!506
fix(standalone): graceful shutdown of embedded postgres on signal See merge request dkinternal/testgen/dataops-testgen!505
Windows pgserver picks a fresh ephemeral TCP port on every startup, so the demo-DB connection row written during quick-start became stale as soon as run-app started a new pgserver session — "Test Connection" and any target-DB query failed with "connection refused". Store a <embedded> sentinel in project_host instead of the live host/port; resolve_connection_params rewrites it to the live values when standalone mode is active. Single chokepoint covers UI test-connection, profiling, test execution, and quick-start increments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
datakitchen-devops
approved these changes
May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Features
(a026cc1)
Bug Fixes
Documentation