From 6d7f39f6e3e5af5bd27edba9ad66def1c2ebbbc8 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:17:52 -0700 Subject: [PATCH 01/12] feat: add preview review reference for Data Designer skill --- .../references/preview-review.md | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 skills/data-designer/references/preview-review.md diff --git a/skills/data-designer/references/preview-review.md b/skills/data-designer/references/preview-review.md new file mode 100644 index 00000000..19aac3af --- /dev/null +++ b/skills/data-designer/references/preview-review.md @@ -0,0 +1,29 @@ +# Preview Review Guide + +## Mindset + +Quality is statistical, not per-record. Fix systemic issues that affect many records; don't chase cosmetic flaws in individual ones. But don't stop early — clear patterns of broken data or ignored instructions are worth fixing. + +## Reading Sample Records + +Load `dataset.parquet` from the preview results directory (printed as `Results path:` by the preview command, or the most recent `artifacts/preview_results_*/` directory). Use pandas to load the parquet file and print the records in a compact, reviewable format. + +## What to Look For + +The specifics depend on the dataset and its intended use. The categories below are common starting points — adapt based on what matters for this dataset. + +### Diversity +- **Mode collapse**: are records clustering around the same patterns, topics, or phrasings? +- **Sampler effectiveness**: are samplers being used effectively to steer diversity in the dataset? +- **Structural monotony**: do LLM-generated columns follow the same template across records? + +### Data Quality +- **Instruction compliance**: does generated content follow prompt constraints (step counts, format requirements, allowed values)? +- **Internal consistency**: does data within a record agree with itself? +- **Encoding integrity**: no garbled encoding, mojibake, or broken unicode. +- **Plausibility**: do examples look like they could come from the real domain, or are they obviously synthetic? + +### Design Choices +- **Column types**: if a text column consistently produces structured data or code, use the appropriate specialized column type. If values come from a fixed set or known distribution, use a sampler instead of an LLM column. +- **Validation**: if output could be checked programmatically (syntax, schema conformance, value ranges), attach a validator. +- **Judge calibration** (if applicable): are scores consistent across similar-quality records? Does the judge catch visible problems? Consider the user's intent — uniformly high scores may be correct if the judge is a quality filter; a spread matters more if it's a training signal. From 95ff92bdd79268476b039004dcfb934fa8815f99 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:21:19 -0700 Subject: [PATCH 02/12] feat: add preview review offer to interactive workflow iterate step --- skills/data-designer/workflows/interactive.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index 81d22c94..5b1fa07e 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -23,7 +23,7 @@ This is an interactive, iterative design process. Do not disengage from the loop 6. **Preview** — Run `data-designer preview --save-results` to generate sample records as HTML files. - Note the sample records directory printed by the `data-designer preview` command - Give the user a clickable link: `file:///sample_records_browser.html` -7. **Iterate** — Ask the user for feedback. Edit the script, re-validate, re-preview, and serve again. Repeat until they are satisfied. +7. **Iterate** — Ask the user for feedback and offer to review the records and suggest improvements yourself. If asked to review, read `references/preview-review.md`. Edit the script, re-validate, re-preview, and serve again. Repeat until they are satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - Warn the user that generation can take a long time for large record counts (50+). From e5f7b17e4bfcd71460e35075f381aa85f162a720 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:22:22 -0700 Subject: [PATCH 03/12] fix: remove stale "and serve again" from iterate step --- skills/data-designer/workflows/interactive.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index 5b1fa07e..34ee764e 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -23,7 +23,7 @@ This is an interactive, iterative design process. Do not disengage from the loop 6. **Preview** — Run `data-designer preview --save-results` to generate sample records as HTML files. - Note the sample records directory printed by the `data-designer preview` command - Give the user a clickable link: `file:///sample_records_browser.html` -7. **Iterate** — Ask the user for feedback and offer to review the records and suggest improvements yourself. If asked to review, read `references/preview-review.md`. Edit the script, re-validate, re-preview, and serve again. Repeat until they are satisfied. +7. **Iterate** — Ask the user for feedback and offer to review the records and suggest improvements yourself. If asked to review, read `references/preview-review.md`. Edit the script, re-validate, and re-preview. Repeat until they are satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - Warn the user that generation can take a long time for large record counts (50+). From a897cb7a3d30ea48f176cc6c0db0f6f30ff1d4a6 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:25:45 -0700 Subject: [PATCH 04/12] fix: reframe design choices as general feature-fit guidance --- skills/data-designer/references/preview-review.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/skills/data-designer/references/preview-review.md b/skills/data-designer/references/preview-review.md index 19aac3af..d4c309a4 100644 --- a/skills/data-designer/references/preview-review.md +++ b/skills/data-designer/references/preview-review.md @@ -24,6 +24,7 @@ The specifics depend on the dataset and its intended use. The categories below a - **Plausibility**: do examples look like they could come from the real domain, or are they obviously synthetic? ### Design Choices -- **Column types**: if a text column consistently produces structured data or code, use the appropriate specialized column type. If values come from a fixed set or known distribution, use a sampler instead of an LLM column. -- **Validation**: if output could be checked programmatically (syntax, schema conformance, value ranges), attach a validator. -- **Judge calibration** (if applicable): are scores consistent across similar-quality records? Does the judge catch visible problems? Consider the user's intent — uniformly high scores may be correct if the judge is a quality filter; a spread matters more if it's a training signal. +Are the right Data Designer features being used? For example: +- A text column that consistently produces structured data or code might be better as a specialized column type. +- Values drawn from a fixed set or known distribution could use a sampler instead of an LLM column. +- If the dataset has judge columns, check whether scores are consistent across similar-quality records and whether the judge catches visible problems. From b4bfdb41e4fd2b6b8c37e9c446af39a42db944c5 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:26:47 -0700 Subject: [PATCH 05/12] fix: move judge calibration from design choices to data quality --- skills/data-designer/references/preview-review.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/data-designer/references/preview-review.md b/skills/data-designer/references/preview-review.md index d4c309a4..479d687b 100644 --- a/skills/data-designer/references/preview-review.md +++ b/skills/data-designer/references/preview-review.md @@ -22,9 +22,9 @@ The specifics depend on the dataset and its intended use. The categories below a - **Internal consistency**: does data within a record agree with itself? - **Encoding integrity**: no garbled encoding, mojibake, or broken unicode. - **Plausibility**: do examples look like they could come from the real domain, or are they obviously synthetic? +- **Judge calibration** (if applicable): are scores consistent across similar-quality records? Does the judge catch visible problems? ### Design Choices Are the right Data Designer features being used? For example: - A text column that consistently produces structured data or code might be better as a specialized column type. - Values drawn from a fixed set or known distribution could use a sampler instead of an LLM column. -- If the dataset has judge columns, check whether scores are consistent across similar-quality records and whether the judge catches visible problems. From 0c916c134a2edaf20a44bbffb0902c12aafde446 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:39:46 -0700 Subject: [PATCH 06/12] fix: make review offer more prominent in iterate step --- skills/data-designer/workflows/interactive.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index 34ee764e..d5046e36 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -23,7 +23,10 @@ This is an interactive, iterative design process. Do not disengage from the loop 6. **Preview** — Run `data-designer preview --save-results` to generate sample records as HTML files. - Note the sample records directory printed by the `data-designer preview` command - Give the user a clickable link: `file:///sample_records_browser.html` -7. **Iterate** — Ask the user for feedback and offer to review the records and suggest improvements yourself. If asked to review, read `references/preview-review.md`. Edit the script, re-validate, and re-preview. Repeat until they are satisfied. +7. **Iterate** + - Ask the user for feedback. + - Offer to review the records yourself and suggest improvements. If the user accepts, read `references/preview-review.md` for guidance. + - Apply changes, re-validate, and re-preview. Repeat until they are satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - Warn the user that generation can take a long time for large record counts (50+). From bcc94c1b8b777d37924aa03478883e89ebc02048 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 17:39:56 -0700 Subject: [PATCH 07/12] fix: clarify "the user" in iterate step --- skills/data-designer/workflows/interactive.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index d5046e36..dea6aa4e 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -26,7 +26,7 @@ This is an interactive, iterative design process. Do not disengage from the loop 7. **Iterate** - Ask the user for feedback. - Offer to review the records yourself and suggest improvements. If the user accepts, read `references/preview-review.md` for guidance. - - Apply changes, re-validate, and re-preview. Repeat until they are satisfied. + - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - Warn the user that generation can take a long time for large record counts (50+). From 205e64728e026bd756eb9aceb694671256704840 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 20:31:52 -0700 Subject: [PATCH 08/12] fix: generalize generation time warning across workflows --- skills/data-designer/workflows/autopilot.md | 4 ++-- skills/data-designer/workflows/interactive.md | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/skills/data-designer/workflows/autopilot.md b/skills/data-designer/workflows/autopilot.md index 4fd08489..2a61c19f 100644 --- a/skills/data-designer/workflows/autopilot.md +++ b/skills/data-designer/workflows/autopilot.md @@ -20,7 +20,7 @@ In this mode, make reasonable design decisions autonomously based on the dataset - Note the sample records directory printed by the `data-designer preview` command - Give the user a clickable link: `file:///sample_records_browser.html` 7. **Create** — If the user specified a record count: - - 50 or fewer: run `data-designer create --num-records --dataset-name ` directly. - - More than 50: warn that generation can take a long time and ask for confirmation before running. + - Run `data-designer create --num-records --dataset-name `. + - Generation time depends on record count, number of LLM columns, and inference throughput. For larger datasets, warn the user and ask for confirmation before running. - If no record count was specified, skip this step. 8. **Present** — Summarize what was built: columns, samplers used, key design choices. If the create command was run, share the results. Ask the user if they want any changes. If so, edit the script, re-validate, re-preview, and iterate. diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index dea6aa4e..e7cb6868 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -29,5 +29,5 @@ This is an interactive, iterative design process. Do not disengage from the loop - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - - Warn the user that generation can take a long time for large record counts (50+). - - Do not run this command yourself — it can take a long time for large datasets and the user should control when it runs. + - Note that generation time depends on record count, number of LLM columns, and inference throughput — it can range from seconds to hours. + - Do not run this command yourself — the user should control when it runs. From b5922ddde61bf62b44c04e7cae405c3759ac9f73 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 20:33:12 -0700 Subject: [PATCH 09/12] fix: soften generation time warning wording --- skills/data-designer/workflows/autopilot.md | 2 +- skills/data-designer/workflows/interactive.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/skills/data-designer/workflows/autopilot.md b/skills/data-designer/workflows/autopilot.md index 2a61c19f..a101effc 100644 --- a/skills/data-designer/workflows/autopilot.md +++ b/skills/data-designer/workflows/autopilot.md @@ -21,6 +21,6 @@ In this mode, make reasonable design decisions autonomously based on the dataset - Give the user a clickable link: `file:///sample_records_browser.html` 7. **Create** — If the user specified a record count: - Run `data-designer create --num-records --dataset-name `. - - Generation time depends on record count, number of LLM columns, and inference throughput. For larger datasets, warn the user and ask for confirmation before running. + - Generation time varies — it depends on factors like record count, number of LLM columns, and inference throughput. For larger datasets, warn the user and ask for confirmation before running. - If no record count was specified, skip this step. 8. **Present** — Summarize what was built: columns, samplers used, key design choices. If the create command was run, share the results. Ask the user if they want any changes. If so, edit the script, re-validate, re-preview, and iterate. diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index e7cb6868..5a797a26 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -29,5 +29,5 @@ This is an interactive, iterative design process. Do not disengage from the loop - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - - Note that generation time depends on record count, number of LLM columns, and inference throughput — it can range from seconds to hours. + - Warn that generation time varies — it depends on factors like record count, number of LLM columns, and inference throughput. - Do not run this command yourself — the user should control when it runs. From 834f822405d453fe2cdb483c322d2fe611cb143a Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 20:37:00 -0700 Subject: [PATCH 10/12] fix: rephrase generation time warning --- skills/data-designer/workflows/autopilot.md | 2 +- skills/data-designer/workflows/interactive.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/skills/data-designer/workflows/autopilot.md b/skills/data-designer/workflows/autopilot.md index a101effc..2f13b7e7 100644 --- a/skills/data-designer/workflows/autopilot.md +++ b/skills/data-designer/workflows/autopilot.md @@ -21,6 +21,6 @@ In this mode, make reasonable design decisions autonomously based on the dataset - Give the user a clickable link: `file:///sample_records_browser.html` 7. **Create** — If the user specified a record count: - Run `data-designer create --num-records --dataset-name `. - - Generation time varies — it depends on factors like record count, number of LLM columns, and inference throughput. For larger datasets, warn the user and ask for confirmation before running. + - Generation speed depends heavily on the dataset configuration and the user's inference setup. For larger datasets, warn the user and ask for confirmation before running. - If no record count was specified, skip this step. 8. **Present** — Summarize what was built: columns, samplers used, key design choices. If the create command was run, share the results. Ask the user if they want any changes. If so, edit the script, re-validate, re-preview, and iterate. diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index 5a797a26..d4a4ab33 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -29,5 +29,5 @@ This is an interactive, iterative design process. Do not disengage from the loop - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. 8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - - Warn that generation time varies — it depends on factors like record count, number of LLM columns, and inference throughput. + - Caution the user that generation speed depends heavily on the dataset configuration and their inference setup. - Do not run this command yourself — the user should control when it runs. From d4f55565b90f851f0f125d71ebfdfc0b96b37006 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Thu, 19 Mar 2026 20:43:39 -0700 Subject: [PATCH 11/12] feat: make preview review a dedicated workflow step --- skills/data-designer/workflows/interactive.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index d4a4ab33..3c4f31ef 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -23,11 +23,9 @@ This is an interactive, iterative design process. Do not disengage from the loop 6. **Preview** — Run `data-designer preview --save-results` to generate sample records as HTML files. - Note the sample records directory printed by the `data-designer preview` command - Give the user a clickable link: `file:///sample_records_browser.html` -7. **Iterate** - - Ask the user for feedback. - - Offer to review the records yourself and suggest improvements. If the user accepts, read `references/preview-review.md` for guidance. - - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. -8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: +7. **Review** — Review the preview records following `references/preview-review.md`. Share a brief assessment — what looks good and what could improve. Then ask the user if they want to act on any of it or have other feedback. +8. **Iterate** — Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. +9. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - Caution the user that generation speed depends heavily on the dataset configuration and their inference setup. - Do not run this command yourself — the user should control when it runs. From 71eaa3ab6492b20a6afa6484bba32cd5eea60173 Mon Sep 17 00:00:00 2001 From: Johnny Greco Date: Fri, 20 Mar 2026 07:11:17 -0700 Subject: [PATCH 12/12] fix: revert preview review to an offer within the iterate step --- skills/data-designer/workflows/interactive.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md index 3c4f31ef..d4a4ab33 100644 --- a/skills/data-designer/workflows/interactive.md +++ b/skills/data-designer/workflows/interactive.md @@ -23,9 +23,11 @@ This is an interactive, iterative design process. Do not disengage from the loop 6. **Preview** — Run `data-designer preview --save-results` to generate sample records as HTML files. - Note the sample records directory printed by the `data-designer preview` command - Give the user a clickable link: `file:///sample_records_browser.html` -7. **Review** — Review the preview records following `references/preview-review.md`. Share a brief assessment — what looks good and what could improve. Then ask the user if they want to act on any of it or have other feedback. -8. **Iterate** — Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. -9. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: +7. **Iterate** + - Ask the user for feedback. + - Offer to review the records yourself and suggest improvements. If the user accepts, read `references/preview-review.md` for guidance. + - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied. +8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset: - `data-designer create --num-records --dataset-name `. - Caution the user that generation speed depends heavily on the dataset configuration and their inference setup. - Do not run this command yourself — the user should control when it runs.