-
Notifications
You must be signed in to change notification settings - Fork 81
feat: add preview review reference and update interactive iterate step #441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
6d7f39f
95ff92b
e5f7b17
a897cb7
b4bfdb4
0c916c1
bcc94c1
205e647
b5922dd
834f822
d4f5556
71eaa3a
ee6fc7b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # Preview Review Guide | ||
|
|
||
| ## Mindset | ||
|
|
||
| Quality is statistical, not per-record. Fix systemic issues that affect many records; don't chase cosmetic flaws in individual ones. But don't stop early — clear patterns of broken data or ignored instructions are worth fixing. | ||
|
|
||
| ## Reading Sample Records | ||
|
|
||
| Load `dataset.parquet` from the preview results directory (printed as `Results path:` by the preview command, or the most recent `artifacts/preview_results_*/` directory). Use pandas to load the parquet file and print the records in a compact, reviewable format. | ||
|
|
||
| ## What to Look For | ||
|
|
||
| The specifics depend on the dataset and its intended use. The categories below are common starting points — adapt based on what matters for this dataset. | ||
|
|
||
| ### Diversity | ||
| - **Mode collapse**: are records clustering around the same patterns, topics, or phrasings? | ||
| - **Sampler effectiveness**: are samplers being used effectively to steer diversity in the dataset? | ||
| - **Structural monotony**: do LLM-generated columns follow the same template across records? | ||
|
|
||
| ### Data Quality | ||
| - **Instruction compliance**: does generated content follow prompt constraints (step counts, format requirements, allowed values)? | ||
| - **Internal consistency**: does data within a record agree with itself? | ||
| - **Encoding integrity**: no garbled encoding, mojibake, or broken unicode. | ||
| - **Plausibility**: do examples look like they could come from the real domain, or are they obviously synthetic? | ||
| - **Judge calibration** (if applicable): are scores consistent across similar-quality records? Does the judge catch visible problems? | ||
|
|
||
| ### Design Choices | ||
| Are the right Data Designer features being used? For example: | ||
| - A text column that consistently produces structured data or code might be better as a specialized column type. | ||
| - Values drawn from a fixed set or known distribution could use a sampler instead of an LLM column. | ||
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -20,7 +20,7 @@ In this mode, make reasonable design decisions autonomously based on the dataset | |||||||||
| - Note the sample records directory printed by the `data-designer preview` command | ||||||||||
| - Give the user a clickable link: `file://<sample-records-dir>/sample_records_browser.html` | ||||||||||
| 7. **Create** — If the user specified a record count: | ||||||||||
| - 50 or fewer: run `data-designer create <path> --num-records <N> --dataset-name <name>` directly. | ||||||||||
| - More than 50: warn that generation can take a long time and ask for confirmation before running. | ||||||||||
| - Run `data-designer create <path> --num-records <N> --dataset-name <name>`. | ||||||||||
| - Generation speed depends heavily on the dataset configuration and the user's inference setup. For larger datasets, warn the user and ask for confirmation before running. | ||||||||||
|
Comment on lines
+23
to
+24
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The first bullet instructs the agent to run the create command unconditionally, but the second bullet says to warn and ask for confirmation before running for larger datasets. An agent following these bullets in order would execute the long-running command first, then warn the user — making the confirmation meaningless. The warning/confirmation check should come before the run instruction. Consider restructuring so the guard comes first:
Suggested change
Prompt To Fix With AIThis is a comment left during a code review.
Path: skills/data-designer/workflows/autopilot.md
Line: 23-24
Comment:
**Contradictory instruction order in Create step**
The first bullet instructs the agent to run the create command unconditionally, but the second bullet says to warn and ask for confirmation before running for larger datasets. An agent following these bullets in order would execute the long-running command first, then warn the user — making the confirmation meaningless.
The warning/confirmation check should come before the run instruction. Consider restructuring so the guard comes first:
```suggestion
- Generation speed depends heavily on the dataset configuration and the user's inference setup. For larger datasets, warn the user and ask for confirmation before running.
- Run `data-designer create <path> --num-records <N> --dataset-name <name>`.
```
How can I resolve this? If you propose a fix, please make it concise. |
||||||||||
| - If no record count was specified, skip this step. | ||||||||||
| 8. **Present** — Summarize what was built: columns, samplers used, key design choices. If the create command was run, share the results. Ask the user if they want any changes. If so, edit the script, re-validate, re-preview, and iterate. | ||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataset.parquetexistence not guaranteed by workflowThe guide instructs the agent to load
dataset.parquetfrom the preview results directory, but thePreviewstep in both workflow files only documents thatdata-designer preview --save-resultsproduces HTML files (specificallysample_records_browser.html). There is no mention anywhere in the workflow documentation that adataset.parquetfile is written to that directory.If
--save-resultsdoes not produce a parquet file, an agent following this guide will hit a missing-file error when trying to load it for self-review. Either:--save-resultsalways produces adataset.parquetalongside the HTML output (and note this in thePreviewsteps of both workflow files), orPrompt To Fix With AI