Skip to content

docs: agent-assisted development plan for DataDesigner#428

Open
nabinchha wants to merge 8 commits intomainfrom
nmulepati/docs/427-plans-for-agent-first
Open

docs: agent-assisted development plan for DataDesigner#428
nabinchha wants to merge 8 commits intomainfrom
nmulepati/docs/427-plans-for-agent-first

Conversation

@nabinchha
Copy link
Contributor

@nabinchha nabinchha commented Mar 17, 2026

Summary

Adds a comprehensive development plan for introducing agent-assisted workflows into DataDesigner, inspired by NVIDIA/OpenShell. The plan covers infrastructure consolidation, documentation restructuring, and GitHub machinery updates across four phases.

Related Issue

Closes #427

Changes

  • New plans/427/agent-first-development-plan.md with:
    • Problem statement: DataDesigner has meaningful agent infrastructure (7 skills, introspection CLI) but top-level docs don't surface it
    • Phase 1: Consolidate agent assets into .agents/ as a tool-agnostic canonical path, with symlinks for .claude/ and .codex/ compatibility
    • Phase 2: Restructure documentation — split AGENTS.md into focused files (STYLEGUIDE.md, DEVELOPMENT.md), update README.md and CONTRIBUTING.md to advertise agent workflows, create architecture/ skeleton
    • Phase 3: GitHub machinery — update issue templates with agent investigation fields, add PR template, update CODEOWNERS, create label taxonomy
    • Phase 4: Future work (new skills, sub-agent personas, triage automation) — flagged as requiring separate planning
    • Delivery strategy: Incremental PRs per phase, with Phase 4 requiring its own planning pass before implementation

Plan for optimizing DataDesigner for agent-assisted development
workflows, inspired by patterns from NVIDIA/OpenShell. Covers
foundation document updates, GitHub machinery, skill infrastructure
consolidation, and architecture documentation.

Closes #427
@nabinchha nabinchha requested a review from a team as a code owner March 17, 2026 21:11
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR adds a comprehensive agent-assisted development plan (plans/427/agent-first-development-plan.md) that outlines four phases for making DataDesigner more agent-friendly: restructuring AGENTS.md, consolidating skill infrastructure into a tool-agnostic .agents/ directory, updating contributor-facing documentation, and adding GitHub machinery (issue/PR templates, labels, CODEOWNERS). The plan is well-structured and the phased delivery strategy is sensible.

Two factual issues were found that should be corrected before the plan is used to drive implementation:

  • "Forthcoming" usage skill already exists: The plan repeatedly defers README and CONTRIBUTING.md edits to a future moment "when the usage skill ships", but skills/data-designer/SKILL.md was already merged into main via PR feat: add Data Designer skill #434. References on lines 21, 162, 166, and 189 should be updated to treat this skill as present.
  • new-sdg skill missing from the skills table: The proposed CONTRIBUTING.md skills table lists only 6 of the 7 .claude/skills/ skills, omitting new-sdg. Given the plan's emphasis on clearly separating development and usage surfaces, new-sdg's categorization should be made explicit.

Confidence Score: 3/5

  • Safe to merge after correcting two factual inaccuracies — the plan references a skill that has already shipped as forthcoming, and omits a skill from the proposed contributor inventory.
  • Documentation-only change with no code impact, so no runtime risk. However, two concrete factual errors could mislead implementers: (1) the skills/data-designer/ usage skill was already merged before this branch was last synced with main, making several "once available" deferrals incorrect, and (2) new-sdg is missing from the skills table in a plan that emphasizes skill discoverability. These issues are scoped entirely to the planning document.
  • plans/427/agent-first-development-plan.md — lines 21, 162, 166, 189 (forthcoming skill), and the skills table around line 193–199 (missing new-sdg).

Important Files Changed

Filename Overview
plans/427/agent-first-development-plan.md New planning document for agent-assisted development workflow across 4 phases. Contains two factual issues: the "forthcoming" usage skill (skills/data-designer/) already shipped in PR #434, and the new-sdg skill is omitted from the proposed CONTRIBUTING.md skills inventory table despite being a .claude/skills/ resident that sits on the development/usage boundary the plan defines.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    P0["Phase 0\nAGENTS.md restructure\n(~50 lines)"]
    P1["Phase 1\nSkill & Agent Infrastructure\n(.agents/ consolidation,\n.claude/ symlinks)"]
    P2a["Phase 2\nFoundation Docs\n(STYLEGUIDE.md, DEVELOPMENT.md,\nCONTRIBUTING.md, README.md)"]
    P3a["Phase 3 (parallel)\nIssue templates"]
    P3b["Phase 3 (parallel)\nPR template"]
    P3c["Phase 3 (parallel)\nCODEOWNERS"]
    P3d["Phase 3 (parallel)\nLabel taxonomy"]
    P3e["Phase 3 (parallel)\narchitecture/ skeleton"]
    P3f["Phase 3\nSkill template conformance\n(create-pr, review-code)"]
    P4["Phase 4 (separate plan)\nNew skills, sub-agent personas,\ntriage automation"]

    P0 --> P1
    P0 --> P2a
    P2a --> P3a
    P2a --> P3b
    P1 --> P2a
    P3a --> P3f
    P3b --> P3f
    P3c -.->|independent| P3a
    P3d -.->|independent| P3a
    P3e -.->|independent| P3a
    P3f --> P4
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: plans/427/agent-first-development-plan.md
Line: 21

Comment:
**"Forthcoming" usage skill has already shipped**

The plan describes the official "build a dataset" skill as forthcoming and uses conditional language ("once available", "when usage tooling ships") throughout, but `skills/data-designer/SKILL.md` already exists in the repository — it was merged into `main` via PR #434 (`96d1956`) before this plan's branch was synced. This affects several sections of the document:

- Line 21: `"the forthcoming official 'build a dataset' skill"` — should be `"the official 'data-designer' skill at skills/data-designer/"`
- Line 162: `"the official usage skill (once available)"` — the skill is already available
- Line 166: `"When usage tooling … ships, link to it here"` — it has already shipped; the README can link to it now
- Line 189: `"see … the official usage skill (once available)"` — same issue

Because the plan uses this "forthcoming" framing to defer several concrete README and CONTRIBUTING.md edits to a future phase, leaving these references as-is may cause confusion when implementing: the skill exists, its canonical path is `skills/data-designer/`, and it already lives outside `.claude/skills/` in the way the plan intended. The plan should be updated to reference the skill as present and link to its actual location.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: plans/427/agent-first-development-plan.md
Line: 193-199

Comment:
**`new-sdg` skill omitted from the skills inventory table**

The skills table in the proposed `CONTRIBUTING.md` section lists only 6 of the 7 skills currently in `.claude/skills/`:

| Category | Skills |
|---|---|
| Investigation | `search-docs`, `search-github` |
| Development | `commit`, `create-pr`, `update-pr` |
| Review | `review-code` |

The `new-sdg` skill (`disable-model-invocation: true`, description: *"Implement a new synthetic data generator using NeMo Data Designer"*) is absent. This is a meaningful omission given that the plan places significant emphasis on distinguishing the **development** surface from the **usage** surface. `new-sdg` sits squarely on the boundary: it helps implement a new SDG configuration inside the repo, but it drives DataDesigner as a user would. Its classification should be made explicit in the plan — either:

- Add it to the table under a new "Usage-adjacent" or "Authoring" category, or
- Acknowledge in Phase 1b (Skill Cross-Reference Cleanup) that `new-sdg`'s `internal: true` metadata already marks it as a development-only skill and note where it falls in the taxonomy.

Leaving it out of the table means the `CONTRIBUTING.md` will silently omit a discoverable skill, which contradicts the plan's goal of making agent infrastructure visible to contributors.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Merge branch 'main' ..."

nabinchha and others added 3 commits March 17, 2026 15:18
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Incorporate johnnygreco's review comments from PR #428:
- Distinguish development tooling vs usage tooling throughout
- Promote AGENTS.md restructure to Phase 0 (~50 lines target)
- Remove skills inventory, workflows, and conventions from AGENTS.md scope
- Remove new-sdg from skill categories (repo skills = development only)
- Overhaul CONTRIBUTING.md toward plan-submission-via-issues workflow
- Tone down README agent-first messaging to 1-2 sentences
- Simplify CODEOWNERS to single maintainer group
- Resolve 4 of 5 open questions per reviewer answers
- Fix malformed markdown and Out of Scope contradiction
- Add AGENTS.md redirect for dataset-building agents
- Tag skills as development-scoped in metadata

Made-with: Cursor
@nabinchha nabinchha requested a review from johnnygreco March 18, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore: optimize Data Designer for agent-first development

2 participants