diff --git a/README.md b/README.md index ce5efe2..a0448af 100644 --- a/README.md +++ b/README.md @@ -101,8 +101,8 @@ flowchart LR | Command | Also triggers on… | What it does | |---|---|---| -| **`/lazarus:discover`** | *"make this run locally"* · *"why won't this start?"* · *"onboard this repo"* · *"help me get oriented"* | Investigates **read-only**, writes `DISCOVERY.md` — a plan plus a concrete *definition of done* — then **stops and waits for you**. | -| **`/lazarus:repair`** | *"execute the repair plan"* · *"fix this codebase"* · *"work the blockers"* | Works the blockers in order, logs every command it actually ran to `VERIFICATION_REPORT.md`, and promotes the commands that *truly worked* into a `CLAUDE.md`. Needs a ratified `DISCOVERY.md` first. | +| **`/lazarus:discover`** | *"make this run locally"* · *"why won't this start?"* · *"onboard this repo"* · *"help me get oriented"* | Investigates **read-only**, writes `DISCOVERY.md` — a *repairability verdict*, a plan, and a concrete *definition of done* — then **stops and waits for you**. | +| **`/lazarus:repair`** | *"execute the repair plan"* · *"fix this codebase"* · *"work the blockers"* | Works the blockers in order, logs every command it actually ran to `VERIFICATION_REPORT.md`, and promotes the commands that *truly worked* into a `CLAUDE.md`. Needs a ratified `DISCOVERY.md` first — and refuses one whose verdict is *not-repairable* (never-built features are feature work, not a repair). | | **`/lazarus:audit`** | *"review this code"* · *"audit this repo"* · *"what should we fix first?"* · *"refactor or rewrite?"* | Produces a 12-section `CODEBASE_AUDIT.md` — architecture, risks, security, frontend/accessibility, a phased plan. **Read-only**; feeds `audit-repair` if you choose to act on it. | | **`/lazarus:audit-repair`** | *"execute the audit"* · *"fix the audit findings"* · *"work the Top 10 action items"* · *"apply the modernization plan"* | Executes a ratified `CODEBASE_AUDIT.md` §11 **one finding at a time** — ratify → act → verify against each item's acceptance check — safety-rails first, behind the guard. The strategic apply phase (`audit → audit-repair`), mirroring `discover → repair`. | diff --git a/docs/OVERVIEW.md b/docs/OVERVIEW.md index 5ae2cbd..7598996 100644 --- a/docs/OVERVIEW.md +++ b/docs/OVERVIEW.md @@ -36,10 +36,10 @@ The name is the namesake: it resurrects dead codebases. But it's just as useful Lazarus is **three skills in two workflows**, with a guard running across everything. ### `discover` — understand (read-only) -Runs in Claude Code's **Plan Mode** (read-only at the tool level — it physically cannot edit). It traces how the code is meant to run and writes a `DISCOVERY.md` file containing: what the app appears to do, the inferred setup/build/test/run commands, a ranked list of blockers, and a **Mechanical Definition of Done** — runnable assertions like *"`npm install` exits 0, the server stays up 30 seconds, this endpoint returns 200."* Then it **stops and waits for you to approve.** +Runs in Claude Code's **Plan Mode** (read-only at the tool level — it physically cannot edit). It traces how the code is meant to run and writes a `DISCOVERY.md` file containing: a **repairability verdict** (`repairable` / `partially-runnable` / `not-repairable` — broken-but-fixable blockers are split from never-built gaps), what the app appears to do, the inferred setup/build/test/run commands, a ranked list of blockers, and a **Mechanical Definition of Done** — runnable assertions like *"`npm install` exits 0, the server stays up 30 seconds, this endpoint returns 200."* Then it **stops and waits for you to approve.** ### `repair` — act (the only skill that changes code) -It **requires** a ratified `DISCOVERY.md` first. It works the blockers in dependency order (environment → install → build → runtime → tests → main flow), logs every command it *actually executed* to a separate `VERIFICATION_REPORT.md`, and promotes only genuinely-verified commands into a durable `CLAUDE.md`. It treats the Definition of Done as a contract — if the contract turns out wrong, it proposes an amendment rather than silently rewriting it. +It **requires** a ratified `DISCOVERY.md` first, and refuses one whose verdict is `not-repairable` — never-built functionality is feature work, not a repair. It works the blockers in dependency order (environment → install → build → runtime → tests → main flow), logs every command it *actually executed* to a separate `VERIFICATION_REPORT.md`, and promotes only genuinely-verified commands into a durable `CLAUDE.md`. It treats the Definition of Done as a contract — if the contract turns out wrong, it proposes an amendment rather than silently rewriting it. ### `audit` — assess (read-only, standalone) A separate workflow that answers a different question: *should we own this?* It produces a 12-section `CODEBASE_AUDIT.md` — architecture, risks, security, dependency health, testing, frontend/accessibility, and a phased modernization plan. It is deliberately decoupled: it doesn't depend on discover or repair, and its report is a deliverable for a human (e.g. handed to a client), not an input to the other skills. diff --git a/plugins/lazarus/skills/discover/SKILL.md b/plugins/lazarus/skills/discover/SKILL.md index 7ee2f27..7b80caf 100644 --- a/plugins/lazarus/skills/discover/SKILL.md +++ b/plugins/lazarus/skills/discover/SKILL.md @@ -57,6 +57,9 @@ Write the output to `DISCOVERY.md` at the repository root. Use this structure: ```markdown # DISCOVERY.md +## Repairability verdict +[repairable | partially-runnable | not-repairable] — [one-sentence justification, citing evidence] [tag] + ## Repository shape - Type: [single project | monorepo with N workspaces] - Languages: [list] @@ -78,9 +81,14 @@ Write the output to `DISCOVERY.md` at the repository root. Use this structure: - ... ## Blockers preventing local startup +[Fixable defects only — things that exist but are broken. A missing feature is not a blocker; it goes under Gaps.] 1. [Title] — [evidence] — [tag] — [severity: critical/high/medium] 2. ... +## Gaps (never built) +[Functionality that is referenced but was never implemented: stub bodies, imports of modules that don't exist, README/route/schema features with no code behind them. Write "None found." if there are none.] +1. [Title] — [evidence] — [tag] + ## Proposed Mechanical Definition of Done The repair phase is done when ALL of these check: - [ ] `` exits 0 @@ -97,16 +105,25 @@ The repair phase is done when ALL of these check: [Things the human must decide before repair starts. Be specific.] ``` +**Choosing the Repairability verdict.** Blockers and gaps are different kinds, not different severities — a blocker is broken code that exists; a gap can't be "fixed," only built, and building it is feature work outside repair's scope. The verdict follows from the split: + +- `repairable` — every obstacle is a fixable blocker; the Mechanical Definition of Done is achievable by repair alone. +- `partially-runnable` — the core app can be made to boot, but some of its advertised functionality is a gap. The DoD covers only what exists; list each gap so the user knows exactly what repair will NOT deliver. +- `not-repairable` — the thing the user wants to run was never built (essential components are absent, not broken). The honest deliverable is this verdict itself, with the evidence. Do not propose a DoD that quietly substitutes "build the missing pieces" for repair. + +Gaps never appear as DoD checkboxes. If a gap blocks the smoke check, that is evidence for `partially-runnable` or `not-repairable` — not a reason to add "implement X" to the plan. + Note on the smoke check for **hardware- or service-coupled apps**: if the one end-to-end assertion can't be run without something you can't supply — a physical device, a paid/external API, real credentials, a running database — say so explicitly. Make it a ratification Open Question and mark that DoD item `requires: ` instead of a plain checkbox. Never fake a smoke check or silently drop it; "this needs the camera / DB / API key to verify" is the correct, honest output, not a green check you didn't earn. ### 6. Stop for ratification Do NOT proceed to repair. After writing DISCOVERY.md, present a short summary in chat and ask the user to: -1. Review the proposed Definition of Done — these are the mechanical checks that will determine when repair is complete -2. Confirm scope (especially for monorepos) -3. Resolve any open questions -4. Approve, modify, or reject +1. Confirm the Repairability verdict — it decides whether repair runs at all, and on what subset +2. Review the proposed Definition of Done — these are the mechanical checks that will determine when repair is complete +3. Confirm scope (especially for monorepos) +4. Resolve any open questions +5. Approve, modify, or reject When the user approves, they should invoke the `repair` skill in a fresh prompt that references the ratified DISCOVERY.md. @@ -118,6 +135,8 @@ When the user approves, they should invoke the `repair` skill in a fresh prompt - Trying to discover an entire monorepo in one pass. Pick a workspace. - Recommending fixes during discovery. This phase is observation only. - Continuing into repair without explicit user ratification. +- Filing a never-built gap as a blocker. A gap can't be fixed, only built — listing it as a blocker hands repair feature work in disguise. +- Defaulting to `repairable` to be agreeable. If the evidence says the app was never finished, `not-repairable` is the useful answer, not a failure of the discovery. ## Research grounding diff --git a/plugins/lazarus/skills/repair/SKILL.md b/plugins/lazarus/skills/repair/SKILL.md index 1ad83f4..8c131b5 100644 --- a/plugins/lazarus/skills/repair/SKILL.md +++ b/plugins/lazarus/skills/repair/SKILL.md @@ -17,6 +17,10 @@ This skill executes against a ratified `DISCOVERY.md`. It is NOT a generic "fix This precondition exists because agent repair without an upstream contract has a documented failure mode: the agent silently redefines success as it goes (see arxiv 2604.04580 — "Beyond Fixed Tests"). +**Second refusal — `not-repairable`.** If `DISCOVERY.md` carries `Repairability verdict: not-repairable`, stop. Repair fixes blockers in code that exists; it does not build functionality that was never written — that is feature work needing its own plan. Quote the verdict's justification back to the user and offer the real options: re-run discovery scoped to the subset that does exist, or commission the missing pieces as deliberate feature development outside this skill. + +If the verdict is `partially-runnable`, proceed — but the `## Gaps (never built)` list stays out of scope: work only the blockers, and never quietly start building a gap because it would make the app "more done." If `DISCOVERY.md` has no verdict field (it predates the field), treat it as `repairable` and note that in `VERIFICATION_REPORT.md`. + ## Workflow ### 1. Load and confirm the contract @@ -24,6 +28,7 @@ This precondition exists because agent repair without an upstream contract has a Read `DISCOVERY.md`. State back to the user, in two or three sentences: - What the app appears to do +- The Repairability verdict (and, if `partially-runnable`, which gaps are explicitly NOT being built) - What the Mechanical Definition of Done requires - Which blockers will be worked through @@ -157,6 +162,7 @@ Produce `IMPLEMENTATION_SUMMARY.md` at repo root: - Promoting unverified claims to CLAUDE.md — pollutes durable guidance with assumptions - Continuing to grind on a blocker after two genuine attempts — mark deferred and move on - Removing business logic just because it looks old or doesn't match modern patterns +- Building a never-built gap because it "blocks" a DoD item — that's feature work in disguise; surface it as a DoD amendment or a `not-repairable` escalation instead - Committing generated build artifacts — the build writes to the config's `outDir` (read it; don't assume `dist/`); gitignore that folder plus `node_modules` and caches before staging, so a repair never commits build output (a newly-created *lockfile*, though, you keep) ## Research grounding