From 88ae78ff6acb15ac54f670ae1f5c685243b12505 Mon Sep 17 00:00:00 2001 From: Larry Stewart Date: Tue, 9 Jun 2026 15:11:27 -0400 Subject: [PATCH 1/2] discover: add Repairability verdict + blockers/gaps split; repair refuses not-repairable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DISCOVERY.md previously had no way to say 'this can't be repaired because it was never built' — only blocker severity. Now: - discover emits a top-of-file Repairability verdict (repairable | partially-runnable | not-repairable) and splits fixable blockers from never-built gaps; gaps never become DoD checkboxes - repair gains a second refusal: it stops on not-repairable, keeps gaps out of scope on partially-runnable, and treats a missing verdict field (pre-existing DISCOVERY.md files) as repairable - README + OVERVIEW updated to match Co-Authored-By: Claude Fable 5 --- README.md | 4 ++-- docs/OVERVIEW.md | 4 ++-- plugins/lazarus/skills/discover/SKILL.md | 27 ++++++++++++++++++++---- plugins/lazarus/skills/repair/SKILL.md | 6 ++++++ 4 files changed, 33 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index ce5efe2..a0448af 100644 --- a/README.md +++ b/README.md @@ -101,8 +101,8 @@ flowchart LR | Command | Also triggers on… | What it does | |---|---|---| -| **`/lazarus:discover`** | *"make this run locally"* · *"why won't this start?"* · *"onboard this repo"* · *"help me get oriented"* | Investigates **read-only**, writes `DISCOVERY.md` — a plan plus a concrete *definition of done* — then **stops and waits for you**. | -| **`/lazarus:repair`** | *"execute the repair plan"* · *"fix this codebase"* · *"work the blockers"* | Works the blockers in order, logs every command it actually ran to `VERIFICATION_REPORT.md`, and promotes the commands that *truly worked* into a `CLAUDE.md`. Needs a ratified `DISCOVERY.md` first. | +| **`/lazarus:discover`** | *"make this run locally"* · *"why won't this start?"* · *"onboard this repo"* · *"help me get oriented"* | Investigates **read-only**, writes `DISCOVERY.md` — a *repairability verdict*, a plan, and a concrete *definition of done* — then **stops and waits for you**. | +| **`/lazarus:repair`** | *"execute the repair plan"* · *"fix this codebase"* · *"work the blockers"* | Works the blockers in order, logs every command it actually ran to `VERIFICATION_REPORT.md`, and promotes the commands that *truly worked* into a `CLAUDE.md`. Needs a ratified `DISCOVERY.md` first — and refuses one whose verdict is *not-repairable* (never-built features are feature work, not a repair). | | **`/lazarus:audit`** | *"review this code"* · *"audit this repo"* · *"what should we fix first?"* · *"refactor or rewrite?"* | Produces a 12-section `CODEBASE_AUDIT.md` — architecture, risks, security, frontend/accessibility, a phased plan. **Read-only**; feeds `audit-repair` if you choose to act on it. | | **`/lazarus:audit-repair`** | *"execute the audit"* · *"fix the audit findings"* · *"work the Top 10 action items"* · *"apply the modernization plan"* | Executes a ratified `CODEBASE_AUDIT.md` §11 **one finding at a time** — ratify → act → verify against each item's acceptance check — safety-rails first, behind the guard. The strategic apply phase (`audit → audit-repair`), mirroring `discover → repair`. | diff --git a/docs/OVERVIEW.md b/docs/OVERVIEW.md index 5ae2cbd..7598996 100644 --- a/docs/OVERVIEW.md +++ b/docs/OVERVIEW.md @@ -36,10 +36,10 @@ The name is the namesake: it resurrects dead codebases. But it's just as useful Lazarus is **three skills in two workflows**, with a guard running across everything. ### `discover` — understand (read-only) -Runs in Claude Code's **Plan Mode** (read-only at the tool level — it physically cannot edit). It traces how the code is meant to run and writes a `DISCOVERY.md` file containing: what the app appears to do, the inferred setup/build/test/run commands, a ranked list of blockers, and a **Mechanical Definition of Done** — runnable assertions like *"`npm install` exits 0, the server stays up 30 seconds, this endpoint returns 200."* Then it **stops and waits for you to approve.** +Runs in Claude Code's **Plan Mode** (read-only at the tool level — it physically cannot edit). It traces how the code is meant to run and writes a `DISCOVERY.md` file containing: a **repairability verdict** (`repairable` / `partially-runnable` / `not-repairable` — broken-but-fixable blockers are split from never-built gaps), what the app appears to do, the inferred setup/build/test/run commands, a ranked list of blockers, and a **Mechanical Definition of Done** — runnable assertions like *"`npm install` exits 0, the server stays up 30 seconds, this endpoint returns 200."* Then it **stops and waits for you to approve.** ### `repair` — act (the only skill that changes code) -It **requires** a ratified `DISCOVERY.md` first. It works the blockers in dependency order (environment → install → build → runtime → tests → main flow), logs every command it *actually executed* to a separate `VERIFICATION_REPORT.md`, and promotes only genuinely-verified commands into a durable `CLAUDE.md`. It treats the Definition of Done as a contract — if the contract turns out wrong, it proposes an amendment rather than silently rewriting it. +It **requires** a ratified `DISCOVERY.md` first, and refuses one whose verdict is `not-repairable` — never-built functionality is feature work, not a repair. It works the blockers in dependency order (environment → install → build → runtime → tests → main flow), logs every command it *actually executed* to a separate `VERIFICATION_REPORT.md`, and promotes only genuinely-verified commands into a durable `CLAUDE.md`. It treats the Definition of Done as a contract — if the contract turns out wrong, it proposes an amendment rather than silently rewriting it. ### `audit` — assess (read-only, standalone) A separate workflow that answers a different question: *should we own this?* It produces a 12-section `CODEBASE_AUDIT.md` — architecture, risks, security, dependency health, testing, frontend/accessibility, and a phased modernization plan. It is deliberately decoupled: it doesn't depend on discover or repair, and its report is a deliverable for a human (e.g. handed to a client), not an input to the other skills. diff --git a/plugins/lazarus/skills/discover/SKILL.md b/plugins/lazarus/skills/discover/SKILL.md index 7ee2f27..7b80caf 100644 --- a/plugins/lazarus/skills/discover/SKILL.md +++ b/plugins/lazarus/skills/discover/SKILL.md @@ -57,6 +57,9 @@ Write the output to `DISCOVERY.md` at the repository root. Use this structure: ```markdown # DISCOVERY.md +## Repairability verdict +[repairable | partially-runnable | not-repairable] — [one-sentence justification, citing evidence] [tag] + ## Repository shape - Type: [single project | monorepo with N workspaces] - Languages: [list] @@ -78,9 +81,14 @@ Write the output to `DISCOVERY.md` at the repository root. Use this structure: - ... ## Blockers preventing local startup +[Fixable defects only — things that exist but are broken. A missing feature is not a blocker; it goes under Gaps.] 1. [Title] — [evidence] — [tag] — [severity: critical/high/medium] 2. ... +## Gaps (never built) +[Functionality that is referenced but was never implemented: stub bodies, imports of modules that don't exist, README/route/schema features with no code behind them. Write "None found." if there are none.] +1. [Title] — [evidence] — [tag] + ## Proposed Mechanical Definition of Done The repair phase is done when ALL of these check: - [ ] `` exits 0 @@ -97,16 +105,25 @@ The repair phase is done when ALL of these check: [Things the human must decide before repair starts. Be specific.] ``` +**Choosing the Repairability verdict.** Blockers and gaps are different kinds, not different severities — a blocker is broken code that exists; a gap can't be "fixed," only built, and building it is feature work outside repair's scope. The verdict follows from the split: + +- `repairable` — every obstacle is a fixable blocker; the Mechanical Definition of Done is achievable by repair alone. +- `partially-runnable` — the core app can be made to boot, but some of its advertised functionality is a gap. The DoD covers only what exists; list each gap so the user knows exactly what repair will NOT deliver. +- `not-repairable` — the thing the user wants to run was never built (essential components are absent, not broken). The honest deliverable is this verdict itself, with the evidence. Do not propose a DoD that quietly substitutes "build the missing pieces" for repair. + +Gaps never appear as DoD checkboxes. If a gap blocks the smoke check, that is evidence for `partially-runnable` or `not-repairable` — not a reason to add "implement X" to the plan. + Note on the smoke check for **hardware- or service-coupled apps**: if the one end-to-end assertion can't be run without something you can't supply — a physical device, a paid/external API, real credentials, a running database — say so explicitly. Make it a ratification Open Question and mark that DoD item `requires: ` instead of a plain checkbox. Never fake a smoke check or silently drop it; "this needs the camera / DB / API key to verify" is the correct, honest output, not a green check you didn't earn. ### 6. Stop for ratification Do NOT proceed to repair. After writing DISCOVERY.md, present a short summary in chat and ask the user to: -1. Review the proposed Definition of Done — these are the mechanical checks that will determine when repair is complete -2. Confirm scope (especially for monorepos) -3. Resolve any open questions -4. Approve, modify, or reject +1. Confirm the Repairability verdict — it decides whether repair runs at all, and on what subset +2. Review the proposed Definition of Done — these are the mechanical checks that will determine when repair is complete +3. Confirm scope (especially for monorepos) +4. Resolve any open questions +5. Approve, modify, or reject When the user approves, they should invoke the `repair` skill in a fresh prompt that references the ratified DISCOVERY.md. @@ -118,6 +135,8 @@ When the user approves, they should invoke the `repair` skill in a fresh prompt - Trying to discover an entire monorepo in one pass. Pick a workspace. - Recommending fixes during discovery. This phase is observation only. - Continuing into repair without explicit user ratification. +- Filing a never-built gap as a blocker. A gap can't be fixed, only built — listing it as a blocker hands repair feature work in disguise. +- Defaulting to `repairable` to be agreeable. If the evidence says the app was never finished, `not-repairable` is the useful answer, not a failure of the discovery. ## Research grounding diff --git a/plugins/lazarus/skills/repair/SKILL.md b/plugins/lazarus/skills/repair/SKILL.md index 1ad83f4..8c131b5 100644 --- a/plugins/lazarus/skills/repair/SKILL.md +++ b/plugins/lazarus/skills/repair/SKILL.md @@ -17,6 +17,10 @@ This skill executes against a ratified `DISCOVERY.md`. It is NOT a generic "fix This precondition exists because agent repair without an upstream contract has a documented failure mode: the agent silently redefines success as it goes (see arxiv 2604.04580 — "Beyond Fixed Tests"). +**Second refusal — `not-repairable`.** If `DISCOVERY.md` carries `Repairability verdict: not-repairable`, stop. Repair fixes blockers in code that exists; it does not build functionality that was never written — that is feature work needing its own plan. Quote the verdict's justification back to the user and offer the real options: re-run discovery scoped to the subset that does exist, or commission the missing pieces as deliberate feature development outside this skill. + +If the verdict is `partially-runnable`, proceed — but the `## Gaps (never built)` list stays out of scope: work only the blockers, and never quietly start building a gap because it would make the app "more done." If `DISCOVERY.md` has no verdict field (it predates the field), treat it as `repairable` and note that in `VERIFICATION_REPORT.md`. + ## Workflow ### 1. Load and confirm the contract @@ -24,6 +28,7 @@ This precondition exists because agent repair without an upstream contract has a Read `DISCOVERY.md`. State back to the user, in two or three sentences: - What the app appears to do +- The Repairability verdict (and, if `partially-runnable`, which gaps are explicitly NOT being built) - What the Mechanical Definition of Done requires - Which blockers will be worked through @@ -157,6 +162,7 @@ Produce `IMPLEMENTATION_SUMMARY.md` at repo root: - Promoting unverified claims to CLAUDE.md — pollutes durable guidance with assumptions - Continuing to grind on a blocker after two genuine attempts — mark deferred and move on - Removing business logic just because it looks old or doesn't match modern patterns +- Building a never-built gap because it "blocks" a DoD item — that's feature work in disguise; surface it as a DoD amendment or a `not-repairable` escalation instead - Committing generated build artifacts — the build writes to the config's `outDir` (read it; don't assume `dist/`); gitignore that folder plus `node_modules` and caches before staging, so a repair never commits build output (a newly-created *lockfile*, though, you keep) ## Research grounding From 80cf3573827d140925dc565224859cab2679fdeb Mon Sep 17 00:00:00 2001 From: Larry Stewart Date: Tue, 9 Jun 2026 15:18:10 -0400 Subject: [PATCH 2/2] =?UTF-8?q?README:=20restructure=20pair-first=20?= =?UTF-8?q?=E2=80=94=20two=20journeys,=20four=20commands?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Four commands read as four choices; they're not. Lead with the two journeys (discover→repair, audit→audit-repair), each plan → you approve → execute: - 'Which to reach for' becomes 'Two journeys, four commands' — a two-row journey table (guard as a standing line, not a row) - new callout: don't memorize the order — every skill routes you to the right phase (repair offers discover; audit-repair refuses without a ratified audit) - mermaid finally shows the audit→audit-repair leg (it dead-ended at the report, predating v0.4.0) - command table split into the two journey groups - intro bullets name their command pairs; FAQ rows catch up to audit-repair Co-Authored-By: Claude Fable 5 --- README.md | 46 +++++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index a0448af..227c8f3 100644 --- a/README.md +++ b/README.md @@ -18,24 +18,24 @@ Point Claude at a repository and let Lazarus help make it: Alive again, document **Lazarus** is a Claude Code plugin for working on a codebase with an AI agent you can actually trust. It does **two jobs** on *any* repo — yours, one you inherited, an open-source project, healthy or broken: -- 🔧 **Make it run** — point it at code that won't start, or that you simply don't know yet. It investigates, proposes a plan with a concrete "done" checklist you approve, then works through the blockers until the app boots — and writes down what actually worked so the next person (or the next you) doesn't start from zero. -- 🧭 **Assess it** — get a principal-engineer read: what's risky, what to fix first, and whether to maintain, refactor, or rewrite. A report you act on or hand to a client. Nothing in the repo changes. +- 🔧 **Make it run** (`discover` → `repair`) — point it at code that won't start, or that you simply don't know yet. It investigates, proposes a plan with a concrete "done" checklist you approve, then works through the blockers until the app boots — and writes down what actually worked so the next person (or the next you) doesn't start from zero. +- 🧭 **Assess it — and, if you choose, fix it** (`audit` → `audit-repair`) — get a principal-engineer read: what's risky, what to fix first, and whether to maintain, refactor, or rewrite. A report you act on, hand to a client — or have executed finding-by-finding, each behind your approval. -Both run behind a guard that blocks destructive commands before they ever run — and on the "make it run" side, **nothing changes until you approve the plan.** It'll resurrect a dead repo that won't even start (the namesake), but it's just as useful on healthy code you want made runnable, understood, or assessed. +Both journeys run behind a guard that blocks destructive commands before they ever run — and in both, **nothing changes until you approve a plan.** It'll resurrect a dead repo that won't even start (the namesake), but it's just as useful on healthy code you want made runnable, understood, or assessed. -## 🧭 Which to reach for +## 🧭 Two journeys, four commands -Four skills in **two workflows** — each now *plan-then-execute* — with the guard across both. Match your situation: +Lazarus looks like four skills, but you only ever choose between **two journeys**. Each is *plan → you approve → execute* — the four commands are just the steps: -| Your situation | Reach for | What happens | +| You want… | The journey | How it flows | |---|---|---| -| *"It won't run"* · *"I'm lost in this repo"* · *"I need to change it safely"* | 🔍 **`discover`** → *you approve* → 🔧 **`repair`** | `discover` investigates read-only and writes a plan with a runnable "done" checklist; you approve it; `repair` works the blockers until each one passes — recording what actually worked in `CLAUDE.md`. | -| *"What shape is this in?"* · *"What do we fix first?"* · *"Maintain, refactor, or rewrite?"* | 🧭 **`audit`** | A read-only, 12-section principal-engineer report — architecture, risks, security, dependency health, a phased plan. Changes nothing; it's a deliverable you act on (or hand to a client). | -| *"Now go fix what the audit found"* · *"work the Top 10"* · *"apply the modernization plan"* | 🧭 **`audit`** → *you approve* → 🛠️ **`audit-repair`** | After the audit, `audit-repair` executes the Top 10 findings **one at a time** behind the ratify gate — verifying each against its acceptance check. The apply phase for `audit`, exactly as `repair` is for `discover`. | -| *"Don't let the agent wreck my machine"* | 🛡️ *(automatic)* | The guard blocks `rm -rf /`, force-push, `DROP TABLE`, and ~25 more — the whole time. | +| **It running** — *"it won't start"* · *"I'm lost in this repo"* · *"I need to change it safely"* | 🔍 **`discover`** → 🧑 *you approve* → 🔧 **`repair`** | `discover` investigates read-only and writes a plan with a runnable "done" checklist; you approve it; `repair` works the blockers until each one passes — recording what actually worked in `CLAUDE.md`. | +| **It assessed — and, if you choose, fixed** — *"what shape is this in?"* · *"maintain, refactor, or rewrite?"* · *"now go fix what the audit found"* | 🧭 **`audit`** → 🧑 *your call* → 🛠️ **`audit-repair`** | `audit` writes a read-only, 12-section principal-engineer report. Stop there — it's a deliverable you can hand to a client — or ratify its Top 10 and `audit-repair` executes them **one at a time**, verifying each against its acceptance check. | + +And the whole time — both journeys, every step — the 🛡️ **guard** blocks `rm -rf /`, force-push, `DROP TABLE`, and ~25 more destructive commands before they ever execute. > [!NOTE] -> **Two workflows, one gate — now symmetric.** Each is *plan-then-execute* with your approval as the gate: **`discover` → `repair`** (make it run) and **`audit` → `audit-repair`** (assess, then fix). `repair` won't run without a ratified `discover` plan; `audit-repair` won't run without a ratified `audit`. The two workflows stay independent — neither requires the other, and `audit` is still perfectly useful as a read-only report you never act on. +> **Don't memorize the order — start anywhere.** The skills route you: type `/lazarus:repair` with no plan and it stops and offers to run `discover` first; finish `discover` and it names the next command; `audit-repair` refuses to run until an `audit` is ratified. The two journeys stay independent — neither requires the other, and `audit` is still perfectly useful as a report you never act on. **New here?** The three commands below get you running in under a minute — no config, no keys. **Want the internals?** The collapsible **Deep dive** sections further down open up the guard's design, the anti-hallucination model, and the research behind it. For the whole picture in one read, see the [complete project overview](docs/OVERVIEW.md). @@ -74,9 +74,9 @@ A scary repo to a running app — discover, you approve, repair, and the guard s Animated terminal: discover writes a plan, you approve, repair fixes the blockers, the guard blocks rm -rf /, and the app boots -## 🗺️ The two workflows +## 🗺️ The two journeys -Two independent workflows. One makes the code run; the other tells you what to do about it. +Two independent journeys. One makes the code run; the other tells you what to do about it — and fixes it, if you say so. ```mermaid flowchart LR @@ -90,21 +90,33 @@ flowchart LR B -->|assess it| H["🧭 lazarus:audit
read-only"] H --> I["📊 CODEBASE_AUDIT.md
risks · what to fix first
· refactor vs rewrite"] + I -.->|"optional"| J(["🧑 you ratify
the Top 10"]) + J --> K["🛠️ lazarus:audit-repair
one finding at a time"] + K --> L["✅ findings fixed +
verified against checks"] style A fill:#fee2e2,stroke:#ef4444,color:#111 style G fill:#dcfce7,stroke:#22c55e,color:#111 style I fill:#e0f2fe,stroke:#0ea5e9,color:#111 style E fill:#fef9c3,stroke:#eab308,color:#111 + style J fill:#fef9c3,stroke:#eab308,color:#111 + style L fill:#dcfce7,stroke:#22c55e,color:#111 ``` -**Type the command, or just describe what you want** — both work. The fast path is the command itself: `/lazarus:discover`, `/lazarus:repair`, `/lazarus:audit` (start typing `/discover`, `/repair`, or `/audit` and it autocompletes). Plain English triggers the same skill. +**Type the command, or just describe what you want** — both work. The fast path is the command itself (start typing `/discover`, `/repair`, or `/audit` and it autocompletes); plain English triggers the same skill. + +**Journey 1 — make it run** | Command | Also triggers on… | What it does | |---|---|---| | **`/lazarus:discover`** | *"make this run locally"* · *"why won't this start?"* · *"onboard this repo"* · *"help me get oriented"* | Investigates **read-only**, writes `DISCOVERY.md` — a *repairability verdict*, a plan, and a concrete *definition of done* — then **stops and waits for you**. | | **`/lazarus:repair`** | *"execute the repair plan"* · *"fix this codebase"* · *"work the blockers"* | Works the blockers in order, logs every command it actually ran to `VERIFICATION_REPORT.md`, and promotes the commands that *truly worked* into a `CLAUDE.md`. Needs a ratified `DISCOVERY.md` first — and refuses one whose verdict is *not-repairable* (never-built features are feature work, not a repair). | + +**Journey 2 — assess it, then (optionally) fix it** + +| Command | Also triggers on… | What it does | +|---|---|---| | **`/lazarus:audit`** | *"review this code"* · *"audit this repo"* · *"what should we fix first?"* · *"refactor or rewrite?"* | Produces a 12-section `CODEBASE_AUDIT.md` — architecture, risks, security, frontend/accessibility, a phased plan. **Read-only**; feeds `audit-repair` if you choose to act on it. | -| **`/lazarus:audit-repair`** | *"execute the audit"* · *"fix the audit findings"* · *"work the Top 10 action items"* · *"apply the modernization plan"* | Executes a ratified `CODEBASE_AUDIT.md` §11 **one finding at a time** — ratify → act → verify against each item's acceptance check — safety-rails first, behind the guard. The strategic apply phase (`audit → audit-repair`), mirroring `discover → repair`. | +| **`/lazarus:audit-repair`** | *"execute the audit"* · *"fix the audit findings"* · *"work the Top 10 action items"* · *"apply the modernization plan"* | Executes a ratified `CODEBASE_AUDIT.md` §11 **one finding at a time** — ratify → act → verify against each item's acceptance check — safety-rails first, behind the guard. The strategic apply phase, exactly as `repair` is for `discover`. | > [!TIP] > **Pairs with `/code-review`** — a *built-in* Claude Code command (not part of Lazarus). Point it at your current diff for a focused bug-and-cleanup pass once the app runs. @@ -242,13 +254,13 @@ The skill reads `CODEBASE_AUDIT.md` §11, shows you the proposed issues, lets yo
I installed it but /lazarus:discover (or the guard) does nothing. Why?
-You almost certainly skipped /reload-plugins. Installing registers the plugin; its skills, hooks, and guard only go live after you run /reload-plugins (or restart claude) in that session. Run it once and the /lazarus:discover, /lazarus:repair, and /lazarus:audit commands appear. +You almost certainly skipped /reload-plugins. Installing registers the plugin; its skills, hooks, and guard only go live after you run /reload-plugins (or restart claude) in that session. Run it once and the /lazarus:discover, /lazarus:repair, /lazarus:audit, and /lazarus:audit-repair commands appear.
Will it actually change my code without asking?
-Discovery and audit are read-only (Plan Mode). Repair changes code — but only after you approve the plan, and the guard blocks destructive shell commands throughout. You stay in the loop at the one decision that matters: ratifying what "done" means. +Discovery and audit are read-only (Plan Mode). Repair and audit-repair change code — but only after you ratify a plan (the "done" checklist, or the audit's Top 10), and the guard blocks destructive shell commands throughout. You stay in the loop at the one decision that matters: ratifying what "done" means.