From d52a3a4c2b83b4d27a9e1c483960221fe5aa147e Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 05:07:12 -0600
Subject: [PATCH 01/37] =?UTF-8?q?feat:=20add=20maintenance=20skills=20?=
 =?UTF-8?q?=E2=80=94=20deps-audit,=20bench-check,=20test-health,=20houseke?=
 =?UTF-8?q?ep?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Four recurring maintenance routines as Claude Code skills:
- /deps-audit: vulnerability scanning, staleness, unused deps, license checks
- /bench-check: benchmark regression detection against saved baselines
- /test-health: flaky test detection, dead tests, coverage gap analysis
- /housekeep: clean worktrees, dirt files, sync main, prune branches
---
 .claude/skills/bench-check/SKILL.md | 223 +++++++++++++++++++++++
 .claude/skills/deps-audit/SKILL.md  | 164 +++++++++++++++++
 .claude/skills/housekeep/SKILL.md   | 266 ++++++++++++++++++++++++++++
 .claude/skills/test-health/SKILL.md | 248 ++++++++++++++++++++++++++
 4 files changed, 901 insertions(+)
 create mode 100644 .claude/skills/bench-check/SKILL.md
 create mode 100644 .claude/skills/deps-audit/SKILL.md
 create mode 100644 .claude/skills/housekeep/SKILL.md
 create mode 100644 .claude/skills/test-health/SKILL.md

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
new file mode 100644
index 00000000..2b48ff3a
--- /dev/null
+++ b/.claude/skills/bench-check/SKILL.md
@@ -0,0 +1,223 @@
+---
+name: bench-check
+description: Run benchmarks against a saved baseline, detect performance regressions, and update the baseline — guards against silent slowdowns
+argument-hint: "[--save-baseline | --compare-only | --threshold 15]  (default: compare + save)"
+allowed-tools: Bash, Read, Write, Edit, Glob, Grep, Agent
+---
+
+# /bench-check — Performance Regression Check
+
+Run the project's benchmark suite, compare results against a saved baseline, flag regressions beyond a threshold, and optionally update the baseline. Prevents silent performance degradation between releases.
+
+## Arguments
+
+- `$ARGUMENTS` may contain:
+  - `--save-baseline` — run benchmarks and save as the new baseline (no comparison)
+  - `--compare-only` — compare against baseline without updating it
+  - `--threshold N` — regression threshold percentage (default: 15%)
+  - No arguments — compare against baseline, then update it if no regressions
+
+## Phase 0 — Pre-flight
+
+1. Confirm we're in the codegraph repo root
+2. Check that benchmark scripts exist:
+   - `scripts/benchmark.js` (build speed, query latency)
+   - `scripts/incremental-benchmark.js` (incremental build tiers)
+   - `scripts/query-benchmark.js` (query depth scaling)
+   - `scripts/embedding-benchmark.js` (search recall) — optional, skip if embedding deps missing
+3. Parse `$ARGUMENTS`:
+   - `SAVE_ONLY=true` if `--save-baseline`
+   - `COMPARE_ONLY=true` if `--compare-only`
+   - `THRESHOLD=N` from `--threshold N` (default: 15)
+4. Check for existing baseline at `generated/bench-check/baseline.json`
+   - If missing and not `--save-baseline`: warn that this will be an initial baseline run
+
+## Phase 1 — Run Benchmarks
+
+Run each benchmark script and collect results. Each script outputs JSON to stdout.
+
+### 1a. Build & Query Benchmark
+
+```bash
+node scripts/benchmark.js 2>/dev/null
+```
+
+Extract:
+- `buildTime` (ms) — per engine (native, WASM)
+- `queryTime` (ms) — per query type
+- `nodeCount`, `edgeCount` — graph size
+
+### 1b. Incremental Benchmark
+
+```bash
+node scripts/incremental-benchmark.js 2>/dev/null
+```
+
+Extract:
+- `noOpRebuild` (ms) — time for no-change rebuild
+- `singleFileRebuild` (ms) — time after one file change
+- `importResolution` (ms) — resolution throughput
+
+### 1c. Query Depth Benchmark
+
+```bash
+node scripts/query-benchmark.js 2>/dev/null
+```
+
+Extract:
+- `fnDeps` scaling by depth
+- `fnImpact` scaling by depth
+- `diffImpact` latency
+
+### 1d. Embedding Benchmark (optional)
+
+```bash
+node scripts/embedding-benchmark.js 2>/dev/null
+```
+
+Extract:
+- `embeddingTime` (ms)
+- `recall` at Hit@1, Hit@3, Hit@5, Hit@10
+
+> **Timeout:** Each benchmark gets 5 minutes max. If it times out, record `"timeout"` for that suite and continue.
+
+> **Errors:** If a benchmark script fails (non-zero exit), record `"error: <message>"` and continue with remaining benchmarks.
+
+## Phase 2 — Normalize Results
+
+Build a flat metrics object from all benchmark results:
+
+```json
+{
+  "timestamp": "<ISO 8601>",
+  "version": "<from package.json>",
+  "gitRef": "<current HEAD short SHA>",
+  "metrics": {
+    "build.native.ms": 1234,
+    "build.wasm.ms": 2345,
+    "query.fnDeps.depth3.ms": 45,
+    "query.fnImpact.depth3.ms": 67,
+    "query.diffImpact.ms": 89,
+    "incremental.noOp.ms": 12,
+    "incremental.singleFile.ms": 34,
+    "incremental.importResolution.ms": 56,
+    "graph.nodes": 500,
+    "graph.edges": 1200,
+    "embedding.time.ms": 3000,
+    "embedding.recall.hit1": 0.85,
+    "embedding.recall.hit5": 0.95
+  }
+}
+```
+
+Adapt the metric keys to match whatever the benchmark scripts actually output — the above are representative. The goal is a flat key→number map for easy comparison.
+
+## Phase 3 — Compare Against Baseline
+
+Skip this phase if `SAVE_ONLY=true` or no baseline exists.
+
+For each metric in the current run:
+
+1. Look up the same metric in the baseline
+2. Compute: `delta_pct = ((current - baseline) / baseline) * 100`
+3. Classify:
+   - **Regression**: metric increased by more than `THRESHOLD`% (for time metrics) or decreased by more than `THRESHOLD`% (for recall/quality metrics)
+   - **Improvement**: metric decreased by more than `THRESHOLD`% (time) or increased (quality)
+   - **Stable**: within threshold
+
+> **Direction awareness:** For latency metrics (ms), higher = worse. For recall/quality metrics, higher = better. For count metrics (nodes, edges), changes are informational only — not regressions.
+
+### Regression table
+
+| Metric | Baseline | Current | Delta | Status |
+|--------|----------|---------|-------|--------|
+| build.native.ms | 1200 | 1500 | +25% | REGRESSION |
+| query.fnDeps.depth3.ms | 45 | 43 | -4.4% | stable |
+
+## Phase 4 — Verdict
+
+Based on comparison results:
+
+### No regressions found
+- Print: `BENCH-CHECK PASSED — no regressions beyond {THRESHOLD}% threshold`
+- If not `COMPARE_ONLY`: update baseline with current results
+
+### Regressions found
+- Print: `BENCH-CHECK FAILED — {N} regressions detected`
+- List each regression with metric name, baseline value, current value, delta %
+- Do NOT update the baseline
+- Suggest investigation:
+  - `git log --oneline <baseline-ref>..HEAD` to find what changed
+  - `codegraph diff-impact <baseline-ref> -T` to find structural changes
+  - Re-run individual benchmarks to confirm (not flaky)
+
+### First run (no baseline)
+- Print: `BENCH-CHECK — initial baseline saved`
+- Save current results as baseline
+
+## Phase 5 — Save Baseline
+
+When saving (initial run, `--save-baseline`, or passed comparison):
+
+Write to `generated/bench-check/baseline.json`:
+```json
+{
+  "savedAt": "<ISO 8601>",
+  "version": "<package version>",
+  "gitRef": "<HEAD short SHA>",
+  "threshold": 15,
+  "metrics": { ... }
+}
+```
+
+Also append a one-line summary to `generated/bench-check/history.ndjson`:
+```json
+{"timestamp":"...","version":"...","gitRef":"...","metrics":{...}}
+```
+
+This creates a running log of benchmark results over time.
+
+## Phase 6 — Report
+
+Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`:
+
+```markdown
+# Benchmark Report — <date>
+
+**Version:** X.Y.Z | **Git ref:** abc1234 | **Threshold:** 15%
+
+## Verdict: PASSED / FAILED
+
+## Comparison vs Baseline
+
+<!-- Full comparison table with all metrics -->
+
+## Regressions (if any)
+
+<!-- Detail each regression with possible causes -->
+
+## Trend (if history.ndjson has 3+ entries)
+
+<!-- Show trend for key metrics: build time, query time, graph size -->
+
+## Raw Results
+
+<!-- Full JSON output from each benchmark -->
+```
+
+## Phase 7 — Cleanup
+
+1. If report was written, print its path
+2. If baseline was updated, print confirmation
+3. Print one-line summary: `PASSED (0 regressions) | FAILED (N regressions) | BASELINE SAVED`
+
+## Rules
+
+- **Never skip a benchmark** — if it fails, record the failure and continue
+- **Timeout is 5 minutes per benchmark** — use appropriate timeout flags
+- **Don't update baseline on regression** — the user must investigate first
+- **Recall/quality metrics are inverted** — a decrease is a regression
+- **Count metrics are informational** — graph growing isn't a regression
+- **The baseline file is committed to git** — it's a shared reference point
+- **history.ndjson is append-only** — never truncate or rewrite it
+- Generated files go in `generated/bench-check/` — create the directory if needed
diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
new file mode 100644
index 00000000..cc2e4b12
--- /dev/null
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -0,0 +1,164 @@
+---
+name: deps-audit
+description: Audit dependencies for vulnerabilities, staleness, unused packages, and license risks — produce a health report with actionable fixes
+argument-hint: "[--fix]  (optional — auto-fix safe updates)"
+allowed-tools: Bash, Read, Write, Edit, Glob, Grep, Agent
+---
+
+# /deps-audit — Dependency Health Audit
+
+Audit the project's dependency tree for security vulnerabilities, outdated packages, unused dependencies, and license compliance. Produce a structured report and optionally auto-fix safe updates.
+
+## Arguments
+
+- `$ARGUMENTS` may contain `--fix` to auto-apply safe updates (patch/minor only)
+
+## Phase 0 — Pre-flight
+
+1. Confirm we're in the codegraph repo root (check for `package.json` and `package-lock.json`)
+2. Run `node --version` — must be >= 20
+3. Run `npm --version` to capture toolchain info
+4. Parse `$ARGUMENTS` — set `AUTO_FIX=true` if `--fix` is present
+
+## Phase 1 — Security Vulnerabilities
+
+Run `npm audit --json` and parse the output:
+
+1. Count vulnerabilities by severity: `critical`, `high`, `moderate`, `low`, `info`
+2. For each `critical` or `high` vulnerability:
+   - Record: package name, severity, CVE/GHSA ID, vulnerable version range, patched version, dependency path (direct vs transitive)
+   - Check if a fix is available (`npm audit fix --dry-run --json`)
+3. Summarize: total vulns, fixable count, breaking-fix count
+
+**If `AUTO_FIX` is set:** Run `npm audit fix` (non-breaking fixes only). Record what changed. Do NOT run `npm audit fix --force` — breaking changes require manual review.
+
+## Phase 2 — Outdated Dependencies
+
+Run `npm outdated --json` and categorize:
+
+### 2a. Direct dependencies (`dependencies` + `devDependencies`)
+
+For each outdated package, record:
+- Package name
+- Current version → Wanted (semver-compatible) → Latest
+- Whether the update is patch, minor, or major
+- If major: check the package's CHANGELOG/release notes for breaking changes relevant to our usage
+
+### 2b. Staleness score
+
+Classify each outdated dep:
+| Category | Definition |
+|----------|-----------|
+| **Fresh** | On latest or within 1 patch |
+| **Aging** | 1+ minor versions behind |
+| **Stale** | 1+ major versions behind |
+| **Abandoned** | No release in 12+ months (check npm registry publish date) |
+
+For any package classified as **Abandoned**, check if there's a maintained fork or alternative.
+
+**If `AUTO_FIX` is set:** Run `npm update` to apply semver-compatible updates. Record what changed.
+
+## Phase 3 — Unused Dependencies
+
+Detect dependencies declared in `package.json` but never imported:
+
+1. Read `dependencies` and `devDependencies` from `package.json`
+2. For each dependency, search for imports/requires across `src/`, `tests/`, `scripts/`, `cli.js`, `index.js`:
+   - `require('<pkg>')` or `require('<pkg>/...')`
+   - `import ... from '<pkg>'` or `import '<pkg>'`
+   - `import('<pkg>')` (dynamic imports)
+3. Skip known implicit dependencies that don't have direct imports:
+   - `@anthropic-ai/tokenizer` — may be used by `@anthropic-ai/sdk`
+   - `tree-sitter-*` and `web-tree-sitter` — loaded dynamically via WASM
+   - `@biomejs/biome` — used as CLI tool only
+   - `commit-and-tag-version` — used as npm script
+   - `@optave/codegraph-*` — platform-specific optional binaries
+   - `vitest` — test runner, invoked via CLI
+   - Anything in `optionalDependencies`
+4. For each truly unused dep: recommend removal with `npm uninstall <pkg>`
+
+> **Important:** Some deps are used transitively or via CLI — don't blindly remove. Flag as "likely unused" and let the user decide.
+
+## Phase 4 — License Compliance
+
+Check licenses for all direct dependencies:
+
+1. For each package in `dependencies`, read its `node_modules/<pkg>/package.json` → `license` field
+2. Classify:
+   - **Permissive** (MIT, ISC, BSD-2-Clause, BSD-3-Clause, Apache-2.0, 0BSD, Unlicense): OK
+   - **Weak copyleft** (LGPL-2.1, LGPL-3.0, MPL-2.0): Flag for review
+   - **Strong copyleft** (GPL-2.0, GPL-3.0, AGPL-3.0): Flag as risk — may conflict with MIT license of codegraph
+   - **Unknown/UNLICENSED/missing**: Flag for investigation
+3. Only flag non-permissive licenses — don't list every MIT dep
+
+## Phase 5 — Duplicate Packages
+
+Check for duplicate versions of the same package in the dependency tree:
+
+1. Run `npm ls --all --json` and look for packages that appear multiple times with different versions
+2. Only flag duplicates that add significant bundle weight (> 100KB) or are security-sensitive (crypto, auth, etc.)
+3. Suggest deduplication: `npm dedupe`
+
+## Phase 6 — Report
+
+Write a report to `generated/deps-audit/DEPS_AUDIT_<date>.md` with this structure:
+
+```markdown
+# Dependency Audit Report — <date>
+
+## Summary
+
+| Metric | Value |
+|--------|-------|
+| Total dependencies (direct) | N |
+| Total dependencies (transitive) | N |
+| Security vulnerabilities | N critical, N high, N moderate, N low |
+| Outdated packages | N stale, N aging, N fresh |
+| Unused dependencies | N |
+| License risks | N |
+| Duplicates | N |
+| **Health score** | **X/100** |
+
+## Health Score Calculation
+
+- Start at 100
+- -20 per critical vuln, -10 per high vuln, -3 per moderate vuln
+- -5 per stale (major behind) dep, -2 per aging dep
+- -5 per unused dep
+- -10 per copyleft license risk
+- Floor at 0
+
+## Security Vulnerabilities
+<!-- Detail each critical/high vuln with remediation -->
+
+## Outdated Packages
+<!-- Table: package, current, latest, category, notes -->
+
+## Unused Dependencies
+<!-- List with evidence (no imports found) -->
+
+## License Flags
+<!-- Only non-permissive licenses -->
+
+## Duplicates
+<!-- Only significant ones -->
+
+## Recommended Actions
+<!-- Prioritized list: fix vulns > remove unused > update stale > dedupe -->
+```
+
+## Phase 7 — Auto-fix Summary (if `--fix`)
+
+If `AUTO_FIX` was set, summarize all changes made:
+1. List each package updated/fixed
+2. Run `npm test` to verify nothing broke
+3. If tests fail, revert with `git checkout -- package.json package-lock.json` and report what failed
+
+## Rules
+
+- **Never run `npm audit fix --force`** — breaking changes need human review
+- **Never remove a dependency** without asking the user, even if it appears unused — flag it in the report instead
+- **Always run tests** after any auto-fix changes
+- **If `--fix` causes test failures**, revert all changes and report the failure
+- Treat `optionalDependencies` separately — they're expected to fail on some platforms
+- The report goes in `generated/deps-audit/` — create the directory if it doesn't exist
diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
new file mode 100644
index 00000000..a00a88f5
--- /dev/null
+++ b/.claude/skills/housekeep/SKILL.md
@@ -0,0 +1,266 @@
+---
+name: housekeep
+description: Local repo maintenance — clean stale worktrees, remove dirt files, sync with main, update codegraph, prune branches, and verify repo health
+argument-hint: "[--full | --dry-run | --skip-update]  (default: full cleanup)"
+allowed-tools: Bash, Read, Write, Edit, Glob, Grep
+---
+
+# /housekeep — Local Repository Maintenance
+
+Clean up the local repo: remove stale worktrees, delete dirt/temp files, sync with main, update codegraph to latest, prune merged branches, and verify repo health. The "spring cleaning" routine.
+
+## Arguments
+
+- `$ARGUMENTS` may contain:
+  - `--full` — run all phases (default behavior)
+  - `--dry-run` — show what would be cleaned without actually doing it
+  - `--skip-update` — skip the codegraph npm update phase
+  - No arguments — full cleanup
+
+## Phase 0 — Pre-flight
+
+1. Confirm we're in the codegraph repo root (check `package.json` with `"name": "@optave/codegraph"`)
+2. Parse `$ARGUMENTS`:
+   - `DRY_RUN=true` if `--dry-run`
+   - `SKIP_UPDATE=true` if `--skip-update`
+3. Record current branch: `git branch --show-current`
+4. Record current git status: `git status --short`
+5. Warn the user if there are uncommitted changes — housekeeping works best from a clean state
+
+## Phase 1 — Clean Stale Worktrees
+
+### 1a. List all worktrees
+
+```bash
+git worktree list
+```
+
+### 1b. Identify stale worktrees
+
+A worktree is stale if:
+- Its directory no longer exists on disk (prunable)
+- It has no uncommitted changes AND its branch has been merged to main
+- It was created more than 7 days ago with no commits since (abandoned)
+
+Check `.claude/worktrees/` for Claude Code worktrees specifically.
+
+### 1c. Clean up
+
+For prunable worktrees (missing directory):
+```bash
+git worktree prune
+```
+
+For stale worktrees with merged branches:
+- List them and ask the user for confirmation before removing
+- If confirmed (or `--full` without `--dry-run`):
+  ```bash
+  git worktree remove <path>
+  git branch -d <branch>  # only if fully merged
+  ```
+
+**If `DRY_RUN`:** Just list what would be removed, don't do it.
+
+> **Never force-remove** a worktree with uncommitted changes. List it as "has uncommitted work" and skip.
+
+## Phase 2 — Delete Dirt Files
+
+Remove temporary and generated files that accumulate over time:
+
+### 2a. Known dirt patterns
+
+Search for and remove:
+- `*.tmp.*`, `*.bak`, `*.orig` files in the repo (but NOT in `node_modules/`)
+- `.DS_Store` files
+- `*.log` files in repo root (not in `node_modules/`)
+- Empty directories (except `.codegraph/`, `.claude/`, `node_modules/`)
+- `coverage/` directory (regenerated by `npm run test:coverage`)
+- `.codegraph/graph.db-journal` (SQLite WAL leftovers)
+- Stale lock files: `.codegraph/*.lock` older than 1 hour
+
+### 2b. Large untracked files
+
+Find untracked files larger than 1MB:
+```bash
+git ls-files --others --exclude-standard | while read f; do
+  size=$(stat --format='%s' "$f" 2>/dev/null || stat -f '%z' "$f" 2>/dev/null)
+  if [ "$size" -gt 1048576 ]; then echo "$f ($size bytes)"; fi
+done
+```
+
+Flag these for user review — they might be accidentally untracked binaries.
+
+### 2c. Clean up
+
+**If `DRY_RUN`:** List all files that would be removed with their sizes.
+
+**Otherwise:**
+- Remove known dirt patterns automatically
+- For large untracked files: list and ask the user
+
+> **Never delete** files that are tracked by git. Only clean untracked/ignored files.
+
+## Phase 3 — Sync with Main
+
+### 3a. Fetch latest
+
+```bash
+git fetch origin
+```
+
+### 3b. Check main branch status
+
+```bash
+git log HEAD..origin/main --oneline
+```
+
+If main has new commits:
+- If on main: `git pull origin main`
+- If on a feature branch: inform the user how many commits behind main they are
+  - Suggest: `git merge origin/main` (never rebase — per project rules)
+
+### 3c. Check for diverged branches
+
+List local branches that have diverged from their remote tracking branch:
+```bash
+git for-each-ref --format='%(refname:short) %(upstream:track)' refs/heads/
+```
+
+Flag any branches marked `[ahead N, behind M]` — these may need attention.
+
+## Phase 4 — Prune Merged Branches
+
+### 4a. Find merged branches
+
+```bash
+git branch --merged main
+```
+
+### 4b. Safe to delete
+
+Branches that are:
+- Fully merged into main
+- Not `main` itself
+- Not the current branch
+- Not a worktree branch (check `git worktree list`)
+
+### 4c. Prune remote tracking refs
+
+```bash
+git remote prune origin
+```
+
+This removes local refs to branches that no longer exist on the remote.
+
+### 4d. Clean up
+
+**If `DRY_RUN`:** List branches that would be deleted.
+
+**Otherwise:** Delete merged branches:
+```bash
+git branch -d <branch>  # safe delete, only if fully merged
+```
+
+> **Never use `git branch -D`** (force delete). If `-d` fails, the branch has unmerged work — skip it.
+
+## Phase 5 — Update Codegraph
+
+**Skip if `SKIP_UPDATE` is set.**
+
+### 5a. Check current version
+
+```bash
+node -e "console.log(require('./package.json').version)"
+```
+
+### 5b. Check latest published version
+
+```bash
+npm view @optave/codegraph version
+```
+
+### 5c. Update if needed
+
+If a newer version is available:
+- Show the version diff (current → latest)
+- Check the CHANGELOG for what changed
+- If it's a patch/minor: update automatically
+  ```bash
+  npm install
+  ```
+- If it's a major: warn the user and ask for confirmation
+
+### 5d. Rebuild
+
+After any update:
+```bash
+npm install
+```
+
+Verify the build works:
+```bash
+npx codegraph stats 2>/dev/null && echo "OK" || echo "FAILED"
+```
+
+## Phase 6 — Verify Repo Health
+
+Quick health checks to catch issues:
+
+### 6a. Graph integrity
+
+```bash
+npx codegraph stats
+```
+
+If the graph is stale (built from a different commit), rebuild:
+```bash
+npx codegraph build
+```
+
+### 6b. Node modules integrity
+
+```bash
+npm ls --depth=0 2>&1 | grep -c "missing\|invalid\|WARN"
+```
+
+If issues found: `npm install` to fix.
+
+### 6c. Git integrity
+
+```bash
+git fsck --no-dangling 2>&1 | head -20
+```
+
+Flag any errors (rare but important).
+
+## Phase 7 — Report
+
+Print a summary to the console (no file needed — this is a local maintenance task):
+
+```
+=== Housekeeping Report ===
+
+Worktrees:  removed 2 stale, 1 has uncommitted work (skipped)
+Dirt files: cleaned 5 temp files (12KB), 1 large untracked flagged
+Branches:   pruned 3 merged branches, 2 remote refs
+Main sync:  up to date (or: 4 commits behind — merge suggested)
+Codegraph:  v3.1.2 → v3.1.3 updated (or: already latest)
+Graph:      rebuilt (was stale) (or: fresh)
+Node mods:  OK (or: fixed 2 missing deps)
+Git:        OK
+
+Status: CLEAN ✓
+```
+
+**If `DRY_RUN`:** prefix with `[DRY RUN]` and show what would happen without doing it.
+
+## Rules
+
+- **Never force-delete** anything — use safe deletes only (`git branch -d`, `git worktree remove`)
+- **Never rebase** — sync with main via merge only (per project rules)
+- **Never delete tracked files** — only clean untracked/ignored dirt
+- **Never delete worktrees with uncommitted changes** — warn and skip
+- **Ask before deleting large untracked files** — they might be intentional
+- **This is a local-only operation** — no pushes, no remote modifications, no PR creation
+- **Idempotent** — running twice should be safe (second run finds nothing to clean)
+- **`--dry-run` is sacred** — it must NEVER modify anything, only report
diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
new file mode 100644
index 00000000..2bb06194
--- /dev/null
+++ b/.claude/skills/test-health/SKILL.md
@@ -0,0 +1,248 @@
+---
+name: test-health
+description: Audit test suite health — detect flaky tests, dead tests, coverage gaps, and missing assertions — produce a health report with fix suggestions
+argument-hint: "[--flaky-runs 5 | --coverage | --quick]  (default: full audit)"
+allowed-tools: Bash, Read, Write, Edit, Glob, Grep, Agent
+---
+
+# /test-health — Test Suite Health Audit
+
+Audit the test suite for flaky tests, dead/trivial tests, coverage gaps on recent changes, missing assertions, and structural issues. Produce a health report with prioritized recommendations.
+
+## Arguments
+
+- `$ARGUMENTS` may contain:
+  - `--flaky-runs N` — number of times to run the suite for flaky detection (default: 5)
+  - `--coverage` — only run the coverage gap analysis (skip flaky/dead detection)
+  - `--quick` — skip flaky detection (most time-consuming), run everything else
+  - No arguments — full audit
+
+## Phase 0 — Pre-flight
+
+1. Confirm we're in the codegraph repo root
+2. Verify vitest is available: `npx vitest --version`
+3. Parse `$ARGUMENTS`:
+   - `FLAKY_RUNS=N` from `--flaky-runs N` (default: 5)
+   - `COVERAGE_ONLY=true` if `--coverage`
+   - `QUICK=true` if `--quick`
+4. Discover all test files:
+   ```bash
+   find tests/ -name '*.test.js' -o -name '*.test.ts' | sort
+   ```
+5. Count total test files and categorize by directory (integration, parsers, graph, search, unit)
+
+## Phase 1 — Flaky Test Detection
+
+**Skip if `COVERAGE_ONLY` or `QUICK` is set.**
+
+Run the full test suite `FLAKY_RUNS` times and track per-test pass/fail:
+
+```bash
+for i in $(seq 1 $FLAKY_RUNS); do
+  npx vitest run --reporter=json 2>/dev/null
+done
+```
+
+For each run, parse the JSON reporter output to get per-test results.
+
+### Analysis
+
+A test is **flaky** if it passes in some runs and fails in others.
+
+For each flaky test found:
+1. Record: test file, test name, pass count, fail count, failure messages
+2. Categorize likely cause:
+   - **Timing-dependent**: failure message mentions timeout, race condition, or test has `setTimeout`/`sleep`
+   - **Order-dependent**: only fails when run with other tests (passes in isolation)
+   - **Resource-dependent**: mentions file system, network, port, or temp directory
+   - **Non-deterministic**: random/Date.now/Math.random in test or source
+
+> **Timeout:** Each full suite run gets 3 minutes. If it times out, record partial results and continue.
+
+## Phase 2 — Dead & Trivial Test Detection
+
+Scan all test files for problematic patterns:
+
+### 2a. Empty / no-assertion tests
+
+Search for test bodies that:
+- Have no `expect()`, `assert()`, `toBe()`, `toEqual()`, or similar assertion calls
+- Only contain `console.log` or comments
+- Are skipped: `it.skip(`, `test.skip(`, `xit(`, `xtest(`
+- Are TODO: `it.todo(`, `test.todo(`
+
+```
+Pattern: test bodies with 0 assertions = dead tests
+```
+
+### 2b. Trivial / tautological tests
+
+Detect tests that assert on constants or trivially true conditions:
+- `expect(true).toBe(true)`
+- `expect(1).toBe(1)`
+- `expect(result).toBeDefined()` as the ONLY assertion (too weak)
+
+### 2c. Commented-out tests
+
+Search for commented-out test blocks:
+- `// it(`, `// test(`, `/* it(`, `/* test(`
+- Large commented blocks inside `describe` blocks
+
+### 2d. Orphaned fixtures
+
+Check if any files in `tests/fixtures/` are not referenced by any test file.
+
+### 2e. Duplicate test names
+
+Search for duplicate test descriptions within the same `describe` block — these indicate copy-paste errors.
+
+## Phase 3 — Coverage Gap Analysis
+
+Run vitest with coverage and analyze:
+
+```bash
+npx vitest run --coverage --coverage.reporter=json 2>/dev/null
+```
+
+### 3a. Overall coverage
+
+Parse `coverage/coverage-summary.json` and extract:
+- Line coverage %
+- Branch coverage %
+- Function coverage %
+- Statement coverage %
+
+### 3b. Uncovered files
+
+Find source files in `src/` with 0% coverage (no tests touch them at all).
+
+### 3c. Low-coverage hotspots
+
+Find files with < 50% line coverage. For each:
+- List uncovered functions (from the detailed coverage data)
+- Check if the file is in `domain/` or `features/` (core logic — coverage matters more)
+- Check file's complexity with `codegraph complexity <file> -T` — high complexity + low coverage = high risk
+
+### 3d. Recent changes without coverage
+
+Compare against `main` branch to find recently changed files:
+
+```bash
+git diff --name-only main...HEAD -- src/
+```
+
+For each changed source file, check if:
+1. It has corresponding test changes
+2. Its coverage increased, decreased, or stayed the same
+3. New functions/exports were added without test coverage
+
+> **Note:** If the coverage tool is not configured or fails, skip this phase and note it in the report. Coverage is a vitest plugin — it may need `@vitest/coverage-v8` installed.
+
+## Phase 4 — Test Structure Analysis
+
+Analyze the test suite's structural health:
+
+### 4a. Test-to-source mapping
+
+For each directory in `src/`:
+- Count source files
+- Count corresponding test files
+- Calculate test coverage ratio (files with tests / total files)
+- Flag directories with < 30% test file coverage
+
+### 4b. Test file size distribution
+
+- Find oversized test files (> 500 lines) — may need splitting
+- Find tiny test files (< 10 lines) — may be stubs or dead
+
+### 4c. Setup/teardown hygiene
+
+Check for:
+- Tests that create temp files/dirs but don't clean up (`afterEach`/`afterAll` missing)
+- Tests that mutate global state without restoration
+- Missing `beforeEach` resets in `describe` blocks that share state
+
+### 4d. Timeout analysis
+
+- Find tests with custom timeouts: `{ timeout: ... }`
+- Find tests that exceed the default 30s timeout in recent runs
+- High timeouts often indicate tests that should be restructured or are testing too much
+
+## Phase 5 — Report
+
+Write report to `generated/test-health/TEST_HEALTH_<date>.md`:
+
+```markdown
+# Test Health Report — <date>
+
+## Summary
+
+| Metric | Value |
+|--------|-------|
+| Total test files | N |
+| Total test cases | N |
+| Flaky tests | N |
+| Dead/trivial tests | N |
+| Skipped tests | N |
+| Coverage (lines) | X% |
+| Coverage (branches) | X% |
+| Uncovered source files | N |
+| **Health score** | **X/100** |
+
+## Health Score Calculation
+
+- Start at 100
+- -10 per flaky test
+- -3 per dead/trivial test
+- -2 per skipped test (without TODO explaining why)
+- -1 per uncovered source file in `domain/` or `features/`
+- -(100 - line_coverage) / 5 (coverage penalty)
+- Floor at 0
+
+## Flaky Tests
+<!-- For each: file, name, pass/fail ratio, likely cause, suggested fix -->
+
+## Dead & Trivial Tests
+<!-- For each: file, line, issue, recommendation -->
+
+## Coverage Gaps
+<!-- Uncovered files, low-coverage hotspots with complexity -->
+
+## Structural Issues
+<!-- Oversized files, missing cleanup, timeout issues -->
+
+## Recommended Actions
+
+### Priority 1 — Fix flaky tests
+<!-- List with specific suggestions -->
+
+### Priority 2 — Remove or fix dead tests
+<!-- List with specific suggestions -->
+
+### Priority 3 — Add coverage for high-risk gaps
+<!-- List uncovered functions in core modules, ordered by complexity -->
+
+### Priority 4 — Structural improvements
+<!-- Split large files, add cleanup, reduce timeouts -->
+```
+
+## Phase 6 — Quick Wins
+
+After writing the report, identify tests that can be fixed immediately (< 5 min each):
+
+1. Remove `.skip` from tests that now pass (run them to check)
+2. Add missing assertions to empty test bodies (if the intent is clear)
+3. Delete commented-out test blocks older than 6 months (check git blame)
+
+**Do NOT auto-fix** — list these as suggestions in the report. The user decides.
+
+## Rules
+
+- **Never delete or modify test files** without explicit user approval — this is a read-only audit
+- **Flaky detection is slow** — warn the user before running 5+ iterations
+- **Coverage requires `@vitest/coverage-v8`** — if missing, skip coverage and note it
+- **Order-dependent flakiness** requires running tests both in suite and in isolation — only do this for tests that flaked in Phase 1
+- **Fixture files may be shared** across tests — don't flag as orphaned if used indirectly
+- **Skipped tests aren't always bad** — only flag if there's no `TODO` or comment explaining why
+- Generated files go in `generated/test-health/` — create the directory if needed
+- **This is a diagnostic tool** — it reports problems, it doesn't fix them (unless the user opts in)

From a562b523045e4d72145f447b97f792ae0db6ea58 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 05:25:16 -0600
Subject: [PATCH 02/37] fix(bench-check): capture stderr, guard
 division-by-zero, commit baseline
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Replace 2>/dev/null with output=$(... 2>&1) + exit_code check on all
  four benchmark invocations so error messages are captured and recorded
- Add division-by-zero guard in Phase 3: when baseline == 0, mark delta
  as "N/A — baseline was zero" (informational only, not a regression)
- Add git add + git commit step in Phase 5 so the baseline file is
  actually committed after each save, matching the documented rule
---
 .claude/skills/bench-check/SKILL.md | 33 +++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 2b48ff3a..334345a1 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -39,9 +39,12 @@ Run each benchmark script and collect results. Each script outputs JSON to stdou
 ### 1a. Build & Query Benchmark
 
 ```bash
-node scripts/benchmark.js 2>/dev/null
+output=$(node scripts/benchmark.js 2>&1)
+exit_code=$?
 ```
 
+If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+
 Extract:
 - `buildTime` (ms) — per engine (native, WASM)
 - `queryTime` (ms) — per query type
@@ -50,9 +53,12 @@ Extract:
 ### 1b. Incremental Benchmark
 
 ```bash
-node scripts/incremental-benchmark.js 2>/dev/null
+output=$(node scripts/incremental-benchmark.js 2>&1)
+exit_code=$?
 ```
 
+If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+
 Extract:
 - `noOpRebuild` (ms) — time for no-change rebuild
 - `singleFileRebuild` (ms) — time after one file change
@@ -61,9 +67,12 @@ Extract:
 ### 1c. Query Depth Benchmark
 
 ```bash
-node scripts/query-benchmark.js 2>/dev/null
+output=$(node scripts/query-benchmark.js 2>&1)
+exit_code=$?
 ```
 
+If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+
 Extract:
 - `fnDeps` scaling by depth
 - `fnImpact` scaling by depth
@@ -72,9 +81,12 @@ Extract:
 ### 1d. Embedding Benchmark (optional)
 
 ```bash
-node scripts/embedding-benchmark.js 2>/dev/null
+output=$(node scripts/embedding-benchmark.js 2>&1)
+exit_code=$?
 ```
 
+If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+
 Extract:
 - `embeddingTime` (ms)
 - `recall` at Hit@1, Hit@3, Hit@5, Hit@10
@@ -119,8 +131,9 @@ Skip this phase if `SAVE_ONLY=true` or no baseline exists.
 For each metric in the current run:
 
 1. Look up the same metric in the baseline
-2. Compute: `delta_pct = ((current - baseline) / baseline) * 100`
-3. Classify:
+2. Guard against division-by-zero: if `baseline == 0`, mark the delta as `"N/A � baseline was zero"` and treat the metric as **informational only** (not a regression or improvement)
+3. Otherwise compute: `delta_pct = ((current - baseline) / baseline) * 100`
+4. Classify:
    - **Regression**: metric increased by more than `THRESHOLD`% (for time metrics) or decreased by more than `THRESHOLD`% (for recall/quality metrics)
    - **Improvement**: metric decreased by more than `THRESHOLD`% (time) or increased (quality)
    - **Stable**: within threshold
@@ -177,6 +190,12 @@ Also append a one-line summary to `generated/bench-check/history.ndjson`:
 
 This creates a running log of benchmark results over time.
 
+After writing both files, commit the baseline so it is a shared reference point:
+```bash
+git add generated/bench-check/baseline.json generated/bench-check/history.ndjson
+git commit -m "chore: update bench-check baseline (<gitRef>)"
+```
+
 ## Phase 6 — Report
 
 Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`:
@@ -218,6 +237,6 @@ Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`:
 - **Don't update baseline on regression** — the user must investigate first
 - **Recall/quality metrics are inverted** — a decrease is a regression
 - **Count metrics are informational** — graph growing isn't a regression
-- **The baseline file is committed to git** — it's a shared reference point
+- **The baseline file is committed to git** — it's a shared reference point; Phase 5 always commits it
 - **history.ndjson is append-only** — never truncate or rewrite it
 - Generated files go in `generated/bench-check/` — create the directory if needed

From 4fc994d8ce28ed74e433a0b4758736d7ce8892c3 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 05:25:30 -0600
Subject: [PATCH 03/37] fix(deps-audit): run npm ci after revert, document
 tokenizer skip reason

- After reverting package.json + package-lock.json on --fix test failure,
  also run `npm ci` to resync node_modules/ with the restored lock file;
  without this the manifest is reverted but installed packages are not
- Add explanatory comment on @anthropic-ai/tokenizer skip-list entry
  clarifying it is a peer dependency of @anthropic-ai/sdk and may be
  required at runtime without an explicit import in our code
---
 .claude/skills/deps-audit/SKILL.md | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index cc2e4b12..8240bf7e 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -68,7 +68,7 @@ Detect dependencies declared in `package.json` but never imported:
    - `import ... from '<pkg>'` or `import '<pkg>'`
    - `import('<pkg>')` (dynamic imports)
 3. Skip known implicit dependencies that don't have direct imports:
-   - `@anthropic-ai/tokenizer` — may be used by `@anthropic-ai/sdk`
+   - `@anthropic-ai/tokenizer` — peer dependency of `@anthropic-ai/sdk`; the SDK may require it at runtime without an explicit import in our code (verify against package.json before removing)
    - `tree-sitter-*` and `web-tree-sitter` — loaded dynamically via WASM
    - `@biomejs/biome` — used as CLI tool only
    - `commit-and-tag-version` — used as npm script
@@ -152,13 +152,16 @@ Write a report to `generated/deps-audit/DEPS_AUDIT_<date>.md` with this structur
 If `AUTO_FIX` was set, summarize all changes made:
 1. List each package updated/fixed
 2. Run `npm test` to verify nothing broke
-3. If tests fail, revert with `git checkout -- package.json package-lock.json` and report what failed
+3. If tests fail:
+   - Revert the manifest: `git checkout -- package.json package-lock.json`
+   - Restore `node_modules/` to match the reverted lock file: `npm ci`
+   - Report what failed
 
 ## Rules
 
 - **Never run `npm audit fix --force`** — breaking changes need human review
 - **Never remove a dependency** without asking the user, even if it appears unused — flag it in the report instead
 - **Always run tests** after any auto-fix changes
-- **If `--fix` causes test failures**, revert all changes and report the failure
+- **If `--fix` causes test failures**, revert manifest with `git checkout -- package.json package-lock.json` then run `npm ci` to resync `node_modules/`, and report the failure
 - Treat `optionalDependencies` separately — they're expected to fail on some platforms
 - The report goes in `generated/deps-audit/` — create the directory if it doesn't exist

From 89aef6b0a1bc57a114b1214ca73c3fd484c16fb5 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 05:25:42 -0600
Subject: [PATCH 04/37] fix(housekeep): guard Phase 5 in source repo, fix
 stale-worktree criterion

- Phase 5 (Update Codegraph): add source-repo guard that skips the
  self-update logic when running inside the codegraph source repo;
  comparing the dev version to the published release and running
  npm install is a no-op since codegraph is not one of its own deps
- Phase 1b stale-worktree criterion: replace "created more than 7 days
  ago" (not determinable via git worktree list) with "last commit on the
  branch is more than 7 days old AND branch has no commits ahead of
  origin/main", using `git log -1 --format=%ci <branch>`
---
 .claude/skills/housekeep/SKILL.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index a00a88f5..d5bd9d7a 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -40,7 +40,8 @@ git worktree list
 A worktree is stale if:
 - Its directory no longer exists on disk (prunable)
 - It has no uncommitted changes AND its branch has been merged to main
-- It was created more than 7 days ago with no commits since (abandoned)
+- Its branch has no commits ahead of `origin/main` AND the branch's last commit is more than 7 days old
+  (check: `git log -1 --format=%ci <branch>` — `git worktree list` does not expose creation timestamps)
 
 Check `.claude/worktrees/` for Claude Code worktrees specifically.
 
@@ -167,6 +168,9 @@ git branch -d <branch>  # safe delete, only if fully merged
 
 **Skip if `SKIP_UPDATE` is set.**
 
+> **Source-repo guard:** This phase is only meaningful when codegraph is installed as a *dependency* of a consumer project. Because the pre-flight confirms we are inside the codegraph *source* repo (`"name": "@optave/codegraph"`), comparing the dev version to the published release and running `npm install` would be a no-op — codegraph is not one of its own dependencies. **Skip this entire phase** when running inside the source repo and print:
+> `Codegraph: skipped (running inside source repo — update via git pull / branch sync instead)`
+
 ### 5a. Check current version
 
 ```bash

From ce5d811225d4d53da0a1934c0da97e8a0297b7a5 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 19:34:16 -0600
Subject: [PATCH 05/37] fix: address Round 3 Greptile review feedback

---
 .claude/skills/bench-check/SKILL.md |  2 +-
 .claude/skills/deps-audit/SKILL.md  | 16 ++++++++++++----
 .claude/skills/housekeep/SKILL.md   |  2 +-
 .claude/skills/test-health/SKILL.md |  4 ++--
 4 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 334345a1..85031103 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -193,7 +193,7 @@ This creates a running log of benchmark results over time.
 After writing both files, commit the baseline so it is a shared reference point:
 ```bash
 git add generated/bench-check/baseline.json generated/bench-check/history.ndjson
-git commit -m "chore: update bench-check baseline (<gitRef>)"
+git diff --cached --quiet || git commit -m "chore: update bench-check baseline (<gitRef>)"
 ```
 
 ## Phase 6 — Report
diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index 8240bf7e..0098779d 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -149,11 +149,19 @@ Write a report to `generated/deps-audit/DEPS_AUDIT_<date>.md` with this structur
 
 ## Phase 7 — Auto-fix Summary (if `--fix`)
 
-If `AUTO_FIX` was set, summarize all changes made:
+If `AUTO_FIX` was set:
+
+**Before running any auto-fix** (in Phase 1/2), save the original manifests so pre-existing unstaged changes are preserved:
+```bash
+git stash push -m "deps-audit-backup" -- package.json package-lock.json
+```
+
+Summarize all changes made:
 1. List each package updated/fixed
 2. Run `npm test` to verify nothing broke
-3. If tests fail:
-   - Revert the manifest: `git checkout -- package.json package-lock.json`
+3. If tests pass: drop the saved state (`git stash drop`)
+4. If tests fail:
+   - Restore the saved manifests: `git stash pop`
    - Restore `node_modules/` to match the reverted lock file: `npm ci`
    - Report what failed
 
@@ -162,6 +170,6 @@ If `AUTO_FIX` was set, summarize all changes made:
 - **Never run `npm audit fix --force`** — breaking changes need human review
 - **Never remove a dependency** without asking the user, even if it appears unused — flag it in the report instead
 - **Always run tests** after any auto-fix changes
-- **If `--fix` causes test failures**, revert manifest with `git checkout -- package.json package-lock.json` then run `npm ci` to resync `node_modules/`, and report the failure
+- **If `--fix` causes test failures**, restore manifests from the saved state (`git stash pop`) then run `npm ci` to resync `node_modules/`, and report the failure
 - Treat `optionalDependencies` separately — they're expected to fail on some platforms
 - The report goes in `generated/deps-audit/` — create the directory if it doesn't exist
diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index d5bd9d7a..d8c75fb7 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -134,7 +134,7 @@ Flag any branches marked `[ahead N, behind M]` — these may need attention.
 ### 4a. Find merged branches
 
 ```bash
-git branch --merged main
+git branch --merged origin/main
 ```
 
 ### 4b. Safe to delete
diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
index 2bb06194..855b0628 100644
--- a/.claude/skills/test-health/SKILL.md
+++ b/.claude/skills/test-health/SKILL.md
@@ -39,7 +39,7 @@ Run the full test suite `FLAKY_RUNS` times and track per-test pass/fail:
 
 ```bash
 for i in $(seq 1 $FLAKY_RUNS); do
-  npx vitest run --reporter=json 2>/dev/null
+  npx vitest run --reporter=json 2>&1
 done
 ```
 
@@ -101,7 +101,7 @@ Search for duplicate test descriptions within the same `describe` block — thes
 Run vitest with coverage and analyze:
 
 ```bash
-npx vitest run --coverage --coverage.reporter=json 2>/dev/null
+npx vitest run --coverage --coverage.reporter=json 2>&1
 ```
 
 ### 3a. Overall coverage

From 01b5110296b67bef5e8938764a3d2a1126d60c41 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:32:56 -0600
Subject: [PATCH 06/37] fix: move deps-audit stash to Phase 0, before npm
 commands modify manifests

---
 .claude/skills/deps-audit/SKILL.md | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index 0098779d..189bbe04 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -19,6 +19,10 @@ Audit the project's dependency tree for security vulnerabilities, outdated packa
 2. Run `node --version` — must be >= 20
 3. Run `npm --version` to capture toolchain info
 4. Parse `$ARGUMENTS` — set `AUTO_FIX=true` if `--fix` is present
+5. **If `AUTO_FIX` is set:** Save the original manifests now, before any npm commands run, so pre-existing unstaged changes are preserved:
+   ```bash
+   git stash push -m "deps-audit-backup" -- package.json package-lock.json
+   ```
 
 ## Phase 1 — Security Vulnerabilities
 
@@ -151,11 +155,6 @@ Write a report to `generated/deps-audit/DEPS_AUDIT_<date>.md` with this structur
 
 If `AUTO_FIX` was set:
 
-**Before running any auto-fix** (in Phase 1/2), save the original manifests so pre-existing unstaged changes are preserved:
-```bash
-git stash push -m "deps-audit-backup" -- package.json package-lock.json
-```
-
 Summarize all changes made:
 1. List each package updated/fixed
 2. Run `npm test` to verify nothing broke

From 3b0e29309310763957f2a37c8aaed362f41ebe11 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:33:04 -0600
Subject: [PATCH 07/37] fix: capture flaky-detection loop output to per-run
 files for comparison

---
 .claude/skills/test-health/SKILL.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
index 855b0628..5ec83719 100644
--- a/.claude/skills/test-health/SKILL.md
+++ b/.claude/skills/test-health/SKILL.md
@@ -38,12 +38,13 @@ Audit the test suite for flaky tests, dead/trivial tests, coverage gaps on recen
 Run the full test suite `FLAKY_RUNS` times and track per-test pass/fail:
 
 ```bash
+mkdir -p /tmp/test-health-runs
 for i in $(seq 1 $FLAKY_RUNS); do
-  npx vitest run --reporter=json 2>&1
+  npx vitest run --reporter=json > /tmp/test-health-runs/run-$i.json 2>/tmp/test-health-runs/run-$i.err
 done
 ```
 
-For each run, parse the JSON reporter output to get per-test results.
+For each run, parse the JSON reporter output from `/tmp/test-health-runs/run-$i.json` to get per-test results.
 
 ### Analysis
 

From 52de49526db077c1a2291a606b1b21b71bbdad15 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:33:13 -0600
Subject: [PATCH 08/37] fix: always require confirmation for stale worktree
 removal

---
 .claude/skills/housekeep/SKILL.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index d8c75fb7..e0659672 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -53,8 +53,8 @@ git worktree prune
 ```
 
 For stale worktrees with merged branches:
-- List them and ask the user for confirmation before removing
-- If confirmed (or `--full` without `--dry-run`):
+- List them and **always ask the user for confirmation before removing**, regardless of `--full`
+- If confirmed:
   ```bash
   git worktree remove <path>
   git branch -d <branch>  # only if fully merged

From 8be5cecf241e82ea1ad50a34df8b084fef1f3b32 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:33:22 -0600
Subject: [PATCH 09/37] fix: use parsed threshold in baseline.json, guard
 --compare-only on first run

---
 .claude/skills/bench-check/SKILL.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 85031103..fc2ccff4 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -131,7 +131,7 @@ Skip this phase if `SAVE_ONLY=true` or no baseline exists.
 For each metric in the current run:
 
 1. Look up the same metric in the baseline
-2. Guard against division-by-zero: if `baseline == 0`, mark the delta as `"N/A � baseline was zero"` and treat the metric as **informational only** (not a regression or improvement)
+2. Guard against division-by-zero: if `baseline == 0`, mark the delta as `"N/A � baseline was zero"` and treat the metric as **informational only** (not a regression or improvement)
 3. Otherwise compute: `delta_pct = ((current - baseline) / baseline) * 100`
 4. Classify:
    - **Regression**: metric increased by more than `THRESHOLD`% (for time metrics) or decreased by more than `THRESHOLD`% (for recall/quality metrics)
@@ -165,8 +165,8 @@ Based on comparison results:
   - Re-run individual benchmarks to confirm (not flaky)
 
 ### First run (no baseline)
-- Print: `BENCH-CHECK — initial baseline saved`
-- Save current results as baseline
+- If `COMPARE_ONLY` is set: print a warning that no baseline exists and exit without saving
+- Otherwise: print `BENCH-CHECK — initial baseline saved` and save current results as baseline
 
 ## Phase 5 — Save Baseline
 
@@ -178,7 +178,7 @@ Write to `generated/bench-check/baseline.json`:
   "savedAt": "<ISO 8601>",
   "version": "<package version>",
   "gitRef": "<HEAD short SHA>",
-  "threshold": 15,
+  "threshold": $THRESHOLD,
   "metrics": { ... }
 }
 ```

From 87d9213cb3bcf75d1bcbff252d73d9b348bde8be Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:55:16 -0600
Subject: [PATCH 10/37] fix(deps-audit): track stash creation to avoid
 operating on wrong entry

When Phase 0 stash push is a no-op (manifests unchanged), Phase 7
was calling stash drop/pop on the wrong entry. Track STASH_CREATED
exit code and branch on it: use git checkout when no stash exists.
---
 .claude/skills/deps-audit/SKILL.md | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index 189bbe04..d4c672af 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -22,7 +22,9 @@ Audit the project's dependency tree for security vulnerabilities, outdated packa
 5. **If `AUTO_FIX` is set:** Save the original manifests now, before any npm commands run, so pre-existing unstaged changes are preserved:
    ```bash
    git stash push -m "deps-audit-backup" -- package.json package-lock.json
+   STASH_CREATED=$?
    ```
+   Track `STASH_CREATED` — when `0`, a stash entry was actually created; when `1`, the files had no changes so nothing was stashed.
 
 ## Phase 1 — Security Vulnerabilities
 
@@ -158,17 +160,22 @@ If `AUTO_FIX` was set:
 Summarize all changes made:
 1. List each package updated/fixed
 2. Run `npm test` to verify nothing broke
-3. If tests pass: drop the saved state (`git stash drop`)
-4. If tests fail:
+3. If tests pass and `STASH_CREATED` is `0`: drop the saved state (`git stash drop`)
+   If tests pass and `STASH_CREATED` is `1`: discard manifest changes with `git checkout -- package.json package-lock.json` (no stash entry exists)
+4. If tests fail and `STASH_CREATED` is `0`:
    - Restore the saved manifests: `git stash pop`
    - Restore `node_modules/` to match the reverted lock file: `npm ci`
    - Report what failed
+5. If tests fail and `STASH_CREATED` is `1`:
+   - Discard manifest changes: `git checkout -- package.json package-lock.json`
+   - Restore `node_modules/` to match the reverted lock file: `npm ci`
+   - Report what failed
 
 ## Rules
 
 - **Never run `npm audit fix --force`** — breaking changes need human review
 - **Never remove a dependency** without asking the user, even if it appears unused — flag it in the report instead
 - **Always run tests** after any auto-fix changes
-- **If `--fix` causes test failures**, restore manifests from the saved state (`git stash pop`) then run `npm ci` to resync `node_modules/`, and report the failure
+- **If `--fix` causes test failures**, restore manifests from the saved state (git stash pop if `STASH_CREATED=0`, or `git checkout` if stash was a no-op) then run `npm ci` to resync `node_modules/`, and report the failure
 - Treat `optionalDependencies` separately — they're expected to fail on some platforms
 - The report goes in `generated/deps-audit/` — create the directory if it doesn't exist

From 65d983698c9a1353e90763944cfffbbb52631d7c Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:55:27 -0600
Subject: [PATCH 11/37] fix(test-health): use mktemp for flaky-run directory to
 avoid concurrent corruption

Replace hardcoded /tmp/test-health-runs/ with mktemp -d so parallel
sessions get isolated directories. Add cleanup at end of analysis.
---
 .claude/skills/test-health/SKILL.md | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
index 5ec83719..0cc0bc1f 100644
--- a/.claude/skills/test-health/SKILL.md
+++ b/.claude/skills/test-health/SKILL.md
@@ -38,13 +38,18 @@ Audit the test suite for flaky tests, dead/trivial tests, coverage gaps on recen
 Run the full test suite `FLAKY_RUNS` times and track per-test pass/fail:
 
 ```bash
-mkdir -p /tmp/test-health-runs
+RUN_DIR=$(mktemp -d /tmp/test-health-XXXXXX)
 for i in $(seq 1 $FLAKY_RUNS); do
-  npx vitest run --reporter=json > /tmp/test-health-runs/run-$i.json 2>/tmp/test-health-runs/run-$i.err
+  npx vitest run --reporter=json > "$RUN_DIR/run-$i.json" 2>"$RUN_DIR/run-$i.err"
 done
 ```
 
-For each run, parse the JSON reporter output from `/tmp/test-health-runs/run-$i.json` to get per-test results.
+For each run, parse the JSON reporter output from `$RUN_DIR/run-$i.json` to get per-test results.
+
+After all runs are parsed and analysis is complete, clean up the temporary directory:
+```bash
+rm -rf "$RUN_DIR"
+```
 
 ### Analysis
 

From eef2c03fa9f16784639bd2a6935b428be55e5d93 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sat, 21 Mar 2026 23:55:37 -0600
Subject: [PATCH 12/37] fix(bench-check): add save-baseline verdict path, fix
 em-dash, use explicit commit paths

Add 4th verdict path for --save-baseline when baseline already exists.
Replace corrupted em-dash character in N/A string. Change commit command
to use explicit file paths per project convention.
---
 .claude/skills/bench-check/SKILL.md | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index fc2ccff4..cb12ab92 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -131,7 +131,7 @@ Skip this phase if `SAVE_ONLY=true` or no baseline exists.
 For each metric in the current run:
 
 1. Look up the same metric in the baseline
-2. Guard against division-by-zero: if `baseline == 0`, mark the delta as `"N/A � baseline was zero"` and treat the metric as **informational only** (not a regression or improvement)
+2. Guard against division-by-zero: if `baseline == 0`, mark the delta as `"N/A — baseline was zero"` and treat the metric as **informational only** (not a regression or improvement)
 3. Otherwise compute: `delta_pct = ((current - baseline) / baseline) * 100`
 4. Classify:
    - **Regression**: metric increased by more than `THRESHOLD`% (for time metrics) or decreased by more than `THRESHOLD`% (for recall/quality metrics)
@@ -168,6 +168,10 @@ Based on comparison results:
 - If `COMPARE_ONLY` is set: print a warning that no baseline exists and exit without saving
 - Otherwise: print `BENCH-CHECK — initial baseline saved` and save current results as baseline
 
+### Save-baseline with existing baseline (`--save-baseline`)
+- Print: `BENCH-CHECK — baseline overwritten (previous: <old gitRef>, new: <new gitRef>)`
+- Save current results as the new baseline (overwrite existing)
+
 ## Phase 5 — Save Baseline
 
 When saving (initial run, `--save-baseline`, or passed comparison):
@@ -192,8 +196,7 @@ This creates a running log of benchmark results over time.
 
 After writing both files, commit the baseline so it is a shared reference point:
 ```bash
-git add generated/bench-check/baseline.json generated/bench-check/history.ndjson
-git diff --cached --quiet || git commit -m "chore: update bench-check baseline (<gitRef>)"
+git diff --quiet generated/bench-check/baseline.json generated/bench-check/history.ndjson || git commit generated/bench-check/baseline.json generated/bench-check/history.ndjson -m "chore: update bench-check baseline (<gitRef>)"
 ```
 
 ## Phase 6 — Report

From 19b14e93d6125bc131efadfc11879956b62f8aff Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 00:05:37 -0600
Subject: [PATCH 13/37] docs(roadmap): update Phase 5 TypeScript migration with
 accurate progress

Phase 5 was listed as "2 of 7 complete" with outdated pre-Phase 3 file
paths. Updated to reflect actual state: 32 of 269 source modules migrated
(~12%). Steps 5.3-5.5 now list exact migrated/remaining files with verified
counts (5.3=8, 5.4=54, 5.5=175, total=237 JS-only files). Added note about
14 stale .js counterparts of already-migrated .ts files needing deletion.
---
 docs/roadmap/ROADMAP.md | 128 +++++++++++++++++++++++++++++-----------
 1 file changed, 94 insertions(+), 34 deletions(-)

diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md
index b9664157..06fdbc0e 100644
--- a/docs/roadmap/ROADMAP.md
+++ b/docs/roadmap/ROADMAP.md
@@ -18,7 +18,7 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned
 | [**2.7**](#phase-27--deep-analysis--graph-enrichment) | Deep Analysis & Graph Enrichment | Dataflow analysis, intraprocedural CFG, AST node storage, expanded node/edge types, extractors refactoring, CLI consolidation, interactive viewer, exports command, normalizeSymbol | **Complete** (v3.0.0) |
 | [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring (Vertical Slice) | Unified AST analysis framework, command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, builder pipeline, presentation layer, domain grouping, curated API, unified graph model, qualified names, CLI composability | **Complete** (v3.1.5) |
 | [**4**](#phase-4--resolution-accuracy) | Resolution Accuracy | Dead role sub-categories, receiver type tracking, interface/trait implementation edges, resolution precision/recall benchmarks, `package.json` exports field, monorepo workspace resolution | **In Progress** (5 of 6 complete) |
-| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration, supply-chain security, CI coverage gates | **In Progress** (2 of 7 complete) |
+| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | **In Progress** (32 of 269 src modules migrated; 14 stale `.js` to delete) |
 | [**6**](#phase-6--native-analysis-acceleration) | Native Analysis Acceleration | Move JS-only build phases (AST nodes, CFG, dataflow, insert nodes, structure, roles, complexity) to Rust; fix incremental rebuild data loss on native; sub-100ms 1-file rebuilds | Planned |
 | [**7**](#phase-7--runtime--extensibility) | Runtime & Extensibility | Event-driven pipeline, unified engine strategy, subgraph export filtering, transitive confidence, query caching, configuration profiles, pagination, plugin system, DX & onboarding, confidence annotations, shell completion | Planned |
 | [**8**](#phase-8--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned |
@@ -1080,12 +1080,16 @@ npm workspaces (`package.json` `workspaces`), `pnpm-workspace.yaml`, and `lerna.
 
 ## Phase 5 -- TypeScript Migration
 
-> **Status:** In Progress
+> **Status:** In Progress — 32 of 269 source modules migrated (~12%), plus 14 stale `.js` counterparts to delete
 
 **Goal:** Migrate the codebase from plain JavaScript to TypeScript, leveraging the clean module boundaries established in Phase 3. Incremental module-by-module migration starting from leaf modules inward.
 
 **Why after Phase 4:** The resolution accuracy work (Phase 4) operates on the existing JS codebase and produces immediate accuracy gains. TypeScript migration builds on Phase 3's clean module boundaries to add type safety across the entire codebase. Every subsequent phase benefits from types: MCP schema auto-generation, API contracts, refactoring safety. The Phase 4 resolution improvements (receiver tracking, interface edges) establish the resolution model that TypeScript types will formalize.
 
+**Note:** File paths below reflect the post-Phase 3 directory structure. Migration has progressed non-linearly — some orchestration modules were migrated before all leaf/core modules were complete. `.js` and `.ts` coexist during migration (`allowJs: true` in tsconfig). 14 already-migrated modules still have stale `.js` counterparts that need deletion (see cleanup note at the end of this section).
+
+**File counts (as of March 2026):** 32 `.ts` modules in `src/`, 237 `.js`-only files needing migration, 14 stale `.js` duplicates of already-migrated `.ts` files needing deletion. Remaining by step: 5.3 = 8, 5.4 = 54, 5.5 = 175 (total = 237).
+
 ### ~~5.1 -- Project Setup~~ ✅
 
 TypeScript project configured with strict mode, ES module output, path aliases, incremental compilation, and `dist/` build output with source maps. Biome configured for `.ts` files. `package.json` `exports` point to compiled output.
@@ -1108,50 +1112,106 @@ Comprehensive TypeScript type definitions for the entire domain model — symbol
 
 **New file:** `src/types.ts` ([#516](https://github.com/optave/codegraph/pull/516))
 
-### 5.3 -- Leaf Module Migration
+### 5.3 -- Leaf Module Migration (In Progress)
 
-Migrate modules with no internal dependencies first:
+Migrate modules with no or minimal internal dependencies.
+
+**Migrated:**
 
 | Module | Notes |
 |--------|-------|
-| `src/errors.ts` | Domain error hierarchy (Phase 3.7) |
-| `src/logger.ts` | Minimal, no internal deps |
-| `src/constants.ts` | Pure data |
-| `src/config.ts` | Config types derived from `.codegraphrc.json` schema |
-| `src/db/connection.ts` | SQLite connection wrapper |
-| `src/db/migrations.ts` | Schema version management |
-| `src/formatters/*.ts` | Pure input->string transforms |
-| `src/paginate.ts` | Generic pagination helpers |
+| `src/shared/errors.ts` | Domain error hierarchy (Phase 3.7) |
+| `src/shared/kinds.ts` | Symbol and edge kind constants |
+| `src/shared/normalize.ts` | Symbol name normalization |
+| `src/shared/paginate.ts` | Generic pagination helpers |
+| `src/infrastructure/logger.ts` | Structured logging |
+| `src/infrastructure/result-formatter.ts` | JSON/NDJSON output formatting |
+| `src/infrastructure/test-filter.ts` | Test file detection heuristics |
+| `src/presentation/colors.ts` | ANSI color constants |
+| `src/presentation/table.ts` | CLI table formatting |
+
+**Remaining:**
 
-Allow `.js` and `.ts` to coexist during migration (`allowJs: true` in tsconfig).
+| Module | Notes |
+|--------|-------|
+| `src/shared/constants.js` | `EXTENSIONS`, `IGNORE_DIRS` constants |
+| `src/shared/file-utils.js` | File path utilities |
+| `src/shared/generators.js` | Generator/async iterator helpers |
+| `src/shared/hierarchy.js` | Hierarchy traversal helpers |
+| `src/infrastructure/config.js` | Config loading, env overrides, secret resolution |
+| `src/infrastructure/native.js` | Native napi-rs addon loader with WASM fallback |
+| `src/infrastructure/registry.js` | Global repo registry for multi-repo MCP |
+| `src/infrastructure/update-check.js` | npm update availability check |
+
+### 5.4 -- Core Module Migration (In Progress)
 
-### 5.4 -- Core Module Migration
+Migrate modules that implement domain logic and Phase 3 interfaces.
 
-Migrate modules that implement Phase 3 interfaces:
+**Migrated:**
 
 | Module | Key types |
 |--------|-----------|
-| `src/db/repository.ts` | `Repository` interface, all prepared statements typed |
-| `src/parser/engine.ts` | `Engine` interface, native/WASM dispatch |
-| `src/parser/registry.ts` | `LanguageEntry` type, extension mapping |
-| `src/parser/tree-utils.ts` | Tree-sitter node helpers |
-| `src/parser/base-extractor.ts` | `Extractor` interface, handler map |
-| `src/parser/extractors/*.ts` | Per-language extractors |
-| `src/analysis/*.ts` | Typed analysis results (impact scores, call chains) |
-| `src/resolve.ts` | Import resolution with confidence types |
-
-### 5.5 -- Orchestration & Public API Migration
-
-Migrate top-level orchestration and entry points:
+| `src/graph/model.ts` | `CodeGraph` class, unified graph model |
+| `src/graph/algorithms/bfs.ts` | Breadth-first search traversal |
+| `src/graph/algorithms/centrality.ts` | Centrality metrics (degree, betweenness) |
+| `src/graph/algorithms/shortest-path.ts` | Shortest path between symbols |
+| `src/graph/algorithms/tarjan.ts` | Tarjan SCC (cycle detection) |
+| `src/graph/algorithms/leiden/rng.ts` | Random number generator for Leiden |
+| `src/graph/classifiers/risk.ts` | Risk scoring classifier |
+| `src/graph/classifiers/roles.ts` | Symbol role classifier |
+| `src/domain/graph/resolve.ts` | Import resolution with confidence types |
+
+**Remaining (54 files):**
+
+| Module | Files | Key types |
+|--------|-------|-----------|
+| `src/db/` | 18 | `Repository` interface, SQLite connection, migrations, query builder, all repository modules |
+| `src/domain/parser.js` | 1 | `Engine` interface, tree-sitter WASM wrapper, `LANGUAGE_REGISTRY` |
+| `src/domain/queries.js` | 1 | Query functions: symbol search, file deps, impact analysis, diff-impact |
+| `src/domain/analysis/` | 9 | Analysis results (context, impact, dependencies, exports, roles, etc.) |
+| `src/extractors/` | 11 | Per-language extractors (JS, TS, Go, Rust, Java, C#, PHP, Ruby, Python, HCL) + helpers + barrel |
+| `src/graph/algorithms/` | 8 | Louvain, Leiden (6 files: adapter, CPM, index, modularity, optimiser, partition), algorithms barrel |
+| `src/graph/builders/` | 4 | Dependency, structure, temporal graph builders + barrel |
+| `src/graph/classifiers/index.js` + `src/graph/index.js` | 2 | Barrel exports |
+
+### 5.5 -- Orchestration & Public API Migration (In Progress)
+
+Migrate top-level orchestration, features, and entry points.
+
+**Migrated:**
 
 | Module | Notes |
 |--------|-------|
-| `src/builder.ts` | Pipeline stages with typed `PipelineStage` |
-| `src/watcher.ts` | File system events + pipeline |
-| `src/embeddings/*.ts` | Vector store interface, model registry |
-| `src/mcp/*.ts` | Tool schemas, typed handlers |
-| `src/cli/*.ts` | Command objects with typed options |
-| `src/index.ts` | Curated public API with proper export types |
+| `src/domain/graph/builder.ts` | Graph build orchestrator |
+| `src/domain/graph/builder/context.ts` | Build context (options, state) |
+| `src/domain/graph/builder/helpers.ts` | Builder utility functions |
+| `src/domain/graph/builder/pipeline.ts` | Pipeline stage definitions |
+| `src/domain/graph/watcher.ts` | File system events + rebuild triggers |
+| `src/domain/search/generator.ts` | Embedding vector generation |
+| `src/domain/search/index.ts` | Search module entry point |
+| `src/domain/search/models.ts` | Model management |
+| `src/mcp/index.ts` | MCP server entry point |
+| `src/mcp/middleware.ts` | MCP middleware layer |
+| `src/mcp/server.ts` | MCP server implementation |
+| `src/mcp/tool-registry.ts` | Dynamic tool list builder |
+| `src/features/export.ts` | Graph export orchestration |
+
+**Remaining (175 files):**
+
+| Module | Files | Notes |
+|--------|-------|-------|
+| `src/cli.js` + `src/cli/` | 48 | Commander CLI entry point, 43 command handlers (`commands/`), barrel, shared CLI utilities (`shared/`: open-graph, options, output) |
+| `src/index.js` | 1 | Curated public API exports |
+| `src/features/` | 20 | ast, audit, batch, boundaries, branch-compare, cfg, check, cochange, communities, complexity, dataflow, flow, graph-enrichment, manifesto, owners, sequence, snapshot, structure, triage, `shared/find-nodes` |
+| `src/presentation/` | 28 | All presentation formatters (14 files), `queries-cli/` (7 files), result-formatter, sequence-renderer, viewer, query, export, flow, brief |
+| `src/mcp/tools/` | 36 | Individual MCP tool handlers + barrel |
+| `src/domain/graph/builder/stages/` | 9 | Build pipeline stages (collect-files, parse-files, resolve-imports, etc.) |
+| `src/domain/graph/builder/incremental.js` | 1 | Incremental rebuild logic |
+| `src/domain/graph/` | 3 | `cycles.js`, `journal.js`, `change-journal.js` |
+| `src/domain/search/` | 11 | Search subsystem: `search/` (6 files), `stores/` (2 files), `strategies/` (3 files) |
+| `src/ast-analysis/` | 18 | AST analysis framework, visitors, language-specific rules |
+
+**JS counterpart cleanup (14 files to delete):** The following `.js` files are stale counterparts of already-migrated `.ts` files and should be deleted once all consumers import from `.ts`: `domain/graph/builder.js`, `domain/graph/builder/{context,helpers,pipeline}.js`, `domain/graph/resolve.js`, `domain/graph/watcher.js`, `domain/search/{generator,index,models}.js`, `features/export.js`, `mcp/{index,middleware,server,tool-registry}.js`
 
 ### 5.6 -- Test Migration
 
@@ -1162,7 +1222,7 @@ Migrate top-level orchestration and entry points:
 
 **Verification:** All existing tests pass. `tsc --noEmit` succeeds with zero errors. No `any` escape hatches except at FFI boundaries (napi-rs addon, tree-sitter WASM).
 
-**Affected files:** All `src/**/*.js` -> `src/**/*.ts`, all `tests/**/*.js` -> `tests/**/*.ts`, `package.json`, `biome.json`
+**Affected files:** All remaining `src/**/*.js` → `src/**/*.ts`, all `tests/**/*.js` → `tests/**/*.ts`, `package.json`, `biome.json`
 
 ---
 

From 5bda6bae7ff30242a9fb0f7144580382d2305a53 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 08:48:50 -0600
Subject: [PATCH 14/37] fix: deps-audit success path should keep npm changes,
 not revert (#565)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When STASH_CREATED=1 and tests pass, the npm audit fix changes are
good — no action needed. Previously it ran git checkout to discard
them, which undid the successful fix.
---
 .claude/skills/deps-audit/SKILL.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index d4c672af..ccd7e6c4 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -160,8 +160,8 @@ If `AUTO_FIX` was set:
 Summarize all changes made:
 1. List each package updated/fixed
 2. Run `npm test` to verify nothing broke
-3. If tests pass and `STASH_CREATED` is `0`: drop the saved state (`git stash drop`)
-   If tests pass and `STASH_CREATED` is `1`: discard manifest changes with `git checkout -- package.json package-lock.json` (no stash entry exists)
+3. If tests pass and `STASH_CREATED` is `0`: drop the saved state (`git stash drop`) — the npm changes are good, no rollback needed
+   If tests pass and `STASH_CREATED` is `1`: no action needed — the npm changes are good and no stash entry exists to clean up
 4. If tests fail and `STASH_CREATED` is `0`:
    - Restore the saved manifests: `git stash pop`
    - Restore `node_modules/` to match the reverted lock file: `npm ci`

From bd0ba1a39a8aad1c4709058b450eeeee20d27186 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 08:49:00 -0600
Subject: [PATCH 15/37] fix: bench-check use git add + diff --cached to detect
 new files (#565)

git diff --quiet ignores untracked files, so on the first run when
baseline.json and history.ndjson are newly created, the commit was
skipped. Stage first with git add, then check with --cached.
---
 .claude/skills/bench-check/SKILL.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index cb12ab92..16a0d4a9 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -196,9 +196,12 @@ This creates a running log of benchmark results over time.
 
 After writing both files, commit the baseline so it is a shared reference point:
 ```bash
-git diff --quiet generated/bench-check/baseline.json generated/bench-check/history.ndjson || git commit generated/bench-check/baseline.json generated/bench-check/history.ndjson -m "chore: update bench-check baseline (<gitRef>)"
+git add generated/bench-check/baseline.json generated/bench-check/history.ndjson
+git diff --cached --quiet || git commit generated/bench-check/baseline.json generated/bench-check/history.ndjson -m "chore: update bench-check baseline (<gitRef>)"
 ```
 
+> `git add` first so that newly created files (first run) are staged; `--cached` then detects them correctly. Without this, `git diff --quiet` ignores untracked files and the baseline is never committed on the first run.
+
 ## Phase 6 — Report
 
 Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`:

From 7b91e3c524701e52902da50c673a6eee09e6aca1 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 08:49:10 -0600
Subject: [PATCH 16/37] fix: housekeep require confirmation before branch
 deletion (#565)

Branch deletion now asks for user confirmation before each delete,
consistent with worktree removal in Phase 1c.
---
 .claude/skills/housekeep/SKILL.md | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index e0659672..0a086fa6 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -157,12 +157,17 @@ This removes local refs to branches that no longer exist on the remote.
 
 **If `DRY_RUN`:** List branches that would be deleted.
 
-**Otherwise:** Delete merged branches:
+**Otherwise:** For each merged branch, ask the user for confirmation before deleting:
+```
+Delete merged branch '<branch>'? (y/n)
+```
+If confirmed, delete the branch:
 ```bash
 git branch -d <branch>  # safe delete, only if fully merged
 ```
 
 > **Never use `git branch -D`** (force delete). If `-d` fails, the branch has unmerged work — skip it.
+> **Always confirm before deleting** — consistent with worktree removal in Phase 1c.
 
 ## Phase 5 — Update Codegraph
 

From 5462d32e905a146aab333753eace0bae752a08aa Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 19:14:52 -0600
Subject: [PATCH 17/37] fix: scope git diff --cached to bench-check files only
 (#565)

---
 .claude/skills/bench-check/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 16a0d4a9..9ba48309 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -197,7 +197,7 @@ This creates a running log of benchmark results over time.
 After writing both files, commit the baseline so it is a shared reference point:
 ```bash
 git add generated/bench-check/baseline.json generated/bench-check/history.ndjson
-git diff --cached --quiet || git commit generated/bench-check/baseline.json generated/bench-check/history.ndjson -m "chore: update bench-check baseline (<gitRef>)"
+git diff --cached --quiet -- generated/bench-check/baseline.json generated/bench-check/history.ndjson || git commit generated/bench-check/baseline.json generated/bench-check/history.ndjson -m "chore: update bench-check baseline (<gitRef>)"
 ```
 
 > `git add` first so that newly created files (first run) are staged; `--cached` then detects them correctly. Without this, `git diff --quiet` ignores untracked files and the baseline is never committed on the first run.

From 457e6b9dfe195a793b52167ba809c46aec57aa19 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 19:14:56 -0600
Subject: [PATCH 18/37] fix: use json-summary reporter to match
 coverage-summary.json output (#565)

---
 .claude/skills/test-health/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
index 0cc0bc1f..e9d57d2e 100644
--- a/.claude/skills/test-health/SKILL.md
+++ b/.claude/skills/test-health/SKILL.md
@@ -107,7 +107,7 @@ Search for duplicate test descriptions within the same `describe` block — thes
 Run vitest with coverage and analyze:
 
 ```bash
-npx vitest run --coverage --coverage.reporter=json 2>&1
+npx vitest run --coverage --coverage.reporter=json-summary 2>&1
 ```
 
 ### 3a. Overall coverage

From 852003dfde920059dc940792c9067e2495f2f77d Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 19:15:01 -0600
Subject: [PATCH 19/37] fix: capture stash ref by name to avoid position-based
 targeting (#565)

---
 .claude/skills/deps-audit/SKILL.md | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index ccd7e6c4..83cc0481 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -25,6 +25,11 @@ Audit the project's dependency tree for security vulnerabilities, outdated packa
    STASH_CREATED=$?
    ```
    Track `STASH_CREATED` — when `0`, a stash entry was actually created; when `1`, the files had no changes so nothing was stashed.
+   If `STASH_CREATED` is `0`, immediately capture the stash ref for later use:
+   ```bash
+   STASH_REF=$(git stash list --format='%gd %s' | grep 'deps-audit-backup' | head -1 | awk '{print $1}')
+   ```
+   Use `$STASH_REF` (not `stash@{0}`) in all later stash drop/pop commands to avoid targeting the wrong entry if other stashes are pushed in the interim.
 
 ## Phase 1 — Security Vulnerabilities
 
@@ -160,10 +165,10 @@ If `AUTO_FIX` was set:
 Summarize all changes made:
 1. List each package updated/fixed
 2. Run `npm test` to verify nothing broke
-3. If tests pass and `STASH_CREATED` is `0`: drop the saved state (`git stash drop`) — the npm changes are good, no rollback needed
+3. If tests pass and `STASH_CREATED` is `0`: drop the saved state (`git stash drop $STASH_REF`) — the npm changes are good, no rollback needed
    If tests pass and `STASH_CREATED` is `1`: no action needed — the npm changes are good and no stash entry exists to clean up
 4. If tests fail and `STASH_CREATED` is `0`:
-   - Restore the saved manifests: `git stash pop`
+   - Restore the saved manifests: `git stash pop $STASH_REF`
    - Restore `node_modules/` to match the reverted lock file: `npm ci`
    - Report what failed
 5. If tests fail and `STASH_CREATED` is `1`:
@@ -176,6 +181,6 @@ Summarize all changes made:
 - **Never run `npm audit fix --force`** — breaking changes need human review
 - **Never remove a dependency** without asking the user, even if it appears unused — flag it in the report instead
 - **Always run tests** after any auto-fix changes
-- **If `--fix` causes test failures**, restore manifests from the saved state (git stash pop if `STASH_CREATED=0`, or `git checkout` if stash was a no-op) then run `npm ci` to resync `node_modules/`, and report the failure
+- **If `--fix` causes test failures**, restore manifests from the saved state (`git stash pop $STASH_REF` if `STASH_CREATED=0`, or `git checkout` if stash was a no-op) then run `npm ci` to resync `node_modules/`, and report the failure
 - Treat `optionalDependencies` separately — they're expected to fail on some platforms
 - The report goes in `generated/deps-audit/` — create the directory if it doesn't exist

From eea2954bdeceaa594a9599cb3fe037d692cfcb7a Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 19:15:06 -0600
Subject: [PATCH 20/37] fix: remove unreachable Phase 5 subphases since
 source-repo guard always skips (#565)

---
 .claude/skills/housekeep/SKILL.md | 35 -------------------------------
 1 file changed, 35 deletions(-)

diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index 0a086fa6..c4c7cca0 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -176,41 +176,6 @@ git branch -d <branch>  # safe delete, only if fully merged
 > **Source-repo guard:** This phase is only meaningful when codegraph is installed as a *dependency* of a consumer project. Because the pre-flight confirms we are inside the codegraph *source* repo (`"name": "@optave/codegraph"`), comparing the dev version to the published release and running `npm install` would be a no-op — codegraph is not one of its own dependencies. **Skip this entire phase** when running inside the source repo and print:
 > `Codegraph: skipped (running inside source repo — update via git pull / branch sync instead)`
 
-### 5a. Check current version
-
-```bash
-node -e "console.log(require('./package.json').version)"
-```
-
-### 5b. Check latest published version
-
-```bash
-npm view @optave/codegraph version
-```
-
-### 5c. Update if needed
-
-If a newer version is available:
-- Show the version diff (current → latest)
-- Check the CHANGELOG for what changed
-- If it's a patch/minor: update automatically
-  ```bash
-  npm install
-  ```
-- If it's a major: warn the user and ask for confirmation
-
-### 5d. Rebuild
-
-After any update:
-```bash
-npm install
-```
-
-Verify the build works:
-```bash
-npx codegraph stats 2>/dev/null && echo "OK" || echo "FAILED"
-```
-
 ## Phase 6 — Verify Repo Health
 
 Quick health checks to catch issues:

From 9b4869c47c8afcf40cf2084888779807f6737cfc Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 19:31:28 -0600
Subject: [PATCH 21/37] fix: use dynamic threshold variable in bench-check
 Phase 6 report template (#565)

---
 .claude/skills/bench-check/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 9ba48309..e1926569 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -209,7 +209,7 @@ Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`:
 ```markdown
 # Benchmark Report — <date>
 
-**Version:** X.Y.Z | **Git ref:** abc1234 | **Threshold:** 15%
+**Version:** X.Y.Z | **Git ref:** abc1234 | **Threshold:** $THRESHOLD%
 
 ## Verdict: PASSED / FAILED
 

From 8d92c999cf344344149a0e5c73e667b7217ab409 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 21:32:34 -0600
Subject: [PATCH 22/37] fix: address open review items in maintenance skills
 (#565)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- bench-check: add timeout 300 wrappers to all 4 benchmark invocations
  with exit code 124 check for timeout detection
- bench-check: add explicit COMPARE_ONLY guard at Phase 5 entry
- housekeep: fix grep portability — use grep -cE instead of GNU \| syntax
- test-health: add timeout 180 wrapper in flaky detection loop
- test-health: fix find command -o precedence with grouping parentheses
---
 .claude/skills/bench-check/SKILL.md | 16 +++++++++++-----
 .claude/skills/housekeep/SKILL.md   |  2 +-
 .claude/skills/test-health/SKILL.md |  9 ++++++---
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index e1926569..362b1db0 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -39,10 +39,11 @@ Run each benchmark script and collect results. Each script outputs JSON to stdou
 ### 1a. Build & Query Benchmark
 
 ```bash
-output=$(node scripts/benchmark.js 2>&1)
+output=$(timeout 300 node scripts/benchmark.js 2>&1)
 exit_code=$?
 ```
 
+If `exit_code` is 124: record `"timeout"` for this suite and continue.
 If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
 
 Extract:
@@ -53,10 +54,11 @@ Extract:
 ### 1b. Incremental Benchmark
 
 ```bash
-output=$(node scripts/incremental-benchmark.js 2>&1)
+output=$(timeout 300 node scripts/incremental-benchmark.js 2>&1)
 exit_code=$?
 ```
 
+If `exit_code` is 124: record `"timeout"` for this suite and continue.
 If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
 
 Extract:
@@ -67,10 +69,11 @@ Extract:
 ### 1c. Query Depth Benchmark
 
 ```bash
-output=$(node scripts/query-benchmark.js 2>&1)
+output=$(timeout 300 node scripts/query-benchmark.js 2>&1)
 exit_code=$?
 ```
 
+If `exit_code` is 124: record `"timeout"` for this suite and continue.
 If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
 
 Extract:
@@ -81,17 +84,18 @@ Extract:
 ### 1d. Embedding Benchmark (optional)
 
 ```bash
-output=$(node scripts/embedding-benchmark.js 2>&1)
+output=$(timeout 300 node scripts/embedding-benchmark.js 2>&1)
 exit_code=$?
 ```
 
+If `exit_code` is 124: record `"timeout"` for this suite and continue.
 If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
 
 Extract:
 - `embeddingTime` (ms)
 - `recall` at Hit@1, Hit@3, Hit@5, Hit@10
 
-> **Timeout:** Each benchmark gets 5 minutes max. If it times out, record `"timeout"` for that suite and continue.
+> **Timeout:** Each benchmark gets 5 minutes max (`timeout 300`). Exit code 124 indicates timeout — record `"timeout"` for that suite and continue.
 
 > **Errors:** If a benchmark script fails (non-zero exit), record `"error: <message>"` and continue with remaining benchmarks.
 
@@ -174,6 +178,8 @@ Based on comparison results:
 
 ## Phase 5 — Save Baseline
 
+**Skip this phase if `COMPARE_ONLY` is set.** Compare-only mode never writes or commits baselines.
+
 When saving (initial run, `--save-baseline`, or passed comparison):
 
 Write to `generated/bench-check/baseline.json`:
diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index c4c7cca0..0c8c0d6b 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -194,7 +194,7 @@ npx codegraph build
 ### 6b. Node modules integrity
 
 ```bash
-npm ls --depth=0 2>&1 | grep -c "missing\|invalid\|WARN"
+npm ls --depth=0 2>&1 | grep -cE "missing|invalid|WARN"
 ```
 
 If issues found: `npm install` to fix.
diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
index e9d57d2e..bb66790e 100644
--- a/.claude/skills/test-health/SKILL.md
+++ b/.claude/skills/test-health/SKILL.md
@@ -27,7 +27,7 @@ Audit the test suite for flaky tests, dead/trivial tests, coverage gaps on recen
    - `QUICK=true` if `--quick`
 4. Discover all test files:
    ```bash
-   find tests/ -name '*.test.js' -o -name '*.test.ts' | sort
+   find tests/ \( -name '*.test.js' -o -name '*.test.ts' \) | sort
    ```
 5. Count total test files and categorize by directory (integration, parsers, graph, search, unit)
 
@@ -40,7 +40,10 @@ Run the full test suite `FLAKY_RUNS` times and track per-test pass/fail:
 ```bash
 RUN_DIR=$(mktemp -d /tmp/test-health-XXXXXX)
 for i in $(seq 1 $FLAKY_RUNS); do
-  npx vitest run --reporter=json > "$RUN_DIR/run-$i.json" 2>"$RUN_DIR/run-$i.err"
+  timeout 180 npx vitest run --reporter=json > "$RUN_DIR/run-$i.json" 2>"$RUN_DIR/run-$i.err"
+  if [ $? -eq 124 ]; then
+    echo '{"timeout":true}' > "$RUN_DIR/run-$i.json"
+  fi
 done
 ```
 
@@ -63,7 +66,7 @@ For each flaky test found:
    - **Resource-dependent**: mentions file system, network, port, or temp directory
    - **Non-deterministic**: random/Date.now/Math.random in test or source
 
-> **Timeout:** Each full suite run gets 3 minutes. If it times out, record partial results and continue.
+> **Timeout:** Each full suite run gets 3 minutes (`timeout 180`). Exit code 124 indicates timeout — the run is recorded as `{"timeout":true}` and the loop continues.
 
 ## Phase 2 — Dead & Trivial Test Detection
 

From 9ad37ea1117be6ffd8ed72a9ae43f961dd62a546 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Sun, 22 Mar 2026 23:25:11 -0600
Subject: [PATCH 23/37] fix: add COVERAGE_ONLY guards to Phase 2 and Phase 4 in
 test-health

---
 .claude/skills/test-health/SKILL.md | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/.claude/skills/test-health/SKILL.md b/.claude/skills/test-health/SKILL.md
index bb66790e..4c836586 100644
--- a/.claude/skills/test-health/SKILL.md
+++ b/.claude/skills/test-health/SKILL.md
@@ -70,6 +70,8 @@ For each flaky test found:
 
 ## Phase 2 — Dead & Trivial Test Detection
 
+**Skip if `COVERAGE_ONLY` is set.**
+
 Scan all test files for problematic patterns:
 
 ### 2a. Empty / no-assertion tests
@@ -149,6 +151,8 @@ For each changed source file, check if:
 
 ## Phase 4 — Test Structure Analysis
 
+**Skip if `COVERAGE_ONLY` is set.**
+
 Analyze the test suite's structural health:
 
 ### 4a. Test-to-source mapping

From 30ab30e64f5db5eacb0a608676e58ad0ddfe498c Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 01:38:03 -0600
Subject: [PATCH 24/37] fix: add regression skip guard to bench-check Phase 5,
 expand deps-audit search dirs

---
 .claude/skills/bench-check/SKILL.md | 1 +
 .claude/skills/deps-audit/SKILL.md  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 362b1db0..70b6d70b 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -179,6 +179,7 @@ Based on comparison results:
 ## Phase 5 — Save Baseline
 
 **Skip this phase if `COMPARE_ONLY` is set.** Compare-only mode never writes or commits baselines.
+**Skip this phase if regressions were detected in Phase 4.** The baseline is only updated on a clean run.
 
 When saving (initial run, `--save-baseline`, or passed comparison):
 
diff --git a/.claude/skills/deps-audit/SKILL.md b/.claude/skills/deps-audit/SKILL.md
index 83cc0481..083a8c77 100644
--- a/.claude/skills/deps-audit/SKILL.md
+++ b/.claude/skills/deps-audit/SKILL.md
@@ -74,7 +74,7 @@ For any package classified as **Abandoned**, check if there's a maintained fork
 Detect dependencies declared in `package.json` but never imported:
 
 1. Read `dependencies` and `devDependencies` from `package.json`
-2. For each dependency, search for imports/requires across `src/`, `tests/`, `scripts/`, `cli.js`, `index.js`:
+2. For each dependency, search for imports/requires across `src/`, `tests/`, `scripts/`, `mcp/`, `graph/`, `ast-analysis/`, `cli.js`, `index.js`:
    - `require('<pkg>')` or `require('<pkg>/...')`
    - `import ... from '<pkg>'` or `import '<pkg>'`
    - `import('<pkg>')` (dynamic imports)

From a8631d2a9522dd24e29cb169c2f927a48b056427 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 02:26:00 -0600
Subject: [PATCH 25/37] fix: add empty-string guard for stat size check in
 housekeep (#565)

When both stat variants (GNU and BSD) fail, $size is empty and the
arithmetic comparison errors out. Add a [ -z "$size" ] && continue
guard so the loop skips files whose size cannot be determined.
---
 .claude/skills/housekeep/SKILL.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.claude/skills/housekeep/SKILL.md b/.claude/skills/housekeep/SKILL.md
index 0c8c0d6b..ef15efb7 100644
--- a/.claude/skills/housekeep/SKILL.md
+++ b/.claude/skills/housekeep/SKILL.md
@@ -85,6 +85,7 @@ Find untracked files larger than 1MB:
 ```bash
 git ls-files --others --exclude-standard | while read f; do
   size=$(stat --format='%s' "$f" 2>/dev/null || stat -f '%z' "$f" 2>/dev/null)
+  [ -z "$size" ] && continue
   if [ "$size" -gt 1048576 ]; then echo "$f ($size bytes)"; fi
 done
 ```

From 8fd7430c0b167a57826f838c837238881fc25d43 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 02:26:11 -0600
Subject: [PATCH 26/37] fix: add BASELINE SAVED verdict path and clarify
 if/else-if in bench-check (#565)

Phase 6: when SAVE_ONLY or first-run (no prior baseline), write a
shortened report with "Verdict: BASELINE SAVED" instead of the full
comparison report.

Phases 1a-1d: replace ambiguous "If timeout / If non-zero" with
explicit "If timeout / Else if non-zero" so the two conditions are
clearly mutually exclusive.
---
 .claude/skills/bench-check/SKILL.md | 34 +++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/.claude/skills/bench-check/SKILL.md b/.claude/skills/bench-check/SKILL.md
index 70b6d70b..a6f474d3 100644
--- a/.claude/skills/bench-check/SKILL.md
+++ b/.claude/skills/bench-check/SKILL.md
@@ -43,8 +43,8 @@ output=$(timeout 300 node scripts/benchmark.js 2>&1)
 exit_code=$?
 ```
 
-If `exit_code` is 124: record `"timeout"` for this suite and continue.
-If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+If `exit_code` is 124: record `"timeout"` for this suite and skip to the next suite.
+Else if `exit_code` is non-zero: record `"error: $output"` for this suite and skip to the next suite.
 
 Extract:
 - `buildTime` (ms) — per engine (native, WASM)
@@ -58,8 +58,8 @@ output=$(timeout 300 node scripts/incremental-benchmark.js 2>&1)
 exit_code=$?
 ```
 
-If `exit_code` is 124: record `"timeout"` for this suite and continue.
-If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+If `exit_code` is 124: record `"timeout"` for this suite and skip to the next suite.
+Else if `exit_code` is non-zero: record `"error: $output"` for this suite and skip to the next suite.
 
 Extract:
 - `noOpRebuild` (ms) — time for no-change rebuild
@@ -73,8 +73,8 @@ output=$(timeout 300 node scripts/query-benchmark.js 2>&1)
 exit_code=$?
 ```
 
-If `exit_code` is 124: record `"timeout"` for this suite and continue.
-If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+If `exit_code` is 124: record `"timeout"` for this suite and skip to the next suite.
+Else if `exit_code` is non-zero: record `"error: $output"` for this suite and skip to the next suite.
 
 Extract:
 - `fnDeps` scaling by depth
@@ -88,8 +88,8 @@ output=$(timeout 300 node scripts/embedding-benchmark.js 2>&1)
 exit_code=$?
 ```
 
-If `exit_code` is 124: record `"timeout"` for this suite and continue.
-If `exit_code` is non-zero: record `"error: $output"` for this suite and continue.
+If `exit_code` is 124: record `"timeout"` for this suite and skip to the next suite.
+Else if `exit_code` is non-zero: record `"error: $output"` for this suite and skip to the next suite.
 
 Extract:
 - `embeddingTime` (ms)
@@ -211,7 +211,23 @@ git diff --cached --quiet -- generated/bench-check/baseline.json generated/bench
 
 ## Phase 6 — Report
 
-Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`:
+Write a human-readable report to `generated/bench-check/BENCH_REPORT_<date>.md`.
+
+**If `SAVE_ONLY` is set or no prior baseline existed (first run):** write a shortened report — omit the "Comparison vs Baseline" and "Regressions" sections since no comparison was performed:
+
+```markdown
+# Benchmark Report — <date>
+
+**Version:** X.Y.Z | **Git ref:** abc1234 | **Threshold:** $THRESHOLD%
+
+## Verdict: BASELINE SAVED — no comparison performed
+
+## Raw Results
+
+<!-- Full JSON output from each benchmark -->
+```
+
+**Otherwise (comparison was performed):** write the full report with comparison and verdict:
 
 ```markdown
 # Benchmark Report — <date>

From 23f2f76c3afa2690659cefc8927a34198ea7dc35 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 02:26:49 -0600
Subject: [PATCH 27/37] docs(roadmap): mark Phase 4 complete, update Phase 5
 progress (5 of 7)

Phase 4 (Resolution Accuracy) had all 6 sub-phases merged but status
still said "In Progress". Phase 5 (TypeScript Migration) had 5.3-5.5
merged via PRs #553, #554, #555, #566 but was listed with stale counts.
Updated both to reflect actual state: Phase 4 complete, Phase 5 at 5/7
with 76 of 283 modules migrated (~27%).
---
 docs/roadmap/ROADMAP.md | 142 +++++++++++-----------------------------
 1 file changed, 37 insertions(+), 105 deletions(-)

diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md
index 06fdbc0e..3d6af3e4 100644
--- a/docs/roadmap/ROADMAP.md
+++ b/docs/roadmap/ROADMAP.md
@@ -17,8 +17,8 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned
 | [**2.5**](#phase-25--analysis-expansion) | Analysis Expansion | Complexity metrics, community detection, flow tracing, co-change, manifesto, boundary rules, check, triage, audit, batch, hybrid search | **Complete** (v2.7.0) |
 | [**2.7**](#phase-27--deep-analysis--graph-enrichment) | Deep Analysis & Graph Enrichment | Dataflow analysis, intraprocedural CFG, AST node storage, expanded node/edge types, extractors refactoring, CLI consolidation, interactive viewer, exports command, normalizeSymbol | **Complete** (v3.0.0) |
 | [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring (Vertical Slice) | Unified AST analysis framework, command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, builder pipeline, presentation layer, domain grouping, curated API, unified graph model, qualified names, CLI composability | **Complete** (v3.1.5) |
-| [**4**](#phase-4--resolution-accuracy) | Resolution Accuracy | Dead role sub-categories, receiver type tracking, interface/trait implementation edges, resolution precision/recall benchmarks, `package.json` exports field, monorepo workspace resolution | **In Progress** (5 of 6 complete) |
-| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | **In Progress** (32 of 269 src modules migrated; 14 stale `.js` to delete) |
+| [**4**](#phase-4--resolution-accuracy) | Resolution Accuracy | Dead role sub-categories, receiver type tracking, interface/trait implementation edges, resolution precision/recall benchmarks, `package.json` exports field, monorepo workspace resolution | **Complete** (v3.3.1) |
+| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | **In Progress** (5 of 7 complete — 76 of 283 src modules migrated, ~27%) |
 | [**6**](#phase-6--native-analysis-acceleration) | Native Analysis Acceleration | Move JS-only build phases (AST nodes, CFG, dataflow, insert nodes, structure, roles, complexity) to Rust; fix incremental rebuild data loss on native; sub-100ms 1-file rebuilds | Planned |
 | [**7**](#phase-7--runtime--extensibility) | Runtime & Extensibility | Event-driven pipeline, unified engine strategy, subgraph export filtering, transitive confidence, query caching, configuration profiles, pagination, plugin system, DX & onboarding, confidence annotations, shell completion | Planned |
 | [**8**](#phase-8--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned |
@@ -994,9 +994,9 @@ src/domain/
 
 ---
 
-## Phase 4 -- Resolution Accuracy
+## Phase 4 -- Resolution Accuracy ✅
 
-> **Status:** In Progress
+> **Status:** Complete -- all 6 sub-phases shipped across v3.2.0 → v3.3.1
 
 **Goal:** Close the most impactful gaps in call graph accuracy before investing in type safety or native acceleration. The entire value proposition — blast radius, impact analysis, dependency chains — rests on the call graph. These targeted improvements make the graph trustworthy.
 
@@ -1080,15 +1080,15 @@ npm workspaces (`package.json` `workspaces`), `pnpm-workspace.yaml`, and `lerna.
 
 ## Phase 5 -- TypeScript Migration
 
-> **Status:** In Progress — 32 of 269 source modules migrated (~12%), plus 14 stale `.js` counterparts to delete
+> **Status:** In Progress — 5 of 7 steps complete (76 of 283 source modules migrated, ~27%)
 
 **Goal:** Migrate the codebase from plain JavaScript to TypeScript, leveraging the clean module boundaries established in Phase 3. Incremental module-by-module migration starting from leaf modules inward.
 
 **Why after Phase 4:** The resolution accuracy work (Phase 4) operates on the existing JS codebase and produces immediate accuracy gains. TypeScript migration builds on Phase 3's clean module boundaries to add type safety across the entire codebase. Every subsequent phase benefits from types: MCP schema auto-generation, API contracts, refactoring safety. The Phase 4 resolution improvements (receiver tracking, interface edges) establish the resolution model that TypeScript types will formalize.
 
-**Note:** File paths below reflect the post-Phase 3 directory structure. Migration has progressed non-linearly — some orchestration modules were migrated before all leaf/core modules were complete. `.js` and `.ts` coexist during migration (`allowJs: true` in tsconfig). 14 already-migrated modules still have stale `.js` counterparts that need deletion (see cleanup note at the end of this section).
+**Note:** File paths below reflect the post-Phase 3 directory structure. `.js` and `.ts` coexist during migration (`allowJs: true` in tsconfig). Steps 5.3–5.5 completed across PRs #553, #554, #555, #566. Remaining work: test migration (5.6) and remaining `.js` source files (~207 files).
 
-**File counts (as of March 2026):** 32 `.ts` modules in `src/`, 237 `.js`-only files needing migration, 14 stale `.js` duplicates of already-migrated `.ts` files needing deletion. Remaining by step: 5.3 = 8, 5.4 = 54, 5.5 = 175 (total = 237).
+**File counts (as of March 2026):** 76 `.ts` modules in `src/`, ~207 `.js` files remaining. Steps 5.1–5.5 complete.
 
 ### ~~5.1 -- Project Setup~~ ✅
 
@@ -1112,104 +1112,36 @@ Comprehensive TypeScript type definitions for the entire domain model — symbol
 
 **New file:** `src/types.ts` ([#516](https://github.com/optave/codegraph/pull/516))
 
-### 5.3 -- Leaf Module Migration (In Progress)
-
-Migrate modules with no or minimal internal dependencies.
-
-**Migrated:**
-
-| Module | Notes |
-|--------|-------|
-| `src/shared/errors.ts` | Domain error hierarchy (Phase 3.7) |
-| `src/shared/kinds.ts` | Symbol and edge kind constants |
-| `src/shared/normalize.ts` | Symbol name normalization |
-| `src/shared/paginate.ts` | Generic pagination helpers |
-| `src/infrastructure/logger.ts` | Structured logging |
-| `src/infrastructure/result-formatter.ts` | JSON/NDJSON output formatting |
-| `src/infrastructure/test-filter.ts` | Test file detection heuristics |
-| `src/presentation/colors.ts` | ANSI color constants |
-| `src/presentation/table.ts` | CLI table formatting |
-
-**Remaining:**
-
-| Module | Notes |
-|--------|-------|
-| `src/shared/constants.js` | `EXTENSIONS`, `IGNORE_DIRS` constants |
-| `src/shared/file-utils.js` | File path utilities |
-| `src/shared/generators.js` | Generator/async iterator helpers |
-| `src/shared/hierarchy.js` | Hierarchy traversal helpers |
-| `src/infrastructure/config.js` | Config loading, env overrides, secret resolution |
-| `src/infrastructure/native.js` | Native napi-rs addon loader with WASM fallback |
-| `src/infrastructure/registry.js` | Global repo registry for multi-repo MCP |
-| `src/infrastructure/update-check.js` | npm update availability check |
-
-### 5.4 -- Core Module Migration (In Progress)
-
-Migrate modules that implement domain logic and Phase 3 interfaces.
-
-**Migrated:**
-
-| Module | Key types |
-|--------|-----------|
-| `src/graph/model.ts` | `CodeGraph` class, unified graph model |
-| `src/graph/algorithms/bfs.ts` | Breadth-first search traversal |
-| `src/graph/algorithms/centrality.ts` | Centrality metrics (degree, betweenness) |
-| `src/graph/algorithms/shortest-path.ts` | Shortest path between symbols |
-| `src/graph/algorithms/tarjan.ts` | Tarjan SCC (cycle detection) |
-| `src/graph/algorithms/leiden/rng.ts` | Random number generator for Leiden |
-| `src/graph/classifiers/risk.ts` | Risk scoring classifier |
-| `src/graph/classifiers/roles.ts` | Symbol role classifier |
-| `src/domain/graph/resolve.ts` | Import resolution with confidence types |
-
-**Remaining (54 files):**
-
-| Module | Files | Key types |
-|--------|-------|-----------|
-| `src/db/` | 18 | `Repository` interface, SQLite connection, migrations, query builder, all repository modules |
-| `src/domain/parser.js` | 1 | `Engine` interface, tree-sitter WASM wrapper, `LANGUAGE_REGISTRY` |
-| `src/domain/queries.js` | 1 | Query functions: symbol search, file deps, impact analysis, diff-impact |
-| `src/domain/analysis/` | 9 | Analysis results (context, impact, dependencies, exports, roles, etc.) |
-| `src/extractors/` | 11 | Per-language extractors (JS, TS, Go, Rust, Java, C#, PHP, Ruby, Python, HCL) + helpers + barrel |
-| `src/graph/algorithms/` | 8 | Louvain, Leiden (6 files: adapter, CPM, index, modularity, optimiser, partition), algorithms barrel |
-| `src/graph/builders/` | 4 | Dependency, structure, temporal graph builders + barrel |
-| `src/graph/classifiers/index.js` + `src/graph/index.js` | 2 | Barrel exports |
-
-### 5.5 -- Orchestration & Public API Migration (In Progress)
-
-Migrate top-level orchestration, features, and entry points.
-
-**Migrated:**
-
-| Module | Notes |
-|--------|-------|
-| `src/domain/graph/builder.ts` | Graph build orchestrator |
-| `src/domain/graph/builder/context.ts` | Build context (options, state) |
-| `src/domain/graph/builder/helpers.ts` | Builder utility functions |
-| `src/domain/graph/builder/pipeline.ts` | Pipeline stage definitions |
-| `src/domain/graph/watcher.ts` | File system events + rebuild triggers |
-| `src/domain/search/generator.ts` | Embedding vector generation |
-| `src/domain/search/index.ts` | Search module entry point |
-| `src/domain/search/models.ts` | Model management |
-| `src/mcp/index.ts` | MCP server entry point |
-| `src/mcp/middleware.ts` | MCP middleware layer |
-| `src/mcp/server.ts` | MCP server implementation |
-| `src/mcp/tool-registry.ts` | Dynamic tool list builder |
-| `src/features/export.ts` | Graph export orchestration |
-
-**Remaining (175 files):**
-
-| Module | Files | Notes |
-|--------|-------|-------|
-| `src/cli.js` + `src/cli/` | 48 | Commander CLI entry point, 43 command handlers (`commands/`), barrel, shared CLI utilities (`shared/`: open-graph, options, output) |
-| `src/index.js` | 1 | Curated public API exports |
-| `src/features/` | 20 | ast, audit, batch, boundaries, branch-compare, cfg, check, cochange, communities, complexity, dataflow, flow, graph-enrichment, manifesto, owners, sequence, snapshot, structure, triage, `shared/find-nodes` |
-| `src/presentation/` | 28 | All presentation formatters (14 files), `queries-cli/` (7 files), result-formatter, sequence-renderer, viewer, query, export, flow, brief |
-| `src/mcp/tools/` | 36 | Individual MCP tool handlers + barrel |
-| `src/domain/graph/builder/stages/` | 9 | Build pipeline stages (collect-files, parse-files, resolve-imports, etc.) |
-| `src/domain/graph/builder/incremental.js` | 1 | Incremental rebuild logic |
-| `src/domain/graph/` | 3 | `cycles.js`, `journal.js`, `change-journal.js` |
-| `src/domain/search/` | 11 | Search subsystem: `search/` (6 files), `stores/` (2 files), `strategies/` (3 files) |
-| `src/ast-analysis/` | 18 | AST analysis framework, visitors, language-specific rules |
+### ~~5.3 -- Leaf Module Migration~~ ✅
+
+Migrated 25 leaf modules (no internal dependencies) from JavaScript to TypeScript in two waves:
+
+- ✅ Wave 1 (17 modules): `shared/errors`, `shared/kinds`, `shared/normalize`, `shared/paginate`, `infrastructure/logger`, `infrastructure/result-formatter`, `infrastructure/test-filter`, `db/index`, `domain/analysis/*` (context, dependencies, exports, impact, implementations, module-map, roles, symbol-lookup), `domain/graph/cycles`, `presentation/colors`, `presentation/table` ([#553](https://github.com/optave/codegraph/pull/553))
+- ✅ Wave 2 (8 modules): `shared/constants`, `shared/file-utils`, `shared/generators`, `shared/hierarchy`, `infrastructure/config`, `infrastructure/native`, `infrastructure/registry`, `infrastructure/update-check` ([#566](https://github.com/optave/codegraph/pull/566))
+
+### ~~5.4 -- Core Module Migration~~ ✅
+
+Migrated 54 core modules that implement Phase 3 interfaces — database repository, parser engine, language extractors, import resolution, graph builders, and analysis modules.
+
+- ✅ `db/repository/*.ts` — all prepared statements typed
+- ✅ `domain/parser.ts`, `domain/graph/resolve.ts` — engine and resolution with confidence types
+- ✅ `extractors/*.ts` — all 11 language extractors
+- ✅ `domain/graph/builder/**/*.ts` — full build pipeline
+- ✅ `graph/**/*.ts` — unified graph model, algorithms (Tarjan, Louvain, Leiden, BFS, centrality, shortest-path), classifiers (role, risk), builders
+
+([#554](https://github.com/optave/codegraph/pull/554))
+
+### ~~5.5 -- Orchestration & Public API Migration~~ ✅
+
+Migrated top-level orchestration and entry points — builder pipeline, watcher, embeddings subsystem, MCP server, CLI commands, and public API index.
+
+- ✅ `domain/graph/builder.ts`, `domain/graph/watcher.ts` — pipeline stages typed
+- ✅ `domain/search/*.ts` — vector store, model registry, search modes
+- ✅ `mcp/*.ts` — tool schemas, typed handlers
+- ✅ `features/*.ts`, `presentation/*.ts` — feature modules and CLI formatters
+- ✅ `index.ts` — curated public API with proper export types
+
+([#555](https://github.com/optave/codegraph/pull/555))
 
 **JS counterpart cleanup (14 files to delete):** The following `.js` files are stale counterparts of already-migrated `.ts` files and should be deleted once all consumers import from `.ts`: `domain/graph/builder.js`, `domain/graph/builder/{context,helpers,pipeline}.js`, `domain/graph/resolve.js`, `domain/graph/watcher.js`, `domain/search/{generator,index,models}.js`, `features/export.js`, `mcp/{index,middleware,server,tool-registry}.js`
 

From 2616c788d891aac17c3d20b290cca20eff39b264 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 02:41:03 -0600
Subject: [PATCH 28/37] =?UTF-8?q?docs(roadmap):=20correct=20Phase=205=20pr?=
 =?UTF-8?q?ogress=20=E2=80=94=205.3/5.4/5.5=20still=20in=20progress?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previous commit incorrectly marked 5.3-5.5 as complete. In reality
76 of 283 src files are .ts (~27%) while 207 remain .js (~73%).
PRs #553, #554, #555, #566 migrated a first wave but left substantial
work in each step: 4 leaf files, 39 core files, 159 orchestration
files. Updated each step with accurate migrated/remaining counts.
---
 docs/roadmap/ROADMAP.md | 73 ++++++++++++++++++++++++++---------------
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md
index 3d6af3e4..4614c75e 100644
--- a/docs/roadmap/ROADMAP.md
+++ b/docs/roadmap/ROADMAP.md
@@ -18,7 +18,7 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned
 | [**2.7**](#phase-27--deep-analysis--graph-enrichment) | Deep Analysis & Graph Enrichment | Dataflow analysis, intraprocedural CFG, AST node storage, expanded node/edge types, extractors refactoring, CLI consolidation, interactive viewer, exports command, normalizeSymbol | **Complete** (v3.0.0) |
 | [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring (Vertical Slice) | Unified AST analysis framework, command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, builder pipeline, presentation layer, domain grouping, curated API, unified graph model, qualified names, CLI composability | **Complete** (v3.1.5) |
 | [**4**](#phase-4--resolution-accuracy) | Resolution Accuracy | Dead role sub-categories, receiver type tracking, interface/trait implementation edges, resolution precision/recall benchmarks, `package.json` exports field, monorepo workspace resolution | **Complete** (v3.3.1) |
-| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | **In Progress** (5 of 7 complete — 76 of 283 src modules migrated, ~27%) |
+| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | **In Progress** (76 of 283 src files migrated, ~27%) |
 | [**6**](#phase-6--native-analysis-acceleration) | Native Analysis Acceleration | Move JS-only build phases (AST nodes, CFG, dataflow, insert nodes, structure, roles, complexity) to Rust; fix incremental rebuild data loss on native; sub-100ms 1-file rebuilds | Planned |
 | [**7**](#phase-7--runtime--extensibility) | Runtime & Extensibility | Event-driven pipeline, unified engine strategy, subgraph export filtering, transitive confidence, query caching, configuration profiles, pagination, plugin system, DX & onboarding, confidence annotations, shell completion | Planned |
 | [**8**](#phase-8--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned |
@@ -1080,15 +1080,13 @@ npm workspaces (`package.json` `workspaces`), `pnpm-workspace.yaml`, and `lerna.
 
 ## Phase 5 -- TypeScript Migration
 
-> **Status:** In Progress — 5 of 7 steps complete (76 of 283 source modules migrated, ~27%)
+> **Status:** In Progress — 76 of 283 source files migrated (~27%), 207 `.js` files remaining
 
 **Goal:** Migrate the codebase from plain JavaScript to TypeScript, leveraging the clean module boundaries established in Phase 3. Incremental module-by-module migration starting from leaf modules inward.
 
 **Why after Phase 4:** The resolution accuracy work (Phase 4) operates on the existing JS codebase and produces immediate accuracy gains. TypeScript migration builds on Phase 3's clean module boundaries to add type safety across the entire codebase. Every subsequent phase benefits from types: MCP schema auto-generation, API contracts, refactoring safety. The Phase 4 resolution improvements (receiver tracking, interface edges) establish the resolution model that TypeScript types will formalize.
 
-**Note:** File paths below reflect the post-Phase 3 directory structure. `.js` and `.ts` coexist during migration (`allowJs: true` in tsconfig). Steps 5.3–5.5 completed across PRs #553, #554, #555, #566. Remaining work: test migration (5.6) and remaining `.js` source files (~207 files).
-
-**File counts (as of March 2026):** 76 `.ts` modules in `src/`, ~207 `.js` files remaining. Steps 5.1–5.5 complete.
+**Note:** `.js` and `.ts` coexist during migration (`allowJs: true` in tsconfig). PRs #553, #554, #555, #566 migrated a first wave of files across steps 5.3–5.5, but substantial work remains in each step. 13 stale `.js` files have `.ts` counterparts and need deletion.
 
 ### ~~5.1 -- Project Setup~~ ✅
 
@@ -1112,38 +1110,61 @@ Comprehensive TypeScript type definitions for the entire domain model — symbol
 
 **New file:** `src/types.ts` ([#516](https://github.com/optave/codegraph/pull/516))
 
-### ~~5.3 -- Leaf Module Migration~~ ✅
+### 5.3 -- Leaf Module Migration (In Progress)
+
+Migrate modules with no or minimal internal dependencies. 25 migrated, 4 remaining.
+
+**Migrated (25):** `shared/errors`, `shared/kinds`, `shared/normalize`, `shared/paginate`, `shared/constants`, `shared/file-utils`, `shared/generators`, `shared/hierarchy`, `infrastructure/logger`, `infrastructure/config`, `infrastructure/native`, `infrastructure/registry`, `infrastructure/update-check`, `infrastructure/result-formatter`, `infrastructure/test-filter`, `db/repository/*` (14 files), `domain/analysis/*` (9 files), `presentation/colors`, `presentation/table` — via [#553](https://github.com/optave/codegraph/pull/553), [#566](https://github.com/optave/codegraph/pull/566)
+
+**Remaining (4):**
+
+| Module | Notes |
+|--------|-------|
+| `src/db/connection.js` | SQLite connection wrapper |
+| `src/db/index.js` | DB barrel/schema entry point |
+| `src/db/migrations.js` | Schema version management |
+| `src/db/query-builder.js` | Dynamic query builder |
 
-Migrated 25 leaf modules (no internal dependencies) from JavaScript to TypeScript in two waves:
+### 5.4 -- Core Module Migration (In Progress)
 
-- ✅ Wave 1 (17 modules): `shared/errors`, `shared/kinds`, `shared/normalize`, `shared/paginate`, `infrastructure/logger`, `infrastructure/result-formatter`, `infrastructure/test-filter`, `db/index`, `domain/analysis/*` (context, dependencies, exports, impact, implementations, module-map, roles, symbol-lookup), `domain/graph/cycles`, `presentation/colors`, `presentation/table` ([#553](https://github.com/optave/codegraph/pull/553))
-- ✅ Wave 2 (8 modules): `shared/constants`, `shared/file-utils`, `shared/generators`, `shared/hierarchy`, `infrastructure/config`, `infrastructure/native`, `infrastructure/registry`, `infrastructure/update-check` ([#566](https://github.com/optave/codegraph/pull/566))
+Migrate modules that implement domain logic and Phase 3 interfaces. Some migrated via [#554](https://github.com/optave/codegraph/pull/554), 39 files remaining.
 
-### ~~5.4 -- Core Module Migration~~ ✅
+**Migrated:** `db/repository/*.ts` (14 files), `domain/parser.ts`, `domain/graph/resolve.ts`, `extractors/*.ts` (11 files), `domain/graph/builder.ts` + `context.ts` + `helpers.ts` + `pipeline.ts`, `domain/graph/watcher.ts`, `domain/search/{generator,index,models}.ts`, `graph/model.ts`, `graph/algorithms/{bfs,centrality,shortest-path,tarjan}.ts`, `graph/algorithms/leiden/rng.ts`, `graph/classifiers/{risk,roles}.ts`
 
-Migrated 54 core modules that implement Phase 3 interfaces — database repository, parser engine, language extractors, import resolution, graph builders, and analysis modules.
+**Remaining (39):**
 
-- ✅ `db/repository/*.ts` — all prepared statements typed
-- ✅ `domain/parser.ts`, `domain/graph/resolve.ts` — engine and resolution with confidence types
-- ✅ `extractors/*.ts` — all 11 language extractors
-- ✅ `domain/graph/builder/**/*.ts` — full build pipeline
-- ✅ `graph/**/*.ts` — unified graph model, algorithms (Tarjan, Louvain, Leiden, BFS, centrality, shortest-path), classifiers (role, risk), builders
+| Module | Files | Notes |
+|--------|-------|-------|
+| `domain/graph/builder/stages/` | 9 | All 9 build pipeline stages (collect-files, parse-files, resolve-imports, build-edges, etc.) |
+| `domain/graph/builder/incremental.js` | 1 | Incremental rebuild logic |
+| `domain/graph/{cycles,journal,change-journal}.js` | 3 | Graph utilities |
+| `domain/queries.js` | 1 | Core query functions |
+| `domain/search/search/` | 6 | Search subsystem (hybrid, semantic, keyword, filters, cli-formatter, prepare) |
+| `domain/search/stores/` | 2 | FTS5, SQLite blob stores |
+| `domain/search/strategies/` | 3 | Source, structured, text-utils strategies |
+| `graph/algorithms/leiden/` | 6 | Leiden community detection (adapter, CPM, modularity, optimiser, partition, index) |
+| `graph/algorithms/{louvain,index}.js` | 2 | Louvain + algorithms barrel |
+| `graph/builders/` | 4 | Dependency, structure, temporal builders + barrel |
+| `graph/classifiers/index.js` + `graph/index.js` | 2 | Barrel exports |
 
-([#554](https://github.com/optave/codegraph/pull/554))
+### 5.5 -- Orchestration & Public API Migration (In Progress)
 
-### ~~5.5 -- Orchestration & Public API Migration~~ ✅
+Migrate top-level orchestration, features, and entry points. Some migrated via [#555](https://github.com/optave/codegraph/pull/555), 159 files remaining.
 
-Migrated top-level orchestration and entry points — builder pipeline, watcher, embeddings subsystem, MCP server, CLI commands, and public API index.
+**Migrated:** `domain/graph/builder.ts` + `context.ts` + `helpers.ts` + `pipeline.ts`, `domain/graph/watcher.ts`, `domain/search/{generator,index,models}.ts`, `mcp/{index,middleware,server,tool-registry}.ts`, `features/export.ts`, `index.ts`
 
-- ✅ `domain/graph/builder.ts`, `domain/graph/watcher.ts` — pipeline stages typed
-- ✅ `domain/search/*.ts` — vector store, model registry, search modes
-- ✅ `mcp/*.ts` — tool schemas, typed handlers
-- ✅ `features/*.ts`, `presentation/*.ts` — feature modules and CLI formatters
-- ✅ `index.ts` — curated public API with proper export types
+**Remaining (159):**
 
-([#555](https://github.com/optave/codegraph/pull/555))
+| Module | Files | Notes |
+|--------|-------|-------|
+| `cli.js` + `cli/` | 55 | Commander entry point, 43 command handlers (`commands/`), barrel, shared CLI utilities |
+| `mcp/tools/` | 36 | Individual MCP tool handlers + barrel |
+| `presentation/` | 28 | Presentation formatters (14 files), `queries-cli/` (7 files), sequence-renderer, viewer, export, etc. |
+| `features/` | 21 | audit, batch, boundaries, cfg, check, cochange, communities, complexity, dataflow, flow, graph-enrichment, manifesto, owners, sequence, snapshot, structure, triage, ast, branch-compare, `shared/find-nodes` |
+| `ast-analysis/` | 18 | AST analysis framework, visitors (4), language-specific rules (9), engine, metrics, shared, visitor-utils |
+| `index.js` | 1 | Public API exports (stale — `.ts` exists) |
 
-**JS counterpart cleanup (14 files to delete):** The following `.js` files are stale counterparts of already-migrated `.ts` files and should be deleted once all consumers import from `.ts`: `domain/graph/builder.js`, `domain/graph/builder/{context,helpers,pipeline}.js`, `domain/graph/resolve.js`, `domain/graph/watcher.js`, `domain/search/{generator,index,models}.js`, `features/export.js`, `mcp/{index,middleware,server,tool-registry}.js`
+**Stale `.js` counterparts to delete (13 files):** `domain/graph/builder.js`, `domain/graph/builder/{context,helpers,pipeline}.js`, `domain/graph/watcher.js`, `domain/search/{generator,index,models}.js`, `features/export.js`, `mcp/{index,middleware,server,tool-registry}.js` — these have `.ts` counterparts already
 
 ### 5.6 -- Test Migration
 

From 9d2b7ff1e10320ac0953a417c339fe31454569d3 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 03:00:16 -0600
Subject: [PATCH 29/37] fix(skill): ban untracked deferrals in /review skill
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The /review skill allowed replying "acknowledged as follow-up" to
reviewer comments without tracking them anywhere. These deferrals
get lost — nobody revisits PR comment threads after merge.

Now: if a fix is genuinely out of scope, the skill must create a
GitHub issue with the follow-up label before replying. The reply
must include the issue link. A matching rule in the Rules section
reinforces the ban.
---
 .claude/skills/review/SKILL.md | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index b0a7e00d..ce3ef428 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -120,7 +120,29 @@ For **each** review comment — including minor suggestions, nits, style feedbac
 2. **Read the relevant code** at the file and line referenced.
 3. **Make the change.** Even if the comment is marked as "nit" or "suggestion" or "minor" — address it. The goal is zero outstanding comments.
 4. **If you disagree** with a suggestion (e.g., it would introduce a bug or contradicts project conventions), do NOT silently ignore it. Reply to the comment explaining why you chose a different approach.
-5. **Reply to each comment** explaining what you did. The reply mechanism depends on where the comment lives:
+5. **If the fix is genuinely out of scope** for this PR (e.g., it affects a different module not touched by this PR, or requires a design decision beyond the PR's purpose), you MUST create a GitHub issue to track it before replying. Never reply with "acknowledged as follow-up" or "noted for later" without a tracked issue — untracked deferrals get lost and nobody will ever revisit them.
+
+   ```bash
+   # Create a tracking issue for the deferred item
+   gh issue create \
+     --title "follow-up: <concise description of what needs to be done>" \
+     --body "$(cat <<'EOF'
+   Deferred from PR #<number> review.
+
+   **Original reviewer comment:** https://github.com/optave/codegraph/pull/<number>#discussion_r<comment-id>
+
+   **Context:** <why this is out of scope for the current PR and what the fix entails>
+   EOF
+   )" \
+     --label "follow-up"
+   ```
+
+   Then reply to the reviewer comment referencing the issue:
+   ```bash
+   gh api repos/optave/codegraph/pulls/<number>/comments/<comment-id>/replies \
+     -f body="Out of scope for this PR — tracked in #<issue-number>"
+   ```
+6. **Reply to each comment** explaining what you did. The reply mechanism depends on where the comment lives:
 
    **For inline PR review comments** (from Claude, Greptile, or humans — these have a `path` and `line`):
    ```bash
@@ -220,3 +242,4 @@ After processing all PRs, output a summary table:
 - **One concern per commit** — don't lump conflict resolution with code fixes.
 - **Flag scope creep.** If a PR's diff contains files unrelated to its stated purpose (e.g., a docs PR carrying `src/` or test changes from a merged feature branch), flag it immediately. Split the unrelated changes into a separate branch and PR. Do not proceed with review until the PR is scoped correctly — scope creep is not acceptable.
 - If a PR is fundamentally broken beyond what review feedback can fix, note it in the summary and skip to the next PR.
+- **Never defer without tracking.** Do not reply "acknowledged as follow-up", "noted for later", or "tracking for follow-up" to a reviewer comment without creating a GitHub issue first. If you can't fix it now and it's genuinely out of scope, create an issue with the `follow-up` label and include the issue link in your reply. Untracked acknowledgements are the same as ignoring the comment — they will never be revisited.

From dbe4a736fc3ac2cd1a8d3966e21817a9e65537c0 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 03:13:02 -0600
Subject: [PATCH 30/37] fix(skill): add --repo flag, multi-endpoint reply for
 deferrals, and scope guidance (#568)

---
 .claude/skills/review/SKILL.md | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index ce3ef428..86e1c92e 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -120,11 +120,12 @@ For **each** review comment — including minor suggestions, nits, style feedbac
 2. **Read the relevant code** at the file and line referenced.
 3. **Make the change.** Even if the comment is marked as "nit" or "suggestion" or "minor" — address it. The goal is zero outstanding comments.
 4. **If you disagree** with a suggestion (e.g., it would introduce a bug or contradicts project conventions), do NOT silently ignore it. Reply to the comment explaining why you chose a different approach.
-5. **If the fix is genuinely out of scope** for this PR (e.g., it affects a different module not touched by this PR, or requires a design decision beyond the PR's purpose), you MUST create a GitHub issue to track it before replying. Never reply with "acknowledged as follow-up" or "noted for later" without a tracked issue — untracked deferrals get lost and nobody will ever revisit them.
+5. **If the fix is genuinely out of scope** for this PR, you MUST create a GitHub issue to track it before replying. Never reply with "acknowledged as follow-up" or "noted for later" without a tracked issue — untracked deferrals get lost and nobody will ever revisit them. "Genuinely out of scope" means the fix touches a different module not in the PR's diff, requires an architectural decision beyond the PR's mandate, or would introduce unrelated risk. Fixing a variable name, adding a null check, or adjusting a string in a file already in the diff is NOT out of scope — just do it.
 
    ```bash
    # Create a tracking issue for the deferred item
    gh issue create \
+     --repo optave/codegraph \
      --title "follow-up: <concise description of what needs to be done>" \
      --body "$(cat <<'EOF'
    Deferred from PR #<number> review.
@@ -137,10 +138,14 @@ For **each** review comment — including minor suggestions, nits, style feedbac
      --label "follow-up"
    ```
 
-   Then reply to the reviewer comment referencing the issue:
+   Then reply to the reviewer comment referencing the issue. Use the same reply mechanism as step 6 below — inline PR review comments use `/pulls/<number>/comments/<comment-id>/replies`, top-level review bodies and issue-style comments use `/issues/<number>/comments`:
    ```bash
+   # For inline PR review comments:
    gh api repos/optave/codegraph/pulls/<number>/comments/<comment-id>/replies \
      -f body="Out of scope for this PR — tracked in #<issue-number>"
+   # For top-level review bodies or issue-style comments:
+   gh api repos/optave/codegraph/issues/<number>/comments \
+     -f body="Out of scope for this PR — tracked in #<issue-number>"
    ```
 6. **Reply to each comment** explaining what you did. The reply mechanism depends on where the comment lives:
 

From ca1d3697222608f9c352a9b2f0b016c02f1b4850 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 03:59:58 -0600
Subject: [PATCH 31/37] fix(skill): guard follow-up label creation before gh
 issue create (#568)

gh issue create --label "follow-up" fails if the label doesn't exist
in the repo. Add a gh label create guard step that is safe to re-run.
---
 .claude/skills/review/SKILL.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index 86e1c92e..2a2249ce 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -123,6 +123,9 @@ For **each** review comment — including minor suggestions, nits, style feedbac
 5. **If the fix is genuinely out of scope** for this PR, you MUST create a GitHub issue to track it before replying. Never reply with "acknowledged as follow-up" or "noted for later" without a tracked issue — untracked deferrals get lost and nobody will ever revisit them. "Genuinely out of scope" means the fix touches a different module not in the PR's diff, requires an architectural decision beyond the PR's mandate, or would introduce unrelated risk. Fixing a variable name, adding a null check, or adjusting a string in a file already in the diff is NOT out of scope — just do it.
 
    ```bash
+   # Ensure the follow-up label exists (safe to re-run)
+   gh label create "follow-up" --color "0e8a16" --description "Deferred from PR review" --repo optave/codegraph 2>/dev/null || true
+
    # Create a tracking issue for the deferred item
    gh issue create \
      --repo optave/codegraph \

From 424980e54481aa58535875e87c97c561c357873c Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 04:28:10 -0600
Subject: [PATCH 32/37] feat(skill): parallelize /review with one subagent per
 PR

---
 .claude/skills/review/SKILL.md | 56 +++++++++++++++++++++++++---------
 1 file changed, 41 insertions(+), 15 deletions(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index 2a2249ce..fe941936 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -26,21 +26,30 @@ Record each PR's number, branch, base, merge status, and CI state.
 
 ---
 
-## Step 2: Process Each PR
+## Step 2: Launch Parallel Subagents
 
-For **each** open PR, perform the following steps in order. Process PRs one at a time to avoid cross-contamination.
+Each PR is independent work — **launch one Agent subagent per PR, all in parallel.** Use `isolation: "worktree"` so each agent gets its own copy of the repo with no cross-PR contamination.
 
-### 2a. Switch to the PR branch
+Pass each agent the full PR processing instructions (Steps 2a–2i below) along with the PR number, branch, base, and current state from Step 1. The agent prompt must include **all** the rules from the Rules section at the bottom of this skill.
 
-Ensure the working tree is clean before switching to avoid cross-PR contamination:
-
-```bash
-if [ -n "$(git status --porcelain)" ]; then
-  git stash push -m "pre-checkout stash"
-fi
 ```
+For each PR, launch an Agent with:
+- description: "Review PR #<number>"
+- isolation: "worktree"
+- prompt: <the full PR processing instructions below, with PR details filled in>
+```
+
+Launch **all** PR agents in a single message (one tool call per PR) so they run concurrently. Do NOT wait for one to finish before starting the next.
+
+Each agent will return a result summary. Collect all results for the final summary table in Step 3.
+
+---
+
+## PR Processing Instructions (for each subagent)
 
-Then check out the PR branch:
+The following steps are executed by each subagent for its assigned PR.
+
+### 2a. Check out the PR branch
 
 ```bash
 gh pr checkout <number>
@@ -191,7 +200,7 @@ After addressing all comments for a PR:
 
 ### 2g. Re-trigger reviewers
 
-**Greptile:** Before re-triggering, check if your last reply to Greptile already has a positive emoji reaction (👍, ✅, 🎉, etc.) from `greptileai`. A positive reaction means Greptile is satisfied with your fix — do NOT re-trigger in that case, move on. Only re-trigger if there is no positive reaction on your last comment:
+**Greptile:** Before re-triggering, check if your last reply to Greptile already has a positive emoji reaction (thumbs up, check, party, etc.) from `greptileai`. A positive reaction means Greptile is satisfied with your fix — do NOT re-trigger in that case, move on. Only re-trigger if there is no positive reaction on your last comment:
 
 ```bash
 # Check reactions on your most recent comment to see if Greptile already approved
@@ -219,14 +228,29 @@ After re-triggering:
 1. Wait for the new reviews to come in (check after a reasonable interval).
 2. Fetch new comments again (repeat Step 2d).
 3. If there are **new** comments from Greptile or Claude, go back to Step 2e and address them.
-4. **Repeat this loop for a maximum of 3 rounds.** If after 3 rounds there are still actionable comments, mark the PR as "needs human review" in the summary table and move to the next PR.
+4. **Repeat this loop for a maximum of 3 rounds.** If after 3 rounds there are still actionable comments, mark the PR as "needs human review" in the result.
 5. Verify CI is still green after all changes.
 
+### 2i. Return result
+
+At the end of processing, the subagent MUST return a structured result with these fields so the main agent can build the summary table:
+
+```
+PR: #<number>
+Branch: <branch-name>
+Conflicts: resolved | none
+CI: green | red | pending
+Comments Addressed: <count>
+Reviewers Re-triggered: <list>
+Status: ready | needs-work | needs-human-review | skipped
+Notes: <any issues encountered>
+```
+
 ---
 
-## Step 3: Summary
+## Step 3: Collect Results and Summarize
 
-After processing all PRs, output a summary table:
+After **all** subagents complete, collect their results and output a summary table:
 
 ```
 | PR | Branch | Conflicts | CI | Comments Addressed | Reviewers Re-triggered | Status |
@@ -234,6 +258,8 @@ After processing all PRs, output a summary table:
 | #N | branch | resolved/none | green/red | N comments | greptile, claude | ready/needs-work |
 ```
 
+If any subagent failed or returned an error, note it in the Status column as `agent-error` with the failure reason.
+
 ---
 
 ## Rules
@@ -242,7 +268,7 @@ After processing all PRs, output a summary table:
 - **Never force-push** unless fixing a commit message that fails commitlint. Amend + force-push is the only way to fix a pushed commit title (messages are part of the SHA). This is safe on feature branches. For all other problems, fix with a new commit.
 - **Address ALL comments from ALL reviewers** (Claude, Greptile, and humans), even minor/nit/optional ones. Leave zero unaddressed. Do not only respond to one reviewer and skip another.
 - **Always reply to comments** explaining what was done. Don't just fix silently. Every reviewer must see a reply on their feedback.
-- **Don't re-trigger Greptile if already approved.** If your last reply to a Greptile comment has a positive emoji reaction (👍, ✅, 🎉) from `greptileai`, it's already satisfied — skip re-triggering.
+- **Don't re-trigger Greptile if already approved.** If your last reply to a Greptile comment has a positive emoji reaction from `greptileai`, it's already satisfied — skip re-triggering.
 - **Only re-trigger Claude** if you addressed Claude's feedback specifically.
 - **No co-author lines** in commit messages.
 - **No Claude Code references** in commit messages or comments.

From 84acbc76eb1ee1f887b2b5a9555544a7344883a8 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 04:46:21 -0600
Subject: [PATCH 33/37] fix: correct heredoc terminator indentation in review
 skill (#568)

---
 .claude/skills/review/SKILL.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index fe941936..7312006d 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -139,13 +139,13 @@ For **each** review comment — including minor suggestions, nits, style feedbac
    gh issue create \
      --repo optave/codegraph \
      --title "follow-up: <concise description of what needs to be done>" \
-     --body "$(cat <<'EOF'
-   Deferred from PR #<number> review.
+     --body "$(cat <<-'EOF'
+	Deferred from PR #<number> review.
 
-   **Original reviewer comment:** https://github.com/optave/codegraph/pull/<number>#discussion_r<comment-id>
+	**Original reviewer comment:** https://github.com/optave/codegraph/pull/<number>#discussion_r<comment-id>
 
-   **Context:** <why this is out of scope for the current PR and what the fix entails>
-   EOF
+	**Context:** <why this is out of scope for the current PR and what the fix entails>
+	EOF
    )" \
      --label "follow-up"
    ```

From 7643a795c82d9a9f1e0b346da1062acaa122e4fd Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 14:25:58 -0600
Subject: [PATCH 34/37] fix(skill): capture gh issue create output before
 referencing issue number
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

gh issue create prints the new issue URL to stdout — capture it and
extract the number so reply templates can reference it unambiguously.
---
 .claude/skills/review/SKILL.md | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index f415eb15..ab8f7d3b 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -135,8 +135,8 @@ For **each** review comment — including minor suggestions, nits, style feedbac
    # Ensure the follow-up label exists (safe to re-run)
    gh label create "follow-up" --color "0e8a16" --description "Deferred from PR review" --repo optave/codegraph 2>/dev/null || true
 
-   # Create a tracking issue for the deferred item
-   gh issue create \
+   # Create a tracking issue for the deferred item and capture the issue number
+   issue_url=$(gh issue create \
      --repo optave/codegraph \
      --title "follow-up: <concise description of what needs to be done>" \
      --body "$(cat <<-'EOF'
@@ -147,17 +147,18 @@ For **each** review comment — including minor suggestions, nits, style feedbac
 	**Context:** <why this is out of scope for the current PR and what the fix entails>
 	EOF
    )" \
-     --label "follow-up"
+     --label "follow-up")
+   issue_number=$(echo "$issue_url" | grep -oE '[0-9]+$')
    ```
 
-   Then reply to the reviewer comment referencing the issue. Use the same reply mechanism as step 6 below — inline PR review comments use `/pulls/<number>/comments/<comment-id>/replies`, top-level review bodies and issue-style comments use `/issues/<number>/comments`:
+   Then reply to the reviewer comment referencing the issue (using `$issue_number` captured above). Use the same reply mechanism as step 6 below — inline PR review comments use `/pulls/<number>/comments/<comment-id>/replies`, top-level review bodies and issue-style comments use `/issues/<number>/comments`:
    ```bash
    # For inline PR review comments:
    gh api repos/optave/codegraph/pulls/<number>/comments/<comment-id>/replies \
-     -f body="Out of scope for this PR — tracked in #<issue-number>"
+     -f body="Out of scope for this PR — tracked in #$issue_number"
    # For top-level review bodies or issue-style comments:
    gh api repos/optave/codegraph/issues/<number>/comments \
-     -f body="Out of scope for this PR — tracked in #<issue-number>"
+     -f body="Out of scope for this PR — tracked in #$issue_number"
    ```
 6. **Reply to each comment** explaining what you did. The reply mechanism depends on where the comment lives:
 

From f4978aa2482387a8305302e5e06fa8a295d4c907 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 18:54:42 -0600
Subject: [PATCH 35/37] fix(skill): surface follow-up issues in review result
 format and summary table

Add "Issues Created" field to the subagent result format and an "Issues" column
to the Step 3 summary table, so deferred out-of-scope items are visible in the
final report.
---
 .claude/skills/review/SKILL.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index ab8f7d3b..5890ac44 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -242,6 +242,7 @@ Branch: <branch-name>
 Conflicts: resolved | none
 CI: green | red | pending
 Comments Addressed: <count>
+Issues Created: <comma-separated list of #<n> follow-up issues, or "none">
 Reviewers Re-triggered: <list>
 Status: ready | needs-work | needs-human-review | skipped
 Notes: <any issues encountered>
@@ -254,9 +255,9 @@ Notes: <any issues encountered>
 After **all** subagents complete, collect their results and output a summary table:
 
 ```
-| PR | Branch | Conflicts | CI | Comments Addressed | Reviewers Re-triggered | Status |
-|----|--------|-----------|----|--------------------|----------------------|--------|
-| #N | branch | resolved/none | green/red | N comments | greptile, claude | ready/needs-work |
+| PR | Branch | Conflicts | CI | Comments Addressed | Issues | Reviewers Re-triggered | Status |
+|----|--------|-----------|----|--------------------|--------|----------------------|--------|
+| #N | branch | resolved/none | green/red | N comments | #X, #Y or none | greptile, claude | ready/needs-work |
 ```
 
 If any subagent failed or returned an error, note it in the Status column as `agent-error` with the failure reason.

From e671711381d51dbd564cfa76a92c981390903671 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 19:15:17 -0600
Subject: [PATCH 36/37] fix(skill): require verbatim rule propagation to
 subagents

---
 .claude/skills/review/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index 5890ac44..17d994b7 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -30,7 +30,7 @@ Record each PR's number, branch, base, merge status, and CI state.
 
 Each PR is independent work — **launch one Agent subagent per PR, all in parallel.** Use `isolation: "worktree"` so each agent gets its own copy of the repo with no cross-PR contamination.
 
-Pass each agent the full PR processing instructions (Steps 2a–2i below) along with the PR number, branch, base, and current state from Step 1. The agent prompt must include **all** the rules from the Rules section at the bottom of this skill.
+Pass each agent the full PR processing instructions (Steps 2a–2i below) along with the PR number, branch, base, and current state from Step 1. The agent prompt must include **all** the rules from the Rules section at the bottom of this skill — copy them **verbatim**, do not paraphrase or summarize.
 
 ```
 For each PR, launch an Agent with:

From 6577f9114897b017f6eef134092101169bd05f41 Mon Sep 17 00:00:00 2001
From: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
Date: Mon, 23 Mar 2026 19:52:01 -0600
Subject: [PATCH 37/37] fix(skill): align Issues field name between 2i result
 and Step 3 table (#568)

---
 .claude/skills/review/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.claude/skills/review/SKILL.md b/.claude/skills/review/SKILL.md
index 17d994b7..ce8474ab 100644
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -242,7 +242,7 @@ Branch: <branch-name>
 Conflicts: resolved | none
 CI: green | red | pending
 Comments Addressed: <count>
-Issues Created: <comma-separated list of #<n> follow-up issues, or "none">
+Issues: <comma-separated list of #<n> follow-up issues, or "none">
 Reviewers Re-triggered: <list>
 Status: ready | needs-work | needs-human-review | skipped
 Notes: <any issues encountered>