feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557
feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557carlos-alm wants to merge 66 commits intomainfrom
Conversation
…in backlog These two items deliver the highest immediate impact on agent experience and graph accuracy without requiring Rust porting or TypeScript migration. They should be implemented before any Phase 4+ roadmap work. - #83: hook-optimized `codegraph brief` enriches passively-injected context - #71: basic type inference closes the biggest resolution gap for TS/Java
Impact: 14 functions changed, 0 affected
Add new Phase 4 covering the port of JS-only build phases to Rust: - 4.1-4.3: AST nodes, CFG, dataflow visitor ports (~587ms savings) - 4.4: Batch SQLite inserts (~143ms) - 4.5: Role classification & structure (~42ms) - 4.6: Complete complexity pre-computation - 4.7: Fix incremental rebuild data loss on native engine - 4.8: Incremental rebuild performance (target sub-100ms) Bump old Phases 4-10 to 5-11 with all cross-references updated. Benchmark evidence shows ~50% of native build time is spent in JS visitors that run identically on both engines.
Take main's corrected #57 section anchors; keep HEAD's v2.7.0 version reference. Impact: 10 functions changed, 11 affected
…ative-acceleration Impact: 25 functions changed, 46 affected
- Add COMMITS=0 guard in publish.yml to return clean version when HEAD is exactly at a tag (mirrors bench-version.js early return) - Change bench-version.js to use PATCH+1-dev.COMMITS format instead of PATCH+COMMITS-dev.SHA (mirrors publish.yml's new scheme) - Fix fallback in bench-version.js to use dev.1 matching publish.yml's no-tags COMMITS=1 default Impact: 1 functions changed, 0 affected
The release skill now scans commit history using conventional commit rules to determine major/minor/patch automatically. Explicit version argument still works as before.
…ns, and architectural snapshot Add /titan-run skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling end-to-end autonomous execution. Hardening layers added across the pipeline: - Pre-Agent Gate (G1-G4): git health, worktree validity, state integrity, backups - Post-phase validation (V1-V15): artifact structure, coverage, consistency checks - Stall detection with per-phase thresholds and no-progress abort - Mandatory human checkpoint before forge (unless --yes) New validation tools integrated into forge and gate: - Diff Review Agent (forge Step 9): verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs - Semantic Assertions (gate Step 5): export signature stability, import resolution integrity, dependency direction, re-export chain validation - Architectural Snapshot Comparator (gate Step 5.5): community stability, cross-domain dependency direction, cohesion delta, drift detection vs pre-forge baseline
Greptile SummaryThis PR introduces the Key changes:
This PR has been through an extensive review cycle and addresses a very large number of previously flagged issues. The only remaining inaccuracy found is that the README command table lists All four Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant TitanRun as /titan-run (Orchestrator)
participant Recon as /titan-recon (Sub-agent)
participant Gauntlet as /titan-gauntlet (Sub-agent, loop)
participant Sync as /titan-sync (Sub-agent)
participant Forge as /titan-forge (Sub-agent, loop)
participant Gate as /titan-gate (Sub-agent, per-commit)
User->>TitanRun: /titan-run [--yes] [--start-from phase]
TitanRun->>TitanRun: Step 0: Pre-flight (worktree, args, state check)
TitanRun->>TitanRun: Step 0.5: Artifact pre-validation (if --start-from / --skip-*)
TitanRun->>TitanRun: G1-G4: Pre-Agent Gate
TitanRun->>Recon: Dispatch (Step 1)
Recon-->>TitanRun: titan-state.json, GLOBAL_ARCH.md
TitanRun->>TitanRun: V1-V4: Validate recon artifacts
loop Gauntlet iterations (stall limit: 3)
TitanRun->>TitanRun: G1-G4: Pre-Agent Gate
TitanRun->>Gauntlet: Dispatch (Step 2)
Gauntlet-->>TitanRun: gauntlet.ndjson (incremental NDJSON)
TitanRun->>TitanRun: Progress/stall check, NDJSON integrity
end
TitanRun->>TitanRun: V5-V7: Validate gauntlet artifacts
TitanRun->>TitanRun: G1-G4: Pre-Agent Gate
TitanRun->>Sync: Dispatch (Step 3)
Sync-->>TitanRun: sync.json
TitanRun->>TitanRun: V8-V10: Validate sync artifacts
TitanRun->>TitanRun: Step 3.5a: Capture arch snapshot (codegraph communities/structure/drift)
TitanRun->>User: Step 3.5b: FORGE CHECKPOINT — confirm to proceed
User-->>TitanRun: Confirmed (or --yes)
loop Forge iterations (stall limit: 2)
TitanRun->>TitanRun: G1-G4: Pre-Agent Gate
TitanRun->>Forge: Dispatch --phase N (Step 4)
loop Per target in phase
Forge->>Forge: Stage files (Step 8)
Forge->>Forge: Diff review D1-D5 (Step 9)
Forge->>Forge: Run tests (Step 10)
Forge->>Gate: /titan-gate (Step 11)
Gate->>Gate: Structural + Semantic + Arch checks
Gate-->>Forge: PASS / FAIL(test) / FAIL(semantic)
Forge->>Forge: Commit or rollback (Steps 12/13)
end
Forge-->>TitanRun: titan-state.json updated
TitanRun->>TitanRun: V11-V13: Stall check, commit audit, post-phase tests
end
TitanRun->>TitanRun: V14-V15: Final state + gate-log consistency
TitanRun->>User: Step 5: Final pipeline report
Reviews (23): Last reviewed commit: "fix(titan-run): guard NDJSON integrity c..." | Re-trigger Greptile |
|
|
||
| if currentAuditedCount == previousAuditedCount: | ||
| stallCount += 1 | ||
| Print: "WARNING: Gauntlet iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)" | ||
| if stallCount >= maxStalls: | ||
| Stop: "Gauntlet stalled for <maxStalls> consecutive iterations at <currentAuditedCount>/<expectedTargetCount> targets. Likely stuck on a problematic target. Check gauntlet.ndjson for the last successful entry and investigate the next target in the batch." |
There was a problem hiding this comment.
Undefined variable
previousAuditedCountBeforeAgent in gauntlet efficiency check
The efficiency check references previousAuditedCountBeforeAgent, which is never defined in the pseudocode. By the time this line is reached, previousAuditedCount has already been updated to currentAuditedCount on the line just above, so using it there would always yield 0.
To correctly compute how many targets this iteration processed, you need to capture the pre-agent count before the update. For example:
previousAuditedCount = currentAuditedCount # update for next iteration's stall check
# Efficiency check: if progress is very slow (< 2 targets per iteration), warn
targetsThisIteration = currentAuditedCount - previousAuditedCountBeforeAgent # ← undefined
Should be:
# Save count before update for efficiency check
countBeforeUpdate = previousAuditedCount
previousAuditedCount = currentAuditedCount # update for next iteration's stall check
# Efficiency check
targetsThisIteration = currentAuditedCount - countBeforeUpdate
if targetsThisIteration == 1 and iteration > 3:
Print: "WARNING: Only 1 target per iteration..."
This same issue exists in the identical copy at docs/examples/claude-code-skills/titan-run/SKILL.md at the same line.
There was a problem hiding this comment.
Fixed — saved pre-update count to countBeforeUpdate before reassignment. Applied to both .claude/skills/ and docs/examples/ copies.
Impact: 1662 functions changed, 760 affected
|
Resolved merge conflicts with main and fixed the undefined previousAuditedCountBeforeAgent variable in the gauntlet loop efficiency check. |
| If any SHA doesn't exist → **WARN**: "Commit <sha> recorded in state but not found in git history. State may be out of sync." | ||
|
|
||
| **V15. Gate log consistency:** | ||
| If `.codegraph/titan/gate-log.ndjson` exists: |
There was a problem hiding this comment.
Incomplete merge conflict detection — misses
AU, UA, DU, UD markers
The check only lists UU, AA, DD as conflict indicators in git status --porcelain output. Git also produces AU (added by us, modified by them), UA (modified by us, added by them), DU (deleted by us, modified by them), and UD (modified by us, deleted by them) in the porcelain format for merge conflicts. Missing these cases could allow the pipeline to proceed with unresolved conflicts into a forge sub-agent that commits.
| If `.codegraph/titan/gate-log.ndjson` exists: | |
| - **Merge conflicts** (lines starting with `UU`, `AA`, `DD`, `AU`, `UA`, `DU`, `UD`): Stop immediately: "Unresolved merge conflict detected. Resolve before continuing." |
There was a problem hiding this comment.
Fixed — added AU, UA, DU, UD to the merge conflict detection markers in both .claude/skills/ and docs/examples/ copies.
| Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5). | ||
| The arch snapshot is at .codegraph/titan/arch-snapshot.json. | ||
| 6. Commit on success, rollback on failure | ||
|
|
There was a problem hiding this comment.
--start-from forge silently disables architectural comparison without warning
When using --start-from forge, Step 3.5a (architectural snapshot capture) is skipped entirely since it lives inside Step 3.5, which comes after sync. The gate's Step 5.5 gracefully handles a missing arch-snapshot.json by skipping the comparison, but the user receives no warning that the architectural drift check will be inactive for the entire forge run.
Consider adding an explicit notice at the start of the forge loop when arch-snapshot.json is absent and the start phase was forge:
if startPhase == "forge" and not exists(".codegraph/titan/arch-snapshot.json"):
Print: "NOTE: No arch-snapshot.json found. Architectural comparison in /titan-gate (Step 5.5) will be skipped for this run.
To enable it, run '/titan-run --start-from sync' to re-capture the pre-forge snapshot."
There was a problem hiding this comment.
Fixed — added a pre-loop check in Step 4a that warns when arch-snapshot.json is missing, advising the user to run --start-from sync to re-capture it.
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet efficiency check (save pre-update count before reassignment) - Add AU, UA, DU, UD to merge conflict detection markers - Add warning when --start-from forge runs without arch-snapshot.json
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet efficiency check (save pre-update count before reassignment) - Add AU, UA, DU, UD to merge conflict detection markers - Add warning when --start-from forge runs without arch-snapshot.json
.claude/skills/titan-forge/SKILL.md
Outdated
| 13. **On failure (test, gate, or diff-review):** | ||
| ```bash | ||
| git checkout -- <changed files> |
There was a problem hiding this comment.
Incomplete rollback — staged files not unstaged in Step 13
The new flow stages files in Step 8 (git add) before running tests (Step 10) and gate (Step 11). When a test or gate failure routes to Step 13, only the working-tree restoration is performed:
git checkout -- <changed files>This restores the file contents on disk but does not remove the files from the staging area. The changes remain indexed in git. If the orchestrator loops back and the next target's git add picks up different files, the leftover staged changes from the failed target will be silently included in the next commit.
The rollback should unstage first, then restore the working tree:
| 13. **On failure (test, gate, or diff-review):** | |
| ```bash | |
| git checkout -- <changed files> | |
| 13. **On failure (test, gate, or diff-review):** | |
| ```bash | |
| git reset HEAD <changed files> | |
| git checkout -- <changed files> | |
| ``` |
Note: the DIFF FAIL path in Step 9 already says "Unstage and revert changes" correctly — this gap only affects Step 13. The same fix is needed in the docs mirror at docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — added unstaging step before working tree restore in Step 13 rollback, so staged files are properly cleared before checkout. Applied to both .claude/skills/ and docs/examples/ copies.
Both commands are called in titan-forge's new diff review step (Step 9): fn-impact for deletion audit (D4) and exports for re-export chain checks.
|
Addressed the remaining open item from the summary: README command table (P2): Restored The |
| | `codegraph exports` | GAUNTLET | Per-symbol export consumers | | ||
| | `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius | | ||
| | `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) | |
There was a problem hiding this comment.
Command table stale after new diff-review and semantic-assertion additions
Three commands were removed from FORGE/GATE in the table, but the new steps introduced in this PR use them:
-
codegraph fn-impact— listed as "GAUNTLET, SYNC" only, but is still called in titan-forge Step 4 (pre-touch understanding) and in the new Step 9 D4 (deletion audit). It's also now used in titan-gate Step 5a. All three users (FORGE, GATE) are missing. -
codegraph context— listed as "SYNC" only, but titan-forge Step 4 still callscodegraph context <target> -T --json(this code is unchanged and confirmed in the current file). FORGE should be restored. -
codegraph exports— listed as "GAUNTLET" only, but the new titan-gate Step 5a explicitly runscodegraph exports <changed-file> -T --json. GATE should be added.
The same table appears in .claude/skills/ (mirrored) so the same correction applies there.
Suggested correction:
| | `codegraph exports` | GAUNTLET | Per-symbol export consumers | | |
| | `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius | | |
| | `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) | | |
| | `codegraph exports` | GAUNTLET, GATE | Per-symbol export consumers | | |
| | `codegraph fn-impact` | GAUNTLET, SYNC, FORGE, GATE | Blast radius | |
And separately restore FORGE to the codegraph context row:
| `codegraph context` | SYNC, FORGE | Full function context |
There was a problem hiding this comment.
Fixed — added GATE to exports and fn-impact rows, added FORGE to context row in the command table. Verified against actual usage in titan-gate Step 5a and titan-forge Steps 4 and 9.
There was a problem hiding this comment.
Fixed — added GATE to codegraph communities, codegraph structure, and codegraph deps in the command table. These are all used by GATE Step 5.5 (architectural snapshot comparison) but were missing from the "Used by" column.
| 5. Run /titan-gate — read .claude/skills/titan-gate/SKILL.md and follow it exactly. | ||
| Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5). | ||
| The arch snapshot is at .codegraph/titan/arch-snapshot.json. | ||
| 6. Commit on success, rollback on failure |
There was a problem hiding this comment.
--yes description inaccurate — forge no longer accepts this flag
The argument hint says --yes is "passed through to forge", but this PR removes --yes from titan-forge's supported arguments list (the --yes → skip confirmation prompt line was deleted from titan-forge). The flag is now passed to forge's dispatch invocation (line ~835 of this file: yesFlag = "--yes" if autoConfirm else ""), but forge silently ignores it since it no longer handles that argument.
--yes still has meaningful effects within the orchestrator itself (auto-resume on existing state, skip the pre-pipeline and forge-checkpoint confirmations), so it should stay — but the description should be updated to reflect its actual scope.
| 6. Commit on success, rollback on failure | |
| - `--yes` → skip all confirmation prompts in the orchestrator (pre-pipeline, forge checkpoint, and resume prompts) |
There was a problem hiding this comment.
Fixed — updated the --yes description to reflect its actual scope: orchestrator-level prompts only (pre-pipeline, forge checkpoint, and resume). Applied to both .claude/skills/ and docs/examples/ copies.
There was a problem hiding this comment.
Fixed — updated the Rules section description of --yes to clarify both purposes: (1) passed through to forge so it skips its per-phase confirmation prompt, and (2) within the orchestrator itself, skips pre-pipeline, forge checkpoint, and resume prompts. Updated in both .claude/skills/titan-run/SKILL.md and the example copy.
.claude/skills/titan-run/SKILL.md
Outdated
| if newCompletedPhases == previousCompletedPhases: | ||
| stallCount += 1 | ||
| Print: "WARNING: Forge iteration <iteration> did not complete phase <nextPhase> (stall <stallCount>/<maxStalls>)" | ||
| if stallCount >= maxStalls: | ||
| Stop: "Forge stalled on phase <nextPhase> for <maxStalls> consecutive iterations. Check titan-state.json → execution.failedTargets for details." | ||
| else: | ||
| stallCount = 0 |
There was a problem hiding this comment.
Forge stall detection ignores within-phase target progress
stallCount increments whenever a forge sub-agent returns without completing a full phase — but forge is explicitly designed to be resumable within a phase. If a phase has more targets than a single sub-agent context window can process, the agent will complete N targets, save state, and return. Because the phase number never appears in completedPhases until every target in it is done, newCompletedPhases == previousCompletedPhases evaluates true on every single iteration — and with maxStalls = 2, the orchestrator aborts after just two sub-agent invocations, even though real per-target progress is being made on each one.
Concrete failure scenario: a phase with 15 targets where each sub-agent processes 5. After iteration 1 (5 done, phase still open) → stallCount = 1. After iteration 2 (10 done, phase still open) → stallCount = 2 ≥ maxStalls → pipeline aborts with "Forge stalled."
Fix: also track previousCompletedTargets and only increment stallCount when both phases AND targets are unchanged:
previousCompletedPhases = execution.completedPhases (or [])
previousCompletedTargets = execution.completedTargets (or []) # add this
iteration = 0
while iteration < maxIterations:
...
newCompletedPhases = execution.completedPhases (or [])
newCompletedTargets = execution.completedTargets (or [])
newFailedTargets = execution.failedTargets (or [])
if newCompletedPhases == previousCompletedPhases and len(newCompletedTargets) == len(previousCompletedTargets):
stallCount += 1
Print: "WARNING: Forge iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)"
if stallCount >= maxStalls:
Stop: "Forge stalled on phase <nextPhase> ..."
else:
stallCount = 0
previousCompletedPhases = newCompletedPhases
previousCompletedTargets = newCompletedTargets # add this
The same fix is needed in docs/examples/claude-code-skills/titan-run/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — forge stall detection now tracks both completedPhases AND completedTargets. stallCount only increments when neither advances, preventing false aborts on multi-target phases. Applied to both .claude/skills/ and docs/examples/ copies. Also removed a duplicate newCompletedTargets assignment.
.claude/skills/titan-run/SKILL.md
Outdated
| Print: "Consider reverting: git revert <headBefore>..<headAfter>" | ||
| Stop. | ||
|
|
||
| previousCompletedPhases = newCompletedPhases |
There was a problem hiding this comment.
Duplicate
previousCompletedPhases assignment — leftover from stall-detection fix
previousCompletedPhases = newCompletedPhases is assigned twice in the forge loop: first at line 478 (immediately after the stall check), and again here at the very end of the loop body. The second assignment is fully redundant — newCompletedPhases hasn't been modified by V12, V13, or anything else between the two assignments.
The prior review thread that introduced the two-variable stall check noted "Also removed a duplicate newCompletedTargets assignment" — and indeed previousCompletedTargets appears only once. However, the corresponding duplicate for previousCompletedPhases was not removed.
While the runtime effect is a no-op (setting the variable to the same value it already holds), an AI agent reading this pseudocode may treat the end-of-loop placement as semantically significant and hesitate before the first assignment, or assume some state mutation occurred in between.
| previousCompletedPhases = newCompletedPhases |
(Remove the lone previousCompletedPhases = newCompletedPhases line at the end of the loop body. The assignment on line 478 is sufficient.)
The identical duplicate is present in docs/examples/claude-code-skills/titan-run/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — removed the duplicate previousCompletedPhases = newCompletedPhases at the end of the forge loop body in both .claude/skills and docs/examples copies.
| **A2. Dependency direction between domains:** | ||
| From `GLOBAL_ARCH.md`, extract the expected dependency direction between domains (e.g., "presentation depends on features, not the reverse"). | ||
|
|
There was a problem hiding this comment.
A2 runs
codegraph deps but already has the new edges from Step 1 diff-impact
A2 instructs the agent to run:
codegraph deps <changed-file> --jsonto find new cross-domain dependencies — but this command returns ALL dependencies of the changed file, including pre-existing ones that have nothing to do with this commit. The filtering instruction ("'New' means the edge appears in the Step 1 diff-impact results") is correct in intent, but having codegraph deps listed alongside it implies the agent should use its output to discover new edges. An agent can easily interpret this as "run deps and check all edges for domain violations", generating false-positive FAILs on pre-existing upward dependencies that were already accepted.
Step 5c already correctly extracts new edges from the Step 1 diff-impact results without re-running any extra command. A2 should follow the same pattern — filter new edges from diff-impact and then use codegraph deps only to resolve the domain layer of the two endpoints, not to enumerate new edges themselves.
Consider clarifying A2 to make the data flow explicit:
From the new dependency edges already extracted in Step 5c (diff-impact Step 1 results):
For each new edge, resolve the domain/layer of source and target using:
codegraph deps <changed-file> --json
If source is a lower layer importing a higher layer → **FAIL**The same issue is present in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — A2 now filters new cross-domain edges from the Step 1 diff-impact results instead of running codegraph deps on all dependencies. codegraph deps is only used to resolve the domain/layer of edge endpoints. Applied to both .claude/skills and docs/examples copies.
.claude/skills/titan-gate/SKILL.md
Outdated
| Clean up the temp file (recover path from sidecar): | ||
| ```bash | ||
| BARREL_TMP=$(cat .codegraph/titan/.barrel-tmp 2>/dev/null) | ||
| if [ -n "$BARREL_TMP" ]; then rm -f "$BARREL_TMP"; fi | ||
| rm -f .codegraph/titan/.barrel-tmp | ||
| ``` |
There was a problem hiding this comment.
Step 5d cleanup not guaranteed to run on early exit
Step 5.5 explicitly marks its cleanup block with "MUST run even on failure or early exit, before proceeding to Step 9." Step 5d's cleanup has no such instruction — just a trailing block at the end of the step.
When an AI agent produces a FAIL verdict at Step 5d (exports dropped with active callers), it may jump directly to Step 9 for verdict aggregation without executing the cleanup block. The .barrel-tmp sidecar and the mktemp file will persist. On the next gate invocation:
- A new
BARREL_TMPpath is written to.barrel-tmp, overwriting the old one - The orphaned temp file from the previous run is never deleted
This is the exact class of stale-file problem that was explicitly fixed for .arch-tmpdir — but the same guarantee was not applied here. The same applies to docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
Consider adding the same explicit instruction:
| Clean up the temp file (recover path from sidecar): | |
| ```bash | |
| BARREL_TMP=$(cat .codegraph/titan/.barrel-tmp 2>/dev/null) | |
| if [ -n "$BARREL_TMP" ]; then rm -f "$BARREL_TMP"; fi | |
| rm -f .codegraph/titan/.barrel-tmp | |
| ``` | |
| Clean up the temp file (recover path from sidecar). **This MUST run even if Step 5d produced a FAIL verdict — run it before proceeding to Step 9:** | |
| ```bash | |
| BARREL_TMP=$(cat .codegraph/titan/.barrel-tmp 2>/dev/null) | |
| if [ -n "$BARREL_TMP" ]; then rm -f "$BARREL_TMP"; fi | |
| rm -f .codegraph/titan/.barrel-tmp |
There was a problem hiding this comment.
Fixed — added the same "MUST run even if Step 5d produced a FAIL verdict" guarantee to the barrel cleanup block, matching the pattern already used in Step 5.5's cleanup. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-gate/SKILL.md
Outdated
| - Check against `codegraph check --boundaries -T --json` | ||
| - New dependency from a lower layer to a higher layer → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | ||
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" |
There was a problem hiding this comment.
Step 5c suggests running
codegraph check --boundaries without --staged, producing false positives
The bullet point Check against \codegraph check --boundaries -T --json`specifies the command without--staged`. An agent following this literally runs a full-codebase boundaries check, which returns ALL boundary violations — including pre-existing ones that were there before this commit. This would incorrectly flag long-standing boundary violations as new failures introduced by the current change.
Step 1 already ran codegraph check --staged --cycles --blast-radius 30 --boundaries -T --json, which includes the boundaries check scoped to staged changes. Step 5c should reference those results instead:
| - Check against `codegraph check --boundaries -T --json` | |
| - New dependency from a lower layer to a higher layer → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | |
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" | |
| From the diff-impact results already collected in Step 1, extract any **new** edges (imports that didn't exist before). | |
| For each new dependency: | |
| - Check against `GLOBAL_ARCH.md` layer rules (if Titan artifacts exist) | |
| - Check the Step 1 `codegraph check --staged --boundaries` results for violations on this edge (already collected — do not re-run) | |
| - New dependency from a lower layer to a higher layer → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | |
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" |
The same issue exists in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — Step 5c now references the Step 1 codegraph check --staged --boundaries results instead of re-running without --staged. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-forge/SKILL.md
Outdated
| 11. **Run /titan-gate:** | ||
| Use the Skill tool to invoke `titan-gate`. | ||
| - If FAIL on **test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree. | ||
| - If FAIL on **semantic/structural** (gate preserves staged changes per its no-rollback rule) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — gate left staged changes intact for potential in-place fixing, and step 13 would silently destroy them. |
There was a problem hiding this comment.
Step 11 semantic FAIL path claims to preserve working tree but immediately destroys it
The instruction says "gate preserves staged changes per its no-rollback rule" and then immediately does git reset HEAD <files> && git checkout -- <files>, which both unstages AND reverts the working tree — destroying all changes. The note "Do NOT go to step 13 — gate left staged changes intact for potential in-place fixing, and step 13 would silently destroy them" is misleading because Step 11 has already destroyed them.
The phrase "gate preserved staged changes" is factually correct about what gate did, but forge immediately overrides that by performing its own full rollback. The "do not go to step 13" is only to avoid a double-rollback attempt, not to preserve any working tree state for fixing.
This creates an inconsistency: gate's Step 11 FAIL template tells the user "Staged changes are intact. Fix the issues above, or manually run git reset HEAD to unstage" — but if forge called gate, those changes have already been wiped by forge Step 11 before the user sees the message.
Consider clarifying the intent — if the purpose is to clean up for the next target (not preserve for fixing), the comment should say so:
| 11. **Run /titan-gate:** | |
| Use the Skill tool to invoke `titan-gate`. | |
| - If FAIL on **test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree. | |
| - If FAIL on **semantic/structural** (gate preserves staged changes per its no-rollback rule) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — gate left staged changes intact for potential in-place fixing, and step 13 would silently destroy them. | |
| - If FAIL on **semantic/structural** (gate does not auto-rollback its staging area but forge must clean up for the next target) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for test/gate failures where gate already unstaged; going there again would attempt a duplicate rollback. |
The same issue exists in docs/examples/claude-code-skills/titan-forge/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — Step 11 semantic FAIL now correctly says "gate does not auto-rollback its staging area, but forge must clean up for the next target". The "Do NOT go to step 13" note now explains it avoids duplicate rollback, not that changes are preserved for fixing. Applied to both copies.
| @@ -18,7 +18,6 @@ Your goal: read `sync.json`, find the next incomplete execution phase, make the | |||
| - `--phase N` → jump to specific phase | |||
| - `--target <name>` → run single target only (for retrying failures) | |||
| - `--dry-run` → show what would be done without changing code | |||
There was a problem hiding this comment.
--yes removed from argument-hint and list but still functional in Step 0.8
The PR removes --yes from the argument-hint frontmatter and from the arguments section, but Step 0.8 still reads $ARGUMENTS for it:
"Ask for confirmation before starting (unless
$ARGUMENTScontains--yes)."
/titan-run also still passes --yes through to forge (Rules section: "Pass --yes through to forge if the user provided it, so forge skips its per-phase confirmation prompt"). So the flag continues to work when passed by the orchestrator, but is now undocumented when invoking forge directly.
A user calling /titan-forge --yes directly won't see it in the argument list and may think it's invalid. Consider either re-adding it as an undocumented passthrough, or clarifying in the docs that --yes is accepted but only meaningful when invoked by /titan-run.
The same applies to docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — re-added --yes to the arguments list with description: "skip confirmation prompt (typically passed by /titan-run orchestrator)". Applied to both copies.
| 10. **Run tests** (detect the project's test command from package.json scripts — `npm test`, `yarn test`, `pnpm test`, etc.): | ||
| ```bash | ||
| <detected-test-command> 2>&1 | ||
| ``` | ||
| If tests fail → go to rollback (step 13). |
There was a problem hiding this comment.
Tests run twice per target — in Step 10 and again inside gate (Step 4)
Step 10 runs the test suite:
<detected-test-command> 2>&1Then Step 11 invokes /titan-gate, which also runs tests in its Step 4. For projects with even moderately long test suites, this doubles execution time on every target — a 30-second test run becomes 60 seconds per target, multiplied across all forge targets.
The Step 10 pre-gate test run makes sense as a fast-fail optimization (failing before triggering all gate codegraph checks), but this tradeoff should be documented. Consider noting the rationale inline, or providing a way to skip the Step 10 test if the user prefers the gate to handle all checks (e.g., an environment variable or a --no-pre-gate-test flag).
The same applies to docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Documented — added a Note block after Step 10 explaining the pre-gate test is a fast-fail optimization with the tradeoff: catches failures faster at the cost of 2x test time on passing targets. Applied to both copies.
Step 5c referenced codegraph check --boundaries without --staged, causing false positives from pre-existing violations. Now references the already-collected Step 1 results instead.
…#557) - Step 11 semantic FAIL: clarify forge does its own rollback (not gate) - Re-add --yes to argument list as documented passthrough - Add diffWarnings: [] to execution state init block - Document pre-gate test as fast-fail optimization with tradeoff note
…ptave/codegraph into feat/release-skill-auto-semver
|
Addressed all latest Greptile feedback:
All changes applied to both .claude/skills/ and docs/examples/ copies. |
.claude/skills/titan-gate/SKILL.md
Outdated
| codegraph deps <endpoint-symbol> --json | ||
| ``` | ||
| (Only call this to look up which domain/layer an individual edge endpoint belongs to — not to enumerate all dependencies.) |
There was a problem hiding this comment.
codegraph deps wrong command for domain-layer resolution
A2 instructs the agent to call codegraph deps <endpoint-symbol> --json to resolve "which domain/layer an individual edge endpoint belongs to." But codegraph deps is documented as a file-level dependency map — it returns the dependencies of a file, not metadata about a symbol's location in the architecture.
Calling codegraph deps parseConfig --json (a symbol name) would fail or produce wrong results on every project. The correct approach to resolve a symbol's domain/layer is:
- Use
codegraph where <symbol>to get the file path. - Infer the layer from the path (e.g.,
src/presentation/...→ presentation layer) by matching against the domain map inGLOBAL_ARCH.md.
Alternatively, the source file is already available in the diff-impact edge output and doesn't need a separate lookup.
The same issue exists in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — replaced codegraph deps (file-level command) with path-based layer inference using the file paths already available in diff-impact edge output. Applied to both .claude/skills/ and docs/examples/ copies.
| **D1. Scope — only planned files touched:** | ||
| Compare staged file paths against `sync.json → executionOrder[currentPhase].targets` and their known file paths (from gauntlet entries). Flag any file NOT associated with the current target or phase. | ||
| - File in a completely different domain → **DIFF FAIL** | ||
| - File is a direct dependency of the target (consumer or import) → **OK** (expected ripple) |
There was a problem hiding this comment.
D2 intent-match check fails silently for dead-code targets
D2 reads "the gauntlet entry's recommendation field" for the current target. However, dead-code targets (deletions of unreferenced symbols) come from titan-state.json → roles.deadSymbols identified during RECON — they have no gauntlet.ndjson entry. When forge processes one of these dead-code targets, D2 would find no matching entry and either skip the intent check or hallucinate the recommendation field, letting any diff pass without verification.
V9 in titan-run already acknowledges this: "OR in titan-state.json → roles.deadSymbols". D2 should have a corresponding guard:
**D2. Intent match:**
- If this is a dead-code target (in `titan-state.json → roles.deadSymbols`), expected recommendation is "remove dead code / delete symbol" — skip gauntlet entry lookup and verify the diff shows only deletions.
- Otherwise, read the gauntlet entry's `recommendation` field...The same fix is needed in docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — D2 now checks if the target is in titan-state.json deadSymbols first. For dead-code targets, it skips gauntlet entry lookup and verifies the diff shows only deletions. Applied to both copies.
| TITAN_HEAD_SHA=$(git rev-parse HEAD) | ||
| node -e " | ||
| const fs = require('fs'); | ||
| const communities = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-communities.json','utf8')); | ||
| const structure = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-structure.json','utf8')); | ||
| const drift = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-drift.json','utf8')); | ||
| const snapshot = { | ||
| timestamp: new Date().toISOString(), | ||
| capturedBefore: 'forge', | ||
| headSha: '$TITAN_HEAD_SHA', | ||
| communities, | ||
| structure, | ||
| drift | ||
| }; | ||
| fs.writeFileSync('.codegraph/titan/arch-snapshot.json', JSON.stringify(snapshot, null, 2)); | ||
| " | ||
| ``` |
There was a problem hiding this comment.
node -e script has no error handling — silent failure produces no arch-snapshot.json
If any of the three preceding codegraph commands produces malformed output (non-zero exit, partial JSON, or an error message instead of JSON), the corresponding file will contain invalid JSON. The node -e script calls JSON.parse(fs.readFileSync(...)) without any try/catch, so it throws an unhandled exception and exits non-zero. The orchestrator has no documented check for this failure, so it silently proceeds to Step 3.5b and then dispatches forge — but arch-snapshot.json was never written.
Gate Step 5.5 gracefully skips when arch-snapshot.json is absent and prints no user-visible error. The entire architectural comparison layer is silently disabled for the entire forge run with no indication of why.
Add error handling:
TITAN_HEAD_SHA=$(git rev-parse HEAD)
node -e "
try {
const fs = require('fs');
const communities = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-communities.json','utf8'));
const structure = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-structure.json','utf8'));
const drift = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-drift.json','utf8'));
const snapshot = { timestamp: new Date().toISOString(), capturedBefore: 'forge',
headSha: '$TITAN_HEAD_SHA', communities, structure, drift };
fs.writeFileSync('.codegraph/titan/arch-snapshot.json', JSON.stringify(snapshot, null, 2));
console.log('arch-snapshot.json written');
} catch(e) { console.error('ARCH_SNAPSHOT_FAILED: ' + e.message); process.exit(1); }
"If the script fails, print a warning and continue without the snapshot (rather than silently producing no file). The same applies to docs/examples/claude-code-skills/titan-run/SKILL.md.
There was a problem hiding this comment.
Fixed — wrapped the node -e script in try/catch with error logging. On failure, prints a warning and continues without the snapshot instead of silently proceeding. Applied to both copies.
| If `.codegraph/titan/titan-state.json` should exist at this point (i.e., we're past recon): | ||
| ```bash | ||
| node -e "try { JSON.parse(require('fs').readFileSync('.codegraph/titan/titan-state.json','utf8')); console.log('OK'); } catch(e) { console.log('CORRUPT: '+e.message); process.exit(1); }" | ||
| ``` | ||
| - If **CORRUPT** → attempt recovery from backup (see State Backup below). If no backup → stop: "State file corrupted with no backup. Run `/titan-reset` and start over." | ||
|
|
||
| ### G4. State backup | ||
| Before every sub-agent dispatch, back up the current state file: | ||
| ```bash | ||
| cp .codegraph/titan/titan-state.json .codegraph/titan/titan-state.json.bak 2>/dev/null || true | ||
| ``` | ||
| If a sub-agent corrupts the state, G3 on the next iteration will detect it and restore from `.bak`. | ||
|
|
There was a problem hiding this comment.
G3 corruption recovery procedure is not documented
G3 says "attempt recovery from backup (see State Backup below)" but G4 only documents how to create the backup (cp ... .bak), not how to restore from it. An AI agent reading G3 has no documented procedure for the recovery step and would likely hallucinate one (e.g., cp .bak titan-state.json without verifying the backup is valid).
Add an explicit recovery procedure to G3:
### G3. State file integrity
If `.codegraph/titan/titan-state.json` should exist at this point:
```bash
node -e "try { JSON.parse(require('fs').readFileSync('.codegraph/titan/titan-state.json','utf8')); console.log('OK'); } catch(e) { console.log('CORRUPT: '+e.message); process.exit(1); }"- If CORRUPT → attempt recovery:
If backup is also corrupt or missing → stop.
# Verify backup is valid before restoring node -e "try { JSON.parse(require('fs').readFileSync('.codegraph/titan/titan-state.json.bak','utf8')); console.log('OK'); } catch(e) { console.log('CORRUPT'); process.exit(1); }" # If backup is valid: cp .codegraph/titan/titan-state.json.bak .codegraph/titan/titan-state.json
Same fix needed in `docs/examples/claude-code-skills/titan-run/SKILL.md`.
There was a problem hiding this comment.
Fixed — G3 now has an explicit recovery procedure: verify backup exists, validate it's valid JSON, cp .bak over the corrupt file. If backup is also corrupt or missing, stop. Applied to both copies.
| ```bash | ||
| node -e " | ||
| const fs = require('fs'); | ||
| const lines = fs.readFileSync('.codegraph/titan/gauntlet.ndjson','utf8').trim().split('\n'); | ||
| let valid = 0, corrupt = 0; | ||
| for (const line of lines) { | ||
| try { JSON.parse(line); valid++; } catch { corrupt++; } | ||
| } | ||
| console.log(JSON.stringify({ valid, corrupt, total: lines.length })); | ||
| " |
There was a problem hiding this comment.
NDJSON integrity check fails if
gauntlet.ndjson does not yet exist
The NDJSON integrity check at step 2c unconditionally calls fs.readFileSync('.codegraph/titan/gauntlet.ndjson', 'utf8'). If the file doesn't exist (e.g., the gauntlet sub-agent crashed before writing any entry), readFileSync throws ENOENT — the node -e script exits non-zero with a Node.js stack trace, which is a confusing diagnostic for what is simply "gauntlet wrote nothing."
The check should guard against a missing file:
const fs = require('fs');
if (!fs.existsSync('.codegraph/titan/gauntlet.ndjson')) {
console.log(JSON.stringify({ valid: 0, corrupt: 0, total: 0, missing: true }));
process.exit(0);
}
const lines = fs.readFileSync('.codegraph/titan/gauntlet.ndjson','utf8').trim().split('\n');Then the caller should handle missing: true as equivalent to valid == 0 (stop the pipeline). The same fix is needed in docs/examples/claude-code-skills/titan-run/SKILL.md.
There was a problem hiding this comment.
Fixed — added fs.existsSync guard before readFileSync. When gauntlet.ndjson is missing, outputs {valid:0, corrupt:0, total:0, missing:true} and exits 0. Caller handles missing:true as equivalent to valid==0. Applied to both copies.
…ce in A2 codegraph deps is a file-level dependency map, not a symbol lookup. The diff-impact edge output already contains source/target file paths, so the domain/layer can be inferred directly from the path using the GLOBAL_ARCH.md domain map without an extra command.
Dead-code targets have no gauntlet.ndjson entry, causing D2 to fail silently. Now checks titan-state.json deadSymbols first and validates the diff shows only deletions without needing a gauntlet lookup.
The node -e script that assembles arch-snapshot.json had no error handling — if any input file was missing or malformed, it failed silently and no snapshot was written. Now catches errors, prints a warning, and continues without the snapshot.
G3 referenced backup recovery without documenting the steps. Now includes: check backup exists, validate it is valid JSON before restoring, cp .bak over the corrupt file, and stop if backup is also corrupt or missing.
The node -e script crashed with ENOENT if gauntlet.ndjson did not exist yet. Now checks fs.existsSync first and outputs a result with missing:true so the caller can handle it gracefully.
| | `codegraph path` | SYNC | Dependency paths between targets | | ||
| | `codegraph cycles` | SYNC, GATE | Circular dependency detection | | ||
| | `codegraph deps` | SYNC | File-level dependency map | | ||
| | `codegraph deps` | SYNC, GATE | File-level dependency map | |
There was a problem hiding this comment.
codegraph deps incorrectly lists GATE as a consumer
Gate A2 was updated (as part of a previous fix cycle in this PR) to stop calling codegraph deps entirely. The current titan-gate/SKILL.md Step 5.5 A2 now explicitly says:
"do not re-run
codegraph depson the full file… No additional codegraph command is needed — the diff-impact edge output contains the file paths directly."
The GATE entry was added to this row based on an earlier iteration of the design that called codegraph deps for domain/layer resolution. That approach was subsequently replaced with path-based inference from the diff-impact results already in memory. The table was never updated to reflect the final implementation.
An agent or developer reading this table to understand which commands gate calls would incorrectly expect a codegraph deps invocation in the gate flow.
| | `codegraph deps` | SYNC, GATE | File-level dependency map | | |
| | `codegraph deps` | SYNC | File-level dependency map | |
The same row exists in .claude/skills/ — but since the docs mirror is identical to .claude/skills/ at HEAD, only this file needs the fix.
Summary
/titan-runorchestrator skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling hands-free end-to-end execution/titan-forge(Step 9) — verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs (scope check, intent match, commit message accuracy, deletion audit, leftover check)/titan-gate(Step 5) — export signature stability, import resolution integrity, dependency direction assertions, re-export chain validation/titan-gate(Step 5.5) — captures pre-forge architectural baseline, compares community stability, cross-domain dependency direction, cohesion delta, and drift after each commit/titan-run: Pre-Agent Gate (G1-G4), post-phase validation (V1-V15), stall detection, state file backup/recovery, NDJSON integrity checks, mandatory human checkpoint before forgeTest plan
/titan-runon a test codebase in a worktree — verify full pipeline completes--start-from forgeskips analysis phases but validates their artifacts exist