From fec01d4213e85f9902364de263b37d4234364a41 Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 16 Mar 2026 04:50:10 -0600 Subject: [PATCH 01/52] docs: promote #83 (brief command) and #71 (type inference) to Tier 0 in backlog These two items deliver the highest immediate impact on agent experience and graph accuracy without requiring Rust porting or TypeScript migration. They should be implemented before any Phase 4+ roadmap work. - #83: hook-optimized `codegraph brief` enriches passively-injected context - #71: basic type inference closes the biggest resolution gap for TS/Java --- docs/roadmap/BACKLOG.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/docs/roadmap/BACKLOG.md b/docs/roadmap/BACKLOG.md index 7c7cea66..c5876a18 100644 --- a/docs/roadmap/BACKLOG.md +++ b/docs/roadmap/BACKLOG.md @@ -21,6 +21,17 @@ Each item has a short title, description, category, expected benefit, and four a ## Backlog +### Tier 0 — Promote before Phase 4-5 (highest immediate impact) + +These two items directly improve agent experience and graph accuracy today, without requiring Rust porting or TypeScript migration. They should be implemented before any Phase 4+ roadmap work begins. + +**Rationale:** Item #83 enriches the *passively-injected* context that agents actually see via hooks — the single highest-leverage surface for reducing blind edits. Item #71 closes the biggest accuracy gap in the graph for TypeScript and Java, where missing type-aware resolution causes hallucinated "no callers" results. + +| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | Depends on | +|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------|------------| +| 83 | Hook-optimized `codegraph brief` command | New `codegraph brief ` command designed for Claude Code hook context injection. Returns a compact, token-efficient summary per file: each symbol with its role and caller count (e.g. `buildGraph [core, 12 callers]`), blast radius count on importers (`Imported by: src/cli.js (+8 transitive)`), and overall file risk tier. Current `deps --json` output used by `enrich-context.sh` is shallow — just file-level imports/importedBy and symbol names with no role or blast radius info. The `brief` command would include: **(a)** symbol roles in the output — knowing a file defines `core` vs `leaf` symbols changes editing caution; **(b)** per-symbol transitive caller counts — makes blast radius visible without a separate `fn-impact` call; **(c)** file-level risk tier (high/medium/low based on max fan-in and role composition). Output optimized for `additionalContext` injection — single compact block, not verbose JSON. Also add `--brief` flag to `deps` as an alias. | Embeddability | The `enrich-context.sh` hook is the only codegraph context agents actually see (they ignore CLAUDE.md instructions to run commands manually). Making that passively-injected context richer — with roles, caller counts, and risk tiers — directly reduces blind edits to high-impact code. Currently the hook shows `Defines: function buildGraph` but not that it's a core symbol with 12 transitive callers | ✓ | ✓ | 4 | No | — | +| 71 | Basic type inference for typed languages | Extract type annotations from TypeScript and Java AST nodes (variable declarations, function parameters, return types, generics) to resolve method calls through typed references. Currently `const x: Router = express.Router(); x.get(...)` produces no edge because `x.get` can't be resolved without knowing `x` is a `Router`. Tree-sitter already parses type annotations — we just don't use them for resolution. Start with declared types (no flow inference), which covers the majority of TS/Java code. | Resolution | Dramatically improves call graph completeness for TypeScript and Java — the two languages where developers annotate types explicitly and expect tooling to use them. Directly prevents hallucinated "no callers" results for methods called through typed variables | ✓ | ✓ | 5 | No | — | + ### Tier 1 — Zero-dep + Foundation-aligned (build these first) Non-breaking, ordered by problem-fit: @@ -144,7 +155,6 @@ These address fundamental limitations in the parsing and resolution pipeline tha | ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | Depends on | |----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------|------------| -| 71 | Basic type inference for typed languages | Extract type annotations from TypeScript and Java AST nodes (variable declarations, function parameters, return types, generics) to resolve method calls through typed references. Currently `const x: Router = express.Router(); x.get(...)` produces no edge because `x.get` can't be resolved without knowing `x` is a `Router`. Tree-sitter already parses type annotations — we just don't use them for resolution. Start with declared types (no flow inference), which covers the majority of TS/Java code. | Resolution | Dramatically improves call graph completeness for TypeScript and Java — the two languages where developers annotate types explicitly and expect tooling to use them. Directly prevents hallucinated "no callers" results for methods called through typed variables | ✓ | ✓ | 5 | No | — | | 72 | Interprocedural dataflow analysis | Extend the existing intraprocedural dataflow (ID 14) to propagate `flows_to`/`returns`/`mutates` edges across function boundaries. When function A calls B with argument X, and B's dataflow shows X flows to its return value, connect A's call site to the downstream consumers of B's return. Requires stitching per-function dataflow summaries at call edges — no new parsing, just graph traversal over existing `dataflow` + `edges` tables. Start with single-level propagation (caller↔callee), not transitive closure. | Analysis | Current dataflow stops at function boundaries, missing the most important flows — data passing through helper functions, middleware chains, and factory patterns. Single-function scope means `dataflow` can't answer "where does this user input end up?" across call boundaries. Cross-function propagation is the difference between toy dataflow and useful taint-like analysis | ✓ | ✓ | 5 | No | 14 | | 73 | Improved dynamic call resolution | Upgrade the current "best-effort" dynamic dispatch resolution for Python, Ruby, and JavaScript. Three concrete improvements: **(a)** receiver-type tracking — when `x = SomeClass()` is followed by `x.method()`, resolve `method` to `SomeClass.method` using the assignment chain (leverages existing `ast_nodes` + `dataflow` tables); **(b)** common pattern recognition — resolve `EventEmitter.on('event', handler)` callback registration, `Promise.then/catch` chains, `Array.map/filter/reduce` with named function arguments, and decorator/annotation patterns; **(c)** confidence-tiered edges — mark dynamically-resolved edges with a confidence score (high for direct assignment, medium for pattern match, low for heuristic) so consumers can filter by reliability. | Resolution | In Python/Ruby/JS, 30-60% of real calls go through dynamic dispatch — method calls on variables, callbacks, event handlers, higher-order functions. The current best-effort resolution misses most of these, leaving massive gaps in the call graph for the languages where codegraph is most commonly used. Even partial improvement here has outsized impact on graph completeness | ✓ | ✓ | 5 | No | — | | 81 | Track dynamic `import()` and re-exports as graph edges | Extract `import()` expressions as `dynamic-imports` edges in both WASM extraction paths (query-based and walk-based). Destructured names (`const { a } = await import(...)`) feed into `importedNames` for call resolution. **Partially done:** WASM JS/TS extraction works (PR #389). Remaining: **(a)** native Rust engine support — `crates/codegraph-core/src/extractors/javascript.rs` doesn't extract `import()` calls; **(b)** non-static paths (`import(\`./plugins/${name}.js\`)`, `import(variable)`) are skipped with a debug warning; **(c)** re-export consumer counting in `exports --unused` only checks `calls` edges, not `imports`/`dynamic-imports` — symbols consumed only via import edges show as zero-consumer false positives. | Resolution | Fixes false "zero consumers" reports for symbols consumed via dynamic imports. 95 `dynamic-imports` edges found in codegraph's own codebase — these were previously invisible to impact analysis, exports audit, and dead-export hooks | ✓ | ✓ | 5 | No | — | @@ -163,7 +173,6 @@ These close gaps in search expressiveness, cross-repo navigation, implementation | 78 | Cross-repo symbol resolution | In multi-repo mode, resolve import edges that cross repository boundaries. When repo A imports `@org/shared-lib`, and repo B is `@org/shared-lib` in the registry, create cross-repo edges linking A's import to B's actual exported symbol. Requires matching npm/pip/go package names to registered repos. Store cross-repo edges with a `repo` qualifier in the `edges` table. Enables cross-repo `fn-impact` (changing a shared library function shows impact across all consuming repos), cross-repo `path` queries, and cross-repo `diff-impact`. | Navigation | Multi-repo mode currently treats each repo as isolated — agents can search across repos but can't trace dependencies between them. Cross-repo edges enable "if I change this shared utility, which downstream repos break?" — the highest-value question in monorepo and multi-repo architectures | ✓ | ✓ | 5 | No | — | | 79 | Advanced query language with boolean operators and output shaping | Extend `codegraph search` and `codegraph where` with a structured query syntax supporting: **(a)** boolean operators — `kind:function AND file:src/` , `name:parse OR name:extract`, `NOT kind:class`; **(b)** compound filters — `kind:method AND complexity.cognitive>15 AND role:core`; **(c)** output shaping — `--select symbols` (just names), `--select files` (distinct files), `--select owners` (CODEOWNERS for matches), `--select stats` (aggregate counts by kind/file/role); **(d)** result aggregation — `--group-by file`, `--group-by kind`, `--group-by community` with counts. Parse the query into a SQL WHERE clause against the `nodes`/`function_complexity`/`edges` tables. Expose as `query_language` MCP tool parameter. | Search | Current search is either keyword/semantic (fuzzy) or exact-name (`where`). Agents needing "all core functions with cognitive complexity > 15 in src/api/" must chain multiple commands and filter manually — wasting tokens on intermediate results. A structured query language answers compound questions in one call | ✓ | ✓ | 4 | No | — | | 80 | Find implementations in impact analysis | When a function signature or interface definition changes, automatically include all implementations/subtypes in `fn-impact` and `diff-impact` blast radius. Currently impact only follows `calls` edges — changing an interface method signature breaks every implementor, but this is invisible. Requires ID 74's `implements` edges. Add `--include-implementations` flag (on by default) to impact commands. | Analysis | Catches the most dangerous class of missed blast radius — interface/trait changes that silently break all implementors. A single method signature change on a widely-implemented interface can break dozens of files, none of which appear in the current call-graph-only impact analysis | ✓ | ✓ | 5 | No | 74 | -| 83 | Hook-optimized `codegraph brief` command | New `codegraph brief ` command designed for Claude Code hook context injection. Returns a compact, token-efficient summary per file: each symbol with its role and caller count (e.g. `buildGraph [core, 12 callers]`), blast radius count on importers (`Imported by: src/cli.js (+8 transitive)`), and overall file risk tier. Current `deps --json` output used by `enrich-context.sh` is shallow — just file-level imports/importedBy and symbol names with no role or blast radius info. The `brief` command would include: **(a)** symbol roles in the output — knowing a file defines `core` vs `leaf` symbols changes editing caution; **(b)** per-symbol transitive caller counts — makes blast radius visible without a separate `fn-impact` call; **(c)** file-level risk tier (high/medium/low based on max fan-in and role composition). Output optimized for `additionalContext` injection — single compact block, not verbose JSON. Also add `--brief` flag to `deps` as an alias. | Embeddability | The `enrich-context.sh` hook is the only codegraph context agents actually see (they ignore CLAUDE.md instructions to run commands manually). Making that passively-injected context richer — with roles, caller counts, and risk tiers — directly reduces blind edits to high-impact code. Currently the hook shows `Defines: function buildGraph` but not that it's a core symbol with 12 transitive callers | ✓ | ✓ | 4 | No | — | ### Tier 2 — Foundation-aligned, needs dependencies From 41d664f1b106c7a72484864b9a029803a3df8f9d Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 16 Mar 2026 04:55:02 -0600 Subject: [PATCH 02/52] docs: add Phase 4 (Native Analysis Acceleration) to roadmap Add new Phase 4 covering the port of JS-only build phases to Rust: - 4.1-4.3: AST nodes, CFG, dataflow visitor ports (~587ms savings) - 4.4: Batch SQLite inserts (~143ms) - 4.5: Role classification & structure (~42ms) - 4.6: Complete complexity pre-computation - 4.7: Fix incremental rebuild data loss on native engine - 4.8: Incremental rebuild performance (target sub-100ms) Bump old Phases 4-10 to 5-11 with all cross-references updated. Benchmark evidence shows ~50% of native build time is spent in JS visitors that run identically on both engines. --- docs/roadmap/ROADMAP.md | 325 +++++++++++++++++++++++++++------------- 1 file changed, 220 insertions(+), 105 deletions(-) diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md index 4ca9cf9d..9edda2d1 100644 --- a/docs/roadmap/ROADMAP.md +++ b/docs/roadmap/ROADMAP.md @@ -2,7 +2,7 @@ > **Current version:** 3.1.4 | **Status:** Active development | **Updated:** March 2026 -Codegraph is a strong local-first code graph CLI. This roadmap describes planned improvements across ten phases -- closing gaps with commercial code intelligence platforms while preserving codegraph's core strengths: fully local, open source, zero cloud dependency by default. +Codegraph is a strong local-first code graph CLI. This roadmap describes planned improvements across eleven phases -- closing gaps with commercial code intelligence platforms while preserving codegraph's core strengths: fully local, open source, zero cloud dependency by default. **LLM strategy:** All LLM-powered features are **optional enhancements**. Everything works without an API key. When configured (OpenAI, Anthropic, Ollama, or any OpenAI-compatible endpoint), users unlock richer semantic search and natural language queries. @@ -13,17 +13,18 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned | Phase | Theme | Key Deliverables | Status | |-------|-------|-----------------|--------| | [**1**](#phase-1--rust-core) | Rust Core | Rust parsing engine via napi-rs, parallel parsing, incremental tree-sitter, JS orchestration layer | **Complete** (v1.3.0) | -| [**2**](#phase-2--foundation-hardening) | Foundation Hardening | Parser registry, complete MCP, test coverage, enhanced config, multi-repo MCP | **Complete** (v1.4.0) | -| [**2.5**](#phase-25--analysis-expansion) | Analysis Expansion | Complexity metrics, community detection, flow tracing, co-change, manifesto, boundary rules, check, triage, audit, batch, hybrid search | **Complete** (v2.6.0) | +| [**2**](#phase-2--foundation-hardening) | Foundation Hardening | Parser registry, complete MCP, test coverage, enhanced config, multi-repo MCP | **Complete** (v1.5.0) | +| [**2.5**](#phase-25--analysis-expansion) | Analysis Expansion | Complexity metrics, community detection, flow tracing, co-change, manifesto, boundary rules, check, triage, audit, batch, hybrid search | **Complete** (v2.7.0) | | [**2.7**](#phase-27--deep-analysis--graph-enrichment) | Deep Analysis & Graph Enrichment | Dataflow analysis, intraprocedural CFG, AST node storage, expanded node/edge types, extractors refactoring, CLI consolidation, interactive viewer, exports command, normalizeSymbol | **Complete** (v3.0.0) | | [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring (Vertical Slice) | Unified AST analysis framework, command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, builder pipeline, presentation layer, domain grouping, curated API, unified graph model, qualified names, CLI composability | **In Progress** (v3.1.4) | -| [**4**](#phase-4--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration, supply-chain security, CI coverage gates | Planned | -| [**5**](#phase-5--runtime--extensibility) | Runtime & Extensibility | Event-driven pipeline, unified engine strategy, subgraph export filtering, transitive confidence, query caching, configuration profiles, pagination, plugin system, DX & onboarding | Planned | -| [**6**](#phase-6--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned | -| [**7**](#phase-7--natural-language-queries) | Natural Language Queries | `ask` command, conversational sessions, LLM-narrated graph queries, onboarding tools | Planned | -| [**8**](#phase-8--expanded-language-support) | Expanded Language Support | 8 new languages (11 -> 19), parser utilities | Planned | -| [**9**](#phase-9--github-integration--ci) | GitHub Integration & CI | Reusable GitHub Action, LLM-enhanced PR review, visual impact graphs, SARIF output | Planned | -| [**10**](#phase-10--interactive-visualization--advanced-features) | Visualization & Advanced | Web UI, dead code detection, monorepo, agentic search, refactoring analysis | Planned | +| [**4**](#phase-4--native-analysis-acceleration) | Native Analysis Acceleration | Move JS-only build phases (AST nodes, CFG, dataflow, insert nodes, structure, roles, complexity) to Rust; fix incremental rebuild data loss on native; sub-100ms 1-file rebuilds | Planned | +| [**5**](#phase-5--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration, supply-chain security, CI coverage gates | Planned | +| [**6**](#phase-6--runtime--extensibility) | Runtime & Extensibility | Event-driven pipeline, unified engine strategy, subgraph export filtering, transitive confidence, query caching, configuration profiles, pagination, plugin system, DX & onboarding | Planned | +| [**7**](#phase-7--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned | +| [**8**](#phase-8--natural-language-queries) | Natural Language Queries | `ask` command, conversational sessions, LLM-narrated graph queries, onboarding tools | Planned | +| [**9**](#phase-9--expanded-language-support) | Expanded Language Support | 8 new languages (11 -> 19), parser utilities | Planned | +| [**10**](#phase-10--github-integration--ci) | GitHub Integration & CI | Reusable GitHub Action, LLM-enhanced PR review, visual impact graphs, SARIF output | Planned | +| [**11**](#phase-11--interactive-visualization--advanced-features) | Visualization & Advanced | Web UI, dead code detection, monorepo, agentic search, refactoring analysis | Planned | ### Dependency graph @@ -33,12 +34,13 @@ Phase 1 (Rust Core) |--> Phase 2.5 (Analysis Expansion) |--> Phase 2.7 (Deep Analysis & Graph Enrichment) |--> Phase 3 (Architectural Refactoring) - |--> Phase 4 (TypeScript Migration) - |--> Phase 5 (Runtime & Extensibility) - |--> Phase 6 (Embeddings + Metadata) --> Phase 7 (NL Queries + Narration) - |--> Phase 8 (Languages) - |--> Phase 9 (GitHub/CI) <-- Phase 6 (risk_score, side_effects) -Phases 1-7 --> Phase 10 (Visualization + Refactoring Analysis) + |--> Phase 4 (Native Analysis Acceleration) + |--> Phase 5 (TypeScript Migration) + |--> Phase 6 (Runtime & Extensibility) + |--> Phase 7 (Embeddings + Metadata) --> Phase 8 (NL Queries + Narration) + |--> Phase 9 (Languages) + |--> Phase 10 (GitHub/CI) <-- Phase 7 (risk_score, side_effects) +Phases 1-8 --> Phase 11 (Visualization + Refactoring Analysis) ``` --- @@ -113,7 +115,7 @@ Ensure the transition is seamless. ## Phase 2 -- Foundation Hardening ✅ -> **Status:** Complete -- shipped in v1.4.0 +> **Status:** Complete -- shipped in v1.5.0 **Goal:** Fix structural issues that make subsequent phases harder. @@ -199,11 +201,11 @@ Support querying multiple codebases from a single MCP server instance. ## Phase 2.5 -- Analysis Expansion ✅ -> **Status:** Complete -- shipped across v2.0.0 -> v2.6.0 +> **Status:** Complete -- shipped across v2.0.0 -> v2.7.0 **Goal:** Build a comprehensive analysis toolkit on top of the graph -- complexity metrics, community detection, risk triage, architecture boundary enforcement, CI validation, and hybrid search. This phase emerged organically as features were needed and wasn't in the original roadmap. -### 2.5.1 -- Complexity Metrics ✅ +### 2.6.1 -- Complexity Metrics ✅ Per-function complexity analysis using language-specific AST rules. @@ -217,7 +219,7 @@ Per-function complexity analysis using language-specific AST rules. **New file:** `src/complexity.js` (2,163 lines) -### 2.5.2 -- Community Detection & Drift ✅ +### 2.6.2 -- Community Detection & Drift ✅ Louvain community detection at file or function level. @@ -228,7 +230,7 @@ Louvain community detection at file or function level. **New file:** `src/communities.js` (310 lines) -### 2.5.3 -- Structure & Role Classification ✅ +### 2.6.3 -- Structure & Role Classification ✅ Directory structure graph with node role classification. @@ -241,7 +243,7 @@ Directory structure graph with node role classification. **New file:** `src/structure.js` (668 lines) -### 2.5.4 -- Execution Flow Tracing ✅ +### 2.6.4 -- Execution Flow Tracing ✅ Forward BFS from framework entry points through callees to leaves. @@ -251,7 +253,7 @@ Forward BFS from framework entry points through callees to leaves. **New file:** `src/flow.js` (362 lines) -### 2.5.5 -- Temporal Coupling (Co-change Analysis) ✅ +### 2.6.5 -- Temporal Coupling (Co-change Analysis) ✅ Git history analysis for temporal file coupling. @@ -262,7 +264,7 @@ Git history analysis for temporal file coupling. **New file:** `src/cochange.js` (502 lines) -### 2.5.6 -- Manifesto Rule Engine ✅ +### 2.6.6 -- Manifesto Rule Engine ✅ Configurable rule engine with warn/fail thresholds for function, file, and graph rules. @@ -274,7 +276,7 @@ Configurable rule engine with warn/fail thresholds for function, file, and graph **New file:** `src/manifesto.js` (511 lines) -### 2.5.7 -- Architecture Boundary Rules ✅ +### 2.6.7 -- Architecture Boundary Rules ✅ Architecture enforcement using glob patterns and presets. @@ -285,7 +287,7 @@ Architecture enforcement using glob patterns and presets. **New file:** `src/boundaries.js` (347 lines) -### 2.5.8 -- CI Validation Predicates (`check`) ✅ +### 2.6.8 -- CI Validation Predicates (`check`) ✅ Structured pass/fail checks for CI pipelines. @@ -299,7 +301,7 @@ Structured pass/fail checks for CI pipelines. **New file:** `src/check.js` (433 lines) -### 2.5.9 -- Composite Analysis Commands ✅ +### 2.6.9 -- Composite Analysis Commands ✅ High-level commands that compose multiple analysis steps. @@ -309,7 +311,7 @@ High-level commands that compose multiple analysis steps. **New files:** `src/audit.js` (424 lines), `src/batch.js` (91 lines), `src/triage.js` (274 lines) -### 2.5.10 -- Hybrid Search ✅ +### 2.6.10 -- Hybrid Search ✅ BM25 keyword search + semantic vector search with RRF fusion. @@ -321,7 +323,7 @@ BM25 keyword search + semantic vector search with RRF fusion. **Affected file:** `src/embedder.js` (grew from 525 -> 1,113 lines) -### 2.5.11 -- Supporting Infrastructure ✅ +### 2.6.11 -- Supporting Infrastructure ✅ Cross-cutting utilities added during the expansion. @@ -333,7 +335,7 @@ Cross-cutting utilities added during the expansion. - ✅ **Journal:** change journal validation/management (`src/journal.js`, 110 lines) - ✅ **Update Check:** npm registry polling with 24h cache (`src/update-check.js`, 161 lines) -### 2.5.12 -- MCP Tool Expansion ✅ +### 2.6.12 -- MCP Tool Expansion ✅ MCP grew from 12 -> 25 tools, covering all new analysis capabilities. @@ -365,7 +367,7 @@ MCP grew from 12 -> 25 tools, covering all new analysis capabilities. **Goal:** Add deeper static analysis capabilities (dataflow, control flow graphs, AST querying), enrich the graph model with sub-declaration node types and structural edges, refactor extractors into per-language modules, consolidate the CLI surface area, and introduce interactive visualization. This phase emerged from competitive analysis against Joern and Narsil-MCP. -### 2.7.1 -- Dataflow Analysis ✅ +### 2.8.1 -- Dataflow Analysis ✅ Define-use chain extraction tracking how data flows between functions. @@ -382,7 +384,7 @@ Define-use chain extraction tracking how data flows between functions. **New file:** `src/dataflow.js` (1,187 lines) -### 2.7.2 -- Expanded Node Types (Phase 1) ✅ +### 2.8.2 -- Expanded Node Types (Phase 1) ✅ Extend the graph model with sub-declaration node kinds. @@ -396,7 +398,7 @@ Extend the graph model with sub-declaration node kinds. **Affected files:** All extractors, `src/builder.js`, `src/queries.js`, `src/db.js` -### 2.7.3 -- Expanded Edge Types (Phase 2) ✅ +### 2.8.3 -- Expanded Edge Types (Phase 2) ✅ Structural edges for richer graph relationships. @@ -407,7 +409,7 @@ Structural edges for richer graph relationships. **Affected files:** `src/builder.js`, `src/queries.js` -### 2.7.4 -- Intraprocedural Control Flow Graph (CFG) ✅ +### 2.8.4 -- Intraprocedural Control Flow Graph (CFG) ✅ Basic-block control flow graph construction from function ASTs. @@ -422,7 +424,7 @@ Basic-block control flow graph construction from function ASTs. **New file:** `src/cfg.js` (1,451 lines) -### 2.7.5 -- Stored Queryable AST Nodes ✅ +### 2.8.5 -- Stored Queryable AST Nodes ✅ Persist and query selected AST node types for pattern-based codebase exploration. @@ -437,7 +439,7 @@ Persist and query selected AST node types for pattern-based codebase exploration **New file:** `src/ast.js` (392 lines) -### 2.7.6 -- Extractors Refactoring ✅ +### 2.8.6 -- Extractors Refactoring ✅ Split per-language extractors from monolithic `parser.js` into dedicated modules. @@ -451,7 +453,7 @@ Split per-language extractors from monolithic `parser.js` into dedicated modules **New directory:** `src/extractors/` -### 2.7.7 -- normalizeSymbol Utility ✅ +### 2.8.7 -- normalizeSymbol Utility ✅ Stable JSON schema for symbol output across all query functions. @@ -461,7 +463,7 @@ Stable JSON schema for symbol output across all query functions. **Affected file:** `src/queries.js` -### 2.7.8 -- Interactive Graph Viewer ✅ +### 2.8.8 -- Interactive Graph Viewer ✅ Self-contained HTML visualization with vis-network. @@ -478,7 +480,7 @@ Self-contained HTML visualization with vis-network. **New file:** `src/viewer.js` (948 lines) -### 2.7.9 -- Exports Command ✅ +### 2.8.9 -- Exports Command ✅ Per-symbol consumer analysis for file exports. @@ -489,7 +491,7 @@ Per-symbol consumer analysis for file exports. **Affected file:** `src/queries.js` -### 2.7.10 -- Export Format Expansion ✅ +### 2.8.10 -- Export Format Expansion ✅ Three new graph export formats for external tooling integration. @@ -499,7 +501,7 @@ Three new graph export formats for external tooling integration. **Affected file:** `src/export.js` (681 lines) -### 2.7.11 -- CLI Consolidation ✅ +### 2.8.11 -- CLI Consolidation ✅ First CLI surface area reduction -- 5 commands merged into existing ones. @@ -512,7 +514,7 @@ First CLI surface area reduction -- 5 commands merged into existing ones. **Affected file:** `src/cli.js` -### 2.7.12 -- MCP Tool Consolidation & Expansion ✅ +### 2.8.12 -- MCP Tool Consolidation & Expansion ✅ MCP tools were both consolidated and expanded, resulting in a net change from 25 → 30 tools (31 in multi-repo mode). @@ -540,7 +542,7 @@ Plus updated enums on existing tools (edge_kinds, symbol kinds). ### 2.7 Summary -| Metric | Before (v2.6.0) | After (v3.0.0) | Delta | +| Metric | Before (v2.7.0) | After (v3.0.0) | Delta | |--------|-----------------|-----------------|-------| | Source modules | 35 | 50 | +15 | | Total source lines | 17,830 | 26,277 | +47% | @@ -991,13 +993,126 @@ Practical cleanup to make the CLI surface match the internal composability that --- -## Phase 4 -- TypeScript Migration +## Phase 4 -- Native Analysis Acceleration + +**Goal:** Move the remaining JS-only build phases to Rust so that `--engine native` eliminates all redundant WASM visitor walks. Today only 3 of 10 build phases (parse, resolve imports, build edges) run in Rust — the other 7 execute identical JavaScript regardless of engine, leaving ~50% of native build time on the table. + +**Why its own phase:** This is a substantial Rust engineering effort — porting 6 JS visitors to `crates/codegraph-core/`, fixing a data loss bug in incremental rebuilds, and optimizing the 1-file rebuild path. Doing this before the TS migration avoids rewriting the same visitor code twice (once to TS, once to Rust). The Phase 3 module boundaries make each phase a self-contained target. + +**Evidence (v3.1.4 benchmarks on 398 files):** + +| Phase | Native | WASM | Ratio | Status | +|-------|-------:|-----:|------:|--------| +| Parse | 468ms | 1483ms | 3.2x faster | Already Rust | +| Build edges | 88ms | 152ms | 1.7x faster | Already Rust | +| Resolve imports | 8ms | 9ms | ~1x | Already Rust | +| **AST nodes** | **361ms** | **347ms** | **~1x** | JS visitor — biggest win | +| **CFG** | **126ms** | **125ms** | **~1x** | JS visitor | +| **Dataflow** | **100ms** | **98ms** | **~1x** | JS visitor | +| **Insert nodes** | **143ms** | **148ms** | **~1x** | Pure SQLite batching | +| **Roles** | **29ms** | **32ms** | **~1x** | JS classification | +| **Structure** | **13ms** | **17ms** | **~1x** | JS directory tree | +| Complexity | 16ms | 77ms | 5x faster | Partly pre-computed | + +**Target:** Reduce native full-build time from ~1,400ms to ~700ms (2x improvement) by eliminating ~690ms of redundant JS visitor work. + +### 4.1 -- AST Node Extraction in Rust + +The largest single opportunity. Currently the native parser returns partial AST node data, so the JS `buildAstNodes()` visitor re-walks all WASM trees anyway (~361ms). + +- Extend `crates/codegraph-core/` to extract all AST node types (`call`, `new`, `string`, `regex`, `throw`, `await`) during the native parse phase +- Return complete AST node data in the `FileSymbols` result so `run-analyses.js` can skip the WASM walker entirely +- Validate parity: ensure native extraction produces identical node counts to the WASM visitor (benchmark already tracks this via `nodes/file`) + +**Affected files:** `crates/codegraph-core/src/lib.rs`, `src/features/ast.js`, `src/domain/graph/builder/stages/run-analyses.js` + +### 4.2 -- CFG Construction in Rust + +The intraprocedural control-flow graph visitor runs in JS even on native builds (~126ms). + +- Port `createCfgVisitor()` logic to Rust: basic block detection, branch/loop edges, entry/exit nodes +- Return CFG block data per function in `FileSymbols` so the JS visitor is fully bypassed +- Validate parity: CFG block counts and edge counts must match the WASM visitor output + +**Affected files:** `crates/codegraph-core/src/lib.rs`, `src/features/cfg.js`, `src/ast-analysis/visitors/cfg-visitor.js` + +### 4.3 -- Dataflow Analysis in Rust + +Dataflow edges are computed by a JS visitor that walks WASM trees (~100ms on native builds). + +- Port `createDataflowVisitor()` to Rust: variable definitions, assignments, reads, def-use chains +- Return dataflow edges in `FileSymbols` +- Validate parity against WASM visitor output + +**Affected files:** `crates/codegraph-core/src/lib.rs`, `src/features/dataflow.js`, `src/ast-analysis/visitors/dataflow-visitor.js` + +### 4.4 -- Batch SQLite Inserts via Rust + +`insertNodes` is pure SQLite work (~143ms) but runs row-by-row from JS. Batching in Rust can reduce JS↔native boundary crossings. + +- Expose a `batchInsertNodes(nodes[])` function from Rust that uses a single prepared statement in a transaction +- Alternatively, generate the SQL batch on the JS side and execute as a single `better-sqlite3` call (may be sufficient without Rust) +- Benchmark both approaches; pick whichever is faster + +**Affected files:** `crates/codegraph-core/src/lib.rs`, `src/db/index.js`, `src/domain/graph/builder/stages/insert-nodes.js` + +### 4.5 -- Role Classification & Structure in Rust + +Smaller wins (~42ms combined) but complete the picture of a fully native build pipeline. + +- Port `classifyNodeRoles()` to Rust: hub/leaf/bridge/utility classification based on in/out degree and betweenness +- Port directory structure building and metrics aggregation +- Return role assignments and structure data alongside parse results + +**Affected files:** `crates/codegraph-core/src/lib.rs`, `src/features/structure.js`, `src/domain/graph/builder/stages/build-structure.js` + +### 4.6 -- Complete Complexity Pre-computation + +Complexity is partly pre-computed by native (~16ms vs 77ms WASM) but not all functions are covered. + +- Ensure native parse computes cognitive, cyclomatic, Halstead, and MI metrics for every function, not just a subset +- Eliminate the WASM fallback path in `buildComplexityMetrics()` when running native + +**Affected files:** `crates/codegraph-core/src/lib.rs`, `src/features/complexity.js` + +### 4.7 -- Fix Incremental Rebuild Data Loss on Native Engine + +**Bug:** On native 1-file rebuilds, complexity, CFG, and dataflow data for the changed file is **silently lost**. `purgeFilesFromGraph` removes the old data, but the analysis phases never re-compute it because: + +1. The native parser does not produce a `_tree` (WASM tree-sitter tree) +2. The unified walker at `src/ast-analysis/engine.js:108-109` skips files without `_tree` +3. The `buildXxx` functions check for pre-computed fields (`d.complexity`, `d.cfg?.blocks`) which the native parser does not provide for these analyses +4. Result: 0.1ms no-op — the phases run but do nothing + +This is confirmed by the v3.1.4 1-file rebuild data: complexity (0.1ms), CFG (0.1ms), dataflow (0.2ms) on native — these are just module import overhead, not actual computation. Contrast with v3.1.3 where the numbers were higher (1.3ms, 8.7ms, 4ms) because earlier versions triggered a WASM fallback tree via `ensureWasmTrees`. + +**Fix (prerequisite: 4.1–4.3):** Once the native parser returns complete AST nodes, CFG blocks, and dataflow edges in `FileSymbols`, the `run-analyses` stage can store them directly without needing a WASM tree. The incremental path must: + +- Ensure `parseFilesAuto()` returns pre-computed analysis data for the single changed file +- Have `run-analyses.js` store that data (currently it only stores if `_tree` exists or if pre-computed fields are present — the latter path needs to work reliably) +- Add an integration test: rebuild 1 file on native engine, then query its complexity/CFG/dataflow and assert non-empty results + +**Affected files:** `src/ast-analysis/engine.js`, `src/domain/graph/builder/stages/run-analyses.js`, `src/domain/parser.js`, `tests/integration/` + +### 4.8 -- Incremental Rebuild Performance + +With analysis data loss fixed, optimize the 1-file rebuild path end-to-end. Current native 1-file rebuild is 265ms — dominated by parse (51ms), structure (13ms), roles (27ms), edges (13ms), insert (12ms), and finalize (12ms). + +- **Skip unchanged phases:** Structure and roles are graph-wide computations. On a 1-file change, only the changed file's nodes/edges need updating — skip full reclassification unless the file's degree changed significantly +- **Incremental edge rebuild:** Only rebuild edges involving the changed file's symbols, not the full edge set +- **Benchmark target:** Sub-100ms native 1-file rebuilds (from current 265ms) + +**Affected files:** `src/domain/graph/builder/stages/build-structure.js`, `src/domain/graph/builder/stages/build-edges.js`, `src/domain/graph/builder/pipeline.js` + +--- + +## Phase 5 -- TypeScript Migration **Goal:** Migrate the codebase from plain JavaScript to TypeScript, leveraging the clean module boundaries established in Phase 3. Incremental module-by-module migration starting from leaf modules inward. **Why after Phase 3:** The architectural refactoring creates small, well-bounded modules with explicit interfaces (Repository, Engine, BaseExtractor, Pipeline stages, Command objects). These are natural type boundaries -- typing monolithic 2,000-line files that are about to be split would be double work. -### 4.1 -- Project Setup +### 5.1 -- Project Setup - Add `typescript` as a devDependency - Create `tsconfig.json` with strict mode, ES module output, path aliases matching the Phase 3 module structure @@ -1008,7 +1123,7 @@ Practical cleanup to make the CLI surface match the internal composability that **Affected files:** `package.json`, `biome.json`, new `tsconfig.json` -### 4.2 -- Core Type Definitions +### 5.2 -- Core Type Definitions Define TypeScript interfaces for all abstractions introduced in Phase 3: @@ -1036,7 +1151,7 @@ These interfaces serve as the migration contract -- each module is migrated to s **New file:** `src/types.ts` -### 4.3 -- Leaf Module Migration +### 5.3 -- Leaf Module Migration Migrate modules with no internal dependencies first: @@ -1053,7 +1168,7 @@ Migrate modules with no internal dependencies first: Allow `.js` and `.ts` to coexist during migration (`allowJs: true` in tsconfig). -### 4.4 -- Core Module Migration +### 5.4 -- Core Module Migration Migrate modules that implement Phase 3 interfaces: @@ -1068,7 +1183,7 @@ Migrate modules that implement Phase 3 interfaces: | `src/analysis/*.ts` | Typed analysis results (impact scores, call chains) | | `src/resolve.ts` | Import resolution with confidence types | -### 4.5 -- Orchestration & Public API Migration +### 5.5 -- Orchestration & Public API Migration Migrate top-level orchestration and entry points: @@ -1081,7 +1196,7 @@ Migrate top-level orchestration and entry points: | `src/cli/*.ts` | Command objects with typed options | | `src/index.ts` | Curated public API with proper export types | -### 4.6 -- Test Migration +### 5.6 -- Test Migration - Migrate test files from `.js` to `.ts` - Add type-safe test utilities and fixture builders @@ -1092,7 +1207,7 @@ Migrate top-level orchestration and entry points: **Affected files:** All `src/**/*.js` -> `src/**/*.ts`, all `tests/**/*.js` -> `tests/**/*.ts`, `package.json`, `biome.json` -### 4.7 -- Supply-Chain Security & Audit +### 5.7 -- Supply-Chain Security & Audit **Gap:** No `npm audit` in CI pipeline. No supply-chain attestation (SLSA/SBOM). No formal security audit history. @@ -1105,33 +1220,33 @@ Migrate top-level orchestration and entry points: **Affected files:** `.github/workflows/ci.yml`, `.github/workflows/publish.yml`, `docs/security/` -### 4.8 -- CI Test Quality & Coverage Gates +### 5.8 -- CI Test Quality & Coverage Gates **Gaps:** - No coverage thresholds enforced in CI (coverage report runs locally only) - Embedding tests in separate workflow requiring HuggingFace token - 312 `setTimeout`/`sleep` instances in tests — potential flakiness under load -- No dependency audit step in CI (see also [4.7](#47----supply-chain-security--audit)) +- No dependency audit step in CI (see also [5.7](#47----supply-chain-security--audit)) **Deliverables:** 1. **Coverage gate** -- add `vitest --coverage` to CI with minimum threshold (e.g. 80% lines/branches); fail the pipeline when coverage drops below the threshold 2. **Unified test workflow** -- merge embedding tests into the main CI workflow using a securely stored `HF_TOKEN` secret; eliminate the separate workflow 3. **Timer cleanup** -- audit and reduce `setTimeout`/`sleep` usage in tests; replace with deterministic waits (event-based, polling with backoff, or `vi.useFakeTimers()`) to reduce flakiness -4. > _Dependency audit step is covered by [4.7](#47----supply-chain-security--audit) deliverable 1._ +4. > _Dependency audit step is covered by [5.7](#47----supply-chain-security--audit) deliverable 1._ **Affected files:** `.github/workflows/ci.yml`, `vitest.config.js`, `tests/` --- -## Phase 5 -- Runtime & Extensibility +## Phase 6 -- Runtime & Extensibility -**Goal:** Harden the runtime for large codebases and open the platform to external contributors. These items were deferred from Phase 3 -- they depend on the clean module boundaries and domain layering established there, and benefit from TypeScript's type safety (Phase 4) for safe refactoring of cross-cutting concerns like caching, streaming, and plugin contracts. +**Goal:** Harden the runtime for large codebases and open the platform to external contributors. These items were deferred from Phase 3 -- they depend on the clean module boundaries and domain layering established there, and benefit from TypeScript's type safety (Phase 5) for safe refactoring of cross-cutting concerns like caching, streaming, and plugin contracts. **Why after TypeScript Migration:** Several of these items introduce new internal contracts (plugin API, cache interface, streaming protocol, engine strategy). Defining those contracts in TypeScript from the start avoids a second migration pass and gives contributors type-checked extension points. -### 5.1 -- Event-Driven Pipeline +### 6.1 -- Event-Driven Pipeline Replace the synchronous build/analysis pipeline with an event/streaming architecture. Enables progress reporting, cancellation tokens, and bounded memory usage on large repositories (10K+ files). @@ -1143,7 +1258,7 @@ Replace the synchronous build/analysis pipeline with an event/streaming architec **Affected files:** `src/domain/graph/builder.js`, `src/cli/`, `src/mcp/` -### 5.2 -- Unified Engine Interface (Strategy Pattern) +### 6.2 -- Unified Engine Interface (Strategy Pattern) Replace scattered `engine.name === 'native'` / `engine === 'wasm'` branching throughout the codebase with a formal Strategy pattern. Each engine implements a common `ParsingEngine` interface with methods like `parse(file)`, `batchParse(files)`, `supports(language)`, and `capabilities()`. @@ -1155,7 +1270,7 @@ Replace scattered `engine.name === 'native'` / `engine === 'wasm'` branching thr **Affected files:** `src/infrastructure/native.js`, `src/domain/parser.js`, `src/domain/graph/builder.js` -### 5.3 -- Subgraph Export Filtering +### 6.3 -- Subgraph Export Filtering Add focus and depth controls to `codegraph export` so users can produce usable visualizations of specific subsystems rather than the entire graph. @@ -1172,7 +1287,7 @@ codegraph export --focus "buildGraph" --depth 3 --format dot **Affected files:** `src/features/export.js`, `src/presentation/export.js` -### 5.4 -- Transitive Import-Aware Confidence +### 6.4 -- Transitive Import-Aware Confidence Improve import resolution accuracy by walking the import graph before falling back to proximity heuristics. Currently the 6-level priority system uses directory proximity as a strong signal, but this can mis-resolve when a symbol is re-exported through an index file several directories away. @@ -1183,7 +1298,7 @@ Improve import resolution accuracy by walking the import graph before falling ba **Affected files:** `src/domain/graph/resolve.js` -### 5.5 -- Query Result Caching +### 6.5 -- Query Result Caching Add an LRU/TTL cache layer between the analysis/query functions and the SQLite repository. With 34+ MCP tools that often run overlapping queries within a session, caching eliminates redundant DB round-trips. @@ -1196,7 +1311,7 @@ Add an LRU/TTL cache layer between the analysis/query functions and the SQLite r **Affected files:** `src/domain/analysis/`, `src/db/index.js` -### 5.6 -- Configuration Profiles +### 6.6 -- Configuration Profiles Support named configuration profiles for monorepos and multi-service projects where different parts of the codebase need different settings. @@ -1217,7 +1332,7 @@ Support named configuration profiles for monorepos and multi-service projects wh **Affected files:** `src/infrastructure/config.js`, `src/cli/` -### 5.7 -- Pagination Standardization +### 6.7 -- Pagination Standardization Standardize SQL-level `LIMIT`/`OFFSET` pagination across all repository queries and surface it consistently through the CLI and MCP. @@ -1229,7 +1344,7 @@ Standardize SQL-level `LIMIT`/`OFFSET` pagination across all repository queries **Affected files:** `src/shared/paginate.js`, `src/db/index.js`, `src/domain/analysis/`, `src/mcp/` -### 5.8 -- Plugin System for Custom Commands +### 6.8 -- Plugin System for Custom Commands Allow users to extend codegraph with custom commands by dropping a JS/TS module into `~/.codegraph/plugins/` (global) or `.codegraph/plugins/` (project-local). @@ -1257,7 +1372,7 @@ export function data(db: Database, args: ParsedArgs, config: Config): object { **Affected files:** `src/cli/`, `src/mcp/`, new `src/infrastructure/plugins.js` -### 5.9 -- Developer Experience & Onboarding +### 6.9 -- Developer Experience & Onboarding Lower the barrier to first successful use. Today codegraph requires manual install, manual config, and prior knowledge of which command to run next. @@ -1271,13 +1386,13 @@ Lower the barrier to first successful use. Today codegraph requires manual insta --- -## Phase 6 -- Intelligent Embeddings +## Phase 7 -- Intelligent Embeddings **Goal:** Dramatically improve semantic search quality by embedding natural-language descriptions instead of raw code. -> **Phase 6.3 (Hybrid Search) was completed early** during Phase 2.5 -- FTS5 BM25 + semantic search with RRF fusion is already shipped in v2.6.0. +> **Phase 7.3 (Hybrid Search) was completed early** during Phase 2.5 -- FTS5 BM25 + semantic search with RRF fusion is already shipped in v2.7.0. -### 6.1 -- LLM Description Generator +### 7.1 -- LLM Description Generator For each function/method/class node, generate a concise natural-language description: @@ -1305,7 +1420,7 @@ For each function/method/class node, generate a concise natural-language descrip **New file:** `src/describer.js` -### 6.2 -- Enhanced Embedding Pipeline +### 7.2 -- Enhanced Embedding Pipeline - When descriptions exist, embed the description text instead of raw code - Keep raw code as fallback when no description is available @@ -1316,11 +1431,11 @@ For each function/method/class node, generate a concise natural-language descrip **Affected files:** `src/embedder.js` -### ~~6.3 -- Hybrid Search~~ ✅ Completed in Phase 2.5 +### ~~7.3 -- Hybrid Search~~ ✅ Completed in Phase 2.5 -Shipped in v2.6.0. FTS5 BM25 keyword search + semantic vector search with RRF fusion. Three search modes: `hybrid` (default), `semantic`, `keyword`. +Shipped in v2.7.0. FTS5 BM25 keyword search + semantic vector search with RRF fusion. Three search modes: `hybrid` (default), `semantic`, `keyword`. -### 6.4 -- Build-time Semantic Metadata +### 7.4 -- Build-time Semantic Metadata Enrich nodes with LLM-generated metadata beyond descriptions. Computed incrementally at build time (only for changed nodes), stored as columns on the `nodes` table. @@ -1333,9 +1448,9 @@ Enrich nodes with LLM-generated metadata beyond descriptions. Computed increment - MCP tool: `assess ` -- returns complexity rating + specific concerns - Cascade invalidation: when a node changes, mark dependents for re-enrichment -**Depends on:** 6.1 (LLM provider abstraction) +**Depends on:** 7.1 (LLM provider abstraction) -### 6.5 -- Module Summaries +### 7.5 -- Module Summaries Aggregate function descriptions + dependency direction into file-level narratives. @@ -1343,17 +1458,17 @@ Aggregate function descriptions + dependency direction into file-level narrative - MCP tool: `explain_module ` -- returns module purpose, key exports, role in the system - `naming_conventions` metadata per module -- detected patterns (camelCase, snake_case, verb-first), flag outliers -**Depends on:** 6.1 (function-level descriptions must exist first) +**Depends on:** 7.1 (function-level descriptions must exist first) > **Full spec:** See [llm-integration.md](./llm-integration.md) for detailed architecture, infrastructure table, and prompt design. --- -## Phase 7 -- Natural Language Queries +## Phase 8 -- Natural Language Queries **Goal:** Allow developers to ask questions about their codebase in plain English. -### 7.1 -- Query Engine +### 8.1 -- Query Engine ```bash codegraph ask "How does the authentication flow work?" @@ -1379,7 +1494,7 @@ codegraph ask "How does the authentication flow work?" **New file:** `src/nlquery.js` -### 7.2 -- Conversational Sessions +### 8.2 -- Conversational Sessions Multi-turn conversations with session memory. @@ -1393,7 +1508,7 @@ codegraph sessions clear - Store conversation history in SQLite table `sessions` - Include prior Q&A pairs in subsequent prompts -### 7.3 -- MCP Integration +### 8.3 -- MCP Integration New MCP tool: `ask_codebase` -- natural language query via MCP. @@ -1401,7 +1516,7 @@ Enables AI coding agents (Claude Code, Cursor, etc.) to ask codegraph questions **Affected files:** `src/mcp.js` -### 7.4 -- LLM-Narrated Graph Queries +### 8.4 -- LLM-Narrated Graph Queries Graph traversal + LLM narration for questions that require both structural data and natural-language explanation. Each query walks the graph first, then sends the structural result to the LLM for narration. @@ -1414,9 +1529,9 @@ Graph traversal + LLM narration for questions that require both structural data Pre-computed `flow_narratives` table caches results for key entry points at build time, invalidated when any node in the chain changes. -**Depends on:** 6.4 (`side_effects` metadata), 6.1 (descriptions for narration context) +**Depends on:** 7.4 (`side_effects` metadata), 7.1 (descriptions for narration context) -### 7.5 -- Onboarding & Navigation Tools +### 8.5 -- Onboarding & Navigation Tools Help new contributors and AI agents orient in an unfamiliar codebase. @@ -1425,15 +1540,15 @@ Help new contributors and AI agents orient in an unfamiliar codebase. - MCP tool: `get_started` -- returns ordered list: "start here, then read this, then this" - `change_plan ` -- LLM reads description, graph identifies relevant modules, returns touch points and test coverage gaps -**Depends on:** 6.5 (module summaries for context), 7.1 (query engine) +**Depends on:** 7.5 (module summaries for context), 8.1 (query engine) --- -## Phase 8 -- Expanded Language Support +## Phase 9 -- Expanded Language Support **Goal:** Go from 11 -> 19 supported languages. -### 8.1 -- Batch 1: High Demand +### 9.1 -- Batch 1: High Demand | Language | Extensions | Grammar | Effort | |----------|-----------|---------|--------| @@ -1442,7 +1557,7 @@ Help new contributors and AI agents orient in an unfamiliar codebase. | Kotlin | `.kt`, `.kts` | `tree-sitter-kotlin` | Low | | Swift | `.swift` | `tree-sitter-swift` | Medium | -### 8.2 -- Batch 2: Growing Ecosystems +### 9.2 -- Batch 2: Growing Ecosystems | Language | Extensions | Grammar | Effort | |----------|-----------|---------|--------| @@ -1451,7 +1566,7 @@ Help new contributors and AI agents orient in an unfamiliar codebase. | Lua | `.lua` | `tree-sitter-lua` | Low | | Zig | `.zig` | `tree-sitter-zig` | Low | -### 8.3 -- Parser Abstraction Layer +### 9.3 -- Parser Abstraction Layer Extract shared patterns from existing extractors into reusable helpers. @@ -1467,13 +1582,13 @@ Extract shared patterns from existing extractors into reusable helpers. --- -## Phase 9 -- GitHub Integration & CI +## Phase 10 -- GitHub Integration & CI **Goal:** Bring codegraph's analysis into pull request workflows. > **Note:** Phase 2.5 delivered `codegraph check` (CI validation predicates with exit code 0/1), which provides the foundation for GitHub Action integration. The boundary violation, blast radius, and cycle detection predicates are already available. -### 9.1 -- Reusable GitHub Action +### 10.1 -- Reusable GitHub Action A reusable GitHub Action that runs on PRs: @@ -1496,7 +1611,7 @@ A reusable GitHub Action that runs on PRs: **New file:** `.github/actions/codegraph-ci/action.yml` -### 9.2 -- PR Review Integration +### 10.2 -- PR Review Integration ```bash codegraph review --pr @@ -1519,7 +1634,7 @@ Requires `gh` CLI. For each changed function: **New file:** `src/github.js` -### 9.3 -- Visual Impact Graphs for PRs +### 10.3 -- Visual Impact Graphs for PRs Extend the existing `diff-impact --format mermaid` foundation with CI automation and LLM annotations. @@ -1540,15 +1655,15 @@ Extend the existing `diff-impact --format mermaid` foundation with CI automation - Highlight fragile nodes: high churn + high fan-in = high breakage risk - Track blast radius trends: "this PR's blast radius is 2x larger than your average" -**Depends on:** 9.1 (GitHub Action), 6.4 (`risk_score`, `side_effects`) +**Depends on:** 10.1 (GitHub Action), 7.4 (`risk_score`, `side_effects`) -### 9.4 -- SARIF Output +### 10.4 -- SARIF Output Add SARIF output format for cycle detection. SARIF integrates with GitHub Code Scanning, showing issues inline in the PR. **Affected files:** `src/export.js` -### 9.5 -- Auto-generated Docstrings +### 10.5 -- Auto-generated Docstrings ```bash codegraph annotate @@ -1557,15 +1672,15 @@ codegraph annotate --changed-only LLM-generated docstrings aware of callers, callees, and types. Diff-aware: only regenerate for functions whose code or dependencies changed. Stores in `docstrings` column on nodes table -- does not modify source files unless explicitly requested. -**Depends on:** 6.1 (LLM provider abstraction), 6.4 (side effects context) +**Depends on:** 7.1 (LLM provider abstraction), 7.4 (side effects context) --- -## Phase 10 -- Interactive Visualization & Advanced Features +## Phase 11 -- Interactive Visualization & Advanced Features -### 10.1 -- Interactive Web Visualization (Partially Complete) +### 11.1 -- Interactive Web Visualization (Partially Complete) -> **Phase 2.7 progress:** `codegraph plot` (Phase 2.7.8) ships a self-contained HTML viewer with vis-network. It supports layout switching, color/size/cluster overlays, drill-down, community detection, and a detail panel. The remaining work is the server-based experience below. +> **Phase 2.7 progress:** `codegraph plot` (Phase 2.8.8) ships a self-contained HTML viewer with vis-network. It supports layout switching, color/size/cluster overlays, drill-down, community detection, and a detail panel. The remaining work is the server-based experience below. ```bash codegraph viz @@ -1584,7 +1699,7 @@ Opens a local web UI at `localhost:3000` extending the static HTML viewer with: **New file:** `src/visualizer.js` -### 10.2 -- Dead Code Detection +### 11.2 -- Dead Code Detection ```bash codegraph dead @@ -1597,7 +1712,7 @@ Find functions/methods/classes with zero incoming edges (never called). Filters **Affected files:** `src/queries.js` -### 10.3 -- Cross-Repository Support (Monorepo) +### 11.3 -- Cross-Repository Support (Monorepo) Support multi-package monorepos with cross-package edges. @@ -1607,7 +1722,7 @@ Support multi-package monorepos with cross-package edges. - `codegraph build --workspace` to scan all packages - Impact analysis across package boundaries -### 10.4 -- Agentic Search +### 11.4 -- Agentic Search Recursive reference-following search that traces connections. @@ -1629,7 +1744,7 @@ codegraph agent-search "payment processing" **New file:** `src/agentic-search.js` -### 10.5 -- Refactoring Analysis +### 11.5 -- Refactoring Analysis LLM-powered structural analysis that identifies refactoring opportunities. The graph provides the structural data; the LLM interprets it. @@ -1644,9 +1759,9 @@ LLM-powered structural analysis that identifies refactoring opportunities. The g > **Note:** `hotspots` and `boundary_analysis` already have data foundations from Phase 2.5 (structure.js hotspots, boundaries.js evaluation). This phase adds LLM interpretation on top. -**Depends on:** 6.4 (`risk_score`, `complexity_notes`), 6.5 (module summaries) +**Depends on:** 7.4 (`risk_score`, `complexity_notes`), 7.5 (module summaries) -### 10.6 -- Auto-generated Docstrings +### 11.6 -- Auto-generated Docstrings ```bash codegraph annotate @@ -1655,7 +1770,7 @@ codegraph annotate --changed-only LLM-generated docstrings aware of callers, callees, and types. Diff-aware: only regenerate for functions whose code or dependencies changed. Stores in `docstrings` column on nodes table -- does not modify source files unless explicitly requested. -**Depends on:** 6.1 (LLM provider abstraction), 6.4 (side effects context) +**Depends on:** 7.1 (LLM provider abstraction), 7.4 (side effects context) > **Full spec:** See [llm-integration.md](./llm-integration.md) for detailed architecture, infrastructure tables, and prompt design for all LLM-powered features. From 30fdd26a4a965694d76a33c14cb054f86b7bddd8 Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 16 Mar 2026 05:32:32 -0600 Subject: [PATCH 03/52] docs: fix sub-section numbering to match parent phase headings --- docs/roadmap/ROADMAP.md | 52 ++++++++++++++++++++--------------------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md index d37c9506..3f0c2abe 100644 --- a/docs/roadmap/ROADMAP.md +++ b/docs/roadmap/ROADMAP.md @@ -205,7 +205,7 @@ Support querying multiple codebases from a single MCP server instance. **Goal:** Build a comprehensive analysis toolkit on top of the graph -- complexity metrics, community detection, risk triage, architecture boundary enforcement, CI validation, and hybrid search. This phase emerged organically as features were needed and wasn't in the original roadmap. -### 2.6.1 -- Complexity Metrics ✅ +### 2.5.1 -- Complexity Metrics ✅ Per-function complexity analysis using language-specific AST rules. @@ -219,7 +219,7 @@ Per-function complexity analysis using language-specific AST rules. **New file:** `src/complexity.js` (2,163 lines) -### 2.6.2 -- Community Detection & Drift ✅ +### 2.5.2 -- Community Detection & Drift ✅ Louvain community detection at file or function level. @@ -230,7 +230,7 @@ Louvain community detection at file or function level. **New file:** `src/communities.js` (310 lines) -### 2.6.3 -- Structure & Role Classification ✅ +### 2.5.3 -- Structure & Role Classification ✅ Directory structure graph with node role classification. @@ -243,7 +243,7 @@ Directory structure graph with node role classification. **New file:** `src/structure.js` (668 lines) -### 2.6.4 -- Execution Flow Tracing ✅ +### 2.5.4 -- Execution Flow Tracing ✅ Forward BFS from framework entry points through callees to leaves. @@ -253,7 +253,7 @@ Forward BFS from framework entry points through callees to leaves. **New file:** `src/flow.js` (362 lines) -### 2.6.5 -- Temporal Coupling (Co-change Analysis) ✅ +### 2.5.5 -- Temporal Coupling (Co-change Analysis) ✅ Git history analysis for temporal file coupling. @@ -264,7 +264,7 @@ Git history analysis for temporal file coupling. **New file:** `src/cochange.js` (502 lines) -### 2.6.6 -- Manifesto Rule Engine ✅ +### 2.5.6 -- Manifesto Rule Engine ✅ Configurable rule engine with warn/fail thresholds for function, file, and graph rules. @@ -276,7 +276,7 @@ Configurable rule engine with warn/fail thresholds for function, file, and graph **New file:** `src/manifesto.js` (511 lines) -### 2.6.7 -- Architecture Boundary Rules ✅ +### 2.5.7 -- Architecture Boundary Rules ✅ Architecture enforcement using glob patterns and presets. @@ -287,7 +287,7 @@ Architecture enforcement using glob patterns and presets. **New file:** `src/boundaries.js` (347 lines) -### 2.6.8 -- CI Validation Predicates (`check`) ✅ +### 2.5.8 -- CI Validation Predicates (`check`) ✅ Structured pass/fail checks for CI pipelines. @@ -301,7 +301,7 @@ Structured pass/fail checks for CI pipelines. **New file:** `src/check.js` (433 lines) -### 2.6.9 -- Composite Analysis Commands ✅ +### 2.5.9 -- Composite Analysis Commands ✅ High-level commands that compose multiple analysis steps. @@ -311,7 +311,7 @@ High-level commands that compose multiple analysis steps. **New files:** `src/audit.js` (424 lines), `src/batch.js` (91 lines), `src/triage.js` (274 lines) -### 2.6.10 -- Hybrid Search ✅ +### 2.5.10 -- Hybrid Search ✅ BM25 keyword search + semantic vector search with RRF fusion. @@ -323,7 +323,7 @@ BM25 keyword search + semantic vector search with RRF fusion. **Affected file:** `src/embedder.js` (grew from 525 -> 1,113 lines) -### 2.6.11 -- Supporting Infrastructure ✅ +### 2.5.11 -- Supporting Infrastructure ✅ Cross-cutting utilities added during the expansion. @@ -335,7 +335,7 @@ Cross-cutting utilities added during the expansion. - ✅ **Journal:** change journal validation/management (`src/journal.js`, 110 lines) - ✅ **Update Check:** npm registry polling with 24h cache (`src/update-check.js`, 161 lines) -### 2.6.12 -- MCP Tool Expansion ✅ +### 2.5.12 -- MCP Tool Expansion ✅ MCP grew from 12 -> 25 tools, covering all new analysis capabilities. @@ -367,7 +367,7 @@ MCP grew from 12 -> 25 tools, covering all new analysis capabilities. **Goal:** Add deeper static analysis capabilities (dataflow, control flow graphs, AST querying), enrich the graph model with sub-declaration node types and structural edges, refactor extractors into per-language modules, consolidate the CLI surface area, and introduce interactive visualization. This phase emerged from competitive analysis against Joern and Narsil-MCP. -### 2.8.1 -- Dataflow Analysis ✅ +### 2.7.1 -- Dataflow Analysis ✅ Define-use chain extraction tracking how data flows between functions. @@ -384,7 +384,7 @@ Define-use chain extraction tracking how data flows between functions. **New file:** `src/dataflow.js` (1,187 lines) -### 2.8.2 -- Expanded Node Types (Phase 1) ✅ +### 2.7.2 -- Expanded Node Types (Phase 1) ✅ Extend the graph model with sub-declaration node kinds. @@ -398,7 +398,7 @@ Extend the graph model with sub-declaration node kinds. **Affected files:** All extractors, `src/builder.js`, `src/queries.js`, `src/db.js` -### 2.8.3 -- Expanded Edge Types (Phase 2) ✅ +### 2.7.3 -- Expanded Edge Types (Phase 2) ✅ Structural edges for richer graph relationships. @@ -409,7 +409,7 @@ Structural edges for richer graph relationships. **Affected files:** `src/builder.js`, `src/queries.js` -### 2.8.4 -- Intraprocedural Control Flow Graph (CFG) ✅ +### 2.7.4 -- Intraprocedural Control Flow Graph (CFG) ✅ Basic-block control flow graph construction from function ASTs. @@ -424,7 +424,7 @@ Basic-block control flow graph construction from function ASTs. **New file:** `src/cfg.js` (1,451 lines) -### 2.8.5 -- Stored Queryable AST Nodes ✅ +### 2.7.5 -- Stored Queryable AST Nodes ✅ Persist and query selected AST node types for pattern-based codebase exploration. @@ -439,7 +439,7 @@ Persist and query selected AST node types for pattern-based codebase exploration **New file:** `src/ast.js` (392 lines) -### 2.8.6 -- Extractors Refactoring ✅ +### 2.7.6 -- Extractors Refactoring ✅ Split per-language extractors from monolithic `parser.js` into dedicated modules. @@ -453,7 +453,7 @@ Split per-language extractors from monolithic `parser.js` into dedicated modules **New directory:** `src/extractors/` -### 2.8.7 -- normalizeSymbol Utility ✅ +### 2.7.7 -- normalizeSymbol Utility ✅ Stable JSON schema for symbol output across all query functions. @@ -463,7 +463,7 @@ Stable JSON schema for symbol output across all query functions. **Affected file:** `src/queries.js` -### 2.8.8 -- Interactive Graph Viewer ✅ +### 2.7.8 -- Interactive Graph Viewer ✅ Self-contained HTML visualization with vis-network. @@ -480,7 +480,7 @@ Self-contained HTML visualization with vis-network. **New file:** `src/viewer.js` (948 lines) -### 2.8.9 -- Exports Command ✅ +### 2.7.9 -- Exports Command ✅ Per-symbol consumer analysis for file exports. @@ -491,7 +491,7 @@ Per-symbol consumer analysis for file exports. **Affected file:** `src/queries.js` -### 2.8.10 -- Export Format Expansion ✅ +### 2.7.10 -- Export Format Expansion ✅ Three new graph export formats for external tooling integration. @@ -501,7 +501,7 @@ Three new graph export formats for external tooling integration. **Affected file:** `src/export.js` (681 lines) -### 2.8.11 -- CLI Consolidation ✅ +### 2.7.11 -- CLI Consolidation ✅ First CLI surface area reduction -- 5 commands merged into existing ones. @@ -514,7 +514,7 @@ First CLI surface area reduction -- 5 commands merged into existing ones. **Affected file:** `src/cli.js` -### 2.8.12 -- MCP Tool Consolidation & Expansion ✅ +### 2.7.12 -- MCP Tool Consolidation & Expansion ✅ MCP tools were both consolidated and expanded, resulting in a net change from 25 → 30 tools (31 in multi-repo mode). @@ -542,7 +542,7 @@ Plus updated enums on existing tools (edge_kinds, symbol kinds). ### 2.7 Summary -| Metric | Before (v2.7.0) | After (v3.0.0) | Delta | +| Metric | Before (v2.7.0 baseline) | After (v3.0.0) | Delta | |--------|-----------------|-----------------|-------| | Source modules | 35 | 50 | +15 | | Total source lines | 17,830 | 26,277 | +47% | @@ -1680,7 +1680,7 @@ LLM-generated docstrings aware of callers, callees, and types. Diff-aware: only ### 11.1 -- Interactive Web Visualization (Partially Complete) -> **Phase 2.7 progress:** `codegraph plot` (Phase 2.8.8) ships a self-contained HTML viewer with vis-network. It supports layout switching, color/size/cluster overlays, drill-down, community detection, and a detail panel. The remaining work is the server-based experience below. +> **Phase 2.7 progress:** `codegraph plot` (Phase 2.7.8) ships a self-contained HTML viewer with vis-network. It supports layout switching, color/size/cluster overlays, drill-down, community detection, and a detail panel. The remaining work is the server-based experience below. ```bash codegraph viz From 2fce6905fb3d69485b0aa2d5adf6ad3580ef106e Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 16 Mar 2026 23:22:57 -0600 Subject: [PATCH 04/52] fix: align version computation between publish.yml and bench-version.js - Add COMMITS=0 guard in publish.yml to return clean version when HEAD is exactly at a tag (mirrors bench-version.js early return) - Change bench-version.js to use PATCH+1-dev.COMMITS format instead of PATCH+COMMITS-dev.SHA (mirrors publish.yml's new scheme) - Fix fallback in bench-version.js to use dev.1 matching publish.yml's no-tags COMMITS=1 default Impact: 1 functions changed, 0 affected --- .github/workflows/publish.yml | 8 +++++--- scripts/bench-version.js | 19 ++++++------------- 2 files changed, 11 insertions(+), 16 deletions(-) diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index a6538d19..81a70e52 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -77,9 +77,11 @@ jobs: COMMITS=1 IFS='.' read -r MAJOR MINOR PATCH <<< "$CURRENT" fi - DEV_PATCH=$((PATCH + COMMITS)) - SHORT_SHA=$(echo "${{ github.sha }}" | cut -c1-7) - VERSION="${MAJOR}.${MINOR}.${DEV_PATCH}-dev.${SHORT_SHA}" + if [ "$COMMITS" -eq 0 ]; then + VERSION="${MAJOR}.${MINOR}.${PATCH}" + else + VERSION="${MAJOR}.${MINOR}.$((PATCH + 1))-dev.${COMMITS}" + fi NPM_TAG="dev" echo "Dev release: $VERSION (${COMMITS} commits since ${RELEASE_TAG:-none})" fi diff --git a/scripts/bench-version.js b/scripts/bench-version.js index accc7a8b..7fd2f84e 100644 --- a/scripts/bench-version.js +++ b/scripts/bench-version.js @@ -6,8 +6,8 @@ * 2. `git rev-list ..HEAD --count` → count commits since that tag * * - If HEAD is exactly tagged (0 commits): returns "2.5.0" - * - Otherwise: returns "2.5.N-dev.hash" (e.g. "2.5.3-dev.c50f7f5") - * where N = PATCH + commits since tag, hash = short commit SHA + * - Otherwise: returns "2.5.(PATCH+1)-dev.COMMITS" (e.g. "2.5.3-dev.45") + * where COMMITS = number of commits since the tag * * This prevents dev/dogfood benchmark runs from overwriting release data * in the historical benchmark reports (which deduplicate by version). @@ -38,24 +38,17 @@ export function getBenchmarkVersion(pkgVersion, cwd) { // Exact tag (0 commits since tag): return clean release version if (commits === 0) return `${major}.${minor}.${patch}`; - // Dev build: MAJOR.MINOR.(PATCH+COMMITS)-dev.SHORT_SHA - const hash = execFileSync('git', ['rev-parse', '--short', 'HEAD'], { cwd, ...GIT_OPTS }).trim(); - const devPatch = Number(patch) + commits; - return `${major}.${minor}.${devPatch}-dev.${hash}`; + // Dev build: MAJOR.MINOR.(PATCH+1)-dev.COMMITS + return `${major}.${minor}.${Number(patch) + 1}-dev.${commits}`; } catch { /* git not available or no tags */ } - // Fallback: no git or no tags — match publish.yml's no-tags behavior (PATCH+1-dev.SHA) + // Fallback: no git or no tags — match publish.yml's no-tags behavior (COMMITS=1) const parts = pkgVersion.split('.'); if (parts.length === 3) { const [major, minor, patch] = parts; - try { - const hash = execFileSync('git', ['rev-parse', '--short', 'HEAD'], { cwd, ...GIT_OPTS }).trim(); - return `${major}.${minor}.${Number(patch) + 1}-dev.${hash}`; - } catch { - return `${major}.${minor}.${Number(patch) + 1}-dev`; - } + return `${major}.${minor}.${Number(patch) + 1}-dev.1`; } return `${pkgVersion}-dev`; } From 3b6dccf482a3366b9c2851d7b20fe5069d485267 Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Tue, 17 Mar 2026 04:47:43 -0600 Subject: [PATCH 05/52] feat: auto-detect semver bump in /release skill when no version provided The release skill now scans commit history using conventional commit rules to determine major/minor/patch automatically. Explicit version argument still works as before. --- .claude/skills/release/SKILL.md | 50 +++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 11 deletions(-) diff --git a/.claude/skills/release/SKILL.md b/.claude/skills/release/SKILL.md index a05c2f9b..5df0febd 100644 --- a/.claude/skills/release/SKILL.md +++ b/.claude/skills/release/SKILL.md @@ -1,27 +1,55 @@ --- name: release description: Prepare a codegraph release — bump versions, update CHANGELOG, ROADMAP, BACKLOG, README, create PR -argument-hint: +argument-hint: "[version e.g. 3.1.1] (optional — auto-detects from commits)" allowed-tools: Bash, Read, Write, Edit, Glob, Grep, Agent --- -# Release v$ARGUMENTS +# Release -You are preparing a release for `@optave/codegraph` version **$ARGUMENTS**. +You are preparing a release for `@optave/codegraph`. + +**Version argument:** `$ARGUMENTS` +- If a version was provided (e.g. `3.1.1`), use it as the target version. +- If no version was provided (empty or blank `$ARGUMENTS`), you will auto-detect it in Step 1b. --- -## Step 1: Gather context +## Step 1a: Gather context Run these in parallel: -1. `git log --oneline v..HEAD` — all commits since the last release tag +1. `git log --oneline v..HEAD` — all commits since the last release tag (use `git describe --tags --match "v*" --abbrev=0` to find the previous tag) 2. Read `CHANGELOG.md` (first 80 lines) — understand the format 3. Read `package.json` — current version 4. `git describe --tags --match "v*" --abbrev=0` — find the previous stable release tag +## Step 1b: Determine version (if not provided) + +If `$ARGUMENTS` is empty or blank, determine the semver bump from the commits gathered in Step 1a. + +Scan **every commit message** between the last tag and HEAD. Apply these rules in priority order: + +| Condition | Bump | +|-----------|------| +| Any commit has a `BREAKING CHANGE:` or `BREAKING-CHANGE:` footer, **or** uses the `!` suffix (e.g. `feat!:`, `fix!:`, `refactor!:`) | **major** | +| Any commit uses `feat:` or `feat(scope):` | **minor** | +| Everything else (`fix:`, `refactor:`, `perf:`, `chore:`, `docs:`, `test:`, `ci:`, etc.) | **patch** | + +Given the current version `MAJOR.MINOR.PATCH` from `package.json`, compute the new version: +- **major** → `(MAJOR+1).0.0` +- **minor** → `MAJOR.(MINOR+1).0` +- **patch** → `MAJOR.MINOR.(PATCH+1)` + +Print the detected bump reason and the resolved version, e.g.: +> Detected **minor** bump (found `feat:` commits). Version: 3.1.0 → **3.2.0** + +Use the resolved version as `VERSION` for all subsequent steps. + +If `$ARGUMENTS` was provided, use it directly as `VERSION`. + ## Step 2: Bump version in package.json -Edit `package.json` to set `"version": "$ARGUMENTS"`. +Edit `package.json` to set `"version": "VERSION"`. **Do NOT bump:** - `crates/codegraph-core/Cargo.toml` — synced automatically by `scripts/sync-native-versions.js` during the publish workflow @@ -104,16 +132,16 @@ Run `grep` to confirm the new version appears in `package-lock.json` and that al ## Step 8: Create branch, commit, push, PR -1. Create branch: `git checkout -b release/$ARGUMENTS` +1. Create branch: `git checkout -b release/VERSION` 2. Stage only the files you changed: `CHANGELOG.md`, `package.json`, `package-lock.json`, `docs/roadmap/ROADMAP.md`, `docs/roadmap/BACKLOG.md` if changed, `README.md` if changed -3. Commit: `chore: release v$ARGUMENTS` -4. Push: `git push -u origin release/$ARGUMENTS` +3. Commit: `chore: release vVERSION` +4. Push: `git push -u origin release/VERSION` 5. Create PR: ``` -gh pr create --title "chore: release v$ARGUMENTS" --body "$(cat <<'EOF' +gh pr create --title "chore: release vVERSION" --body "$(cat <<'EOF' ## Summary -- Bump version to $ARGUMENTS +- Bump version to VERSION - Add CHANGELOG entry for all commits since previous release - Update ROADMAP progress From b0e5c30fafc3efb18c35f7bfb33db312f959cadc Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Sat, 21 Mar 2026 02:38:42 -0600 Subject: [PATCH 06/52] feat: add /titan-run orchestrator with diff review, semantic assertions, and architectural snapshot MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add /titan-run skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling end-to-end autonomous execution. Hardening layers added across the pipeline: - Pre-Agent Gate (G1-G4): git health, worktree validity, state integrity, backups - Post-phase validation (V1-V15): artifact structure, coverage, consistency checks - Stall detection with per-phase thresholds and no-progress abort - Mandatory human checkpoint before forge (unless --yes) New validation tools integrated into forge and gate: - Diff Review Agent (forge Step 9): verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs - Semantic Assertions (gate Step 5): export signature stability, import resolution integrity, dependency direction, re-export chain validation - Architectural Snapshot Comparator (gate Step 5.5): community stability, cross-domain dependency direction, cohesion delta, drift detection vs pre-forge baseline --- .claude/skills/titan-forge/SKILL.md | 293 +++++++++ .claude/skills/titan-gate/SKILL.md | 126 +++- .claude/skills/titan-run/SKILL.md | 574 ++++++++++++++++++ docs/examples/claude-code-skills/README.md | 44 +- .../claude-code-skills/titan-forge/SKILL.md | 293 +++++++++ .../claude-code-skills/titan-gate/SKILL.md | 126 +++- .../claude-code-skills/titan-run/SKILL.md | 574 ++++++++++++++++++ docs/use-cases/titan-paradigm.md | 27 +- 8 files changed, 2022 insertions(+), 35 deletions(-) create mode 100644 .claude/skills/titan-forge/SKILL.md create mode 100644 .claude/skills/titan-run/SKILL.md create mode 100644 docs/examples/claude-code-skills/titan-forge/SKILL.md create mode 100644 docs/examples/claude-code-skills/titan-run/SKILL.md diff --git a/.claude/skills/titan-forge/SKILL.md b/.claude/skills/titan-forge/SKILL.md new file mode 100644 index 00000000..44c4c36b --- /dev/null +++ b/.claude/skills/titan-forge/SKILL.md @@ -0,0 +1,293 @@ +--- +name: titan-forge +description: Execute the sync.json plan — refactor code, validate with /titan-gate, commit, and advance state (Titan Paradigm Phase 4) +argument-hint: <--phase N> <--target name> <--dry-run> +allowed-tools: Bash, Read, Write, Edit, Glob, Grep, Skill, Agent +--- + +# Titan FORGE — Execute Sync Plan + +You are running the **FORGE** phase of the Titan Paradigm. + +Your goal: read `sync.json`, find the next incomplete execution phase, make the actual code changes for each target, validate with `/titan-gate`, commit, and advance state. + +> **Context budget:** One phase per invocation. Do not attempt all phases in one session — the context window will fill. Run one phase, report, stop. User re-runs for the next phase. + +**Arguments** (from `$ARGUMENTS`): +- No args → run next incomplete phase +- `--phase N` → jump to specific phase +- `--target ` → run single target only (for retrying failures) +- `--dry-run` → show what would be done without changing code + +--- + +## Step 0 — Pre-flight + +1. **Worktree check:** + ```bash + git rev-parse --show-toplevel && git worktree list + ``` + If not in a worktree, stop: "Run `/worktree` first." + +2. **Sync with main:** + ```bash + git fetch origin main && git merge origin/main --no-edit + ``` + If there are merge conflicts, stop: "Merge conflict detected. Resolve conflicts and re-run `/titan-forge`." + +3. **Load artifacts.** Read: + - `.codegraph/titan/sync.json` — execution plan (if missing: "Run `/titan-sync` first.") + - `.codegraph/titan/titan-state.json` — current state + - `.codegraph/titan/gauntlet.ndjson` — per-target audit details + - `.codegraph/titan/gauntlet-summary.json` — aggregated results + +4. **Validate state.** If `titan-state.json` has `currentPhase` other than `"sync"` and no existing `execution` block, stop: "State not ready. Run `/titan-sync` first." + +5. **Initialize execution state** (if first run). Add to `titan-state.json`: + ```json + { + "execution": { + "currentPhase": 1, + "completedPhases": [], + "currentTarget": null, + "completedTargets": [], + "failedTargets": [], + "commits": [] + } + } + ``` + +6. **Determine next phase.** Use `--phase N` if provided, otherwise find the lowest phase number not in `completedPhases`. + +7. **Print plan:** + > Phase N: \ — N targets, estimated N commits + +8. **Ask for confirmation** before starting (unless `$ARGUMENTS` contains `--yes`). + +--- + +## Step 1 — Phase-specific execution strategies + +Each phase type requires different code-change logic: + +### Phase 1: Dead code cleanup +- Delete the symbol/export +- Verify no consumers: `codegraph fn-impact -T --json` +- Remove any orphaned imports +- Run lint to clean up + +### Phase 2: Shared abstractions +- Extract function/interface to new or existing file +- Update imports in all consumers +- Verify with: `codegraph exports -T --json` + +### Phase 3: Empty catches / error handling +- Replace `catch {}` with `catch (e) { logger.debug(...) }` or explicit fallback +- Use contextually appropriate error handling +- Subphases: each distinct catch pattern = one commit + +### Phase 4: Extractor decomposition +- Split large `walkXNode` switch cases into handler functions +- Keep dispatcher thin — handler per node kind +- Subphases: each extractor = one commit + +### Phase 5: General decomposition +- Read the gauntlet recommendation for the specific target +- Apply the recommended decomposition strategy +- Subphases: each function split = one commit + +### Phase 6: Small FAIL fixes +- Read the gauntlet recommendation for the specific target +- Apply the recommended fix (complexity reduction, metric improvement) +- Group by domain where possible + +--- + +## Step 2 — Per-target execution loop + +For each target in the current phase: + +1. **Skip if done.** Check if target is already in `execution.completedTargets`. If so, skip. + +2. **Update state.** Set `execution.currentTarget` in `titan-state.json`. + +3. **Read gauntlet entry.** Find this target in `gauntlet.ndjson` → get recommendation, violations, metrics. + +4. **Understand before touching.** Run codegraph commands: + ```bash + codegraph context -T --json + ``` + If blast radius > 0: + ```bash + codegraph fn-impact -T --json + ``` + +5. **Check if already fixed.** If the file has changed since gauntlet ran, re-check metrics: + ```bash + codegraph complexity --file --health -T --json + ``` + If the target now passes all thresholds, skip with note: "Target already passes — skipping." + +6. **Read source file(s).** Understand the code before editing. + +7. **Apply the change** based on phase strategy (Step 1) + gauntlet recommendation. + +8. **Stage changed files:** + ```bash + git add + ``` + +9. **Diff review (intent verification):** + Before running gate or tests, verify the diff matches the intent. This catches cases where the code change is structurally valid but doesn't match what was planned. + + Collect the context: + ```bash + git diff --cached --stat + git diff --cached + ``` + + Load the gauntlet entry for this target (from `gauntlet.ndjson`) and the sync plan entry (from `sync.json → executionOrder[currentPhase]`). + + **Check all of the following:** + + **D1. Scope — only planned files touched:** + Compare staged file paths against `sync.json → executionOrder[currentPhase].targets` and their known file paths (from gauntlet entries). Flag any file NOT associated with the current target or phase. + - File in a completely different domain → **DIFF FAIL** + - File is a direct dependency of the target (consumer or import) → **OK** (expected ripple) + - Test file for the target → **OK** + + **D2. Intent match — diff aligns with gauntlet recommendation:** + Read the gauntlet entry's `recommendation` field and `violations` list. Verify the diff addresses them: + - If recommendation says "split" → diff should show new functions extracted, original simplified + - If recommendation says "remove dead code" → diff should show deletions, not additions + - If violation was "complexity > threshold" → diff should reduce complexity, not just move code around + - If the diff does something **entirely different** from the recommendation → **DIFF FAIL** + + **D3. Commit message accuracy:** + Compare the planned commit message from `sync.json` against what the diff actually does. + - Message says "remove dead code" but diff adds new functions → **DIFF WARN** + - Message says "extract X from Y" but diff only modifies Y without creating X → **DIFF FAIL** + + **D4. Deletion audit:** + If the diff deletes code (lines removed > 10): + ```bash + codegraph fn-impact -T --json 2>/dev/null + ``` + If the deleted symbol has active callers not updated in this diff → **DIFF FAIL**: "Deleted still has callers not updated in this commit." + + **D5. Leftover check:** + If the gauntlet recommendation mentioned specific symbols to remove/refactor, verify they were actually addressed: + - Dead symbols listed for removal → should be deleted in the diff + - Functions marked for decomposition → original should be simplified or removed + + **On DIFF FAIL:** Unstage and revert changes, add to `execution.failedTargets` with reason starting with `"diff-review: "`. Continue to next target. + **On DIFF WARN:** Log the warning but proceed to gate. Include the warning in the gate-log entry. + +10. **Run tests:** + ```bash + npm test 2>&1 + ``` + If tests fail → go to rollback (step 13). + +11. **Run /titan-gate:** + Use the Skill tool to invoke `titan-gate`. If FAIL → go to rollback (step 13). + +12. **On success:** + ```bash + git commit -m "" + ``` + - Record commit SHA in `execution.commits` + - Add target to `execution.completedTargets` + - Record any diff-review warnings in `execution.diffWarnings` (if any) + - Update `titan-state.json` + +13. **On failure (test, gate, or diff-review):** + ```bash + git checkout -- + ``` + - Add to `execution.failedTargets` with reason: `{ "target": "", "reason": "", "phase": N }` + - Clear `execution.currentTarget` + - **Continue to next target** — don't block the whole phase + +--- + +## Step 3 — Phase completion + +When all targets in the phase are processed: + +1. Add phase number to `execution.completedPhases` +2. Advance `execution.currentPhase` to the next phase number +3. Clear `execution.currentTarget` +4. Write updated `titan-state.json` + +--- + +## Step 4 — Report + +Print: + +``` +## Phase N Complete: