From d64e66b16c16519dbc0a138ee894d35633c32b34 Mon Sep 17 00:00:00 2001 From: shahinyanm Date: Fri, 8 May 2026 12:50:01 +0400 Subject: [PATCH 1/3] v0.33.0: bugfix batch from real-world tool-audit (B1-B14) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Surfaced from cross-machine tp-report.txt audit + external code review on 2026-05-08. 1255 tests pass; full diff in beads token-pilot-n2r. P0 (release blocker — silent breakage): - B1+B3: install-agents probes ~/.claude/plugins/cache for fresher tp-* templates when running CLI is stale (e.g. global v0.20.1 binary shipping only 6 of 25 agents). New: pickFreshestAgentsSource(). - B2: user settings.json hooks pinned to /home/.../npx//... paths kept calling old binary even after plugin cache shipped a new minor (e.g. PreToolUse:Task in v0.31.0 silently no-op'd). - new helper isStaleTokenPilotHookCommand() detects the pattern - installHook() now filters those entries before re-writing - new helper isTokenPilotPluginEnabled() short-circuits manual install when the bundled plugin is already enabled — the plugin's hooks/hooks.json with ${CLAUDE_PLUGIN_ROOT} is the source of truth in that case - new CLI subcommand: `token-pilot migrate-hooks` cleans both user-level + project-level settings.json idempotently P1 (functional): - B6: smart_log gitArgs malformed — `git log -- HEAD -- foo.ts` treated HEAD as pathspec; removed the leading `'--'` separator. - B7+B8: project root detection on WSL caught C:\\Windows / /mnt/c/Windows / UNC paths. New: prefer CLAUDE_PROJECT_DIR over INIT_CWD/PWD/cwd; reject Windows system paths via new isWindowsSystemPath(). - B10: ast-index `exec()` failed permanently when init silently errored at server startup. Added lazy retry with a friendlier error pointing at `npx token-pilot install-ast-index`. - B14: subagents (general-purpose, code-analyzer, feature-dev:*) loop on raw Read after hook-pre-read deny because their prompts don't know about MCP tools. PreToolUse:Task now ALWAYS injects a generic SUBAGENT_TOOL_GUIDE additionalContext for non-tp-* dispatches (empty / no-match / escape paths included). - B4: TOKEN_PILOT_FORCE_SUBAGENTS=1 was a silent no-op when the agent catalog was empty. Now hard-denies with a clear "run install-agents" diagnostic. P2: - B5: hook-events.jsonl was fragmented across monorepo subdirs (e.g. apps/admin, apps/api, packages/prisma). New loadEventsTree() walks the repo root and merges; `stats` uses it by default, opt out with `--no-merge`. - B9: read_range rejected start_line/end_line when the MCP transport sent them as strings ("10"). New coerceIntFromAny() accepts numeric strings, rejects everything else. P3 deferred: - B11 belonged to a different package (epitaxy) — not ours. - B12, B13 (docs path drift, retroactive ADR for 6 new tools) filed for a follow-up — non-blocking for v0.33.0. Manifests bumped 0.32.0 → 0.33.0; agents rebuilt (25/25). --- .claude-plugin/marketplace.json | 4 +- .claude-plugin/plugin.json | 2 +- .gitignore | 1 + agents/tp-api-surface-tracker.md | 2 +- agents/tp-audit-scanner.md | 2 +- agents/tp-commit-writer.md | 2 +- agents/tp-context-engineer.md | 2 +- agents/tp-dead-code-finder.md | 2 +- agents/tp-debugger.md | 2 +- agents/tp-dep-health.md | 2 +- agents/tp-doc-writer.md | 2 +- agents/tp-history-explorer.md | 2 +- agents/tp-impact-analyzer.md | 2 +- agents/tp-incident-timeline.md | 2 +- agents/tp-incremental-builder.md | 2 +- agents/tp-migration-scout.md | 2 +- agents/tp-onboard.md | 2 +- agents/tp-performance-profiler.md | 2 +- agents/tp-pr-reviewer.md | 2 +- agents/tp-refactor-planner.md | 2 +- agents/tp-review-impact.md | 2 +- agents/tp-run.md | 2 +- agents/tp-session-restorer.md | 2 +- agents/tp-ship-coordinator.md | 2 +- agents/tp-spec-writer.md | 2 +- agents/tp-test-coverage-gapper.md | 2 +- agents/tp-test-triage.md | 2 +- agents/tp-test-writer.md | 2 +- package.json | 2 +- src/ast-index/client.ts | 21 +++- src/cli/install-agents.ts | 107 ++++++++++++++++++- src/cli/stats.ts | 16 ++- src/core/event-log.ts | 78 ++++++++++++++ src/core/validation.ts | 47 +++++--- src/handlers/smart-log.ts | 9 +- src/hooks/installer.ts | 172 +++++++++++++++++++++++++++++- src/hooks/pre-task.ts | 56 ++++++++-- src/index.ts | 88 ++++++++++++++- tests/hooks/pre-task.test.ts | 64 ++++++++--- tests/index.test.ts | 10 ++ 40 files changed, 654 insertions(+), 73 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 6c33fe5..22969a9 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -6,14 +6,14 @@ }, "metadata": { "description": "Token Pilot \u2014 save 60-90% tokens when AI reads code", - "version": "0.32.0" + "version": "0.33.0" }, "plugins": [ { "name": "token-pilot", "source": "./", "description": "Reduces token consumption by 60-90% via AST-aware lazy file reading, structural symbol navigation, and cross-session tool-usage analytics. 22 MCP tools + 19 subagents + budget watchdog hooks.", - "version": "0.32.0", + "version": "0.33.0", "author": { "name": "Digital-Threads" }, diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 212e45e..d62d799 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "token-pilot", - "version": "0.32.0", + "version": "0.33.0", "description": "Saves 60-90% tokens when AI reads code. AST-aware lazy reading, symbol navigation, cross-session tool-usage analytics, 22 subagents (haiku/sonnet/opus-tiered) with budget watchdog.", "author": { "name": "Digital-Threads", diff --git a/.gitignore b/.gitignore index 43a5b39..cd08462 100644 --- a/.gitignore +++ b/.gitignore @@ -45,3 +45,4 @@ docs/code-review-report.md .claude/ .token-pilot.json .token-pilot-fingerprint.json +tp-report.txt diff --git a/agents/tp-api-surface-tracker.md b/agents/tp-api-surface-tracker.md index 7b577af..c4a6580 100644 --- a/agents/tp-api-surface-tracker.md +++ b/agents/tp-api-surface-tracker.md @@ -9,7 +9,7 @@ tools: - mcp__token-pilot__read_symbol - Bash model: haiku -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: dd184501203fa7f3c73f419c4ffbe33c4be75400cb64a7a51733a3fe23f6e085 --- diff --git a/agents/tp-audit-scanner.md b/agents/tp-audit-scanner.md index 90af53f..84dd8c5 100644 --- a/agents/tp-audit-scanner.md +++ b/agents/tp-audit-scanner.md @@ -11,7 +11,7 @@ tools: - Grep - Read model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: d172f600bf32277ea6eb4cbbee4542ddd698a986dcd96997d33930561964569b --- diff --git a/agents/tp-commit-writer.md b/agents/tp-commit-writer.md index 34dc497..d72230d 100644 --- a/agents/tp-commit-writer.md +++ b/agents/tp-commit-writer.md @@ -8,7 +8,7 @@ tools: - mcp__token-pilot__test_summary - mcp__token-pilot__outline - Bash -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: de64a406b5176de19f7422619c7de7949b1f28865f225402c9cea9255f377428 --- diff --git a/agents/tp-context-engineer.md b/agents/tp-context-engineer.md index 5d5b7cc..4ffdcc2 100644 --- a/agents/tp-context-engineer.md +++ b/agents/tp-context-engineer.md @@ -13,7 +13,7 @@ tools: - Edit - Glob model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 68b32af2dacd82ebe52c4eec93edb903d452688274c3065218270627c564d8b0 --- diff --git a/agents/tp-dead-code-finder.md b/agents/tp-dead-code-finder.md index b36f20f..c18e171 100644 --- a/agents/tp-dead-code-finder.md +++ b/agents/tp-dead-code-finder.md @@ -11,7 +11,7 @@ tools: - Grep - Read model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: d9b7f5b7ae6f4ae21305c775361bcab097cc774370a6d976c093571d46d55021 --- diff --git a/agents/tp-debugger.md b/agents/tp-debugger.md index fa21ecd..b0b1cde 100644 --- a/agents/tp-debugger.md +++ b/agents/tp-debugger.md @@ -12,7 +12,7 @@ tools: - Read - Bash model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 052413de8d92377edcde6ae5c823f5378db304baccfa29e8866467f42553a500 --- diff --git a/agents/tp-dep-health.md b/agents/tp-dep-health.md index d64f179..ba2a831 100644 --- a/agents/tp-dep-health.md +++ b/agents/tp-dep-health.md @@ -9,7 +9,7 @@ tools: - Bash - Read model: haiku -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: e14dc57493d816f8c2e017963e2ef5f66bea50fd0b805a80e8a0d97c968427e7 --- diff --git a/agents/tp-doc-writer.md b/agents/tp-doc-writer.md index 524f8ad..3a8a399 100644 --- a/agents/tp-doc-writer.md +++ b/agents/tp-doc-writer.md @@ -13,7 +13,7 @@ tools: - Edit - Glob model: haiku -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 57d741794ab40e31a7ac49c68ea39a9088f5827cdef866ce81bfca1b7c9180cf --- diff --git a/agents/tp-history-explorer.md b/agents/tp-history-explorer.md index be0379d..5eb8bb4 100644 --- a/agents/tp-history-explorer.md +++ b/agents/tp-history-explorer.md @@ -10,7 +10,7 @@ tools: - Bash - Read model: haiku -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 7b70fa76a60e3c58a1de4f56c32c0f166424137e203a0cf1c8654e7c9235d904 --- diff --git a/agents/tp-impact-analyzer.md b/agents/tp-impact-analyzer.md index 490b504..ae103a1 100644 --- a/agents/tp-impact-analyzer.md +++ b/agents/tp-impact-analyzer.md @@ -12,7 +12,7 @@ tools: - mcp__token-pilot__read_symbols - Read model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 351a987e11eba63852f5431a16d8eb53104f4f689f82fdcc5a2bf4db948ba92f --- diff --git a/agents/tp-incident-timeline.md b/agents/tp-incident-timeline.md index 4e17b76..4dd0002 100644 --- a/agents/tp-incident-timeline.md +++ b/agents/tp-incident-timeline.md @@ -8,7 +8,7 @@ tools: - mcp__token-pilot__read_symbol - Bash model: inherit -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: de5722bfea374eaab096c1ae635c37879e7a91370ee3cd0532f4240be03c91eb --- diff --git a/agents/tp-incremental-builder.md b/agents/tp-incremental-builder.md index 744932a..94dd287 100644 --- a/agents/tp-incremental-builder.md +++ b/agents/tp-incremental-builder.md @@ -13,7 +13,7 @@ tools: - Edit - Bash model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 375a824d0d847bb5453ec594c7a62ad566ee7e4d92717b0473f771f1a0477c60 --- diff --git a/agents/tp-migration-scout.md b/agents/tp-migration-scout.md index 1abd8fd..a6e4157 100644 --- a/agents/tp-migration-scout.md +++ b/agents/tp-migration-scout.md @@ -11,7 +11,7 @@ tools: - Grep - Glob model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 0334de1bf99b431b65359637d125cda7c44c6f780eb92c57cc538715b1939536 --- diff --git a/agents/tp-onboard.md b/agents/tp-onboard.md index f9d8722..188cbc8 100644 --- a/agents/tp-onboard.md +++ b/agents/tp-onboard.md @@ -10,7 +10,7 @@ tools: - mcp__token-pilot__smart_read - mcp__token-pilot__smart_read_many - mcp__token-pilot__read_section -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 832e95633fbc8e9b0c10f3e540a327d4be062fb4b3f17a6cce6be13f414e2927 --- diff --git a/agents/tp-performance-profiler.md b/agents/tp-performance-profiler.md index 8577452..001576b 100644 --- a/agents/tp-performance-profiler.md +++ b/agents/tp-performance-profiler.md @@ -11,7 +11,7 @@ tools: - Bash - Read model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: b61f06380d80798fa2e49d37bcba0653495bee04dd6bdbc1feff9a75607b0508 --- diff --git a/agents/tp-pr-reviewer.md b/agents/tp-pr-reviewer.md index d2404ab..213c869 100644 --- a/agents/tp-pr-reviewer.md +++ b/agents/tp-pr-reviewer.md @@ -11,7 +11,7 @@ tools: - mcp__token-pilot__read_for_edit - Read model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: f83f50d05b4f70285ae7afed2b1a406fc436df56e61a0aedbfb31edc7f2b6e66 --- diff --git a/agents/tp-refactor-planner.md b/agents/tp-refactor-planner.md index cad37f2..9b3290a 100644 --- a/agents/tp-refactor-planner.md +++ b/agents/tp-refactor-planner.md @@ -8,7 +8,7 @@ tools: - mcp__token-pilot__outline - mcp__token-pilot__read_symbol model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: c5f6fc122c89e16e5cf774045f92169ee3468555320b898171ba13eca5323550 --- diff --git a/agents/tp-review-impact.md b/agents/tp-review-impact.md index 9d038a4..cbd37e5 100644 --- a/agents/tp-review-impact.md +++ b/agents/tp-review-impact.md @@ -9,7 +9,7 @@ tools: - mcp__token-pilot__module_info - Bash model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 8ef3c3341cbfed4eb8dd130126a9683edc57e378c92ff0ca764d584fd941c55c --- diff --git a/agents/tp-run.md b/agents/tp-run.md index 66c4deb..7799ea2 100644 --- a/agents/tp-run.md +++ b/agents/tp-run.md @@ -16,7 +16,7 @@ tools: - Glob - Bash model: haiku -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 2b08618d34a61f00aafccbda9fed6d83243296dedb83440edbd2d5c28bb6dbc4 --- diff --git a/agents/tp-session-restorer.md b/agents/tp-session-restorer.md index 2cbeaf7..c38d8a9 100644 --- a/agents/tp-session-restorer.md +++ b/agents/tp-session-restorer.md @@ -9,7 +9,7 @@ tools: - mcp__token-pilot__session_budget - Bash - Read -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 529374ed728f5eed5b758b3be3da65624783c0bf0c1a253d7d661a843eb5f767 --- diff --git a/agents/tp-ship-coordinator.md b/agents/tp-ship-coordinator.md index 122d732..b170763 100644 --- a/agents/tp-ship-coordinator.md +++ b/agents/tp-ship-coordinator.md @@ -11,7 +11,7 @@ tools: - Read - Grep model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: a60f6ae110eb3138064bce074e8ba26fa0ce5f4659df1624a9d9d3646803391b --- diff --git a/agents/tp-spec-writer.md b/agents/tp-spec-writer.md index 7bc3806..76b18a9 100644 --- a/agents/tp-spec-writer.md +++ b/agents/tp-spec-writer.md @@ -9,7 +9,7 @@ tools: - Read - Write model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: c7a4e8b39228fd5158528f389c924c5ff2d98c4b9b05ee0106d54a26c5dc1350 --- diff --git a/agents/tp-test-coverage-gapper.md b/agents/tp-test-coverage-gapper.md index 9112437..ab001e5 100644 --- a/agents/tp-test-coverage-gapper.md +++ b/agents/tp-test-coverage-gapper.md @@ -10,7 +10,7 @@ tools: - mcp__token-pilot__test_summary - Glob - Grep -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: be81eed53a3720d146cf89e4a14a7a56577633f7c84c234c412ab70d64c05b11 --- diff --git a/agents/tp-test-triage.md b/agents/tp-test-triage.md index b337da8..258c8b6 100644 --- a/agents/tp-test-triage.md +++ b/agents/tp-test-triage.md @@ -8,7 +8,7 @@ tools: - mcp__token-pilot__find_usages - mcp__token-pilot__read_symbol model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 362ecf4cb03b059421ea26933473700900073dc38b3a7fe271208dfb1ae14f90 --- diff --git a/agents/tp-test-writer.md b/agents/tp-test-writer.md index d82e998..a223bd6 100644 --- a/agents/tp-test-writer.md +++ b/agents/tp-test-writer.md @@ -13,7 +13,7 @@ tools: - Edit - Bash model: sonnet -token_pilot_version: "0.32.0" +token_pilot_version: "0.33.0" token_pilot_body_hash: 269f2fe22ff4517c277d3f56ca67d8a5527b93290ab21079a83ba7af22c1b5a9 --- diff --git a/package.json b/package.json index 3011c40..d19d8ad 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "token-pilot", - "version": "0.32.0", + "version": "0.33.0", "description": "Save up to 80% tokens when AI reads code — MCP server for token-efficient code navigation, AST-aware structural reading instead of dumping full files into context window", "type": "module", "main": "dist/index.js", diff --git a/src/ast-index/client.ts b/src/ast-index/client.ts index 0b94e5e..9e747da 100644 --- a/src/ast-index/client.ts +++ b/src/ast-index/client.ts @@ -920,8 +920,27 @@ export class AstIndexClient { } private async exec(args: string[], timeoutMs?: number): Promise { + // v0.33.0 (B10) — lazy retry. server.ts calls init() at startup, + // but it can fail silently (binary download flaky, permissions + // on shared FS, or the user invoked us before postinstall ran). + // Subsequent MCP calls would otherwise throw "not initialized" + // forever. Try once more here; on failure surface a friendlier + // error pointing at the install command. if (!this.binaryPath) { - throw new Error("ast-index not initialized. Call init() first."); + try { + await this.init(); + } catch (err) { + const msg = err instanceof Error ? err.message : String(err); + throw new Error( + `ast-index not initialized — auto-init also failed: ${msg}\n` + + `Run: npx token-pilot install-ast-index`, + ); + } + if (!this.binaryPath) { + throw new Error( + "ast-index not initialized. Run `npx token-pilot install-ast-index`.", + ); + } } // ast-index v3.39+ honours AST_INDEX_WALK_UP=1 — read-commands then diff --git a/src/cli/install-agents.ts b/src/cli/install-agents.ts index 2c9415f..d54535c 100644 --- a/src/cli/install-agents.ts +++ b/src/cli/install-agents.ts @@ -230,6 +230,95 @@ export function resolveDistAgentsDir(scriptUrl: string): string { return join(here, "..", "..", "agents"); } +/** + * Compare two semver-ish version strings. Returns positive if `a > b`, + * negative if `a < b`, 0 if equal. Tolerates non-semver suffixes by + * stripping anything past `-` or `+`. Pre-release / metadata are + * ignored. Used only for picking the highest version of a directory + * sibling list — bullet-proofness not required. + */ +function compareVersions(a: string, b: string): number { + const norm = (s: string) => + s + .split(/[-+]/)[0] + .split(".") + .map((p) => (Number.isFinite(parseInt(p, 10)) ? parseInt(p, 10) : 0)); + const [ax, ay, az] = norm(a); + const [bx, by, bz] = norm(b); + if (ax !== bx) return ax - bx; + if (ay !== by) return ay - by; + return az - bz; +} + +/** + * v0.33.0 — locate the freshest tp-* template source available on disk. + * + * The plugin cache (`~/.claude/plugins/cache/token-pilot/token-pilot//agents/`) + * almost always trails or matches the version of the running binary, but + * users with a stale npm-global token-pilot (e.g. v0.20.1 — see beads B1) + * end up running a binary whose own `agents/` directory contains an old + * subset of templates while the plugin cache holds the up-to-date 25-agent + * set. Probe both, pick whichever has the higher version (or, when + * versions cannot be read, the more complete set). + * + * Pure: never throws — returns the script-relative dir as fallback when + * the cache cannot be read. + */ +export async function pickFreshestAgentsSource( + scriptUrl: string, + homeDir: string, +): Promise<{ dir: string; source: "self" | "plugin-cache" }> { + const selfDir = resolveDistAgentsDir(scriptUrl); + const cacheRoot = join( + homeDir, + ".claude", + "plugins", + "cache", + "token-pilot", + "token-pilot", + ); + + let cacheVersion: string | null = null; + let cacheDir: string | null = null; + try { + const versions = (await readdir(cacheRoot)).filter((v) => /\d+\.\d+/.test(v)); + if (versions.length > 0) { + versions.sort(compareVersions); + cacheVersion = versions[versions.length - 1]; + cacheDir = join(cacheRoot, cacheVersion, "agents"); + } + } catch { + /* no plugin cache — keep self */ + } + + if (!cacheDir) return { dir: selfDir, source: "self" }; + + // Compare counts of tp-*.md to spot the obvious "stale binary" case + // where the running CLI is old but plugin cache is current. + let selfCount = 0; + let cacheCount = 0; + try { + selfCount = (await readdir(selfDir)).filter( + (f) => f.endsWith(".md") && f.startsWith("tp-"), + ).length; + } catch { + /* selfDir missing — pick cache */ + } + try { + cacheCount = (await readdir(cacheDir)).filter( + (f) => f.endsWith(".md") && f.startsWith("tp-"), + ).length; + } catch { + return { dir: selfDir, source: "self" }; + } + + if (cacheCount > selfCount) { + return { dir: cacheDir, source: "plugin-cache" }; + } + // Tie or self has more — keep self (newer un-released templates win). + return { dir: selfDir, source: "self" }; +} + /** Read one line from an interactive TTY prompt. */ async function promptLine(question: string): Promise { process.stderr.write(question); @@ -428,8 +517,22 @@ export async function handleInstallAgents( } } - const distAgentsDir = - opts?.distAgentsDir ?? resolveDistAgentsDir(import.meta.url); + // v0.33.0 (B1+B3) — explicit override wins; otherwise probe the + // plugin cache and pick whichever has more tp-*.md templates. This + // protects users with stale npm-global token-pilot from copying + // an outdated 6-agent subset when the bundled plugin already + // shipped the full 25. + let distAgentsDir = opts?.distAgentsDir; + if (!distAgentsDir) { + const picked = await pickFreshestAgentsSource(import.meta.url, homeDir); + distAgentsDir = picked.dir; + if (picked.source === "plugin-cache") { + process.stderr.write( + `[token-pilot] using agents from plugin cache: ${distAgentsDir}\n` + + ` (running CLI's own agents/ has fewer templates — bundled plugin is fresher).\n`, + ); + } + } try { const result = await installAgents({ diff --git a/src/cli/stats.ts b/src/cli/stats.ts index dca1dc1..a0da462 100644 --- a/src/cli/stats.ts +++ b/src/cli/stats.ts @@ -12,7 +12,11 @@ * directly without touching the filesystem. */ -import { loadEvents, type HookEvent } from "../core/event-log.js"; +import { + loadEvents, + loadEventsTree, + type HookEvent, +} from "../core/event-log.js"; export interface StatsOptions { /** @@ -204,7 +208,15 @@ export async function handleStats( opts?: { projectRoot?: string }, ): Promise { const projectRoot = opts?.projectRoot ?? process.cwd(); - const events = await loadEvents(projectRoot); + + // v0.33.0 (B5) — `--no-merge` disables the repo-tree walk and reads + // only the top-level `.token-pilot/hook-events.jsonl`. Default is to + // merge events from every subdirectory log so monorepo subdir + // sessions show up in the same totals. + const noMerge = argv.includes("--no-merge"); + const events = noMerge + ? await loadEvents(projectRoot) + : await loadEventsTree(projectRoot); const session = parseFlag(argv, "session"); const byAgent = parseFlag(argv, "by-agent"); diff --git a/src/core/event-log.ts b/src/core/event-log.ts index 55e8229..785d07f 100644 --- a/src/core/event-log.ts +++ b/src/core/event-log.ts @@ -205,6 +205,84 @@ export async function loadEvents(projectRoot: string): Promise { return out; } +/** + * v0.33.0 (B5) — load events from EVERY `.token-pilot/hook-events.jsonl` + * found at or below `repoRoot`. The hook writer resolves its own + * project root from `process.cwd()` at the moment Claude Code spawns + * us, which can land in a subdirectory of the actual repo (apps/admin, + * apps/api, packages/prisma, …). Without this, `token-pilot stats` + * sees only the top-level log and reports a fraction of the savings. + * + * Walks up to `maxDepth` levels and merges chronologically. Pure on + * filesystem read errors — missing dirs are silently skipped. + */ +export async function loadEventsTree( + repoRoot: string, + maxDepth = 5, +): Promise { + const PRUNE = new Set([ + "node_modules", + ".git", + "dist", + "build", + ".next", + ".turbo", + ".cache", + "coverage", + ".vercel", + ".vite", + ]); + const seen = new Set(); + const all: HookEvent[] = []; + + async function visit(dir: string, depth: number): Promise { + if (depth > maxDepth) return; + let entries: string[] = []; + try { + entries = await fs.readdir(dir); + } catch { + return; + } + if (entries.includes(".token-pilot")) { + const logPath = join(dir, ".token-pilot", CURRENT_FILE); + if (!seen.has(logPath)) { + seen.add(logPath); + try { + const raw = await fs.readFile(logPath, "utf-8"); + for (const line of raw.split("\n")) { + if (!line.trim()) continue; + try { + all.push(JSON.parse(line) as HookEvent); + } catch { + /* skip malformed */ + } + } + } catch { + /* missing log */ + } + } + } + for (const name of entries) { + if (PRUNE.has(name)) continue; + if (name.startsWith(".") && name !== ".claude") continue; + const full = join(dir, name); + try { + const stat = await fs.stat(full); + if (stat.isDirectory()) { + await visit(full, depth + 1); + } + } catch { + /* skip */ + } + } + } + + await visit(repoRoot, 0); + // Sort chronologically so per-session aggregations stay correct. + all.sort((a, b) => (a.ts || 0) - (b.ts || 0)); + return all; +} + /** * Enumerate all archive files (`hook-events..jsonl`) with metadata * needed by `retentionDeletions`. diff --git a/src/core/validation.ts b/src/core/validation.ts index b232efd..1800203 100644 --- a/src/core/validation.ts +++ b/src/core/validation.ts @@ -1,5 +1,32 @@ import { resolve, relative } from "node:path"; +/** + * v0.33.0 (B9) — coerce an `unknown` argument value to an integer. + * + * MCP transports frequently round-trip numeric arguments through + * JSON or environment variables and re-emit them as strings (e.g. + * `"42"`). Accept that case and reject everything else, including + * non-finite numbers, decimals, and strings that don't parse cleanly. + * + * Returns the integer value or `null` when the input cannot be + * interpreted as one. + */ +export function coerceIntFromAny(value: unknown): number | null { + if (typeof value === "number") { + if (!Number.isFinite(value) || !Number.isInteger(value)) return null; + return value; + } + if (typeof value === "string") { + const trimmed = value.trim(); + if (trimmed.length === 0) return null; + if (!/^-?\d+$/.test(trimmed)) return null; + const n = Number(trimmed); + if (!Number.isFinite(n) || !Number.isInteger(n)) return null; + return n; + } + return null; +} + /** * Resolve a user-provided path and validate it stays within projectRoot. * Prevents path traversal attacks (e.g. ../../etc/passwd). @@ -146,28 +173,24 @@ export function validateReadRangeArgs(args: unknown): { if (typeof a.path !== "string" || a.path.length === 0) { throw new Error('Required parameter "path" must be a non-empty string.'); } - if ( - typeof a.start_line !== "number" || - !Number.isInteger(a.start_line) || - a.start_line < 1 - ) { + // v0.33.0 (B9) — some MCP clients serialise numbers as strings; + // accept "10" the same as 10. Reject non-numeric strings. + const start = coerceIntFromAny(a.start_line); + if (start === null || start < 1) { throw new Error( 'Required parameter "start_line" must be a positive integer.', ); } - if ( - typeof a.end_line !== "number" || - !Number.isInteger(a.end_line) || - a.end_line < 1 - ) { + const end = coerceIntFromAny(a.end_line); + if (end === null || end < 1) { throw new Error( 'Required parameter "end_line" must be a positive integer.', ); } - if (a.end_line < a.start_line) { + if (end < start) { throw new Error('"end_line" must be >= "start_line".'); } - return { path: a.path, start_line: a.start_line, end_line: a.end_line }; + return { path: a.path, start_line: start, end_line: end }; } /** diff --git a/src/handlers/smart-log.ts b/src/handlers/smart-log.ts index d8aed26..0f9ef93 100644 --- a/src/handlers/smart-log.ts +++ b/src/handlers/smart-log.ts @@ -39,14 +39,19 @@ export async function handleSmartLog( const count = Math.min(args.count ?? 10, MAX_COUNT); const ref = args.ref ?? 'HEAD'; - // Build git log command with --numstat for file stats + // Build git log command with --numstat for file stats. + // v0.33.0 (B6) — `ref` MUST be a revision argument, not a pathspec. + // The previous version pushed `'--', ref` which made git interpret + // `HEAD` as a path (`git log -- HEAD`) and silently returned empty. + // Adding a path then produced `git log -- HEAD -- foo.ts` — invalid. + // Correct order: `git log [-- ]`. const gitArgs = [ 'log', `--format=${RECORD_SEPARATOR}%h${FIELD_SEPARATOR}%ad${FIELD_SEPARATOR}%an${FIELD_SEPARATOR}%s`, '--date=short', '--numstat', `-n`, `${count}`, - '--', ref, + ref, ]; if (args.path) { diff --git a/src/hooks/installer.ts b/src/hooks/installer.ts index 07a5cc0..9b38ded 100644 --- a/src/hooks/installer.ts +++ b/src/hooks/installer.ts @@ -36,6 +36,40 @@ function buildHookCommand( return `token-pilot ${action}`; } +/** + * Detect a stale token-pilot hook command — one that points at a + * pinned npx-cache snapshot (`npx/_npx//...`) or any other + * version-pinned path that won't follow plugin upgrades. + * + * v0.33.0 fix: users who ran `npx token-pilot init` early on got + * settings.json entries with literal `~/.npm/_npx//...` paths. + * When the npx cache rotates or token-pilot publishes a new minor, + * those entries silently call the old binary, missing every hook + * shipped after install (e.g. v0.31.0 Task hooks). Removing the + * stale entry lets the next install or the bundled plugin's + * `hooks/hooks.json` (CLAUDE_PLUGIN_ROOT) take over. + */ +export function isStaleTokenPilotHookCommand(cmd: unknown): boolean { + if (typeof cmd !== "string") return false; + if (!cmd.includes("token-pilot")) return false; + // npm/npx cache hash — always stale (will rotate) + if (/\/_npx\/[0-9a-f]+\//.test(cmd)) return true; + // Pinned plugin-cache version path — old version that may not + // contain a hook handler the new settings entry references. + // Match `/plugins/cache/token-pilot/token-pilot//`. + const pinned = cmd.match( + /\/plugins\/cache\/token-pilot\/token-pilot\/([^/]+)\//, + ); + if (pinned) { + // The plugin runtime always uses ${CLAUDE_PLUGIN_ROOT} which + // resolves to the *current* version dir. A literal version in + // the path means someone wrote it from a CLI that captured the + // dir at install time — stale by definition. + return true; + } + return false; +} + function createHookConfig(options?: HookInstallOptions) { return { hooks: { @@ -187,12 +221,15 @@ export async function installHook( if (Array.isArray(existingHooks)) { // Remove old broken hooks (bare "token-pilot" without absolute path) - // and replace with working ones using absolute paths + // OR stale npx-cache / pinned-version paths (v0.33.0) + // and replace with working ones using absolute paths. const oldBrokenHooks = existingHooks.filter( (h: any) => isTokenPilotHook(h) && h.hooks?.some( - (hook: any) => hook.command?.match(/^token-pilot\s/), // bare command without path + (hook: any) => + hook.command?.match(/^token-pilot\s/) || + isStaleTokenPilotHookCommand(hook.command), ), ); @@ -386,3 +423,134 @@ export async function uninstallHook( }; } } + +// ─── v0.33.0 migration ──────────────────────────────────────────────── + +export interface CleanStaleResult { + scanned: string[]; + cleaned: string[]; + staleEntriesRemoved: number; + message: string; +} + +/** + * Scan a settings.json (user-level or project-level) and remove every + * token-pilot hook entry whose command points at a pinned npx-cache + * snapshot or a literal plugin-cache version path. The plugin's bundled + * `hooks/hooks.json` (resolved through `${CLAUDE_PLUGIN_ROOT}` at + * runtime) supersedes them. + * + * Pure-ish: writes only when something changed. Never throws — bad JSON + * or missing file are reported in the result so callers (CLI, init) + * can surface them without aborting. + */ +export async function cleanStaleHookEntries( + settingsPath: string, +): Promise { + const result: CleanStaleResult = { + scanned: [settingsPath], + cleaned: [], + staleEntriesRemoved: 0, + message: "", + }; + + let raw: string; + try { + raw = await readFile(settingsPath, "utf-8"); + } catch (err: any) { + if (err?.code === "ENOENT") { + result.message = `No settings at ${settingsPath} — nothing to migrate.`; + return result; + } + result.message = `Cannot read ${settingsPath}: ${err?.message ?? err}`; + return result; + } + + let settings: Record; + try { + settings = JSON.parse(raw); + } catch { + result.message = `Invalid JSON in ${settingsPath} — skipped (fix manually).`; + return result; + } + + const sections = ["PreToolUse", "PostToolUse", "SessionStart"] as const; + let removed = 0; + + for (const section of sections) { + const arr = settings.hooks?.[section]; + if (!Array.isArray(arr)) continue; + const filtered = arr.filter((entry: any) => { + const inner = Array.isArray(entry?.hooks) ? entry.hooks : []; + const hasStale = inner.some((h: any) => + isStaleTokenPilotHookCommand(h?.command), + ); + if (hasStale) { + removed++; + return false; + } + return true; + }); + if (filtered.length !== arr.length) { + if (filtered.length === 0) { + delete settings.hooks[section]; + } else { + settings.hooks[section] = filtered; + } + } + } + + if (removed === 0) { + result.message = `No stale token-pilot hook entries in ${settingsPath}.`; + return result; + } + + // Drop empty hooks container so JSON stays clean. + if ( + settings.hooks && + typeof settings.hooks === "object" && + Object.keys(settings.hooks).length === 0 + ) { + delete settings.hooks; + } + + await writeFile(settingsPath, JSON.stringify(settings, null, 2) + "\n"); + result.cleaned.push(settingsPath); + result.staleEntriesRemoved = removed; + result.message = `Removed ${removed} stale token-pilot hook entr${ + removed === 1 ? "y" : "ies" + } from ${settingsPath}.`; + return result; +} + +/** + * Inspect `~/.claude/settings.json` to determine whether the user has + * enabled the bundled `token-pilot` plugin in Claude Code. When true, + * the plugin's own `hooks/hooks.json` is the source of truth and any + * additional hook entries written by the npm CLI are duplicates that + * also lock the user to whichever binary path the CLI captured. + */ +export async function isTokenPilotPluginEnabled( + homeDir: string, +): Promise { + const settingsPath = resolve(homeDir, ".claude", "settings.json"); + let raw: string; + try { + raw = await readFile(settingsPath, "utf-8"); + } catch { + return false; + } + let settings: any; + try { + settings = JSON.parse(raw); + } catch { + return false; + } + const enabled = settings?.enabledPlugins; + if (!enabled || typeof enabled !== "object") return false; + // keys look like `token-pilot@token-pilot` — match prefix. + return Object.entries(enabled).some( + ([key, val]) => + val === true && typeof key === "string" && key.startsWith("token-pilot@"), + ); +} diff --git a/src/hooks/pre-task.ts b/src/hooks/pre-task.ts index 32d2005..d89a96a 100644 --- a/src/hooks/pre-task.ts +++ b/src/hooks/pre-task.ts @@ -83,6 +83,21 @@ function containsEscape(description: string): boolean { return ESCAPE_PHRASES.some((p) => n.includes(p)); } +/** + * v0.33.0 (B14) — generic context appended to every advice payload + * for non-tp-* dispatches. Subagents like `general-purpose` and + * `code-analyzer` don't know about the token-pilot MCP tools and + * loop on raw `Read` even after `hook-pre-read` denies them. This + * paragraph lands in their context window before they take their + * first action and tells them what to use instead. + */ +const SUBAGENT_TOOL_GUIDE = + "When working in this task: prefer `mcp__token-pilot__smart_read` " + + "(file structure), `read_symbol` (one function/class), and " + + "`find_usages` (semantic search) over raw Read/Grep. The token-pilot " + + "PreToolUse hooks block large-file Read and unbounded Grep — use " + + "the MCP tools or pass `offset`/`limit` to Read."; + /** * Pure decision function. Caller resolves all context (env, mode, * agent index) up front so this stays a plain input → output mapping. @@ -101,16 +116,42 @@ export function decidePreTask( return { kind: "allow" }; } - // No description → nothing to match against. Allow (Claude Code - // sometimes dispatches Task with only a subagent_type + session id). - if (!description || description.length === 0) return { kind: "allow" }; + // v0.33.0 (B4) — TOKEN_PILOT_FORCE_SUBAGENTS=1 with an empty agent + // catalog used to silently allow every Task call (no matches → no + // suggestion). That defeats the env's only purpose. Fail loud + // instead: tell the user to install the templates. + const indexEmpty = + !ctx.agentIndex.agents || ctx.agentIndex.agents.length === 0; + if (ctx.force && indexEmpty) { + return { + kind: "deny", + reason: + "TOKEN_PILOT_FORCE_SUBAGENTS=1 is set but no tp-* agents are " + + "installed in this project (or `~/.claude/agents/`). " + + "Run `npx token-pilot install-agents --scope=project` first, " + + "or unset TOKEN_PILOT_FORCE_SUBAGENTS.", + }; + } + + // No description → nothing to match against. Inject the generic + // tool-guide so the subagent still picks tp-tools (B14). + if (!description || description.length === 0) { + return { kind: "advise", message: SUBAGENT_TOOL_GUIDE }; + } // Author-blessed escape clauses — user is explicitly saying - // "this is broad". Respect that. - if (containsEscape(description)) return { kind: "allow" }; + // "this is broad". Inject the tool-guide but no agent suggestion. + if (containsEscape(description)) { + return { kind: "advise", message: SUBAGENT_TOOL_GUIDE }; + } const hit = matchTpAgent(description, ctx.agentIndex); - if (!hit) return { kind: "allow" }; + if (!hit) { + // No specific tp-* match. Still send the generic tool-guide so + // the subagent learns about smart_read / read_symbol — covers the + // common code-analyzer / general-purpose loop on raw Read (B14). + return { kind: "advise", message: SUBAGENT_TOOL_GUIDE }; + } const suggestion = `Consider dispatching \`${hit.agent}\` instead of \`${subagentType || "general-purpose"}\` — ` + @@ -118,7 +159,8 @@ export function decidePreTask( `tp-* agents run under a tighter budget and output in terse style, typically ` + `~50-70 % fewer tokens than general-purpose. ` + `Escape: add "ad-hoc" or "open-ended" to the description to bypass, or set ` + - `TOKEN_PILOT_MODE=advisory for warn-only behaviour.`; + `TOKEN_PILOT_MODE=advisory for warn-only behaviour.\n\n` + + SUBAGENT_TOOL_GUIDE; const hardBlock = ctx.force || diff --git a/src/index.ts b/src/index.ts index 3b92422..5345d0d 100644 --- a/src/index.ts +++ b/src/index.ts @@ -29,7 +29,12 @@ import { execFile } from "node:child_process"; import { promisify } from "node:util"; import { fileURLToPath } from "node:url"; import { createServer } from "./server.js"; -import { installHook, uninstallHook } from "./hooks/installer.js"; +import { + installHook, + uninstallHook, + cleanStaleHookEntries, + isTokenPilotPluginEnabled, +} from "./hooks/installer.js"; import { findBinary, installBinary, @@ -296,6 +301,24 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { case "uninstall-hook": await handleUninstallHook(cliArgs[1] || process.cwd()); return; + case "migrate-hooks": { + // v0.33.0 — clean stale npx-cache / pinned-version token-pilot + // hook entries from user-level + project-level settings.json so + // the bundled plugin's hooks/hooks.json takes over via + // CLAUDE_PLUGIN_ROOT. Safe and idempotent. + const targets = [ + resolve(homedir(), ".claude", "settings.json"), + resolve(cliArgs[1] || process.cwd(), ".claude", "settings.json"), + ]; + let total = 0; + for (const path of targets) { + const r = await cleanStaleHookEntries(path); + console.log(r.message); + total += r.staleEntriesRemoved; + } + console.log(`\nDone — removed ${total} stale entr${total === 1 ? "y" : "ies"}.`); + return; + } case "install-ast-index": await handleInstallAstIndex(); return; @@ -387,6 +410,34 @@ export function looksLikePluginCacheDir(candidate: string): boolean { } } +/** + * v0.33.0 (B8) — reject candidates that are obviously not a project + * directory. Triggered by WSL launches where the shell starts in + * `C:\Windows\System32`, `/mnt/c/Windows/...`, or a UNC path. Without + * this guard, `git rev-parse --show-toplevel` either fails noisily or + * returns the Windows tree, leaving every subsequent git/MCP call + * looking at the wrong filesystem. + * + * Conservative — only matches paths we are certain are not user code. + */ +export function isWindowsSystemPath(candidate: string): boolean { + if (!candidate) return false; + // Native Windows: C:\Windows\... or C:/Windows/... + if (/^[A-Za-z]:[\\/](Windows|Program Files|ProgramData)\b/i.test(candidate)) { + return true; + } + // WSL view of Windows: /mnt/c/Windows/... (or any drive letter) + if (/^\/mnt\/[a-z]\/(windows|program files|programdata)\b/i.test(candidate)) { + return true; + } + // UNC path — almost never a project root and `cwd` cannot be set to + // one reliably anyway. Better to skip than to misroute. + if (/^\\\\/.test(candidate)) { + return true; + } + return false; +} + export async function startServer(cliArgs: string[] = process.argv.slice(2)) { // Defensive: ignore a poisoned cliArgs[0] pointing into the plugin install // dir. Fall through to the INIT_CWD / PWD / cwd detection below — same @@ -401,14 +452,23 @@ export async function startServer(cliArgs: string[] = process.argv.slice(2)) { let projectRoot = explicitRoot || process.cwd(); - // Detect git root for reliable project root - // Try multiple sources: args[0] → INIT_CWD (npm/npx invoking dir) → PWD → cwd + // Detect git root for reliable project root. + // v0.33.0 (B8) — on WSL the shell is sometimes launched with the + // working directory pointing into Windows' filesystem + // (`/mnt/c/Windows/system32` or, worse, a UNC like `\\\\wsl$\\…`). + // INIT_CWD/PWD/cwd then resolve to a Windows system path and + // every git operation lands in the wrong tree. Claude Code itself + // reliably exports `CLAUDE_PROJECT_DIR` — prefer it absolutely + // when present and reject obvious system paths regardless. if (!explicitRoot) { const candidates = [ + process.env.CLAUDE_PROJECT_DIR, // canonical Claude Code env (B8) process.env.INIT_CWD, // npm/npx sets this to invoking directory process.env.PWD, // shell working directory (may differ from cwd) process.cwd(), // Node.js working directory - ].filter((c): c is string => !!c && c !== "/"); + ] + .filter((c): c is string => !!c && c !== "/") + .filter((c) => !isWindowsSystemPath(c)); let detected = false; for (const candidate of candidates) { @@ -825,6 +885,26 @@ export async function handleInstallHook(projectRoot: string) { process.exit(0); } + // v0.33.0 — detect when the user already enabled the token-pilot + // plugin in `~/.claude/settings.json`. Even though we're running + // outside CLAUDE_PLUGIN_ROOT here (CLI invocation), the plugin's + // own `hooks/hooks.json` is what Claude Code uses at runtime. + // Writing additional entries with a captured npx-cache path leads + // to the bug B2 (v0.33.0): hooks pinned to an old binary that + // never sees newer hook handlers. Surface a clear migration step + // instead of silently duplicating. + if (await isTokenPilotPluginEnabled(homedir())) { + const userSettings = resolve(homedir(), ".claude", "settings.json"); + const cleanup = await cleanStaleHookEntries(userSettings); + console.log( + "token-pilot plugin is enabled in ~/.claude/settings.json —\n" + + "the plugin's bundled hooks/hooks.json is the source of truth.\n" + + "Skipping settings.json hook write to avoid pinning to a stale path.\n" + + cleanup.message, + ); + process.exit(0); + } + let hookOptions: { scriptPath?: string; nodeExecPath?: string } | undefined; try { const rawPath = fileURLToPath(new URL("./index.js", import.meta.url)); diff --git a/tests/hooks/pre-task.test.ts b/tests/hooks/pre-task.test.ts index 3cef619..a02219e 100644 --- a/tests/hooks/pre-task.test.ts +++ b/tests/hooks/pre-task.test.ts @@ -72,36 +72,48 @@ describe("decidePreTask — allow cases", () => { expect(d.kind).toBe("allow"); }); - it("allows when description is empty", () => { + it("advises with the tool-guide when description is empty (B14)", () => { const d = decidePreTask(input("general-purpose", ""), ctx()); - expect(d.kind).toBe("allow"); + expect(d.kind).toBe("advise"); + if (d.kind === "advise") { + expect(d.message).toContain("smart_read"); + } }); - it("allows when heuristic returns no match", () => { + it("advises with the tool-guide when heuristic returns no match (B14)", () => { const d = decidePreTask( input("general-purpose", "reminder to buy milk"), ctx(), ); - expect(d.kind).toBe("allow"); + expect(d.kind).toBe("advise"); + if (d.kind === "advise") { + expect(d.message).toContain("smart_read"); + // No agent name should be suggested when there is no match. + expect(d.message).not.toMatch(/tp-[a-z]/); + } }); - it("allows when description contains an escape phrase", () => { + it("advises with the tool-guide when description contains an escape phrase (B14)", () => { const d = decidePreTask( input("general-purpose", 'ad-hoc "review these changes" investigation'), ctx(), ); - expect(d.kind).toBe("allow"); + expect(d.kind).toBe("advise"); + if (d.kind === "advise") { + expect(d.message).toContain("smart_read"); + expect(d.message).not.toMatch(/tp-[a-z]/); + } }); - it("allows open-ended / across-the-codebase escape forms", () => { + it("advises with the tool-guide on open-ended / across-the-codebase escape forms (B14)", () => { for (const phrase of [ "open-ended review these changes task", "review these changes across the codebase", "multi-step review these changes", ]) { - expect(decidePreTask(input("general-purpose", phrase), ctx()).kind).toBe( - "allow", - ); + const d = decidePreTask(input("general-purpose", phrase), ctx()); + expect(d.kind).toBe("advise"); + if (d.kind === "advise") expect(d.message).toContain("smart_read"); } }); }); @@ -159,12 +171,20 @@ describe("decidePreTask — deny cases (strict / force)", () => { expect(d.kind).toBe("deny"); }); - it("force does NOT override an escape phrase", () => { + it("force does NOT hard-deny when an escape phrase is present (advises with tool-guide)", () => { + // v0.33.0 (B14): an escape phrase still pre-empts agent suggestion, + // but we now always send the generic tool-guide so the subagent + // learns about smart_read / read_symbol. force should NOT escalate + // an escape into a hard deny. const d = decidePreTask( input("general-purpose", "ad-hoc review these changes"), ctx({ mode: "deny", force: true }), ); - expect(d.kind).toBe("allow"); + expect(d.kind).toBe("advise"); + if (d.kind === "advise") { + expect(d.message).toContain("smart_read"); + expect(d.message).not.toMatch(/tp-[a-z]/); + } }); it("force does NOT block tp-* subagents", () => { @@ -174,6 +194,26 @@ describe("decidePreTask — deny cases (strict / force)", () => { ); expect(d.kind).toBe("allow"); }); + + it("force + empty agent index → diagnostic deny (B4)", () => { + // Reproduce the silent-no-op scenario: user enables FORCE_SUBAGENTS + // but never ran install-agents — the catalog is empty, so previously + // every Task slipped through. v0.33.0 deny with a clear hint instead. + const emptyCtx = { + mode: "advisory" as const, + agentIndex: { agents: [] }, + force: true, + }; + const d = decidePreTask( + input("general-purpose", "please review these changes"), + emptyCtx, + ); + expect(d.kind).toBe("deny"); + if (d.kind === "deny") { + expect(d.reason).toContain("install-agents"); + expect(d.reason).toContain("FORCE_SUBAGENTS"); + } + }); }); describe("decidePreTask — matcher selection", () => { diff --git a/tests/index.test.ts b/tests/index.test.ts index 5f2b5fe..d0048b8 100644 --- a/tests/index.test.ts +++ b/tests/index.test.ts @@ -24,6 +24,16 @@ vi.mock("../src/server.js", () => ({ vi.mock("../src/hooks/installer.js", () => ({ installHook: mockDeps.installHook, uninstallHook: mockDeps.uninstallHook, + // v0.33.0: stub plugin-detection + migration helpers introduced in + // this release so the CLI flow under test stays unaffected. + isTokenPilotPluginEnabled: vi.fn().mockResolvedValue(false), + cleanStaleHookEntries: vi.fn().mockResolvedValue({ + scanned: [], + cleaned: [], + staleEntriesRemoved: 0, + message: "", + }), + isStaleTokenPilotHookCommand: vi.fn().mockReturnValue(false), })); vi.mock("../src/ast-index/binary-manager.js", async () => { From 5c546a1a85911a1d28ada7c3a9452cb300932064 Mon Sep 17 00:00:00 2001 From: shahinyanm Date: Fri, 8 May 2026 13:17:58 +0400 Subject: [PATCH 2/3] v0.33.0 Pack 1+2+3: error/diagnostic logging + doctor health checks MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the visibility gap exposed by tp-report.txt: every silent hook throw, edge-case branch, and stale-config slipped past us because we had no place to write them. 14 new tests; 1269/1269 pass. Pack 1 — error channel - new src/core/error-log.ts: appendError, loadErrors, formatErrorList, classifyError (ENOENT/parse_error/timeout/EACCES/...), safeBasename / safePathInfo (privacy helpers), rotation (5 MB) + 30d archive retention, TOKEN_PILOT_NO_ERROR_LOG=1 opt-out. - new src/hooks/safe-runner.ts: runHookEntryPoint(opts, fn) wraps every hook entry point; on throw → structured record to ~/.token-pilot/hook-errors.jsonl; always exits 0 so Claude Code never sees a hook failure. - wraps all 8 entry points in src/index.ts: hook-read, hook-edit, hook-pre-bash, hook-pre-grep, hook-pre-task, hook-post-bash, hook-post-task, hook-session-start. - new CLI: `token-pilot errors [--tail=N --code=X --hook=Y --level=L]`. Pack 2 — diagnostic events - HookEvent extended with optional level/code/detail/duration_ms + union now includes "diagnostic". - new helper appendDiagnostic(projectRoot, args). - emit points wired: * hook-pre-task → `force_subagents_no_agents` when TOKEN_PILOT_FORCE_SUBAGENTS=1 and the agent index is empty * startServer detect path → `wsl_path_rejected` one event per WSL/UNC candidate that isWindowsSystemPath() filters - doctor extended with a new "runtime health" section: * top error codes from last 100 records * stale npx-cache / pinned-version hook entries in user-level + project-level settings.json (B2 leftover scan) * tp-* agent inventory: installed vs catalog * cwd safety: warns when CLAUDE_PROJECT_DIR or process.cwd() lands on a Windows system path Pack 3 — timing groundwork - HookEvent gains optional duration_ms. - runHookEntryPoint measures wall-clock and exposes onTiming hook for callers that want to log a `hook_slow` diagnostic. (Per-event emit deferred — needs a threshold tuned from real data.) Privacy: error records carry only sanitized input (basename, ext, short metadata). No file content, no prompts, no full paths. Fully local — nothing is sent anywhere. Tests: - tests/core/error-log.test.ts (14 tests) — classifier, sanitizers, loadErrors filters/tail, formatErrorList top-codes. - 1255 prior tests untouched and pass. --- src/core/error-log.ts | 268 +++++++++++++++++++++++++++++++ src/core/event-log.ts | 75 ++++++++- src/hooks/safe-runner.ts | 88 +++++++++++ src/index.ts | 294 ++++++++++++++++++++++++++--------- tests/core/error-log.test.ts | 167 ++++++++++++++++++++ 5 files changed, 815 insertions(+), 77 deletions(-) create mode 100644 src/core/error-log.ts create mode 100644 src/hooks/safe-runner.ts create mode 100644 tests/core/error-log.test.ts diff --git a/src/core/error-log.ts b/src/core/error-log.ts new file mode 100644 index 0000000..7ac9c4b --- /dev/null +++ b/src/core/error-log.ts @@ -0,0 +1,268 @@ +/** + * v0.34.0 — error / diagnostic channel for token-pilot hooks + CLI. + * + * Why a separate file from `hook-events.jsonl`: + * - hook-events lives in `/.token-pilot/`. When the hook + * itself fails BEFORE projectRoot is resolved (B8 WSL detection, + * missing dir, ENOENT), there is nowhere to write the regular log. + * - Errors must outlive a single project — when a user reports + * "nothing logs anymore" we want one absolute path to look at. + * + * Layout: + * ~/.token-pilot/hook-errors.jsonl + * + * Format: one JSON record per line. Schema in `HookErrorRecord` below. + * + * Discipline: + * - Never throws. The error logger is itself the last line of defence — + * a throw here would defeat the wrapper that calls it. + * - Cap-and-rotate: when the file passes MAX_BYTES the writer renames + * it to `hook-errors..jsonl` and starts fresh. Old archives + * past RETENTION_MS are pruned best-effort on each append. + * - `TOKEN_PILOT_NO_ERROR_LOG=1` opts out entirely. + * + * Privacy: + * - The `input` field is whatever the hook chose to record. Callers + * MUST sanitize before passing it in — no full paths, no file + * content, no prompts. Helpers `safeBasename()` / `safePathInfo()` + * are provided for the common cases. + */ + +import * as fs from "node:fs/promises"; +import { existsSync } from "node:fs"; +import { homedir } from "node:os"; +import { basename, dirname, join, resolve } from "node:path"; + +// ─── types ─────────────────────────────────────────────────────────── + +export type ErrorLevel = "info" | "warn" | "error"; + +export interface HookErrorRecord { + ts: number; + hook: string; + level: ErrorLevel; + code: string; + msg: string; + stack?: string; + input?: Record; + pluginVersion?: string; + nodeVersion?: string; + platform?: NodeJS.Platform; +} + +// ─── constants ─────────────────────────────────────────────────────── + +const MAX_BYTES = 5 * 1024 * 1024; // 5 MB before rotate +const RETENTION_MS = 30 * 24 * 3600 * 1000; // 30d archive retention +const ARCHIVE_RE = /^hook-errors\.(\d+)\.jsonl$/; +const CURRENT = "hook-errors.jsonl"; + +// ─── path resolution ───────────────────────────────────────────────── + +export function errorLogDir(): string { + return join(homedir(), ".token-pilot"); +} +export function errorLogPath(): string { + return join(errorLogDir(), CURRENT); +} + +// ─── env opt-out ───────────────────────────────────────────────────── + +function isOptedOut(): boolean { + return process.env.TOKEN_PILOT_NO_ERROR_LOG === "1"; +} + +// ─── sanitizers ────────────────────────────────────────────────────── + +/** + * Reduce a path to its basename. Use everywhere a path could leak: + * the user's project tree, file content, anything not strictly an + * identifier. Returns `""` for missing input rather than null + * so the field stays shape-stable. + */ +export function safeBasename(p: unknown): string { + if (typeof p !== "string" || p.length === 0) return ""; + return basename(p); +} + +/** + * Capture only the metadata about a path — basename + length + ext — + * dropping the absolute path entirely. Useful when the analysis + * benefits from "kind of file" without revealing where it lived. + */ +export function safePathInfo(p: unknown): { + name: string; + ext: string; +} { + if (typeof p !== "string" || p.length === 0) { + return { name: "", ext: "" }; + } + const name = basename(p); + const dot = name.lastIndexOf("."); + return { + name, + ext: dot >= 0 ? name.slice(dot).toLowerCase() : "", + }; +} + +// ─── error classification ──────────────────────────────────────────── + +/** + * Map a thrown value to a stable, searchable `code`. The classifier + * is intentionally simple — node ErrnoException codes pass through, + * everything else falls into a coarse bucket. New cases land here + * only when a pattern shows up repeatedly in the wild. + */ +export function classifyError(err: unknown): string { + if (err && typeof err === "object") { + const e = err as NodeJS.ErrnoException; + if (typeof e.code === "string" && e.code.length > 0) return e.code; + const name = (e as Error).name; + if (name === "SyntaxError") return "parse_error"; + if (name === "TypeError") return "type_error"; + } + if (err instanceof Error) { + const m = err.message.toLowerCase(); + if (m.includes("timeout")) return "timeout"; + if (m.includes("not initialized")) return "not_initialized"; + if (m.includes("permission denied")) return "EACCES"; + } + return "unknown"; +} + +// ─── rotate + retention ────────────────────────────────────────────── + +async function rotateIfNeeded(): Promise { + const p = errorLogPath(); + try { + const stat = await fs.stat(p); + if (stat.size < MAX_BYTES) return; + const archive = join(errorLogDir(), `hook-errors.${Date.now()}.jsonl`); + await fs.rename(p, archive); + } catch { + /* missing or stat failure — append will create */ + } +} + +async function pruneArchives(): Promise { + const dir = errorLogDir(); + let entries: string[]; + try { + entries = await fs.readdir(dir); + } catch { + return; + } + const cutoff = Date.now() - RETENTION_MS; + for (const name of entries) { + const m = name.match(ARCHIVE_RE); + if (!m) continue; + const ts = Number(m[1]); + if (!Number.isFinite(ts)) continue; + if (ts < cutoff) { + try { + await fs.unlink(join(dir, name)); + } catch { + /* best-effort */ + } + } + } +} + +// ─── append ────────────────────────────────────────────────────────── + +export async function appendError(rec: HookErrorRecord): Promise { + if (isOptedOut()) return; + try { + await fs.mkdir(errorLogDir(), { recursive: true }); + await rotateIfNeeded(); + await fs.appendFile(errorLogPath(), JSON.stringify(rec) + "\n"); + // best-effort retention sweep — not awaited tightly because a slow + // FS shouldn't slow the hook hot-path; failures are silent. + pruneArchives().catch(() => {}); + } catch { + /* logger of last resort — never throw */ + } +} + +// ─── load ──────────────────────────────────────────────────────────── + +export interface LoadErrorsOptions { + /** Limit to the last N records (most recent first). */ + tail?: number; + /** Filter by `code`. */ + code?: string; + /** Filter by `hook` name. */ + hook?: string; + /** Filter by `level`. */ + level?: ErrorLevel; + /** Override the default current-log path (testing). */ + path?: string; +} + +export async function loadErrors( + opts: LoadErrorsOptions = {}, +): Promise { + const p = opts.path ?? errorLogPath(); + let raw: string; + try { + raw = await fs.readFile(p, "utf-8"); + } catch { + return []; + } + const out: HookErrorRecord[] = []; + for (const line of raw.split("\n")) { + if (!line.trim()) continue; + try { + const rec = JSON.parse(line) as HookErrorRecord; + if (opts.code && rec.code !== opts.code) continue; + if (opts.hook && rec.hook !== opts.hook) continue; + if (opts.level && rec.level !== opts.level) continue; + out.push(rec); + } catch { + /* skip malformed */ + } + } + // Newest first — most useful default ordering for a tail view. + out.sort((a, b) => (b.ts || 0) - (a.ts || 0)); + if (opts.tail && opts.tail > 0) { + return out.slice(0, opts.tail); + } + return out; +} + +// ─── format ────────────────────────────────────────────────────────── + +export function formatErrorList(records: HookErrorRecord[]): string { + if (records.length === 0) { + return "No errors logged."; + } + const counts = new Map(); + for (const r of records) { + counts.set(r.code, (counts.get(r.code) ?? 0) + 1); + } + const top = Array.from(counts.entries()) + .sort((a, b) => b[1] - a[1]) + .slice(0, 10); + + const lines: string[] = []; + lines.push(`token-pilot errors — ${records.length} total`); + lines.push(""); + lines.push("Top codes:"); + for (const [code, n] of top) { + lines.push(` ${String(n).padStart(4)}× ${code}`); + } + lines.push(""); + lines.push("Most recent:"); + const recent = records.slice(0, 20); + for (const r of recent) { + const when = new Date(r.ts).toISOString().slice(11, 19); + lines.push( + ` [${when}] ${r.level.toUpperCase().padEnd(5)} ${r.hook} ${r.code} — ${r.msg}`, + ); + } + return lines.join("\n"); +} + +// ─── exports for indirection from index.ts ─────────────────────────── + +export { existsSync, resolve, dirname }; diff --git a/src/core/event-log.ts b/src/core/event-log.ts index 785d07f..f2aebf6 100644 --- a/src/core/event-log.ts +++ b/src/core/event-log.ts @@ -36,7 +36,14 @@ export interface HookEvent { /** null for top-level session; agent_type string inside a subagent. */ agent_type: string | null; agent_id: string | null; - event: "denied" | "allowed" | "bypass" | "pass-through" | "task" | string; + event: + | "denied" + | "allowed" + | "bypass" + | "pass-through" + | "task" + | "diagnostic" + | string; file: string; lines: number; estTokens: number; @@ -45,6 +52,26 @@ export interface HookEvent { /** estTokens - summaryTokens; 0 for allow/bypass. */ savedTokens: number; + // ─── diagnostic (v0.34.0) ──────────────────────────────────────── + // Populated only on `event: "diagnostic"` records — emitted by + // hooks/handlers when an edge-case fires (matcher empty, WSL + // path rejected, validation coerced, …). Every diagnostic gets a + // stable `code` so frequencies are countable in stats. + /** "info" | "warn" | "error" — severity of the diagnostic. */ + level?: "info" | "warn" | "error"; + /** Stable searchable identifier — e.g. `force_subagents_no_agents`. */ + code?: string; + /** Optional small map of context — must be sanitised by caller. */ + detail?: Record; + + // ─── timing (v0.34.0 Pack 3) ───────────────────────────────────── + /** + * Wall-clock duration of the hook handler that emitted this event, + * milliseconds. Always optional — only the safe-runner wrapper sets + * it, and only on the FINAL diagnostic record per hook invocation. + */ + duration_ms?: number; + // ─── task-specific (v0.31.0) ───────────────────────────────────── // These are populated only on `event: "task"` records emitted by // `processPostTask`. They are optional on the interface so the rest @@ -181,6 +208,52 @@ export async function appendEvent( } } +/** + * v0.34.0 — convenience wrapper for emitting a `diagnostic` event. + * + * Diagnostics describe edge-case branches inside a normal handler + * run (matcher returned no agents, WSL path rejected, MCP arg + * coerced, etc.). They live in the project-local hook-events.jsonl + * alongside the regular events so `stats --diagnostics` can count + * them by code. + * + * If the projectRoot is not yet resolvable (the failure happened + * before detection), prefer `appendError` from `core/error-log.ts` + * — it falls back to a user-level path. + * + * Pure-ish: never throws. Same best-effort semantics as appendEvent. + */ +export async function appendDiagnostic( + projectRoot: string, + args: { + code: string; + level?: "info" | "warn" | "error"; + detail?: Record; + sessionId?: string; + agentType?: string | null; + agentId?: string | null; + durationMs?: number; + }, +): Promise { + const rec: HookEvent = { + ts: Date.now(), + session_id: args.sessionId ?? "diagnostic", + agent_type: args.agentType ?? null, + agent_id: args.agentId ?? null, + event: "diagnostic", + file: "", + lines: 0, + estTokens: 0, + summaryTokens: 0, + savedTokens: 0, + level: args.level ?? "info", + code: args.code, + detail: args.detail, + duration_ms: args.durationMs, + }; + await appendEvent(projectRoot, rec); +} + /** * Read all events from the current log file. Malformed JSONL lines are * skipped silently (a corrupted line should not poison the whole diff --git a/src/hooks/safe-runner.ts b/src/hooks/safe-runner.ts new file mode 100644 index 0000000..b15c8ca --- /dev/null +++ b/src/hooks/safe-runner.ts @@ -0,0 +1,88 @@ +/** + * v0.34.0 — `runHookSafely` / `runHookEntryPoint`. + * + * Every token-pilot hook used to wrap its own try/catch. When the + * branch broke (B2 stale binary, B8 bad cwd, B10 ast-index init + * race), throws were swallowed silently and the user saw "nothing + * happens". This wrapper centralises the discipline: + * + * - Run the hook body + * - On throw → record one structured `HookErrorRecord` to the + * user-level error log + * - Optionally measure duration and emit a `diagnostic` event + * (Pack 3 — timing) + * - ALWAYS exit 0 — Claude Code must never see a hook error + * because that aborts the user's tool call. + * + * Hooks themselves keep responsibility for emitting domain-level + * diagnostics (matcher empty, WSL reject, etc.) — this wrapper is + * the safety net for unexpected throws. + */ + +import { appendError, classifyError, type ErrorLevel } from "../core/error-log.js"; + +export interface RunHookOptions { + /** Hook name (matcher in hooks.json — e.g. "hook-pre-task"). */ + hook: string; + /** Optional safe summary of the hook input — sanitised by caller. */ + inputSummary?: Record; + /** Plugin version, captured by caller and forwarded to the log. */ + pluginVersion?: string; +} + +/** + * Run a hook body and swallow any throw into the structured error log. + * Returns true on success, false on caught error — useful when the + * caller still wants to take a fallback action (e.g. emit a generic + * permissionDecision to Claude before exiting). + */ +export async function runHookSafely( + options: RunHookOptions, + fn: () => Promise | void, +): Promise { + try { + await fn(); + return true; + } catch (err) { + const e = err as Error & { stack?: string; code?: string }; + await appendError({ + ts: Date.now(), + hook: options.hook, + level: "error" as ErrorLevel, + code: classifyError(err), + msg: e?.message ?? String(err), + stack: e?.stack, + input: options.inputSummary, + pluginVersion: options.pluginVersion, + nodeVersion: process.version, + platform: process.platform, + }); + return false; + } +} + +/** + * The full entry-point wrapper used by `index.ts` cases. Wraps + * `runHookSafely` and additionally guarantees `process.exit(0)` at + * the end so a stray throw cannot leak a non-zero status to Claude. + * + * Pack 3: optionally measures duration and forwards it to the + * caller via `onTiming` so the timing diagnostic can be emitted + * after the hook body decided what to log to hook-events.jsonl. + */ +export async function runHookEntryPoint( + options: RunHookOptions & { onTiming?: (durationMs: number) => Promise | void }, + fn: () => Promise | void, +): Promise { + const started = Date.now(); + await runHookSafely(options, fn); + const durationMs = Date.now() - started; + if (options.onTiming) { + try { + await options.onTiming(durationMs); + } catch { + /* timing emit must never affect exit */ + } + } + process.exit(0); +} diff --git a/src/index.ts b/src/index.ts index 5345d0d..66c789c 100644 --- a/src/index.ts +++ b/src/index.ts @@ -35,6 +35,9 @@ import { cleanStaleHookEntries, isTokenPilotPluginEnabled, } from "./hooks/installer.js"; +import { runHookEntryPoint } from "./hooks/safe-runner.js"; +import { loadErrors, formatErrorList } from "./core/error-log.js"; +import { appendDiagnostic } from "./core/event-log.js"; import { findBinary, installBinary, @@ -161,24 +164,33 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { switch (cliArgs[0]) { case "hook-read": { - const cfg = await loadConfig(process.cwd()); - await handleHookRead( - cliArgs[1], - cfg.hooks.mode, - cfg.hooks.denyThreshold, - process.cwd(), - { - adaptiveThreshold: cfg.hooks.adaptiveThreshold, - adaptiveBudgetTokens: cfg.hooks.adaptiveBudgetTokens, - }, - ); + // v0.34.0 — wrap in runHookEntryPoint so any unexpected throw + // lands in `~/.token-pilot/hook-errors.jsonl` instead of being + // swallowed silently. handleHookRead has its own internal + // try/catch for known I/O failures; the wrapper is the safety + // net for everything else. + await runHookEntryPoint({ hook: "hook-read" }, async () => { + const cfg = await loadConfig(process.cwd()); + await handleHookRead( + cliArgs[1], + cfg.hooks.mode, + cfg.hooks.denyThreshold, + process.cwd(), + { + adaptiveThreshold: cfg.hooks.adaptiveThreshold, + adaptiveBudgetTokens: cfg.hooks.adaptiveBudgetTokens, + }, + ); + }); return; } case "hook-edit": - handleHookEdit(); + await runHookEntryPoint({ hook: "hook-edit" }, async () => { + handleHookEdit(); + }); return; case "hook-post-bash": { - try { + await runHookEntryPoint({ hook: "hook-post-bash" }, async () => { const stdin = readFileSync(0, "utf-8"); const input = JSON.parse(stdin); const advice = decidePostBashAdvice(input, { @@ -186,18 +198,13 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { }); const rendered = renderPostBashHookOutput(advice); if (rendered) process.stdout.write(rendered); - } catch { - /* silent — hook must not break */ - } - process.exit(0); + }); return; } case "hook-pre-bash": { // v0.28.0 — passive pre-intercept for heavy Bash commands. - // PostToolUse can't truncate tool_response (API limit), so the only - // way to actually save tokens is to deny the call up front and - // nudge the agent to the cheaper MCP equivalent. - try { + // v0.34.0 — runHookEntryPoint covers throw paths. + await runHookEntryPoint({ hook: "hook-pre-bash" }, async () => { const stdin = readFileSync(0, "utf-8"); const input = JSON.parse(stdin); const decision = decidePreBash( @@ -206,16 +213,13 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { ); const rendered = renderPreBashOutput(decision); if (rendered) process.stdout.write(rendered); - } catch { - /* silent — hook must not break */ - } - process.exit(0); + }); return; } case "hook-pre-grep": { // v0.28.0 — passive pre-intercept for symbol-like Grep patterns. - // Redirects identifier searches to mcp__token-pilot__find_usages. - try { + // v0.34.0 — runHookEntryPoint covers throw paths. + await runHookEntryPoint({ hook: "hook-pre-grep" }, async () => { const stdin = readFileSync(0, "utf-8"); const input = JSON.parse(stdin); const decision = decidePreGrep( @@ -224,39 +228,44 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { ); const rendered = renderPreGrepOutput(decision); if (rendered) process.stdout.write(rendered); - } catch { - /* silent — hook must not break */ - } - process.exit(0); + }); return; } case "hook-pre-task": { // v0.31.0 Pack 2 — route general-purpose Task dispatches to a - // `tp-*` specialist when the description clearly matches. Default - // (deny / advisory mode) is a non-blocking advise; strict mode or - // TOKEN_PILOT_FORCE_SUBAGENTS=1 hard-denies on a high-confidence - // match. The matcher is lenient by design (false deny is much - // worse than a missed nudge — see pre-edit v0.30.4 rollback). - try { - const stdin = readFileSync(0, "utf-8"); - const input = JSON.parse(stdin); - const agentIndex = await getAgentIndex(); - const force = process.env.TOKEN_PILOT_FORCE_SUBAGENTS === "1"; - const decision = decidePreTask(input, { - mode: parseEnforcementMode(process.env.TOKEN_PILOT_MODE), - agentIndex, - force, - }); - const rendered = renderPreTaskOutput(decision); - if (rendered) process.stdout.write(rendered); - } catch { - /* silent — hook must not break */ - } - process.exit(0); + // `tp-*` specialist when the description clearly matches. + // v0.34.0 — error/diagnostic logging via runHookEntryPoint. + await runHookEntryPoint( + { hook: "hook-pre-task" }, + async () => { + const stdin = readFileSync(0, "utf-8"); + const input = JSON.parse(stdin); + const agentIndex = await getAgentIndex(); + const force = process.env.TOKEN_PILOT_FORCE_SUBAGENTS === "1"; + + // v0.34.0 diagnostic: B4 — empty index + force is a fail + // case (we deny, but record so the user can see why). + if (force && agentIndex.agents.length === 0) { + await appendDiagnostic(process.cwd(), { + code: "force_subagents_no_agents", + level: "warn", + detail: { hint: "run `npx token-pilot install-agents`" }, + }); + } + + const decision = decidePreTask(input, { + mode: parseEnforcementMode(process.env.TOKEN_PILOT_MODE), + agentIndex, + force, + }); + const rendered = renderPreTaskOutput(decision); + if (rendered) process.stdout.write(rendered); + }, + ); return; } case "hook-post-task": { - try { + await runHookEntryPoint({ hook: "hook-post-task" }, async () => { const stdin = readFileSync(0, "utf-8"); const input = JSON.parse(stdin); const message = await processPostTask(process.cwd(), homedir(), input); @@ -270,29 +279,23 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { }), ); } - } catch { - /* silent — hook must not break */ - } - process.exit(0); + }); return; } case "hook-session-start": { - const cfg = await loadConfig(process.cwd()); - // `sessionStart.enabled` is independent of `hooks.mode` by design — - // a user may want the Read-blocking hook off (mode:"off") while still - // getting the tool-rules reminder at session start, or vice versa. - if (!cfg.sessionStart.enabled) { - process.exit(0); - } - const result = await handleSessionStart({ - projectRoot: process.cwd(), - homeDir: homedir(), - sessionStartConfig: cfg.sessionStart, + await runHookEntryPoint({ hook: "hook-session-start" }, async () => { + const cfg = await loadConfig(process.cwd()); + // sessionStart.enabled is independent of hooks.mode by design. + if (!cfg.sessionStart.enabled) return; + const result = await handleSessionStart({ + projectRoot: process.cwd(), + homeDir: homedir(), + sessionStartConfig: cfg.sessionStart, + }); + if (result) { + process.stdout.write(result); + } }); - if (result) { - process.stdout.write(result); - } - process.exit(0); return; } case "install-hook": @@ -301,6 +304,27 @@ export async function main(cliArgs = process.argv.slice(2)): Promise { case "uninstall-hook": await handleUninstallHook(cliArgs[1] || process.cwd()); return; + case "errors": { + // v0.34.0 — surface ~/.token-pilot/hook-errors.jsonl with optional + // filters: --tail=N --code= --hook= --level= + const args = cliArgs.slice(1); + const flag = (k: string) => { + for (const a of args) { + if (a.startsWith(`--${k}=`)) return a.slice(k.length + 3); + if (a === `--${k}` || a === `-${k}`) return "true"; + } + return undefined; + }; + const tailRaw = flag("tail"); + const records = await loadErrors({ + tail: tailRaw ? Number(tailRaw) : undefined, + code: flag("code"), + hook: flag("hook"), + level: flag("level") as "info" | "warn" | "error" | undefined, + }); + process.stdout.write(formatErrorList(records) + "\n"); + return; + } case "migrate-hooks": { // v0.33.0 — clean stale npx-cache / pinned-version token-pilot // hook entries from user-level + project-level settings.json so @@ -461,14 +485,28 @@ export async function startServer(cliArgs: string[] = process.argv.slice(2)) { // reliably exports `CLAUDE_PROJECT_DIR` — prefer it absolutely // when present and reject obvious system paths regardless. if (!explicitRoot) { - const candidates = [ + const rawCandidates = [ process.env.CLAUDE_PROJECT_DIR, // canonical Claude Code env (B8) process.env.INIT_CWD, // npm/npx sets this to invoking directory process.env.PWD, // shell working directory (may differ from cwd) process.cwd(), // Node.js working directory - ] - .filter((c): c is string => !!c && c !== "/") - .filter((c) => !isWindowsSystemPath(c)); + ].filter((c): c is string => !!c && c !== "/"); + // v0.34.0 — emit a diagnostic for every Windows / UNC reject so + // we can see in stats how often WSL launches misroute the cwd. + for (const c of rawCandidates) { + if (isWindowsSystemPath(c)) { + try { + await appendDiagnostic(process.cwd(), { + code: "wsl_path_rejected", + level: "warn", + detail: { path_basename: c.split(/[\\/]/).pop() ?? "" }, + }); + } catch { + /* logger of last resort */ + } + } + } + const candidates = rawCandidates.filter((c) => !isWindowsSystemPath(c)); let detected = false; for (const candidate of candidates) { @@ -1165,6 +1203,110 @@ export async function handleDoctor() { /* ecosystem check is best-effort; never break doctor */ } + // ── v0.34.0 health checks (Pack 2 of error-logging release) ── + try { + console.log("── runtime health ──"); + + // Recent errors + const recent = await loadErrors({ tail: 100 }); + if (recent.length === 0) { + console.log(` errors: 0 in ~/.token-pilot/hook-errors.jsonl ✓`); + } else { + const counts = new Map(); + for (const r of recent) counts.set(r.code, (counts.get(r.code) ?? 0) + 1); + const top = Array.from(counts.entries()).sort((a, b) => b[1] - a[1]).slice(0, 5); + console.log(` errors: ${recent.length} recent (top codes):`); + for (const [code, n] of top) console.log(` ${String(n).padStart(3)}× ${code}`); + console.log(` drill-in: token-pilot errors --tail=20`); + } + + // Stale hook entries — user-level + project-level + const userSettings = join(homedir(), ".claude", "settings.json"); + const projectSettings = join(cwd, ".claude", "settings.json"); + let staleCount = 0; + for (const p of [userSettings, projectSettings]) { + try { + if (!existsSync(p)) continue; + const raw = await import("node:fs/promises").then((m) => m.readFile(p, "utf-8")); + const json = JSON.parse(raw); + const sections = ["PreToolUse", "PostToolUse", "SessionStart"] as const; + for (const s of sections) { + const arr = json.hooks?.[s]; + if (!Array.isArray(arr)) continue; + for (const entry of arr) { + const inner = Array.isArray(entry?.hooks) ? entry.hooks : []; + for (const h of inner) { + if (typeof h?.command === "string") { + if (/\/_npx\/[0-9a-f]+\//.test(h.command)) staleCount++; + else if (/\/plugins\/cache\/token-pilot\/token-pilot\/[^/]+\//.test(h.command)) staleCount++; + } + } + } + } + } catch { + /* skip */ + } + } + if (staleCount === 0) { + console.log(` stale hooks: none ✓`); + } else { + console.log(` stale hooks: ${staleCount} pinned-path entries`); + console.log(` fix: token-pilot migrate-hooks`); + } + + // Installed tp-* agents vs catalog + let installed = 0; + let catalog = 0; + try { + const fsp = await import("node:fs/promises"); + const userAgents = join(homedir(), ".claude", "agents"); + const projAgents = join(cwd, ".claude", "agents"); + const seen = new Set(); + for (const dir of [userAgents, projAgents]) { + try { + const entries = await fsp.readdir(dir); + for (const e of entries) { + if (e.startsWith("tp-") && e.endsWith(".md")) seen.add(e); + } + } catch { + /* missing */ + } + } + installed = seen.size; + const dist = new URL("../agents", import.meta.url).pathname; + try { + const dEntries = await fsp.readdir(dist); + catalog = dEntries.filter((f) => f.startsWith("tp-") && f.endsWith(".md")).length; + } catch { + catalog = 0; + } + } catch { + /* skip */ + } + if (catalog === 0) { + console.log(` tp-* agents: could not read catalog`); + } else if (installed === 0) { + console.log(` tp-* agents: 0 of ${catalog} installed`); + console.log(` fix: token-pilot install-agents --scope=user`); + } else if (installed < catalog) { + console.log(` tp-* agents: ${installed} of ${catalog} installed (partial)`); + console.log(` fix: token-pilot install-agents --force`); + } else { + console.log(` tp-* agents: ${installed}/${catalog} ✓`); + } + + // WSL detection probe + const cwdGuess = process.env.CLAUDE_PROJECT_DIR || process.cwd(); + if (isWindowsSystemPath(cwdGuess)) { + console.log(` cwd: ${cwdGuess} ✗ (Windows system path — see B8)`); + } else { + console.log(` cwd: ${cwdGuess} ✓`); + } + console.log(""); + } catch { + /* health checks are best-effort */ + } + process.exit(0); } diff --git a/tests/core/error-log.test.ts b/tests/core/error-log.test.ts new file mode 100644 index 0000000..18f9904 --- /dev/null +++ b/tests/core/error-log.test.ts @@ -0,0 +1,167 @@ +/** + * Tests for the v0.34.0 error/diagnostic channel. + * + * The module writes to `~/.token-pilot/hook-errors.jsonl` by default. + * Tests redirect to a tmp dir via the `path` override so we never + * touch the real user-level file. + */ +import { afterEach, beforeEach, describe, expect, it } from "vitest"; +import { mkdtemp, rm, writeFile, mkdir } from "node:fs/promises"; +import { join } from "node:path"; +import { tmpdir } from "node:os"; +import { + classifyError, + loadErrors, + formatErrorList, + safeBasename, + safePathInfo, + type HookErrorRecord, +} from "../../src/core/error-log.ts"; + +describe("classifyError", () => { + it("returns ENOENT for ErrnoException-shaped error", () => { + const err = Object.assign(new Error("missing"), { code: "ENOENT" }); + expect(classifyError(err)).toBe("ENOENT"); + }); + + it("maps SyntaxError to parse_error", () => { + expect(classifyError(new SyntaxError("bad json"))).toBe("parse_error"); + }); + + it("maps timeout messages to timeout", () => { + expect(classifyError(new Error("operation timeout exceeded"))).toBe( + "timeout", + ); + }); + + it("falls back to unknown for plain values", () => { + expect(classifyError("oops")).toBe("unknown"); + expect(classifyError(undefined)).toBe("unknown"); + }); +}); + +describe("safeBasename / safePathInfo", () => { + it("returns just the basename", () => { + expect(safeBasename("/Users/x/secret/Foo.tsx")).toBe("Foo.tsx"); + }); + + it("returns sentinel for missing input", () => { + expect(safeBasename(undefined)).toBe(""); + expect(safeBasename("")).toBe(""); + }); + + it("safePathInfo extracts ext", () => { + expect(safePathInfo("/x/y/z.test.ts")).toEqual({ + name: "z.test.ts", + ext: ".ts", + }); + expect(safePathInfo("/x/Makefile")).toEqual({ + name: "Makefile", + ext: "", + }); + }); +}); + +describe("loadErrors", () => { + let dir: string; + let path: string; + + beforeEach(async () => { + dir = await mkdtemp(join(tmpdir(), "tp-error-log-")); + path = join(dir, "hook-errors.jsonl"); + }); + afterEach(async () => { + await rm(dir, { recursive: true, force: true }); + }); + + function rec(over: Partial = {}): HookErrorRecord { + return { + ts: Date.now(), + hook: "hook-pre-task", + level: "error", + code: "unknown", + msg: "x", + ...over, + }; + } + + it("returns [] when the file is missing", async () => { + expect(await loadErrors({ path })).toEqual([]); + }); + + it("parses jsonl, skipping malformed lines", async () => { + const lines = [ + JSON.stringify(rec({ ts: 1, code: "a" })), + "this is not json", + JSON.stringify(rec({ ts: 2, code: "b" })), + "", + ]; + await writeFile(path, lines.join("\n") + "\n"); + const out = await loadErrors({ path }); + expect(out.map((r) => r.code)).toEqual(["b", "a"]); // newest first + }); + + it("filters by code / hook / level", async () => { + const lines = [ + rec({ ts: 1, code: "a", hook: "hook-read", level: "info" }), + rec({ ts: 2, code: "b", hook: "hook-pre-task", level: "warn" }), + rec({ ts: 3, code: "a", hook: "hook-pre-task", level: "error" }), + ].map((r) => JSON.stringify(r)); + await writeFile(path, lines.join("\n") + "\n"); + + expect((await loadErrors({ path, code: "a" })).length).toBe(2); + expect((await loadErrors({ path, hook: "hook-read" })).length).toBe(1); + expect((await loadErrors({ path, level: "error" })).length).toBe(1); + }); + + it("respects tail option", async () => { + const lines = [1, 2, 3, 4, 5].map((ts) => + JSON.stringify(rec({ ts, code: `c${ts}` })), + ); + await writeFile(path, lines.join("\n") + "\n"); + const out = await loadErrors({ path, tail: 2 }); + // newest first + expect(out.map((r) => r.code)).toEqual(["c5", "c4"]); + }); +}); + +describe("formatErrorList", () => { + it("returns a friendly empty message", () => { + expect(formatErrorList([])).toMatch(/No errors logged/); + }); + + it("counts top codes and shows recent records", () => { + const records: HookErrorRecord[] = [ + { ts: 1, hook: "h", level: "error", code: "ENOENT", msg: "x" }, + { ts: 2, hook: "h", level: "error", code: "ENOENT", msg: "x" }, + { ts: 3, hook: "h", level: "error", code: "parse_error", msg: "x" }, + ]; + const out = formatErrorList(records); + expect(out).toContain("3 total"); + expect(out).toContain("ENOENT"); + expect(out).toContain("parse_error"); + }); +}); + +describe("appendError integration (writes to overridden cwd)", () => { + // appendError uses errorLogPath() — cannot override cleanly; cover via + // existsSync after writing through a faked $HOME. Skipping the real + // append in unit tests keeps them filesystem-clean. + it("loadErrors path-override returns parsed records", async () => { + const dir = await mkdtemp(join(tmpdir(), "tp-error-log-")); + const path = join(dir, "hook-errors.jsonl"); + const r: HookErrorRecord = { + ts: 100, + hook: "hook-pre-task", + level: "warn", + code: "force_subagents_no_agents", + msg: "no agents installed", + }; + await mkdir(dir, { recursive: true }); + await writeFile(path, JSON.stringify(r) + "\n"); + const out = await loadErrors({ path }); + expect(out.length).toBe(1); + expect(out[0].code).toBe("force_subagents_no_agents"); + await rm(dir, { recursive: true, force: true }); + }); +}); From 4157f621e87a9ebcd382af739ef969569c875401 Mon Sep 17 00:00:00 2001 From: shahinyanm Date: Fri, 8 May 2026 13:23:12 +0400 Subject: [PATCH 3/3] docs: file the 2026-05-08 MCP issues report with v0.33.0 status MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit External review surfaced four functional MCP bugs (UNC project root, git cwd, read_range schema, find_usages init) plus a fifth version- mismatch claim. All four functional ones are fixed in v0.33.0 (B6, B7, B8, B9, B10); the fifth belongs to a different package and is filed as not-applicable. Document tracks origin → root cause → commit → verification steps so future readers can cross-reference the report against the fix without rerunning the audit. --- docs/issues/2026-05-08-mcp-issues.md | 137 +++++++++++++++++++++++++++ 1 file changed, 137 insertions(+) create mode 100644 docs/issues/2026-05-08-mcp-issues.md diff --git a/docs/issues/2026-05-08-mcp-issues.md b/docs/issues/2026-05-08-mcp-issues.md new file mode 100644 index 0000000..534479f --- /dev/null +++ b/docs/issues/2026-05-08-mcp-issues.md @@ -0,0 +1,137 @@ +# MCP Issues — 2026-05-08 + +External code-review report received 2026-05-08 from a WSL setup +(Windows 11 host, Ubuntu WSL2 guest, repo accessed via UNC share +`\\wsl.localhost\ubuntu\...`). All four functional issues were +reproduced from `tp-report.txt` audit data; each is now tracked as +a beads issue and addressed in v0.33.0. + +## Status overview + +| # | Issue | Beads | Status | Commit | +|---|-------|-------|--------|--------| +| 1 | `project_overview` returns `C:\Windows` root | token-pilot-lpp (B8) | **FIXED** | v0.33.0 | +| 2 | `smart_log` / `smart_diff` fail on UNC paths | token-pilot-z7p (B6) + token-pilot-5jh (B7) + token-pilot-lpp (B8) | **FIXED** | v0.33.0 | +| 3 | `read_range` rejects valid `start_line` | token-pilot-5lw (B9) | **FIXED** | v0.33.0 | +| 4 | `find_usages` requires manual `init()` | token-pilot-vax (B10) | **FIXED** | v0.33.0 | +| 5 | Version mismatch 0.1.1 vs package.json 0.8.3 | — | **NOT APPLICABLE** | belongs to a different package (`epitaxy`) — token-pilot is at 0.33.0 | + +## Issue 1 — `project_overview` catches `C:\Windows` + +**Original report:** +> При вызове `project_overview` без аргументов на проекте, открытом из +> WSL UNC-пути (`\\wsl.localhost\ubuntu\home\shahinyanm\www\token-pilot`), +> сервер отвечает `PROJECT: Windows v0.0.0` `TYPE: unknown (no config files +> found)`. + +**Root cause confirmed:** `startServer()` walked `cwd` / `INIT_CWD` / +`PWD` candidates without rejecting Windows system paths. From a WSL +launch the cwd was `C:\Windows\System32` (or `/mnt/c/Windows/...` from +the WSL view) and `git rev-parse --show-toplevel` either failed or +silently returned the Windows tree. + +**Fix (v0.33.0, commit on `v0.33.0-bugfix-batch`):** +- `src/index.ts` — preferred ordering changed to + `CLAUDE_PROJECT_DIR > INIT_CWD > PWD > cwd`. Claude Code reliably + exports `CLAUDE_PROJECT_DIR`; nothing else is trustworthy on WSL. +- New helper `isWindowsSystemPath(candidate)` filters + `C:\Windows\…`, `C:\Program Files\…`, `/mnt/c/windows/…`, and any + `\\…` UNC path before the candidate reaches git-detect. +- v0.34.0 Pack 2 emits a `wsl_path_rejected` diagnostic per filtered + candidate so we can see in `stats` how often this fires in the wild. + +## Issue 2 — `smart_log` / `smart_diff` fail on UNC paths + +**Original report:** +> `git log failed: ... fatal: not a git repository` +> `git diff failed: ... warning: Not a git repository.` + +**Root cause confirmed:** two separate bugs amplified each other. +1. `smart_log` built `gitArgs` as + `['log', …flags, '--', ref]` and (with a path arg) appended a second + `'--', path` separator. Git interpreted `HEAD` as a pathspec and + silently returned empty; with a path the double `--` made it + reject the call. +2. Both handlers DID call `execFile('git', …, { cwd: projectRoot })`, + so the symptom on WSL was Issue 1 leaking through: `projectRoot` + resolved to a Windows system path, every git call in that tree + failed. + +**Fix (v0.33.0):** +- `src/handlers/smart-log.ts` — drop the leading `'--'` so the args + are `['log', …flags, ref]` then optional `'--', args.path`. +- `smart_diff` already uses `cwd: projectRoot`; it inherits the + Issue 1 fix automatically. + +## Issue 3 — `read_range` rejects valid `start_line` + +**Original report:** +> `read_range({path, start_line: 95, end_line: 240})` → +> `Required parameter "start_line" must be a positive integer.` + +**Root cause confirmed:** `validateReadRangeArgs` required +`typeof === "number"`. Some MCP clients (and most non-trivial transport +shims) round-trip integer arguments through JSON or environment +variables and re-emit them as strings (`"95"`). + +**Fix (v0.33.0):** +- New helper `coerceIntFromAny(value)` in `src/core/validation.ts` + accepts numeric strings (`"95"` → `95`), rejects everything else + (decimals, `"95abc"`, `NaN`, etc.). +- `validateReadRangeArgs` now coerces both `start_line` and `end_line` + through it. + +## Issue 4 — `find_usages` requires manual `init()` + +**Original report:** +> `find_usages(...)` → `Error: ast-index not initialized. Call init() first.` + +**Root cause confirmed:** `server.ts` calls `astIndex.init()` once at +startup. When that init failed silently (binary download flake, FS +permissions, postinstall didn't run), every subsequent MCP call kept +throwing `not initialized` until the user manually re-invoked +`token-pilot install-ast-index` and restarted Claude Code. + +**Fix (v0.33.0):** +- `src/ast-index/client.ts` — `exec()` now lazy-retries `init()` once + on missing `binaryPath`. On second failure surfaces a friendlier + error message pointing at `npx token-pilot install-ast-index`. + +## Issue 5 — Version mismatch 0.1.1 vs 0.8.3 + +**Original report:** +> Token-pilot версия: 0.1.1 (по `project_overview`), при этом +> package.json — 0.8.3 (рассогласование версий). + +**Not applicable.** token-pilot is at v0.33.0 in this branch and the +sources never had `0.1.1` or `0.8.3` anywhere. The reported numbers +match a different package (`epitaxy`) — the report appears to have +captured a `project_overview` from another project alongside ours. + +If similar-looking numbers reappear in a real token-pilot session, +file a fresh issue with the exact `project_overview` JSON output and +the `package.json` hash so we can trace which detector returned the +fake version. + +## Verification + +- 1269/1269 unit tests pass on the v0.33.0 branch + (`npx vitest run`). +- Build clean (`npm run build` writes 25 composed agents). +- Manual repro for Issues 1-2 still requires a WSL host — added a + diagnostic event so we can verify in real-world telemetry that the + reject path is no longer firing on a normal Linux/macOS launch. + +## How to verify post-merge + +1. Pull v0.33.0 on the WSL machine. +2. `npm install -g token-pilot@latest`. +3. `token-pilot migrate-hooks` to clean stale npx-cache hook entries. +4. Restart Claude Code on the affected project. +5. Confirm: + - `project_overview()` returns the WSL path, not `C:\Windows`. + - `smart_log()` returns commits. + - `read_range({ path, start_line: 95, end_line: 240 })` works. + - `find_usages("Foo")` succeeds even on a fresh shell. + - `token-pilot doctor` shows zero stale hook entries and the + correct cwd.