diff --git a/docs/bundles/code-review/run.md b/docs/bundles/code-review/run.md index 6b11b39..d1d7ab4 100644 --- a/docs/bundles/code-review/run.md +++ b/docs/bundles/code-review/run.md @@ -39,6 +39,8 @@ The pipeline reviews **`.py`** and **`.pyi`** only. The **`--focus docs`** facet | `--score-only` | Print just the reward delta integer | | `--no-tests` | Skip the TDD gate | | `--fix` | Apply Ruff autofixes, then rerun the review | +| `--preview-fixes` | For **`--focus simplify`**, compute non-mutating patch evidence for supported safe-mechanical simplification fixers | +| `--with-mutation` | For **`--focus simplify`**, record opt-in mutation proof evidence for cleanup candidates; unavailable tooling is inconclusive | | `--interactive` | Prompt for scope decisions before execution | | `--instructions` | Print AI-facing simplify / clean-code workflow instructions and exit without running review | @@ -53,6 +55,9 @@ The Typer entrypoint validates **review flags** first: it raises **`typer.BadPar - **`--include-tests` with `--exclude-tests`**: pick at most one test inclusion mode. Runtime error: **`Cannot use both --include-tests and --exclude-tests`** - **`--out` without `--json`**: **`--out`** is accepted only when **`--json`** is also set. Runtime error: **`Use --out together with --json.`** - **`--json` with `--score-only`**: do not combine JSON report output with score-only mode (**`Use either --json or --score-only, not both.`**). +- **`--preview-fixes` with `--fix`**: preview is non-mutating and cannot be combined with write mode. Runtime error: **`Cannot combine --preview-fixes with --fix.`** +- **`--preview-fixes` without simplify focus**: preview evidence is scoped to cleanup findings. Runtime error: **`Use --preview-fixes only with --focus simplify.`** +- **`--with-mutation` without simplify focus**: mutation proof is scoped to cleanup candidates. Runtime error: **`Use --with-mutation only with --focus simplify.`** **Supported targeting:** either pass **positional file paths** for a fixed review set (the pipeline still only reviews Python sources it accepts, such as **`.py`** / **`.pyi`**), or omit files and use **`--scope`** / **`--path`** (and related test flags) for auto-discovery — do not mix positional paths with **`--scope`** or **`--path`**. @@ -90,13 +95,14 @@ specfact code review run --scope changed --mode shadow --json --out /tmp/review- ### `--focus` facets (repeatable) -Use **`--focus`** with **`source`**, **`tests`**, **`docs`**, and/or **`simplify`** (union of facets, then intersect with scope). Do not combine **`--focus`** with **`--include-tests`** or **`--exclude-tests`**. The **`simplify`** facet produces simplification-focused reports: advisory **`ai_bloat`** findings plus high-confidence **`dry`** and **`kiss`** findings that carry deterministic metadata such as **`rewrite_hint`**, **`canonical_pattern`**, **`intent_key`**, **`estimated_deletion_lines`**, and **`related_locations`**. Simplification-focused findings are score-neutral and non-blocking. +Use **`--focus`** with **`source`**, **`tests`**, **`docs`**, and/or **`simplify`** (union of facets, then intersect with scope). Do not combine **`--focus`** with **`--include-tests`** or **`--exclude-tests`**. The **`simplify`** facet produces simplification-focused reports: advisory **`ai_bloat`** findings plus high-confidence **`dry`** and **`kiss`** findings that carry deterministic metadata such as **`rewrite_hint`**, **`canonical_pattern`**, **`intent_key`**, **`estimated_deletion_lines`**, and **`related_locations`**. Simplification-focused JSON also includes **`cleanup_forecast`**, **`signal_trace`**, **`preserve_reasons`**, and **`remediation_packet`** fields when available. ```bash specfact code review run --scope changed --focus tests specfact code review run --scope full --path packages/specfact-code-review --focus source specfact code review run --scope full --focus docs -specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json +specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json +specfact code review run --scope changed --focus simplify --with-mutation --json --out .specfact/code-review-simplify.json ``` ### AI instructions fallback @@ -107,7 +113,7 @@ When an IDE does not support bundled prompts or skills, print the same guided si specfact code review run --instructions ``` -The output explains how to remove AI bloat and apply clean-code simplifications using SpecFact evidence, including `safe_mechanical`, `needs_tests`, `design_judgment`, and `preserve` handling, patch previews, conservative keep/skip defaults, and per-file validation. It also tells assistants how to handle clean PR branches where `--scope changed` has no worktree files: find branch-delta Python files with a base-ref diff such as `git diff --name-only ...HEAD -- '*.py' '*.pyi'`, review those files as explicit positional files, and treat findings without `guidance_kind` as unguided advisories rather than auto-fix input. +The output explains how to remove AI bloat and apply clean-code simplifications using SpecFact evidence, including `cleanup_forecast`, `safe_mechanical`, `needs_tests`, `design_judgment`, `preserve`, `remediation_packet`, patch previews, conservative keep/skip defaults, and per-file validation. It also tells assistants how to handle clean PR branches where `--scope changed` has no worktree files: find branch-delta Python files with a base-ref diff such as `git diff --name-only ...HEAD -- '*.py' '*.pyi'`, review those files as explicit positional files, and treat findings without `guidance_kind` as unguided advisories rather than auto-fix input. `ai_bloat` findings are cleanup signals, not proof of AI authorship. ### Positional files (explicit Python paths) @@ -137,10 +143,10 @@ The built-in `specfact/ai-bloat-patterns` policy pack is parallel to `specfact/c Use `--focus simplify` when producing the IDE simplification queue: ```bash -specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json +specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json ``` -Simplify-focused reports keep advisory `ai_bloat` findings plus high-confidence `dry` and `kiss` findings that include deterministic simplification metadata. Metadata fields such as `rewrite_hint`, `canonical_pattern`, `intent_key`, `estimated_deletion_lines`, and `related_locations` are additive; legacy consumers can keep reading the original finding fields. Simplification findings remain score-neutral and non-blocking. +Simplify-focused reports keep advisory `ai_bloat` findings plus high-confidence `dry` and `kiss` findings that include deterministic simplification metadata. Metadata fields such as `rewrite_hint`, `canonical_pattern`, `intent_key`, `estimated_deletion_lines`, `related_locations`, `signal_trace`, `preserve_reasons`, and `remediation_packet` are additive; legacy consumers can keep reading the original finding fields. The report-level `cleanup_forecast` summarizes reviewed LOC, estimated deletion ranges, guidance-kind totals, normalized AI-bloat density, weighted bloat points, and cleanup-yield LOC per KLOC. Simplification findings remain score-neutral; enforce mode blocks only unresolved safe-mechanical cleanup candidates. ## Related diff --git a/docs/modules/code-review.md b/docs/modules/code-review.md index efe5ad8..6693606 100644 --- a/docs/modules/code-review.md +++ b/docs/modules/code-review.md @@ -59,6 +59,11 @@ Options (aligned with `specfact code review run --help`): - `--score-only`: print only the integer `reward_delta` - `--no-tests`: skip the targeted TDD gate - `--fix`: apply Ruff autofixes and re-run the review before printing results +- `--preview-fixes`: with **`--focus simplify`**, compute non-mutating patch + evidence for supported safe-mechanical simplification fixers +- `--with-mutation`: with **`--focus simplify`**, record opt-in mutation proof + evidence for cleanup candidates; missing mutation tooling is recorded as + inconclusive - `--interactive`: ask whether changed test files should be included before auto-detected review runs - `--instructions`: print AI-facing simplify / clean-code workflow instructions @@ -75,6 +80,9 @@ The command rejects incompatible mixes (same rules as the bundle run guide): Typ - **`--include-tests` with `--exclude-tests`**: pick at most one test inclusion mode. - **`--out` without `--json`**: **`--out`** is accepted only when **`--json`** is also set. - **`--json` with `--score-only`**: pick one, not both (**`--json`** cannot be used with **`--score-only`**). +- **`--preview-fixes` with `--fix`**: preview is non-mutating and cannot be combined with write mode. +- **`--preview-fixes` without `--focus simplify`**: preview evidence is scoped to cleanup findings. +- **`--with-mutation` without `--focus simplify`**: mutation proof is scoped to cleanup candidates. When `FILES` is omitted, the command falls back to: @@ -114,7 +122,7 @@ guide (same Typer surface as this section). The review pipeline also emits `ai_bloat` findings for code shapes commonly amplified by AI-assisted generation: manual append loops, passthrough lambdas, identity `try/except`, one-call wrappers, speculative `Optional[...] = None` parameters, duplicate terminal guards, long low-branch functions, and redundant intermediates. -These findings are `severity=info`, advisory-only, and score-neutral. They are written to `.specfact/code-review.json` when the report includes all severities; for simplification queues, write `.specfact/code-review-simplify.json` with `--focus simplify` so `/specfact.08-simplify` can filter them by `category=ai_bloat` for per-change confirmed rewrites. They do not claim AI authorship; they identify simplification candidates. +These findings are `severity=info`, advisory-only, and score-neutral. They are written to `.specfact/code-review.json` when the report includes all severities; for simplification queues, write `.specfact/code-review-simplify.json` with `--focus simplify` so `/specfact.08-simplify` can filter them by `category=ai_bloat` for per-change confirmed rewrites. Simplify JSON now includes `cleanup_forecast` at report level plus per-finding `signal_trace`, `preserve_reasons`, and `remediation_packet` where available. They do not claim AI authorship; they identify simplification candidates. For the lowest-friction AI onboarding path, start with the built-in instruction printer instead of requiring a user to install IDE prompts or skills first: @@ -125,8 +133,9 @@ specfact code review run --instructions Paste that output into any AI coding assistant and ask it to simplify or remove AI bloat with SpecFact. The instructions explain the expected report file, -`guidance_kind` handling, patch-preview decision cards, conservative defaults -for `design_judgment`, and per-file validation. They also cover clean PR +`cleanup_forecast`, `guidance_kind`, `remediation_packet` handling, +patch-preview decision cards, conservative defaults for `design_judgment`, +and per-file validation. They also cover clean PR branches where `--scope changed` has no worktree files: the assistant should find branch-delta Python files with a base-ref diff such as `git diff --name-only ...HEAD -- '*.py' '*.pyi'`, review those files @@ -190,6 +199,19 @@ specfact code review run --score-only packages/specfact-code-review/src/specfact specfact code review run --fix packages/specfact-code-review/src/specfact_code_review/run/commands.py ``` +For simplify-focused cleanup, prefer a JSON-first preview loop before writing: + +```bash +specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json +``` + +Inspect `cleanup_forecast` to estimate cleanup yield and sort by +`guidance_kind`. For each finding, use `remediation_packet` as the portable AI +IDE contract. The preview evidence reports patch deltas without editing tracked +files. Use `--with-mutation` only when you explicitly want candidate-scoped +mutation evidence; missing or timed-out tooling is inconclusive, not proof that +deletion is safe. + ## Tool runners The `specfact-code-review` bundle now includes internal runners that translate tool diff --git a/docs/quickstart-ai-bloat.md b/docs/quickstart-ai-bloat.md index 50ddb77..0880a30 100644 --- a/docs/quickstart-ai-bloat.md +++ b/docs/quickstart-ai-bloat.md @@ -10,7 +10,7 @@ expertise_level: [beginner, intermediate] # AI bloat quickstart -Use the Code Review bundle to detect bloat patterns commonly produced by AI-assisted coding, then use the Project bundle's `/specfact.08-simplify` prompt to review each cleanup with confirmation. +Use the Code Review bundle to detect bloat patterns commonly produced by AI-assisted coding, estimate cleanup impact, and hand structured remediation packets to any AI IDE. The Project bundle's `/specfact.08-simplify` prompt can still drive the confirmed rewrite loop. ## 1. Install and refresh prompts @@ -20,17 +20,17 @@ specfact module install nold-ai/specfact-project specfact init ide ``` -## 2. Run review with full JSON evidence +## 2. Run simplify review with cleanup forecast evidence ```bash -specfact code review run --json --out .specfact/code-review.json +specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json ``` -Omit `--level` for this report. `--level error` intentionally filters info-level findings, including `ai_bloat`, out of the command output. +Omit `--level` for this report. `--level error` intentionally filters info-level findings, including `ai_bloat`, out of the command output. `--preview-fixes` is non-mutating: it adds patch forecast evidence without editing tracked files. ## 3. Inspect the signal -Look for findings where `category` is `ai_bloat`. They are `severity=info`, advisory-only, and score-neutral. +Look first at `cleanup_forecast`. It summarizes reviewed LOC, low/expected/high deletion estimates, guidance-kind totals, AI-bloat density, weighted bloat points, and cleanup-yield LOC per KLOC. Then inspect findings where `category` is `ai_bloat`. They are `severity=info`, advisory-only, and score-neutral. Example output from the implementation dry run for this change: the AST detector found advisory `ai_bloat` candidates across `specfact-code-review` and `specfact-project` package sources, with no automatic rewrites applied. `/specfact.08-simplify` is the human-confirmed rewrite path. @@ -38,13 +38,18 @@ Example output from the implementation dry run for this change: the AST detector { "category": "ai_bloat", "severity": "info", - "rule": "ai-bloat.identity-try-except" + "rule": "ai-bloat.identity-try-except", + "guidance_kind": "safe_mechanical", + "remediation_packet": { + "safe_to_autofix": true, + "validation_plan": ["run targeted tests", "rerun simplify review"] + } } ``` ## 4. Simplify in the IDE -Run `/specfact.08-simplify`. The prompt reads `.specfact/code-review.json`, groups findings by file and rule, and asks before applying each edit. +Run `/specfact.08-simplify` or pass `.specfact/code-review-simplify.json` to your AI IDE. The JSON is the contract: sort by `guidance_kind`, use each `remediation_packet`, preserve anything with `preserve_reasons`, and ask before editing `design_judgment` findings. Example cleanup: @@ -83,7 +88,7 @@ def double(values: list[int]) -> list[int]: ## 5. Re-run review ```bash -specfact code review run --json --out .specfact/code-review.json +specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json ``` Use the new report to confirm accepted simplifications cleared the corresponding `ai_bloat` findings. This is bloat-shape detection, not AI-authorship detection. diff --git a/openspec/CHANGE_ORDER.md b/openspec/CHANGE_ORDER.md index 4291b67..14e1b0c 100644 --- a/openspec/CHANGE_ORDER.md +++ b/openspec/CHANGE_ORDER.md @@ -103,6 +103,7 @@ The architecture pillar remains active because `architecture-02-well-architected | code-review + project | 03 | code-review-ai-bloat-detection | [#269](https://github.com/nold-ai/specfact-cli-modules/issues/269) | Parent Feature: [#175](https://github.com/nold-ai/specfact-cli-modules/issues/175); Epic: [#162](https://github.com/nold-ai/specfact-cli-modules/issues/162); no known blockers | | code-review + project | 04 | code-review-11-simplification-feedback-loop | [#276](https://github.com/nold-ai/specfact-cli-modules/issues/276) | Parent Feature: [#275](https://github.com/nold-ai/specfact-cli-modules/issues/275); Epic: [#162](https://github.com/nold-ai/specfact-cli-modules/issues/162); blocked by `code-review-ai-bloat-detection` / [#269](https://github.com/nold-ai/specfact-cli-modules/issues/269) | | code-review + project | 05 | code-review-12-guided-simplification-enforcement | [#286](https://github.com/nold-ai/specfact-cli-modules/issues/286) | Parent Feature: [#275](https://github.com/nold-ai/specfact-cli-modules/issues/275); Epic: [#162](https://github.com/nold-ai/specfact-cli-modules/issues/162); blocked by `code-review-11-simplification-feedback-loop` / [#276](https://github.com/nold-ai/specfact-cli-modules/issues/276) | +| code-review + project | 06 | code-review-13-cleanup-forecast-agent-handoff | [#297](https://github.com/nold-ai/specfact-cli-modules/issues/297) | Parent Feature: [#275](https://github.com/nold-ai/specfact-cli-modules/issues/275); Epic: [#162](https://github.com/nold-ai/specfact-cli-modules/issues/162); blocked by `code-review-12-guided-simplification-enforcement` / [#286](https://github.com/nold-ai/specfact-cli-modules/issues/286) | ### Documentation restructure diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/.openspec.yaml b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/.openspec.yaml new file mode 100644 index 0000000..6894814 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/.openspec.yaml @@ -0,0 +1,2 @@ +schema: spec-driven +created: 2026-05-24 diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/README.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/README.md new file mode 100644 index 0000000..90554e1 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/README.md @@ -0,0 +1,3 @@ +# code-review-13-cleanup-forecast-agent-handoff + +Cleanup forecast, AI-bloat index, remediation packets, and AI IDE handoff for code review. diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/TDD_EVIDENCE.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/TDD_EVIDENCE.md new file mode 100644 index 0000000..bae7b93 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/TDD_EVIDENCE.md @@ -0,0 +1,55 @@ +# TDD Evidence: code-review-13-cleanup-forecast-agent-handoff + +## Failing-before evidence + +Command: + +```bash +hatch run pytest tests/unit/specfact_code_review/run/test_findings.py tests/unit/specfact_code_review/run/test_runner.py tests/unit/specfact_code_review/run/test_commands.py tests/unit/specfact_code_review/review/test_commands.py -q +``` + +Result before implementation: + +- Exit code: 2 +- Collection failed because `AiBloatIndex` and `_preserve_reasons_for_finding` did not exist yet. + +## Passing evidence + +Targeted implementation command: + +```bash +hatch run pytest tests/unit/specfact_code_review/run/test_cleanup_evidence.py tests/unit/specfact_code_review/run/test_forecast.py tests/unit/specfact_code_review/run/test_findings.py tests/unit/specfact_code_review/run/test_runner.py tests/unit/specfact_code_review/run/test_commands.py tests/unit/specfact_code_review/review/test_commands.py -q +``` + +Result after implementation: + +- Exit code: 0 +- 137 passed + +Docs and packaged-resource parity command: + +```bash +hatch run pytest tests/unit/docs/test_code_review_docs_parity.py tests/unit/specfact_code_review/rules/test_updater.py tests/unit/test_guided_simplify_resources.py -q +``` + +Result: + +- Exit code: 0 +- 22 passed + +Required final gates: + +- `hatch run format` — exit code 0 +- `hatch run type-check` — exit code 0 +- `hatch run lint` — exit code 0 +- `hatch run yaml-lint` — exit code 0 +- `hatch run check-bundle-imports` — exit code 0 +- `hatch run verify-modules-signature --payload-from-filesystem --enforce-version-bump` — exit code 0 +- `hatch run specfact code review run --bug-hunt --json --out .specfact/code-review.json --scope changed` — exit code 0 +- `openspec validate code-review-13-cleanup-forecast-agent-handoff --strict` — exit code 0 + +Full suite wrappers: + +- `hatch run contract-test -- tests/cli-contracts/specfact-code-review-run.scenarios.yaml` — exit code 0, 785 passed, 2 warnings +- `hatch run smart-test` — exit code 0, 785 passed, 2 warnings +- `hatch run test -- -q` — exit code 0, 785 passed, 2 warnings diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/design.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/design.md new file mode 100644 index 0000000..b0edb4d --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/design.md @@ -0,0 +1,37 @@ +## Context + +The current simplify report already has per-finding `estimated_deletion_lines`, `guidance_kind`, `recommended_action`, `safety_checks`, and action status. That is enough for an expert to inspect individual findings, but it is not enough for a user or AI IDE to decide where to start, estimate cleanup impact, or distinguish a safe patch preview from a judgment call. + +The next layer should stay deterministic and Python-first. CPG, Joern, and polyglot clone analysis remain follow-up work. This change should improve the current runner without adding a heavy default dependency path or turning bloat advisories into proof of AI authorship. + +## Decisions + +- `cleanup_forecast` is derived from reviewed Python LOC and guided simplification metadata. It reports raw finding counts and normalized metrics so teams can compare repositories and PRs without over-weighting file size. +- Forecast weights are fixed in V1: `safe_mechanical=1.0`, `needs_tests=0.6`, `design_judgment=0.25`, and `preserve=0.0`. +- `--preview-fixes` is non-mutating. It may create temporary files or in-memory diffs, but it must not edit tracked sources. +- `--with-mutation` is explicit and valid only with `--focus simplify`. Timeouts and tool absence are inconclusive evidence, not proof that cleanup is safe. +- Preserve reasons short-circuit automatic cleanup. The closed taxonomy uses the emitted enum tokens `contract_lambda`, `public_api`, `protocol_member`, `cli_callback`, `compat_shim`, `spec_linked`, `domain_wrapper`, and `load_bearing`. +- `remediation_packet` is the universal handoff surface. IDE prompts and skills may summarize it, but the JSON is authoritative. + +## Data Shape + +`cleanup_forecast` should include: + +- `reviewed_loc`: production, test, and total Python LOC for the reviewed file set. +- `estimated_deletion_lines`: low, expected, high, plus totals by guidance kind. +- `ai_bloat_index`: findings per KLOC, weighted bloat points per KLOC, and cleanup-yield LOC per KLOC. +- `by_guidance_kind`: counts and estimated deletion lines for each guidance kind. +- `by_action_status`: lifecycle counts when present. + +Finding additions should be optional: + +- `signal_trace`: deterministic source signals, including tool name, fired flag, score/value, evidence reference, and explanation. +- `preserve_reasons`: closed-list preserve reasons with evidence refs. +- `remediation_packet`: plain-language issue, recommended action, why it may need to stay, safety checks, validation plan, safe-to-autofix flag, and optional patch forecast refs. + +## Risks + +- **Forecasts can look like guarantees.** Mitigation: use low/expected/high ranges and label deletion estimates as non-binding until preview or mutation evidence exists. +- **Mutation can be slow or flaky.** Mitigation: keep it opt-in, candidate-scoped, and inconclusive on timeout. +- **Preserve detection can hide real bloat.** Mitigation: preserve only blocks automatic cleanup; it can still be reported as kept with rationale. +- **JSON growth can break consumers.** Mitigation: additive fields only, keep original required fields and legacy validation intact. diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/proposal.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/proposal.md new file mode 100644 index 0000000..33f6f7f --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/proposal.md @@ -0,0 +1,48 @@ +## Why + +`code-review-12-guided-simplification-enforcement` made simplify findings safer for humans and agents to interpret, but the report still does not quantify cleanup impact or package each recommendation as a portable handoff artifact. Developers need to know how much code is likely removable, how much risk each cleanup carries, and what proof an AI IDE should use before editing. + +This change turns `specfact code review run --focus simplify` into a cleanup forecast and remediation handoff loop. It keeps the current conservative guidance model, adds normalized bloat metrics, records deterministic preserve signals, and emits remediation packets that any AI IDE or LLM can consume without relying on vendor-specific prompts. + +## What Changes + +- Add a `cleanup_forecast` report summary with reviewed LOC, estimated removable LOC ranges, guidance-kind breakdowns, AI-bloat density, weighted bloat points, and cleanup-yield metrics. +- Extend findings with optional `signal_trace`, `preserve_reasons`, and `remediation_packet` fields so each cleanup recommendation carries explainable evidence and an AI-ready action contract. +- Add `--preview-fixes` for simplify-focused runs to compute non-mutating patch and numstat forecasts for supported safe-mechanical fixers. +- Add opt-in `--with-mutation` for mutation-backed proof on simplify candidates without making mutation testing part of the default review path. +- Expand preserve detection for contracts, public APIs, protocol/ABC members, Typer/Click callbacks, compatibility shims, explicit preserve markers, and load-bearing mutation evidence. +- Update modules docs, quickstart, the packaged `specfact-code-review` skill, and command-contract coverage to present the JSON report as the universal AI IDE handoff artifact. + +## Capabilities + +### New Capabilities + +- `cleanup-forecast-review`: Quantified cleanup forecasting and AI-bloat index reporting for simplify-focused review runs. +- `ai-ide-remediation-handoff`: Portable remediation packets for AI IDEs and headless agents. + +### Modified Capabilities + +- `review-finding-model`: Add optional evidence and handoff fields while preserving legacy report compatibility. +- `review-run-command`: Add non-mutating preview and opt-in mutation proof flags. +- `guided-simplification-review`: Use stronger preserve and proof signals before cleanup recommendations are applied. +- `review-cli-contracts`: Cover the new flags, invalid combinations, and JSON output shape. +- `house-rules-skill`: Teach agents to use cleanup forecasts and remediation packets. + +## Impact + +- **Affected bundle:** `packages/specfact-code-review`. +- **Affected docs:** Code Review bundle/module pages and AI bloat quickstart. +- **Affected JSON:** `.specfact/code-review-simplify.json` receives additive optional fields; existing required fields remain compatible. +- **Affected command surface:** `specfact code review run` gains `--preview-fixes` and `--with-mutation`. +- **Release impact:** `specfact-code-review` version, registry entry, and signatures must be refreshed if packaged assets or manifests change. + +## Source Tracking + + +- **Modules Epic:** [#162](https://github.com/nold-ai/specfact-cli-modules/issues/162) +- **Parent Feature:** [#275](https://github.com/nold-ai/specfact-cli-modules/issues/275) +- **GitHub Issue:** [#297](https://github.com/nold-ai/specfact-cli-modules/issues/297) +- **Repository:** nold-ai/specfact-cli-modules +- **Prior Baseline:** [#286](https://github.com/nold-ai/specfact-cli-modules/issues/286) / `code-review-12-guided-simplification-enforcement` +- **Last Synced Status:** synced +- **Sanitized:** false diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/ai-ide-remediation-handoff/spec.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/ai-ide-remediation-handoff/spec.md new file mode 100644 index 0000000..5c6ebc3 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/ai-ide-remediation-handoff/spec.md @@ -0,0 +1,19 @@ +## ADDED Requirements + +### Requirement: Review JSON is the portable AI IDE handoff contract + +The Code Review bundle SHALL expose cleanup guidance through machine-readable JSON so Claude, Codex, Cursor, Copilot, and other assistants can act without vendor-specific prompt assumptions. + +#### Scenario: Remediation packets guide AI cleanup + +- **WHEN** a simplify-focused report contains cleanup findings +- **THEN** each actionable finding SHALL include or be able to derive a remediation packet +- **AND** the packet SHALL state whether the finding may be auto-fixed, needs tests, needs design judgment, or should be preserved +- **AND** the packet SHALL include a validation plan for any accepted cleanup + +#### Scenario: AI instructions prioritize the JSON contract + +- **WHEN** `specfact code review run --instructions` is executed +- **THEN** the instructions SHALL tell assistants to generate simplify evidence first +- **AND** they SHALL tell assistants to sort findings by `guidance_kind`, inspect `cleanup_forecast`, and follow remediation packets before editing +- **AND** they SHALL prohibit treating `ai_bloat` findings as proof of AI authorship diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/cleanup-forecast-review/spec.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/cleanup-forecast-review/spec.md new file mode 100644 index 0000000..31b04e2 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/cleanup-forecast-review/spec.md @@ -0,0 +1,40 @@ +## ADDED Requirements + +### Requirement: Simplify review reports include cleanup forecasts + +Simplify-focused review reports SHALL include a cleanup forecast that quantifies likely cleanup impact without treating estimates as guaranteed deletions. + +#### Scenario: Forecast summarizes reviewed LOC and deletion estimates + +- **WHEN** `specfact code review run --focus simplify --json` emits guided simplification findings +- **THEN** the report SHALL include `cleanup_forecast.reviewed_loc` +- **AND** it SHALL include low, expected, and high estimated deletion-line totals +- **AND** it SHALL include deletion estimates grouped by `guidance_kind` +- **AND** legacy report consumers SHALL still be able to ignore the new field + +#### Scenario: Forecast exposes normalized AI-bloat index + +- **WHEN** a cleanup forecast is present +- **THEN** it SHALL include normalized metrics per KLOC for finding density, weighted bloat points, and cleanup yield +- **AND** the default weights SHALL be `safe_mechanical=1.0`, `needs_tests=0.6`, `design_judgment=0.25`, and `preserve=0.0` +- **AND** preserve findings SHALL contribute no weighted bloat points + +### Requirement: Cleanup forecasts distinguish advice from proof + +The cleanup forecast SHALL distinguish estimate-only signals from previewed or mutation-backed proof. + +#### Scenario: Preview evidence upgrades forecast confidence + +- **WHEN** `--preview-fixes` computes a patch forecast for safe-mechanical findings +- **THEN** the cleanup forecast SHALL include preview evidence for affected findings +- **AND** the preview SHALL report added, removed, and net line counts without editing tracked files + +#### Scenario: Mutation evidence is opt-in + +- **WHEN** `--with-mutation` is not provided +- **THEN** the review SHALL NOT run mutation testing +- **AND** the report SHALL NOT imply mutation-backed proof exists + +- **WHEN** `--with-mutation` is provided for simplify focus +- **THEN** mutation outcomes SHALL be recorded as evidence for candidate findings +- **AND** timeouts or unavailable mutation tooling SHALL be recorded as inconclusive rather than safe cleanup proof diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/review-finding-model/spec.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/review-finding-model/spec.md new file mode 100644 index 0000000..01db8a1 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/review-finding-model/spec.md @@ -0,0 +1,25 @@ +## MODIFIED Requirements + +### Requirement: ReviewFinding schema supports additive simplification metadata + +The `ReviewFinding` model SHALL accept optional simplification metadata while preserving the existing governed finding fields and category/severity validation. The report schema version SHALL advance additively when simplification metadata, guided simplification metadata, cleanup forecast metadata, or AI IDE handoff metadata is emitted. + +#### Scenario: Finding carries signal trace evidence + +- **WHEN** a `ReviewFinding` payload includes `signal_trace` +- **THEN** model validation SHALL accept deterministic signal entries with tool/source name, fired status, optional score/value, evidence references, and explanation +- **AND** legacy finding payloads without `signal_trace` SHALL remain valid + +#### Scenario: Finding carries preserve reasons + +- **WHEN** a `ReviewFinding` payload includes `preserve_reasons` +- **THEN** each reason SHALL come from a closed taxonomy of preserve contexts +- **AND** the finding SHALL NOT be considered safe for automatic cleanup while a preserve reason is present +- **AND** the preserve reason SHALL include enough evidence for a developer or AI agent to explain why cleanup was not applied + +#### Scenario: Finding carries remediation packet + +- **WHEN** a simplify-focused finding includes `remediation_packet` +- **THEN** the packet SHALL include a plain-language issue, recommended action, possible keep reason, safety checks, validation plan, and safe-to-autofix flag +- **AND** the packet MAY include patch forecast references when preview evidence exists +- **AND** AI IDE prompts and skills SHALL treat the JSON packet as authoritative over prompt prose diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/review-run-command/spec.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/review-run-command/spec.md new file mode 100644 index 0000000..4719078 --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/specs/review-run-command/spec.md @@ -0,0 +1,26 @@ +## MODIFIED Requirements + +### Requirement: End-to-End `specfact code review run` in modules repo + +The `specfact-code-review` bundle SHALL provide a fully wired `specfact code review run` command that orchestrates the existing tool runners, supports scoped file selection, emits governed review reports, and provides simplify-specific cleanup forecast and handoff controls. + +#### Scenario: Run command previews simplify fixes without mutating files + +- **WHEN** `specfact code review run --focus simplify --preview-fixes --json --out ` is executed +- **THEN** the command SHALL compute preview evidence for supported safe-mechanical simplification fixers +- **AND** it SHALL write the forecast evidence to the JSON report +- **AND** it SHALL NOT edit tracked source files + +#### Scenario: Run command rejects preview and fix together + +- **WHEN** `specfact code review run --focus simplify --preview-fixes --fix` is executed +- **THEN** the command SHALL fail before review execution with a clear invalid-combination error + +#### Scenario: Run command scopes mutation proof to simplify focus + +- **WHEN** `specfact code review run --with-mutation` is executed without `--focus simplify` +- **THEN** the command SHALL fail before review execution with a clear invalid-combination error + +- **WHEN** `specfact code review run --focus simplify --with-mutation` is executed +- **THEN** the command SHALL run mutation proof only for candidate cleanup findings +- **AND** it SHALL record mutation outcomes in the report without making mutation proof part of the default review path diff --git a/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/tasks.md b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/tasks.md new file mode 100644 index 0000000..987638a --- /dev/null +++ b/openspec/changes/code-review-13-cleanup-forecast-agent-handoff/tasks.md @@ -0,0 +1,44 @@ +## 1. GitHub readiness and OpenSpec setup + +- [x] 1.1 Create OpenSpec change `code-review-13-cleanup-forecast-agent-handoff`. +- [x] 1.2 Create GitHub issue [#297](https://github.com/nold-ai/specfact-cli-modules/issues/297), link it under Feature [#275](https://github.com/nold-ai/specfact-cli-modules/issues/275), and label it with `enhancement`, `codebase`, `openspec`, and `change-proposal`. +- [x] 1.3 Confirm issue project assignment, open/Todo state, parent linkage, blocked-by relationship, source tracking, and absence of implementation concurrency. +- [x] 1.4 Add `openspec/CHANGE_ORDER.md` row as order 06, blocked by [#286](https://github.com/nold-ai/specfact-cli-modules/issues/286). +- [x] 1.5 Validate the OpenSpec change with `openspec validate code-review-13-cleanup-forecast-agent-handoff --strict`. + +## 2. Spec-first failing tests + +- [x] 2.1 Add model tests for `cleanup_forecast`, `signal_trace`, `preserve_reasons`, and `remediation_packet` compatibility. +- [x] 2.2 Add forecast tests for reviewed LOC, estimated deletion ranges, guidance-kind totals, and AI-bloat index weights. +- [x] 2.3 Add preserve-detection tests for icontract, public API exports, Protocol/ABC members, Typer/Click callbacks, compatibility shims, explicit markers, and mutation load-bearing evidence. +- [x] 2.4 Add CLI tests for `--preview-fixes`, `--with-mutation`, and invalid combinations with non-simplify focus. +- [x] 2.5 Add command-contract and docs parity tests for new flags and report fields. +- [x] 2.6 Record failing-before evidence in `TDD_EVIDENCE.md`. + +## 3. Review model and forecast implementation + +- [x] 3.1 Extend `ReviewReport` with additive `cleanup_forecast` and schema version derivation. +- [x] 3.2 Extend `ReviewFinding` with additive evidence and handoff fields. +- [x] 3.3 Compute reviewed LOC and forecast metrics from the resolved review file set. +- [x] 3.4 Keep scoring and merge-quality verdict behavior unchanged outside simplify-specific enforcement. + +## 4. Preview, preserve, and mutation proof + +- [x] 4.1 Implement non-mutating patch forecast support for existing safe-mechanical simplification fixers. +- [x] 4.2 Implement preserve-reason detection before automatic cleanup eligibility is calculated. +- [x] 4.3 Add opt-in mutation proof scaffolding for simplify candidates, treating tool absence or timeout as inconclusive. +- [x] 4.4 Ensure `--fix` still mutates only deterministic safe-mechanical findings and records action evidence. + +## 5. AI IDE handoff and docs + +- [x] 5.1 Emit remediation packets suitable for Claude, Codex, Cursor, Copilot, or headless agents. +- [x] 5.2 Update `--instructions` and packaged skill guidance to prioritize cleanup forecast and remediation packets. +- [x] 5.3 Update modules docs and AI bloat quickstart for the new JSON-first cleanup workflow. +- [x] 5.4 Coordinate with the paired core docs change before final wording is published. + +## 6. Packaging, signatures, and verification + +- [x] 6.1 Bump affected module versions when packaged resources change. +- [x] 6.2 Refresh registry metadata and module manifest integrity/signatures. +- [x] 6.3 Re-run targeted tests and record passing evidence in `TDD_EVIDENCE.md`. +- [x] 6.4 Run required gates for touched scope: `hatch run format`, `hatch run type-check`, `hatch run lint`, `hatch run yaml-lint`, `hatch run check-bundle-imports`, `hatch run verify-modules-signature --payload-from-filesystem --enforce-version-bump`, `hatch run contract-test`, relevant `hatch run smart-test`, relevant `hatch run test`, and `hatch run specfact code review run --bug-hunt --json --out .specfact/code-review.json --scope changed`. diff --git a/packages/specfact-code-review/module-package.yaml b/packages/specfact-code-review/module-package.yaml index a4f2d38..62eda58 100644 --- a/packages/specfact-code-review/module-package.yaml +++ b/packages/specfact-code-review/module-package.yaml @@ -1,5 +1,5 @@ name: nold-ai/specfact-code-review -version: 0.47.27 +version: 0.47.30 commands: - code tier: official @@ -23,5 +23,5 @@ description: Official SpecFact code review bundle package. category: codebase bundle_group_command: code integrity: - checksum: sha256:a86d8cfde2035059414370bdadd323dcdcf02bd5104e3b30b252bc350a96cd98 - signature: F/Ld51xKWPkQWWS8c00cs9UTbe/kJJ6chBh08sFuKTlAwpRTydf8//wLVaDuI2jt4jX0CwYaHnBqoTd0ELMLAQ== + checksum: sha256:3c7c6b62c2c3a7aea2f3bff95628861fa6a3612b4b3de8064cb9380dba843561 + signature: +9rmzN2qzb53LMUFO+LPJMLaP7pA4QMokYvqwYEq7HNMvbogGFgQupeauqMdYxgd3xtyxUkgrELzAKyABhu3CA== diff --git a/packages/specfact-code-review/src/specfact_code_review/resources/skills/specfact-code-review/SKILL.md b/packages/specfact-code-review/src/specfact_code_review/resources/skills/specfact-code-review/SKILL.md index a642508..d28ec14 100644 --- a/packages/specfact-code-review/src/specfact_code_review/resources/skills/specfact-code-review/SKILL.md +++ b/packages/specfact-code-review/src/specfact_code_review/resources/skills/specfact-code-review/SKILL.md @@ -8,9 +8,10 @@ Updated: 2026-05-22 | Module: nold-ai/specfact-code-review Use this skill as an interactive cleanup coach, not a raw lint executor. When a user says "remove AI bloat", "simplify", "apply clean code", "fix SpecFact review", or similar, run the SpecFact review workflow, explain decisions in the user's language, show exact patch previews, and validate after small changes. ## DO - Treat `specfact code review run --help` as authoritative; use `--instructions` as the fallback AI workflow when prompts/skills are unavailable -- For simplification queues, run `specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json` -- Ask for walkthrough level when interactive: vibe coder, junior developer, senior/pro, or headless agent; auto-adjust if obvious -- For vibe coders, present each finding as a decision card: plain-language issue, why it might need to stay, exact patch preview, validation plan, and recommended choice +- For simplification queues, run `specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json` +- Inspect `cleanup_forecast` first, then treat each finding's `remediation_packet` as the portable AI IDE contract +- Preserve anything with `preserve_reasons`; those reasons block automatic cleanup even when a shorter patch exists +- Ask for walkthrough level when interactive; for vibe coders, present each finding as a decision card with issue, keep reason, patch preview, validation plan, and recommendation - Interpret `guidance_kind`: `safe_mechanical` may apply after local safety checks, `needs_tests` requires tests first, `design_judgment` needs human choice with intent evidence, `preserve` means keep and log `preserve_reason` - For `design_judgment`, inspect API, callback, framework hook, adapter, public symbol, CLI boundary, compatibility shim, and readability intent; if intent is unclear, default to keep or skip - Log each simplification action as recommended, applied, kept, skipped, failed, with evidence of improvement or preserved contract @@ -22,12 +23,10 @@ Use this skill as an interactive cleanup coach, not a raw lint executor. When a - Delete unused private helpers and speculative abstractions quickly (YAGNI) - Extract repeated function shapes once the second copy appears (DRY) - Split persistence and transport concerns instead of mixing `repository.*` with `http_client.*` (SOLID) -- Add @require/@ensure (icontract) + @beartype to all new public APIs -- Run hatch run contract-test-contracts before any commit -- Write the test file BEFORE the feature file (TDD-first) +- Add @require/@ensure (icontract) + @beartype to all new public APIs; write tests before feature code ## DON'T - Don't copy prompt templates into AI IDEs when this installed skill can carry the reusable workflow guidance -- Don't treat simplification findings as AI-authorship proof or apply batch rewrites without explicit user approval +- Don't treat simplification findings as AI-authorship proof, guaranteed LOC removal, or permission for batch rewrites without explicit approval - Don't ask non-expert users to infer code intent from a raw warning; provide the evidence and safest recommendation - Don't apply `design_judgment` findings just because the patch looks shorter - Don't enable known noisy findings unless you explicitly want strict/full review output diff --git a/packages/specfact-code-review/src/specfact_code_review/review/commands.py b/packages/specfact-code-review/src/specfact_code_review/review/commands.py index c3cc721..d3604d6 100644 --- a/packages/specfact-code-review/src/specfact_code_review/review/commands.py +++ b/packages/specfact-code-review/src/specfact_code_review/review/commands.py @@ -31,26 +31,29 @@ Use this when the user asks to remove AI bloat, simplify code, apply clean-code patterns, reduce boilerplate, or act on SpecFact review findings. 1. Generate evidence first: - specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json + specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json If the worktree is clean on a PR branch and --scope changed finds no files, review the branch-delta Python files as explicit positional files and omit --scope. Find them with the PR base ref, for example: git diff --name-only ...HEAD -- '*.py' '*.pyi' -2. Treat guidance_kind as the action contract: +2. Inspect cleanup_forecast before editing. Use reviewed_loc, estimated_deletion_lines, ai_bloat_index, and by_guidance_kind to decide where cleanup will actually pay off. These estimates are cleanup forecasts, not guarantees. + +3. Sort findings by guidance_kind before editing, then treat guidance_kind and remediation_packet as the action contract: - safe_mechanical: apply only after local safety checks pass. - needs_tests: add or identify targeted tests before changing behavior. - design_judgment: inspect intent evidence and ask before editing. - preserve: keep by default and record preserve_reason. Findings without guidance_kind are unguided advisories: summarize them separately, do not auto-apply them, and ask before using them as refactor input. + Prefer each finding's remediation_packet over prose instructions because the JSON report is the portable AI IDE handoff contract. -3. For vibe-coder or junior users, present each finding as a decision card: +4. For vibe-coder or junior users, present each finding as a decision card: Finding, plain-language issue, why it might need to stay, exact patch preview or small before/after proposal, validation plan, recommended choice. -4. For design_judgment findings, check API, callback, framework hook, adapter, public symbol, CLI boundary, compatibility shim, and readability intent. If intent is unclear, default to keep or skip. +5. For design_judgment findings, check API, callback, framework hook, adapter, public symbol, CLI boundary, compatibility shim, and readability intent. If intent is unclear, default to keep or skip. -5. Apply one file at a time. After each accepted file or very small batch, run targeted tests or rerun: +6. Apply one file at a time. After each accepted file or very small batch, run targeted tests or rerun: specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json -6. Log every action as recommended, applied, kept, skipped, or failed with evidence. Never batch-apply design_judgment findings just because the patch is shorter. +7. Log every action as recommended, applied, kept, skipped, or failed with evidence. Never batch-apply design_judgment findings just because the patch is shorter. Never treat ai_bloat findings as proof of AI authorship; they are cleanup signals only, not proof of AI authorship. """ @@ -142,6 +145,16 @@ def run( score_only: bool = typer.Option(False, "--score-only"), no_tests: bool = typer.Option(False, "--no-tests"), fix: bool = typer.Option(False, "--fix"), + preview_fixes: bool = typer.Option( + False, + "--preview-fixes", + help="Preview supported safe-mechanical simplify fixes without editing tracked files.", + ), + with_mutation: bool = typer.Option( + False, + "--with-mutation", + help="Record opt-in mutation proof evidence for simplify cleanup candidates.", + ), interactive: bool = typer.Option(False, "--interactive"), instructions: bool = typer.Option( False, @@ -182,6 +195,8 @@ def run( score_only=score_only, no_tests=no_tests, fix=fix, + preview_fixes=preview_fixes, + with_mutation=with_mutation, ) except (ValueError, ViolationError) as exc: raise typer.BadParameter(_friendly_run_command_error(exc)) from exc diff --git a/packages/specfact-code-review/src/specfact_code_review/rules/updater.py b/packages/specfact-code-review/src/specfact_code_review/rules/updater.py index 38c36b4..e7ee03b 100644 --- a/packages/specfact-code-review/src/specfact_code_review/rules/updater.py +++ b/packages/specfact-code-review/src/specfact_code_review/rules/updater.py @@ -32,8 +32,9 @@ "- Use this skill when asked to run, interpret, or act on SpecFact code review in Codex CLI or another AI IDE", "- Treat `specfact code review run --help` as authoritative; self-heal stale options by checking help " "before changing workflow", - "- For simplification queues, run `specfact code review run --scope changed --focus simplify --json " - "--out .specfact/code-review-simplify.json`", + "- For simplification queues, run `specfact code review run --scope changed --focus simplify " + "--preview-fixes --json --out .specfact/code-review-simplify.json`", + "- Inspect `cleanup_forecast`, then follow each finding's `remediation_packet`; preserve reasons block autofix", "- Ask for walkthrough level when interactive: vibe coder, junior developer, senior/pro, or headless agent; " "auto-adjust if obvious", "- Interpret `guidance_kind`: `safe_mechanical` may apply after local safety checks, `needs_tests` requires " @@ -61,6 +62,7 @@ "- Don't copy prompt templates into AI IDEs when this installed skill can carry the reusable workflow guidance", "- Don't treat simplification findings as AI-authorship proof or apply batch rewrites without explicit " "user approval", + "- Don't treat cleanup forecasts as guaranteed LOC removal; validate previews and tests first", "- Don't enable known noisy findings unless you explicitly want strict/full review output", "- Don't use bare except: or except Exception: pass", "- Don't add # noqa / # type: ignore without inline justification", diff --git a/packages/specfact-code-review/src/specfact_code_review/run/cleanup_evidence.py b/packages/specfact-code-review/src/specfact_code_review/run/cleanup_evidence.py new file mode 100644 index 0000000..b57c2c1 --- /dev/null +++ b/packages/specfact-code-review/src/specfact_code_review/run/cleanup_evidence.py @@ -0,0 +1,184 @@ +"""Cleanup preview and mutation evidence helpers for review runs.""" + +from __future__ import annotations + +import shutil +import tempfile +from collections.abc import Callable +from pathlib import Path + +from beartype import beartype +from icontract import ensure, require + +from specfact_code_review.run.findings import ( + EvidenceRef, + RemediationPacket, + ReviewFinding, + ReviewReport, + SignalTraceEntry, +) +from specfact_code_review.run.forecast import build_cleanup_forecast + + +ApplySimplificationFixes = Callable[[ReviewReport], list[ReviewFinding]] + + +@beartype +@require(lambda files: isinstance(files, list), "files must be a list") +@ensure(lambda result: isinstance(result, ReviewReport), "result must be a review report") +def with_previewed_simplification_findings( + report: ReviewReport, + files: list[Path], + apply_simplification_fixes: ApplySimplificationFixes, +) -> ReviewReport: + previewed_findings = _preview_simplification_fixes(report, apply_simplification_fixes) + if not previewed_findings: + return with_refreshed_cleanup_forecast(report, files) + replacements = {(finding.file, finding.line, finding.rule): finding for finding in previewed_findings} + findings = [replacements.get((finding.file, finding.line, finding.rule), finding) for finding in report.findings] + return with_refreshed_cleanup_forecast(report.model_copy(update={"findings": findings}), files) + + +@beartype +@require(lambda files: isinstance(files, list), "files must be a list") +@ensure(lambda result: isinstance(result, ReviewReport), "result must be a review report") +def with_mutation_evidence(report: ReviewReport, files: list[Path]) -> ReviewReport: + findings = [_with_mutation_signal(finding) for finding in report.findings] + return with_refreshed_cleanup_forecast(report.model_copy(update={"findings": findings}), files) + + +@beartype +@require(lambda files: isinstance(files, list), "files must be a list") +@ensure(lambda result: isinstance(result, ReviewReport), "result must be a review report") +def with_refreshed_cleanup_forecast(report: ReviewReport, files: list[Path]) -> ReviewReport: + data = report.model_dump() + data["cleanup_forecast"] = build_cleanup_forecast(report.findings, files) + data["simplification_summary"] = None + return ReviewReport(**data) + + +def _preview_simplification_fixes( + report: ReviewReport, + apply_simplification_fixes: ApplySimplificationFixes, +) -> list[ReviewFinding]: + previewed: list[ReviewFinding] = [] + for finding in _fixable_simplifications_by_stable_line_order(report.findings): + preview = _preview_single_simplification(finding, apply_simplification_fixes) + if preview is not None: + previewed.append(preview) + return previewed + + +def _preview_single_simplification( + finding: ReviewFinding, + apply_simplification_fixes: ApplySimplificationFixes, +) -> ReviewFinding | None: + source_path = Path(finding.file) + try: + before = source_path.read_text(encoding="utf-8") + except OSError: + return None + with tempfile.TemporaryDirectory(prefix="specfact-review-preview-") as tmpdir: + preview_path = Path(tmpdir) / source_path.name + if not _write_preview_source(preview_path, before): + return None + preview_finding = finding.model_copy(update={"file": str(preview_path)}) + if not apply_simplification_fixes(_report_for_preview(preview_finding)): + return None + try: + after = preview_path.read_text(encoding="utf-8") + except OSError: + return None + added, removed = _line_delta(before, after) + return _with_patch_preview(finding, added=added, removed=removed) + + +def _fixable_simplifications_by_stable_line_order(findings: list[ReviewFinding]) -> list[ReviewFinding]: + return sorted( + [finding for finding in findings if finding.is_safe_mechanical_simplification()], + key=lambda finding: (finding.file, -finding.line, finding.rule), + ) + + +def _write_preview_source(path: Path, source: str) -> bool: + try: + path.write_text(source, encoding="utf-8") + except OSError: + return False + return True + + +def _report_for_preview(finding: ReviewFinding) -> ReviewReport: + return ReviewReport( + run_id="preview", + score=85, + findings=[finding], + summary="Preview simplification fix.", + ) + + +def _line_delta(before: str, after: str) -> tuple[int, int]: + before_count = len(before.splitlines()) + after_count = len(after.splitlines()) + return max(0, after_count - before_count), max(0, before_count - after_count) + + +def _with_patch_preview(finding: ReviewFinding, *, added: int, removed: int) -> ReviewFinding: + patch_ref = f"preview:{finding.file}:{finding.line}" + signal_trace = [ + *(finding.signal_trace or []), + SignalTraceEntry( + tool="specfact", + source="preview_fixes", + fired=True, + score=1.0, + value=f"added={added}; removed={removed}; net={added - removed}", + evidence_refs=[EvidenceRef(path=finding.file, start_line=finding.line)], + explanation="Non-mutating preview computed a safe-mechanical patch forecast.", + ), + ] + packet = _packet_with_patch_ref(finding, patch_ref) + return ReviewFinding(**{**finding.model_dump(), "signal_trace": signal_trace, "remediation_packet": packet}) + + +def _packet_with_patch_ref(finding: ReviewFinding, patch_ref: str) -> RemediationPacket: + if finding.remediation_packet is None: + return RemediationPacket( + issue=finding.message, + recommended_action=finding.recommended_action or "inspect", + possible_keep_reason=finding.preserve_reason, + safety_checks=finding.safety_checks or ["inspect the surrounding behavior before editing"], + validation_plan=["run targeted tests", "rerun simplify review"], + safe_to_autofix=finding.is_safe_mechanical_simplification() and finding.fixable, + patch_forecast_refs=[patch_ref], + ) + refs = list(finding.remediation_packet.patch_forecast_refs or []) + if patch_ref not in refs: + refs.append(patch_ref) + return finding.remediation_packet.model_copy(update={"patch_forecast_refs": refs}) + + +def _with_mutation_signal(finding: ReviewFinding) -> ReviewFinding: + if not finding.is_safe_mechanical_simplification(): + return finding + value = "inconclusive: mutation scaffolding only" + explanation = "Mutation tooling was available, but candidate-scoped execution is not configured yet." + if not _mutation_tool_available(): + value = "inconclusive: mutmut unavailable" + explanation = "Mutation proof was requested, but mutmut is not installed." + signal_trace = [ + *(finding.signal_trace or []), + SignalTraceEntry( + tool="mutmut", + source="mutation", + fired=False, + value=value, + evidence_refs=[EvidenceRef(path=finding.file, start_line=finding.line)], + explanation=explanation, + ), + ] + return ReviewFinding(**{**finding.model_dump(), "signal_trace": signal_trace}) + + +def _mutation_tool_available() -> bool: + return shutil.which("mutmut") is not None diff --git a/packages/specfact-code-review/src/specfact_code_review/run/commands.py b/packages/specfact-code-review/src/specfact_code_review/run/commands.py index a2b9805..02a5e8c 100644 --- a/packages/specfact-code-review/src/specfact_code_review/run/commands.py +++ b/packages/specfact-code-review/src/specfact_code_review/run/commands.py @@ -16,6 +16,10 @@ from rich.console import Console from rich.table import Table +from specfact_code_review.run.cleanup_evidence import ( + with_mutation_evidence, + with_previewed_simplification_findings, +) from specfact_code_review.run.findings import EvidenceRef, ReviewFinding, ReviewReport from specfact_code_review.run.runner import ReviewFocus, run_review @@ -63,6 +67,8 @@ class ReviewRunRequest: score_only: bool = False no_tests: bool = False fix: bool = False + preview_fixes: bool = False + with_mutation: bool = False bug_hunt: bool = False review_mode: ReviewRunMode = "enforce" review_level: ReviewLevelFilter | None = None @@ -75,6 +81,8 @@ class _ReviewLoopFlags: no_tests: bool include_noise: bool fix: bool + preview_fixes: bool + with_mutation: bool progress_callback: Callable[[str], None] | None bug_hunt: bool review_mode: ReviewRunMode @@ -321,6 +329,17 @@ def _with_applied_simplification_findings(report: ReviewReport, applied_findings return ReviewReport(**data) +def _with_simplify_enforce_verdict(report: ReviewReport, flags: _ReviewLoopFlags) -> ReviewReport: + if ( + flags.review_focus == "simplify" + and flags.review_mode == "enforce" + and report.simplification_summary is not None + and report.simplification_summary.blocking_simplification_count > 0 + ): + return report.model_copy(update={"overall_verdict": "FAIL", "ci_exit_code": 1}) + return report + + def _fixable_simplifications_by_stable_line_order(findings: list[ReviewFinding]) -> list[ReviewFinding]: indexed_findings = [ (index, finding) @@ -636,6 +655,8 @@ def _emit_progress(description: str) -> None: no_tests=flags.no_tests, include_noise=flags.include_noise, fix=flags.fix, + preview_fixes=flags.preview_fixes, + with_mutation=flags.with_mutation, progress_callback=_emit_progress, bug_hunt=flags.bug_hunt, review_mode=flags.review_mode, @@ -654,6 +675,8 @@ def _run_review_with_status( no_tests=flags.no_tests, include_noise=flags.include_noise, fix=False, + preview_fixes=False, + with_mutation=False, progress_callback=status.update, bug_hunt=flags.bug_hunt, review_mode=flags.review_mode, @@ -671,7 +694,13 @@ def _run_review_with_status( status.update("Re-running review after autofixes...") report = _run_review_once(files, base) report = _with_applied_simplification_findings(report, applied_simplification_findings) - return report + if flags.preview_fixes: + status.update("Previewing safe mechanical simplification fixes...") + report = with_previewed_simplification_findings(report, files, _apply_simplification_fixes) + if flags.with_mutation: + status.update("Recording mutation proof evidence...") + report = with_mutation_evidence(report, files) + return _with_simplify_enforce_verdict(report, flags) def _run_review_once(files: list[Path], flags: _ReviewLoopFlags) -> ReviewReport: @@ -713,7 +742,11 @@ def _run_review_once(files: list[Path], flags: _ReviewLoopFlags) -> ReviewReport focus=flags.review_focus, ) report = _with_applied_simplification_findings(report, applied_simplification_findings) - return report + if flags.preview_fixes: + report = with_previewed_simplification_findings(report, files, _apply_simplification_fixes) + if flags.with_mutation: + report = with_mutation_evidence(report, files) + return _with_simplify_enforce_verdict(report, flags) def _as_auto_scope(value: object) -> AutoScope | None: @@ -830,6 +863,8 @@ def _get_optional_param(name: str, validator: Callable[[object], object], defaul score_only=_get_bool_param("score_only"), no_tests=_get_bool_param("no_tests"), fix=_get_bool_param("fix"), + preview_fixes=_get_bool_param("preview_fixes"), + with_mutation=_get_bool_param("with_mutation"), bug_hunt=_get_bool_param("bug_hunt"), review_mode=_as_review_mode(request_kwargs.pop("review_mode", "enforce")), review_level=_as_review_level(request_kwargs.pop("review_level", None)), @@ -863,6 +898,12 @@ def _validate_review_request(request: ReviewRunRequest) -> None: raise InvalidOptionCombinationError("Use either --json or --score-only, not both.") if not request.json_output and request.out is not None: raise MissingOutForJsonError("Use --out together with --json.") + if request.preview_fixes and request.fix: + raise InvalidOptionCombinationError("Cannot combine --preview-fixes with --fix.") + if request.preview_fixes and request.review_focus != "simplify": + raise InvalidOptionCombinationError("Use --preview-fixes only with --focus simplify.") + if request.with_mutation and request.review_focus != "simplify": + raise InvalidOptionCombinationError("Use --with-mutation only with --focus simplify.") def _normalize_review_request(request: ReviewRunRequest) -> ReviewRunRequest: @@ -879,6 +920,8 @@ def _normalize_review_request(request: ReviewRunRequest) -> ReviewRunRequest: score_only=request.score_only, no_tests=request.no_tests, fix=request.fix, + preview_fixes=request.preview_fixes, + with_mutation=request.with_mutation, bug_hunt=request.bug_hunt, review_mode=request.review_mode, review_level=request.review_level, @@ -931,6 +974,8 @@ def run_command( no_tests=request.no_tests, include_noise=request.include_noise, fix=request.fix, + preview_fixes=request.preview_fixes, + with_mutation=request.with_mutation, progress_callback=None, bug_hunt=request.bug_hunt, review_mode=request.review_mode, diff --git a/packages/specfact-code-review/src/specfact_code_review/run/findings.py b/packages/specfact-code-review/src/specfact_code_review/run/findings.py index 50a00d1..500be6d 100644 --- a/packages/specfact-code-review/src/specfact_code_review/run/findings.py +++ b/packages/specfact-code-review/src/specfact_code_review/run/findings.py @@ -29,6 +29,16 @@ VALID_SEVERITIES = ("error", "warning", "info") GUIDANCE_KINDS = ("safe_mechanical", "needs_tests", "design_judgment", "preserve") ACTION_STATUSES = ("recommended", "applied", "kept", "skipped", "failed") +PRESERVE_REASONS = ( + "contract_lambda", + "protocol_member", + "public_api", + "compat_shim", + "cli_callback", + "domain_wrapper", + "spec_linked", + "load_bearing", +) PASS = "PASS" PASS_WITH_ADVISORY = "PASS_WITH_ADVISORY" FAIL = "FAIL" @@ -67,6 +77,130 @@ def _validate_invariants(self) -> EvidenceRef: return self +class SignalTraceEntry(BaseModel): + """Deterministic source signal that contributed to a cleanup finding.""" + + tool: str = Field(..., description="Tool or analysis layer that produced the signal.") + source: str = Field(..., description="Stable signal or rule source identifier.") + fired: bool = Field(..., description="Whether the signal fired for this finding.") + score: float | None = Field(default=None, description="Optional normalized signal score.") + value: str | int | float | bool | None = Field(default=None, description="Optional raw signal value.") + evidence_refs: list[EvidenceRef] | None = Field(default=None, description="Evidence backing the signal.") + explanation: str = Field(..., description="Short explanation of the signal.") + + @field_validator("tool", "source", "explanation") + @classmethod + def _validate_non_empty_text(cls, value: str) -> str: + if not value.strip(): + raise ValueError("value must not be empty") + return value + + +class PreserveReasonEvidence(BaseModel): + """Closed-taxonomy reason that prevents automatic cleanup.""" + + reason: Literal[ + "contract_lambda", + "protocol_member", + "public_api", + "compat_shim", + "cli_callback", + "domain_wrapper", + "spec_linked", + "load_bearing", + ] = Field(..., description="Closed preserve-reason taxonomy value.") + evidence_refs: list[EvidenceRef] = Field(..., min_length=1, description="Evidence for the preserve reason.") + explanation: str = Field(..., description="Why this context must be preserved.") + + @field_validator("explanation") + @classmethod + def _validate_non_empty_text(cls, value: str) -> str: + if not value.strip(): + raise ValueError("value must not be empty") + return value + + +class RemediationPacket(BaseModel): + """Portable AI IDE handoff contract for one cleanup finding.""" + + issue: str = Field(..., description="Plain-language issue description.") + recommended_action: str = Field(..., description="Recommended cleanup action.") + possible_keep_reason: str | None = Field(default=None, description="Why the code might need to stay.") + safety_checks: list[str] = Field(..., min_length=1, description="Checks required before editing.") + validation_plan: list[str] = Field(..., min_length=1, description="Validation steps after editing.") + safe_to_autofix: bool = Field(..., description="Whether an agent may apply this automatically.") + patch_forecast_refs: list[str] | None = Field(default=None, description="Patch preview references when present.") + + @field_validator("issue", "recommended_action", "possible_keep_reason") + @classmethod + def _validate_non_empty_text(cls, value: str | None) -> str | None: + if value is not None and not value.strip(): + raise ValueError("value must not be empty") + return value + + @field_validator("safety_checks", "validation_plan", "patch_forecast_refs") + @classmethod + def _validate_non_empty_entries(cls, value: list[str] | None) -> list[str] | None: + if value is not None and any(not item.strip() for item in value): + raise ValueError("entries must not be empty") + return value + + +class ReviewedLoc(BaseModel): + """Reviewed Python LOC split by production and tests.""" + + production: int = Field(..., ge=0) + tests: int = Field(..., ge=0) + total: int = Field(..., ge=0) + + @model_validator(mode="after") + def _validate_total_matches_parts(self) -> ReviewedLoc: + if self.total != self.production + self.tests: + raise ValueError("reviewed_loc.total must equal production + tests") + return self + + +class DeletionEstimate(BaseModel): + """Non-binding deletion-line range.""" + + low: int = Field(..., ge=0) + expected: int = Field(..., ge=0) + high: int = Field(..., ge=0) + + @model_validator(mode="after") + def _validate_ordering(self) -> DeletionEstimate: + if not self.low <= self.expected <= self.high: + raise ValueError("estimated_deletion_lines must satisfy low <= expected <= high") + return self + + +class AiBloatIndex(BaseModel): + """Normalized cleanup metrics per KLOC.""" + + findings_per_kloc: float = Field(..., ge=0.0) + weighted_bloat_points_per_kloc: float = Field(..., ge=0.0) + cleanup_yield_loc_per_kloc: float = Field(..., ge=0.0) + + +class GuidanceKindForecast(BaseModel): + """Forecast aggregate for one guidance kind.""" + + count: int = Field(..., ge=0) + estimated_deletion_lines: int = Field(..., ge=0) + + +class CleanupForecast(BaseModel): + """Aggregate cleanup impact forecast for simplify-focused reviews.""" + + reviewed_loc: ReviewedLoc + estimated_deletion_lines: DeletionEstimate + ai_bloat_index: AiBloatIndex + by_guidance_kind: dict[str, GuidanceKindForecast] = Field(default_factory=dict) + by_action_status: dict[str, int] = Field(default_factory=dict) + preview_evidence_count: int = Field(default=0, ge=0) + mutation_evidence_count: int = Field(default=0, ge=0) + + class ReviewFinding(BaseModel): """Structured representation of a code-review finding.""" @@ -156,6 +290,18 @@ class ReviewFinding(BaseModel): before_ref: EvidenceRef | None = Field(default=None, description="Evidence reference before an applied action.") after_ref: EvidenceRef | None = Field(default=None, description="Evidence reference after an applied action.") improvement: str | None = Field(default=None, description="Evidence-backed improvement summary.") + signal_trace: list[SignalTraceEntry] | None = Field( + default=None, + description="Optional deterministic signal trace for cleanup findings.", + ) + preserve_reasons: list[PreserveReasonEvidence] | None = Field( + default=None, + description="Optional closed-taxonomy preserve reasons that block automatic cleanup.", + ) + remediation_packet: RemediationPacket | None = Field( + default=None, + description="Optional portable cleanup handoff packet for AI IDEs.", + ) @field_validator( "tool", @@ -186,6 +332,13 @@ def _validate_safety_checks(cls, value: list[str] | None) -> list[str] | None: raise ValueError("safety_checks entries must not be empty") return value + @field_validator("signal_trace", "preserve_reasons") + @classmethod + def _validate_non_empty_evidence_list(cls, value: list[object] | None) -> list[object] | None: + if value is not None and not value: + raise ValueError("evidence lists must not be empty when provided") + return value + @model_validator(mode="after") def _validate_guided_metadata(self) -> ReviewFinding: guided_fields = ( @@ -238,6 +391,9 @@ def has_simplification_metadata(self) -> bool: self.before_ref, self.after_ref, self.improvement, + self.signal_trace, + self.preserve_reasons, + self.remediation_packet, ) ) @@ -251,7 +407,17 @@ def has_guided_simplification_metadata(self) -> bool: @ensure(lambda result: isinstance(result, bool)) def is_safe_mechanical_simplification(self) -> bool: """Return whether the finding is an unresolved safe mechanical simplification.""" - return self.guidance_kind == "safe_mechanical" and self.action_status in {None, "recommended", "failed"} + return ( + self.guidance_kind == "safe_mechanical" + and self.action_status in {None, "recommended", "failed"} + and not self.preserve_reasons + ) + + @beartype + @ensure(lambda result: isinstance(result, bool)) + def has_cleanup_handoff_metadata(self) -> bool: + """Return whether this finding carries cleanup forecast or handoff metadata.""" + return self.signal_trace is not None or self.preserve_reasons is not None or self.remediation_packet is not None @beartype @ensure(lambda result: isinstance(result, bool)) @@ -302,6 +468,10 @@ class ReviewReport(BaseModel): default=None, description="Aggregate simplification guidance and action-status evidence.", ) + cleanup_forecast: CleanupForecast | None = Field( + default=None, + description="Aggregate cleanup forecast for simplify-focused review runs.", + ) house_rules_updates: list[str] = Field(default_factory=list, description="Suggested house-rules updates.") @field_validator("schema_version", "run_id", "summary") @@ -322,7 +492,11 @@ def _normalize_timestamp(cls, value: datetime) -> datetime: def _derive_governance_fields(self) -> ReviewReport: if self.simplification_summary is None: self.simplification_summary = _build_simplification_summary(self.findings) - if self.simplification_summary is not None: + if self.cleanup_forecast is not None or any( + finding.has_cleanup_handoff_metadata() for finding in self.findings + ): + self.schema_version = "1.3" + elif self.simplification_summary is not None: self.schema_version = "1.2" elif any(finding.has_simplification_metadata() for finding in self.findings): self.schema_version = "1.1" diff --git a/packages/specfact-code-review/src/specfact_code_review/run/forecast.py b/packages/specfact-code-review/src/specfact_code_review/run/forecast.py new file mode 100644 index 0000000..60386ed --- /dev/null +++ b/packages/specfact-code-review/src/specfact_code_review/run/forecast.py @@ -0,0 +1,120 @@ +"""Cleanup forecast metrics for simplify-focused review reports.""" + +from __future__ import annotations + +from dataclasses import dataclass +from pathlib import Path + +from beartype import beartype +from icontract import ensure, require + +from specfact_code_review.run.findings import ( + AiBloatIndex, + CleanupForecast, + DeletionEstimate, + GuidanceKindForecast, + ReviewedLoc, + ReviewFinding, +) + + +_CLEANUP_FORECAST_WEIGHTS = { + "safe_mechanical": 1.0, + "needs_tests": 0.6, + "design_judgment": 0.25, + "preserve": 0.0, +} + + +@dataclass +class _CleanupForecastTotals: + by_guidance_kind: dict[str, GuidanceKindForecast] + by_action_status: dict[str, int] + low: int = 0 + expected: float = 0.0 + high: int = 0 + weighted_points: float = 0.0 + + +@beartype +@require(lambda files: isinstance(files, list), "files must be a list") +@require(lambda findings: isinstance(findings, list), "findings must be a list") +@ensure(lambda result: isinstance(result, CleanupForecast), "result must be a cleanup forecast") +def build_cleanup_forecast(findings: list[ReviewFinding], files: list[Path]) -> CleanupForecast: + reviewed_loc = _reviewed_loc_for_files(files) + guided = [finding for finding in findings if finding.guidance_kind is not None] + totals = _cleanup_forecast_totals(guided) + kloc = max(reviewed_loc.total / 1000.0, 0.001) + expected_lines = round(totals.expected) + return CleanupForecast( + reviewed_loc=reviewed_loc, + estimated_deletion_lines=DeletionEstimate(low=totals.low, expected=expected_lines, high=totals.high), + ai_bloat_index=AiBloatIndex( + findings_per_kloc=round(len(guided) / kloc, 3), + weighted_bloat_points_per_kloc=round(totals.weighted_points / kloc, 3), + cleanup_yield_loc_per_kloc=round(expected_lines / kloc, 3), + ), + by_guidance_kind=totals.by_guidance_kind, + by_action_status=totals.by_action_status, + preview_evidence_count=sum( + 1 + for finding in guided + if finding.remediation_packet is not None and finding.remediation_packet.patch_forecast_refs + ), + mutation_evidence_count=sum( + 1 + for finding in guided + if finding.signal_trace is not None and any(signal.source == "mutation" for signal in finding.signal_trace) + ), + ) + + +def _reviewed_loc_for_files(files: list[Path]) -> ReviewedLoc: + production = 0 + tests = 0 + for file_path in files: + if file_path.suffix not in {".py", ".pyi"}: + continue + try: + loc = _count_python_loc(file_path) + except (OSError, UnicodeDecodeError): + continue + if "tests" in file_path.parts: + tests += loc + else: + production += loc + return ReviewedLoc(production=production, tests=tests, total=production + tests) + + +def _count_python_loc(file_path: Path) -> int: + return sum( + 1 + for line in file_path.read_text(encoding="utf-8").splitlines() + if line.strip() and not line.lstrip().startswith("#") + ) + + +def _cleanup_forecast_totals(guided: list[ReviewFinding]) -> _CleanupForecastTotals: + totals = _CleanupForecastTotals(by_guidance_kind={}, by_action_status={}) + for finding in guided: + _add_cleanup_forecast_finding(totals, finding) + return totals + + +def _add_cleanup_forecast_finding(totals: _CleanupForecastTotals, finding: ReviewFinding) -> None: + guidance_kind = finding.guidance_kind or "design_judgment" + deletion_lines = finding.estimated_deletion_lines or 0 + current = totals.by_guidance_kind.get(guidance_kind, GuidanceKindForecast(count=0, estimated_deletion_lines=0)) + totals.by_guidance_kind[guidance_kind] = GuidanceKindForecast( + count=current.count + 1, + estimated_deletion_lines=current.estimated_deletion_lines + deletion_lines, + ) + if finding.action_status is not None: + totals.by_action_status[finding.action_status] = totals.by_action_status.get(finding.action_status, 0) + 1 + if guidance_kind == "safe_mechanical": + totals.low += deletion_lines + if guidance_kind != "preserve": + totals.high += deletion_lines + weight = _CLEANUP_FORECAST_WEIGHTS.get(guidance_kind, 0.0) + totals.expected += deletion_lines * weight + totals.weighted_points += weight diff --git a/packages/specfact-code-review/src/specfact_code_review/run/runner.py b/packages/specfact-code-review/src/specfact_code_review/run/runner.py index 0f72cbf..95ba06d 100644 --- a/packages/specfact-code-review/src/specfact_code_review/run/runner.py +++ b/packages/specfact-code-review/src/specfact_code_review/run/runner.py @@ -21,7 +21,16 @@ from icontract import ensure, require from specfact_code_review._review_utils import normalize_path_variants, tool_error -from specfact_code_review.run.findings import ReviewFinding, ReviewReport +from specfact_code_review.run.findings import ( + CleanupForecast, + EvidenceRef, + PreserveReasonEvidence, + RemediationPacket, + ReviewFinding, + ReviewReport, + SignalTraceEntry, +) +from specfact_code_review.run.forecast import build_cleanup_forecast from specfact_code_review.run.scorer import score_review from specfact_code_review.tools.ai_bloat_runner import run_ai_bloat from specfact_code_review.tools.ast_clean_code_runner import run_ast_clean_code @@ -345,6 +354,290 @@ def _filter_findings_by_focus(findings: list[ReviewFinding], focus: ReviewFocus raise ValueError(f"Unsupported review focus: {focus}") +def _enrich_cleanup_findings(findings: list[ReviewFinding]) -> list[ReviewFinding]: + return [_enriched_cleanup_finding(finding) for finding in findings] + + +def _enriched_cleanup_finding(finding: ReviewFinding) -> ReviewFinding: + if finding.guidance_kind is None: + return finding + preserve_reasons = list(finding.preserve_reasons or []) + preserve_reasons.extend( + reason + for reason in _preserve_reasons_for_finding(finding, load_bearing=False) + if reason not in preserve_reasons + ) + updates: dict[str, object] = { + "signal_trace": _signal_trace_for_finding(finding), + } + if preserve_reasons: + updates.update( + { + "guidance_kind": "preserve", + "recommended_action": "keep", + "estimated_deletion_lines": 0, + "action_status": "kept", + "preserve_reason": "; ".join(reason.explanation for reason in preserve_reasons), + "preserve_reasons": preserve_reasons, + } + ) + candidate = finding.model_copy(update=updates) + return candidate.model_copy(update={"remediation_packet": _remediation_packet_for_finding(candidate)}) + + +def _signal_trace_for_finding(finding: ReviewFinding) -> list[SignalTraceEntry]: + existing = list(finding.signal_trace or []) + if existing: + return existing + return [ + SignalTraceEntry( + tool=finding.tool, + source=finding.rule, + fired=True, + score=1.0 if finding.confidence == "high" else None, + value=finding.canonical_pattern, + evidence_refs=[EvidenceRef(path=finding.file, start_line=finding.line)], + explanation=f"{finding.tool} emitted {finding.rule}.", + ) + ] + + +def _remediation_packet_for_finding(finding: ReviewFinding) -> RemediationPacket: + possible_keep_reason = finding.preserve_reason + if possible_keep_reason is None and finding.guidance_kind in {"design_judgment", "needs_tests"}: + possible_keep_reason = "Keep the current shape if tests, API compatibility, or domain readability need it." + return RemediationPacket( + issue=finding.message, + recommended_action=finding.recommended_action or "inspect", + possible_keep_reason=possible_keep_reason, + safety_checks=finding.safety_checks or ["inspect the surrounding behavior before editing"], + validation_plan=["run targeted tests for the touched file", "rerun specfact code review with --focus simplify"], + safe_to_autofix=finding.is_safe_mechanical_simplification() and finding.fixable, + ) + + +def _preserve_reasons_for_finding(finding: ReviewFinding, *, load_bearing: bool) -> list[PreserveReasonEvidence]: + reasons: list[PreserveReasonEvidence] = [] + evidence_ref = EvidenceRef(path=finding.file, start_line=finding.line) + if load_bearing: + reasons.append( + PreserveReasonEvidence( + reason="load_bearing", + evidence_refs=[evidence_ref], + explanation="Mutation proof indicates this code is load-bearing.", + ) + ) + try: + source = Path(finding.file).read_text(encoding="utf-8") + tree = ast.parse(source, filename=finding.file) + except (OSError, SyntaxError, UnicodeDecodeError): + return reasons + lines = source.splitlines() + function_node = _function_containing_line(tree, finding.line) + class_node = _class_containing_line(tree, finding.line) + public_names = _module_all_names(tree) + if function_node is not None: + if _has_contract_decorator(function_node): + reasons.append( + PreserveReasonEvidence( + reason="contract_lambda", + evidence_refs=[evidence_ref], + explanation="Function is protected by an icontract-style contract decorator.", + ) + ) + if function_node.name in public_names: + reasons.append( + PreserveReasonEvidence( + reason="public_api", + evidence_refs=[evidence_ref], + explanation="Function is exported through __all__ and is public API.", + ) + ) + if _has_cli_decorator(function_node): + reasons.append( + PreserveReasonEvidence( + reason="cli_callback", + evidence_refs=[evidence_ref], + explanation="Function is registered as a Typer or Click callback.", + ) + ) + if _has_preserve_marker(lines, function_node.lineno): + reasons.append( + PreserveReasonEvidence( + reason="compat_shim", + evidence_refs=[evidence_ref], + explanation="Function has an explicit specfact preserve compatibility marker.", + ) + ) + if _has_spec_marker(lines, function_node.lineno): + reasons.append( + PreserveReasonEvidence( + reason="spec_linked", + evidence_refs=[evidence_ref], + explanation="Function has an explicit spec requirement marker.", + ) + ) + if class_node is not None and _is_protocol_or_abstract_member(class_node, function_node): + reasons.append( + PreserveReasonEvidence( + reason="protocol_member", + evidence_refs=[evidence_ref], + explanation="Finding is inside an abstract Protocol or ABC member contract.", + ) + ) + return _dedupe_preserve_reasons(reasons) + + +def _dedupe_preserve_reasons(reasons: list[PreserveReasonEvidence]) -> list[PreserveReasonEvidence]: + deduped: list[PreserveReasonEvidence] = [] + seen: set[str] = set() + for reason in reasons: + if reason.reason in seen: + continue + seen.add(reason.reason) + deduped.append(reason) + return deduped + + +def _function_containing_line(tree: ast.AST, line: int) -> ast.FunctionDef | ast.AsyncFunctionDef | None: + functions = [ + node + for node in ast.walk(tree) + if isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef) + and node.lineno <= line <= (node.end_lineno or node.lineno) + ] + return max(functions, key=lambda node: node.lineno, default=None) + + +def _class_containing_line(tree: ast.AST, line: int) -> ast.ClassDef | None: + classes = [ + node + for node in ast.walk(tree) + if isinstance(node, ast.ClassDef) and node.lineno <= line <= (node.end_lineno or node.lineno) + ] + return max(classes, key=lambda node: node.lineno, default=None) + + +def _module_all_names(tree: ast.Module) -> set[str]: + names: set[str] = set() + for node in tree.body: + exported_values = _module_all_assignment_values(node) + if exported_values is None: + continue + for item in exported_values: + item_name = _string_constant_value(item) + if item_name is not None: + names.add(item_name) + return names + + +def _module_all_assignment_values(node: ast.stmt) -> list[ast.expr] | None: + if not isinstance(node, ast.Assign): + return None + if not any(isinstance(target, ast.Name) and target.id == "__all__" for target in node.targets): + return None + if isinstance(node.value, ast.List | ast.Tuple | ast.Set): + return list(node.value.elts) + return None + + +def _string_constant_value(node: ast.AST) -> str | None: + if isinstance(node, ast.Constant) and isinstance(node.value, str): + return node.value + if isinstance(node, ast.Str) and isinstance(node.s, str): + return node.s + return None + + +def _decorator_full_name(node: ast.AST) -> str: + target = node.func if isinstance(node, ast.Call) else node + if isinstance(target, ast.Name): + return target.id + if isinstance(target, ast.Attribute): + prefix = _decorator_full_name(target.value) + return f"{prefix}.{target.attr}" if prefix else target.attr + return "" + + +def _has_contract_decorator(function_node: ast.FunctionDef | ast.AsyncFunctionDef) -> bool: + contract_names = {"require", "ensure", "invariant", "icontract.require", "icontract.ensure", "icontract.invariant"} + return any(_decorator_full_name(decorator) in contract_names for decorator in function_node.decorator_list) + + +def _has_cli_decorator(function_node: ast.FunctionDef | ast.AsyncFunctionDef) -> bool: + return any( + _decorator_full_name(decorator).split(".")[-1] in {"command", "callback"} + for decorator in function_node.decorator_list + ) + + +def _has_abstractmethod(function_node: ast.FunctionDef | ast.AsyncFunctionDef | None) -> bool: + if function_node is None: + return False + return any( + _decorator_full_name(decorator).split(".")[-1] == "abstractmethod" for decorator in function_node.decorator_list + ) + + +def _is_protocol_or_abstract_member( + class_node: ast.ClassDef, + function_node: ast.FunctionDef | ast.AsyncFunctionDef | None, +) -> bool: + if function_node is None: + return False + if _has_abstractmethod(function_node): + return True + return _has_base_named(class_node, {"Protocol"}) and _is_stub_function(function_node) + + +def _is_stub_function(function_node: ast.FunctionDef | ast.AsyncFunctionDef) -> bool: + body = [statement for statement in function_node.body if not _is_docstring_statement(statement)] + return all(_is_stub_statement(statement) for statement in body) + + +def _is_docstring_statement(statement: ast.stmt) -> bool: + return ( + isinstance(statement, ast.Expr) + and isinstance(statement.value, ast.Constant) + and isinstance(statement.value.value, str) + ) + + +def _is_stub_statement(statement: ast.stmt) -> bool: + if isinstance(statement, ast.Pass): + return True + return ( + isinstance(statement, ast.Expr) + and isinstance(statement.value, ast.Constant) + and statement.value.value is Ellipsis + ) + + +def _has_base_named(class_node: ast.ClassDef, names: set[str]) -> bool: + return any(_base_name(base).rsplit(".", maxsplit=1)[-1] in names for base in class_node.bases) + + +def _base_name(node: ast.AST) -> str: + if isinstance(node, ast.Name): + return node.id + if isinstance(node, ast.Attribute): + prefix = _base_name(node.value) + return f"{prefix}.{node.attr}" if prefix else node.attr + if isinstance(node, ast.Subscript): + return _base_name(node.value) + return "" + + +def _has_preserve_marker(lines: list[str], line: int) -> bool: + context = "\n".join(lines[max(0, line - 3) : line]) + return "specfact: preserve(" in context + + +def _has_spec_marker(lines: list[str], line: int) -> bool: + context = "\n".join(lines[max(0, line - 3) : line]) + return "# spec:" in context or "# specfact: requirement(" in context + + def _collect_tdd_inputs(files: list[Path]) -> tuple[list[Path], list[Path], list[ReviewFinding]]: source_files = [file_path for file_path in files if _expected_test_path(file_path) is not None] findings: list[ReviewFinding] = [] @@ -633,6 +926,10 @@ def run_review( findings = _filter_findings_by_review_level(findings, review_options.review_level) findings = _filter_findings_by_focus(findings, review_options.focus) + cleanup_forecast: CleanupForecast | None = None + if review_options.focus == "simplify": + findings = _enrich_cleanup_findings(findings) + cleanup_forecast = build_cleanup_forecast(findings, files) score = score_review( findings=findings, @@ -648,6 +945,7 @@ def run_review( score=score.score, findings=findings, summary=_summary_for_findings(findings), + cleanup_forecast=cleanup_forecast, ) if review_options.review_mode == "shadow": return report.model_copy(update={"ci_exit_code": 0}) diff --git a/packages/specfact-project/module-package.yaml b/packages/specfact-project/module-package.yaml index 089eec4..06b52b6 100644 --- a/packages/specfact-project/module-package.yaml +++ b/packages/specfact-project/module-package.yaml @@ -1,5 +1,5 @@ name: nold-ai/specfact-project -version: 0.41.17 +version: 0.41.18 commands: - project - plan @@ -27,5 +27,5 @@ core_compatibility: '>=0.40.0,<1.0.0' description: Official SpecFact project bundle package. bundle_group_command: project integrity: - checksum: sha256:98760f97c05a8bb7606931dd50b2c885c8d98309332cdab6bc5b2d5153564cdc - signature: Hp8M0QWi1OO1+gCPil5K0MbUy3tY3xg8R3OzdHRbeJFWnYPaB8GMuEpEkN163OVjzEqToTKYnifLpGqfgJt9CA== + checksum: sha256:4f2905d81e5287a7bfa81a93fe50ac17d6f55f5d5b7ea6b2d328ed83869d99a2 + signature: 55f2vT+jV/u0xcxWFgQnEZ1urtuJjfOhXYNo1tOUrnPZGhHWAhLaBGuwbYYb1V2FeUbFPdUxbbkx/TjbcLvZDA== diff --git a/packages/specfact-project/resources/prompts/specfact.08-simplify.md b/packages/specfact-project/resources/prompts/specfact.08-simplify.md index 32bec99..2547719 100644 --- a/packages/specfact-project/resources/prompts/specfact.08-simplify.md +++ b/packages/specfact-project/resources/prompts/specfact.08-simplify.md @@ -56,16 +56,18 @@ If this slash prompt or the installed skill is unavailable in another AI IDE, te Read `.specfact/code-review-simplify.json`. If it is missing, ask the user to run: ```bash -specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json +specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json ``` -Explain that this report is the evidence file: it lists candidate cleanups, the safety checks, and the preserve reasons the assistant must use before touching code. Do not edit files until the report exists. +Explain that this report is the evidence file: it lists candidate cleanups, cleanup forecast, safety checks, remediation packets, and preserve reasons the assistant must use before touching code. Do not edit files until the report exists. + +Inspect `cleanup_forecast` first. Use reviewed LOC, deletion estimate ranges, AI-bloat index, weighted bloat points, cleanup yield, and guidance-kind totals to decide where cleanup has the highest likely payoff. Treat estimates as forecasts, not guaranteed LOC removal. If the report contains no findings where `category == "ai_bloat"` and no findings with simplification metadata such as `intent_key`, `rewrite_hint`, `canonical_pattern`, or `guidance_kind`, report that there are no simplification candidates and stop without editing files. ### Step 2: Group Candidates -Group findings by `intent_key` first when present, then by file or domain and rule. For each candidate, inspect the referenced source location, inspect any related locations from `related_locations`, and capture small surrounding snippets before proposing a rewrite. +Group findings by `intent_key` first when present, then by file or domain and rule. For each candidate, inspect the referenced source location, inspect any related locations from `related_locations`, and capture small surrounding snippets before proposing a rewrite. When `remediation_packet` is present, treat it as authoritative for the issue, recommended action, possible keep reason, safety checks, validation plan, and `safe_to_autofix`. Use `guidance_kind` as the action contract: @@ -74,6 +76,8 @@ Use `guidance_kind` as the action contract: - `design_judgment`: inspect intent evidence first, explain tradeoffs in plain language, default to keep/skip when intent is unclear, and ask before editing. - `preserve`: keep by default; record the `preserve_reason` as a false-positive or intentional-pattern note. +If `preserve_reasons` is present, do not autofix the finding even when a shorter patch exists. + For vibe-coder and junior walkthroughs, present findings as a decision card instead of a raw lint warning: ```text @@ -120,7 +124,7 @@ Compare the new report with the prior findings and summarize which `ai_bloat` or Use the CLI as the verification source: ```bash -specfact code review run --scope changed --focus simplify --json --out .specfact/code-review-simplify.json +specfact code review run --scope changed --focus simplify --preview-fixes --json --out .specfact/code-review-simplify.json specfact code review run --scope changed --bug-hunt --json --out .specfact/code-review-bughunt.json ``` diff --git a/registry/index.json b/registry/index.json index fdbdee2..c559e8e 100644 --- a/registry/index.json +++ b/registry/index.json @@ -2,9 +2,9 @@ "modules": [ { "id": "nold-ai/specfact-project", - "latest_version": "0.41.17", - "download_url": "modules/specfact-project-0.41.17.tar.gz", - "checksum_sha256": "bb9a30df380bc23cb915f74ae5a49d89782e85dc845b358b6938d056b4ad1f04", + "latest_version": "0.41.18", + "download_url": "modules/specfact-project-0.41.18.tar.gz", + "checksum_sha256": "58d5652c6c5e609af1acd661ad816250a251ed80c965f301b40939fc263a51cd", "tier": "official", "publisher": { "name": "nold-ai", @@ -78,9 +78,9 @@ }, { "id": "nold-ai/specfact-code-review", - "latest_version": "0.47.27", - "download_url": "modules/specfact-code-review-0.47.27.tar.gz", - "checksum_sha256": "1f7167cc3f973ebb1a60613655ef22b57a75054044f586f89de73c904b74d740", + "latest_version": "0.47.30", + "download_url": "modules/specfact-code-review-0.47.30.tar.gz", + "checksum_sha256": "8e9db9d4659f6bd71d572e213a186f188e20ddfa1108838d18cfb4decca5761e", "core_compatibility": ">=0.44.0,<1.0.0", "tier": "official", "publisher": { diff --git a/registry/modules/specfact-code-review-0.47.28.tar.gz b/registry/modules/specfact-code-review-0.47.28.tar.gz new file mode 100644 index 0000000..adf70ff Binary files /dev/null and b/registry/modules/specfact-code-review-0.47.28.tar.gz differ diff --git a/registry/modules/specfact-code-review-0.47.28.tar.gz.sha256 b/registry/modules/specfact-code-review-0.47.28.tar.gz.sha256 new file mode 100644 index 0000000..043c920 --- /dev/null +++ b/registry/modules/specfact-code-review-0.47.28.tar.gz.sha256 @@ -0,0 +1 @@ +968386b7d14223a47f76de0aa1289e7b4a2aabe486a5c256a40feedcec3988bd diff --git a/registry/modules/specfact-code-review-0.47.29.tar.gz b/registry/modules/specfact-code-review-0.47.29.tar.gz new file mode 100644 index 0000000..6bf9353 Binary files /dev/null and b/registry/modules/specfact-code-review-0.47.29.tar.gz differ diff --git a/registry/modules/specfact-code-review-0.47.29.tar.gz.sha256 b/registry/modules/specfact-code-review-0.47.29.tar.gz.sha256 new file mode 100644 index 0000000..f04d69a --- /dev/null +++ b/registry/modules/specfact-code-review-0.47.29.tar.gz.sha256 @@ -0,0 +1 @@ +e4856920ab47db60d58800b0662ac878ce43cb32d4cb2938cda3cd750e68cb51 diff --git a/registry/modules/specfact-code-review-0.47.30.tar.gz b/registry/modules/specfact-code-review-0.47.30.tar.gz new file mode 100644 index 0000000..6a4a1da Binary files /dev/null and b/registry/modules/specfact-code-review-0.47.30.tar.gz differ diff --git a/registry/modules/specfact-code-review-0.47.30.tar.gz.sha256 b/registry/modules/specfact-code-review-0.47.30.tar.gz.sha256 new file mode 100644 index 0000000..771796b --- /dev/null +++ b/registry/modules/specfact-code-review-0.47.30.tar.gz.sha256 @@ -0,0 +1 @@ +8e9db9d4659f6bd71d572e213a186f188e20ddfa1108838d18cfb4decca5761e diff --git a/registry/modules/specfact-project-0.41.18.tar.gz b/registry/modules/specfact-project-0.41.18.tar.gz new file mode 100644 index 0000000..132895b Binary files /dev/null and b/registry/modules/specfact-project-0.41.18.tar.gz differ diff --git a/registry/modules/specfact-project-0.41.18.tar.gz.sha256 b/registry/modules/specfact-project-0.41.18.tar.gz.sha256 new file mode 100644 index 0000000..bde46f1 --- /dev/null +++ b/registry/modules/specfact-project-0.41.18.tar.gz.sha256 @@ -0,0 +1 @@ +58d5652c6c5e609af1acd661ad816250a251ed80c965f301b40939fc263a51cd diff --git a/tests/cli-contracts/specfact-code-review-run.scenarios.yaml b/tests/cli-contracts/specfact-code-review-run.scenarios.yaml index 43e6314..e5cd60e 100644 --- a/tests/cli-contracts/specfact-code-review-run.scenarios.yaml +++ b/tests/cli-contracts/specfact-code-review-run.scenarios.yaml @@ -168,3 +168,24 @@ scenarios: exit_code: 2 stderr_contains: - Cannot combine --focus with --include-tests or --exclude-tests + - name: preview-fixes-cannot-combine-with-fix + type: anti-pattern + argv: + - tests/fixtures/review/clean_module.py + - --focus + - simplify + - --preview-fixes + - --fix + expect: + exit_code: 2 + stderr_contains: + - Cannot combine --preview-fixes with --fix + - name: mutation-proof-requires-simplify-focus + type: anti-pattern + argv: + - tests/fixtures/review/clean_module.py + - --with-mutation + expect: + exit_code: 2 + stderr_contains: + - Use --with-mutation only with --focus simplify diff --git a/tests/unit/specfact_code_review/review/test_commands.py b/tests/unit/specfact_code_review/review/test_commands.py index 6bb6131..0f0e726 100644 --- a/tests/unit/specfact_code_review/review/test_commands.py +++ b/tests/unit/specfact_code_review/review/test_commands.py @@ -41,9 +41,13 @@ def _fail_run_command(_files: list[Path], **_kwargs: object) -> tuple[int, str | assert "branch-delta Python files" in result.output assert "git diff --name-only ...HEAD" in result.output assert "Findings without guidance_kind are unguided advisories" in result.output + assert "Sort findings by guidance_kind before editing" in result.output assert "exact patch preview" in result.output assert "default to keep or skip" in result.output assert "specfact code review run --scope changed --focus simplify" in result.output + assert "cleanup_forecast" in result.output + assert "remediation_packet" in result.output + assert "not proof of AI authorship" in result.output def test_review_run_interactive_prompts_for_test_inclusion(monkeypatch: Any) -> None: diff --git a/tests/unit/specfact_code_review/run/test_cleanup_evidence.py b/tests/unit/specfact_code_review/run/test_cleanup_evidence.py new file mode 100644 index 0000000..5a89fb6 --- /dev/null +++ b/tests/unit/specfact_code_review/run/test_cleanup_evidence.py @@ -0,0 +1,66 @@ +from __future__ import annotations + +from pathlib import Path + +import pytest + +from specfact_code_review.run.cleanup_evidence import with_mutation_evidence, with_previewed_simplification_findings +from specfact_code_review.run.findings import ReviewFinding, ReviewReport + + +def _finding(file_path: Path) -> ReviewFinding: + return ReviewFinding( + category="ai_bloat", + severity="info", + tool="ast", + rule="ai-bloat.redundant-intermediate", + file=str(file_path), + line=2, + message="Simplify local code.", + fixable=True, + confidence="high", + rewrite_hint="Inline the temporary.", + canonical_pattern="one-use-temporary", + estimated_deletion_lines=1, + guidance_kind="safe_mechanical", + recommended_action="inline", + clean_code_principle="kiss", + rationale="The local variable is assigned once and returned.", + safety_checks=["same expression is returned"], + action_status="recommended", + ) + + +def _report(finding: ReviewFinding) -> ReviewReport: + return ReviewReport(run_id="review", score=90, findings=[finding], summary="Simplify") + + +def test_with_previewed_simplification_findings_records_patch_ref(tmp_path: Path) -> None: + source = tmp_path / "sample.py" + source.write_text( + "def total(values: list[int]) -> int:\n result = sum(values)\n return result\n", encoding="utf-8" + ) + finding = _finding(source) + + def _apply(report: ReviewReport) -> list[ReviewFinding]: + preview_path = Path(report.findings[0].file) + preview_path.write_text("def total(values: list[int]) -> int:\n return sum(values)\n", encoding="utf-8") + return [report.findings[0]] + + previewed = with_previewed_simplification_findings(_report(finding), [source], _apply) + + assert previewed.findings[0].remediation_packet is not None + assert previewed.findings[0].remediation_packet.patch_forecast_refs == [f"preview:{source}:2"] + assert source.read_text(encoding="utf-8").count("result") == 2 + + +def test_with_mutation_evidence_records_inconclusive_signal(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None: + source = tmp_path / "sample.py" + source.write_text("def total(values: list[int]) -> int:\n return sum(values)\n", encoding="utf-8") + monkeypatch.setattr("specfact_code_review.run.cleanup_evidence._mutation_tool_available", lambda: False) + + report = with_mutation_evidence(_report(_finding(source)), [source]) + + assert report.findings[0].signal_trace is not None + assert report.findings[0].signal_trace[-1].source == "mutation" + assert report.findings[0].signal_trace[-1].value == "inconclusive: mutmut unavailable" diff --git a/tests/unit/specfact_code_review/run/test_commands.py b/tests/unit/specfact_code_review/run/test_commands.py index e8bbee2..9f28906 100644 --- a/tests/unit/specfact_code_review/run/test_commands.py +++ b/tests/unit/specfact_code_review/run/test_commands.py @@ -312,6 +312,88 @@ def fake_run_review(files: list[Path], **kwargs: Any) -> ReviewReport: assert recorded == {"files": [package_file], "focus": "simplify"} +def test_run_command_rejects_preview_fixes_with_fix() -> None: + result = runner.invoke( + app, + [ + "review", + "run", + "--focus", + "simplify", + "--preview-fixes", + "--fix", + "tests/fixtures/review/clean_module.py", + ], + ) + + assert result.exit_code == 2 + assert "Cannot combine --preview-fixes with --fix" in result.output + + +def test_run_command_rejects_with_mutation_without_simplify_focus() -> None: + result = runner.invoke( + app, + ["review", "run", "--with-mutation", "tests/fixtures/review/clean_module.py"], + ) + + assert result.exit_code == 2 + assert "Use --with-mutation only with --focus simplify" in result.output + + +def test_preview_fixes_adds_patch_forecast_without_mutating_tracked_file(monkeypatch: Any, tmp_path: Path) -> None: + target = tmp_path / "sample.py" + source = "def total(values: list[int]) -> int:\n result = sum(values)\n return result\n" + target.write_text(source, encoding="utf-8") + report = _safe_mechanical_report(target, line=2, rule="ai-bloat.redundant-intermediate") + monkeypatch.setattr("specfact_code_review.run.commands.run_review", lambda files, **kwargs: report) + + exit_code, output = run_commands.run_command( + run_commands.ReviewRunRequest( + files=[target], + json_output=True, + out=tmp_path / "review-report.json", + focus_facets=("simplify",), + preview_fixes=True, + ) + ) + + assert exit_code == 1 + assert output == str(tmp_path / "review-report.json") + assert target.read_text(encoding="utf-8") == source + previewed = ReviewReport.model_validate_json((tmp_path / "review-report.json").read_text(encoding="utf-8")) + assert previewed.cleanup_forecast is not None + assert previewed.cleanup_forecast.preview_evidence_count == 1 + assert previewed.findings[0].remediation_packet is not None + assert previewed.findings[0].remediation_packet.patch_forecast_refs == [f"preview:{target}:2"] + + +def test_with_mutation_records_inconclusive_evidence_for_missing_tool(monkeypatch: Any, tmp_path: Path) -> None: + target = tmp_path / "sample.py" + target.write_text( + "def total(values: list[int]) -> int:\n result = sum(values)\n return result\n", encoding="utf-8" + ) + report = _safe_mechanical_report(target, line=2, rule="ai-bloat.redundant-intermediate") + monkeypatch.setattr("specfact_code_review.run.commands.run_review", lambda files, **kwargs: report) + monkeypatch.setattr("specfact_code_review.run.cleanup_evidence._mutation_tool_available", lambda: False) + + exit_code, output = run_commands.run_command( + run_commands.ReviewRunRequest( + files=[target], + json_output=True, + out=tmp_path / "review-report.json", + focus_facets=("simplify",), + with_mutation=True, + ) + ) + + assert exit_code == 1 + assert output == str(tmp_path / "review-report.json") + mutation_report = ReviewReport.model_validate_json((tmp_path / "review-report.json").read_text(encoding="utf-8")) + assert mutation_report.findings[0].signal_trace is not None + assert mutation_report.findings[0].signal_trace[-1].source == "mutation" + assert mutation_report.findings[0].signal_trace[-1].value == "inconclusive: mutmut unavailable" + + def test_apply_simplification_fixes_inlines_redundant_intermediate(tmp_path: Path) -> None: target = tmp_path / "sample.py" target.write_text( @@ -531,6 +613,8 @@ def test_run_review_once_applies_simplification_fixes_before_rerun(monkeypatch: no_tests=True, include_noise=False, fix=True, + preview_fixes=False, + with_mutation=False, progress_callback=None, bug_hunt=False, review_mode="enforce", diff --git a/tests/unit/specfact_code_review/run/test_findings.py b/tests/unit/specfact_code_review/run/test_findings.py index 847965c..3748d25 100644 --- a/tests/unit/specfact_code_review/run/test_findings.py +++ b/tests/unit/specfact_code_review/run/test_findings.py @@ -6,7 +6,19 @@ import pytest from pydantic import ValidationError -from specfact_code_review.run.findings import EvidenceRef, ReviewFinding, ReviewReport +from specfact_code_review.run.findings import ( + AiBloatIndex, + CleanupForecast, + DeletionEstimate, + EvidenceRef, + GuidanceKindForecast, + PreserveReasonEvidence, + RemediationPacket, + ReviewedLoc, + ReviewFinding, + ReviewReport, + SignalTraceEntry, +) class ReviewFindingPayload(TypedDict, total=False): @@ -49,6 +61,9 @@ class ReviewFindingPayload(TypedDict, total=False): before_ref: EvidenceRef after_ref: EvidenceRef improvement: str + signal_trace: list[SignalTraceEntry] + preserve_reasons: list[PreserveReasonEvidence] + remediation_packet: RemediationPacket def _finding_data(**overrides: Unpack[ReviewFindingPayload]) -> ReviewFindingPayload: @@ -138,6 +153,86 @@ def test_review_finding_accepts_guided_simplification_metadata() -> None: assert finding.is_safe_mechanical_simplification() +def test_review_finding_accepts_cleanup_handoff_metadata() -> None: + finding = ReviewFinding( + **_finding_data( + category="ai_bloat", + severity="info", + rule="ai-bloat.redundant-intermediate", + confidence="high", + rewrite_hint="Inline the one-use temporary into the return statement.", + canonical_pattern="one-use-temporary", + estimated_deletion_lines=1, + guidance_kind="safe_mechanical", + recommended_action="inline", + clean_code_principle="kiss", + rationale="The local variable is assigned once and read only by the following return.", + safety_checks=["same expression is returned", "temporary has no later reads"], + action_status="recommended", + signal_trace=[ + SignalTraceEntry( + tool="ast", + source="ai-bloat.redundant-intermediate", + fired=True, + score=1.0, + value="one-use-temporary", + evidence_refs=[EvidenceRef(path="src/example.py", start_line=12)], + explanation="AST pattern matched a one-use temporary.", + ) + ], + preserve_reasons=[ + PreserveReasonEvidence( + reason="public_api", + evidence_refs=[EvidenceRef(path="src/example.py", start_line=12)], + explanation="Exported in __all__.", + ) + ], + remediation_packet=RemediationPacket( + issue="One-use temporary can be inlined.", + recommended_action="inline", + possible_keep_reason="Keep only if readability would regress.", + safety_checks=["same expression is returned"], + validation_plan=["run targeted tests", "rerun simplify review"], + safe_to_autofix=False, + patch_forecast_refs=["preview:src/example.py:12"], + ), + ) + ) + + assert finding.signal_trace is not None + assert finding.signal_trace[0].tool == "ast" + assert finding.preserve_reasons is not None + assert finding.preserve_reasons[0].reason == "public_api" + assert finding.remediation_packet is not None + assert not finding.remediation_packet.safe_to_autofix + assert not finding.is_safe_mechanical_simplification() + + +def test_review_finding_rejects_unknown_preserve_reason() -> None: + with pytest.raises(ValidationError): + ReviewFinding( + **_finding_data( + category="ai_bloat", + severity="info", + guidance_kind="safe_mechanical", + recommended_action="inline", + clean_code_principle="kiss", + rationale="The local rewrite is safe.", + safety_checks=["same expression is returned"], + preserve_reasons=cast( + Any, + [ + { + "reason": "unknown_reason", + "evidence_refs": [EvidenceRef(path="src/example.py", start_line=12)], + "explanation": "Not in taxonomy.", + } + ], + ), + ) + ) + + def test_review_finding_accepts_guided_metadata_without_action_status() -> None: finding = ReviewFinding( **_finding_data( @@ -362,6 +457,47 @@ def test_review_report_uses_schema_1_2_and_summary_when_guided_metadata_is_prese assert report.simplification_summary.blocking_simplification_count == 1 +def test_review_report_uses_schema_1_3_when_cleanup_forecast_is_present() -> None: + report = ReviewReport( + run_id="run-cleanup-forecast", + timestamp=datetime(2026, 5, 24, tzinfo=UTC), + score=85, + findings=[], + summary="Cleanup forecast.", + cleanup_forecast=CleanupForecast( + reviewed_loc=ReviewedLoc(production=80, tests=20, total=100), + estimated_deletion_lines=DeletionEstimate(low=2, expected=5, high=8), + ai_bloat_index=AiBloatIndex( + findings_per_kloc=20.0, + weighted_bloat_points_per_kloc=16.0, + cleanup_yield_loc_per_kloc=50.0, + ), + by_guidance_kind={ + "safe_mechanical": GuidanceKindForecast(count=2, estimated_deletion_lines=2), + "needs_tests": GuidanceKindForecast(count=1, estimated_deletion_lines=5), + }, + by_action_status={"recommended": 3}, + ), + ) + + assert report.schema_version == "1.3" + assert report.cleanup_forecast is not None + assert report.cleanup_forecast.ai_bloat_index.weighted_bloat_points_per_kloc == 16.0 + + +def test_reviewed_loc_rejects_total_mismatch() -> None: + with pytest.raises(ValidationError, match=r"reviewed_loc.total must equal production \+ tests"): + ReviewedLoc(production=80, tests=20, total=90) + + +def test_deletion_estimate_rejects_inverted_range() -> None: + with pytest.raises(ValidationError, match="estimated_deletion_lines must satisfy low <= expected <= high"): + DeletionEstimate(low=6, expected=5, high=10) + + with pytest.raises(ValidationError, match="estimated_deletion_lines must satisfy low <= expected <= high"): + DeletionEstimate(low=1, expected=5, high=4) + + def test_review_report_counts_failed_safe_mechanical_findings_as_blocking() -> None: report = ReviewReport( run_id="run-guided-simplify", diff --git a/tests/unit/specfact_code_review/run/test_forecast.py b/tests/unit/specfact_code_review/run/test_forecast.py new file mode 100644 index 0000000..e41133c --- /dev/null +++ b/tests/unit/specfact_code_review/run/test_forecast.py @@ -0,0 +1,67 @@ +from __future__ import annotations + +from pathlib import Path +from typing import Literal, cast + +from specfact_code_review.run.findings import ReviewFinding +from specfact_code_review.run.forecast import build_cleanup_forecast + + +def _finding(*, guidance_kind: str, deletion_lines: int) -> ReviewFinding: + return ReviewFinding( + category="ai_bloat", + severity="info", + tool="ast", + rule="ai-bloat.redundant-intermediate", + file="src/example.py", + line=1, + message="Simplify local code.", + confidence="high", + rewrite_hint="Inline the temporary.", + canonical_pattern="one-use-temporary", + estimated_deletion_lines=deletion_lines, + guidance_kind=cast( + Literal["safe_mechanical", "needs_tests", "design_judgment", "preserve"], + guidance_kind, + ), + recommended_action="keep" if guidance_kind == "preserve" else "inline", + clean_code_principle="api_stability" if guidance_kind == "preserve" else "kiss", + rationale="The local variable is assigned once and returned.", + safety_checks=["same expression is returned"], + preserve_reason="Public compatibility boundary." if guidance_kind == "preserve" else None, + action_status="recommended", + ) + + +def test_build_cleanup_forecast_counts_loc_and_weighted_bloat(tmp_path: Path) -> None: + source = tmp_path / "src" / "example.py" + source.parent.mkdir() + source.write_text("# comment\n\nvalue = 1\nprint(value)\n", encoding="utf-8") + test_file = tmp_path / "tests" / "test_example.py" + test_file.parent.mkdir() + test_file.write_text("def test_example():\n assert True\n", encoding="utf-8") + + forecast = build_cleanup_forecast( + [ + _finding(guidance_kind="safe_mechanical", deletion_lines=2), + _finding(guidance_kind="preserve", deletion_lines=5), + ], + [source, test_file], + ) + + assert forecast.reviewed_loc.production == 2 + assert forecast.reviewed_loc.tests == 2 + assert forecast.estimated_deletion_lines.low == 2 + assert forecast.estimated_deletion_lines.high == 2 + assert forecast.by_guidance_kind["preserve"].estimated_deletion_lines == 5 + assert forecast.ai_bloat_index.weighted_bloat_points_per_kloc == 250.0 + + +def test_build_cleanup_forecast_skips_undecodable_python_files(tmp_path: Path) -> None: + source = tmp_path / "legacy.py" + source.write_bytes(b"\xff\xfe\x00") + + forecast = build_cleanup_forecast([_finding(guidance_kind="safe_mechanical", deletion_lines=2)], [source]) + + assert forecast.reviewed_loc.total == 0 + assert forecast.estimated_deletion_lines.expected == 2 diff --git a/tests/unit/specfact_code_review/run/test_runner.py b/tests/unit/specfact_code_review/run/test_runner.py index 061b5b7..4db4420 100644 --- a/tests/unit/specfact_code_review/run/test_runner.py +++ b/tests/unit/specfact_code_review/run/test_runner.py @@ -7,11 +7,13 @@ from pathlib import Path from typing import Literal +import pytest from pytest import MonkeyPatch from specfact_code_review.run.findings import ReviewFinding, ReviewReport from specfact_code_review.run.runner import ( _coverage_findings, + _preserve_reasons_for_finding, _pytest_python_executable, _pytest_targets, _run_pytest_with_coverage, @@ -226,7 +228,7 @@ def test_run_review_simplify_focus_keeps_only_simplification_queue(monkeypatch: ("ai_bloat", "high"), ("dry", "high"), ] - assert report.schema_version == "1.1" + assert report.schema_version == "1.3" assert report.overall_verdict == "PASS" @@ -256,11 +258,62 @@ def test_run_review_simplify_enforce_fails_only_safe_mechanical_recommendations( review_mode="enforce", ) - assert report.schema_version == "1.2" + assert report.schema_version == "1.3" assert report.overall_verdict == "FAIL" assert report.ci_exit_code == 1 assert report.simplification_summary is not None assert report.simplification_summary.blocking_simplification_count == 1 + assert report.cleanup_forecast is not None + assert report.cleanup_forecast.by_guidance_kind["safe_mechanical"].count == 1 + assert report.cleanup_forecast.by_guidance_kind["needs_tests"].count == 1 + assert report.cleanup_forecast.by_guidance_kind["preserve"].count == 1 + assert report.cleanup_forecast.ai_bloat_index.weighted_bloat_points_per_kloc >= 0.0 + + +def test_run_review_simplify_forecast_counts_loc_and_weighted_bloat(monkeypatch: MonkeyPatch, tmp_path: Path) -> None: + source = tmp_path / "src/example.py" + source.parent.mkdir(parents=True) + source.write_text( + "def one() -> int:\n" + " value = 1\n" + " return value\n" + "\n" + "# ignored comment\n" + "def two() -> bool:\n" + " if True:\n" + " return True\n" + " return False\n", + encoding="utf-8", + ) + test_file = tmp_path / "tests/test_example.py" + test_file.parent.mkdir(parents=True) + test_file.write_text("def test_example() -> None:\n assert True\n", encoding="utf-8") + safe = _simplification_finding(category="ai_bloat", guidance_kind="safe_mechanical") + needs_tests = _simplification_finding(category="ai_bloat", guidance_kind="needs_tests") + design = _simplification_finding(category="ai_bloat", guidance_kind="design_judgment") + preserve = _simplification_finding(category="ai_bloat", guidance_kind="preserve") + monkeypatch.setattr("specfact_code_review.run.runner.run_ruff", lambda files: []) + monkeypatch.setattr("specfact_code_review.run.runner.run_radon", lambda files: []) + monkeypatch.setattr("specfact_code_review.run.runner.run_semgrep", lambda files: []) + monkeypatch.setattr("specfact_code_review.run.runner.run_semgrep_bugs", lambda files: []) + monkeypatch.setattr("specfact_code_review.run.runner.run_ai_bloat", lambda files: [safe, needs_tests]) + monkeypatch.setattr("specfact_code_review.run.runner.run_ast_clean_code", lambda files: [design, preserve]) + monkeypatch.setattr("specfact_code_review.run.runner.run_basedpyright", lambda files: []) + monkeypatch.setattr("specfact_code_review.run.runner.run_pylint", lambda files: []) + monkeypatch.setattr("specfact_code_review.run.runner.run_contract_check", lambda files, **_: []) + monkeypatch.setattr("specfact_code_review.run.runner._evaluate_tdd_gate", lambda files: ([], None)) + + report = run_review([source, test_file], no_tests=True, focus="simplify") + + assert report.cleanup_forecast is not None + assert report.cleanup_forecast.reviewed_loc.production == 7 + assert report.cleanup_forecast.reviewed_loc.tests == 2 + assert report.cleanup_forecast.estimated_deletion_lines.low == 3 + assert report.cleanup_forecast.estimated_deletion_lines.expected == 6 + assert report.cleanup_forecast.estimated_deletion_lines.high == 9 + assert report.cleanup_forecast.ai_bloat_index.findings_per_kloc == pytest.approx(444.444, abs=0.001) + assert report.cleanup_forecast.ai_bloat_index.weighted_bloat_points_per_kloc == pytest.approx(205.556, abs=0.001) + assert report.cleanup_forecast.ai_bloat_index.cleanup_yield_loc_per_kloc == pytest.approx(666.667, abs=0.001) def test_run_review_simplify_enforce_passes_design_and_preserve_guidance(monkeypatch: MonkeyPatch) -> None: @@ -293,6 +346,101 @@ def test_run_review_simplify_enforce_passes_design_and_preserve_guidance(monkeyp assert report.simplification_summary.blocking_simplification_count == 0 +def test_preserve_detection_covers_contract_public_protocol_cli_compat_and_load_bearing(tmp_path: Path) -> None: + source = tmp_path / "api.py" + source_text = ( + "from typing import Protocol\n" + "import typer\n" + "from abc import ABC, abstractmethod\n" + "\n" + "__all__ = ['exported']\n" + "app = typer.Typer()\n" + "\n" + "@icontract.require(lambda value: value > 0)\n" + "def contracted(value: int) -> int:\n" + " return value\n" + "\n" + "def exported() -> None:\n" + " return None\n" + "\n" + "class Handler(Protocol):\n" + " def handle(self, payload: str) -> str: ...\n" + "\n" + "class BaseHandler(ABC):\n" + " @abstractmethod\n" + " def abstract_handle(self, payload: str) -> str:\n" + " raise NotImplementedError\n" + "\n" + " def concrete_helper(self, payload: str) -> str:\n" + " result = payload.strip()\n" + " return result\n" + "\n" + "@app.command()\n" + "def cli_main() -> None:\n" + " return None\n" + "\n" + "# specfact: preserve(compat)\n" + "def shim() -> None:\n" + " return None\n" + ) + source.write_text(source_text, encoding="utf-8") + + def line_containing(text: str) -> int: + return next(index for index, line in enumerate(source_text.splitlines(), start=1) if text in line) + + finding_lines = { + "contract_lambda": line_containing("return value"), + "public_api": line_containing("return None"), + "protocol_member": line_containing("def handle"), + "cli_callback": line_containing("def cli_main"), + "compat_shim": line_containing("def shim"), + } + + for expected_reason, line in finding_lines.items(): + finding = _simplification_finding(category="ai_bloat", guidance_kind="safe_mechanical").model_copy( + update={"file": str(source), "line": line} + ) + reasons = _preserve_reasons_for_finding(finding, load_bearing=False) + assert expected_reason in {reason.reason for reason in reasons} + + abstract_finding = _simplification_finding(category="ai_bloat", guidance_kind="safe_mechanical").model_copy( + update={"file": str(source), "line": line_containing("raise NotImplementedError")} + ) + abstract_reasons = _preserve_reasons_for_finding(abstract_finding, load_bearing=False) + assert "protocol_member" in {reason.reason for reason in abstract_reasons} + + concrete_finding = _simplification_finding(category="ai_bloat", guidance_kind="safe_mechanical").model_copy( + update={"file": str(source), "line": line_containing("return result")} + ) + concrete_reasons = _preserve_reasons_for_finding(concrete_finding, load_bearing=False) + assert "protocol_member" not in {reason.reason for reason in concrete_reasons} + + load_bearing_finding = _simplification_finding(category="ai_bloat", guidance_kind="safe_mechanical").model_copy( + update={"file": str(source), "line": line_containing("def exported")} + ) + reasons = _preserve_reasons_for_finding(load_bearing_finding, load_bearing=True) + assert "load_bearing" in {reason.reason for reason in reasons} + + +def test_preserve_detection_treats_docstring_only_protocol_method_as_stub(tmp_path: Path) -> None: + source = tmp_path / "api.py" + source_text = ( + "from typing import Protocol\n" + "\n" + "class Handler(Protocol):\n" + " def handle(self, payload: str) -> str:\n" + ' """Handle the payload."""\n' + ) + source.write_text(source_text, encoding="utf-8") + + finding = _simplification_finding(category="ai_bloat", guidance_kind="safe_mechanical").model_copy( + update={"file": str(source), "line": 4} + ) + reasons = _preserve_reasons_for_finding(finding, load_bearing=False) + + assert "protocol_member" in {reason.reason for reason in reasons} + + def test_run_review_simplify_focus_preserves_tool_errors(monkeypatch: MonkeyPatch) -> None: monkeypatch.setattr("specfact_code_review.run.runner.run_ruff", lambda files: []) monkeypatch.setattr("specfact_code_review.run.runner.run_radon", lambda files: [])