-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Problem Statement
The completed task list is not cross-referenced against the actual codebase after /speckit.implement.
After ~770 structured tasks using spec-kit /implement were completed, I found 2 tasks marked [X] complete where the corresponding code was not completed. Each was discovered only when the feature misbehaved during testing. The point is not to review the implementation, but to verify the tasks were ever really done.
The two examples:
- A task specified "Create outcome classification types in outcomeTypes.ts." The file was never created. The types were silently folded into a different module.
- A task specified "Modify patternDetector.ts to filter out turns with operationalFailures." The file was never modified. The filtering was applied elsewhere but never extended to pattern detection.
I'm calling these phantom completions: the agent marks a task done because the [X] token is the statistically favored continuation in a list of completed tasks, not because it verified the work completed in the filesystem.
Proposed Solution
Similar to analyze, an optional command after implement to /speckit.verify-tasks. The command would verify [X]-marked tasks for file existence, git diffs, and expected patterns in the codebase, flagging gaps before they reach code review or production. Produce a verify-tasks-report.md in the specs branch directory, and iterates the report to Fix, Skip, or Investigate any unverified completed tasks.
I'm building this as a spec-kit extension (https://github.com/datastone-inc/speckit-verify-tasks). Detailed writeup of the Phantom Creation phenomenon and its causes: (https://datastone.ca/blog/task-phantom-completions-ai-assisted-development/).
Alternatives Considered
No response
Component
Specify CLI (initialization, commands)
AI Agent (if applicable)
None
Use Cases
Valuable for improving the quality of long running projects with lots of tasks being implemented.
Acceptance Criteria
The [X] completed tasks are verified as backed by real implementation.
Additional Context
There are some other issues and extensions "circling" this issue, but I think this feature drills down to a specific, real gap:
- https://github.com/ismaelJimenez/spec-kit-verify - broader scope verification that can miss phantom completions
- [Feature]: Add /speckit.verify Command for Implementation Validation #1745 sounds like it is resolved by above extension
- "End of task guardrails" are routinely skipped as context fills up #847 - Context-window degradation causing guardrails to be skipped (related but different mechanism). BTW, this is partially addressed in the verify-tasks by suggesting to run with clear context (see my repo).
- Add Post-Implementation Debugging and Fixing Workflow #442 - Post-implementation debugging workflow (addresses visible failures, not silent phantom completions)