|
140 | 140 | - TypeScript tasks use `.test.ts` extension (Jest convention), timeout in ms (60000) not seconds |
141 | 141 | - npx jest --timeout flag sets per-test timeout in milliseconds |
142 | 142 | --- |
| 143 | + |
| 144 | +## 2026-02-16 - US-008 |
| 145 | +- Registered 9 navprove tasks in `configs/selected_benchmark_tasks.json` |
| 146 | +- All entries have: benchmark=ccb_navprove, sdlc_phase=Debugging, category=navigation_verified, difficulty=hard |
| 147 | +- Languages: 5 python (qutebrowser×4, ansible), 3 go (teleport, vuls, flipt), 1 typescript (tutanota) |
| 148 | +- Repos: qutebrowser/qutebrowser, ansible/ansible, gravitational/teleport, future-architect/vuls, flipt-io/flipt, tutao/tutanota |
| 149 | +- MCP benefit scores: qb=0.83, ansible=0.86, teleport=0.85, vuls=0.77, flipt=0.80, tutanota=0.82 |
| 150 | +- Updated statistics: total_selected 166→175, Debugging 3→12, go 41→44, python 45→50, typescript 10→11 |
| 151 | +- Added ccb_navprove to tasks_per_benchmark and selection_targets |
| 152 | +- Files changed: configs/selected_benchmark_tasks.json |
| 153 | +- **Learnings for future iterations:** |
| 154 | + - The JSON field is `task_id` (not `task_name`) in selected_benchmark_tasks.json entries |
| 155 | + - MCP scores use formula cc(0.25)+cfd(0.30)+ssp(0.20)+tcw(0.25); tcw=0.80 for search-heavy navprove tasks |
| 156 | + - Statistics section needs manual update: total_selected, tasks_per_sdlc_phase, tasks_per_benchmark, tasks_per_language |
| 157 | + - Always validate with `json.load()` after editing — single trailing comma breaks the file |
| 158 | +--- |
0 commit comments