Skip to content

Commit 95683eb

Browse files
LoCoBench Botclaude
andcommitted
chore: update PRD and progress for US-008 completion
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 9cdbae5 commit 95683eb

File tree

2 files changed

+17
-1
lines changed

2 files changed

+17
-1
lines changed

ralph-navprove/prd.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@
135135
"json.load(open('configs/selected_benchmark_tasks.json')) succeeds without error (valid JSON)"
136136
],
137137
"priority": 8,
138-
"passes": false,
138+
"passes": true,
139139
"notes": "MCP benefit scores: qutebrowser tasks should be high (large Python codebase, multi-file bugs). Use formula cc(0.25)+cfd(0.30)+ssp(0.20)+tcw(0.25). tcw=0.8 for navprove (heavy search needed). Estimate cc/cfd/ssp per-repo based on codebase size and file count."
140140
}
141141
]

ralph-navprove/progress.txt

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,3 +140,19 @@
140140
- TypeScript tasks use `.test.ts` extension (Jest convention), timeout in ms (60000) not seconds
141141
- npx jest --timeout flag sets per-test timeout in milliseconds
142142
---
143+
144+
## 2026-02-16 - US-008
145+
- Registered 9 navprove tasks in `configs/selected_benchmark_tasks.json`
146+
- All entries have: benchmark=ccb_navprove, sdlc_phase=Debugging, category=navigation_verified, difficulty=hard
147+
- Languages: 5 python (qutebrowser×4, ansible), 3 go (teleport, vuls, flipt), 1 typescript (tutanota)
148+
- Repos: qutebrowser/qutebrowser, ansible/ansible, gravitational/teleport, future-architect/vuls, flipt-io/flipt, tutao/tutanota
149+
- MCP benefit scores: qb=0.83, ansible=0.86, teleport=0.85, vuls=0.77, flipt=0.80, tutanota=0.82
150+
- Updated statistics: total_selected 166→175, Debugging 3→12, go 41→44, python 45→50, typescript 10→11
151+
- Added ccb_navprove to tasks_per_benchmark and selection_targets
152+
- Files changed: configs/selected_benchmark_tasks.json
153+
- **Learnings for future iterations:**
154+
- The JSON field is `task_id` (not `task_name`) in selected_benchmark_tasks.json entries
155+
- MCP scores use formula cc(0.25)+cfd(0.30)+ssp(0.20)+tcw(0.25); tcw=0.80 for search-heavy navprove tasks
156+
- Statistics section needs manual update: total_selected, tasks_per_sdlc_phase, tasks_per_benchmark, tasks_per_language
157+
- Always validate with `json.load()` after editing — single trailing comma breaks the file
158+
---

0 commit comments

Comments
 (0)