chore: update PRD and progress for US-008 completion

LoCoBench Bot · claude · LoCoBench Bot · commit 95683eba4720 · 2026-02-16T20:48:24.000Z
Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/ralph-navprove/prd.json b/ralph-navprove/prd.json
@@ -135,7 +135,7 @@
         "json.load(open('configs/selected_benchmark_tasks.json')) succeeds without error (valid JSON)"
       ],
       "priority": 8,
-      "passes": false,
+      "passes": true,
       "notes": "MCP benefit scores: qutebrowser tasks should be high (large Python codebase, multi-file bugs). Use formula cc(0.25)+cfd(0.30)+ssp(0.20)+tcw(0.25). tcw=0.8 for navprove (heavy search needed). Estimate cc/cfd/ssp per-repo based on codebase size and file count."
     }
   ]
diff --git a/ralph-navprove/progress.txt b/ralph-navprove/progress.txt
@@ -140,3 +140,19 @@
   - TypeScript tasks use `.test.ts` extension (Jest convention), timeout in ms (60000) not seconds
   - npx jest --timeout flag sets per-test timeout in milliseconds
 ---
+
+## 2026-02-16 - US-008
+- Registered 9 navprove tasks in `configs/selected_benchmark_tasks.json`
+- All entries have: benchmark=ccb_navprove, sdlc_phase=Debugging, category=navigation_verified, difficulty=hard
+- Languages: 5 python (qutebrowser×4, ansible), 3 go (teleport, vuls, flipt), 1 typescript (tutanota)
+- Repos: qutebrowser/qutebrowser, ansible/ansible, gravitational/teleport, future-architect/vuls, flipt-io/flipt, tutao/tutanota
+- MCP benefit scores: qb=0.83, ansible=0.86, teleport=0.85, vuls=0.77, flipt=0.80, tutanota=0.82
+- Updated statistics: total_selected 166→175, Debugging 3→12, go 41→44, python 45→50, typescript 10→11
+- Added ccb_navprove to tasks_per_benchmark and selection_targets
+- Files changed: configs/selected_benchmark_tasks.json
+- **Learnings for future iterations:**
+  - The JSON field is `task_id` (not `task_name`) in selected_benchmark_tasks.json entries
+  - MCP scores use formula cc(0.25)+cfd(0.30)+ssp(0.20)+tcw(0.25); tcw=0.80 for search-heavy navprove tasks
+  - Statistics section needs manual update: total_selected, tasks_per_sdlc_phase, tasks_per_benchmark, tasks_per_language
+  - Always validate with `json.load()` after editing — single trailing comma breaks the file
+---

Original file line number	Diff line number	Diff line change
`@@ -135,7 +135,7 @@`
`135`	`135`	`"json.load(open('configs/selected_benchmark_tasks.json')) succeeds without error (valid JSON)"`
`136`	`136`	`],`
`137`	`137`	`"priority": 8,`
`138`		`- "passes": false,`
	`138`	`+ "passes": true,`
`139`	`139`	`"notes": "MCP benefit scores: qutebrowser tasks should be high (large Python codebase, multi-file bugs). Use formula cc(0.25)+cfd(0.30)+ssp(0.20)+tcw(0.25). tcw=0.8 for navprove (heavy search needed). Estimate cc/cfd/ssp per-repo based on codebase size and file count."`
`140`	`140`	`}`
`141`	`141`	`]`