Commit 44045f1
fix: trace audit — fix 3 broken verifiers, MANIFEST model fallback, drop 2 TAC tasks
- navidrome test.sh: pytest → go test (Go/Ginkgo project)
- nodebb-notif/nodebb-plugin test.sh: pytest → npx mocha + Mocha reward parsing
- openlibrary Dockerfile.sg_only: pre-install Node.js 22 + Claude Code (sweap-images Node 16 broken)
- generate_manifest.py: model extraction falls back to result.json when config.json missing
(fixes ccb_feature/ccb_refactor incorrectly showing opus instead of haiku)
- Drop 2 llamacpp TAC tasks (need external RocketChat server, incompatible with benchmark)
- selected_benchmark_tasks.json: 414 → 412 tasks, ccb_test 20 → 18
- Add handoff doc for rerun setup (local Docker + Daytona)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent e100ad4 commit 44045f1
File tree
7 files changed
+601
-696
lines changed- benchmarks/ccb_fix
- nodebb-notif-dropdown-fix-001/tests
- nodebb-plugin-validate-fix-001/tests
- openlibrary-solr-boolean-fix-001/environment
- configs
- docs/ops
- scripts
7 files changed
+601
-696
lines changedLines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
115 | | - | |
116 | | - | |
| 115 | + | |
| 116 | + | |
117 | 117 | | |
118 | | - | |
119 | | - | |
| 118 | + | |
| 119 | + | |
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
| |||
Lines changed: 13 additions & 3 deletions
Large diffs are not rendered by default.
Lines changed: 12 additions & 2 deletions
Large diffs are not rendered by default.
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
9 | 17 | | |
10 | 18 | | |
11 | 19 | | |
| |||
0 commit comments