Commit 92b4eea
fix: extract script handles old-format dirs, add V2 report audit
- Fix extract_v2_report_data.py to scan both old-format (ccb_*/csb_*
batch dirs) and new-format (timestamp dirs), recovering 1366 evals
from 98 old-format config dirs. Fixes bustub-hyperloglog-impl-001
appearing unpaired (369→370 paired tasks).
- Add audit_v2_report_data.py for comprehensive V2 report data
validation: coverage checks, reward integrity, suspicious pattern
detection, normalization verification, and old-format impact analysis.
- Archive V1 technical report to docs/technical_reports/archive/.
- Regenerate script registry and index.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 6dd3f87 commit 92b4eea
File tree
5 files changed
+1208
-1
lines changed- docs
- ops
- technical_reports/archive
- scripts
5 files changed
+1208
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
| 167 | + | |
167 | 168 | | |
168 | 169 | | |
169 | 170 | | |
| |||
182 | 183 | | |
183 | 184 | | |
184 | 185 | | |
| 186 | + | |
185 | 187 | | |
186 | 188 | | |
187 | 189 | | |
188 | 190 | | |
189 | 191 | | |
| 192 | + | |
190 | 193 | | |
191 | 194 | | |
192 | 195 | | |
| |||
File renamed without changes.
0 commit comments