Skip to content

Commit 46f2b0e

Browse files
sjarmakclaude
andcommitted
Fix default model in suite launchers from Opus to Haiku
sdlc_suite_2config.sh and validate_one_per_benchmark.sh both defaulted to claude-opus-4-6, causing accidental Opus runs that are incomparable with the existing 250-pair Haiku baseline. Change default to claude-haiku-4-5-20251001 to match run_selected_tasks.sh. Also make TIMEOUT_MULTIPLIER overridable via env var in both sdlc_suite_2config.sh and run_selected_tasks.sh. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 0afecda commit 46f2b0e

File tree

4 files changed

+22
-4
lines changed

4 files changed

+22
-4
lines changed

configs/sdlc_suite_2config.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ SUITE_STEM="${SUITE#ccb_}"
4343
# ============================================
4444
BENCHMARK_DIR="$(pwd)/benchmarks"
4545
AGENT_PATH="agents.claude_baseline_agent:BaselineClaudeCodeAgent"
46-
MODEL="${MODEL:-anthropic/claude-opus-4-6}"
46+
MODEL="${MODEL:-anthropic/claude-haiku-4-5-20251001}"
4747
CONCURRENCY=1
48-
TIMEOUT_MULTIPLIER=10
48+
TIMEOUT_MULTIPLIER="${TIMEOUT_MULTIPLIER:-10}"
4949
RUN_BASELINE=true
5050
RUN_FULL=true
5151
CATEGORY="${CATEGORY:-staging}"

configs/validate_one_per_benchmark.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ cd "$REPO_ROOT"
2323
export PYTHONPATH="$(pwd):${PYTHONPATH:-}"
2424

2525
SELECTION_FILE="$REPO_ROOT/configs/selected_benchmark_tasks.json"
26-
MODEL="${MODEL:-anthropic/claude-opus-4-6}"
26+
MODEL="${MODEL:-anthropic/claude-haiku-4-5-20251001}"
2727
AGENT_PATH="agents.claude_baseline_agent:BaselineClaudeCodeAgent"
2828
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
2929

docs/ops/SCRIPT_INDEX.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@ Generated from `scripts/registry.json` by `scripts/generate_script_index.py`.
171171
- `scripts/extract_analysis_metrics.py` - Utility script for extract analysis metrics.
172172
- `scripts/find_mcp_distracted.py` - Utility script for find mcp distracted.
173173
- `scripts/fix_h3_tokens.py` [one_off] - Historical one-off script: fix h3 tokens.
174+
- `scripts/handoff_monitor_scrollend.sh` - Utility script for handoff monitor scrollend.
174175
- `scripts/hydrate_task_specs.py` - Utility script for hydrate task specs.
175176
- `scripts/icp_profiles.py` - Utility script for icp profiles.
176177
- `scripts/integrate_answer_json_wave1.py` - Utility script for integrate answer json wave1.
@@ -179,6 +180,7 @@ Generated from `scripts/registry.json` by `scripts/generate_script_index.py`.
179180
- `scripts/judge_demo.py` - Utility script for judge demo.
180181
- `scripts/list_gemini_models.py` - Utility script for list gemini models.
181182
- `scripts/mirror_largerepo_expansion.sh` - Utility script for mirror largerepo expansion.
183+
- `scripts/plan_variance_runs.py` - Utility script for plan variance runs.
182184
- `scripts/regenerate_artifact_dockerfiles.py` - Utility script for regenerate artifact dockerfiles.
183185
- `scripts/remirror_mcp_unique_repos.sh` - Utility script for remirror mcp unique repos.
184186
- `scripts/repair_h3_trajectories.py` [one_off] - Historical one-off script: repair h3 trajectories.

scripts/registry.json

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -546,6 +546,14 @@
546546
"language": "python",
547547
"summary": "QA/validation script for governance evaluator."
548548
},
549+
{
550+
"name": "handoff_monitor_scrollend.sh",
551+
"path": "scripts/handoff_monitor_scrollend.sh",
552+
"category": "misc",
553+
"status": "maintained",
554+
"language": "shell",
555+
"summary": "Utility script for handoff monitor scrollend."
556+
},
549557
{
550558
"name": "headless_login.py",
551559
"path": "scripts/headless_login.py",
@@ -786,6 +794,14 @@
786794
"language": "python",
787795
"summary": "Submission/reporting script for package submission."
788796
},
797+
{
798+
"name": "plan_variance_runs.py",
799+
"path": "scripts/plan_variance_runs.py",
800+
"category": "misc",
801+
"status": "maintained",
802+
"language": "python",
803+
"summary": "Utility script for plan variance runs."
804+
},
789805
{
790806
"name": "prebuild_images.sh",
791807
"path": "scripts/prebuild_images.sh",
@@ -1155,7 +1171,7 @@
11551171
"infra_mirrors": 16,
11561172
"library_helpers": 7,
11571173
"migration": 4,
1158-
"misc": 37,
1174+
"misc": 39,
11591175
"qa_quality": 10,
11601176
"submission_reporting": 7,
11611177
"task_creation_selection": 12,

0 commit comments

Comments
 (0)