feat(bench): mutator subagent — autonomous artifact rewriting from failure analysis

## Objective

Add a `mutator` subagent to agentv-bench that autonomously generates improved versions of the artifact under test (skill, prompt, config) based on failure analysis from the analyzer subagent.

Currently, agentv-bench Step 5 (Improve) is human-directed — the analyzer identifies failures and suggests improvements, but the human decides what to change. The mutator subagent closes this gap by generating the rewrite itself, enabling unattended optimization loops.

## Design Latitude

**Location**: `plugins/agentv-dev/skills/agentv-bench/agents/mutator.md`

**Inputs** (provided by agentv-bench orchestrator):
- Current best artifact content
- Per-assertion pass rates (e.g., `IDENTIFIES_CLARITY_ISSUES: 3/5`)
- Top failure descriptions from analyzer
- Original artifact (for reference, not as mutation base)

**Output**: A rewritten artifact that addresses failing criteria.

**Mutation strategy** (adapted from [karpathy/autoresearch](https://github.com/karpathy/autoresearch) and [pi-autoresearch](https://github.com/davebcn87/pi-autoresearch)):
- For any assertion below 80% pass rate: add explicit, concrete instructions
- Preserve instructions that already pass consistently
- Prefer simplification when score is maintained (Karpathy's "simplicity criterion" — cleaner code at equal performance is an improvement)
- Never add speculative features — only address observed failures

**Integration with agentv-bench Step 5**:
- **Interactive mode** (existing): human still directs improvements; mutator available as optional assist ("generate a suggestion based on failures")
- **Autoresearch mode** (see #748): mutator dispatched automatically, no human input needed

## Acceptance Signals

- `agents/mutator.md` exists in agentv-bench with clear instructions for generating artifact rewrites
- agentv-bench Step 5 can dispatch the mutator subagent as an alternative to human-directed improvement
- Mutator output is a complete rewritten artifact (not a diff or suggestion list)
- Mutator reads from "best" version, not from the failed candidate (hill-climbing ratchet)
- Works for skill files (SKILL.md), prompt templates, and agent configs

## Non-Goals

- Not a general-purpose code rewriter — only rewrites the specific artifact being evaluated
- Not a replacement for human-directed improvement — both modes coexist
- Does not modify the eval definition (EVAL.yaml) — only the artifact under test
- Does not run evals itself — agentv-bench orchestrates the full loop

## Context

The autoresearch pattern — proven by [karpathy/autoresearch](https://github.com/karpathy/autoresearch) (ML training optimization) and [pi-autoresearch](https://github.com/davebcn87/pi-autoresearch) (generic optimization loops) — automates the improvement step: score → keep/drop → mutate → repeat. The mutator subagent is the core building block that enables this in agentv-bench.

Key design insight from Karpathy: constrain mutation to a single file (the artifact) and keep the evaluation harness immutable. This prevents the agent from gaming its own scoring.

## Related

- #699 — Ralph Loop (complementary: Ralph re-prompts the target with feedback during a run; mutator rewrites the artifact between runs)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bench): mutator subagent — autonomous artifact rewriting from failure analysis #746

Objective

Design Latitude

Acceptance Signals

Non-Goals

Context

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(bench): mutator subagent — autonomous artifact rewriting from failure analysis #746

Description

Objective

Design Latitude

Acceptance Signals

Non-Goals

Context

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions