Rework evalbuff: commit learning, parallel agents, trace compression#481
Merged
Rework evalbuff: commit learning, parallel agents, trace compression#481
Conversation
…sion Two-mode architecture: learn mode walks git history commit-by-commit, prompt mode runs a specific task. Both use iterative doc improvement with parallel agent execution, judging, and keep/reject loop. Key changes: - Add commit-task-generator: extracts tasks from git history via LLM - Add trace-compressor: hybrid compression stores large tool results in files with inline pointers so doc writer can see agent reasoning - Rewrite run-evalbuff with runLearnMode/runPromptMode, parallel agent execution (N runs per task), and iterative doc improvement - Fix cli-runner timeout: kill entire process group via detached spawn - Update judge with judgeTaskResult for prompt mode (no ground truth) - Update docs-optimizer: always analyze, agent trace support, revert logic that preserves previously-accepted doc edits - Rewrite tests for new architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e prompts - Read full file contents at parent commit (up to 500K) to give the prompt generator rich context about the codebase, matching buffbench's approach - Include the complete diff (up to 200K chars) instead of truncating at 8K - Rewrite system prompt to produce human-like prompts: high-level functional requirements, natural language, no file paths unless a human would mention them - Skip commits with diffs >200K instead of >50K Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
process.kill(-pid)to kill entire agent process treesTest plan
🤖 Generated with Claude Code