You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve evalbuff prompt generation: full file context, buffbench-style prompts
- Read full file contents at parent commit (up to 500K) to give the prompt
generator rich context about the codebase, matching buffbench's approach
- Include the complete diff (up to 200K chars) instead of truncating at 8K
- Rewrite system prompt to produce human-like prompts: high-level functional
requirements, natural language, no file paths unless a human would mention them
- Skip commits with diffs >200K instead of >50K
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Get a list of commits from the repo, oldest first.
17
19
* Starts from `startAfterSha` (exclusive) or HEAD~commitCount if no state.
@@ -87,41 +89,125 @@ export function getCommitInfo(
87
89
}
88
90
89
91
/**
90
-
* Generate a human-like task prompt from a commit's message and diff.
91
-
* Uses Claude CLI to rephrase the commit into a natural coding task.
92
+
* Read a file's content at a specific commit SHA.
93
+
* Returns null if the file doesn't exist at that commit.
92
94
*/
93
-
exportasyncfunctiongeneratePromptFromCommit(
94
-
message: string,
95
-
diff: string,
95
+
functionreadFileAtCommit(
96
+
repoPath: string,
97
+
sha: string,
98
+
filePath: string,
99
+
): string|null{
100
+
try{
101
+
returnexecSync(`git show ${sha}:${JSON.stringify(filePath)}`,{
102
+
cwd: repoPath,
103
+
encoding: 'utf-8',
104
+
maxBuffer: 10*1024*1024,
105
+
})
106
+
}catch{
107
+
returnnull
108
+
}
109
+
}
110
+
111
+
/**
112
+
* Read the full contents of all files being modified at the parent commit.
113
+
* This gives the prompt generator context about what the code looks like
114
+
* before the change, so it can write a realistic human prompt.
115
+
*/
116
+
functionreadFilesAtParent(
117
+
repoPath: string,
118
+
parentSha: string,
96
119
filesChanged: string[],
97
-
): Promise<string>{
98
-
constsystemPrompt=`You are generating a task prompt that a developer might write to ask a coding agent to make changes to a codebase. You'll be given a git commit message and diff. Your job is to write a natural, human-sounding prompt that would lead an agent to make similar changes.
120
+
): Record<string,string>{
121
+
constfiles: Record<string,string>={}
122
+
lettotalSize=0
123
+
constmaxTotalSize=500_000// 500K total for all files
constPROMPT_GEN_SYSTEM=`You are generating a task prompt that a human developer would realistically write to ask an AI coding agent to make changes to their codebase.
139
+
140
+
You will receive:
141
+
- A git diff showing exactly what was changed
142
+
- The full contents of all files being modified (as they looked BEFORE the change)
143
+
- The commit message (as a hint, but don't just copy it)
144
+
145
+
Your job is to write a natural, human-sounding prompt — the kind of thing a developer would type into a chat with an AI assistant.
146
+
147
+
## Key Principles
148
+
149
+
1. Focus on high-level functional requirements, not implementation details
150
+
- GOOD: "add user authentication to the API"
151
+
- BAD: "implement an authenticateUser function in src/auth/middleware.ts"
152
+
153
+
2. Use natural language — like a Slack message or ticket description
154
+
- GOOD: "the nightly CI is pointing at the wrong directory, it should be agents not .agents"
155
+
- BAD: "Update the directory reference in .github/workflows/nightly-e2e.yml from .agents to agents"
156
+
157
+
3. Describe what you WANT or what's WRONG, not how to fix it
158
+
- GOOD: "the hover state on buttons looks broken"
159
+
- BAD: "change the CSS hover opacity from 0.5 to 0.8 in Button.tsx"
160
+
161
+
4. Don't reference specific file paths unless a human naturally would. Humans describe the feature area, not the file tree.
162
+
- GOOD: "our login page needs to redirect to freebuff.com instead of codebuff.com"
163
+
- BAD: "update src/auth/login.ts, src/config/urls.ts, and tests/auth.test.ts to change codebuff.com to freebuff.com"
99
164
100
-
## Rules
165
+
5. Don't over-specify. Leave room for the agent to figure out the implementation.
101
166
102
-
1. Write as if you're a developer describing what you want done — NOT as if you've seen the solution
103
-
2. Be vague enough that the agent has to figure out the implementation details, but specific enough about the desired outcome
104
-
3. Do NOT mention specific line numbers, exact variable names from the diff, or implementation details
105
-
4. DO mention the general area of the codebase, the feature/bug, and the desired behavior
106
-
5. Keep it to 1-4 sentences
107
-
6. Sound natural — like a Slack message or a ticket description, not a formal spec
167
+
6. Keep it to 1-4 sentences.
168
+
169
+
7. Read the FULL file contents to understand context. The diff alone can be misleading — understanding the surrounding code helps you write a prompt that makes sense for this codebase.
108
170
109
171
## Output
110
172
111
-
Respond with ONLY the prompt text, nothing else.`
173
+
Respond with ONLY the prompt text. No quotes, no preamble, no explanation.`
112
174
113
-
constuserPrompt=`Commit message: ${message}
175
+
/**
176
+
* Generate a human-like task prompt from a commit.
177
+
* Reads the full files at the parent commit for context, similar to how
178
+
* buffbench uses file-explorer agents to understand the codebase.
179
+
*/
180
+
exportasyncfunctiongeneratePromptFromCommit(
181
+
repoPath: string,
182
+
parentSha: string,
183
+
message: string,
184
+
diff: string,
185
+
filesChanged: string[],
186
+
): Promise<string>{
187
+
// Read full file contents at the parent commit for context
0 commit comments