Skip to content

Commit 9d74730

Browse files
author
LoCoBench Bot
committed
Merge remote-tracking branch 'origin/ralph/gapfill-investigation'
# Conflicts: # configs/codereview_2config.sh # configs/selected_benchmark_tasks.json # prd.json # progress.txt
2 parents 9ac71f2 + 92ef760 commit 9d74730

File tree

49 files changed

+4657
-255
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+4657
-255
lines changed

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ docs/
3636
ERROR_CATALOG.md # Known error fingerprints, causes, fixes
3737
```
3838

39-
## Benchmarks (13 total, 11 active)
39+
## Benchmarks (16 total, 14 active)
4040

4141
| Benchmark | Tasks | Language(s) | Focus | Status |
4242
|-----------|-------|-------------|-------|--------|

benchmarks/ccb_codereview/cr-aspnetcore-001/environment/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ RUN git clone --filter=blob:none --no-checkout https://github.com/dotnet/aspnetc
2525
git config user.name "Agent"
2626

2727
# Inject defects into the codebase
28-
COPY inject_defects.sh /tmp/inject_defects.sh
28+
COPY environment/inject_defects.sh /tmp/inject_defects.sh
2929
RUN chmod +x /tmp/inject_defects.sh && /tmp/inject_defects.sh && rm /tmp/inject_defects.sh
3030

3131
# Create directories for verifier (tests uploaded at runtime by Harbor verifier)

benchmarks/ccb_codereview/cr-ghost-001/environment/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ RUN git clone --filter=blob:none --no-checkout https://github.com/TryGhost/Ghost
2525
git config user.name "Agent"
2626

2727
# Inject defects into the codebase
28-
COPY inject_defects.sh /tmp/inject_defects.sh
28+
COPY environment/inject_defects.sh /tmp/inject_defects.sh
2929
RUN chmod +x /tmp/inject_defects.sh && /tmp/inject_defects.sh && rm /tmp/inject_defects.sh
3030

3131
# Create directories for verifier (tests uploaded at runtime by Harbor verifier)
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
FROM ubuntu:22.04
2+
3+
WORKDIR /workspace
4+
5+
# Install dependencies
6+
RUN apt-get update && apt-get install -y \
7+
git \
8+
curl \
9+
python3 \
10+
npm \
11+
&& rm -rf /var/lib/apt/lists/*
12+
13+
# Install Claude Code CLI
14+
RUN npm install -g @anthropic-ai/claude-code
15+
16+
# Clone Envoy at the target commit (before router header mutation fix)
17+
# PR #40856 (commit 25f893b) fixes the bug where response_headers_to_add
18+
# is processed multiple times for local responses due to double
19+
# finalizeResponseHeaders() call. We check out the parent to get the broken state.
20+
# The bug was introduced by PR #39534 which moved finalizeResponseHeaders()
21+
# into the modify_headers_ callback, causing double processing on local replies.
22+
RUN git clone --filter=blob:none --no-checkout https://github.com/envoyproxy/envoy.git . && \
23+
git checkout 25f893b~1 && \
24+
git config user.email "agent@example.com" && \
25+
git config user.name "Agent"
26+
27+
# Create output directories
28+
RUN mkdir -p /logs/agent /logs/verifier
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Investigation: Duplicate Response Headers in Envoy Filter Pipeline
2+
3+
**Repository:** envoyproxy/envoy
4+
**Task Type:** Deep Causal Chain (investigation only — no code fixes)
5+
6+
## Scenario
7+
8+
An Envoy proxy operator reports that response headers configured via `response_headers_to_add` in route configuration are being **duplicated** in certain responses. Specifically, when the router filter generates a local reply (e.g., upstream timeout, connection failure, or request too large), custom response headers like `x-request-id-echo` appear twice in the HTTP response.
9+
10+
The issue is intermittent — it only affects responses where the router itself generates the reply (local replies) rather than forwarding an upstream response. Normal proxied responses have headers added exactly once.
11+
12+
Access log snippet showing the problem (using `%RESPONSE_CODE_DETAILS%` formatter):
13+
14+
```
15+
[2025-08-15T10:23:45.001Z] "GET /api/v1/data HTTP/1.1" 504 UT
16+
response_code_details=upstream_response_timeout
17+
x-custom-trace: abc123
18+
x-custom-trace: abc123
19+
```
20+
21+
The header `x-custom-trace` appears twice. The route config has:
22+
23+
```yaml
24+
response_headers_to_add:
25+
- header:
26+
key: "x-custom-trace"
27+
value: "%REQ(x-request-id)%"
28+
append_action: OVERWRITE_IF_EXISTS_OR_ADD
29+
```
30+
31+
Despite using `OVERWRITE_IF_EXISTS_OR_ADD`, the header is duplicated on local replies but not on proxied upstream responses.
32+
33+
## Your Task
34+
35+
Investigate the root cause of this duplicate header behavior and produce a report at `/logs/agent/investigation.md`.
36+
37+
Your report MUST cover:
38+
39+
1. **How the router filter processes response headers** — specifically the `finalizeResponseHeaders()` call chain and the `modify_headers_` callback
40+
2. **How local replies are generated** — the `sendLocalReply` code path in the router and how it differs from the upstream response path
41+
3. **The specific mechanism causing double processing** — which PR/change moved `finalizeResponseHeaders()` into the `modify_headers_` callback, and why this causes double invocation for local replies
42+
4. **The interaction between `sendLocalReply` and `modify_headers_`** — how `sendLocalReply` calls `finalizeResponseHeaders()` directly, AND the `modify_headers_` callback also calls it, resulting in headers being added twice
43+
5. **The role of the `append_action` / `append` proto fields** — how `HeaderValueOption` config is parsed in `header_parser.cc`, including the deprecated `append` BoolValue field vs the newer `append_action` enum, and how proto default values affect behavior
44+
6. **The filter manager's encode path** — how `FilterManager::encodeHeaders()` iterates through filters in reverse order and how the header mutation filter interacts with route-level header additions
45+
7. **Which files and functions form the full causal chain** from symptom (duplicate headers in access log) to root cause (double `finalizeResponseHeaders()` call)
46+
47+
## Output Requirements
48+
49+
Write your investigation report to `/logs/agent/investigation.md` with these sections:
50+
51+
```
52+
# Investigation Report
53+
54+
## Summary
55+
<1-2 sentence finding>
56+
57+
## Root Cause
58+
<Specific file, function, and mechanism>
59+
60+
## Evidence
61+
<Code references with file paths and line numbers>
62+
63+
## Affected Components
64+
<List of packages/modules impacted>
65+
66+
## Causal Chain
67+
<Ordered list: symptom → intermediate hops → root cause>
68+
69+
## Recommendation
70+
<Fix strategy and diagnostic steps>
71+
```
72+
73+
## Constraints
74+
75+
- Do NOT write any code fixes
76+
- Do NOT modify any source files
77+
- Your job is investigation and analysis only
78+
- The causal chain spans at least 4 packages: `source/common/router/`, `source/common/http/`, `source/extensions/filters/http/header_mutation/`, and `api/envoy/config/core/v3/`
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
version = "1.0"
2+
3+
[metadata]
4+
name = "inv-deep-001"
5+
description = "Deep causal chain: Envoy router double-processes response header mutations for local replies"
6+
license = "Apache-2.0"
7+
8+
[task]
9+
id = "inv-deep-001"
10+
repo = "envoyproxy/envoy"
11+
category = "deep_causal_chain"
12+
language = "cpp"
13+
difficulty = "hard"
14+
time_limit_sec = 1200
15+
16+
[verification]
17+
type = "test"
18+
command = "bash /tests/test.sh"
19+
20+
reward_type = "checklist"
21+
description = "Weighted checklist scoring investigation report against ground-truth findings"
22+
23+
[environment]
24+
build_timeout_sec = 1800.0
25+
26+
[environment.setup_scripts]
27+
mcp_config = """#!/bin/bash
28+
# Setup Sourcegraph MCP if credentials provided
29+
if [ -n "$SOURCEGRAPH_ACCESS_TOKEN" ] && [ -n "$SOURCEGRAPH_URL" ]; then
30+
echo "Setting up Sourcegraph MCP configuration..."
31+
mkdir -p /root/.config/claude
32+
33+
cat > /root/.config/claude/mcp.json << 'MCPEOF'
34+
{
35+
"mcpServers": {
36+
"sourcegraph": {
37+
"command": "npx",
38+
"args": ["-y", "@sourcegraph/mcp-server"],
39+
"env": {
40+
"SRC_ACCESS_TOKEN": "$SOURCEGRAPH_ACCESS_TOKEN",
41+
"SOURCEGRAPH_URL": "$SOURCEGRAPH_URL"
42+
}
43+
}
44+
}
45+
}
46+
MCPEOF
47+
48+
echo "MCP configuration created"
49+
else
50+
echo "No Sourcegraph credentials provided, MCP disabled"
51+
fi
52+
exit 0
53+
"""
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
{
2+
"task_id": "inv-deep-001",
3+
"description": "Deep causal chain: Envoy router double-processes response header mutations for local replies",
4+
"weights": {
5+
"required_findings": 0.40,
6+
"file_references": 0.30,
7+
"causal_chain": 0.20,
8+
"negative_checks": 0.10
9+
},
10+
"required_findings": [
11+
{
12+
"id": "f1",
13+
"description": "Identifies finalizeResponseHeaders as the function that applies response_headers_to_add",
14+
"patterns": ["finalizeResponseHeaders", "finalize.*[Rr]esponse.*[Hh]eaders"],
15+
"weight": 0.12
16+
},
17+
{
18+
"id": "f2",
19+
"description": "Identifies the modify_headers_ callback/lambda in the router",
20+
"patterns": ["modify_headers_", "modify_headers"],
21+
"weight": 0.12
22+
},
23+
{
24+
"id": "f3",
25+
"description": "Identifies sendLocalReply as the code path where double processing occurs",
26+
"patterns": ["sendLocalReply", "send.*[Ll]ocal.*[Rr]eply", "local.?reply"],
27+
"weight": 0.12
28+
},
29+
{
30+
"id": "f4",
31+
"description": "Explains double invocation: sendLocalReply calls finalizeResponseHeaders AND modify_headers_ also calls it",
32+
"patterns": ["double|twice|two.*times|duplicate.*call|called.*again|both.*call|re.?process|multiple.*times"],
33+
"weight": 0.12
34+
},
35+
{
36+
"id": "f5",
37+
"description": "Mentions PR #39534 or its effect of moving finalizeResponseHeaders into modify_headers_ callback",
38+
"patterns": ["39534|moved.*finalize.*into.*modify|modify_headers.*callback.*finalize"],
39+
"weight": 0.08
40+
},
41+
{
42+
"id": "f6",
43+
"description": "Identifies HeaderParser or header_parser.cc as involved in parsing header config",
44+
"patterns": ["[Hh]eader[Pp]arser|header_parser"],
45+
"weight": 0.08
46+
},
47+
{
48+
"id": "f7",
49+
"description": "Mentions append_action enum or OVERWRITE_IF_EXISTS_OR_ADD behavior",
50+
"patterns": ["append_action|OVERWRITE_IF_EXISTS_OR_ADD|APPEND_IF_EXISTS_OR_ADD|[Hh]eader[Aa]ppend[Aa]ction"],
51+
"weight": 0.08
52+
},
53+
{
54+
"id": "f8",
55+
"description": "Identifies HeaderValueOption proto or the deprecated append BoolValue field",
56+
"patterns": ["HeaderValueOption|append.*BoolValue|deprecated.*append|has_append"],
57+
"weight": 0.06
58+
},
59+
{
60+
"id": "f9",
61+
"description": "Mentions FilterManager or filter_manager role in encode path",
62+
"patterns": ["FilterManager|filter_manager|encodeHeaders"],
63+
"weight": 0.06
64+
},
65+
{
66+
"id": "f10",
67+
"description": "Mentions response_headers_to_add route configuration",
68+
"patterns": ["response_headers_to_add"],
69+
"weight": 0.06
70+
},
71+
{
72+
"id": "f11",
73+
"description": "Identifies that local replies vs proxied upstream responses behave differently",
74+
"patterns": ["local.*reply.*differ|upstream.*response.*not.*affect|only.*local|proxied.*response.*correct|upstream.*path.*single"],
75+
"weight": 0.05
76+
},
77+
{
78+
"id": "f12",
79+
"description": "Mentions getResponseHeaderParsers or evaluateHeaders in the header processing chain",
80+
"patterns": ["getResponseHeaderParsers|evaluateHeaders"],
81+
"weight": 0.05
82+
}
83+
],
84+
"file_references": [
85+
{
86+
"id": "r1",
87+
"description": "Identifies source/common/router/router.cc (where sendLocalReply and modify_headers_ live)",
88+
"patterns": ["source/common/router/router\\.cc", "common/router/router\\.cc", "router/router\\.cc"],
89+
"weight": 0.25
90+
},
91+
{
92+
"id": "r2",
93+
"description": "Identifies source/common/router/config_impl.cc (where finalizeResponseHeaders is defined)",
94+
"patterns": ["source/common/router/config_impl\\.cc", "router/config_impl\\.cc"],
95+
"weight": 0.20
96+
},
97+
{
98+
"id": "r3",
99+
"description": "Identifies source/common/router/header_parser.cc (HeadersToAddEntry, append/append_action logic)",
100+
"patterns": ["source/common/router/header_parser\\.cc", "router/header_parser\\.cc", "header_parser\\.cc"],
101+
"weight": 0.15
102+
},
103+
{
104+
"id": "r4",
105+
"description": "Identifies source/common/http/filter_manager.cc (encode path iteration)",
106+
"patterns": ["source/common/http/filter_manager\\.cc", "http/filter_manager\\.cc", "filter_manager\\.cc"],
107+
"weight": 0.10
108+
},
109+
{
110+
"id": "r5",
111+
"description": "Identifies api/envoy/config/core/v3/base.proto (HeaderValueOption, append_action enum)",
112+
"patterns": ["api/envoy/config/core/v3/base\\.proto", "core/v3/base\\.proto", "config/core/v3/base"],
113+
"weight": 0.10
114+
},
115+
{
116+
"id": "r6",
117+
"description": "Identifies header_mutation filter (source/extensions/filters/http/header_mutation/)",
118+
"patterns": ["extensions/filters/http/header_mutation", "header_mutation\\.cc", "header_mutation/header_mutation"],
119+
"weight": 0.10
120+
},
121+
{
122+
"id": "r7",
123+
"description": "Identifies source/common/http/header_mutation.cc (AppendMutation, evaluateHeaders)",
124+
"patterns": ["source/common/http/header_mutation\\.cc", "http/header_mutation\\.cc"],
125+
"weight": 0.10
126+
}
127+
],
128+
"causal_chain": [
129+
{
130+
"id": "c1",
131+
"description": "Hop 1-2: sendLocalReply in router.cc calls finalizeResponseHeaders, then modify_headers_ callback also calls finalizeResponseHeaders, causing response_headers_to_add to be processed twice",
132+
"patterns": ["sendLocalReply|local.*reply", "finalizeResponseHeaders|finalize.*response", "modify_headers_|modify.*header.*callback", "twice|double|duplicate|two.*times|multiple|re.?process"],
133+
"ordered": true,
134+
"weight": 0.40
135+
},
136+
{
137+
"id": "c2",
138+
"description": "Hop 3: finalizeResponseHeaders iterates header parsers via getResponseHeaderParsers/evaluateHeaders which applies response_headers_to_add",
139+
"patterns": ["finalizeResponseHeaders|finalize.*response", "response_headers_to_add|header.*parser|evaluateHeaders"],
140+
"ordered": true,
141+
"weight": 0.25
142+
},
143+
{
144+
"id": "c3",
145+
"description": "Hop 4-5: Header config parsing in header_parser.cc resolves append_action from proto, where append vs append_action deprecation and proto defaults (enum 0 = APPEND_IF_EXISTS_OR_ADD) affect mutation behavior",
146+
"patterns": ["header_parser|HeadersToAddEntry|HeaderParser", "append_action|append.*BoolValue|proto.*default|APPEND_IF_EXISTS"],
147+
"ordered": false,
148+
"weight": 0.35
149+
}
150+
],
151+
"negative_checks": [
152+
{
153+
"id": "n1",
154+
"description": "Does NOT blame the upstream server, network issues, or Envoy version incompatibility as root cause",
155+
"patterns": ["upstream.*server.*root.?cause|network.*root.?cause|version.*incompat.*root.?cause"],
156+
"must_be_absent": true,
157+
"weight": 0.50
158+
},
159+
{
160+
"id": "n2",
161+
"description": "Does NOT claim the bug is in the header_mutation HTTP filter extension itself (it's in the router)",
162+
"patterns": ["header_mutation.*filter.*root.?cause|header.?mutation.*extension.*bug|header_mutation\\.cc.*root.?cause"],
163+
"must_be_absent": true,
164+
"weight": 0.50
165+
}
166+
]
167+
}

0 commit comments

Comments
 (0)