Skip to content

Commit a611acb

Browse files
sjarmakclaude
andcommitted
feat: curate oracle definitions for all 20 new MCP-unique tasks
Phase 3 of MCP-unique expansion. For each of the 20 new tasks: - Created oracle_answer.json with verified files, symbols, chains, and text - Updated task_spec.json with populated oracle fields and evaluation checks - All oracles verified via Sourcegraph MCP queries against pinned repo versions Coverage: 5 Kafka tasks (dep-trace, vuln-remed, migration, incident, onboard), 7 Envoy tasks (TLS, v2 API, conn pool, access logging, xDS, cross-org, HTTP filter), 2 Rust tasks (type inference), 3 Kubernetes tasks (watch events, CRD client), 3 cross-repo chain tasks (domain lineage, agentic correctness). Also fixes sg-benchmarks -> sg-evals in 4 repo set fixture files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 3278e35 commit a611acb

File tree

44 files changed

+1731
-277
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+1731
-277
lines changed
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
{
2+
"files": [
3+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/common/access_log/access_log_impl.h"},
4+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/common/access_log/access_log_impl.cc"},
5+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/file/config.cc"},
6+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/http_grpc_access_log_impl.h"},
7+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/tcp_grpc_access_log_impl.h"},
8+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/open_telemetry/access_log_impl.h"},
9+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/fluentd/fluentd_access_log_impl.h"},
10+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/stream/config.cc"},
11+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/access_log_base.h"},
12+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/file_access_log_impl.h"},
13+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/filters/cel/cel.h"},
14+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/file/v3/file.proto"},
15+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/grpc/v3/als.proto"},
16+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/open_telemetry/v3/logs_service.proto"},
17+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/fluentd/v3/fluentd.proto"},
18+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/stream/v3/stream.proto"}
19+
],
20+
"symbols": [
21+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/common/access_log/access_log_impl.h", "symbol": "AccessLogFactory", "kind": "class"},
22+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/access_log_base.h", "symbol": "ImplBase", "kind": "class"},
23+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/file_access_log_impl.h", "symbol": "FileAccessLog", "kind": "class"},
24+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/file/config.cc", "symbol": "FileAccessLogFactory", "kind": "class"},
25+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/http_grpc_access_log_impl.h", "symbol": "HttpGrpcAccessLog", "kind": "class"},
26+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/tcp_grpc_access_log_impl.h", "symbol": "TcpGrpcAccessLog", "kind": "class"},
27+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/open_telemetry/access_log_impl.h", "symbol": "AccessLog", "kind": "class"},
28+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/fluentd/fluentd_access_log_impl.h", "symbol": "FluentdAccessLog", "kind": "class"},
29+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/filters/cel/cel.h", "symbol": "CELAccessLogExtensionFilter", "kind": "class"},
30+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/file/v3/file.proto", "symbol": "FileAccessLog", "kind": "class"},
31+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/grpc/v3/als.proto", "symbol": "HttpGrpcAccessLogConfig", "kind": "class"}
32+
],
33+
"text": "Found access logging infrastructure across sg-evals/envoy--v1.31.2 and sg-evals/data-plane-api--84e84367. Core infrastructure: access_log_impl.h/cc defines filter implementations (StatusCodeFilter, DurationFilter, ResponseFlagFilter, GrpcStatusFilter, etc.) and AccessLogFactory. Extension base class ImplBase (access_log_base.h) handles filter evaluation before delegating to emitLog(). Logger implementations: (1) File (file/config.cc, FileAccessLogFactory) — supports text AND JSON (json_format, typed_json_format, log_format with json). (2) Stdout/Stderr (stream/config.cc, StdoutAccessLogFactory/StderrAccessLogFactory) — supports JSON via SubstitutionFormatString log_format. (3) HTTP gRPC (grpc/http_grpc_access_log_impl.h, HttpGrpcAccessLog) — sends protobuf HTTPAccessLogEntry over ALS stream, not text/JSON. (4) TCP gRPC (grpc/tcp_grpc_access_log_impl.h, TcpGrpcAccessLog) — sends TCPAccessLogEntry, not text/JSON. (5) OpenTelemetry (open_telemetry/access_log_impl.h) — structured via OTel AnyValue/KeyValueList protos, not JSON. (6) Fluentd (fluentd/fluentd_access_log_impl.h, FluentdAccessLog) — supports JSON (record field is google.protobuf.Struct, converted to msgpack). (7) CEL filter (filters/cel/cel.h, CELAccessLogExtensionFilter) — access log filter, not a sink. Proto definitions in data-plane-api: file.proto (FileAccessLog), als.proto (HttpGrpcAccessLogConfig, TcpGrpcAccessLogConfig), logs_service.proto (OpenTelemetryAccessLogConfig), fluentd.proto (FluentdAccessLogConfig), stream.proto (StdoutAccessLog, StderrAccessLog).",
34+
"_metadata": {
35+
"oracle_type": "file_set_match",
36+
"discovery_method": "sourcegraph_keyword_search",
37+
"queries": [
38+
"repo:^github.com/sg-evals/envoy--v1.31.2$ file:source/extensions/access_loggers/ AccessLog",
39+
"repo:^github.com/sg-evals/envoy--v1.31.2$ file:source/common/access_log/ AccessLog",
40+
"repo:^github.com/sg-evals/data-plane-api--84e84367$ file:envoy/extensions/access_loggers/"
41+
],
42+
"verified_at": "2026-02-23",
43+
"pinned_version": "v1.31.2 (envoy), 84e84367 (data-plane-api)"
44+
}
45+
}

benchmarks/ccb_mcp_compliance/ccx-compliance-052/tests/task_spec.json

Lines changed: 49 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -13,23 +13,62 @@
1313
"artifacts": {
1414
"repo_set_id": "envoy-service-mesh",
1515
"oracle": {
16-
"required_files": [],
17-
"required_symbols": [],
16+
"required_files": [
17+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/common/access_log/access_log_impl.h"},
18+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/common/access_log/access_log_impl.cc"},
19+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/file/config.cc"},
20+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/http_grpc_access_log_impl.h"},
21+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/tcp_grpc_access_log_impl.h"},
22+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/open_telemetry/access_log_impl.h"},
23+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/fluentd/fluentd_access_log_impl.h"},
24+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/stream/config.cc"},
25+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/access_log_base.h"},
26+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/file_access_log_impl.h"},
27+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/filters/cel/cel.h"},
28+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/file/v3/file.proto"},
29+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/grpc/v3/als.proto"},
30+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/open_telemetry/v3/logs_service.proto"},
31+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/fluentd/v3/fluentd.proto"},
32+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/stream/v3/stream.proto"}
33+
],
34+
"required_symbols": [
35+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/common/access_log/access_log_impl.h", "symbol": "AccessLogFactory", "kind": "class"},
36+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/access_log_base.h", "symbol": "ImplBase", "kind": "class"},
37+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/common/file_access_log_impl.h", "symbol": "FileAccessLog", "kind": "class"},
38+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/file/config.cc", "symbol": "FileAccessLogFactory", "kind": "class"},
39+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/http_grpc_access_log_impl.h", "symbol": "HttpGrpcAccessLog", "kind": "class"},
40+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/grpc/tcp_grpc_access_log_impl.h", "symbol": "TcpGrpcAccessLog", "kind": "class"},
41+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/open_telemetry/access_log_impl.h", "symbol": "AccessLog", "kind": "class"},
42+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/fluentd/fluentd_access_log_impl.h", "symbol": "FluentdAccessLog", "kind": "class"},
43+
{"repo": "sg-evals/envoy--v1.31.2", "path": "source/extensions/access_loggers/filters/cel/cel.h", "symbol": "CELAccessLogExtensionFilter", "kind": "class"},
44+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/file/v3/file.proto", "symbol": "FileAccessLog", "kind": "class"},
45+
{"repo": "sg-evals/data-plane-api--84e84367", "path": "envoy/extensions/access_loggers/grpc/v3/als.proto", "symbol": "HttpGrpcAccessLogConfig", "kind": "class"}
46+
],
1847
"required_references": [],
1948
"dependency_chains": []
2049
}
2150
},
2251
"evaluation": {
2352
"modes": ["deterministic"],
2453
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": "",
29-
"file_filter": ""
30-
}
31-
}
32-
],
54+
{
55+
"type": "file_set_match",
56+
"params": {
57+
"search_pattern": "AccessLog",
58+
"file_filter": "access_loggers/"
59+
}
60+
},
61+
{
62+
"type": "symbol_resolution",
63+
"params": {}
64+
},
65+
{
66+
"type": "keyword_presence",
67+
"params": {
68+
"required_keywords": ["FileAccessLog", "HttpGrpcAccessLog", "TcpGrpcAccessLog", "FluentdAccessLog", "OpenTelemetry", "json_format", "CELAccessLogExtensionFilter"]
69+
}
70+
}
71+
],
3372
"eval_script": "/tests/eval.sh",
3473
"pass_exit_code": 0
3574
},
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
{
2+
"files": [
3+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java"},
4+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizer.java"},
5+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java"},
6+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Authorizer.java"}
7+
],
8+
"symbols": [
9+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "StandardAuthorizerData"},
10+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "logAuditMessage"},
11+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "buildAuditMessage"},
12+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "auditLog"},
13+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizer.java", "symbol": "StandardAuthorizer"},
14+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java", "symbol": "logIfAllowed"},
15+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java", "symbol": "logIfDenied"}
16+
],
17+
"text": "Kafka's ACL-based authorization audit logging is concentrated in the StandardAuthorizer subsystem within the metadata module. The architecture has four key files:\n\n1. **StandardAuthorizerData** (metadata/src/main/java/.../StandardAuthorizerData.java): The primary audit log producer. Contains a dedicated `auditLog` Logger field initialized via `auditLogger()` which returns `LoggerFactory.getLogger(\"kafka.authorizer.logger\")`. The `authorize()` method calls `logAuditMessage()` after every authorization decision. `logAuditMessage()` switches on ALLOWED/DENIED: for ALLOWED, it logs at debug level (if `logIfAllowed` is set) or trace level; for DENIED, it logs at info level (if `logIfDenied` is set) or trace level. The `buildAuditMessage()` method constructs the audit string: \"Principal = {principal} is Allowed/Denied operation = {op} from host = {host} on resource = {resource} for request = {apiKey} with resourceRefCount = {count} based on rule {rule}\".\n\n2. **StandardAuthorizer** (metadata/src/main/java/.../StandardAuthorizer.java): The built-in Authorizer implementation that stores ACLs in the metadata log. Its `authorize()` method delegates to `StandardAuthorizerData.authorize()` which triggers the audit logging, and also records authorization metrics via an inner `AuthorizerMetrics` class tracking allowed/denied rates.\n\n3. **Action** (clients/src/main/java/.../Action.java): Defines the `logIfAllowed` and `logIfDenied` boolean fields that control whether a given authorization action should be included in audit logs. These flags distinguish between actual access grants/denials and metadata-only operations (e.g., describing authorized operations).\n\n4. **Authorizer** (clients/src/main/java/.../Authorizer.java): The pluggable authorizer interface. Its Javadoc notes that custom implementations should override `authorizeByResourceType()` to integrate with audit logging (the default implementation cannot).",
18+
"_metadata": {
19+
"oracle_type": "file_set_match",
20+
"discovery_method": "sourcegraph_keyword_search",
21+
"queries": [
22+
"repo:^github.com/sg-evals/kafka--0753c489$ audit file:.*.java$",
23+
"repo:^github.com/sg-evals/kafka--0753c489$ AuditLog",
24+
"repo:^github.com/sg-evals/kafka--0753c489$ authorization log file:.*.java$",
25+
"repo:^github.com/sg-evals/kafka--0753c489$ file:authorizer log",
26+
"repo:^github.com/sg-evals/kafka--0753c489$ \"kafka.authorizer.logger\"",
27+
"repo:^github.com/sg-evals/kafka--0753c489$ \"logAudit\" OR \"auditLog\" file:.*.java$"
28+
],
29+
"verified_at": "2026-02-23",
30+
"pinned_version": "0753c489"
31+
}
32+
}

benchmarks/ccb_mcp_compliance/ccx-compliance-053/tests/task_spec.json

Lines changed: 37 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -13,23 +13,50 @@
1313
"artifacts": {
1414
"repo_set_id": "apache-kafka-ecosystem",
1515
"oracle": {
16-
"required_files": [],
17-
"required_symbols": [],
16+
"required_files": [
17+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java"},
18+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizer.java"},
19+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java"},
20+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Authorizer.java"}
21+
],
22+
"required_symbols": [
23+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "StandardAuthorizerData"},
24+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "logAuditMessage"},
25+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "buildAuditMessage"},
26+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "symbol": "auditLog"},
27+
{"repo": "sg-evals/kafka--0753c489", "path": "metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizer.java", "symbol": "StandardAuthorizer"},
28+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java", "symbol": "logIfAllowed"},
29+
{"repo": "sg-evals/kafka--0753c489", "path": "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java", "symbol": "logIfDenied"}
30+
],
1831
"required_references": [],
1932
"dependency_chains": []
2033
}
2134
},
2235
"evaluation": {
2336
"modes": ["deterministic"],
2437
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": "",
29-
"file_filter": ""
30-
}
31-
}
32-
],
38+
{
39+
"type": "file_set_match",
40+
"params": {}
41+
},
42+
{
43+
"type": "symbol_resolution",
44+
"params": {}
45+
},
46+
{
47+
"type": "keyword_presence",
48+
"params": {
49+
"required_keywords": ["StandardAuthorizerData", "logAuditMessage", "auditLog", "kafka.authorizer.logger", "logIfAllowed", "logIfDenied"]
50+
}
51+
},
52+
{
53+
"type": "provenance",
54+
"params": {
55+
"must_cite_repos": ["sg-evals/kafka--0753c489"],
56+
"must_cite_paths": ["metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java", "clients/src/main/java/org/apache/kafka/server/authorizer/Action.java"]
57+
}
58+
}
59+
],
3360
"eval_script": "/tests/eval.sh",
3461
"pass_exit_code": 0
3562
},

0 commit comments

Comments
 (0)