[APMSVLS-487] respect Datadog-Client-Computed-Stats header#1245
[APMSVLS-487] respect Datadog-Client-Computed-Stats header#1245lucaspimentel wants to merge 13 commits into
Datadog-Client-Computed-Stats header#1245Conversation
Datadog-Client-Computed-Stats header
|
d5220c9 to
003415f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 55f436b34e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Updates trace/stat computation behavior to respect the tracer’s Datadog-Client-Computed-Stats signal, moving _dd.compute_stats from function tags to a per-span backend directive and ensuring extension-side stat generation is suppressed when the tracer already computed stats.
Changes:
- Stamp
_dd.compute_stats="1"per-span only when the backend should compute stats; otherwise leave it absent. - Propagate
client_computed_statsthrough the invocation (Path B) and OTLP pipelines to prevent double stat generation. - Update integration/unit tests to stop asserting
_dd.compute_statsin tags/logs and add truth-table coverage for stamping + stats suppression.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| bottlecap/src/traces/trace_processor.rs | Adds StatsComputedBy resolution, per-span stamping, and aligned stats-generation guards; extends tests. |
| bottlecap/src/otlp/agent.rs | Uses shared StatsComputedBy logic to skip extension stats when tracer computed stats. |
| bottlecap/src/traces/trace_agent.rs | Propagates client_computed_stats into invocation processing for Path B. |
| bottlecap/src/lifecycle/invocation/context.rs | Stores client_computed_stats on context to carry tracer signal through Path B. |
| bottlecap/src/lifecycle/invocation/processor.rs | Threads client_computed_stats into extension-generated spans via header tags. |
| bottlecap/src/lifecycle/invocation/processor_service.rs | Extends command plumbing to carry client_computed_stats. |
| bottlecap/src/tags/lambda/tags.rs | Stops baking _dd.compute_stats into function tags; exposes key constant; updates tests. |
| bottlecap/tests/apm_integration_test.rs | Adds end-to-end fake-intake tests validating stamping and stats suppression matrix. |
| bottlecap/tests/logs_integration_test.rs | Removes _dd.compute_stats assertion in logs payload (no longer a tag). |
| bottlecap/tests/metrics_integration_test.rs | Updates expected payload comment to reflect removal of _dd.compute_stats from tags. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 529cfc18a3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
529cfc1 to
2700ebf
Compare
Tests-first plan for respecting the Datadog-Client-Computed-Stats header: canonical _dd.compute_stats rule, Path A/B production changes, and tiered tests (header contract, trace_processor unit, Path B unit, full fake-intake E2E). 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
Document how libdatadog parses client_computed_stats (non-empty) and client_computed_top_level (presence-only) differently from each other and from the Go agent's uniform isHeaderTrue/ParseBool rule. Note both are latent and fold both into the follow-up libdatadog PR. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
…tead of baking it Move _dd.compute_stats from a baked-in function tag to a per-span backend directive stamped by the trace processor, and respect the tracer's Datadog-Client-Computed-Stats signal. - tags.rs: stop inserting _dd.compute_stats in tags_from_env (it no longer leaks into _dd.tags.function); make COMPUTE_STATS_KEY pub(crate). - trace_processor.rs: add client_computed_stats to ChunkProcessor; stamp _dd.compute_stats="1" on each span's meta only when neither the extension (compute_trace_stats_on_extension) nor the tracer (client_computed_stats) computes stats, matching the Go agent (set "1" or leave absent). Add && !client_computed_stats to the extension-side stats-generation guard. - Tests: truth-table on Span.meta (#1118 guard), stats-skip guard via the real StatsConcentratorService, updated tags.rs unit tests, and removed the stale _dd.compute_stats assertions from the logs/metrics integration tests. All 548 workspace tests pass; fmt + clippy clean.
…ath B aws.lambda span Thread the tracer's Datadog-Client-Computed-Stats signal through the extension-generated aws.lambda span (Path B) so _dd.compute_stats is stamped consistently with Path A (the tracer-trace path handled in Tier 1). - context.rs: add Context.client_computed_stats (+ Default false); ContextBuffer::add_tracer_span takes the flag and records it on the matching context. - processor.rs: add_tracer_span threads the param; send_ctx_spans captures context.client_computed_stats before get_ctx_spans consumes the context; send_spans takes the param and sets it on its TracerHeaderTags so Path B reuses the Tier-1 ChunkProcessor stamping (single source of truth); send_cold_start_span passes false. - processor_service.rs: ProcessorCommand::AddTracerSpan + handle + handler carry the flag. - trace_agent.rs: annotate tracer_header_tags type and pass client_computed_stats to add_tracer_span at the placeholder-span branch. - Tests: context-level flag recording, and an end-to-end Path B test that drives send_ctx_spans through the trace_tx channel and asserts _dd.compute_stats on the aws.lambda span across the full truth table. All 550 workspace tests pass; fmt + clippy clean.
…s + stats suppression Add end-to-end coverage that routes a trace through SendingTraceProcessor::send_processed_traces (so process_traces/ChunkProcessor actually run) and asserts on the wire payloads captured by the in-process fake-intake. - run_processor_pipeline(compute_on_extension, client_computed_stats): drains the trace_tx channel into AggregatorService synchronously after send (avoids a flush race), flushes via TraceFlusher, and flushes stats via StatsConcentratorService -> StatsAggregator -> StatsFlusher. Parameterizes header_tags_with(client_computed_stats). - T3.1 e2e_client_computed_stats_leaves_compute_stats_absent: tracer computed -> span meta absent. - T3.2 e2e_compute_stats_truth_table_on_captured_span: "1" only for the neither-computes row. - T3.3 e2e_stats_suppressed_unless_extension_computes: one stats payload only for (compute_on_extension, !client_computed_stats); empty otherwise. - T3.4 e2e_client_computed_stats_absent_meta_and_no_stats: combined meta-absent + zero stats. 554/554 workspace tests pass; fmt + clippy clean.
The OTLP trace pipeline's stats-generation guard only checked compute_trace_stats_on_extension, so an OTLP request carrying Datadog-Client-Computed-Stats would still generate extension-side stats, double-counting against the tracer's client-side stats. Mirror the guard already used in send_processed_traces by also skipping when client_computed_stats is set. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
Promote COMPUTE_STATS_KEY from pub(crate) to pub so the apm_integration test crate can import it instead of re-declaring the "_dd.compute_stats" string literal, keeping a single source of truth for the tag key. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
…By enum The decision of who computes trace stats (backend, extension, or tracer) was encoded as three separate boolean expressions over the same two inputs: the per-span _dd.compute_stats stamp in ChunkProcessor and the extension-side stats-generation guards in SendingTraceProcessor and the OTLP agent. Phrased differently and spread across two files, they could silently drift apart, either dropping stats (nobody computes) or double-counting them (extension and backend both compute). Derive the decision once via StatsComputedBy::resolve and have all three sites ask a single unambiguous question. No behavior change. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
2700ebf to
756c0bc
Compare
Overview
Makes the extension respect the tracer's
Datadog-Client-Computed-Statsheader and moves_dd.compute_statsfrom a baked-in function tag to a per-span backend directive.We tried supporting the
Datadog-Client-Computed-Statsheader before in #1118, but that was reverted in #1176.Background
Span attribute
_dd.compute_statsasks the backend to compute trace stats. It must be set to"1"only when nobody else computed them — neither the extension (agent) nor the tracer. Previously the extension:_dd.compute_statsinto the function tags unconditionally (tags_from_env), which also leaked the key into_dd.tags.function.Datadog-Client-Computed-Statsentirely, so when a tracer computed stats client-side, the backend was still asked to compute them.Canonical semantics (validated against the Go agent)
The Go agent (
pkg/serverless/tags/tags.go) only ever sets_dd.compute_stats = "1", never"0", and leaves the key absent otherwise. This PR matches that:client_computed_stats(header from tracer)compute_on_extension(extension config)_dd.compute_stats="1"?Changes
tags/lambda/tags.rs— stop baking_dd.compute_statsintags_from_env(no longer leaks into_dd.tags.function);COMPUTE_STATS_KEYis nowpubso the integration test can reuse it instead of re-declaring the literal.traces/trace_processor.rs—ChunkProcessorgainsclient_computed_statsand stamps_dd.compute_stats="1"per-span only when neither side computes stats; the extension-side stats-generation guard insend_processed_tracesnow also skips whenclient_computed_statsis set.aws.lambdaspan) —client_computed_statsis propagated from the tracer's placeholder span throughcontext.rs→processor.rs→processor_service.rs→trace_agent.rs, so Path B reuses the sameChunkProcessorstamping (single source of truth).otlp/agent.rs— the OTLP stats-generation guard previously checked onlycompute_trace_stats_on_extension, so an OTLP request carryingDatadog-Client-Computed-Statswould still generate extension-side stats and double-count against the tracer. It now also skips whenclient_computed_statsis set, mirroring thesend_processed_tracesguard.traces/trace_processor.rs— the three decisions over the same two inputs (the per-span_dd.compute_statsstamp, plus the extension-side stats-generation guards insend_processed_tracesandotlp/agent.rs) are now derived from oneStatsComputedBy::resolve(compute_on_extension, client_computed_stats)helper, so the stamp and the guards can't silently drift apart.Note on the header value (cross-runtime)
Datadog-Client-Computed-Statsis not standardized ("true".NET/Java/PHP/Python,"yes"JS/Ruby/C++,"t"Go). bottlecap consumes the already-parsedclient_computed_statsbool fromlibdd_trace_utils, where any non-empty value →true, so the fix triggers on every runtime. A separate libdatadog PR (DataDog/libdatadog#2071) aligns the header parsing with the Go agent'sisHeaderTrue/ParseBoolrules; this PR only consumes the bool and does not depend on that change.Testing
Tier 0 (header-parsing contract)— moved to the libdatadog bump PR chore(deps): bumplibdatadogto48da0d8for client-computed header fix #1244, since those tests assert libdatadog parsing behavior that thedb05e1f → 48da0d8bump changes. This branch keeps Tiers 1–3 and rebases onto #1244 after it merges.trace_processor.rs) — truth-table onSpan.meta, stats-skip guard via the realStatsConcentratorService, and updatedtags.rsunit tests asserting the key no longer appears in the tags map. Fixed the logs/metrics integration tests that asserted the old leak.context.rs,processor.rs) — context-level flag recording and an end-to-end Path B test drivingsend_ctx_spansthrough thetrace_txchannel, asserting_dd.compute_statson theaws.lambdaspan across the truth table.apm_integration_test.rs) — full fake-intake E2E routing a trace throughSendingTraceProcessor::send_processed_traces: asserts on the capturedAgentPayloadspan meta and onstats_payloads()(stats suppressed unless the extension computes and the tracer did not).APMSVLS-487