Skip to content

fix(span): clear dangling root_span aliases on free (fixes baggage SIGSEGV)#3943

Open
jdani-airalo wants to merge 1 commit into
DataDog:masterfrom
jdani-airalo:harden-baggage-span-inheritance
Open

fix(span): clear dangling root_span aliases on free (fixes baggage SIGSEGV)#3943
jdani-airalo wants to merge 1 commit into
DataDog:masterfrom
jdani-airalo:harden-baggage-span-inheritance

Conversation

@jdani-airalo

@jdani-airalo jdani-airalo commented Jun 2, 2026

Copy link
Copy Markdown

Root cause. A span stack holds a non-owning pointer to its root span (ddtrace_init_span_stack copies it without taking a reference, ext/span.c). The parent-stack reference chain normally keeps the root span alive while any stack aliases it, but out-of-order root span drops break that invariant: ddtrace_span_alter_root_span_config() (runtime config change) and the rejection branch of ddtrace_drop_span() free the root span while only NULLing the current stack's pointer, leaving sibling/descendant stacks dangling. The next cross-stack inherit (ddtrace_set_root_span_propertiesddtrace_inherit_span_properties) then dereferences the freed span.

It faults on the baggage copy specifically: with baggage populated, property_baggage points at a heap zend_array freed with the span, so ZVAL_COPY_DEREFGC_ADDREF dereferences freed memory (addl $0x1,(%rdx) with %rdx=0). With empty baggage it stays the immutable ZVAL_EMPTY_ARRAY default whose pointer survives the freed read — which is why disabling baggage extraction masked it, and why \DDTrace\active_span()->baggage in normal user code does not crash (the root span is alive there).

Fix. When a root span object is freed, scrub every span stack still aliasing it so the weak pointer can never outlive the object, regardless of which drop path freed it. NULL is already the "no root span" sentinel handled throughout (e.g. ddtrace_set_root_span_properties guards on parent_root), so this introduces no new state. The scan runs ~once per trace, mirroring ddtrace_mark_all_span_stacks_flushable.

This supersedes the previous defensive guard in ddtrace_inherit_span_properties (reverted here): the malformed (refcounted-flagged, NULL/freed counted pointer) zval can no longer arise.

Note: not compiled/tested locally — relies on CI and maintainer review.

@jdani-airalo jdani-airalo requested a review from a team as a code owner June 2, 2026 11:52
@jdani-airalo jdani-airalo changed the title Guard span baggage inheritance against malformed parent zval fix: Guard span baggage inheritance against malformed parent zval Jun 2, 2026
@bwoebi

bwoebi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Hey @jdani-airalo.

the big question is - how does that become NULL. This is an invalid state and probably direct access to \DDTrace\active_span()->baggage in very normal user code would also crash then.

I was unable to pin down a possible path to get this to crash. Do have any reproducer? Any information your AI could extract from the core dump / source code to reconstruct a path to getting this crashing?

This PR is a band-aid, not a proper fix :-(

@jdani-airalo jdani-airalo force-pushed the harden-baggage-span-inheritance branch from 8991876 to a155215 Compare June 8, 2026 11:28
@jdani-airalo jdani-airalo changed the title fix: Guard span baggage inheritance against malformed parent zval fix(span): clear dangling root_span aliases on free (fixes baggage SIGSEGV) Jun 8, 2026
@jdani-airalo

Copy link
Copy Markdown
Author

Dug into the lifetime and I think I have the actual root cause — it's not invalid baggage, it's a use-after-free of the parent root span.

stack->root_span is a non-owning pointer (ddtrace_init_span_stack copies it without taking a reference). The parent-stack reference chain normally keeps it alive, but the out-of-order drops — ddtrace_span_alter_root_span_config() (runtime config change) and the rejection branch in ddtrace_drop_span() — free the root span while only NULLing the current stack's pointer, so sibling/descendant stacks dangle. The next ddtrace_set_root_span_propertiesddtrace_inherit_span_properties reads the freed span.

Baggage is just the trigger, which is what makes it look like an "invalid state":

  • populated baggage → property_baggage points at a heap zend_array freed with the span → GC_ADDREF on freed memory → the addl $0x1,(%rdx) fault.
  • empty baggage → it stays the immutable ZVAL_EMPTY_ARRAY default whose pointer survives the freed read → no crash (this is why disabling extraction "fixes" it).
  • \DDTrace\active_span()->baggage in normal code is fine because the root span is alive there.

I've replaced the guard with a fix at the free site: when a root span is freed, scrub any stack still aliasing it so the weak pointer can't outlive the object. NULL is already the handled "no root span" sentinel, so no new state is introduced, and it covers every drop path rather than one read site.

I haven't been able to compile/run it locally, so it'll need CI + your build. If useful I can also share (a) a candidate ASAN reproducer — fork a child stack so it captures root_span, drop the parent root span via the generate_root_span config path, then start a new trace whose parent_stack dangles — and (b) the exact gdb session to confirm from the existing core dump that parent is freed memory (parent->std.handlers not ddtrace_root_span_data_handlers, and %rdx == parent->property_baggage.value.counted).

@jdani-airalo jdani-airalo reopened this Jun 8, 2026
Fixes a SIGSEGV in ddtrace_inherit_span_properties that surfaced in
production (PHP 8.3 NTS, fpm-fcgi) as a GC_ADDREF on a NULL/freed pointer
while copying a parent span's baggage.

Root cause: a span stack keeps a NON-owning pointer to its root span
(ddtrace_init_span_stack copies it without taking a reference). The
parent-stack reference chain normally keeps the root span alive while any
stack aliases it, but out-of-order root span drops break that invariant:
ddtrace_span_alter_root_span_config() (runtime config change) and the
rejection branch of ddtrace_drop_span() free the root span while only
NULLing the *current* stack's pointer, leaving sibling/descendant stacks
dangling. The next cross-stack inherit (ddtrace_set_root_span_properties ->
ddtrace_inherit_span_properties) then dereferences the freed span.

It crashes on the baggage copy specifically: with baggage populated,
property_baggage points at a heap zend_array freed with the span, so the
ZVAL_COPY_DEREF -> GC_ADDREF dereferences freed memory. With empty baggage
it stays the immutable ZVAL_EMPTY_ARRAY default whose pointer survives the
freed read -- which is why disabling baggage extraction masked the bug.

Fix: when a root span object is actually freed, scrub every span stack that
still aliases it so the weak pointer can never outlive the object,
regardless of which drop path freed it. NULL is already the "no root span"
sentinel handled throughout (e.g. ddtrace_set_root_span_properties guards on
parent_root), so this introduces no new state. The scan runs once per root
span free (~once per trace), mirroring ddtrace_mark_all_span_stacks_flushable.

This supersedes the previous defensive guard in ddtrace_inherit_span_properties
(reverted here): the malformed (refcounted-flagged, NULL/freed counted pointer)
zval can no longer arise.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jdani-airalo jdani-airalo force-pushed the harden-baggage-span-inheritance branch from a155215 to 49cc658 Compare June 8, 2026 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants