Suppress spurious "Failed to detach context" log in OTel interceptor#1596
Open
brianstrauch wants to merge 1 commit into
Open
Suppress spurious "Failed to detach context" log in OTel interceptor#1596brianstrauch wants to merge 1 commit into
brianstrauch wants to merge 1 commit into
Conversation
The tracing workflow interceptor attaches an OTel context at the start of execute_workflow/handle_query and detaches it in the finally. The workflow event loop runs portions of the workflow inside contextvars.copy_context().run(...), so the finally can execute in a copied contextvars.Context. A copy preserves the OTel context value (so the existing `context is get_current()` guard passes) but invalidates the attach token, so opentelemetry.context.detach logs "Failed to detach context". Because LogCapturer attaches to the process-global opentelemetry.context logger, that stray log also bled into test_opentelemetry_safe_detach when a tracing workflow tore down during its capture window, making it flaky. Route the best-effort detach through _safe_detach, which drops only that one log record via a scoped logging filter. It still calls opentelemetry.context.detach so attach/detach calls stay balanced, satisfying the leak invariant enforced by test_opentelemetry_context_restored_after_activity.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Flaky test observed here: https://github.com/temporalio/sdk-python/actions/runs/27432588766/job/81086412642?pr=1378
TracingWorkflowInboundInterceptorattaches an OTel context at the start ofexecute_workflow/handle_queryand detaches it in thefinally. The workflow event loop runs portions of the workflow insidecontextvars.copy_context().run(...)(e.g.wait_condition), so thatfinallycan execute in a copiedcontextvars.Context. A copy preserves the OTel context value — so the existingcontext is get_current()guard passes — but invalidates the attach token, soopentelemetry.context.detachfails internally and logs "Failed to detach context".Because
LogCapturerattaches a handler to the process-globalopentelemetry.contextlogger, that stray ERROR log bled intotest_opentelemetry_safe_detachwhenever a tracing workflow tore down during its capture window (same xdist worker process), making the test flaky. In production it's recurring ERROR-level noise on affected workflow completions.Fix
Route the best-effort detach through a small
_safe_detachhelper that drops only the"Failed to detach context"record via a scopedlogging.Filter.It still calls
opentelemetry.context.detach, so attach/detach call counts stay balanced — the leak invariant enforced bytest_opentelemetry_context_restored_after_activity. (Alternatives that avoid the publicdetach— restoring viaattach(previous), or resetting the runtime contextvar directly — both break that balance test, so suppressing at the logger is the only approach that satisfies balanced calls + no spurious log + no test weakening.)Caveat
During a detach the filter is attached to the process-global OTel context logger, so it would also drop an unrelated "Failed to detach context" emitted by other code in that brief window. Given the alternatives break a correctness invariant, this is the right trade-off.
Testing
tests/contrib/opentelemetry/andtests/contrib/openai_agents/test_openai_tracing.pypass together (27/27), including the attach/detach balance test.detachbut produces no record through_safe_detach.