fix: ensure LLM callbacks share the same OTel span context by brucearctor · Pull Request #4854 · google/adk-python

brucearctor · 2026-03-17T01:32:38Z

Description

When OpenTelemetry tracing is enabled, before_model_callback and after_model_callback / on_model_error_callback see different span IDs, causing LLM_REQUEST.span_id != LLM_RESPONSE.span_id in the BigQuery Analytics Plugin.

Root Cause

Two issues in base_llm_flow.py:

before_model_callback ran outside the call_llm span
after_model_callback ran inside a child generate_content span (created by _run_and_handle_error → use_inference_span)

Fix

Move before_model_callback inside the call_llm span so it shares the same span context as the other callbacks
Wrap after_model_callback with trace.use_span(span) to re-activate the call_llm span (needed because the async generator from _run_and_handle_error yields responses inside the child generate_content span)
Import trace from opentelemetry

Testing

Added 3 new tests in test_llm_callback_span_consistency.py:

test_before_and_after_model_callbacks_share_span_id — core regression test
test_before_and_on_error_model_callbacks_share_span_id — error path
test_before_model_callback_short_circuit_has_span — short-circuit case

All 51 existing callback/tracing tests continue to pass.

Move before_model_callback inside the call_llm span and wrap after_model_callback with trace.use_span(span) to re-activate the call_llm span context. This ensures before_model_callback, after_model_callback, and on_model_error_callback all see the same span_id, fixing the mismatch that broke the BigQuery Analytics Plugin. The root cause was twofold: 1. before_model_callback ran outside the call_llm span 2. after_model_callback ran inside a child generate_content span (created by _run_and_handle_error via use_inference_span) Fixes google#4851

gemini-code-assist · 2026-03-17T01:33:19Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical OpenTelemetry tracing issue where LLM callbacks (before_model_callback, after_model_callback, on_model_error_callback) were not consistently sharing the same span ID. This inconsistency led to fragmented tracing data, particularly affecting analytics plugins. The fix involves strategically adjusting the execution context of these callbacks within the _call_llm_async method to ensure they all operate under the same call_llm span, thereby providing accurate and unified tracing information.

Highlights

OpenTelemetry Span Consistency: Resolved an issue where before_model_callback and after_model_callback/on_model_error_callback had different OpenTelemetry span IDs, leading to inconsistent tracing data in the BigQuery Analytics Plugin.
Callback Execution Context Adjustment: Modified base_llm_flow.py to ensure before_model_callback executes within the call_llm span and after_model_callback reactivates the call_llm span, guaranteeing all callbacks share the same span context.
New Span Consistency Tests: Introduced three new unit tests to validate that LLM callbacks consistently share the same OpenTelemetry span ID across various scenarios, including successful calls, error handling, and short-circuiting.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

src/google/adk/flows/llm_flows/base_llm_flow.py
- Imported the opentelemetry.trace module.
- Refactored the _call_llm_async method to move the before_model_callback invocation inside the call_llm span.
- Wrapped after_model_callback calls with trace.use_span(span) to re-activate the call_llm span.
tests/unittests/flows/llm_flows/test_llm_callback_span_consistency.py
- Added a new test file to verify LLM callback span consistency.
- Implemented test_before_and_after_model_callbacks_share_span_id to confirm span ID matching for successful LLM calls.
- Implemented test_before_and_on_error_model_callbacks_share_span_id to verify span ID consistency during error handling.
- Implemented test_before_model_callback_short_circuit_has_span to ensure before_model_callback has a valid span even when short-circuiting the LLM call.

Activity

New unit tests were added to validate the fix for OpenTelemetry span consistency in LLM callbacks.
All 51 existing callback and tracing tests were confirmed to still pass after the changes.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively resolves the OpenTelemetry span inconsistency between before_model_callback and after_model_callback. The approach of moving before_model_callback into the call_llm span and reactivating this span for after_model_callback is correct. The new regression tests are comprehensive and well-written, covering success, error, and short-circuit scenarios. I have one suggestion to refactor a small piece of duplicated code that was introduced with this fix to improve maintainability.

gemini-code-assist · 2026-03-17T01:34:27Z

src/google/adk/flows/llm_flows/base_llm_flow.py

+              with trace.use_span(span):
+                if altered_llm_response := await self._handle_after_model_callback(
+                    invocation_context, llm_response, model_response_event
+                ):
+                  llm_response = altered_llm_response


This block of code for handling the after_model_callback is duplicated in the else branch on lines 1192-1196. To improve maintainability and avoid repeating code (DRY principle), consider extracting this logic into a local helper coroutine within the _call_llm_with_tracing function.

For example:

async def _apply_after_model_callback(response: LlmResponse) -> LlmResponse: """Applies after_model_callback within the correct span context.""" with trace.use_span(span): if altered_response := await self._handle_after_model_callback( invocation_context, response, model_response_event ): return altered_response return response

You could then replace both duplicated blocks with a single call:
llm_response = await _apply_after_model_callback(llm_response)

adk-bot added the tracing [Component] This issue is related to OpenTelemetry tracing label Mar 17, 2026

gemini-code-assist bot reviewed Mar 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ensure LLM callbacks share the same OTel span context#4854

fix: ensure LLM callbacks share the same OTel span context#4854
brucearctor wants to merge 1 commit intogoogle:mainfrom
brucearctor:fix/otel-span-id-mismatch-4851

brucearctor commented Mar 17, 2026

Uh oh!

gemini-code-assist bot commented Mar 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brucearctor commented Mar 17, 2026

Description

Root Cause

Fix

Testing

Uh oh!

gemini-code-assist bot commented Mar 17, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants