Skip to content

fix: ensure LLM callbacks share the same OTel span context#4854

Open
brucearctor wants to merge 1 commit intogoogle:mainfrom
brucearctor:fix/otel-span-id-mismatch-4851
Open

fix: ensure LLM callbacks share the same OTel span context#4854
brucearctor wants to merge 1 commit intogoogle:mainfrom
brucearctor:fix/otel-span-id-mismatch-4851

Conversation

@brucearctor
Copy link

Description

Fixes #4851.

When OpenTelemetry tracing is enabled, before_model_callback and after_model_callback / on_model_error_callback see different span IDs, causing LLM_REQUEST.span_id != LLM_RESPONSE.span_id in the BigQuery Analytics Plugin.

Root Cause

Two issues in base_llm_flow.py:

  1. before_model_callback ran outside the call_llm span
  2. after_model_callback ran inside a child generate_content span (created by _run_and_handle_erroruse_inference_span)

Fix

  1. Move before_model_callback inside the call_llm span so it shares the same span context as the other callbacks
  2. Wrap after_model_callback with trace.use_span(span) to re-activate the call_llm span (needed because the async generator from _run_and_handle_error yields responses inside the child generate_content span)
  3. Import trace from opentelemetry

Testing

Added 3 new tests in test_llm_callback_span_consistency.py:

  • test_before_and_after_model_callbacks_share_span_id — core regression test
  • test_before_and_on_error_model_callbacks_share_span_id — error path
  • test_before_model_callback_short_circuit_has_span — short-circuit case

All 51 existing callback/tracing tests continue to pass.

Move before_model_callback inside the call_llm span and wrap
after_model_callback with trace.use_span(span) to re-activate the
call_llm span context. This ensures before_model_callback,
after_model_callback, and on_model_error_callback all see the same
span_id, fixing the mismatch that broke the BigQuery Analytics Plugin.

The root cause was twofold:
1. before_model_callback ran outside the call_llm span
2. after_model_callback ran inside a child generate_content span
   (created by _run_and_handle_error via use_inference_span)

Fixes google#4851
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical OpenTelemetry tracing issue where LLM callbacks (before_model_callback, after_model_callback, on_model_error_callback) were not consistently sharing the same span ID. This inconsistency led to fragmented tracing data, particularly affecting analytics plugins. The fix involves strategically adjusting the execution context of these callbacks within the _call_llm_async method to ensure they all operate under the same call_llm span, thereby providing accurate and unified tracing information.

Highlights

  • OpenTelemetry Span Consistency: Resolved an issue where before_model_callback and after_model_callback/on_model_error_callback had different OpenTelemetry span IDs, leading to inconsistent tracing data in the BigQuery Analytics Plugin.
  • Callback Execution Context Adjustment: Modified base_llm_flow.py to ensure before_model_callback executes within the call_llm span and after_model_callback reactivates the call_llm span, guaranteeing all callbacks share the same span context.
  • New Span Consistency Tests: Introduced three new unit tests to validate that LLM callbacks consistently share the same OpenTelemetry span ID across various scenarios, including successful calls, error handling, and short-circuiting.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/flows/llm_flows/base_llm_flow.py
    • Imported the opentelemetry.trace module.
    • Refactored the _call_llm_async method to move the before_model_callback invocation inside the call_llm span.
    • Wrapped after_model_callback calls with trace.use_span(span) to re-activate the call_llm span.
  • tests/unittests/flows/llm_flows/test_llm_callback_span_consistency.py
    • Added a new test file to verify LLM callback span consistency.
    • Implemented test_before_and_after_model_callbacks_share_span_id to confirm span ID matching for successful LLM calls.
    • Implemented test_before_and_on_error_model_callbacks_share_span_id to verify span ID consistency during error handling.
    • Implemented test_before_model_callback_short_circuit_has_span to ensure before_model_callback has a valid span even when short-circuiting the LLM call.
Activity
  • New unit tests were added to validate the fix for OpenTelemetry span consistency in LLM callbacks.
  • All 51 existing callback and tracing tests were confirmed to still pass after the changes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the tracing [Component] This issue is related to OpenTelemetry tracing label Mar 17, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively resolves the OpenTelemetry span inconsistency between before_model_callback and after_model_callback. The approach of moving before_model_callback into the call_llm span and reactivating this span for after_model_callback is correct. The new regression tests are comprehensive and well-written, covering success, error, and short-circuit scenarios. I have one suggestion to refactor a small piece of duplicated code that was introduced with this fix to improve maintainability.

Comment on lines +1149 to +1153
with trace.use_span(span):
if altered_llm_response := await self._handle_after_model_callback(
invocation_context, llm_response, model_response_event
):
llm_response = altered_llm_response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for handling the after_model_callback is duplicated in the else branch on lines 1192-1196. To improve maintainability and avoid repeating code (DRY principle), consider extracting this logic into a local helper coroutine within the _call_llm_with_tracing function.

For example:

async def _apply_after_model_callback(response: LlmResponse) -> LlmResponse:
    """Applies after_model_callback within the correct span context."""
    with trace.use_span(span):
        if altered_response := await self._handle_after_model_callback(
            invocation_context, response, model_response_event
        ):
            return altered_response
    return response

You could then replace both duplicated blocks with a single call:
llm_response = await _apply_after_model_callback(llm_response)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tracing [Component] This issue is related to OpenTelemetry tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenTelemetry integration creates span ID mismatch between LLM_REQUEST and LLM_RESPONSE/LLM_ERROR (BigQuery Analytics Plugin)

2 participants