fix: ensure LLM callbacks share the same OTel span context#4854
fix: ensure LLM callbacks share the same OTel span context#4854brucearctor wants to merge 1 commit intogoogle:mainfrom
Conversation
Move before_model_callback inside the call_llm span and wrap after_model_callback with trace.use_span(span) to re-activate the call_llm span context. This ensures before_model_callback, after_model_callback, and on_model_error_callback all see the same span_id, fixing the mismatch that broke the BigQuery Analytics Plugin. The root cause was twofold: 1. before_model_callback ran outside the call_llm span 2. after_model_callback ran inside a child generate_content span (created by _run_and_handle_error via use_inference_span) Fixes google#4851
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical OpenTelemetry tracing issue where LLM callbacks ( Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively resolves the OpenTelemetry span inconsistency between before_model_callback and after_model_callback. The approach of moving before_model_callback into the call_llm span and reactivating this span for after_model_callback is correct. The new regression tests are comprehensive and well-written, covering success, error, and short-circuit scenarios. I have one suggestion to refactor a small piece of duplicated code that was introduced with this fix to improve maintainability.
| with trace.use_span(span): | ||
| if altered_llm_response := await self._handle_after_model_callback( | ||
| invocation_context, llm_response, model_response_event | ||
| ): | ||
| llm_response = altered_llm_response |
There was a problem hiding this comment.
This block of code for handling the after_model_callback is duplicated in the else branch on lines 1192-1196. To improve maintainability and avoid repeating code (DRY principle), consider extracting this logic into a local helper coroutine within the _call_llm_with_tracing function.
For example:
async def _apply_after_model_callback(response: LlmResponse) -> LlmResponse:
"""Applies after_model_callback within the correct span context."""
with trace.use_span(span):
if altered_response := await self._handle_after_model_callback(
invocation_context, response, model_response_event
):
return altered_response
return responseYou could then replace both duplicated blocks with a single call:
llm_response = await _apply_after_model_callback(llm_response)
Description
Fixes #4851.
When OpenTelemetry tracing is enabled,
before_model_callbackandafter_model_callback/on_model_error_callbacksee different span IDs, causingLLM_REQUEST.span_id != LLM_RESPONSE.span_idin the BigQuery Analytics Plugin.Root Cause
Two issues in
base_llm_flow.py:before_model_callbackran outside thecall_llmspanafter_model_callbackran inside a childgenerate_contentspan (created by_run_and_handle_error→use_inference_span)Fix
before_model_callbackinside thecall_llmspan so it shares the same span context as the other callbacksafter_model_callbackwithtrace.use_span(span)to re-activate thecall_llmspan (needed because the async generator from_run_and_handle_erroryields responses inside the childgenerate_contentspan)tracefromopentelemetryTesting
Added 3 new tests in
test_llm_callback_span_consistency.py:test_before_and_after_model_callbacks_share_span_id— core regression testtest_before_and_on_error_model_callbacks_share_span_id— error pathtest_before_model_callback_short_circuit_has_span— short-circuit caseAll 51 existing callback/tracing tests continue to pass.