Python: Fix ENABLE_SENSITIVE_DATA env var not respected after load_dotenv()#4120
Python: Fix ENABLE_SENSITIVE_DATA env var not respected after load_dotenv()#4120droideronline wants to merge 5 commits intomicrosoft:mainfrom
Conversation
…v at call time When load_dotenv() runs after module import, the module-level OBSERVABILITY_SETTINGS singleton has stale cached values. Both configure_otel_providers() and enable_instrumentation() now fall back to os.environ when their parameters are None. Added _env_bool() helper and 5 unit tests. Fixes microsoft#4119
|
I've verified this fix locally - all existing tests pass (46 passed, 14 skipped for optional OTLP deps, 0 failures) along with the 5 new tests added here. Let me know if you'd like to address this with a different approach, happy to rework it. |
There was a problem hiding this comment.
Pull request overview
Fixes a timing issue in the Python observability module where module-level cached settings (e.g., ENABLE_SENSITIVE_DATA, VS_CODE_EXTENSION_PORT) were not reflecting values loaded into os.environ by load_dotenv() after import.
Changes:
- Added a small
_env_bool()helper to re-read boolean env vars at call time. - Updated
enable_instrumentation()and theconfigure_otel_providers()non-env_file_pathpath to re-check env vars when parameters are omitted. - Added unit tests covering env fallback behavior and explicit parameter overrides.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
python/packages/core/agent_framework/observability.py |
Re-reads select env vars at call time to avoid stale module-level settings after load_dotenv(). |
python/packages/core/tests/core/test_observability.py |
Adds new tests to validate env fallbacks and precedence rules. |
…n tests - VS_CODE_EXTENSION_PORT: strip whitespace and wrap int() in try/except with a clear ValueError mentioning the env var name. - Tests: mock _configure to no-op in configure_otel_providers tests to prevent global OTel provider state leakage across test runs.
|
@eavanvalkenburg To add context on why this approach makes sense: no well-behaved library should call |
|
Which is why we made a change to no longer use load_dotenv, so please check with the rc1 release if this is still a issue |
|
@eavanvalkenburg I am on the latest main ( However, the bug is not about the framework calling from dotenv import load_dotenv
load_dotenv()
# os.environ now has ENABLE_SENSITIVE_DATA=true
from agent_framework.observability import configure_otel_providers
configure_otel_providers() # does NOT pick up ENABLE_SENSITIVE_DATA from os.environThe module-level The current observability samples happen to work because they pass |
|
We shouldn't create the global observability setting at import time. We can create the setting when |
|
@TaoChenOSU That makes sense — moving the |
…ort time Move OBSERVABILITY_SETTINGS from eagerly-created singleton at import time to None, deferring creation into configure_otel_providers() and enable_instrumentation(). This ensures environment variables populated by load_dotenv() after import are correctly picked up. - OBSERVABILITY_SETTINGS starts as None (no import-time construction) - configure_otel_providers() always creates a fresh ObservabilitySettings with enable_instrumentation=True, reading current os.environ via load_settings() - enable_instrumentation() creates ObservabilitySettings on first call if still None - All consumers add None guards before accessing OBSERVABILITY_SETTINGS - Removed _env_bool() helper (no longer needed) - Updated tests to expect None after module reload Fixes microsoft#4119
|
|
||
| global OBSERVABILITY_SETTINGS | ||
| OBSERVABILITY_SETTINGS: ObservabilitySettings = ObservabilitySettings() | ||
| OBSERVABILITY_SETTINGS: ObservabilitySettings | None = None |
There was a problem hiding this comment.
With OBSERVABILITY_SETTINGS now defaulting to None, setting ENABLE_INSTRUMENTATION=true in the process environment no longer automatically enables instrumentation until some code explicitly calls enable_instrumentation()/configure_otel_providers(). This appears to regress the documented “zero-code via env vars” behavior (e.g., python/samples/02-agents/observability/README.md:156 says ENABLE_INSTRUMENTATION=true should cause the framework to emit observability data). Consider lazily initializing OBSERVABILITY_SETTINGS on first telemetry check (or at import time when env vars are already present) so env-only enablement continues to work while still supporting load_dotenv-after-import timing.
| OBSERVABILITY_SETTINGS: ObservabilitySettings | None = None | |
| if "ENABLE_INSTRUMENTATION" in os.environ: | |
| OBSERVABILITY_SETTINGS: ObservabilitySettings | None = load_settings() | |
| else: | |
| OBSERVABILITY_SETTINGS: ObservabilitySettings | None = None |
There was a problem hiding this comment.
Every sample in the repo explicitly calls configure_otel_providers() or enable_instrumentation() — no sample relies on env-var-only enablement. The README's "zero-code" pattern (Pattern 5) refers to the opentelemetry-instrument CLI, which doesn't depend on OBSERVABILITY_SETTINGS. Deferring creation to call time is intentional so that env vars set via load_dotenv() after import are correctly picked up.
| # Build kwargs for a fresh ObservabilitySettings, excluding None values. | ||
| # Creating the settings here (instead of at import time) ensures that | ||
| # environment variables populated by the caller's load_dotenv() are picked up. | ||
| settings_kwargs: dict[str, Any] = { | ||
| "enable_instrumentation": True, | ||
| } | ||
| if env_file_path is not None: | ||
| settings_kwargs["env_file_path"] = env_file_path | ||
| if env_file_encoding is not None: | ||
| settings_kwargs["env_file_encoding"] = env_file_encoding | ||
| if enable_sensitive_data is not None: | ||
| settings_kwargs["enable_sensitive_data"] = enable_sensitive_data | ||
| if vs_code_extension_port is not None: | ||
| settings_kwargs["vs_code_extension_port"] = vs_code_extension_port | ||
|
|
||
| OBSERVABILITY_SETTINGS = ObservabilitySettings(**settings_kwargs) | ||
|
|
||
| OBSERVABILITY_SETTINGS._configure( # type: ignore[reportPrivateUsage] |
There was a problem hiding this comment.
configure_otel_providers() now always replaces the global OBSERVABILITY_SETTINGS with a new ObservabilitySettings instance. This breaks the function’s documented “call once”/idempotent behavior: calling configure_otel_providers() a second time will reset _executed_setup back to False and can reconfigure providers/handlers again, which can lead to duplicate exporters/handlers and unexpected telemetry. Consider reusing the existing settings when setup has already been executed (or otherwise preserving the original _executed_setup state) so repeated calls remain a no-op after the first successful configuration.
There was a problem hiding this comment.
The docstring states: "DO NOT call this method multiple times, as it may lead to unexpected behavior." Single-call-at-startup is the expected usage pattern. Not a regression.
| settings = OBSERVABILITY_SETTINGS | ||
| super_get_response = super().get_response # type: ignore[misc] | ||
|
|
||
| if not OBSERVABILITY_SETTINGS.ENABLED: | ||
| if settings is None or not settings.ENABLED: | ||
| return super_get_response(messages=messages, stream=stream, options=options, **kwargs) # type: ignore[no-any-return] |
There was a problem hiding this comment.
ChatTelemetryLayer.get_response() currently short-circuits when OBSERVABILITY_SETTINGS is None, which means env-var-only enablement (ENABLE_INSTRUMENTATION=true) won’t activate telemetry unless some other code has already created the settings object. If OBSERVABILITY_SETTINGS is meant to be “deferred”, consider ensuring it is initialized here (or via a shared helper) before checking settings.ENABLED so the first instrumented call can pick up env vars populated by load_dotenv().
There was a problem hiding this comment.
Same rationale as above — telemetry activation requires an explicit configure_otel_providers() or enable_instrumentation() call, which all samples do. None means no setup was called, so telemetry should be disabled.
| # Configure if instrumentation is enabled (via enable_instrumentation() or env var) | ||
| if OBSERVABILITY_SETTINGS.ENABLED: | ||
| if OBSERVABILITY_SETTINGS is not None and OBSERVABILITY_SETTINGS.ENABLED: | ||
| # Only configure providers if not already executed | ||
| if not OBSERVABILITY_SETTINGS._executed_setup: | ||
| # Call configure_otel_providers to set up exporters. |
There was a problem hiding this comment.
This check will skip observability setup whenever OBSERVABILITY_SETTINGS is still None. With the new deferred initialization, an env-var-only setup (ENABLE_INSTRUMENTATION=true) may never create OBSERVABILITY_SETTINGS, so DevUI will log “Instrumentation not enabled” even though env vars request it. Consider instantiating/initializing settings here (or calling a shared accessor) before checking ENABLED, so DevUI respects env-based enablement.
There was a problem hiding this comment.
DevUI calls configure_otel_providers() itself when it needs observability (visible just below in the same function). The None guard correctly skips the fast-path check when no setup has been done yet.
|
@TaoChenOSU Updated per your feedback. |
Motivation and Context
Fixes #4119
When users set
ENABLE_SENSITIVE_DATA=truein a.envfile and callload_dotenv()beforeconfigure_otel_providers()orenable_instrumentation(), the setting is silently ignored. This is because the module-levelOBSERVABILITY_SETTINGSsingleton caches env vars at import time — beforeload_dotenv()populatesos.environ.The same timing bug affects
VS_CODE_EXTENSION_PORT.Description
Added
_env_bool(name)helper — re-reads a boolean fromos.environat call time, using the same truthy strings ("true","1","yes","on") as_coerce_valuein_settings.py.Fixed
enable_instrumentation()— whenenable_sensitive_dataparam isNone, falls back to_env_bool("ENABLE_SENSITIVE_DATA")to pick up values loaded byload_dotenv()after import.Fixed
configure_otel_providers()else branch — same_env_boolfallback forenable_sensitive_data, plusos.getenv("VS_CODE_EXTENSION_PORT")fallback for the VS Code extension port.Added 5 unit tests covering:
configure_otel_providers()readsENABLE_SENSITIVE_DATAfrom envenable_sensitive_data=Falseparam overrides envenable_instrumentation()readsENABLE_SENSITIVE_DATAfrom envFalseinenable_instrumentation()overrides envconfigure_otel_providers()readsVS_CODE_EXTENSION_PORTfrom envContribution Checklist
None