Skip to content

fix(flask): wrap wsgi_app call in try/except to prevent active_requests gauge leak#4433

Open
alliasgher wants to merge 3 commits intoopen-telemetry:mainfrom
alliasgher:fix-flask-active-requests-gauge-leak
Open

fix(flask): wrap wsgi_app call in try/except to prevent active_requests gauge leak#4433
alliasgher wants to merge 3 commits intoopen-telemetry:mainfrom
alliasgher:fix-flask-active-requests-gauge-leak

Conversation

@alliasgher
Copy link
Copy Markdown

Fixes #4431. If wsgi_app() raises, the active_requests_counter.add(-1,...) at the end was never reached. Add a bare try/except that decrements and re-raises, matching the existing WSGI instrumentation pattern.

…ts gauge leak

If wsgi_app() raises an uncaught exception, the active_requests_counter
decrement at the end of _wrapped_app was never reached, causing the gauge
to permanently read high. Kubernetes HPA and similar systems would see
phantom load.

Add a bare try/except that decrements the counter and re-raises on
exception, matching the pattern already used in the WSGI instrumentation.

Fixes open-telemetry#4431

Signed-off-by: alliasgher <alliasgher123@gmail.com>
@liyaka
Copy link
Copy Markdown

liyaka commented Apr 14, 2026

Thanks for the quick fix! We hit this exact bug in production — the leaked gauge was preventing our Kubernetes HPA from scaling down.

One suggestion: consider using try/finally instead of try/except to keep the decrement in a single code path. With the current approach, active_requests_counter.add(-1, ...) exists in two places (the except block and line 458), which is easy to get out of sync during future refactors.

The WSGI instrumentation already uses this pattern:

# wsgi/__init__.py
try:
    ...
    return _end_span_after_iterating(iterable, span, token)
except Exception as ex:
    raise
finally:
    self.active_requests_counter.add(-1, active_requests_count_attrs)

For Flask it would look like:

try:
    result = wsgi_app(wrapped_app_environ, _start_response)
    # ...duration recording unchanged...
    return result
finally:
    active_requests_counter.add(-1, active_requests_count_attrs)

This also removes the original decrement at line 458, so there's exactly one decrement path.

Also worth adding a test to prevent regression — I have one in #4437 (test_active_requests_counter_decremented_on_error) that could be adapted here.

Signed-off-by: alliasgher <alliasgher123@gmail.com>
@alliasgher
Copy link
Copy Markdown
Author

alliasgher commented Apr 14, 2026

Good catch — updated to use try/finally so the decrement is in a single code path regardless of whether wsgi_app succeeds or raises @liyaka

Signed-off-by: alliasgher <alliasgher123@gmail.com>
@alliasgher alliasgher force-pushed the fix-flask-active-requests-gauge-leak branch from bb98692 to 5c2b07a Compare April 14, 2026 20:20
@alliasgher
Copy link
Copy Markdown
Author

Added the regression test test_active_requests_counter_decremented_on_error. It hits /hello/500 (which raises ValueError internally), collects http.server.active_requests, and asserts the value is back to 0 after both requests complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Flask instrumentation: http.server.active_requests gauge leaks on exception (missing try/finally)

2 participants