Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
([#3938](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/3938))
- `opentelemetry-instrumentation-aiohttp-server`: Support passing `TracerProvider` when instrumenting.
([#3819](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/3819))
- `opentelemetry-instrumentation-urllib`: add ability to capture custom headers
([#4051](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4051))

### Fixed

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,97 @@ def response_hook(span: Span, request: Request, response: HTTPResponse):

will exclude requests such as ``https://site/client/123/info`` and ``https://site/xyz/healthcheck``.

Capture HTTP request and response headers
*****************************************
You can configure the agent to capture specified HTTP headers as span attributes, according to the
`semantic conventions <https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-client-span>`_.

Request headers
***************
To capture HTTP request headers as span attributes, set the environment variable
``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST`` to a comma delimited list of HTTP header names.

For example using the environment variable,
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST="content-type,custom_request_header"

will extract ``content-type`` and ``custom_request_header`` from the request headers and add them as span attributes.

Request header names in urllib are case-insensitive. So, giving the header name as ``CUStom-Header`` in the environment
variable will capture the header named ``custom-header``.

Regular expressions may also be used to match multiple headers that correspond to the given pattern. For example:
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST="Accept.*,X-.*"

Would match all request headers that start with ``Accept`` and ``X-``.

To capture all request headers, set ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST`` to ``".*"``.
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST=".*"

The name of the added span attribute will follow the format ``http.request.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
single item list containing all the header values.

For example:
``http.request.header.custom_request_header = ["<value1>", "<value2>"]``

Response headers
****************
To capture HTTP response headers as span attributes, set the environment variable
``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE`` to a comma delimited list of HTTP header names.

For example using the environment variable,
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE="content-type,custom_response_header"

will extract ``content-type`` and ``custom_response_header`` from the response headers and add them as span attributes.

Response header names in urllib are case-insensitive. So, giving the header name as ``CUStom-Header`` in the environment
variable will capture the header named ``custom-header``.

Regular expressions may also be used to match multiple headers that correspond to the given pattern. For example:
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE="Content.*,X-.*"

Would match all response headers that start with ``Content`` and ``X-``.

To capture all response headers, set ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE`` to ``".*"``.
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE=".*"

The name of the added span attribute will follow the format ``http.response.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
list containing the header values.

For example:
``http.response.header.custom_response_header = ["<value1>", "<value2>"]``

Sanitizing headers
******************
In order to prevent storing sensitive data such as personally identifiable information (PII), session keys, passwords,
etc, set the environment variable ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS``
to a comma delimited list of HTTP header names to be sanitized.

Regexes may be used, and all header names will be matched in a case-insensitive manner.

For example using the environment variable,
::

export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS=".*session.*,set-cookie"

will replace the value of headers such as ``session-id`` and ``set-cookie`` with ``[REDACTED]`` in the span.

Note:
The environment variable names used to capture HTTP headers are still experimental, and thus are subject to change.

API
---
"""
Expand Down Expand Up @@ -135,8 +226,15 @@ def response_hook(span: Span, request: Request, response: HTTPResponse):
)
from opentelemetry.trace import Span, SpanKind, Tracer, get_tracer
from opentelemetry.util.http import (
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST,
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE,
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS,
ExcludeList,
get_custom_header_attributes,
get_custom_headers,
get_excluded_urls,
normalise_request_header_name,
normalise_response_header_name,
parse_excluded_urls,
redact_url,
sanitize_method,
Expand Down Expand Up @@ -169,6 +267,9 @@ def _instrument(self, **kwargs: Any):
``response_hook``: An optional callback which is invoked right before the span is finished processing a response
``excluded_urls``: A string containing a comma-delimited
list of regexes used to exclude URLs from tracking
``captured_request_headers``: A comma-separated list of regexes to match against request headers to capture
``captured_response_headers``: A comma-separated list of regexes to match against response headers to capture
``sensitive_headers``: A comma-separated list of regexes to match against captured headers to be sanitized
"""
# initialize semantic conventions opt-in if needed
_OpenTelemetrySemanticConventionStability._initialize()
Expand Down Expand Up @@ -205,6 +306,24 @@ def _instrument(self, **kwargs: Any):
else parse_excluded_urls(excluded_urls)
),
sem_conv_opt_in_mode=sem_conv_opt_in_mode,
captured_request_headers=kwargs.get(
"captured_request_headers",
get_custom_headers(
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST
),
),
captured_response_headers=kwargs.get(
"captured_response_headers",
get_custom_headers(
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE
),
),
sensitive_headers=kwargs.get(
"sensitive_headers",
get_custom_headers(
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS
),
),
)

def _uninstrument(self, **kwargs: Any):
Expand All @@ -223,6 +342,9 @@ def _instrument(
response_hook: _ResponseHookT = None,
excluded_urls: ExcludeList | None = None,
sem_conv_opt_in_mode: _StabilityMode = _StabilityMode.DEFAULT,
captured_request_headers: list[str] | None = None,
captured_response_headers: list[str] | None = None,
sensitive_headers: list[str] | None = None,
):
"""Enables tracing of all requests calls that go through
:code:`urllib.Client._make_request`"""
Expand All @@ -232,7 +354,10 @@ def _instrument(
@functools.wraps(opener_open)
def instrumented_open(opener, fullurl, data=None, timeout=None):
if isinstance(fullurl, str):
request_ = Request(fullurl, data)
# in case of multiple entries for the same header Opener.open sends the first value
request_ = Request(
fullurl, data, headers=dict(reversed(opener.addheaders))
)
else:
request_ = fullurl

Expand Down Expand Up @@ -275,14 +400,23 @@ def _instrumented_open_call(
)
_set_http_url(labels, url, sem_conv_opt_in_mode)

headers = get_or_create_headers()
labels.update(
get_custom_header_attributes(
headers,
captured_request_headers,
sensitive_headers,
normalise_request_header_name,
)
)

with tracer.start_as_current_span(
span_name, kind=SpanKind.CLIENT, attributes=labels
) as span:
exception = None
if callable(request_hook):
request_hook(span, request)

headers = get_or_create_headers()
inject(headers)

with suppress_http_instrumentation():
Expand Down Expand Up @@ -310,6 +444,16 @@ def _instrumented_open_call(
labels, f"{ver_[:1]}.{ver_[:-1]}", sem_conv_opt_in_mode
)

if span.is_recording():
span.set_attributes(
get_custom_header_attributes(
result.headers,
captured_response_headers,
sensitive_headers,
normalise_response_header_name,
)
)

if exception is not None and _report_new(sem_conv_opt_in_mode):
span.set_attribute(ERROR_TYPE, type(exception).__qualname__)
labels[ERROR_TYPE] = type(exception).__qualname__
Expand Down
Loading