|
| 1 | +PEP: 8XX |
| 2 | +Title: Add timestamps to exception tracebacks |
| 3 | +Author: Gregory P. Smith <greg@krypto.org> |
| 4 | +PEP-Delegate: |
| 5 | +Discussions-To: TODO |
| 6 | +Status: Draft |
| 7 | +Type: Standards Track |
| 8 | +Created: 09-Sep-2025 |
| 9 | +Python-Version: 3.15 |
| 10 | +Post-History: |
| 11 | + |
| 12 | + |
| 13 | +Abstract |
| 14 | +======== |
| 15 | + |
| 16 | +This PEP proposes adding optional timestamps to Python exception objects that |
| 17 | +can be displayed in tracebacks. When enabled, each exception will record the |
| 18 | +time it was instantiated (typically when raised), and this timestamp can be |
| 19 | +displayed alongside the exception message in formatted tracebacks. This feature |
| 20 | +is particularly useful for debugging complex asynchronous applications, exception |
| 21 | +groups, and distributed systems where understanding the temporal ordering of |
| 22 | +exceptions is critical. |
| 23 | + |
| 24 | + |
| 25 | +Motivation |
| 26 | +========== |
| 27 | + |
| 28 | +In modern Python applications, especially those using asynchronous programming |
| 29 | +patterns and exception groups introduced in :pep:`654`, understanding the |
| 30 | +temporal relationships between exceptions has become increasingly important. |
| 31 | +Consider the following scenarios: |
| 32 | + |
| 33 | +1. **Async servers with exception groups**: When multiple exceptions are caught |
| 34 | + and grouped together, it's often difficult to understand which exception |
| 35 | + occurred first and how they relate temporally to each other and to external |
| 36 | + events in the system. |
| 37 | + |
| 38 | +2. **Distributed systems debugging**: When correlating exceptions across |
| 39 | + different services, having precise timestamps allows developers to match |
| 40 | + exceptions with logs from other systems, making root cause analysis more |
| 41 | + straightforward. |
| 42 | + |
| 43 | +3. **Complex exception chains**: In applications with deep exception chains |
| 44 | + (exceptions raised during handling of other exceptions), timestamps help |
| 45 | + clarify the sequence of events that led to the final error state. |
| 46 | + |
| 47 | +4. **Performance debugging**: Timestamps can reveal performance issues, such as |
| 48 | + delays between exception creation and handling, or help identify timeout-related |
| 49 | + problems. |
| 50 | + |
| 51 | +Currently, developers must manually add timing information to exception messages |
| 52 | +or use external logging frameworks, which can be cumbersome and inconsistent. |
| 53 | +This PEP proposes a built-in, standardized approach. |
| 54 | + |
| 55 | + |
| 56 | +Compelling Example: Async Service with Exception Groups |
| 57 | +-------------------------------------------------------- |
| 58 | + |
| 59 | +Consider an async web service that fetches data from multiple backends |
| 60 | +concurrently. When failures occur, understanding the timing is crucial for |
| 61 | +diagnostics:: |
| 62 | + |
| 63 | + import asyncio |
| 64 | + from datetime import datetime |
| 65 | + |
| 66 | + async def fetch_user_data(user_id: str): |
| 67 | + # Simulating a service that fails after 0.5 seconds |
| 68 | + await asyncio.sleep(0.5) |
| 69 | + raise ConnectionError(f"User service timeout for {user_id}") |
| 70 | + |
| 71 | + async def fetch_order_data(user_id: str): |
| 72 | + # Simulating a service that fails after 0.1 seconds |
| 73 | + await asyncio.sleep(0.1) |
| 74 | + raise ValueError(f"Invalid user_id format: {user_id}") |
| 75 | + |
| 76 | + async def fetch_recommendations(user_id: str): |
| 77 | + # Simulating a service that fails after 2.3 seconds |
| 78 | + await asyncio.sleep(2.3) |
| 79 | + raise TimeoutError(f"Recommendation service timeout") |
| 80 | + |
| 81 | + async def fetch_inventory(items: list): |
| 82 | + # Simulating a service that fails after 0.8 seconds |
| 83 | + await asyncio.sleep(0.8) |
| 84 | + raise KeyError(f"Item 'widget-42' not found in inventory") |
| 85 | + |
| 86 | + async def get_user_dashboard(user_id: str): |
| 87 | + """Fetch all user dashboard data concurrently.""" |
| 88 | + async with asyncio.TaskGroup() as tg: |
| 89 | + user_task = tg.create_task(fetch_user_data(user_id)) |
| 90 | + order_task = tg.create_task(fetch_order_data(user_id)) |
| 91 | + rec_task = tg.create_task(fetch_recommendations(user_id)) |
| 92 | + inv_task = tg.create_task(fetch_inventory(['widget-42'])) |
| 93 | + |
| 94 | +With ``PYTHON_TRACEBACK_TIMESTAMPS=iso``, the output would be:: |
| 95 | + |
| 96 | + Traceback (most recent call last): |
| 97 | + ... |
| 98 | + ExceptionGroup: unhandled errors in a TaskGroup (4 sub-exceptions) |
| 99 | + +-+---------------- 1 ---------------- |
| 100 | + | Traceback (most recent call last): |
| 101 | + | File "service.py", line 11, in fetch_order_data |
| 102 | + | raise ValueError(f"Invalid user_id format: {user_id}") |
| 103 | + | ValueError: Invalid user_id format: usr_12@34 <@2025-03-15T10:23:41.142857Z> |
| 104 | + +---------------- 2 ---------------- |
| 105 | + | Traceback (most recent call last): |
| 106 | + | File "service.py", line 7, in fetch_user_data |
| 107 | + | raise ConnectionError(f"User service timeout for {user_id}") |
| 108 | + | ConnectionError: User service timeout for usr_12@34 <@2025-03-15T10:23:41.542901Z> |
| 109 | + +---------------- 3 ---------------- |
| 110 | + | Traceback (most recent call last): |
| 111 | + | File "service.py", line 19, in fetch_inventory |
| 112 | + | raise KeyError(f"Item 'widget-42' not found in inventory") |
| 113 | + | KeyError: "Item 'widget-42' not found in inventory" <@2025-03-15T10:23:41.842856Z> |
| 114 | + +---------------- 4 ---------------- |
| 115 | + | Traceback (most recent call last): |
| 116 | + | File "service.py", line 15, in fetch_recommendations |
| 117 | + | raise TimeoutError(f"Recommendation service timeout") |
| 118 | + | TimeoutError: Recommendation service timeout <@2025-03-15T10:23:43.342912Z> |
| 119 | + |
| 120 | +From these timestamps, we can immediately see: |
| 121 | + |
| 122 | +1. The ``ValueError`` from order validation failed first (at .142s) |
| 123 | +2. The user service timed out 400ms later (at .542s) |
| 124 | +3. The inventory service failed 300ms after that (at .842s) |
| 125 | +4. The recommendation service was the slowest, timing out after 2.3 seconds (at 43.342s) |
| 126 | + |
| 127 | +This temporal information is invaluable for debugging: |
| 128 | + |
| 129 | +- We can see the order service fails fast with validation, suggesting a data |
| 130 | + quality issue rather than a performance problem |
| 131 | +- The 2.3-second delay for recommendations indicates it might be the bottleneck |
| 132 | +- The failures are spread across 3.2 seconds, showing they're not caused by a |
| 133 | + single systemic issue (like a network partition) |
| 134 | +- We can correlate these timestamps with metrics from monitoring systems, |
| 135 | + load balancer logs, or database query logs |
| 136 | + |
| 137 | +Without timestamps, we would only know that four services failed, but not |
| 138 | +their temporal relationship, making root cause analysis significantly harder. |
| 139 | + |
| 140 | + |
| 141 | +Rationale |
| 142 | +========= |
| 143 | + |
| 144 | +The decision to add timestamps directly to exception objects rather than using |
| 145 | +alternative approaches (such as exception notes from :pep:`678`) was driven by |
| 146 | +several factors: |
| 147 | + |
| 148 | +1. **Performance**: Adding timestamps as notes would require creating string |
| 149 | + and list objects for every exception, even when timestamps aren't being |
| 150 | + displayed. The proposed approach stores a single integer (nanoseconds since |
| 151 | + epoch) and only formats it when needed. |
| 152 | + |
| 153 | +2. **Consistency**: Having a standardized timestamp attribute ensures consistent |
| 154 | + formatting and behavior across the Python ecosystem. |
| 155 | + |
| 156 | +3. **Backwards compatibility**: The feature is entirely opt-in, with no impact |
| 157 | + on existing code unless explicitly enabled. |
| 158 | + |
| 159 | +4. **Simplicity**: The implementation is straightforward and doesn't require |
| 160 | + changes to exception handling semantics. |
| 161 | + |
| 162 | + |
| 163 | +Specification |
| 164 | +============= |
| 165 | + |
| 166 | +Exception Timestamp Attribute |
| 167 | +----------------------------- |
| 168 | + |
| 169 | +A new attribute ``__timestamp_ns__`` will be added to the ``BaseException`` |
| 170 | +class. This attribute will store the time in nanoseconds since the Unix epoch |
| 171 | +(same precision as ``time.time_ns()``) when the exception was instantiated. |
| 172 | + |
| 173 | +- The attribute will be set to ``0`` by default (timestamps disabled) |
| 174 | +- When timestamps are enabled, it will be set automatically during exception |
| 175 | + instantiation |
| 176 | +- The timestamp represents when the exception object was created, which is |
| 177 | + typically when it was raised |
| 178 | + |
| 179 | +Special Cases |
| 180 | +------------- |
| 181 | + |
| 182 | +To avoid performance impacts on normal control flow, timestamps will **not** be |
| 183 | +collected for the following exception types even when the feature is enabled: |
| 184 | + |
| 185 | +- ``StopIteration`` |
| 186 | +- ``AsyncStopIteration`` |
| 187 | + |
| 188 | +These exceptions are frequently used for control flow in iterators and async |
| 189 | +iterators, and adding timestamps would introduce unnecessary overhead. |
| 190 | + |
| 191 | +Configuration |
| 192 | +------------- |
| 193 | + |
| 194 | +The feature can be enabled through two mechanisms: |
| 195 | + |
| 196 | +1. **Environment variable**: ``PYTHON_TRACEBACK_TIMESTAMPS`` |
| 197 | + |
| 198 | + - ``"us"`` or ``"1"``: Display timestamps in microseconds |
| 199 | + - ``"ns"``: Display timestamps in nanoseconds |
| 200 | + - ``"iso"``: Display timestamps in ISO 8601 format |
| 201 | + - Empty or unset: Timestamps disabled (default) |
| 202 | + |
| 203 | +2. **Command-line option**: ``-X traceback_timestamps=<format>`` |
| 204 | + |
| 205 | + Uses the same format options as the environment variable. |
| 206 | + |
| 207 | +The command-line option takes precedence over the environment variable if both |
| 208 | +are specified. |
| 209 | + |
| 210 | +Display Format |
| 211 | +-------------- |
| 212 | + |
| 213 | +When timestamps are enabled, they will be displayed at the end of exception |
| 214 | +messages in tracebacks, using the format: ``<@timestamp>`` |
| 215 | + |
| 216 | +Example with ``PYTHON_TRACEBACK_TIMESTAMPS=iso``:: |
| 217 | + |
| 218 | + Traceback (most recent call last): |
| 219 | + File "<stdin>", line 3, in california_raisin |
| 220 | + raise RuntimeError("not enough sunshine") |
| 221 | + RuntimeError: not enough sunshine <@2025-02-01T20:43:01.026169Z> |
| 222 | + |
| 223 | + During handling of the above exception, another exception occurred: |
| 224 | + |
| 225 | + Traceback (most recent call last): |
| 226 | + File "<stdin>", line 1, in <module> |
| 227 | + california_raisin() |
| 228 | + File "<stdin>", line 5, in california_raisin |
| 229 | + raise OSError(2, "on a cloudy day") |
| 230 | + FileNotFoundError: [Errno 2] on a cloudy day <@2025-02-01T20:43:01.026176Z> |
| 231 | + |
| 232 | +Traceback Module Updates |
| 233 | +------------------------ |
| 234 | + |
| 235 | +The ``traceback`` module will be updated to support the new timestamp feature: |
| 236 | + |
| 237 | +1. ``TracebackException`` class will gain a ``no_timestamp`` parameter |
| 238 | + (default ``False``) to control whether timestamps are displayed |
| 239 | + |
| 240 | +2. Functions like ``print_exception()`` will gain a ``no_timestamp`` parameter |
| 241 | + to allow suppressing timestamp display even when globally enabled |
| 242 | + |
| 243 | +3. The formatting functions will check the global configuration and the |
| 244 | + ``__timestamp_ns__`` attribute to determine whether and how to display |
| 245 | + timestamps |
| 246 | + |
| 247 | +Python Configuration |
| 248 | +-------------------- |
| 249 | + |
| 250 | +A new field ``traceback_timestamps`` will be added to ``PyConfig`` to store |
| 251 | +the selected timestamp format. This will be accessible through |
| 252 | +``sys.flags.traceback_timestamps``. |
| 253 | + |
| 254 | + |
| 255 | +Backwards Compatibility |
| 256 | +======================= |
| 257 | + |
| 258 | +This proposal maintains full backwards compatibility: |
| 259 | + |
| 260 | +1. The feature is disabled by default |
| 261 | +2. Existing exception handling code continues to work unchanged |
| 262 | +3. The new attribute is only set when the feature is explicitly enabled |
| 263 | +4. Pickle/unpickle of exceptions works correctly with the new attribute |
| 264 | +5. Third-party exception formatting libraries can ignore the attribute if desired |
| 265 | + |
| 266 | + |
| 267 | +Security Implications |
| 268 | +===================== |
| 269 | + |
| 270 | +Timestamps in exceptions could potentially reveal information about: |
| 271 | + |
| 272 | +1. System performance characteristics |
| 273 | +2. Timing of operations that might be security-sensitive |
| 274 | +3. Patterns of exception handling that could be used for timing attacks |
| 275 | + |
| 276 | +However, since the feature is opt-in and primarily intended for development |
| 277 | +and debugging, these concerns are minimal. Production systems concerned about |
| 278 | +information leakage should not enable this feature. |
| 279 | + |
| 280 | + |
| 281 | +How to Teach This |
| 282 | +================= |
| 283 | + |
| 284 | +The feature should be documented in: |
| 285 | + |
| 286 | +1. The Python documentation for the ``exceptions`` module |
| 287 | +2. The ``traceback`` module documentation |
| 288 | +3. The Python tutorial section on exception handling (as an advanced topic) |
| 289 | +4. The command-line interface documentation |
| 290 | + |
| 291 | +Example use cases should focus on: |
| 292 | + |
| 293 | +- Debugging async applications with multiple concurrent exceptions |
| 294 | +- Correlating exceptions with external system logs |
| 295 | +- Understanding performance issues in exception-heavy code |
| 296 | + |
| 297 | + |
| 298 | +Reference Implementation |
| 299 | +======================== |
| 300 | + |
| 301 | +The reference implementation is available in `CPython PR #129337 |
| 302 | +<https://github.com/python/cpython/pull/129337>`_. |
| 303 | + |
| 304 | +The implementation includes: |
| 305 | + |
| 306 | +- Core exception object changes to add the ``__timestamp_ns__`` attribute |
| 307 | +- Traceback formatting updates to display timestamps |
| 308 | +- Configuration through environment variables and command-line options |
| 309 | +- Special handling for ``StopIteration`` and ``AsyncStopIteration`` |
| 310 | +- Comprehensive test coverage |
| 311 | +- Documentation updates |
| 312 | + |
| 313 | + |
| 314 | +Rejected Ideas |
| 315 | +============== |
| 316 | + |
| 317 | +Using Exception Notes |
| 318 | +--------------------- |
| 319 | + |
| 320 | +An alternative approach would be to use the exception notes feature from |
| 321 | +:pep:`678` to store timestamps. This was rejected because: |
| 322 | + |
| 323 | +1. It would require creating string and list objects for every exception |
| 324 | +2. The performance impact would be significant even when not displaying timestamps |
| 325 | +3. It would make the timestamp less structured and harder to process programmatically |
| 326 | + |
| 327 | +Always Collecting Timestamps |
| 328 | +----------------------------- |
| 329 | + |
| 330 | +Collecting timestamps for all exceptions unconditionally was considered but |
| 331 | +rejected due to: |
| 332 | + |
| 333 | +1. Performance overhead for exceptions used in control flow |
| 334 | +2. Unnecessary memory usage for the vast majority of use cases |
| 335 | +3. Potential security concerns in production environments |
| 336 | + |
| 337 | +Millisecond Precision |
| 338 | +--------------------- |
| 339 | + |
| 340 | +Using millisecond precision instead of nanosecond was considered, but |
| 341 | +nanosecond precision was chosen to: |
| 342 | + |
| 343 | +1. Match the precision of ``time.time_ns()`` |
| 344 | +2. Ensure sufficient precision for high-frequency exception scenarios |
| 345 | +3. Allow for future use cases that might require sub-millisecond precision |
| 346 | + |
| 347 | + |
| 348 | +Open Issues |
| 349 | +=========== |
| 350 | + |
| 351 | +1. Should there be a Python API to programmatically enable/disable timestamps |
| 352 | + at runtime? |
| 353 | + |
| 354 | +2. Should there be a way to retroactively add timestamps to existing exception |
| 355 | + objects? |
| 356 | + |
| 357 | +3. Should the timestamp format be customizable beyond the predefined options? |
| 358 | + |
| 359 | + |
| 360 | +Acknowledgements |
| 361 | +================ |
| 362 | + |
| 363 | +Thanks to Nathaniel J. Smith for the original idea suggestion, and to |
| 364 | +dcolascione for initial review feedback on the implementation. |
| 365 | + |
| 366 | + |
| 367 | +Footnotes |
| 368 | +========= |
| 369 | + |
| 370 | +References: |
| 371 | + |
| 372 | +- `CPython PR #129337 <https://github.com/python/cpython/pull/129337>`_ |
| 373 | +- :pep:`654` -- Exception Groups and except* |
| 374 | +- :pep:`678` -- Enriching Exceptions with Notes |
| 375 | + |
| 376 | + |
| 377 | +Copyright |
| 378 | +========= |
| 379 | + |
| 380 | +This document is placed in the public domain or under the |
| 381 | +CC0-1.0-Universal license, whichever is more permissive. |
0 commit comments