Skip to content

Commit 7dfc4ee

Browse files
Merge main into inverted_flamegraph
2 parents af60172 + b1c9582 commit 7dfc4ee

33 files changed

+995
-79
lines changed

Doc/library/profiling.sampling.rst

Lines changed: 67 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -470,9 +470,10 @@ which you can use to judge whether the data is sufficient for your analysis.
470470
Profiling modes
471471
===============
472472

473-
The sampling profiler supports three modes that control which samples are
473+
The sampling profiler supports four modes that control which samples are
474474
recorded. The mode determines what the profile measures: total elapsed time,
475-
CPU execution time, or time spent holding the global interpreter lock.
475+
CPU execution time, time spent holding the global interpreter lock, or
476+
exception handling.
476477

477478

478479
Wall-clock mode
@@ -553,6 +554,67 @@ single-threaded programs to distinguish Python execution time from time spent
553554
in C extensions or I/O.
554555

555556

557+
Exception mode
558+
--------------
559+
560+
Exception mode (``--mode=exception``) records samples only when a thread has
561+
an active exception::
562+
563+
python -m profiling.sampling run --mode=exception script.py
564+
565+
Samples are recorded in two situations: when an exception is being propagated
566+
up the call stack (after ``raise`` but before being caught), or when code is
567+
executing inside an ``except`` block where exception information is still
568+
present in the thread state.
569+
570+
The following example illustrates which code regions are captured:
571+
572+
.. code-block:: python
573+
574+
def example():
575+
try:
576+
raise ValueError("error") # Captured: exception being raised
577+
except ValueError:
578+
process_error() # Captured: inside except block
579+
finally:
580+
cleanup() # NOT captured: exception already handled
581+
582+
def example_propagating():
583+
try:
584+
try:
585+
raise ValueError("error")
586+
finally:
587+
cleanup() # Captured: exception propagating through
588+
except ValueError:
589+
pass
590+
591+
def example_no_exception():
592+
try:
593+
do_work()
594+
finally:
595+
cleanup() # NOT captured: no exception involved
596+
597+
Note that ``finally`` blocks are only captured when an exception is actively
598+
propagating through them. Once an ``except`` block finishes executing, Python
599+
clears the exception information before running any subsequent ``finally``
600+
block. Similarly, ``finally`` blocks that run during normal execution (when no
601+
exception was raised) are not captured because no exception state is present.
602+
603+
This mode is useful for understanding where your program spends time handling
604+
errors. Exception handling can be a significant source of overhead in code
605+
that uses exceptions for flow control (such as ``StopIteration`` in iterators)
606+
or in applications that process many error conditions (such as network servers
607+
handling connection failures).
608+
609+
Exception mode helps answer questions like "how much time is spent handling
610+
exceptions?" and "which exception handlers are the most expensive?" It can
611+
reveal hidden performance costs in code that catches and processes many
612+
exceptions, even when those exceptions are handled gracefully. For example,
613+
if a parsing library uses exceptions internally to signal format errors, this
614+
mode will capture time spent in those handlers even if the calling code never
615+
sees the exceptions.
616+
617+
556618
Output formats
557619
==============
558620

@@ -1006,8 +1068,9 @@ Mode options
10061068

10071069
.. option:: --mode <mode>
10081070

1009-
Sampling mode: ``wall`` (default), ``cpu``, or ``gil``.
1010-
The ``cpu`` and ``gil`` modes are incompatible with ``--async-aware``.
1071+
Sampling mode: ``wall`` (default), ``cpu``, ``gil``, or ``exception``.
1072+
The ``cpu``, ``gil``, and ``exception`` modes are incompatible with
1073+
``--async-aware``.
10111074

10121075
.. option:: --async-mode <mode>
10131076

Doc/whatsnew/3.15.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,8 @@ Key features include:
146146
and blocking. Use this to identify CPU-bound bottlenecks and optimize computational work.
147147
* **GIL-holding time** (``--mode gil``): Measures time spent holding Python's Global Interpreter
148148
Lock. Use this to identify which threads dominate GIL usage in multi-threaded applications.
149+
* **Exception handling time** (``--mode exception``): Captures samples only from threads with
150+
an active exception. Use this to analyze exception handling overhead.
149151

150152
* **Thread-aware profiling**: Option to profile all threads (``-a``) or just the main thread,
151153
essential for understanding multi-threaded application behavior.
@@ -175,6 +177,10 @@ Key features include:
175177
(``--async-aware``). See which coroutines are consuming time, with options to show only
176178
running tasks or all tasks including those waiting.
177179

180+
* **Opcode-level profiling**: Gather bytecode opcode information for instruction-level
181+
profiling (``--opcodes``). Shows which bytecode instructions are executing, including
182+
specializations from the adaptive interpreter.
183+
178184
See :mod:`profiling.sampling` for the complete documentation, including all
179185
available output formats, profiling modes, and configuration options.
180186

Include/cpython/pyatomic.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -591,6 +591,17 @@ static inline void _Py_atomic_fence_release(void);
591591

592592
// --- aliases ---------------------------------------------------------------
593593

594+
// Compilers don't really support "consume" semantics, so we fake it. Use
595+
// "acquire" with TSan to support false positives. Use "relaxed" otherwise,
596+
// because CPUs on all platforms we support respect address dependencies without
597+
// extra barriers.
598+
// See 2.6.7 in https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2055r0.pdf
599+
#if defined(_Py_THREAD_SANITIZER)
600+
# define _Py_atomic_load_ptr_consume _Py_atomic_load_ptr_acquire
601+
#else
602+
# define _Py_atomic_load_ptr_consume _Py_atomic_load_ptr_relaxed
603+
#endif
604+
594605
#if SIZEOF_LONG == 8
595606
# define _Py_atomic_load_ulong(p) \
596607
_Py_atomic_load_uint64((uint64_t *)p)

Include/internal/pycore_debug_offsets.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,15 @@ typedef struct _Py_DebugOffsets {
110110
uint64_t status;
111111
uint64_t holds_gil;
112112
uint64_t gil_requested;
113+
uint64_t current_exception;
114+
uint64_t exc_state;
113115
} thread_state;
114116

117+
// Exception stack item offset
118+
struct {
119+
uint64_t exc_value;
120+
} err_stackitem;
121+
115122
// InterpreterFrame offset;
116123
struct _interpreter_frame {
117124
uint64_t size;
@@ -282,6 +289,11 @@ typedef struct _Py_DebugOffsets {
282289
.status = offsetof(PyThreadState, _status), \
283290
.holds_gil = offsetof(PyThreadState, holds_gil), \
284291
.gil_requested = offsetof(PyThreadState, gil_requested), \
292+
.current_exception = offsetof(PyThreadState, current_exception), \
293+
.exc_state = offsetof(PyThreadState, exc_state), \
294+
}, \
295+
.err_stackitem = { \
296+
.exc_value = offsetof(_PyErr_StackItem, exc_value), \
285297
}, \
286298
.interpreter_frame = { \
287299
.size = sizeof(_PyInterpreterFrame), \

Include/internal/pycore_pyatomic_ft_wrappers.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ extern "C" {
3131
_Py_atomic_store_ptr(&value, new_value)
3232
#define FT_ATOMIC_LOAD_PTR_ACQUIRE(value) \
3333
_Py_atomic_load_ptr_acquire(&value)
34+
#define FT_ATOMIC_LOAD_PTR_CONSUME(value) \
35+
_Py_atomic_load_ptr_consume(&value)
3436
#define FT_ATOMIC_LOAD_UINTPTR_ACQUIRE(value) \
3537
_Py_atomic_load_uintptr_acquire(&value)
3638
#define FT_ATOMIC_LOAD_PTR_RELAXED(value) \
@@ -125,6 +127,7 @@ extern "C" {
125127
#define FT_ATOMIC_LOAD_SSIZE_ACQUIRE(value) value
126128
#define FT_ATOMIC_LOAD_SSIZE_RELAXED(value) value
127129
#define FT_ATOMIC_LOAD_PTR_ACQUIRE(value) value
130+
#define FT_ATOMIC_LOAD_PTR_CONSUME(value) value
128131
#define FT_ATOMIC_LOAD_UINTPTR_ACQUIRE(value) value
129132
#define FT_ATOMIC_LOAD_PTR_RELAXED(value) value
130133
#define FT_ATOMIC_LOAD_UINT8(value) value

Lib/profiling/sampling/_flamegraph_assets/flamegraph.css

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -343,34 +343,44 @@ body.resizing-sidebar {
343343
gap: 8px;
344344
padding: 8px 10px;
345345
background: var(--bg-primary);
346-
border: 1px solid var(--border);
346+
border: 2px solid var(--border);
347347
border-radius: 8px;
348348
transition: all var(--transition-fast);
349349
animation: slideUp 0.4s ease-out backwards;
350-
animation-delay: calc(var(--i, 0) * 0.05s);
350+
animation-delay: calc(var(--i, 0) * 0.08s);
351351
overflow: hidden;
352+
position: relative;
352353
}
353354

354-
.summary-card:nth-child(1) { --i: 0; }
355-
.summary-card:nth-child(2) { --i: 1; }
356-
.summary-card:nth-child(3) { --i: 2; }
357-
.summary-card:nth-child(4) { --i: 3; }
355+
.summary-card:nth-child(1) { --i: 0; --card-color: 55, 118, 171; }
356+
.summary-card:nth-child(2) { --i: 1; --card-color: 40, 167, 69; }
357+
.summary-card:nth-child(3) { --i: 2; --card-color: 255, 193, 7; }
358+
.summary-card:nth-child(4) { --i: 3; --card-color: 111, 66, 193; }
358359

359360
.summary-card:hover {
360-
border-color: var(--accent);
361-
background: var(--accent-glow);
361+
border-color: rgba(var(--card-color), 0.6);
362+
background: linear-gradient(135deg, rgba(var(--card-color), 0.08) 0%, var(--bg-primary) 100%);
363+
transform: translateY(-2px);
364+
box-shadow: 0 4px 12px rgba(var(--card-color), 0.15);
362365
}
363366

364367
.summary-icon {
365-
font-size: 16px;
368+
font-size: 14px;
366369
width: 28px;
367370
height: 28px;
368371
display: flex;
369372
align-items: center;
370373
justify-content: center;
371-
background: var(--bg-tertiary);
374+
background: linear-gradient(135deg, rgba(var(--card-color), 0.15) 0%, rgba(var(--card-color), 0.05) 100%);
375+
border: 1px solid rgba(var(--card-color), 0.2);
372376
border-radius: 6px;
373377
flex-shrink: 0;
378+
transition: all var(--transition-fast);
379+
}
380+
381+
.summary-card:hover .summary-icon {
382+
transform: scale(1.05);
383+
background: linear-gradient(135deg, rgba(var(--card-color), 0.25) 0%, rgba(var(--card-color), 0.1) 100%);
374384
}
375385

376386
.summary-data {
@@ -382,8 +392,8 @@ body.resizing-sidebar {
382392
.summary-value {
383393
font-family: var(--font-mono);
384394
font-size: 13px;
385-
font-weight: 700;
386-
color: var(--accent);
395+
font-weight: 800;
396+
color: rgb(var(--card-color));
387397
line-height: 1.2;
388398
white-space: nowrap;
389399
overflow: hidden;

Lib/profiling/sampling/_flamegraph_assets/flamegraph.js

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,27 @@ function restoreUIState() {
190190
}
191191
}
192192

193+
// ============================================================================
194+
// Logo/Favicon Setup
195+
// ============================================================================
196+
197+
function setupLogos() {
198+
const logo = document.querySelector('.sidebar-logo-img img');
199+
if (!logo) return;
200+
201+
const navbarLogoContainer = document.getElementById('navbar-logo');
202+
if (navbarLogoContainer) {
203+
const navbarLogo = logo.cloneNode(true);
204+
navbarLogoContainer.appendChild(navbarLogo);
205+
}
206+
207+
const favicon = document.createElement('link');
208+
favicon.rel = 'icon';
209+
favicon.type = 'image/png';
210+
favicon.href = logo.src;
211+
document.head.appendChild(favicon);
212+
}
213+
193214
// ============================================================================
194215
// Status Bar
195216
// ============================================================================
@@ -201,6 +222,11 @@ function updateStatusBar(nodeData, rootValue) {
201222
const timeMs = (nodeData.value / 1000).toFixed(2);
202223
const percent = rootValue > 0 ? ((nodeData.value / rootValue) * 100).toFixed(1) : "0.0";
203224

225+
const brandEl = document.getElementById('status-brand');
226+
const taglineEl = document.getElementById('status-tagline');
227+
if (brandEl) brandEl.style.display = 'none';
228+
if (taglineEl) taglineEl.style.display = 'none';
229+
204230
const locationEl = document.getElementById('status-location');
205231
const funcItem = document.getElementById('status-func-item');
206232
const timeItem = document.getElementById('status-time-item');
@@ -233,6 +259,11 @@ function clearStatusBar() {
233259
const el = document.getElementById(id);
234260
if (el) el.style.display = 'none';
235261
});
262+
263+
const brandEl = document.getElementById('status-brand');
264+
const taglineEl = document.getElementById('status-tagline');
265+
if (brandEl) brandEl.style.display = 'flex';
266+
if (taglineEl) taglineEl.style.display = 'flex';
236267
}
237268

238269
// ============================================================================
@@ -723,6 +754,10 @@ function populateThreadStats(data, selectedThreadId = null) {
723754

724755
const gcPctElem = document.getElementById('gc-pct');
725756
if (gcPctElem) gcPctElem.textContent = `${(threadStats.gc_pct || 0).toFixed(1)}%`;
757+
758+
// Exception stats
759+
const excPctElem = document.getElementById('exc-pct');
760+
if (excPctElem) excPctElem.textContent = `${(threadStats.has_exception_pct || 0).toFixed(1)}%`;
726761
}
727762

728763
// ============================================================================
@@ -1235,6 +1270,7 @@ function toggleInvert() {
12351270
function initFlamegraph() {
12361271
ensureLibraryLoaded();
12371272
restoreUIState();
1273+
setupLogos();
12381274

12391275
if (EMBEDDED_DATA.strings) {
12401276
stringTable = EMBEDDED_DATA.strings;

Lib/profiling/sampling/_flamegraph_assets/flamegraph_template.html

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
<head>
44
<meta charset="UTF-8" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
6-
<title>Tachyon Profiler - Flamegraph</title>
6+
<title>Tachyon Profiler - Flamegraph Report</title>
77
<!-- INLINE_VENDOR_D3_JS -->
88
<!-- INLINE_VENDOR_FLAMEGRAPH_CSS -->
99
<!-- INLINE_VENDOR_FLAMEGRAPH_JS -->
@@ -15,9 +15,10 @@
1515
<!-- Top Bar -->
1616
<header class="top-bar">
1717
<div class="brand">
18+
<div class="brand-logo" id="navbar-logo"></div>
1819
<span class="brand-text">Tachyon</span>
1920
<span class="brand-divider"></span>
20-
<span class="brand-subtitle">Profiler</span>
21+
<span class="brand-subtitle">Flamegraph Report</span>
2122
</div>
2223
<div class="search-wrapper">
2324
<input
@@ -171,6 +172,10 @@ <h3 class="section-title">Runtime Stats</h3>
171172
<div class="stat-tile-value" id="gc-pct">--</div>
172173
<div class="stat-tile-label">GC</div>
173174
</div>
175+
<div class="stat-tile stat-tile--red" id="exc-stat">
176+
<div class="stat-tile-value" id="exc-pct">--</div>
177+
<div class="stat-tile-label">Exception</div>
178+
</div>
174179
</div>
175180
</div>
176181
</section>
@@ -296,6 +301,12 @@ <h3 class="section-title">Heat Map</h3>
296301

297302
<!-- Status Bar -->
298303
<footer class="status-bar">
304+
<span class="status-item" id="status-brand">
305+
<span class="status-value">Tachyon Profiler</span>
306+
</span>
307+
<span class="status-item" id="status-tagline">
308+
<span class="status-label">Python Sampling Profiler</span>
309+
</span>
299310
<span class="status-item" id="status-location" style="display: none;">
300311
<span class="status-label">File:</span>
301312
<span class="status-value" id="status-file">--</span>

Lib/profiling/sampling/_heatmap_assets/heatmap.css

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -20,24 +20,6 @@
2020
z-index: 100;
2121
}
2222

23-
/* Back link in toolbar */
24-
.back-link {
25-
color: white;
26-
text-decoration: none;
27-
padding: 6px 14px;
28-
background: rgba(255, 255, 255, 0.12);
29-
border: 1px solid rgba(255, 255, 255, 0.18);
30-
border-radius: 6px;
31-
font-size: 13px;
32-
font-weight: 500;
33-
transition: all var(--transition-fast);
34-
}
35-
36-
.back-link:hover {
37-
background: rgba(255, 255, 255, 0.22);
38-
border-color: rgba(255, 255, 255, 0.35);
39-
}
40-
4123
/* --------------------------------------------------------------------------
4224
Main Content Area
4325
-------------------------------------------------------------------------- */

0 commit comments

Comments
 (0)