77.. module :: profiling
88 :synopsis: Python profiling tools for performance analysis.
99
10+ .. versionadded :: 3.15
11+
1012**Source code: ** :source: `Lib/profiling/ `
1113
1214--------------
1315
14- .. versionadded :: 3.15
15-
1616.. index ::
1717 single: statistical profiling
1818 single: profiling, statistical
@@ -34,16 +34,15 @@ a single namespace. It contains two submodules, each implementing a different
3434profiling methodology:
3535
3636:mod: `profiling.sampling `
37- A statistical profiler that periodically samples the call stack of running
38- processes. It can attach to any Python process without requiring code
39- modification, making it ideal for production debugging. The sampling
40- approach introduces virtually no overhead to the profiled program .
37+ A statistical profiler that periodically samples the call stack. Run scripts
38+ directly or attach to running processes by PID. Provides multiple output
39+ formats (flamegraphs, heatmaps, Firefox Profiler), GIL analysis, GC tracking,
40+ and multiple profiling modes (wall-clock, CPU, GIL) with virtually no overhead .
4141
4242:mod: `profiling.tracing `
4343 A deterministic profiler that traces every function call, return, and
44- exception event. It provides exact call counts and timing information,
45- making it suitable for development and testing where precision matters
46- more than minimal overhead.
44+ exception event. Provides exact call counts and precise timing information,
45+ capturing every invocation including very fast functions.
4746
4847.. note ::
4948
@@ -60,86 +59,82 @@ profiling methodology:
6059Choosing a Profiler
6160===================
6261
63- The choice between statistical sampling and deterministic tracing depends on
64- your specific use case. Each approach offers distinct advantages that make it
65- better suited to certain scenarios.
62+ For most performance analysis, use the statistical profiler
63+ (:mod: `profiling.sampling `). It has minimal overhead, works for both development
64+ and production, and provides rich visualization options including flamegraphs,
65+ heatmaps, GIL analysis, and more.
6666
67- Statistical profiling excels when you need to understand performance
68- characteristics without affecting the program's behavior. Because the sampling
69- profiler reads process memory externally rather than instrumenting code, it
70- introduces virtually no overhead. This makes it the right choice for profiling
71- production systems, investigating intermittent performance issues, and
72- analyzing programs where timing accuracy is critical.
73-
74- Deterministic profiling provides complete visibility into program execution.
75- Every function call is recorded with precise timing, giving you exact call
76- counts and the ability to trace the full call graph. This level of detail is
77- valuable during development when you need to understand exactly how code
78- executes, verify that optimizations reduce call counts, or identify unexpected
79- function calls.
67+ Use the deterministic profiler (:mod: `profiling.tracing `) when you need **exact
68+ call counts ** and cannot afford to miss any function calls. Since it instruments
69+ every function call and return, it will capture even very fast functions that
70+ complete between sampling intervals. The tradeoff is higher overhead.
8071
8172The following table summarizes the key differences:
8273
8374+--------------------+------------------------------+------------------------------+
8475| Feature | Statistical Sampling | Deterministic |
8576| | (:mod: `profiling.sampling `) | (:mod: `profiling.tracing `) |
8677+====================+==============================+==============================+
87- | **Target ** | Running process | Code you run |
88- +--------------------+------------------------------+------------------------------+
8978| **Overhead ** | Virtually none | Moderate |
9079+--------------------+------------------------------+------------------------------+
9180| **Accuracy ** | Statistical estimate | Exact call counts |
9281+--------------------+------------------------------+------------------------------+
93- | **Setup ** | Attach to any PID | Instrument code |
82+ | **Output formats ** | pstats, flamegraph, heatmap, | pstats |
83+ | | gecko, collapsed | |
84+ +--------------------+------------------------------+------------------------------+
85+ | **Profiling modes**| Wall-clock, CPU, GIL | Wall-clock |
9486+--------------------+------------------------------+------------------------------+
95- | **Use Case ** | Production debugging | Development/testing |
87+ | **Special frames ** | GC, native (C extensions) | N/A |
9688+--------------------+------------------------------+------------------------------+
97- | **Implementation ** | C extension | C extension |
89+ | **Attach to PID ** | Yes | No |
9890+--------------------+------------------------------+------------------------------+
9991
10092
10193When to Use Statistical Sampling
10294--------------------------------
10395
104- The statistical profiler (:mod: `profiling.sampling `) is recommended when:
96+ The statistical profiler (:mod: `profiling.sampling `) is recommended for most
97+ performance analysis tasks. Use it the same way you would use ``cProfile ``::
10598
106- You need to profile a production system where any performance impact is
107- unacceptable. The sampling profiler runs in a separate process and reads the
108- target process memory without interrupting its execution.
99+ python -m profiling.sampling run script.py
109100
110- You want to profile an already-running process without restarting it or
111- modifying its code. Simply provide the process ID and the profiler will attach
112- and begin collecting data.
101+ One of the main strengths of the sampling profiler is its variety of output
102+ formats. Beyond traditional pstats tables, it can generate interactive
103+ flamegraphs that visualize call hierarchies, line-level source heatmaps that
104+ show exactly where time is spent in your code, and Firefox Profiler output for
105+ timeline-based analysis.
113106
114- You are investigating performance issues that might be affected by profiler
115- overhead. Since statistical profiling does not instrument the code, it captures
116- the program's natural behavior.
107+ The profiler also provides insight into Python interpreter behavior that
108+ deterministic profiling cannot capture. Use ``--mode gil `` to identify GIL
109+ contention in multi-threaded code, ``--mode cpu `` to measure actual CPU time
110+ excluding I/O waits, or inspect ``<GC> `` frames to understand garbage collection
111+ overhead. The ``--native `` option reveals time spent in C extensions, helping
112+ distinguish Python overhead from library performance.
117113
118- Your application uses multiple threads and you want to understand how work is
119- distributed across them. The sampling profiler can collect stack traces from
120- all threads simultaneously.
114+ For multi-threaded applications, the ``-a `` option samples all threads
115+ simultaneously, showing how work is distributed. And for production debugging,
116+ the ``attach `` command connects to any running Python process by PID without
117+ requiring a restart or code changes.
121118
122119
123120When to Use Deterministic Tracing
124121---------------------------------
125122
126- The deterministic profiler (:mod: `profiling.tracing `) is recommended when:
123+ The deterministic profiler (:mod: `profiling.tracing `) instruments every function
124+ call and return. This approach has higher overhead than sampling, but guarantees
125+ complete coverage of program execution.
127126
128- You need exact call counts for every function. Statistical profiling provides
129- estimates based on sampling frequency, which may miss or undercount
130- short-lived function calls.
127+ The primary reason to choose deterministic tracing is when you need exact call
128+ counts. Statistical profiling estimates frequency based on sampling, which may
129+ undercount short-lived functions that complete between samples. If you need to
130+ verify that an optimization actually reduced the number of function calls, or
131+ if you want to trace the complete call graph to understand caller-callee
132+ relationships, deterministic tracing is the right choice.
131133
132- You want to trace the complete call graph and understand caller-callee
133- relationships. Deterministic profiling records every call, enabling detailed
134- analysis of how functions interact.
135-
136- You are developing or testing code and can tolerate moderate overhead in
137- exchange for precise measurements. The overhead is acceptable for most
138- development workflows.
139-
140- You need to measure time spent in specific functions accurately, including
141- functions that execute quickly. Deterministic profiling captures every
142- invocation rather than relying on statistical sampling.
134+ Deterministic tracing also excels at capturing functions that execute in
135+ microseconds. Such functions may not appear frequently enough in statistical
136+ samples, but deterministic tracing records every invocation regardless of
137+ duration.
143138
144139
145140Quick Start
@@ -152,76 +147,83 @@ documentation, see the dedicated pages for each profiler.
152147Statistical Profiling
153148---------------------
154149
155- To profile a running Python process, use the :mod: `profiling.sampling ` module
156- with the process ID::
150+ To profile a script, use the :mod: `profiling.sampling ` module with the ``run ``
151+ command::
152+
153+ python -m profiling.sampling run script.py
154+ python -m profiling.sampling run -m mypackage.module
157155
158- python -m profiling.sampling 1234
156+ This runs the script under the profiler and prints a summary of where time was
157+ spent. For an interactive flamegraph::
159158
160- This attaches to process 1234, samples its call stack for the default duration,
161- and prints a summary of where time was spent. No changes to the target process
162- are required.
159+ python -m profiling.sampling run --flamegraph script.py
160+
161+ To profile an already-running process, use the ``attach `` command with the
162+ process ID::
163+
164+ python -m profiling.sampling attach 1234
163165
164166For custom settings, specify the sampling interval (in microseconds) and
165167duration (in seconds)::
166168
167- python -m profiling.sampling -i 50 -d 30 1234
169+ python -m profiling.sampling run -i 50 -d 30 script.py
168170
169171
170172Deterministic Profiling
171173-----------------------
172174
173- To profile a piece of code, use the :mod: `profiling.tracing ` module::
175+ To profile a script from the command line::
176+
177+ python -m profiling.tracing myscript.py
178+
179+ To profile a piece of code programmatically::
174180
175181 import profiling.tracing
176182 profiling.tracing.run('my_function()')
177183
178184This executes the given code under the profiler and prints a summary showing
179- function call counts and timing. For profiling a script from the command line::
180-
181- python -m profiling.tracing myscript.py
185+ exact function call counts and timing.
182186
183187
184188.. _profile-output :
185189
186190Understanding Profile Output
187191============================
188192
189- Both profilers produce output showing function-level statistics. The
190- deterministic profiler output looks like this::
191-
192- 214 function calls (207 primitive calls) in 0.002 seconds
193-
194- Ordered by: cumulative time
193+ Both profilers collect function-level statistics, though they present them in
194+ different formats. The sampling profiler offers multiple visualizations
195+ (flamegraphs, heatmaps, Firefox Profiler, pstats tables), while the
196+ deterministic profiler produces pstats-compatible output. Regardless of format,
197+ the underlying concepts are the same.
195198
196- ncalls tottime percall cumtime percall filename:lineno(function)
197- 1 0.000 0.000 0.002 0.002 {built-in method builtins.exec}
198- 1 0.000 0.000 0.001 0.001 <string>:1(<module>)
199- 1 0.000 0.000 0.001 0.001 __init__.py:250(compile)
199+ Key profiling concepts:
200200
201- The first line indicates that 214 calls were monitored, of which 207 were
202- :dfn: `primitive ` (not induced via recursion). The column headings are:
201+ **Direct time ** (also called *self time * or *tottime *)
202+ Time spent executing code in the function itself, excluding time spent in
203+ functions it called. High direct time indicates the function contains
204+ expensive operations.
203205
204- ncalls
205- The number of calls to this function. When two numbers appear separated by
206- a slash (for example, ``3/1 ``), the function recursed: the first number is
207- the total calls and the second is the primitive (non-recursive) calls.
206+ **Cumulative time ** (also called *total time * or *cumtime *)
207+ Time spent in the function and all functions it called. This measures the
208+ total cost of calling a function, including its entire call subtree.
208209
209- tottime
210- The total time spent in this function alone, excluding time spent in
211- functions it called.
210+ **Call count ** (also called *ncalls * or *samples *)
211+ How many times the function was called (deterministic) or sampled
212+ (statistical). In deterministic profiling, this is exact. In statistical
213+ profiling, it represents the number of times the function appeared in a
214+ stack sample.
212215
213- percall
214- The quotient of ``tottime `` divided by ``ncalls ``.
216+ **Primitive calls **
217+ Calls that are not induced by recursion. When a function recurses, the total
218+ call count includes recursive invocations, but primitive calls counts only
219+ the initial entry. Displayed as ``total/primitive `` (for example, ``3/1 ``
220+ means 3 total calls, 1 primitive).
215221
216- cumtime
217- The cumulative time spent in this function and all functions it called.
218- This figure is accurate even for recursive functions.
219-
220- percall
221- The quotient of ``cumtime `` divided by primitive calls.
222-
223- filename:lineno(function)
224- The location and name of the function.
222+ **Caller/Callee relationships **
223+ Which functions called a given function (callers) and which functions it
224+ called (callees). Flamegraphs visualize this as nested rectangles; pstats
225+ can display it via the :meth: `~pstats.Stats.print_callers ` and
226+ :meth: `~pstats.Stats.print_callees ` methods.
225227
226228
227229Legacy Compatibility
@@ -240,11 +242,12 @@ continue to work without modification in all future Python versions.
240242
241243.. seealso ::
242244
243- :mod: `profiling.tracing `
244- Deterministic tracing profiler for development and testing.
245-
246245 :mod: `profiling.sampling `
247- Statistical sampling profiler for production debugging.
246+ Statistical sampling profiler with flamegraphs, heatmaps, and GIL analysis.
247+ Recommended for most users.
248+
249+ :mod: `profiling.tracing `
250+ Deterministic tracing profiler for exact call counts.
248251
249252 :mod: `pstats `
250253 Statistics analysis and formatting for profile data.
0 commit comments