@@ -96,35 +96,133 @@ performance issues in production environments.
9696Key features include:
9797
9898* **Zero-overhead profiling **: Attach to any running Python process without
99- affecting its performance
100- * **No code modification required **: Profile existing applications without restart
101- * **Real-time statistics **: Monitor sampling quality during data collection
102- * **Multiple output formats **: Generate both detailed statistics and flamegraph data
103- * **Thread-aware profiling **: Option to profile all threads or just the main thread
99+ affecting its performance. Ideal for production debugging where you can't afford
100+ to restart or slow down your application.
104101
105- Profile process 1234 for 10 seconds with default settings:
102+ * **No code modification required **: Profile existing applications without restart.
103+ Simply point the profiler at a running process by PID and start collecting data.
104+
105+ * **Flexible target modes **:
106+
107+ * Profile running processes by PID (``-p ``) - attach to already-running applications
108+ * Run and profile scripts directly - profile from the very start of execution
109+ * Execute and profile modules (``-m ``) - profile packages run as ``python -m module ``
110+
111+ * **Multiple profiling modes **: Choose what to measure based on your performance investigation:
112+
113+ * **Wall-clock time ** (``--mode wall ``, default): Measures real elapsed time including I/O,
114+ network waits, and blocking operations. Use this to understand where your program spends
115+ calendar time, including when waiting for external resources.
116+ * **CPU time ** (``--mode cpu ``): Measures only active CPU execution time, excluding I/O waits
117+ and blocking. Use this to identify CPU-bound bottlenecks and optimize computational work.
118+ * **GIL-holding time ** (``--mode gil ``): Measures time spent holding Python's Global Interpreter
119+ Lock. Use this to identify which threads dominate GIL usage in multi-threaded applications.
120+
121+ * **Thread-aware profiling **: Option to profile all threads (``-a ``) or just the main thread,
122+ essential for understanding multi-threaded application behavior.
123+
124+ * **Multiple output formats **: Choose the visualization that best fits your workflow:
125+
126+ * ``--pstats ``: Detailed tabular statistics compatible with :mod: `pstats `. Shows function-level
127+ timing with direct and cumulative samples. Best for detailed analysis and integration with
128+ existing Python profiling tools.
129+ * ``--collapsed ``: Generates collapsed stack traces (one line per stack). This format is
130+ specifically designed for creating flamegraphs with external tools like Brendan Gregg's
131+ FlameGraph scripts or speedscope.
132+ * ``--flamegraph ``: Generates a self-contained interactive HTML flamegraph using D3.js.
133+ Opens directly in your browser for immediate visual analysis. Flamegraphs show the call
134+ hierarchy where width represents time spent, making it easy to spot bottlenecks at a glance.
135+ * ``--gecko ``: Generates Gecko Profiler format compatible with Firefox Profiler
136+ (https://profiler.firefox.com). Upload the output to Firefox Profiler for advanced
137+ timeline-based analysis with features like stack charts, markers, and network activity.
138+
139+ * **Advanced sorting options **: Sort by direct samples, total time, cumulative time,
140+ sample percentage, cumulative percentage, or function name. Quickly identify hot spots
141+ by sorting functions by where they appear most in stack traces.
142+
143+ * **Flexible output control **: Limit results to top N functions (``-l ``), customize sorting,
144+ and disable summary sections for cleaner output suitable for automation.
145+
146+ **Basic usage examples: **
147+
148+ Attach to a running process and get quick profiling stats:
149+
150+ .. code-block :: shell
151+
152+ python -m profiling.sampling -p 1234
153+
154+ Profile a script from the start of its execution:
155+
156+ .. code-block :: shell
157+
158+ python -m profiling.sampling myscript.py arg1 arg2
159+
160+ Profile a module (like profiling ``python -m http.server ``):
161+
162+ .. code-block :: shell
163+
164+ python -m profiling.sampling -m http.server
165+
166+ **Understanding different profiling modes: **
167+
168+ Investigate why your web server feels slow (includes I/O waits):
169+
170+ .. code-block :: shell
171+
172+ python -m profiling.sampling --mode wall -p 1234
173+
174+ Find CPU-intensive functions (excludes I/O and sleep time):
175+
176+ .. code-block :: shell
177+
178+ python -m profiling.sampling --mode cpu -p 1234
179+
180+ Debug GIL contention in multi-threaded applications:
106181
107182.. code-block :: shell
108183
109- python -m profiling.sampling 1234
184+ python -m profiling.sampling --mode gil -a -p 1234
185+
186+ **Visualization and output formats: **
187+
188+ Generate an interactive flamegraph for visual analysis (opens in browser):
189+
190+ .. code-block :: shell
191+
192+ python -m profiling.sampling --flamegraph -p 1234
193+
194+ Upload to Firefox Profiler for timeline-based analysis:
195+
196+ .. code-block :: shell
197+
198+ python -m profiling.sampling --gecko -o profile.json -p 1234
199+ # Then upload profile.json to https://profiler.firefox.com
200+
201+ Generate collapsed stacks for custom processing:
202+
203+ .. code-block :: shell
204+
205+ python -m profiling.sampling --collapsed -o stacks.txt -p 1234
206+
207+ **Advanced usage: **
110208
111- Profile with custom interval and duration, save to file :
209+ Profile all threads with real-time sampling statistics :
112210
113211.. code-block :: shell
114212
115- python -m profiling.sampling -i 50 -d 30 -o profile. stats 1234
213+ python -m profiling.sampling -a --realtime- stats -p 1234
116214
117- Generate collapsed stacks for flamegraph :
215+ High-frequency sampling (1ms intervals) for 60 seconds :
118216
119217.. code-block :: shell
120218
121- python -m profiling.sampling --collapsed 1234
219+ python -m profiling.sampling -i 1000 -d 60 -p 1234
122220
123- Profile all threads and sort by total time :
221+ Show only the top 20 CPU-consuming functions :
124222
125223.. code-block :: shell
126224
127- python -m profiling.sampling -a -- sort-tottime 1234
225+ python -m profiling.sampling -- sort-tottime -l 20 -p 1234
128226
129227 The profiler generates statistical estimates of where time is spent:
130228
0 commit comments