Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -200,3 +200,36 @@ clean: sweep clean-tools-cache
clean-tools-cache:
@echo "Removing cached tool binaries..."
rm -rf tools-cache

# Generate command documentation from help text
COMMANDS := report benchmark metrics telemetry flamegraph lock config update extract
SUBCOMMANDS := metrics:trim config:restore

.PHONY: docs
docs: perfspect
@echo "Generating command documentation..."
@mkdir -p docs
@echo '# perfspect' > docs/perfspect.md
@echo '' >> docs/perfspect.md
@echo '```text' >> docs/perfspect.md
@./perfspect --help >> docs/perfspect.md
@echo '```' >> docs/perfspect.md
@for cmd in $(COMMANDS); do \
echo " $$cmd"; \
echo "# perfspect $$cmd" > docs/perfspect_$$cmd.md; \
echo '' >> docs/perfspect_$$cmd.md; \
echo '```text' >> docs/perfspect_$$cmd.md; \
./perfspect $$cmd --help >> docs/perfspect_$$cmd.md; \
echo '```' >> docs/perfspect_$$cmd.md; \
done
@for sub in $(SUBCOMMANDS); do \
cmd=$${sub%%:*}; \
subcmd=$${sub##*:}; \
echo " $$cmd $$subcmd"; \
echo "# perfspect $$cmd $$subcmd" > docs/perfspect_$${cmd}_$${subcmd}.md; \
echo '' >> docs/perfspect_$${cmd}_$${subcmd}.md; \
echo '```text' >> docs/perfspect_$${cmd}_$${subcmd}.md; \
./perfspect $$cmd $$subcmd --help >> docs/perfspect_$${cmd}_$${subcmd}.md; \
echo '```' >> docs/perfspect_$${cmd}_$${subcmd}.md; \
done
@echo "Documentation generated in docs/"
50 changes: 26 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,30 +28,32 @@ Usage:

### Commands

| Command | Description |
| ------- | ----------- |
| [`metrics`](#metrics-command) | CPU core and uncore metrics |
| [`report`](#report-command) | System configuration and health |
| [`benchmark`](#benchmark-command) | Performance benchmarks |
| [`telemetry`](#telemetry-command) | System telemetry |
| [`flamegraph`](#flamegraph-command) | Software call-stacks as flamegraphs |
| [`lock`](#lock-command) | Software hot spot, cache-to-cache and lock contention |
| [`config`](#config-command) | Modify system configuration |
| Command | Description | Reference |
| ------- | ----------- | --------- |
| [`metrics`](#metrics-command) | CPU core and uncore metrics | [options](docs/perfspect_metrics.md) |
| [`report`](#report-command) | System configuration and health | [options](docs/perfspect_report.md) |
| [`benchmark`](#benchmark-command) | Performance benchmarks | [options](docs/perfspect_benchmark.md) |
| [`telemetry`](#telemetry-command) | System telemetry | [options](docs/perfspect_telemetry.md) |
| [`flamegraph`](#flamegraph-command) | Software call-stacks as flamegraphs | [options](docs/perfspect_flamegraph.md) |
| [`lock`](#lock-command) | Software hot spot, cache-to-cache and lock contention | [options](docs/perfspect_lock.md) |
| [`config`](#config-command) | Modify system configuration | [options](docs/perfspect_config.md) |

> [!TIP]
> Run `perfspect [command] -h` to view command-specific help text.
> Run `perfspect [command] -h` to view command-specific help text. See [`perfspect -h`](docs/perfspect.md) for global options.

Additional commands: [`update`](docs/perfspect_update.md) checks for and applies application updates (Intel network only), and [`extract`](docs/perfspect_extract.md) extracts embedded resources (for developers).

#### Metrics Command

The `metrics` command generates reports containing CPU architectural performance characterization metrics in HTML and CSV formats. Run `perfspect metrics`.

![screenshot of the TMAM page from the metrics command HTML report, provides a description of TMAM on the left and a pie chart showing the 1st and 2nd level TMAM metrics on the right](docs/metrics_html_tma.png)
![screenshot of the TMAM page from the metrics command HTML report, provides a description of TMAM on the left and a pie chart showing the 1st and 2nd level TMAM metrics on the right](docs/images/metrics_html_tma.png)

##### Live Metrics

The `metrics` command supports two modes -- default and "live". Default mode behaves as above -- metrics are collected and saved into report files for review. The "live" mode prints the metrics to stdout where they can be viewed in the console and/or redirected into a file or observability pipeline. Run `perfspect metrics --live`.

![screenshot of live CSV metrics in a text terminal](docs/metrics_live.png)
![screenshot of live CSV metrics in a text terminal](docs/images/metrics_live.png)

##### Metrics Without Root Permissions

Expand All @@ -65,7 +67,7 @@ Once the configuration changes are applied, use the `--noroot` flag on the comma

##### Refining Metrics to a Specific Time Range

After collecting metrics, you can generate new summary reports for a specific time interval using the `metrics trim` subcommand. This is useful when you've collected metrics for an entire workload but want to analyze only a specific portion, excluding setup, teardown, or other unwanted phases.
After collecting metrics, you can generate new summary reports for a specific time interval using the [`metrics trim`](docs/perfspect_metrics_trim.md) subcommand. This is useful when you've collected metrics for an entire workload but want to analyze only a specific portion, excluding setup, teardown, or other unwanted phases.

The time range can be specified using either absolute timestamps (seconds since epoch) or relative offsets from the beginning/end of the data. At least one time parameter must be specified.

Expand All @@ -85,13 +87,13 @@ $ ./perfspect metrics trim --input perfspect_2025-11-28_09-21-56 --start-time 17

The `metrics` command can expose metrics via a Prometheus compatible `metrics` endpoint. This allows integration with Prometheus monitoring systems. To enable the Prometheus endpoint, use the `--prometheus-server` flag. By default, the endpoint listens on port 9090. The port can be changed using the `--prometheus-server-addr` flag. Run `perfspect metrics --prometheus-server`. See the [example daemonset](docs/perfspect-daemonset.md) for deploying in Kubernetes.

See `perfspect metrics -h` for the extensive set of options and examples.
See [`perfspect metrics -h`](docs/perfspect_metrics.md) for the extensive set of options and examples.

#### Report Command

The `report` command generates system configuration reports in a variety of formats. All categories of information are collected by default. See `perfspect report -h` for all options.
The `report` command generates system configuration reports in a variety of formats. All categories of information are collected by default. See [`perfspect report -h`](docs/perfspect_report.md) for all options.

![screenshot of a small section of the HTML report from the report command](docs/report_html.png)
![screenshot of a small section of the HTML report from the report command](docs/images/report_html.png)

It's possible to report a subset of information by providing command line options. Note that by specifying only the `txt` format, it is printed to stdout, as well as written to a report file.

Expand Down Expand Up @@ -119,7 +121,7 @@ The `benchmark` command runs performance micro-benchmarks to evaluate system hea
./perfspect benchmark --no-summary # Exclude system summary from output
```

See `perfspect benchmark -h` for all options.
See [`perfspect benchmark -h`](docs/perfspect_benchmark.md) for all options.

| Benchmark | Description |
| --------- | ----------- |
Expand All @@ -133,9 +135,9 @@ See `perfspect benchmark -h` for all options.

#### Telemetry Command

The `telemetry` command reports CPU utilization, instruction mix, disk stats, network stats, and more on the specified target(s). All telemetry types are collected by default. To choose telemetry types, see the additional command line options (`perfspect telemetry -h`).
The `telemetry` command reports CPU utilization, instruction mix, disk stats, network stats, and more on the specified target(s). All telemetry types are collected by default. To choose telemetry types, see the additional command line options ([`perfspect telemetry -h`](docs/perfspect_telemetry.md)).

![screenshot of the CPU utilization chart from the HTML output of the telemetry command](docs/telemetry_html.png)
![screenshot of the CPU utilization chart from the HTML output of the telemetry command](docs/images/telemetry_html.png)

##### Additional Telemetry via Environment Variables

Expand All @@ -151,20 +153,20 @@ The following optional telemetry sources can be enabled via environment variable

#### Flamegraph Command

Software flamegraphs are useful in diagnosing software performance bottlenecks. Run `perfspect flamegraph` to capture a system-wide software flamegraph.
Software flamegraphs are useful in diagnosing software performance bottlenecks. Run `perfspect flamegraph` to capture a system-wide software flamegraph. See [`perfspect flamegraph -h`](docs/perfspect_flamegraph.md) for all options.

> [!TIP]
> By default, flamegraphs are collected using the `cycles:P` event. To analyze different performance aspects, use the `--perf-event` flag to specify an alternative perf event (e.g., `cache-misses`, `instructions`, `branches`, `context-switches`, `mem-loads`, `mem-stores`, etc.).

![screenshot of a flamegraph from the HTML output of the flamegraph command](docs/flamegraph.png)
![screenshot of a flamegraph from the HTML output of the flamegraph command](docs/images/flamegraph.png)

#### Lock Command

As systems contain more and more cores, it can be useful to analyze the Linux kernel lock overhead and potential false-sharing that impacts system scalability. Run `perfspect lock` to collect system-wide hot spot, cache-to-cache and lock contention information. Experienced performance engineers can analyze the collected information to identify bottlenecks.
As systems contain more and more cores, it can be useful to analyze the Linux kernel lock overhead and potential false-sharing that impacts system scalability. Run `perfspect lock` to collect system-wide hot spot, cache-to-cache and lock contention information. Experienced performance engineers can analyze the collected information to identify bottlenecks. See [`perfspect lock -h`](docs/perfspect_lock.md) for all options.

#### Config Command

The `config` command provides a method to view and change various system configuration parameters. Run `perfspect config -h` to view the parameters that can be modified.
The `config` command provides a method to view and change various system configuration parameters. Run [`perfspect config -h`](docs/perfspect_config.md) to view the parameters that can be modified.

> [!WARNING]
> Misconfiguring the system may cause it to stop functionining. In some cases, a reboot may be required to restore default settings.
Expand All @@ -189,7 +191,7 @@ Configuration recorded to: perfspect_2025-12-01_14-30-45/gnr_config.txt

##### Restoring Configuration

The `config restore` subcommand restores configuration from a previously recorded file. This is useful for reverting changes or applying a known-good configuration across multiple systems.
The [`config restore`](docs/perfspect_config_restore.md) subcommand restores configuration from a previously recorded file. This is useful for reverting changes or applying a known-good configuration across multiple systems.

Example:

Expand Down
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
42 changes: 42 additions & 0 deletions docs/perfspect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# perfspect

```text
PerfSpect (perfspect) is a multi-function utility for performance engineers analyzing software running on Intel Xeon platforms.

Usage:
perfspect [command] [flags]

Examples:
Generate a configuration report: $ perfspect report
Collect micro-architectural metrics: $ perfspect metrics
Generate a configuration report on a remote target: $ perfspect report --target 192.168.1.2 --user elaine --key ~/.ssh/id_rsa
Generate configuration reports for multiple remote targets: $ perfspect report --targets ./targets.yaml

Use "perfspect [command] --help" for more information about a command.

Commands:
report Collect configuration data from target(s)
benchmark Run performance benchmarks on target(s)
metrics Collect performance metrics from target(s)
telemetry Collect system telemetry from target(s)
flamegraph Collect flamegraph data from target(s)
lock Collect kernel lock data from target(s)
config Modify system configuration on target(s)

Other Commands:
update Update the application (Intel network only)
extract Extract the embedded resources (for developers)

Flags:
--debug enable debug logging and retain temporary directories
-h, --help help for perfspect
--log-stdout write logs to stdout
--noupdate skip application update check
--output string override the output directory
--syslog write logs to syslog instead of a file
--tempdir string override the temporary target directory, must exist and allow execution
-v, --version version for perfspect

Additional help topics:
perfspect
```
44 changes: 44 additions & 0 deletions docs/perfspect_benchmark.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# perfspect benchmark

```text
Run performance benchmarks on target(s)

Usage: perfspect benchmark [flags]

Examples:
Run all benchmarks: $ perfspect benchmark
Run specific benchmarks: $ perfspect benchmark --speed --power
Benchmark remote target: $ perfspect benchmark --target 192.168.1.1 --user fred --key fred_key
Benchmark multiple targets:$ perfspect benchmark --targets targets.yaml

Flags:
Benchmark Options:
--all run all benchmarks (default: true)
--speed CPU speed benchmark (default: false)
--power power consumption benchmark (default: false)
--temperature temperature benchmark (default: false)
--frequency turbo frequency benchmark (default: false)
--memory memory latency and bandwidth benchmark (default: false)
--numa NUMA bandwidth matrix benchmark (default: false)
--storage storage performance benchmark (default: false)
Other Options:
--no-summary do not include system summary in output (default: false)
--storage-dir existing directory where storage performance benchmark data will be temporarily stored (default: /tmp)
--format choose output format(s) from: all, html, xlsx, json, txt (default: [all])
Remote Target Options:
--target host name or IP address of remote target
--port port for SSH to remote target
--user user name for SSH to remote target
--key private key file for SSH to remote target
--targets file with remote target(s) connection details. See targets.yaml for format.
Advanced Options:
--input ".raw" file, or directory containing ".raw" files. Will skip data collection and use raw data for reports.

Global Flags:
--debug enable debug logging and retain temporary directories (default: false)
--log-stdout write logs to stdout (default: false)
--noupdate skip application update check (default: false)
--output override the output directory
--syslog write logs to syslog instead of a file (default: false)
--tempdir override the temporary target directory, must exist and allow execution
```
73 changes: 73 additions & 0 deletions docs/perfspect_config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# perfspect config

```text
Sets system configuration items on target platform(s).

USE CAUTION! Target may become unstable. It is up to the user to ensure that the requested configuration is valid for the target. There is not an automated way to revert the configuration changes. If all else fails, reboot the target.

Usage: perfspect config [flags]

Examples:
Set core count on local host: $ perfspect config --cores 32
Set multiple config items on local host: $ perfspect config --core-max 3.0 --uncore-max 2.1 --tdp 120
Record config to file before changes: $ perfspect config --c6 disable --epb 0 --record
Restore config from file: $ perfspect config restore gnr_config.txt
Set core count on remote target: $ perfspect config --cores 32 --target 192.168.1.1 --user fred --key fred_key
View current config on remote target: $ perfspect config --target 192.168.1.1 --user fred --key fred_key
Set governor on remote targets: $ perfspect config --gov performance --targets targets.yaml

Flags:
General Options:
--cores number of physical cores per processor
--llc LLC size in MB
--tdp maximum power per processor in Watts
--core-max SSE frequency in GHz
--core-max-buckets SSE frequencies for all core buckets in GHz (e.g., 1-40/3.5, 41-60/3.4, 61-86/3.2)
--epb energy perf bias from best performance (0) to most power savings (15)
--epp energy perf profile from best performance (0) to most power savings (255)
--gov CPU scaling governor (performance, powersave)
--elc efficiency latency control (latency, power) [SRF+]
Uncore Frequency Options:
--uncore-max maximum uncore frequency in GHz [EMR-]
--uncore-min minimum uncore frequency in GHz [EMR-]
--uncore-max-compute maximum uncore compute die frequency in GHz [SRF+]
--uncore-min-compute minimum uncore compute die frequency in GHz [SRF+]
--uncore-max-io maximum uncore IO die frequency in GHz [SRF+]
--uncore-min-io minimum uncore IO die frequency in GHz [SRF+]
Prefetcher Options:
--pref-l2hw L2 HW [all] (enable, disable)
--pref-l2adj L2 Adj [all] (enable, disable)
--pref-dcuhw DCU HW [all] (enable, disable)
--pref-dcuip DCU IP [all] (enable, disable)
--pref-dcunp DCU NP [all] (enable, disable)
--pref-amp AMP [SPR,EMR,GNR,DMR] (enable, disable)
--pref-llcpp LLCPP [GNR,DMR] (enable, disable)
--pref-aop AOP [GNR] (enable, disable)
--pref-l2p L2P [DMR] (enable, disable)
--pref-homeless Homeless [SPR,EMR,GNR] (enable, disable)
--pref-llc LLC [SPR,EMR,GNR] (enable, disable)
--pref-llcstream LLC Stream [SRF,CWF] (enable, disable)
C-State Options:
--c6 C6 (enable, disable)
--c1-demotion C1 Demotion (enable, disable)
Other Options:
--no-summary do not print configuration summary
--record record the current configuration to a file to be restored later
Remote Target Options:
--target host name or IP address of remote target
--port port for SSH to remote target
--user user name for SSH to remote target
--key private key file for SSH to remote target
--targets file with remote target(s) connection details. See targets.yaml for format.

Subcommands:
restore: Restore system configuration from file

Global Flags:
--debug enable debug logging and retain temporary directories (default: false)
--log-stdout write logs to stdout (default: false)
--noupdate skip application update check (default: false)
--output override the output directory
--syslog write logs to syslog instead of a file (default: false)
--tempdir override the temporary target directory, must exist and allow execution
```
Loading