Skip to content

Compiler profiling#8997

Open
mcourteaux wants to merge 3 commits intohalide:mainfrom
mcourteaux:compiler-profiling
Open

Compiler profiling#8997
mcourteaux wants to merge 3 commits intohalide:mainfrom
mcourteaux:compiler-profiling

Conversation

@mcourteaux
Copy link
Contributor

This is a Proof-of-Concept PR. I tried to be minimally invasive:

  1. IRVisitors and IRMutators can be decorated with Profiled<...> to get the version that profiles the visit functions.
  2. ZoneScoped macro (and a few variants) times a scope, using the function name in which it sits.
  3. Additionally, I added IRVisitor::profiled_visit(), IRMutator::profiled_mutate(), and IRGraphVisitor::profiled_include(), to be sure you can profile the IRVisitor/Mutator without necessarily profiling the entire IR tree being walked.

To use this:

  1. Enable in build with CMake with -DWITH_COMPILER_PROFILING. When this is not passed, the code is compiled as before with zero runtime overhead.
  2. Run the Halide compiler for some Pipeline with HL_COMPILER_TRACE_FILE=somefile.trace.json as environment variable. If this variable is not set, no profiling events will be collected, nor written.
  3. The file will be written when the program exits.
  4. you can open this in https://ui.perfetto.dev

Example trace file: vector_shuffle.trace.json

With sample screenshot:

Screenshot 2026-03-09 at 00 12 17

This work led to #8996

@mcourteaux mcourteaux requested a review from alexreinking March 8, 2026 23:19
Comment on lines +1473 to +1475
Profiled<ExtractBlockSize> block_size;
block_size.profiled_visit(op);
Stmt loop(op);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexreinking Can you validate this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant