-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Optimize ISLE compilation #12303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optimize ISLE compilation #12303
Conversation
|
@bongjunj thanks for this experimentation! Before moving further, would you mind benchmarking compilation time as well? One of the reasons for putting all of the codegen for an ISLE term in one function is so that the compiler can optimize the code together; splitting that code between functions and (especially) introducing ABI boundaries may produce slowdowns when Cranelift runs. IMHO, it's worth it to spend a few extra seconds when compiling Cranelift if it makes Cranelift itself run faster. If no effect, of course, then no problem and I'll be happy to review this. Thanks! |
Subscribe to Label ActionDetailsThis issue or pull request has been labeled: "cranelift", "isle"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
An 11x reduction in compile time is quite massive, so even if this is a ~NN% regression in compile-time-of-wasm-code I think this would be a great thing to land at least for debug builds and maybe optionally for release builds with some sort of tunable too. |
|
Note, that's an 11x reduction with Bongjun's huge new ruleset; ~10% on |
|
If it does result in Wasm compilation time regressions, it may still make sense to have this as an option of the ISLE compiler that we can enable during development or something, if we can make the maintenance burden minimal. |
|
Thanks for the comment! I've ran the benchmarks and measure the compilation time.
|
|
Sorry for the delayed response here -- was traveling then lost track of this PR. Thanks for the experiments! I think I agree with Nick that this would be good to have as an option, probably off by default. 0.67% compile time is still an unfortunate regression to take (on the flip side, 1% speedups make us pretty happy and take work to find), and is (usually) more important than the time taken to compile Cranelift, since usually you compile Cranelift once then use that compiled Cranelift-containing program many times. But when developing Cranelift, it would be useful to have this option. Could you put this under a Cargo feature that alters the behavior of |
|
@cfallin thanks for the comment! Thank you. |
c9aa301 to
a7db47c
Compare
Overview
The current implementation of ISLE code generation emits a single Rust function for each ISLE term, and
rustccompiles the Rust code to integrate ISLE with other modules of Cranelift.This can cause a compilation bottleneck when a ISLE term contains numerous rules ending up with producing a very very large Rust function, since
rustccannot efficiently compile such functions. Notably, the termsimplifywhich implements the mid-end peephole optimizations has about 500 rules, and the symptom is already visible. (And the ruleset is growing!) You can check, compiling Cranelift,cranelift-codegentakes most of the time in the following report: cargo-timing.htmlWith this PR, the ISLE codegen helps
rustcby wrapping a large match statement generated byislecin a closure. By introducing closures, a portion of a huge Rust function (for a ISLE term) can be split into several smaller compilation units forrustc, reducing the compilation time.Evaluation
On the
mainbranch, the build times forcranelift-codegenbefore/after this optimization are measured. This PR saves ~2 seconds on my machine (x86-64, 64 core, 512GB memory)Extra
I am experimenting with over 6,000 ISLE
simplifyrules.The compilation time reduced from 841.37 seconds to 69.34 seconds with this PR. (+11x speedup)