Merge a non-functional Wizer into Wasmtime #11815

alexcrichton · 2025-10-08T15:18:31Z

This is a history-preserving version of #11805 without any Wasmtime-specific changes. The goal here is to merge everything into this repo while only worrying about preserving the history and then a follow-up PR would actually integrate Wizer with Wasmtime.

This history was procedurally created with by running these commands in a Wizer clone:

python3 ~/git-filter-repo --to-subdirectory-filter crates/wizer
python3 ~/git-filter-repo --invert-paths --path crates/wizer/benches/uap_bench.wizer.wasm
python3 ~/git-filter-repo --invert-paths --path crates/wizer/benches/regex_bench.wizer.wasm
python3 ~/git-filter-repo --invert-paths --path crates/wizer/benches/uap_bench.control.wasm
python3 ~/git-filter-repo --invert-paths --path crates/wizer/benches/regex_bench.control.wasm
python3 ~/git-filter-repo --invert-paths --path crates/wizer/tests/regex_test.wasm
python3 ~/git-filter-repo --invert-paths --path crates/wizer/.github/actions/github-release
git push git@github.com:alexcrichton/wizer main:wasmtime-merge -f

and then running these commands in Wasmtime

git fetch https://github.com/alexcrichton/wizer wasmtime-merge
git merge FETCH_HEAD --allow-unrelated-histories

Notably the big binary blobs are removed from the history to avoid pulling in too much new size here (uap_bench.wizer.wasm was 35M+).

A final commit is then handwritten which excludes crates/wizer from the overall workspace and that's intended to be the starting point to rebase #11805 onto after merging.

Procedurally I do not plan to use the merge queue for this PR. Instead I plan on waiting for CI to fully pass on this PR and then I'll push this directly to main after toggling a switch in the settings for "allow admins to bypass restrictions". I'll then re-enable that switch after I do the push.

This commit updates the wasmtime and wasmtime-wasi crates to 0.22 and handles the API changes. Signed-off-by: Radu M <root@radu.sh>

Update Wasmtime to 0.22

Check rustfmt in CI

When initializing a module, it is sometimes useful to rename a function after the module has been modified. For example, we may wish to substitute an alternate entry point (for WASI, this is `_start()`) since we have already performed some initialization that the normal entry point would do. This PR adds a `-r` option to allow such renaming. For example, $ wizer -r _start=my_new_start -o out.wasm in.wasm will rename the `my_new_start` export in `in.wasm` to `_start` in `out.wasm`, replacing the old `_start` export. Multiple renamings are supported at once.

Previously, it was difficult to properly use Wizer with C++ programs that have static constructors. These constructors are normally run before `main()` is invoked by some libc/linker plumbing at the usual entry point. This causes a set of undesirable problems: - In the Wizer initialization entry point, our globals' constructors have not yet run. - If we manually invoke the constructors to resolve the first issue, then they will be *re*-invoked when the program's main entry point is run on the initialized module. This may cause surprising or inconsistent results. In order to properly handle this, we add a `wizer.h` header with a convenient macro that allows specifying a user initialization function, and adds an exported top-level Wizer initialization entry point and an exported replacement main entry point. The generated initialization entry point invokes any global constructors (using the function `__wasm_call_ctors()` which is generated by `wasm-ld`) then invokes the user's initialization function. The generated main entry point is meant to replace `_start` and performs the same work, *except* that it does not re-invoke constructors. Instead, it immediately invokes `main()` (because everything has already been initialized) and then calls destructors and exits. Taken together, the two entry points match the code that is present in `_start` in an ordinary Wasm binary compiled by wasi-sdk. This approach should be relatively stable, as long as the symbols `__wasm_call_ctors()`, `__wasm_call_dtors()`, and `__original_main()` are not renamed in the WASI SDK (this seems unlikely).

Add option to rename functions, and add C++ example using function renaming to handle ctors properly.

Keeps an alias around so the old flag will keep working. Also provides a "value name" so that the generated help text shows "<dst=src>" instead of "<func_renames>".

Small tweaks and dep updates

This makes it so that repeated wizenings of the same Wasm module (e.g. with different WASI inputs) doesn't require re-compiling the Wasm to native code every time. Fixes #9

Enable Wasmtime's code cache

Add knobs for WASI inheriting stdio and env vars

Although we are implicitly recording only the state that is different from the default (e.g. non-zero regions of memory) we aren't actually *computing* the difference between anything. The word "diff" makes me think of things like `git diff` that do edit distance-style computations, which we definitely aren't doing. I think "snapshot" more correctly reflects the implementation.

Add the ability to preopen dirs during WASI initialization

Add C++ example to Readme

This commit adds support for 99% of module linking to Wizer. This allows modules within the "bundle" to import/export memories, globals, tables, and instances from/to each other, but the root module still cannot import any external state. The one big feature from module linking that isn't supported yet is importing and exporting modules, rather than instances. More on this restriction later. Module linking breaks Wizer's previous assumption that a module is instantiated and initialized exactly once (or, at least, that if it were instantiated and initialized multiple times you would get the same result every time due to the restrictions on imports and what can be used during initialization). Module linking allows the same module to be instantiated multiple times and have a different set of functions called on each instance during initialization. This can lead to distinct initialized states for each instantiation. Here is a simple example: ```wat (module (module $A (global $g (mut i32) (i32.const 0)) (func (export "f") global.get $g i32.const 1 i32.add global.set $g)) (instance $a1 (instantiate $A)) (instance $a2 (instantiate $A)) (func (export "wizer.initialize") call (func $a1 "f") call (func $a2 "f") call (func $a2 "f"))) ``` After initialization, the `$a1` instance has had its `f` function export called once and therefore its global's value is 1, while `$a2` has had its `f` function export called twice and therefore its global's value is 2. Because each instantiation has its own initialized state, we need to emit a pre-initialized module for _each_ instantiation in the bundle. At the same time, however, we do _not_ want to duplicate the `f` function across each pre-initialized module! The code section is generally the largest section in a Wasm module, and if we naively duplicated the code section for each instantiation, we could get exponential code size blow ups. (For example, consider a module linking bundle that contains a leaf module named `$Module1` that is instantiated twice in another module called `$Module2`, which is in turn instantiated twice in `$Module4`, which is in turn instantiated twice in `$Module8`, etc... If the last module in this chain is instantiated `n` times, then `$Module1` is instantiated `2^n` times, and if we duplicate the code section for each instantiation we've got `2^n` copies of `$Module1`'s code section.) The solution is to split out a "code module" containing only the static sections from the original module. All the state in the module (memories, globals, and nested instantiations) are imported via a `__wizer_state` instance import. Then, each instantiation defines its own "state module" which contains the specific initialized state for that instance, and which is instantiated exactly once. To create the initialized version of the original instantiation, we join the state with the code by instantiating the code module and passing in this instantiation's state instance. Let's return to our original example from above. Here is the input Wasm again: ```wat (module (module $A (global $g (mut i32) (i32.const 0)) (func (export "f") global.get $g i32.const 1 i32.add global.set $g)) (instance $a1 (instantiate $A)) (instance $a2 (instantiate $A)) (func (export "wizer.initialize") call (func $a1 "f") call (func $a2 "f") call (func $a2 "f"))) ``` And this is roughly what it looks like after wizening: ```wat (module ;; The code module for $A. (module $A_code ;; Note that the code module imports state, rather than defining it. (import "__wizer_state" (instance (export "__wizer_global_0" (global $g (mut i32))))) (func (export "f") global.get $g i32.const 1 i32.add global.set $g)) ;; State module for $a1. Global is initialized to 1 since $a1's `f` ;; function was called once in the initialization. (module $a1_state_module (global (export "__wizer_global_0" (mut i32) (i32.const 1)))) ;; The state instance for $a1. (instance $a1_state_instance (instantiate $a1_state_module)) ;; Finally, we join the code+state together to create $a1. (instance $a1 (instantiate $A_code (import "__wizer_state" $a1_state_instance))) ;; State module for $a2. Global is initialized to 2 since $a2's `f` ;; function was called twice in the initialization. (module $a2_state_module (global (export "__wizer_global_0" (mut i32) (i32.const 2)))) ;; The state instance for $a2. (instance $a2_state_instance (instantiate $a2_state_module)) ;; Finally, we join the code+state together to create $a2. (instance $a2 (instantiate $A_code (import "__wizer_state" $a2_state_instance))) ;; Initialization function removed. ) ``` Using this code splitting technique allows us to avoid unnecessary duplication for each instantiation. The only bits that are duplicated are the things that are inherently different across instantiations: the instantiation's internal state. The reason we don't allow importing and exporting modules within the bundle is that two different modules that have the same module type signature could have two different sets of internal, not-exported state. This throws a wrench in the instrumentation pass, which needs to force export all internal state, and in this case could make the modules have incompatible type signatures, so they couldn't both be imported by the same module anymore. That, in turn, would force us to duplicate and specialize modules that import other modules for each instantiation. This is do-able, but requires significant additional engineering work. Therefore, this feature is left for some time in the future. As a final note, adding support for module linking required pretty much a full rewrite of Wizer. It no longer just parses the Wasm section by section, tweaking it in place. Instead, there is an initial "parse" phase that creates a `ModuleInfo` tree that serves as an AST. Our subsequent passes -- the instrumentation and rewriting passes -- process the `ModuleInfo` tree rather than reparsing the input Wasm and tweaking it directly.

* Fix Intel Mac CI builds. We build both x86-64 and aarch64 ("Intel Mac" and "Apple Silicon Mac") binaries for wizer in CI on the `macos-latest` CI runner. Historically, this runner was an x86-64 machine, so while we could do a direct compile for x86-64 binaries, we added a target override for `aarch64-darwin` for the aarch64 builds to force a cross-compile. When GitHub switched macOS CI runners to aarch64 (ARM64) machines somewhat recently, the `macos-latest` runner image began producing aarch64 binaries by default, and the target override for cross-compilation became meaningless (redundant). As a result, *both* of our x86-64 and aarch64 macOS binary artifacts contained aarch64 binaries. This is a problem for anyone running an Intel Mac. This PR switches the CI config to specify a cross-compilation of x86-64 binaries, so each target builds its proper architecture. * Use actions/upload-artifact v4 * Use v4 of actions/download-artifact too

* add option for reference_types only enables the wasmtime feature, which allows initializing Modules that were compiled with newer llvm versions but don't actually use reference_types Disabled by default * Update src/lib.rs Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * reject reference-types related instructions * add setter for wasm_reference_types * bail instead of unreachable * tests for reference-types still rejecting tyble modifying instructions * better error messages * test indirect call with reference_types enabled * same docstring for library and cli * (hopefully) actually use the new call_indirect --------- Co-authored-by: Gentle <ramon.klass@gmail.com> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>

…liance#129) * enable reference types and bulk memory support by default * update tests --------- Co-authored-by: Gentle <ramon.klass@gmail.com>

* chore(deps): update wasmtime & upstream deps This commit updates wasmtime along with other deps: - wasmtime: v31 -> v36.0.2 - wasmparser et al: *.228 -> *.238 * fix(deps): use major version of wasmtime in dep Co-authored-by: Till Schneidereit <till@tillschneidereit.net> --------- Co-authored-by: Till Schneidereit <till@tillschneidereit.net>

cfallin · 2025-10-08T20:41:29Z

Procedural idea: to keep history linear and thus make Wasmtime bisects less confusing, would it make sense to graft the initial commit of Wizer on top of Wasmtime main? Then the history goes from

(wizer initial) ------------------\
                                   >--(merge)----------
(Wasmtime initial)----------------/

to

                                               /--(wizer initial) ------------------\
                                              /                                      >--(merge)----------
(Wasmtime initial)------------------------------------------------------------------/

cfallin · 2025-10-08T20:42:09Z

(I believe this could be done with git rebase --onto main when on the Wizer branch, before the merge commit)

cfallin · 2025-10-08T20:42:55Z

(Happy to talk about this tomorrow in the Wasmtime meeting, too; the downside is that by rewriting commits it does change all the hashes)

alexcrichton · 2025-10-09T19:55:23Z

Ok I've tested what happens with git bisect with unrelated histories. I created a repo like this:

* commit 953fddf46cba6f53a67cc644d18d37dfc86b3b68 (HEAD -> main)
| Author: Alex Crichton <alex@alexcrichton.com>
| Date:   Thu Oct 9 12:51:18 2025 -0700
|
|     main3
|
*   commit 5daf51f205fbb562ac22bb084b1d9d6a620fc9ed
|\  Merge: a9307e8 709165d
| | Author: Alex Crichton <alex@alexcrichton.com>
| | Date:   Thu Oct 9 12:51:12 2025 -0700
| |
| |     Merge branch 'adjacent'
| |
| * commit 709165d603c4f09fce67027e358ba1604927a0c2 (adjacent)
|   Author: Alex Crichton <alex@alexcrichton.com>
|   Date:   Thu Oct 9 12:49:14 2025 -0700
|
|       adjacent1
|
* commit a9307e87fb2eee68a5515099e8fbf2bd5feee5d6
| Author: Alex Crichton <alex@alexcrichton.com>
| Date:   Thu Oct 9 12:51:10 2025 -0700
|
|     main2
|
* commit 97c3f6070514e30a101e60ed25a0ad6201958b6a
  Author: Alex Crichton <alex@alexcrichton.com>
  Date:   Thu Oct 9 12:49:14 2025 -0700

      main1

and then I ran this bisect session:

$ git bisect start
status: waiting for both good and bad commits
$ git bisect good 97c3f6070514e30a101e60ed25a0ad6201958b6a
status: waiting for bad commit, 1 good commit known
$ git bisect bad 953fddf46cba6f53a67cc644d18d37dfc86b3b68
Bisecting: 2 revisions left to test after this (roughly 1 step)
[709165d603c4f09fce67027e358ba1604927a0c2] adjacent1
$ git bisect skip
Bisecting: 0 revisions left to test after this (roughly 1 step)
[5daf51f205fbb562ac22bb084b1d9d6a620fc9ed] Merge branch 'adjacent'
$ git bisect bad
Bisecting: 1 revision left to test after this (roughly 1 step)
[a9307e87fb2eee68a5515099e8fbf2bd5feee5d6] main2
$ git bisect bad
a9307e87fb2eee68a5515099e8fbf2bd5feee5d6 is the first bad commit
commit a9307e87fb2eee68a5515099e8fbf2bd5feee5d6
Author: Alex Crichton <alex@alexcrichton.com>
Date:   Thu Oct 9 12:51:10 2025 -0700

    main2

 a | 2 ++
 1 file changed, 2 insertions(+)

Basically looks like git bisect figured out how to switch between trees alright.

Overall I'm also more comfortable not rewriting commits/dates/authors with git rebase, so I'm going to stick to this branch's merge strategy of two unrelated histories with one merge commit to join them together

…into wizer-inert

prtest:full

Radu M and others added 30 commits January 13, 2021 19:19

Update Wasmtime to 0.22

7e1a896

This commit updates the wasmtime and wasmtime-wasi crates to 0.22 and handles the API changes. Signed-off-by: Radu M <root@radu.sh>

Merge pull request #2 from radu-matei/update-wasmtime

7bf8baf

Update Wasmtime to 0.22

Bump to version 1.0.2

a3c6858

Check rustfmt in CI

37e4e43

Merge pull request #3 from fitzgen/rustfmt-ci

38c0b88

Check rustfmt in CI

Merge pull request #5 from cfallin/cpp-ctors

3c5c988

Add option to rename functions, and add C++ example using function renaming to handle ctors properly.

Bump to version 1.1.0

7d4bdb6

Make config knobs for our supported Wasm proposals

cf715d5

Rename --func-rename to --rename-func

5c91035

Keeps an alias around so the old flag will keep working. Also provides a "value name" so that the generated help text shows "<dst=src>" instead of "<func_renames>".

Add a test for multi-memory Wasm modules

4662bef

Upgrade dependencies

6f6ecc9

Update Wasmtime to 0.23.0

13bd26f

Merge pull request #6 from fitzgen/small-tweaks-and-dep-updates

08b6d85

Small tweaks and dep updates

Enable Wasmtime's code cache

3b858aa

This makes it so that repeated wizenings of the same Wasm module (e.g. with different WASI inputs) doesn't require re-compiling the Wasm to native code every time. Fixes #9

Merge pull request #11 from fitzgen/code-cache

921c378

Enable Wasmtime's code cache

Bump to version 1.1.1

45ee076

Update Wasmtime and other deps

691b6f4

Bump to 1.1.2

168e514

Add knobs for WASI inheriting stdio and env vars

0f1329d

Merge pull request #12 from fitzgen/inherit-stdio-env-vars

979c92a

Add knobs for WASI inheriting stdio and env vars

Bump to version 1.2.0

0d700cb

Add the ability to preopen dirs during WASI initialization

13e9de3

Bump to version 1.3.0

9e8634f

Merge pull request #13 from fitzgen/preopen-dirs

4662dd8

Add the ability to preopen dirs during WASI initialization

Add C++ example to Readme

18686da

Merge pull request #14 from aminya/patch-1

c2839e1

Add C++ example to Readme

Guy Bedford and others added 13 commits August 20, 2024 17:18

release: fix version tag reader (bytecodealliance#116)

5b91ee3

7.0.4 (bytecodealliance#117)

e843a35

7.0.5 (bytecodealliance#119)

16b97f0

update ci write permissions (bytecodealliance#120)

e1f43b1

Update wasmtime to v29 (bytecodealliance#127)

4818cbe

enable reference types and bulk memory support by default (bytecodeal…

fc78a01

…liance#129) * enable reference types and bulk memory support by default * update tests --------- Co-authored-by: Gentle <ramon.klass@gmail.com>

Bump to version 8

92f77df

chore: bump deps (bytecodealliance#132)

2f76c85

Bump to version 9.0.0 (bytecodealliance#135)

45c4b6a

Bump to version 10.0.0 (bytecodealliance#137)

a94726c

alexcrichton requested review from a team as code owners October 8, 2025 15:18

alexcrichton requested review from dicej and removed request for a team October 8, 2025 15:18

alexcrichton force-pushed the wizer-inert branch from 6725c52 to 40de2f7 Compare October 8, 2025 15:19

dicej approved these changes Oct 8, 2025

View reviewed changes

alexcrichton mentioned this pull request Oct 8, 2025

Integrate wizer into this repository #11805

Merged

alexcrichton added 2 commits October 9, 2025 13:08

Merge branch 'wasmtime-merge' of https://github.com/alexcrichton/wizer …

71f290b

…into wizer-inert

Ignore wizer subdirectory for now

19c3380

prtest:full

alexcrichton force-pushed the wizer-inert branch from 9a8a760 to 19c3380 Compare October 9, 2025 20:09

alexcrichton merged commit 19c3380 into bytecodealliance:main Oct 9, 2025
40 checks passed

alexcrichton deleted the wizer-inert branch October 9, 2025 20:10

yoshuawuyts mentioned this pull request Nov 25, 2025

Migrate proposals into the WASI monorepo WebAssembly/WASI#826

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge a non-functional Wizer into Wasmtime #11815

Merge a non-functional Wizer into Wasmtime #11815

Uh oh!

alexcrichton commented Oct 8, 2025 •

edited

Loading

Uh oh!

cfallin commented Oct 8, 2025

Uh oh!

cfallin commented Oct 8, 2025

Uh oh!

cfallin commented Oct 8, 2025

Uh oh!

alexcrichton commented Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

Merge a non-functional Wizer into Wasmtime #11815

Merge a non-functional Wizer into Wasmtime #11815

Uh oh!

Conversation

alexcrichton commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfallin commented Oct 8, 2025

Uh oh!

cfallin commented Oct 8, 2025

Uh oh!

cfallin commented Oct 8, 2025

Uh oh!

alexcrichton commented Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

alexcrichton commented Oct 8, 2025 •

edited

Loading