Skip to content

Conversation

@juj
Copy link
Collaborator

@juj juj commented Jan 29, 2026

Fix a +200% Windows performance regression caused by PR #4897. On Windows with Visual Studio compiler, the constructor of a std::stringstream object is extremely slow, as it incurs a call to a std::locale ctor(), which in turn causes an access to some kind of process global locale mutex lock. (maybe to get the current system locale?).

Visual Studio profiler showed this hotspot as:

image

And Markus Stange's fantastic Samply profiler highlighted the size of the issue as

image

where 79% of total time in wasm-opt was taken by the std::stringstream constructor on Windows.

Live link to the above profile: https://share.firefox.dev/4a2wen8

The slow behavior occurred with at least the following command lines:

wasm-opt --strip-dwarf --post-emscripten -O3 --low-memory-unused --zero-filled-memory
--pass-arg=directize-initial-contents-immutable --strip-debug --strip-producers
build.wasm -o build2.wasm --mvp-features --enable-bulk-memory --enable-exception-handling
--enable-mutable-globals --enable-nontrapping-float-to-int --enable-sign-ext --enable-simd

and

wasm-opt --strip-target-features --post-emscripten -O2 --low-memory-unused --zero-filled-memory
--pass-arg=directize-initial-contents-immutable build.wasm -o build2.wasm --mvp-features
--enable-bulk-memory --enable-bulk-memory-opt --enable-call-indirect-overlong --enable-exception-handling
--enable-multivalue --enable-mutable-globals --enable-nontrapping-float-to-int --enable-reference-types
--enable-sign-ext --enable-simd

The main issue is on the very hot function PassRunner::runPassOnFunction() that increased wasm-opt link times from ~20 seconds to ~60 seconds after #4897 on Windows when wasm-opt -O2 optimizing a large 33MB .wasm file. std::string ctor does not share this performance problem on Windows.

Searched through the codebase for other possible uses of std::stringstream that were "optional", and the function PassRunner::run() also had a similar usage pattern, so fixed that too for consistency (even though that call site did not show up in profiles as hot).

After this change, the wasm-opt time returned from ~60s to ~20s.

. On Windows with Visual Studio compiler, the constructor of a std::stringstream object is extremely slow, as it incurs a call to a std::locale ctor(), which in turn causes an access to some kind of process global locale mutex lock. (maybe to get the current system locale?). Work around this issue on the very hot function PassRunner::runPassOnFunction() that increased wasm-opt link times from ~20 seconds to ~60 seconds on Windows when wasm-opt -O2 optimizing a large 33MB .wasm file. std::string ctor does not share this performance problem on Windows.
@kripken
Copy link
Member

kripken commented Jan 29, 2026

Oh wow, good find!

This code path should only be reached in debug mode, though (--debug or BINARYEN_PASS_DEBUG in the env) - just want to confirm that? Your quoted commands don't seem to be debug ones, so there may be an additional problem, if we are getting to that code at all (the check on line 854 should prevent that).

@juj
Copy link
Collaborator Author

juj commented Jan 29, 2026

This code path should only be reached in debug mode, though

there may be an additional problem, if we are getting to that code at all (the check on line 854 should prevent that).

Line 854 does prevent entering the std::stringstream construction code for moduleBefore object and I did not see a perf issue there. That area just had same usage pattern of std::stringstream so I decided to be consistent to refactor that as well.

The function PassRunner::runPassOnFunction() starting at line 1008 was the hot one, and that is being called on release builds as well without any debug gates around std::stringstream ctor?

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks, I didn't notice the second usage below was not indented in debug code...

@kripken kripken merged commit e574f53 into WebAssembly:main Jan 29, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants