Implement the build side of spec shaking#2353
Conversation
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR adds automatic filtering of unused types from contract specifications during the build process. It implements reachability analysis using fixed-point iteration to identify and preserve only types that are transitively referenced by contract functions, while always preserving functions and events. The implementation replaces wasm-gen with wasm-encoder for more flexible WASM manipulation capabilities.
Key changes:
- Implements a new filter module with reachability analysis to detect unused UDTs
- Refactors WASM manipulation to use wasm-encoder instead of wasm-gen, enabling section replacement
- Integrates filtering into the contract build pipeline after metadata injection
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/soroban-cli/src/commands/contract/build.rs | Adds filter_spec call to build pipeline; refactors inject_meta to use wasm-encoder; adds new error types |
| cmd/soroban-cli/Cargo.toml | Replaces wasm-gen dependency with wasm-encoder 0.235.0 |
| cmd/crates/soroban-spec-tools/src/lib.rs | Exports new filter module |
| cmd/crates/soroban-spec-tools/src/filter.rs | Implements reachability analysis and filtering logic with comprehensive unit tests |
| cmd/crates/soroban-spec-tools/src/contract.rs | Adds filter_unused_types, filtered_spec_xdr methods and replace_custom_section utility; includes tests |
| cmd/crates/soroban-spec-tools/Cargo.toml | Adds wasm-encoder 0.235.0 dependency |
| Cargo.lock | Updates dependencies with wasm-encoder and wasmparser 0.235.0; removes obsolete wasm-gen dependencies |
mootz12
left a comment
There was a problem hiding this comment.
Looks great! Only left nitpicks.
mootz12
left a comment
There was a problem hiding this comment.
Looks great! Left a couple of pedantic comments inline.
Leaving review as comment for now - will revise after doing some manual testing.
|
Opened a couple related issues as a result of this work:
I've opened PRs for both, if those PRs merged before this one, this change will benefit from them. |
### What Add `Spec::verify()` to `soroban-spec-tools` that checks every UDT referenced in function signatures, event params, and type definitions is defined within the spec. Integrate the check into `contract build` to emit warnings after a successful build. Add a hidden `contract spec-verify` subcommand for standalone verification of a WASM file. ### Why Contracts can compile and deploy with incomplete specs when referenced types are missing — for example, due to spec shaking in the SDK. Downstream tooling such as bindings generators and explorers silently fail when types are unresolvable. Surfacing this at build time gives developers an early, actionable signal. This will also act as a backup safety to help alert us and developers if changes we make and experiment with wrt spec shaking in #2353 cause specs to disappear that were otherwise needed. Close #2425
…f wasm (#1672) ### What Implement spec shaking using dead code elimination and data section of wasm. Embed markers in the WASM data section for each type/event that is used at external boundaries. These markers enable post-build tools to filter unused entries from the `contractspecv0` section. Also, all contracttypes, contracterrors, and contractevents are exported in contract specs now if they are directly or indirectly referenced by a function parameter, return value, event. ### Why Contract WASM files can contain unused type and event definitions in the `contractspecv0` custom section that inflate binary size unnecessarily. No shaking of specs occurrs today. Because of this contract imports never export types otherwise importing one contract would export all its types. Because of how proc-macros run in isolation and cannot coordinate in WASM builds in Rust, there's nothing the soroban-sdk can do to identify whether a contracttype that sometimes needs to be exported hasn't been used and can be excluded. The macros can however embed markers in the regular data section for each type/event when it's used. These markers survive DCE only if the corresponding code is reachable. A post-build tool (stellar-cli) can extract these markers and filter the spec accordingly. Close #1569 Close stellar/stellar-cli#2030 ### Side-effects A side-effect of this change, is that types/events brought into a contract from a `contractimport` can now be automatically exported from that contract if they use the types at their boundary. This is why the change also removes the `export = false` from types/events that come from a `contractimport`. Close #1330 ### How The SDK embeds markers using volatile reads in conversion/publish functions: **Marker format (12 bytes):** - Bytes 0-5: `SpEcV1` magic prefix - Bytes 6-13: 64-bit truncated SHA256 hash of the spec entry XDR bytes **Embedding mechanism:** ```rust impl SpecShakingMarker for MyType { #[inline(always)] fn spec_shaking_marker() { #[cfg(target_family = "wasm")] { static MARKER: [u8; 14] = *b"SpEcV1\x93\xd5\x05\x04\x36\x68\x40\xd2"; let _ = unsafe { ::core::ptr::read_volatile(MARKER.as_ptr()) }; } } } ``` **When markers are included:** - Types: `spec_shaking_marker()` is called from `TryFromVal` implementations via `IncludeSpec` trait - Events: `spec_shaking_marker()` is called from the `publish()` method **Why this works:** If a type/event is never used at an external boundary, its `__include_spec_marker()` function is never called, and DCE removes both the function and its marker. If the type is used, the marker survives in the data section. **Type identities:** Note that a long time problem with doing any sort of work like this has been identifying the identity of one spec type vs another. That is improved and worked around in this change by using a hash of the spec entry rather than the name of the spec entry. ### CLI Support Required This feature requires CLI support for filtering specs using these markers. See: stellar/stellar-cli#2353 ### Try It Out Install the modified `stellar-cli` that strips unused specs: ``` cargo install --locked --git https://github.com/stellar/stellar-cli --branch spec-clean stellar-cli ``` Import the modified `soroban-sdk` that marks used specs: ``` [dependencies] soroban-sdk = { git = "https://github.com/stellar/rs-soroban-sdk", branch = "spec-markers", features = ["experimental_spec_shaking_v2"] } [dev-dependencies] soroban-sdk = { git = "https://github.com/stellar/rs-soroban-sdk", branch = "spec-markers", features = ["experimental_spec_shaking_v2", "testutils"] }" ``` Build using the stellar-cli: ``` stellar contract build ``` Inspect the contract interface and it should only show types actually used at the boundary of the contract (inputs, outputs, and events): ``` stellar contract info interface --wasm target/wasm32v1-none/release/contract.wasm ``` ### Todo - [x] Figure out a roll out plan, because you can't really use this sdk until you use the new cli, otherwise your specs will increase in size rather than decrease because imported types are now exported then stripped. - [x] Evaluate the cost of the volatile read on a large contract (expect impact is small) - [x] Look for alternative ways other than a volatile read to force the marker to stick around - [x] Try black box - [x] Try `#[used]` - [x] Markers need to be included for any nested user defined types and types inside containers - [x] Come up with better name for feature - [x] Come up with better name for env var (should it just be Stellar CLI version?) - [x] Write out every possible test case, with dimensions: - [x] type: struct, struct tuple, enum, enum int, error - [x] containers: vec, map, option, result, tuple (0-12), struct, struct tuple, enum - [x] mod: lib, sub - [x] visibility: pub, pub(crate), private - [x] imported: and not exported, and exported - [x] context: contract fn param, non-contract fn param, contract fn return, non-contract fn return, event topic field, event data field --------- Co-authored-by: mootz12 <38118608+mootz12@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
What
Filter unused type and event definitions from the
contractspecv0WASM section duringstellar contract buildusing markers embedded by the SDK.Changes:
replace_custom_section()in newsoroban_spec_tools::wasmmodule for WASM custom section replacementsoroban_spec::shakingfrom the SDKrssdkfeat=experimental_spec_shaking_v2contract meta entrySOROBAN_SDK_BUILD_SYSTEM_SUPPORTS_SPEC_SHAKING_V2env var during builds to signal SDK supportWhy
Contract WASM files can contain unused type and event definitions in the
contractspecv0custom section that inflate binary size unnecessarily.The problem: Any exported
contracttypesorcontracteventsdefined in the SDK or any library that don't get used in the importing contract still end up in the contract spec. This results in awkward behaviour where we avoid exporting contracttypes in the SDK, even though doing so would be convenient and enable using contracttypes more frequently there.Why the SDK can't fix this alone: Because of how proc-macros run in isolation and cannot coordinate in WASM builds in Rust (the ctor crate cannot be used like we use it in non-wasm builds in tests), there's nothing the soroban-sdk can do to identify whether a contracttype that sometimes needs to be exported hasn't been used and can be excluded.
Solution: The SDK now embeds markers in the WASM data section for each type/event that is actually used. These markers survive dead code elimination only if the corresponding code is reachable. The CLI extracts these markers and filters the spec accordingly.
Examples of unused entries that get filtered:
Close #2030
Close stellar/rs-soroban-sdk#1569
Close stellar/rs-soroban-sdk#1330
How
The filtering uses markers embedded by the SDK (via
soroban_spec::shaking):Marker format (14 bytes):
SpEcV1magic prefixFiltering process:
rssdkfeat=experimental_spec_shaking_v2— if absent, skip filtering (SDK does not support spec shaking)SpEcV1markers viasoroban_spec::shaking::find_all()soroban_spec::shaking::filter():contractspecv0custom section with the filtered versionWhy this works: The SDK's proc-macros generate code that references a static marker when a type/event is used. If that code path survives dead code elimination (DCE), the marker remains in the data section. If the type/event is never used, DCE removes the code and the marker disappears.
Build integration: The CLI sets
SOROBAN_SDK_BUILD_SYSTEM_SUPPORTS_SPEC_SHAKING_V2=1during builds to inform the SDK that the build system supports marker-based filtering.WASM manipulation: The new
soroban_spec_tools::wasm::replace_custom_section()function rebuilds the WASM, copying all sections verbatim except the target custom section, which is replaced with new content at the end. If multiple custom sections share the same name, all are consolidated into one.SDK Support Required
This feature requires SDK support for embedding markers. See: stellar/rs-soroban-sdk#1571
Try It Out
Install the modified
stellar-clithat strips unused specs:Import the modified
soroban-sdkthat marks used specs:Build using the stellar-cli:
Inspect the contract interface and it should only show types actually used at the boundary of the contract (inputs, outputs, and events):