fix(sys): use THREAD_LOCAL TLS for mimalloc v3 on Apple#71
Draft
shulaoda wants to merge 1 commit into
Draft
Conversation
shulaoda
added a commit
to rolldown/rolldown
that referenced
this pull request
May 22, 2026
) See napi-rs/mimalloc-safe#71 & https://github.com/rolldown/rolldown/actions/runs/26264825278/job/77305893292 This is temporarily merged to allow upgrade and integration, and will be validated in subsequent CI runs. It has already been confirmed to work in the current PR. It will be reverted before the next release, and mimalloc-safe will switch back to the official release version for production use.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the Apple-branch TLS model in mimalloc v3's
prim.hselector withMI_TLS_MODEL_THREAD_LOCAL+MI_TLS_RECURSE_GUARD, solving both the multi-instance heap-corruption issue (Mode A below) and the rayon-worker SIGABRT issue introduced by #67 (Mode B below). Patch is applied bybuild.rsat build time, idempotent, with the submodule left pristine in git.Problem
Two distinct failure modes have been observed in production:
Mode A — multi-instance heap corruption (FIXED_SLOT, upstream default)
mimalloc v3 on Apple stores the per-thread
theappointer at fixed TCB slots 108/109, accessed directly viatpidrro_el0(arm64) or%gs:(x86_64). When multiple statically-linked mimalloc-safe instances coexist in one process — the canonical case being multiple Node.js napi addons — every instance writes its own heap pointer to the same slot on every thread it touches.On any thread called into by more than one addon — the main JS thread above all, plus libuv worker pool and ThreadsafeFunction dispatcher threads — the second instance reads back the first's heap pointer, treats it as its own, and corrupts both.
mimalloc upstream acknowledges this in
prim.h:Mode B — rayon-worker SIGABRT (DYNAMIC_PTHREADS, #67's workaround)
PR #67 addressed Mode A by passing
-DMI_HAS_TLS_SLOT=0, routing Apple toMI_TLS_MODEL_DYNAMIC_PTHREADS(per-imagepthread_key_create+pthread_setspecific). That code path is primarily designed and tested for OpenBSD/Android; its interaction with long-lived non-tokio threads on macOS is fragile.Observed in rolldown:
oxc_cfg → oxc_index → rayon) hits apthread_setspecifictiming inconsistency on its first or post-main allocation.handle_alloc_errorfires, the process aborts with SIGABRT after the bundle has otherwise completed successfully (Finished in ~15msis printed to stdout immediately before).it.repeats(100)on rolldown'scli-e2e.test.ts.Solution
Switch the Apple branch in mimalloc v3's
prim.hselector to use:Both are mature upstream code paths —
THREAD_LOCALis the default on Linux/FreeBSD/NetBSD/etc.;RECURSE_GUARDhas been in the source since v2 specifically to handle dyld-TLV first-touch on macOS.mi_decl_hidden mi_decl_threadproduces per-image__threadsymbols → dyld TLV allocates per-image per-thread storage. No shared TCB slot.MI_TLS_RECURSE_GUARDshort-circuits the fast path via a plain non-TLS_mi_process_is_initializedbool until process init completes.MI_HAS_TLS_SLOTstays at the upstream default1, so_mi_prim_thread_id()keeps usingmi_prim_tls_slot(0)(Apple's system-defined thread-id TSD slot — semantically shared across consumers, no conflict, no allocation).pthread_setspecifictiming surface — every thread just reads/writes its own TLV slot via a normal__threadaccess.Why a
build.rspatch (not a fork or vendoring)mimalloc3is a git submodule pinned to upstreammicrosoft/mimallocatv3.3.2(commit30b2d9d8). Tradeoffs of the alternatives:build.rspatch is what's done here — minimal, surgical, and:assert!s if the upstream selector block can't be located. If a future mimalloc release rewritesprim.h's selector, the next submodule bump fails the build with a clear error message pointing back at the patcher.git submodule update --initon a fresh clone works unchanged. CI re-applies per build.cargo:rerun-if-changed=<prim.h>for change detection. Emitted viacargo:warningso the patch application is visible in build logs.Migration
-DMI_HAS_TLS_SLOT=0cflag is removed.v3feature when bumping from 0.1.61.Performance
Fast-path overhead from
RECURSE_GUARD: one BSS load + one branch on_mi_process_is_initialized. After the first microsecond of process life the branch is predictable to "yes" — sub-microsecond impact in microbenchmarks per upstream documentation; not measurable in real workloads.