Skip to content

[APP-7B2] [DO NOT MERGE] Reproduce and validate iOS deadlock fix in ComponentDescriptorRegistry#91607

Draft
JakubKorytko wants to merge 6 commits into
Expensify:mainfrom
software-mansion-labs:korytko/app-7B2
Draft

[APP-7B2] [DO NOT MERGE] Reproduce and validate iOS deadlock fix in ComponentDescriptorRegistry#91607
JakubKorytko wants to merge 6 commits into
Expensify:mainfrom
software-mansion-labs:korytko/app-7B2

Conversation

@JakubKorytko
Copy link
Copy Markdown
Contributor

@JakubKorytko JakubKorytko commented May 25, 2026

DO NOT MERGE - APP-7B2 Deadlock Reproduction & Fix Validation

This PR exists solely to reproduce and validate the fix for the APP-7B2 fatal iOS app hang described in #91292. It is not intended for merge.

How to test

This branch has two commits. Each should be tested with a separate ad-hoc build:

  1. simulate app-7B2 crash - demonstrates the deadlock (app hangs)
  2. fix app-7B2 crash - applies the fix (app no longer hangs)

Steps:

  1. Run the relevant ad-hoc build
  2. Log in via OldDot (Expensify Classic)
  3. Navigate to NewDot (the transition triggers the reproduction)
  4. Observe:
    • Without fix (commit 1): App freezes - the deadlock watchdog fires after ~3s, the app crashes or hangs
    • With fix (commit 2): Transition completes normally, deadlock did NOT occur

What is APP-7B2?

A fatal iOS app hang caused by mutex contention in React Native Fabric's ComponentDescriptorRegistry. It occurs during OldDot -> NewDot transitions in HybridApp when legacy interop components are lazily registered.

The deadlock (two threads, circular dependency)

Thread Holds Waiting for
Main (CADisplayLink -> stepAnimations) - shared_lock(mutex_) in findComponentDescriptorByHandle_DO_NOT_USE_THIS_IS_BROKEN
Background (JS -> _registerComponentIfPossible) unique_lock(mutex_) in ComponentDescriptorRegistry::add() Main queue via RCTUnsafeExecuteOnMainQueueSync (inside LegacyViewManagerInteropComponentDescriptor constructor)

The main thread can't acquire shared_lock because the background thread holds unique_lock. The background thread can't proceed because it's waiting for the main thread (which is blocked). Classic deadlock.

Commit 1: Reproduce (+037 patch)

Patch: react-native+0.83.1+037+deadlock-repro-APP-7B2.patch

Adds an RCTDeadlockRepro native module that recreates the exact conditions from the Sentry crash report:

  • Main thread: A CADisplayLink repeatedly calls findComponentDescriptorByHandle_DO_NOT_USE_THIS_IS_BROKEN, acquiring shared_lock(mutex_) - matching Sentry Thread 0's stack
  • Background thread: Clears and re-registers legacy interop component descriptors, which calls ComponentDescriptorRegistry::add() (acquires unique_lock(mutex_)) -> LegacyViewManagerInteropComponentDescriptor constructor -> RCTUnsafeExecuteOnMainQueueSync - matching Sentry Thread 24's stack

The repro also wraps the original getViewManagerClass() call in RCTUnsafeExecuteOnMainQueueSync inside constructCoordinator() to force the main-queue dependency that exists in production (where view managers must be accessed on the main thread).

Key implementation details:

  • Stores ComponentDescriptorProvider objects (not just names) to preserve flavor shared_ptrs across re-registration cycles, keeping ComponentHandle values stable
  • Uses selective provider clearing (clearProvidersForTesting) to only remove legacy interop providers, leaving standard component providers intact for subsequent ComponentDescriptorRegistry creation

Files modified:

  • RCTComponentViewFactory.mm - adds clearDescriptorCacheForTesting and callBrokenLookupForTesting
  • ComponentDescriptorProviderRegistry.cpp/.h - adds clearForTesting, clearProvidersForTesting, callBrokenLookupForTesting
  • ComponentDescriptorRegistry.cpp/.h - adds logging and retainCurrentDescriptors
  • LegacyViewManagerInteropComponentDescriptor.mm - wraps getViewManagerClass in RCTUnsafeExecuteOnMainQueueSync
  • RCTDeadlockRepro.mm/.h - the native module orchestrating the reproduction
  • HybridAppHandler.tsx - triggers the repro on OldDot->NewDot transition

Commit 2: Fix (+038 patch)

Patch: react-native+0.83.1+038+fix-deadlock-APP-7B2.patch

Moves the descriptor construction outside the unique_lock in ComponentDescriptorRegistry::add(). The lock is only acquired for the final map insertions (_registryByHandle, _registryByName):

// BEFORE (deadlocks):
void ComponentDescriptorRegistry::add(...) {
  std::unique_lock lock(mutex_);          // 1. Lock
  auto descriptor = provider.constructor(...); // 2. Construct (calls RCTUnsafeExecuteOnMainQueueSync) -> BLOCKED
  _registryByHandle[handle] = descriptor;
}

// AFTER (fixed):
void ComponentDescriptorRegistry::add(...) {
  auto descriptor = provider.constructor(...); // 1. Construct (no lock held, main thread can proceed)
  std::unique_lock lock(mutex_);               // 2. Lock (only for map insertion)
  _registryByHandle[handle] = descriptor;
}

This breaks the circular dependency: the background thread no longer holds unique_lock when calling RCTUnsafeExecuteOnMainQueueSync, so the main thread can acquire shared_lock freely.

Videos

Simulated (commit 1)

Simulated.mov

Simulated with fix (commit 2)

Fixed.mov

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ This PR is possibly changing native code and/or updating libraries, it may cause problems with HybridApp. Please check if any patch updates are required in the HybridApp repo and run an AdHoc build to verify that HybridApp will not break. Ask Contributor Plus for help if you are not sure how to handle this. ⚠️

@github-actions
Copy link
Copy Markdown
Contributor

🚧 @mountiny has triggered a test Expensify/App build. You can view the workflow run here.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

🚧 @mountiny has triggered a test Expensify/App build. You can view the workflow run here.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 25, 2026

Codecov Report

❌ Looks like you've decreased code coverage for some files. Please write tests to increase, or at least maintain, the existing level of code coverage. See our documentation here for how to interpret this table.

Files with missing lines Coverage Δ
src/HybridAppHandler.tsx 52.94% <0.00%> (-3.31%) ⬇️
... and 21 files with indirect coverage changes

@github-actions

This comment has been minimized.

This reverts commit 6d39405.
@github-actions
Copy link
Copy Markdown
Contributor

🚧 @mountiny has triggered a test Expensify/App build. You can view the workflow run here.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

🚧 @mountiny has triggered a test Expensify/App build. You can view the workflow run here.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

🚧 @mountiny has triggered a test Expensify/App build. You can view the workflow run here.

@github-actions
Copy link
Copy Markdown
Contributor

🧪🧪 Use the links below to test this adhoc build on Android, iOS, and Web. Happy testing! 🧪🧪
Built from App PR #91607.

Android 🤖 iOS 🍎
⏩ SKIPPED ⏩ https://ad-hoc-expensify-cash.s3.us-east-1.amazonaws.com/rock-artifacts/ad-hoc/rock-ios-device-AdHoc-83beb69-75cbb1b-baa1c14f6a3287c18c348f8948556cd9c3028b62/index.html
The build for Android was skipped iOS
Web 🕸️
⏩ SKIPPED ⏩
The build for Web was skipped

👀 View the workflow run that generated this build 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant