Fix PerfMap crash when Enable IPC command is sent early in startup#124055
Merged
Fix PerfMap crash when Enable IPC command is sent early in startup#124055
Conversation
Copilot
AI
changed the title
[WIP] Investigate ways to fix runtime issue 123438
Fix PerfMap crash when Enable IPC command is sent early in startup
Feb 5, 2026
AppDomain is created in SystemDomain::Attach, which runs after DiagnosticServerAdapter::Initialize. A PerfMapEnable IPC command received during PauseForDiagnosticsMonitor would crash calling IterateAssembliesEx on null. Safe to skip: no assemblies are loaded before SystemDomain::Attach.
EEJitManager is created in ExecutionManager::Init, which runs after DiagnosticServerAdapter::Initialize. A PerfMapEnable IPC command received during PauseForDiagnosticsMonitor would crash on null dereference. Safe to skip: no JIT'd code exists before ExecutionManager::Init. CodeVersionManager::LockHolder remains inside the null check intentionally - the lock is only needed while iterating code versions, not for the null check. The lock is already initialized in CodeVersionManager::StaticInitialize before DiagnosticServerAdapter::Initialize.
e94322b to
62d36a5
Compare
This was referenced Feb 6, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a CoreCLR startup crash when a PerfMapEnable IPC command arrives very early (before AppDomain and/or EEJitManager are initialized), by making PerfMap::Enable(sendExisting: true) resilient to those components being unavailable.
Changes:
- Guard assembly enumeration behind a null-check for
GetAppDomain()to avoid callingIterateAssembliesExbeforeSystemDomain::Attach(). - Guard JIT code heap enumeration behind a null-check for
ExecutionManager::GetEEJitManager()to avoid iterating heaps beforeExecutionManager::Init(). - Reuse the
EEJitManager*local when building theCodeHeapIterator.
Contributor
|
Tagging subscribers to this area: @agocke |
jkotas
reviewed
Feb 6, 2026
The previous null checks on GetAppDomain() and GetEEJitManager() have race conditions because statics like m_pEEJitManager are not Volatile. Use a Volatile<bool> s_sendExistingReady flag instead. The flag is set by PerfMap::SignalSendExistingReady, called from EEStartup after ExecutionManager::Init. When Enable is called before the flag is set, sendExisting iteration is skipped since no assemblies are loaded and no code is JIT'd anyway.
jkotas
reviewed
Feb 7, 2026
…ndenciesReady - Rename method to better capture intent - Update comment to clarify it must be called before any code is JITed or restored from R2R - Remove stale comment about call site location
jkotas
reviewed
Feb 9, 2026
jkotas
approved these changes
Feb 9, 2026
This was referenced Feb 9, 2026
Open
Member
|
/backport to release/10.0 |
Contributor
|
Started backporting to |
Contributor
|
@mdh1418 backporting to git am output$ git am --3way --empty=keep --ignore-whitespace --keep-non-patch changes.patch
Applying: Add null check for AppDomain in early-startup PerfMap::Enable
Applying: Add null check for EEJitManager in early-startup PerfMap::Enable
Using index info to reconstruct a base tree...
M src/coreclr/vm/perfmap.cpp
Falling back to patching base and 3-way merge...
Auto-merging src/coreclr/vm/perfmap.cpp
CONFLICT (content): Merge conflict in src/coreclr/vm/perfmap.cpp
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0002 Add null check for EEJitManager in early-startup PerfMap::Enable
Error: The process '/usr/bin/git' failed with exit code 128 |
4 tasks
steveisok
pushed a commit
that referenced
this pull request
Feb 11, 2026
…y in startup (#124208) Manual backport of #123226 and #124055 to release/10.0 ## Customer Impact - [ ] Customer reported - [x] Found internally The DiagnosticServer allows profilers to enable PerfMap early in start-up, even before PerfMap is initialized and dependencies are ready, inducing a crash. Enabling PerfMap this early in start-up was not tested until user_events support was added in .NET 10, where One-Collect's [record-trace](https://github.com/microsoft/one-collect) aggressively enables PerfMaps as soon as it connects to a .NET Process so it can resolve JIT'd code. As .NET user_events + record-trace gains more users, this crash will likely be observed more frequently. With the changes in the two PRs, the runtime is resilient to early PerfMap::Enable commands received by the DiagnosticServer. DiagnosticServer initialization will still be early in start-up, PerfMap initialization is just bumped right before it. ## Regression - [ ] Yes - [x] No ## Testing I validated on a wsl2 instance, pausing the startup logic with DOTNET_DefaultDiagnosticPortSuspend=1 and kicking off a separate app invoking DiagnosticClient.EnablePerfMap. Before the change, it would sigsegv and crash. After the changes, its resilient and doesn't crash. ## Risk Low. PerfMap initialization doesn't depend on the logic between its former and proposed location. The logic to handle early IPC PerfMap Enable commands makes the problematic logic a no-op until the dependencies are ready.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes #123438
PerfMap::Enable(sendExisting=true) crashes when a PerfMapEnable IPC command is received very early in startup (e.g., during PauseForDiagnosticsMonitor). At this point:
Both are initialized after DiagnosticServerAdapter::Initialize, so IPC commands can arrive before these are ready.
This PR adds a volatile bool to signal when
sendExistinglogic dependencies are initialized, otherwise sendExisting logic is skipped. sendExisting logic is safe to skip when called early - no assemblies are loaded and no code is JIT'd at that point. Future JIT'd methods will still be logged.Startup Timeline
Testing
mdh1418 validated on a wsl2 instance, pausing the startup logic with
DOTNET_DefaultDiagnosticPortSuspend=1and kicking off a separate app invoking DiagnosticClient.EnablePerfMap. Before the change, it would sigsegv and crash.After the changes, its resilient and doesn't crash.
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.