kqueue: elide mutex and atomic barriers in single-threaded mode by mvandeberg · Pull Request #139 · cppalliance/corosio

mvandeberg · 2026-02-13T00:19:49Z

When concurrency_hint == 1, the scheduler now bypasses all synchronization overhead. conditional_mutex/conditional_unique_lock skip pthread_mutex calls entirely, conditional_atomic decomposes RMW ops (fetch_add/fetch_sub/exchange/CAS) into plain relaxed load+modify+store — eliminating ldaxr/stlxr exclusive pairs on ARM64 and LOCK prefixes on x86. Mutexes spin briefly (40 iterations with yield/pause hints) before falling back to the OS futex.

Benchmark factory updated to accept a concurrency hint so single-threaded benchmarks explicitly opt in with factory(1).

Summary by CodeRabbit

Release Notes

New Features
- Added concurrency-aware synchronization to optimize performance for single-threaded scenarios.
- Enhanced context factory to accept concurrency hints for improved resource utilization.
Refactor
- Updated all backends to use lightweight synchronization primitives with optional locking when concurrency is not required.

When concurrency_hint == 1, the scheduler now bypasses all synchronization overhead. conditional_mutex/conditional_unique_lock skip pthread_mutex calls entirely, conditional_atomic decomposes RMW ops (fetch_add/fetch_sub/exchange/CAS) into plain relaxed load+modify+store — eliminating ldaxr/stlxr exclusive pairs on ARM64 and LOCK prefixes on x86. Mutexes spin briefly (40 iterations with yield/pause hints) before falling back to the OS futex. Benchmark factory updated to accept a concurrency hint so single-threaded benchmarks explicitly opt in with factory(1).

coderabbitai · 2026-02-13T00:20:12Z

📝 Walkthrough

Walkthrough

Introduced conditional synchronization primitives (conditional_atomic, conditional_mutex, conditional_event) to enable optional single-threaded optimization. Modified the context_factory interface to accept a concurrency hint parameter. Updated all benchmark invocations and kqueue scheduler to integrate concurrency-aware locking and factory construction.

Changes

Cohort / File(s)	Summary
Backend Factory Interface `perf/common/backend_selection.hpp`	Replaced function pointer typedef with `context_factory` struct accepting unsigned concurrency hint. Updated all backend factory lambdas (epoll, kqueue, select, iocp) to forward hint to context constructors.
Benchmark Factory Calls `perf/profile/concurrent_io_bench.cpp`, `perf/profile/coroutine_post_bench.cpp`, `perf/profile/queue_depth_bench.cpp`, `perf/profile/scheduler_contention_bench.cpp`, `perf/profile/small_io_bench.cpp`	Updated factory invocations to pass concurrency hint: `factory(num_threads)` in workload setup and `factory(1)` in warmup blocks.
Conditional Synchronization Primitives `src/corosio/src/detail/conditional_atomic.hpp`, `src/corosio/src/detail/conditional_mutex.hpp`	Added new headers implementing `conditional_atomic<T>`, `conditional_mutex`, `conditional_unique_lock`, `conditional_event`, and `spin_pause()` utility for optional single-threaded optimization with relaxed/non-atomic fast paths when disabled.
Kqueue Backend Integration `src/corosio/src/detail/kqueue/op.hpp`, `src/corosio/src/detail/kqueue/scheduler.hpp`, `src/corosio/src/detail/kqueue/scheduler.cpp`	Replaced `std::mutex`/`std::atomic`/`std::condition_variable` with conditional variants. Updated constructor to accept `concurrency_hint` and propagate locking configuration. Changed all lock parameter types from `std::unique_lock<std::mutex>&` to `conditional_unique_lock&`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related issues

Implement kqueue reactor #13: Directly implements concurrency-aware kqueue scheduler with conditional locking primitives and updates to core scheduler internals for the kqueue reactor implementation.

Possibly related PRs

Benchmark enhancements #107: Evolves the context_factory interface by adding concurrency hint parameter support, building upon the factory pattern introduced in that PR.
Add kqueue backend #114: Integrates kqueue backend factory construction and concurrency hint forwarding within backend selection dispatch logic.
Add multi-waiter support, cancel_one, and scheduler_impl intermediary #134: Co-modifies kqueue_scheduler implementation with conditional synchronization primitives and constructor signature changes for concurrency-aware behavior.

Poem

🐰 A whisker-twitch of optimization true,
Conditional locks where single threads pass through,
Concurrency hints now flow with grace,
Kqueue runs faster in its rightful place!
✨ No barriers where none are due.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary change: introducing optimizations to elide mutex and atomic barriers when operating in single-threaded mode (concurrency_hint == 1), which is the main objective of the pull request.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `develop`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-13T00:23:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.20%. Comparing base (4777a92) to head (138aba6).
⚠️ Report is 6 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #139      +/-   ##
===========================================
+ Coverage    81.03%   81.20%   +0.17%     
===========================================
  Files           64       64              
  Lines         5710     5710              
===========================================
+ Hits          4627     4637      +10     
+ Misses        1083     1073      -10

see 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4777a92...138aba6. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cppalliance-bot · 2026-02-13T00:25:41Z

An automated preview of the documentation is available at https://139.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-02-13 00:25:40 UTC

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/corosio/src/detail/conditional_mutex.hpp`:
- Around line 35-42: The spin_pause() implementation uses GCC/Clang-only
intrinsics and will not compile under MSVC; update spin_pause() to add
MSVC-specific branches: include <intrin.h> when _MSC_VER is defined, use
__yield() for ARM64 (_M_ARM64) and _mm_pause() for x86 (_M_X64, _M_IX86) instead
of __asm__ volatile and __builtin_ia32_pause(), and keep the existing GCC/Clang
branches for non-MSVC compilers; ensure all compiler/arch checks are properly
`#ifdef-guarded` so the header remains portable.

🧹 Nitpick comments (4)

src/corosio/src/detail/conditional_mutex.hpp (1)

149-153: conditional_event::wait() is silently a no-op when locking is disabled — document the busy-spin hazard.

When disabled, lock.underlying().owns_lock() is false, so wait() returns immediately without blocking. Any caller that loops on a condition using cond_.wait(lock) (e.g., wait_for_signal) would degrade into an infinite busy-spin consuming 100% CPU.

Currently this is safe because wait_for_signal is structurally unreachable in single-threaded mode (the reactor sentinel ensures the queue is never empty). However, this is a non-obvious invariant — a future refactor that breaks that invariant would silently introduce a livelock. Consider adding an assertion or a brief comment at the call site to document why this is safe.

src/corosio/src/detail/kqueue/scheduler.hpp (1)

206-208: Note: kqueue scheduler now diverges from the epoll scheduler's locking interface.

The epoll scheduler still uses std::unique_lock<std::mutex>& (per src/corosio/src/detail/epoll/scheduler.hpp:172). If the single-threaded optimization proves successful, consider applying the same treatment to the epoll backend to keep the implementations aligned.
src/corosio/src/detail/kqueue/scheduler.cpp (2)
691-697: Verify descriptor mutex is never used concurrently before register_descriptor configures it.

The conditional_mutex in descriptor_state is default-constructed with enabled = true. Lines 691–692 then reconfigure it to match the scheduler's settings. Between construction and this point, the mutex is in a fully-locked (enabled) mode regardless of concurrency_hint. This is safe because register_descriptor is called during socket setup before any I/O events can fire, but it would be slightly more robust to construct descriptor_state::mutex in the disabled state or accept the enabled flag at construction time to avoid any window of incorrect configuration.

843-867: wait_for_signal / wait_for_signal_for silently degrade in single-threaded mode.

When locking is disabled, cond_.wait(lock) is a no-op (the underlying std::unique_lock doesn't own the mutex). wait_for_signal becomes an infinite busy-spin, and wait_for_signal_for becomes a single non-blocking check — effectively poll_one() semantics regardless of the requested timeout.

As currently designed, the reactor sentinel in the completed-ops queue prevents these paths from being reached in single-threaded mode. However, this invariant is non-obvious. A defensive assertion would catch any future regression:
Defensive assertion
 void
 kqueue_scheduler::
 wait_for_signal(conditional_unique_lock& lock) const
 {
+    // Unreachable in single-threaded mode: the reactor sentinel
+    // ensures the queue is never empty when outstanding_work > 0.
+    assert(locking_enabled() && "wait_for_signal reached in single-threaded mode");
     while ((state_ & signaled_bit) == 0)
     {

coderabbitai · 2026-02-13T00:27:00Z

src/corosio/src/detail/conditional_mutex.hpp

+inline void spin_pause() noexcept
+{
+#if defined(__aarch64__) || defined(_M_ARM64)
+    __asm__ volatile("yield");
+#elif defined(__x86_64__) || defined(__i386__) || defined(_M_X64) || defined(_M_IX86)
+    __builtin_ia32_pause();
+#endif
+}


⚠️ Potential issue | 🟡 Minor

spin_pause() will fail to compile under MSVC due to GCC/Clang-only intrinsics.

The MSVC-specific macros (_M_ARM64, _M_X64, _M_IX86) are detected, but the code paths use __asm__ volatile and __builtin_ia32_pause() which are GCC/Clang intrinsics. MSVC requires __yield() (ARM64) and _mm_pause() (x86) from <intrin.h>. Since this header is in the shared detail/ directory (not under kqueue/), it should be portable in case other backends adopt it.

Suggested portable fix

inline void spin_pause() noexcept { -#if defined(__aarch64__) || defined(_M_ARM64) +#if defined(__aarch64__) __asm__ volatile("yield"); -#elif defined(__x86_64__) || defined(__i386__) || defined(_M_X64) || defined(_M_IX86) +#elif defined(_M_ARM64) + __yield(); +#elif defined(__x86_64__) || defined(__i386__) __builtin_ia32_pause(); +#elif defined(_M_X64) || defined(_M_IX86) + _mm_pause(); `#endif` }

The MSVC paths would also need #include <intrin.h> guarded by #ifdef _MSC_VER.

🤖 Prompt for AI Agents

In `@src/corosio/src/detail/conditional_mutex.hpp` around lines 35 - 42, The spin_pause() implementation uses GCC/Clang-only intrinsics and will not compile under MSVC; update spin_pause() to add MSVC-specific branches: include <intrin.h> when _MSC_VER is defined, use __yield() for ARM64 (_M_ARM64) and _mm_pause() for x86 (_M_X64, _M_IX86) instead of __asm__ volatile and __builtin_ia32_pause(), and keep the existing GCC/Clang branches for non-MSVC compilers; ensure all compiler/arch checks are properly `#ifdef-guarded` so the header remains portable.

cppalliance-bot · 2026-02-13T00:33:14Z

GCOVR code coverage report https://139.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://139.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://139.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-02-13 00:33:13 UTC

mvandeberg · 2026-02-13T20:05:11Z

We are going to implement this at compile time. Closing.

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

mvandeberg closed this Feb 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kqueue: elide mutex and atomic barriers in single-threaded mode#139

kqueue: elide mutex and atomic barriers in single-threaded mode#139
mvandeberg wants to merge 1 commit intocppalliance:developfrom
mvandeberg:feature/kqueue-elide-mutex

mvandeberg commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

codecov bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

cppalliance-bot commented Feb 13, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 13, 2026

Uh oh!

cppalliance-bot commented Feb 13, 2026

Uh oh!

mvandeberg commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mvandeberg commented Feb 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

codecov bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cppalliance-bot commented Feb 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

cppalliance-bot commented Feb 13, 2026

Uh oh!

mvandeberg commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mvandeberg commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2026 •

edited

Loading

codecov bot commented Feb 13, 2026 •

edited

Loading