Conversation
📝 WalkthroughWalkthroughAdds per-thread private work queues and bookkeeping to the epoll scheduler, enabling a same-thread fast-path for posting, draining private queues into the global completed-ops queue, adjusting wake semantics to wake only needed threads, and deferring timerfd processing until timer events occur. Changes
Sequence DiagramsequenceDiagram
participant Thread as Thread (posting/run)
participant Context as scheduler_context (private)
participant Scheduler as epoll_scheduler (global)
participant GlobalQ as completed_ops (global queue)
participant Other as Other Thread
Note over Thread,Context: Same-thread fast-path
Thread->>Context: post(work)
Context->>Context: enqueue -> private_queue
Context->>Context: ++private_outstanding_work
Note over Context: no mutex, no wake (fast)
Note over Other,Scheduler: Cross-thread slow-path
Other->>Scheduler: post(work)
Scheduler->>Scheduler: lock mutex
Scheduler->>GlobalQ: enqueue completed_ops
Scheduler->>Scheduler: unlock mutex
Scheduler->>Other: wake necessary idle threads
Note over Thread,Scheduler: Drain during run_reactor
Thread->>Context: drain_thread_queue()
Context->>Scheduler: lock mutex
Context->>GlobalQ: splice private_queue into completed_ops
Scheduler->>Context: unlock mutex
Thread->>Scheduler: process completed_ops and timers (timers only if timerfd fired)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bf8f934 to
747ddd2
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #104 +/- ##
===========================================
- Coverage 80.48% 80.42% -0.06%
===========================================
Files 64 64
Lines 5344 5401 +57
===========================================
+ Hits 4301 4344 +43
- Misses 1043 1057 +14
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
An automated preview of the documentation is available at https://104.corosio.prtest3.cppalliance.org/index.html If more commits are pushed to the pull request, the docs will rebuild at the same URL. 2026-02-04 16:38:41 UTC |
|
GCOVR code coverage report https://104.corosio.prtest3.cppalliance.org/gcovr/index.html Build time: 2026-02-04 16:40:10 UTC |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/corosio/src/detail/epoll/scheduler.cpp (1)
589-626:⚠️ Potential issue | 🟡 MinorUpdate cached expiry only after
timerfd_settimesucceeds.If
timerfd_settimethrows, the cache is already updated, so later calls may early-return even though the fd wasn’t re-armed.🔧 Suggested adjustment
- last_timerfd_expiry_ = nearest; - itimerspec ts{}; int flags = 0; @@ - if (::timerfd_settime(timer_fd_, flags, &ts, nullptr) < 0) + if (::timerfd_settime(timer_fd_, flags, &ts, nullptr) < 0) detail::throw_system_error(make_err(errno), "timerfd_settime"); + + last_timerfd_expiry_ = nearest;
🤖 Fix all issues with AI agents
In `@src/corosio/src/detail/epoll/scheduler.cpp`:
- Around line 660-664: The epoll loop currently sets check_timers = true when
events[i].data.ptr == &timer_fd_ but does not consume the timerfd, which can
leave it readable and cause epoll to spin; modify the epoll handling in the
scheduler loop (where events[i].data.ptr is compared to &timer_fd_) to perform a
non-blocking read of timer_fd_ (read the uint64_t expirations) and ignore
EAGAIN/EWOULDBLOCK errors, logging or handling other errors as appropriate, so
the timerfd is cleared and epoll_wait won't repeatedly wake immediately.
In `@src/corosio/src/detail/epoll/scheduler.hpp`:
- Around line 170-171: The mutable field last_timerfd_expiry_ is updated from
multiple contexts (timer_service callback and reactor) and is not thread-safe;
protect concurrent access to last_timerfd_expiry_ in update_timerfd() and any
readers by using the existing mutex_ (or a dedicated std::mutex) or switch to an
atomic time representation: acquire mutex_ before reading/writing
last_timerfd_expiry_ in update_timerfd() and where it’s read, or replace
last_timerfd_expiry_ with an atomic-backed value and update it via atomic
operations to eliminate the data race.
|
GCOVR code coverage report https://104.corosio.prtest3.cppalliance.org/gcovr/index.html Build time: 2026-02-04 16:10:25 UTC |
Reduce mutex contention by processing events into a local queue without holding the mutex. The mutex is only acquired briefly when splicing completions into the completed_ops_ queue. Changes: - Process events into a local op_queue without holding the mutex - Only acquire mutex for completed_ops_ splice operation - Add check_timers flag to only process timers when timerfd fires - Cache last timerfd expiry to skip redundant timerfd_settime calls
Only wake idle threads, and only as many as we have work available. This prevents waking all threads when only a few completions arrive.
When posting work from within the scheduler's run loop, use a thread-local queue instead of acquiring the global mutex. This matches Asio's thread_info::private_op_queue optimization. - Extend scheduler_context with private_queue and work counter - Fast path in post() detects same-thread via context_stack - Drain points: before blocking, after reactor splice, on exit - Reduces futex calls from ~450K to 1 in multi-threaded benchmarks
- Consume timerfd on expiry to prevent epoll busy-spinning (level-triggered fd must be read to clear readable state) - Remove last_timerfd_expiry_ caching optimization to match Asio (eliminates data race between timer callback and reactor thread)
6166eca to
5bce71e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/corosio/src/detail/epoll/scheduler.cpp`:
- Around line 533-542: The drain_thread_queue implementation uses
wakeup_event_.notify_all() which only wakes condvar waiters and can leave the
reactor unaware when no threads are idle; change drain_thread_queue (and its
call sites if needed) to reuse the scheduler's existing reactor wake/interrupt
path instead of a raw notify_all: after splicing into completed_ops_ (and
preserving outstanding_work_), invoke the scheduler's established wake/interrupt
helper (the same function used elsewhere to interrupt or wake the reactor) so
queued handlers are promptly observed by the reactor; reference
epoll_scheduler::drain_thread_queue, completed_ops_, outstanding_work_, and
wakeup_event_ when locating the change.
| void | ||
| epoll_scheduler:: | ||
| drain_thread_queue(op_queue& queue, long count) const | ||
| { | ||
| std::lock_guard lock(mutex_); | ||
| // Note: outstanding_work_ was already incremented when posting | ||
| completed_ops_.splice(queue); | ||
| if (count > 0) | ||
| wakeup_event_.notify_all(); | ||
| } |
There was a problem hiding this comment.
Ensure drained work wakes or interrupts the reactor when no idle threads.
notify_all() only wakes condvar waiters. If no threads are idle and the reactor is waiting, the newly queued handlers can sit until an unrelated wakeup. Consider reusing the existing wake/interrupt logic so queued work is promptly observed.
🛠️ Suggested fix
void
epoll_scheduler::
drain_thread_queue(op_queue& queue, long count) const
{
- std::lock_guard lock(mutex_);
+ std::unique_lock lock(mutex_);
// Note: outstanding_work_ was already incremented when posting
completed_ops_.splice(queue);
if (count > 0)
- wakeup_event_.notify_all();
+ wake_one_thread_and_unlock(lock);
}🤖 Prompt for AI Agents
In `@src/corosio/src/detail/epoll/scheduler.cpp` around lines 533 - 542, The
drain_thread_queue implementation uses wakeup_event_.notify_all() which only
wakes condvar waiters and can leave the reactor unaware when no threads are
idle; change drain_thread_queue (and its call sites if needed) to reuse the
scheduler's existing reactor wake/interrupt path instead of a raw notify_all:
after splicing into completed_ops_ (and preserving outstanding_work_), invoke
the scheduler's established wake/interrupt helper (the same function used
elsewhere to interrupt or wake the reactor) so queued handlers are promptly
observed by the reactor; reference epoll_scheduler::drain_thread_queue,
completed_ops_, outstanding_work_, and wakeup_event_ when locating the change.
Summary by CodeRabbit