libc/semaphore: Allow Idle Task to acquire available semaphores#18288
libc/semaphore: Allow Idle Task to acquire available semaphores#18288aviralgarg05 wants to merge 1 commit intoapache:masterfrom
Conversation
Move the assertion check in `nxsem_wait` to after the fast path logic. This prevents `Idle_Task` from asserting when acquiring an uncontended semaphore/mutex (e.g. for logging), while still preventing it from waiting on a contended one. Fixes: apache#17865
| /* This API should not be called from the idleloop or interrupt */ | ||
|
|
||
| #if defined(CONFIG_BUILD_FLAT) || defined(__KERNEL__) | ||
| DEBUGASSERT(!OSINIT_IDLELOOP() || !sched_idletask() || |
There was a problem hiding this comment.
why move assert to the end
There was a problem hiding this comment.
Confirmed that the fix addresses the root cause identified in the issue description (Assertion failed at file: semaphore/sem_wait.c:146 task: Idle_Task).
Provide logs from before and after your change, and information about the platform you tested on.
Also, if the idle task shouldn't be using this API, what difference does it make to move the assert statement? You're allowing the idle task to sometimes use the API.
Hi @linguini1, Regarding the logs, the "before" log is detailed in the issue description of #17865: Testing Environment:
"After" result:
The "After" log is essentially the normal, clean boot sequence: (The assertion at Rationale for moving the assertion: However, with the atomic "fast-path" acquisition now in place, |
|
@aviralgarg05 thank you for the logs! In the future, please include that information in your testing section. Your explanation doesn't really make sense to me. Why does the idle task try to wait on a semaphore only in the LVGL demo? Doesn't this assertion indicate that the problem is with something in the LVGL configuration, since no other NuttX code encounters this issue? |
From what I understand, the Idle task in the simulator is running the host event loop (input/display), so it’s effectively behaving like a driver thread. Since these drivers use standard APIs, they’re protected by mutexes. Earlier, the assertion was basically asking “Am I calling a wait function?”, which meant even safe, instant atomic acquisitions were blocked. The fix changes that logic to instead ask “Am I actually going to sleep?”. With this change, it seems okay for the Idle task to grab a free semaphore through the fast/atomic path, since that doesn’t block and is safe. But if it tries to wait on a busy semaphore and goes down the slow path, we still panic — which keeps the safety guarantees intact. |
Hmmm, that's different than my understanding. I was always under the impression that the idle task doesn't do anything except occupy other CPU cores while no tasks are running on them. Maybe someone more familiar with it can weigh in. @acassis any ideas? |
sem_wait shouldn't be called from interrupt/idle context regardless whether the wait really happen. |
@xiaoxiang781216 I understand, thank you for the clarification. The issue arises because the Simulator architecture currently relies on the Idle loop to |
do you use the latest master code? all code in idle loop is already moved into the callback of wdog/wqueue. |
@xiaoxiang781216 actually yes, sorry for the misunderstandings and thank you for pointing out. |
|
let's close this pr. |
Move the assertion check in
nxsem_waitto after the fast path logic. This preventsIdle_Taskfrom asserting when acquiring an uncontended semaphore/mutex (e.g. for logging), while still preventing it from waiting on a contended one.Fixes: #17865
Summary
The
nxsem_waitfunction previously contained aDEBUGASSERTat the very beginning that prohibited theIdle_Taskor an interrupt handler from calling the function entirely. However, with the introduction of atomic "fast-path" acquisition, it is safe for these contexts to acquire a semaphore or mutex if it is currently available (uncontended), as this does not result in a context switch or blocking.This change moves the assertion to the point just before calling
nxsem_wait_slow. This ensures that:Idle_Taskand interrupt handlers can successfully acquire available semaphores/mutexes via the fast path.This specifically resolves a crash observed in the LVGL simulator when the mouse enters the page and the
Idle_Taskor related kernel threads trigger an assertion while performing operations (like logging) that involve semaphores.Impact
sim:lvgl_lcdconfiguration and potentially other simulator/kernel scenarios.Testing
./tools/checkpatch.sh -f libs/libc/semaphore/sem_wait.c.nxsem_waitimplementation inlibs/libc/semaphore/sem_wait.cto ensure the logic flow correctly bypasses the assertion only upon successful atomic acquisition.Host: macOS (Apple Silicon)
Target: Simulator (sim:lvgl_lcd)
Logs
Before Change (Crash):
After Change (Success):
(The assertion at
sem_wait.c:146is no longer encountered, and the system boots correctly).