block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT by blktests-ci[bot] · Pull Request #802 · linux-blktests/linux-block

blktests-ci · 2026-05-06T07:26:24Z

Pull request for series with
subject: block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT
version: 6
url: https://patchwork.kernel.org/project/linux-block/list/?series=1090278

blktests-ci · 2026-05-06T07:26:25Z

Upstream branch: 6d35786
series: https://patchwork.kernel.org/project/linux-block/list/?series=1090278
version: 6

blktests-ci · 2026-05-10T16:16:52Z

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1090278
version: 6

blktests-ci · 2026-05-12T06:35:38Z

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

blktests-ci · 2026-05-12T17:58:26Z

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

blktests-ci · 2026-05-15T08:05:49Z

Upstream branch: 70eda68
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

blktests-ci · 2026-05-21T03:29:39Z

Upstream branch: 8bc67e4
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

blktests-ci · 2026-05-22T02:26:31Z

Upstream branch: 6779b50
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

blktests-ci · 2026-05-23T07:22:17Z

Upstream branch: 79bd2dd
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

blktests-ci · 2026-05-23T17:52:02Z

Upstream branch: eed108e
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093294
version: 7

… on RT On PREEMPT_RT kernels, commit 6bda857 ("block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding") causes a severe throughput regression on systems with many MSI-X interrupt vectors. That commit closed a store/load race between blk_mq_run_hw_queue() and blk_mq_unquiesce_queue() by taking q->queue_lock around the requiesce re-check in blk_mq_run_hw_queue(). Its changelog noted two ways to fix the race -- (1) a pair of memory barriers, or (2) the queue_lock -- and picked (2) because barriers are harder to maintain. On RT, spinlock_t becomes a sleeping rt_mutex. blk_mq_run_hw_queue() is called from every IRQ thread, and the re-check path is hit on the very common "nothing pending" case, so all IRQ threads end up serialising on the single q->queue_lock and block in D-state. On a Broadcom/LSI MegaRAID 12GSAS/PCIe Secure SAS39xx (megaraid_sas, 128 MSI-X vectors, 120 hw queues) throughput drops from 640 MB/s to 153 MB/s. Take approach (1) instead, and while at it turn quiesce_depth into the single source of truth for the quiesce state: - quiesce_depth becomes atomic_t and QUEUE_FLAG_QUIESCED is removed; blk_queue_quiesced() is now "atomic_read(&q->quiesce_depth) > 0". This also makes blk_queue_quiesced(), which is read locklessly from the dispatch path, a clean atomic load instead of a plain-int read racing with a spin_lock-protected int update. - blk_mq_quiesce_queue_nowait() does an atomic_inc() followed by smp_mb__after_atomic(). The spin_lock() it used to take only served to publish the state change; every caller still follows the quiesce with blk_mq_wait_quiesce_done() (synchronize_srcu()/synchronize_rcu()), which is what actually drains in-flight dispatchers and makes the new state globally visible. The barrier here just keeps the helper self-contained for the few callers that defer that wait. - blk_mq_unquiesce_queue() uses atomic_dec_if_positive() (so the WARN-on-underflow check and the decrement are one atomic op) followed by smp_mb__after_atomic() before blk_mq_run_hw_queues(). This is the write side of the race fixed above: a full barrier between the quiesce_depth store and the blk_mq_hctx_has_pending() load. - blk_mq_run_hw_queue() drops the q->queue_lock around the requiesce re-check and uses smp_mb() instead. This is the read side: a full barrier between the just-inserted request (the store that makes blk_mq_hctx_has_pending() true) and the quiesce-state load. A full barrier is required on both sides -- this is a classic store-buffer pattern -- so smp_mb()/smp_mb__after_atomic() rather than a read barrier; with that, at least one of the two racing CPUs observes the other's store and the hw queue is not left both un-quiesced and not rerun. No locking remains on the dispatch hot path. Performance on the RT kernel and the hardware above: - Before: 153 MB/s, IRQ threads in D-state on q->queue_lock - After: 640 MB/s, no IRQ threads blocked The non-RT path replaces a queue_lock acquire/release on the re-check with an smp_mb(), so it should be no worse, and it also stops taking q->queue_lock from blk_mq_run_hw_queue() entirely. Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: 6bda857 ("block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding") Cc: stable@vger.kernel.org Signed-off-by: Ionut Nechita <ionut.nechita@windriver.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org>

blktests-ci Bot added new linus-master V6 labels May 6, 2026

blktests-ci Bot force-pushed the linus-master_base branch from 1f0d33a to b1870f6 Compare May 10, 2026 15:59

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from 16ab228 to fe6d375 Compare May 10, 2026 16:16

blktests-ci Bot added V7 and removed V6 labels May 12, 2026

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from fe6d375 to fb405e9 Compare May 12, 2026 06:35

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from fb405e9 to 8ee03f2 Compare May 12, 2026 17:58

blktests-ci Bot force-pushed the linus-master_base branch from b1870f6 to ca57796 Compare May 15, 2026 07:55

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from 8ee03f2 to 494b9e2 Compare May 15, 2026 08:05

blktests-ci Bot force-pushed the linus-master_base branch from ca57796 to c1feb59 Compare May 21, 2026 02:54

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from 494b9e2 to 73e448e Compare May 21, 2026 03:29

blktests-ci Bot force-pushed the linus-master_base branch from c1feb59 to ea833a1 Compare May 22, 2026 01:53

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from 73e448e to de77f90 Compare May 22, 2026 02:26

blktests-ci Bot force-pushed the linus-master_base branch from ea833a1 to 7af85d1 Compare May 23, 2026 06:11

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from de77f90 to f069506 Compare May 23, 2026 07:22

blktests-ci Bot force-pushed the linus-master_base branch from 7af85d1 to de94ac7 Compare May 23, 2026 17:08

blktests-ci Bot force-pushed the series/1090278=>linus-master branch from f069506 to 8ea3510 Compare May 23, 2026 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT#802

block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT#802
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1090278=>linus-master

blktests-ci Bot commented May 6, 2026

Uh oh!

blktests-ci Bot commented May 6, 2026

Uh oh!

blktests-ci Bot commented May 10, 2026

Uh oh!

blktests-ci Bot commented May 12, 2026

Uh oh!

blktests-ci Bot commented May 12, 2026

Uh oh!

blktests-ci Bot commented May 15, 2026

Uh oh!

blktests-ci Bot commented May 21, 2026

Uh oh!

blktests-ci Bot commented May 22, 2026

Uh oh!

blktests-ci Bot commented May 23, 2026

Uh oh!

blktests-ci Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blktests-ci Bot commented May 6, 2026

Uh oh!

blktests-ci Bot commented May 6, 2026

Uh oh!

blktests-ci Bot commented May 10, 2026

Uh oh!

blktests-ci Bot commented May 12, 2026

Uh oh!

blktests-ci Bot commented May 12, 2026

Uh oh!

blktests-ci Bot commented May 15, 2026

Uh oh!

blktests-ci Bot commented May 21, 2026

Uh oh!

blktests-ci Bot commented May 22, 2026

Uh oh!

blktests-ci Bot commented May 23, 2026

Uh oh!

blktests-ci Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant