Skip to content

drbd: serialize UUID snapshot in drbd_md_write()#793

Open
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1089070=>linus-master
Open

drbd: serialize UUID snapshot in drbd_md_write()#793
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1089070=>linus-master

Conversation

@blktests-ci
Copy link
Copy Markdown

@blktests-ci blktests-ci Bot commented May 4, 2026

Pull request for series with
subject: drbd: serialize UUID snapshot in drbd_md_write()
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1089070

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 4, 2026

Upstream branch: 66edb90
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 4, 2026

Upstream branch: 6d35786
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from 5d46c76 to 1401f0b Compare May 4, 2026 11:04
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from 6f75bd1 to 1f0d33a Compare May 5, 2026 15:39
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 5, 2026

Upstream branch: 6d35786
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from 1401f0b to 1e0831b Compare May 5, 2026 15:48
blktests-ci Bot pushed a commit that referenced this pull request May 10, 2026
…equeue_peeked

When sfb has children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (sfb in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(sfb) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (sfb). And herein lies the problem.
     - sfb will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

[  127.594489][  T453] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[  127.594741][  T453] CPU: 2 UID: 0 PID: 453 Comm: ping Not tainted 7.1.0-rc1-00035-gac961974495b-dirty #793 PREEMPT(full)
[  127.595059][  T453] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  127.595254][  T453] RIP: 0010:qfq_dequeue+0x35c/0x1650 [sch_qfq]
[  127.595461][  T453] Code: 00 fc ff df 80 3c 02 00 0f 85 17 0e 00 00 4c 8d 73 48 48 89 9d b8 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 76 0c 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b
[  127.596081][  T453] RSP: 0018:ffff88810e5af440 EFLAGS: 00010216
[  127.596337][  T453] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: dffffc0000000000
[  127.596623][  T453] RDX: 0000000000000009 RSI: 0000001880000000 RDI: ffff888104fd82b0
[  127.596917][  T453] RBP: ffff888104fd8000 R08: ffff888104fd8280 R09: 1ffff110211893a3
[  127.597165][  T453] R10: 1ffff110211893a6 R11: 1ffff110211893a7 R12: 0000001880000000
[  127.597404][  T453] R13: ffff888104fd82b8 R14: 0000000000000048 R15: 0000000040000000
[  127.597644][  T453] FS:  00007fc380cbfc40(0000) GS:ffff88816f2a8000(0000) knlGS:0000000000000000
[  127.597956][  T453] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  127.598160][  T453] CR2: 00005610aa9890a8 CR3: 000000010369e000 CR4: 0000000000750ef0
[  127.598390][  T453] PKRU: 55555554
[  127.598509][  T453] Call Trace:
[  127.598629][  T453]  <TASK>
[  127.598718][  T453]  ? mark_held_locks+0x40/0x70
[  127.598890][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599053][  T453]  sfb_dequeue+0x88/0x4d0
[  127.599174][  T453]  ? ktime_get+0x137/0x230
[  127.599328][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599480][  T453]  ? qdisc_peek_dequeued+0x7b/0x350 [sch_qfq]
[  127.599670][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599831][  T453]  tbf_dequeue+0x6b1/0x1098 [sch_tbf]
[  127.599988][  T453]  __qdisc_run+0x169/0x1900

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Fixes: e13e02a ("net_sched: SFB flow scheduler")
Signed-off-by: Victor Nogueria <victor@mojatatu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430152957.194015-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from 1f0d33a to b1870f6 Compare May 10, 2026 15:59
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 10, 2026

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from 1e0831b to e312319 Compare May 10, 2026 16:07
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from b1870f6 to ca57796 Compare May 15, 2026 07:55
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 15, 2026

Upstream branch: 70eda68
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from e312319 to cce318d Compare May 15, 2026 08:29
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from ca57796 to c1feb59 Compare May 21, 2026 02:54
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 21, 2026

Upstream branch: 8bc67e4
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from cce318d to 52f2a85 Compare May 21, 2026 03:32
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from c1feb59 to ea833a1 Compare May 22, 2026 01:53
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 22, 2026

Upstream branch: 6779b50
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from 52f2a85 to 81293eb Compare May 22, 2026 02:29
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from ea833a1 to 7af85d1 Compare May 23, 2026 06:11
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 23, 2026

Upstream branch: 79bd2dd
series: https://patchwork.kernel.org/project/linux-block/list/?series=1089070
version: 1

drbd_md_write() copies device->ldev->md.uuid[] into the on-disk
metadata block without holding uuid_lock.

The write-side helpers drbd_uuid_new_current() and drbd_uuid_set_bm()
update md.uuid[] under uuid_lock, and some updates span multiple UUID
slots as one logical state transition. An unlocked drbd_md_write() can
therefore observe and persist a mixed UUID tuple assembled from two
different states.

This is problematic because the serialized UUID tuple is written to
stable storage and later consumed by reconnect and resync decision
logic, meaning an inconsistent on-disk snapshot can represent a state that
never existed atomically in memory.

Protect the UUID copy with uuid_lock so drbd_md_write() serializes one
coherent snapshot.

Fixes: b411b36 ("The DRBD driver")
Signed-off-by: Ziyu Zhang <ziyuzhang201@gmail.com>
@blktests-ci blktests-ci Bot force-pushed the series/1089070=>linus-master branch from 81293eb to f56d565 Compare May 23, 2026 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant