Skip to content

nvme: check MUD support before firmware commit#3403

Open
guzebing1612-dev wants to merge 1 commit into
linux-nvme:masterfrom
guzebing1612-dev:fw-commit-cache-smud
Open

nvme: check MUD support before firmware commit#3403
guzebing1612-dev wants to merge 1 commit into
linux-nvme:masterfrom
guzebing1612-dev:fw-commit-cache-smud

Conversation

@guzebing1612-dev
Copy link
Copy Markdown

This PR follows up on #3396 and the mailing-list patch:

https://lore.kernel.org/linux-nvme/20260519134542.1841435-1-guzebing1612@gmail.com/

The current code checks FRMW.SMUD after Firmware Commit succeeds, which may issue an Identify Controller command after firmware activation has started.

This patch caches FRMW.SMUD before submitting Firmware Commit and uses the cached value when printing the completion MUD value. This avoids issuing Identify Controller after activation has started, and makes MUD handling use the pre-activation controller capability.

Closes #3396

Tested:

  • make checkpatch-diff
  • meson test -C .build

Firmware Commit returns the Multiple Update Detected value in the command
completion. nvme-cli currently identifies the controller after the
command completes to decide whether to print that value.

That post-command Identify is fragile for immediate activation. A possible
Linux race looks like:

  nvme-cli thread              nvme driver AEN work
  ---------------              --------------------
  libnvme_exec_admin_passthru()
    -> Firmware Commit succeeds
                               nvme_handle_aen_notice()
                                 -> FW_ACT_STARTING
                                 -> nvme_change_ctrl_state(RESETTING)
                                 -> nvme_fw_act_work()
                                    -> nvme_quiesce_io_queues()
                                    -> wait for activation
  fw_commit_print_mud()
    -> fw_commit_support_mud()
       -> nvme_identify_ctrl()
          -> admin ioctl passthru
          -> nvme_user_cmd*()
          -> blk_mq_alloc_request()
          -> __nvme_check_ready()
             rejects user admin command while resetting

nvme-cli then prints "identify-ctrl: ..." after the successful
fw-commit output. The extra error makes it unclear whether Firmware
Commit itself failed, even though the command completion was already
successful.

The post-command Identify can also observe the wrong capability. MUD is
a bit in the Firmware Commit completion, so it should be interpreted
using the controller capability that applied when that command was
processed. If the old firmware does not report support for firmware
image overlap but the new firmware does, a post-activation Identify can
make nvme-cli interpret the old command completion using the new
firmware capability.

Read and cache SMUD before submitting Firmware Commit, and use the
cached value to decide whether to print the completion MUD value. This
ties MUD reporting to the firmware revision that accepted the command
and avoids a post-success Identify during firmware activation.

Fixes: b447292 ("nvme: Check fw-commit command support MUD")
Reviewed-by: Tokunori Ikegami <ikegami.t@gmail.com>
Signed-off-by: Guzebing <guzebing@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fw-commit prints "identify-ctrl: No such device or address" after successful Firmware Commit

1 participant