Skip to content

<fix>[host]: filter out disks of type loop and rom#3523

Open
zstack-robot-2 wants to merge 93 commits intofeature-zsv-5.0.0-vm-support-vtpm-and-secucebootfrom
sync/zstackio/rebase-5-0-0
Open

<fix>[host]: filter out disks of type loop and rom#3523
zstack-robot-2 wants to merge 93 commits intofeature-zsv-5.0.0-vm-support-vtpm-and-secucebootfrom
sync/zstackio/rebase-5-0-0

Conversation

@zstack-robot-2
Copy link
Collaborator

Resolves: ZSV-10987

Change-Id: I7277616f7378676d696772796b766676766c706b

sync from gitlab !9382

taogan21 and others added 30 commits January 15, 2026 14:38
Resolves: ZSV-10987

Change-Id: I7277616f7378676d696772796b766676766c706b
<fix>[host]: filter out disks of type loop and rom

See merge request zstackio/zstack!9023
This patch is for zsv_4.10.28

Resolves: ZSV-6434
Related: ZSV-9610

Change-Id: I7763786570626f646a636b767577616a6e726172
<fix>[account-import]: add progress for deleting ldap server

See merge request zstackio/zstack!9048
This patch is for zsv_4.10.28

Resolves: ZSV-11206

Change-Id: I77736869617878636b64756c656b7a666f777966
<chore>[conf]: update i18n json for mevoco module

See merge request zstackio/zstack!9056
This patch is for zsv_4.10.28

Related: ZSV-10444

Change-Id: I70636878647a7962636a64766b6c67756574756c
<chore>[conf]: update i18n files

See merge request zstackio/zstack!9058
update BatchCreateVmFailDeadlockCase, BatchCreateVmFailOnLocalStorageCase,
VmOperationMultyTypeStorageCase.

This patch is for zsv_4.10.28

Related: ZSV-8585

Change-Id: I71617374747574706576627a74646875746d6b68
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage

See merge request zstackio/zstack!9065
This patch is for zsv_4.10.28;
This patch is cherry-picked from ZSTAC-79487, see commit
bd5f6d6.

Resolves: ZSV-11252

Change-Id: I646379678a7923767067706e7a6e667a796f686d
<fix>[utils]: correct CPU model and MAC retrieval in license check

See merge request zstackio/zstack!9069
update LocalNfsMultiCombineCase and OnePsCreateVmCase

This patch is for zsv_4.10.28

Related: ZSV-8585

Change-Id: I64646671626761696b6a7577676a68787871726a
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage

See merge request zstackio/zstack!9073
update cases:
* CreateVmAssignNfsPsCase
* CreateDataVolumeWithOtherPlatformCase

This patch is for zsv_4.10.28

Related: ZSV-8585

Change-Id: I7262796b6b6b73786b786276677870676965796d
<test>[volume]: remove CreateVmInstance.dataDiskOfferingUuids usage

See merge request zstackio/zstack!9081
allow users to initizalize the primary storage in the disk block directly.
vm {
    disk {
        sizeGB(10)
        diskUsePrimaryStorage("nfs")
    }
}

Resolves: ZSV-11263

Change-Id: I6a7567646e6c64636279717576727a6878756476
<chore>[testlib]: support diskUsePrimaryStorage in disk block

See merge request zstackio/zstack!9078
update cases:
* CreateVmWithDesignatedPSCase
* CreateVmWithVolumeSpecifiedPsCase
* CreateVmHostAllocateCase

This patch is for zsv_4.10.28

Related: ZSV-8585

Change-Id: I68646b70716d7579676d6b6e756b6a637a676777
This patch is for zsv_4.10.28

Related: ZSV-10444

Change-Id: I68786e677666656666746868766d646c67616b69
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage

See merge request zstackio/zstack!9092
the systemTag for diskAO.diskOfferingUuid is not initialized in
the previous version and uses the default value of the systemTag.
so, now initialize the systemTag for each diskAO whose
diskOfferingUuid is not null.

Resolves: ZSV-11278

Change-Id: I616462756a656e646563716579716a6172687573
…ssageParamValidator

use the msg class name to replace the ApiMessageParamValidator class name

Resolves: ZSV-11283

Change-Id: I64707261797962716e6676756a6d6f6474626e72
<chore>[ceph]: support systemTag for diskAO.diskOfferingUuid

See merge request zstackio/zstack!9096
update cases:
* MultiPsStartVmCase
* CreateVmAssignPsCase
* MultiNfsAttachMultiClusterMultiHostCase

This patch is for zsv_5.0.0

Related: ZSV-8585

Change-Id: I7367677175776a6170646d6b6e7164656466656c
<fix>[message]: fix incorrect parameter of the format string in ApiMessageParamValidator

See merge request zstackio/zstack!9097
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage

See merge request zstackio/zstack!9095
…r modifying host password

after modifying the host password, the zstack MN will fail to reconnect
the hosts and get the error message that is not well translated.
we put the translated error message in the "i18nDetails" field.

Resolves: ZSV-8182

Change-Id: I617066656b656c6e74726374777573616b666f6e
<fix>[i18n]: translate the error message when reconnecting hosts after modifying host password

See merge request zstackio/zstack!9219
the volume can be deleted and if users want to recover the vm
through the snapshot that took after binding the aforesaid volume
, the system may complain the null pointer exception because we do
not check if the volumes are deleted or not.

Resolves: ZSV-5838

Change-Id: I636c626d7372676776637661766d626466707a72
zhong.zhou and others added 28 commits March 17, 2026 19:49
* Move HPKE host secret envelope crypto to premium
* Move logic of syncing public key to premium

Resolves: ZSPHER-113
Related: http://dev.zstack.io:9080/zstackio/premium/-/merge_requests/13121
Resolves: ZSV-11439
Related: ZSV-11310

Change-Id: I63706e6a7a75616366716574646a677a6e646276
* Added support for cross-VM cloning of host files (NvRam, TpmState)
  and TPM encryption key cloning
* KvmSecureBootManager transitioned from a Component to a service
* Introducing new message/response types, workflows,
  and TPM key backend interfaces along
  with their virtual implementations

Resolves: ZSV-11439
Related: ZSV-11310

Change-Id: I796b77716b6c64657469697a7a797a7268636e6e
Resolves: ZSV-11439
Related: ZSV-11310

Change-Id: I626c6775707a67657570736778766d6b75736570
Resolves: ZSV-11439
Related: ZSV-11310

Change-Id: I68696b7776656f78677679686370707a70616665
Related: ZSV-11439
Related: ZSV-11310

Change-Id: I65616264756468686477647a70716f72626f6365
DBImpact

Resolves: ZSV-11473

Change-Id: I676766646c6568687472776f6c776c6f72777279
* A new hook (afterRollbackPersistVmInstanceVO) has been added
  to the VM rollback path
* introduce the TPM key provider management interface and implementation
  (attach/detach/find) have been introduced and extended

Resolves: ZSV-11489
Related: ZSV-11310

Change-Id: I73646c617971626d77726d6f646467746f637075
@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

Walkthrough

此 PR 引入了 TPM(可信平台模块)、NvRam 和密钥提供程序支持,扩展了虚拟机生命周期管理、存储系统和 RBAC 权限控制,包括新增数据库架构、API 消息类型和底层的主机端文件管理功能。

Changes

Cohort / File(s) Summary
TPM 实体与数据模型
header/src/main/java/org/zstack/header/tpm/entity/*, sdk/src/main/java/org/zstack/sdk/tpm/entity/*
新增 TpmVO、TpmInventory、TpmCapabilityView 等 JPA 实体和 SDK 数据模型,支持 TPM 资源的数据持久化和 API 表示。
TPM API 消息与事件
header/src/main/java/org/zstack/header/tpm/api/*, header/src/main/java/org/zstack/header/tpm/message/*, sdk/src/main/java/org/zstack/sdk/tpm/api/*
新增 AddTpmMsg、RemoveTpmMsg、APIAddTpmEvent 等 20+ 个 TPM 相关的 API 消息、事件和对应的 SDK 操作类,支持 TPM 的创建、删除、查询和更新。
TPM 核心服务实现
compute/src/main/java/org/zstack/compute/vm/devices/Tpm*.java, plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpm*.java
新增 TpmApiInterceptor、VmTpmExtensions、VmTpmManager、KvmTpmManager、KvmTpmExtensions 等服务类,处理 TPM 的验证、持久化、KVM 集成和虚拟机克隆。
密钥提供程序基础设施
header/src/main/java/org/zstack/header/keyprovider/*, sdk/src/main/java/org/zstack/sdk/keyprovider/*
新增 KeyProviderInventory、KmsInventory、NkpInventory 等 20+ 个密钥管理 API 操作类,支持 KMS 和 NKP(网络密钥提供程序)的创建、查询、更新和证书管理。
宿主机密钥与秘密管理
header/src/main/java/org/zstack/header/host/HostKeyIdentity*.java, header/src/main/java/org/zstack/header/secret/*, plugin/kvm/src/main/java/org/zstack/kvm/HostSecretEnvelopeCryptoExtensionPoint.java, plugin/kvm/src/main/java/org/zstack/kvm/HostKeyIdentityHelper.java
新增宿主机公钥身份验证、秘密信封加密、DEK(数据加密密钥)管理等功能,支持与 KVM 宿主机的安全通信。
虚拟机主机文件管理
header/src/main/java/org/zstack/header/vm/additions/*, sdk/src/main/java/org/zstack/sdk/vm/entity/*
新增 VmHostFileVO、VmHostBackupFileVO、VmHostFileInventory 等实体,用于管理 NvRam 和 TPM 状态文件等虚拟机关联的宿主机侧文件。
虚拟机设备规范
header/src/main/java/org/zstack/header/vm/devices/VmDevicesSpec.java, header/src/main/java/org/zstack/header/vm/devices/TpmSpec.java
新增虚拟机设备规范类,允许在虚拟机创建消息中传递 TPM 和其他设备配置。
虚拟机生命周期扩展
compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java, compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java, header/src/main/java/org/zstack/header/vm/VmInstanceCreateExtensionPoint.java
扩展虚拟机创建流程,新增 afterPersistVmInstanceVO 和 afterRollbackPersistVmInstanceVO 扩展点,支持在虚拟机持久化前后执行自定义逻辑。
虚拟机 NvRam 支持
compute/src/main/java/org/zstack/compute/vm/VmDeleteVolumeFlow.java, compute/src/main/java/org/zstack/compute/vm/VmExpungeNvRamVolumeFlow.java, header/src/main/java/org/zstack/header/volume/VolumeType.java
新增 NvRam 卷类型支持,在虚拟机删除流程中添加 NvRam 卷清理逻辑。
Secure Boot 与 EDK 管理
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java, plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java, plugin/kvm/src/main/java/org/zstack/kvm/BootKvmStartVmExtension.java
新增 KVM Secure Boot、EDK(Enroll On Request Data Key)和 NvRam 文件同步功能,支持虚拟机启动时配置 UEFI Secure Boot 和 EDK 版本管理。
KVM 命令与响应扩展
plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
扩展 KVM 代理命令集,新增 TpmTO、NvRamTO、VmHostFileTO、PublicKey/Secret 相关命令,支持与 KVM 宿主机的 TPM 和秘密信息交互。
存储系统 NvRam 支持
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java, plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java, plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java
在 Ceph、本地存储和 NFS 存储后端中添加 NvRam 卷格式(RAW)和安装路径的处理逻辑。
虚拟机创建消息与API
header/src/main/java/org/zstack/header/vm/CreateVmInstanceMsg.java, header/src/main/java/org/zstack/header/vm/APICreateVmInstanceMsg.java, header/src/main/java/org/zstack/header/vm/InstantiateNewCreatedVmInstanceMsg.java
扩展虚拟机创建相关消息,新增 devicesSpec(设备规范)字段,支持在虚拟机创建时指定 TPM 和其他设备。
虚拟机克隆与快照
header/src/main/java/org/zstack/header/vm/APIConvertTemplatedVmInstanceToVmInstanceMsg.java, header/src/main/java/org/zstack/header/vm/APICreateVmInstanceFromVolumeSnapshotGroupMsg.java, plugin/kvm/src/main/java/org/zstack/kvm/efi/CloneVmHostFileMsg.java, plugin/kvm/src/main/java/org/zstack/kvm/tpm/CloneVmTpmMsg.java
新增虚拟机克隆时的 TPM 和 NvRam 文件克隆支持,包括 resetTpm 参数控制 TPM 状态重置。
RBAC 与权限
header/src/main/java/org/zstack/header/RBACInfo.java, header/src/main/java/org/zstack/header/identity/AccountConstant.java, header/src/main/java/org/zstack/header/tpm/RBACInfo.java, identity/src/main/java/org/zstack/identity/rbac/RBACAPIRequestChecker.java
新增"resource-viewer"角色和 ALL_RESOURCES_READABLE_ROLE_UUID,允许特定账户查询所有资源;更新 RBAC 检查以支持基于 isReadOnlyApi 的权限验证。
全局配置
compute/src/main/java/org/zstack/compute/legacy/ComputeLegacyGlobalProperty.java, compute/src/main/java/org/zstack/compute/vm/VmGlobalConfig.java, plugin/kvm/src/main/java/org/zstack/kvm/KVMGlobalConfig.java
新增全局配置项:enableNvRamTypeVolume、RESET_TPM_AFTER_VM_CLONE、ENABLE_UEFI_SECURE_BOOT、VM_EDK_VERSION_CONFIG。
数据库架构
conf/db/zsv/V5.0.0__schema.sql, build/deploydb.sh, conf/deploydb.sh
新增 10+ 个数据库表(TpmVO、VmHostFileVO、VmHostBackupFileVO、KeyProviderVO、KmsVO 等),扩展部署脚本以支持新的 ZSV 架构文件目录。
测试用例更新
test/src/test/groovy/org/zstack/test/integration/*
更新虚拟机创建测试用例,将基于 instanceOffering/diskOffering 的旧方式替换为基于 diskAOs 的新方式,反映虚拟机创建 API 的演进。

Sequence Diagram(s)

sequenceDiagram
    participant Client as API Client
    participant Manager as VmInstanceManager
    participant Extension as VmTpmExtensions
    participant DB as DatabaseFacade
    participant CloudBus as CloudBus

    Client->>Manager: CreateVmInstanceMsg
    Manager->>Manager: persistVmInstanceVO()
    Manager->>Extension: afterPersistVmInstanceVO(vo, msg)
    alt TPM Enabled
        Extension->>DB: persistTpmVO(tpmUuid, vmUuid)
        Extension->>Extension: attachKeyProvider(tpmUuid, keyProviderUuid)
    end
    Extension-->>Manager: Success
    
    alt VM Creation Succeeds
        Manager->>CloudBus: Reply Success
        Client-->>Client: VM Created
    else Rollback on Error
        Manager->>Extension: afterRollbackPersistVmInstanceVO(vo, msg)
        Extension->>DB: deleteTpmVO(tpmUuid)
        Extension->>Extension: detachKeyProvider(tpmUuid)
        Manager->>CloudBus: Reply Error
        Client-->>Client: Creation Failed
    end
Loading
sequenceDiagram
    participant Client as API Client
    participant KvmHost as KVM Host
    participant SecureBootExt as KvmSecureBootExtensions
    participant HostFile as VmHostFile Service

    Client->>KvmHost: StartVmCmd (with TpmTO, NvRamSpec)
    KvmHost->>SecureBootExt: beforeStartVmOnKvm()
    alt NvRam as Volume
        SecureBootExt->>HostFile: Create NvRam Volume
        SecureBootExt->>KvmHost: Inject Volume UUID
    else NvRam as Host File
        SecureBootExt->>HostFile: Prepare Host File (VmHostFileVO)
        SecureBootExt->>KvmHost: Set Host File Path
    end
    KvmHost->>KvmHost: Start VM with EDK & Secure Boot Config
    KvmHost-->>Client: VM Started
    
    note over SecureBootExt: Reads NvRam/TPM State<br/>after VM Shutdown
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Reasoning: 此 PR 是一个大规模、高复杂度的功能增强,涉及超过 20,000 行的新增代码,跨越计算、存储、KVM、SDK 和身份验证等多个核心模块。主要复杂性体现在:

  1. 新概念引入:TPM、NvRam、密钥管理、EDK 等多个新特性的整体设计和实现
  2. 虚拟机生命周期改造:修改了创建、克隆、删除等关键流程,涉及新的扩展点和回滚逻辑
  3. 多层交互:API 层、消息层、存储层、KVM 代理层的深度整合
  4. 数据库架构扩展:新增 10+ 个表,涉及复杂的外键和级联关系
  5. 异构变更模式:不同模块的变更方式差异较大,需要逐一理解设计意图

Poem

🐰 TPM 与 NvRam 轻轻落地,
密钥管理守护虚机起伏,
数据库表纷纷涌现,
安全启动闪闪发光,
代码森林中,信任之花绽放!

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/zstackio/rebase-5-0-0

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugin/sharedMountPointPrimaryStorage/src/main/java/org/zstack/storage/primary/smp/KvmBackend.java (1)

971-980: ⚠️ Potential issue | 🟠 Major

NvRam 在“预设 installPath”场景会被错误标记为 QCOW2。

当前先走 installPath 非空分支,会跳过 NvRam 的 RAW 赋值;随后 Line 993 使用 cmd.volumeFormat 回填,最终可能把 NvRam 卷写成 QCOW2。建议先按类型设置 volumeFormat,再处理 installPath 选择。

建议修复
 private void createEmptyVolume(final VolumeInventory volume, String hostUuid, final ReturnValueCompletion<InstantiateVolumeOnPrimaryStorageReply> completion) {
     final CreateEmptyVolumeCmd cmd = new CreateEmptyVolumeCmd();
+    boolean isNvRam = VolumeType.NvRam.toString().equals(volume.getType());
+    if (isNvRam) {
+        cmd.volumeFormat = VolumeConstant.VOLUME_FORMAT_RAW;
+    }

     if (StringUtils.isNotEmpty(volume.getInstallPath())) {
         cmd.installPath = volume.getInstallPath();
     } else if (VolumeType.Root.toString().equals(volume.getType())) {
         cmd.installPath = makeRootVolumeInstallUrl(volume);
     } else if (VolumeType.Data.toString().equals(volume.getType())) {
         cmd.installPath = makeDataVolumeInstallUrl(volume.getUuid());
-    } else if (VolumeType.NvRam.toString().equals(volume.getType())) {
+    } else if (isNvRam) {
         cmd.installPath = makeNvRamVolumeInstallUrl(volume.getUuid());
-        cmd.volumeFormat = VolumeConstant.VOLUME_FORMAT_RAW;
     } else {
         DebugUtils.Assert(false, "Should not be here");
     }

Also applies to: 992-994

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/sharedMountPointPrimaryStorage/src/main/java/org/zstack/storage/primary/smp/KvmBackend.java`
around lines 971 - 980, The issue is that when volume.getInstallPath() is
non-empty the code skips the VolumeType.NvRam branch and never sets
cmd.volumeFormat to RAW, causing NvRam volumes to be treated as QCOW2; modify
KvmBackend so you set type-specific fields (e.g., set cmd.volumeFormat =
VolumeConstant.VOLUME_FORMAT_RAW for VolumeType.NvRam) prior to the installPath
selection block (the code referencing volume.getInstallPath(), VolumeType.NvRam,
makeNvRamVolumeInstallUrl, makeRootVolumeInstallUrl and cmd.volumeFormat) so the
format is always correct even when installPath is preset.
plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java (1)

1102-1113: ⚠️ Potential issue | 🟠 Major

NvRam 在已有 installPath 场景下会被错误写成 QCOW2

当前先判断 installPath,会绕过 NvRam 分支,导致 cmd.getVolumeFormat() 仍是默认 QCOW2;随后 Line 1145 会把卷格式写错。建议将 NvRam 格式设置独立于安装路径分支。

建议修复
diff --git a/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java b/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java
@@
         cmd.setName(volume.getName());
         cmd.setSize(volume.getSize());
         cmd.setVolumeUuid(volume.getUuid());
+        if (volume.getType().equals(VolumeType.NvRam.toString())) {
+            cmd.setVolumeFormat(VolumeConstant.VOLUME_FORMAT_RAW);
+        }
+
         if (StringUtils.isNotEmpty(volume.getInstallPath())) {
             cmd.setInstallUrl(volume.getInstallPath());
         } else if (volume.getType().equals(VolumeType.Root.toString())) {
             cmd.setInstallUrl(NfsPrimaryStorageKvmHelper.makeRootVolumeInstallUrl(pinv, volume));
         } else if (volume.getType().equals(VolumeType.Data.toString())) {
@@
         } else if (volume.getType().equals(VolumeType.Cache.toString())) {
             cmd.setInstallUrl(NfsPrimaryStorageKvmHelper.makeDataVolumeInstallUrl(pinv, volume.getUuid()));
         } else if (volume.getType().equals(VolumeType.NvRam.toString())) {
             cmd.setInstallUrl(NfsPrimaryStorageKvmHelper.makeNvRamVolumeInstallUrl(pinv, volume.getUuid()));
-            cmd.setVolumeFormat(VolumeConstant.VOLUME_FORMAT_RAW);
         } else {
             throw new CloudRuntimeException(String.format("unknown volume type %s", volume.getType()));
         }

Also applies to: 1145-1145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java`
around lines 1102 - 1113, The NvRam volume format is only set inside the
installPath-less branch, so when volume.getInstallPath() is present NvRam stays
QCOW2; update NfsPrimaryStorageKVMBackend so that after the installUrl selection
logic (the block that calls NfsPrimaryStorageKvmHelper.makeRootVolumeInstallUrl
/ makeDataVolumeInstallUrl / makeNvRamVolumeInstallUrl) you add a type check for
VolumeType.NvRam and unconditionally call
cmd.setVolumeFormat(VolumeConstant.VOLUME_FORMAT_RAW) for that case (or move the
existing cmd.setVolumeFormat call out of the NvRam-only branch), ensuring NvRam
gets RAW format regardless of volume.getInstallPath().
🟠 Major comments (18)
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java-1733-1743 (1)

1733-1743: ⚠️ Potential issue | 🟠 Major

cmd.format 已动态计算,但返回卷格式仍被固定为 RAW,存在元数据不一致风险

这里允许非 NvRam 卷走 QCOW2,但在 Line 1758 仍固定 vol.setFormat(VolumeConstant.VOLUME_FORMAT_RAW),会把实际 qcow2 卷记录成 raw。建议将返回格式与 cmd.format 保持一致。

建议修改
@@
         if (type == VolumeType.NvRam) {
             cmd.format = VolumeConstant.VOLUME_FORMAT_RAW;
         } else {
             cmd.format = msg.hasSystemTag(VolumeSystemTags.FORMAT_QCOW2.getTagFormat()) ?
                     VolumeConstant.VOLUME_FORMAT_QCOW2 :
                     VolumeConstant.VOLUME_FORMAT_RAW ;
         }
@@
             public void success(CreateEmptyVolumeRsp ret) {
                 VolumeInventory vol = msg.getVolume();
                 vol.setInstallPath(buildEmptyVolumeInstallPath(finalPoolName, cmd.installPath, ret.getInstallPath()));
-                vol.setFormat(VolumeConstant.VOLUME_FORMAT_RAW);
+                vol.setFormat(cmd.getFormat());
                 vol.setActualSize(ret.actualSize);
                 reply.setVolume(vol);
                 bus.reply(msg, reply);
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java`
around lines 1733 - 1743, The volume format stored in CephPrimaryStorageBase is
inconsistent: cmd.format is computed conditionally (using VolumeType and
VolumeSystemTags) but the persisted VolumeVO is always set to
VolumeConstant.VOLUME_FORMAT_RAW via vol.setFormat(...), risking metadata
mismatch; update the code that sets the volume record (the vol.setFormat call)
to use the same computed value (cmd.format or the same conditional logic) so the
recorded format matches cmd.format (ensure references to VolumeType,
VolumeSystemTags.FORMAT_QCOW2 and VolumeConstant are reused).
header/src/main/java/org/zstack/header/storage/snapshot/VolumeSnapshotConstant.java-25-30 (1)

25-30: ⚠️ Potential issue | 🟠 Major

将公开状态常量改为不可变集合,避免运行时被篡改

当前 Arrays.asList(...) 生成的列表可被 set() 修改。作为 public 常量暴露后,外部代码可意外改写允许状态,导致快照校验规则漂移。

建议修复(保持现有 API 类型)
 import org.zstack.header.vm.VmInstanceState;
 
 import java.util.Arrays;
+import java.util.Collections;
 import java.util.List;
@@
-    public static final List<VmInstanceState> ALLOW_TAKE_SNAPSHOTS_VM_STATES = Arrays.asList(
+    public static final List<VmInstanceState> ALLOW_TAKE_SNAPSHOTS_VM_STATES = Collections.unmodifiableList(Arrays.asList(
             VmInstanceState.Running, VmInstanceState.Stopped, VmInstanceState.Paused
-    );
-    public static final List<VmInstanceState> ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES = Arrays.asList(
+    ));
+    public static final List<VmInstanceState> ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES = Collections.singletonList(
             VmInstanceState.Stopped
     );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/storage/snapshot/VolumeSnapshotConstant.java`
around lines 25 - 30, The public mutable lists ALLOW_TAKE_SNAPSHOTS_VM_STATES
and ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES (containing VmInstanceState values)
must be made unmodifiable so callers cannot mutate them at runtime; replace the
Arrays.asList(...) initializers with an immutable wrapper such as
Collections.unmodifiableList(Arrays.asList(...)) or
List.copyOf(Arrays.asList(...)) (keeping the public static final
List<VmInstanceState> type and the same elements) so the constants remain
read-only.
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java-1309-1311 (1)

1309-1311: ⚠️ Potential issue | 🟠 Major

NvRam 新增后,目录推导分支未补齐,存在断言失败风险

这里已经引入 VolumeType.NvRam 的安装路径与格式处理,但 makeVolumeInstallDir(VolumeInventory vol)(Line 1000-1014)仍未覆盖 NvRam。后续凡是走到该通用目录方法(例如 handle(ChangeVolumeTypeOnPrimaryStorageMsg ...))时,volPath 可能为空并在 Line 1012 触发断言。

建议修复
diff --git a/plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java b/plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java
@@
     public String makeVolumeInstallDir(VolumeInventory vol) {
         String volPath = null;
         if (VolumeType.Data.toString().equals(vol.getType())) {
             volPath = makeDataVolumeInstallUrl(vol.getUuid());
         } else if (VolumeType.Root.toString().equals(vol.getType())) {
             volPath = makeRootVolumeInstallUrl(vol);
         } else if (VolumeType.Memory.toString().equals(vol.getType())) {
             volPath = makeMemoryVolumeInstallUrl(vol);
         } else if (VolumeType.Cache.toString().equals(vol.getType())) {
             volPath = makeDataVolumeInstallUrl(vol.getUuid());
+        } else if (VolumeType.NvRam.toString().equals(vol.getType())) {
+            volPath = makeNvRamVolumeInstallUrl(vol.getUuid());
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java`
around lines 1309 - 1311, The code added handling for VolumeType.NvRam in
LocalStorageKvmBackend (setting install URL and RAW format) but did not update
makeVolumeInstallDir(VolumeInventory vol) to return a directory for NvRam, so
callers like handle(ChangeVolumeTypeOnPrimaryStorageMsg...) can get a null
volPath and hit the assertion; update makeVolumeInstallDir to include a branch
for VolumeType.NvRam that returns the correct install directory (use the same
naming/structure as makeNvRamVolumeInstallUrl or the existing volume dirs),
ensuring volPath is never null for NvRam volumes and keeping behavior consistent
with other volume types.
plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java-47-49 (1)

47-49: ⚠️ Potential issue | 🟠 Major

请补齐 makeVolumeInstallDir 的 NvRam 分支以避免断言失败

新增了 NvRam URL 构造,但 makeVolumeInstallDir 仍未覆盖 VolumeType.NvRam。当后续流程传入 NvRam 卷时,volPath 可能为空并触发断言失败。

建议修复
diff --git a/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java b/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java
@@
         } else if (VolumeType.Memory.toString().equals(vol.getType())) {
             volPath = makeMemoryVolumeInstallUrl(pinv, vol);
         } else if (VolumeType.Cache.toString().equals(vol.getType())) {
             volPath = makeDataVolumeInstallUrl(pinv, vol.getUuid());
+        } else if (VolumeType.NvRam.toString().equals(vol.getType())) {
+            volPath = makeNvRamVolumeInstallUrl(pinv, vol.getUuid());
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java`
around lines 47 - 49, makeVolumeInstallDir is missing handling for
VolumeType.NvRam, causing volPath to be null and assertions to fail when NvRam
volumes are used; update the method to include a branch for VolumeType.NvRam
that calls the new makeNvRamVolumeInstallUrl(PrimaryStorageInventory, String)
(and/or PrimaryStoragePathMaker.makeNvRamVolumeInstallPath) to build the NvRam
install path, assign it to volPath, and return/create the directory like other
volume types so assertions no longer trigger.
sdk/src/main/java/org/zstack/sdk/tpm/api/QueryTpmResult.java-5-12 (1)

5-12: ⚠️ Potential issue | 🟠 Major

inventories 需要指定泛型类型参数。

inventories 当前声明为原始 java.util.List 缺少泛型信息,Gson 反序列化时会将列表元素反序列化为 Map/LinkedTreeMap,而非 TpmInventory 对象,导致 SDK 用户无法直接使用查询结果。请将字段和 getter/setter 改为 java.util.List<TpmInventory>,并添加必要的 import 语句。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/src/main/java/org/zstack/sdk/tpm/api/QueryTpmResult.java` around lines 5
- 12, Change the raw inventories field and its accessor methods in
QueryTpmResult to use a generic type so Gson deserializes elements as
TpmInventory: update the field declaration inventories to
java.util.List<TpmInventory>, change setInventories and getInventories
signatures to accept/return java.util.List<TpmInventory>, and add the import for
org.zstack.sdk.tpm.api.TpmInventory (or the correct TpmInventory package) so the
compiler recognizes the type.
compute/src/main/java/org/zstack/compute/vm/devices/VmTpmManager.java-53-61 (1)

53-61: ⚠️ Potential issue | 🟠 Major

避免把可空资源配置直接拆箱成 boolean

getResourceConfigValue(vmUuid, Boolean.class) 返回的是包装类型;这里直接 returnboolean,在没有资源级值时会因为自动拆箱抛 NullPointerException。对“没有 TPM 且未单独配置 secure boot”的 VM,这条判断路径会直接把 NVRAM 注册流程打断。至少这里要先做空值收敛,并让全局默认值兜底。

🛠 建议修改
         ResourceConfig resourceConfig = resourceConfigFacade.getResourceConfig(ENABLE_UEFI_SECURE_BOOT.getIdentity());
-        return resourceConfig.getResourceConfigValue(vmUuid, Boolean.class);
+        return Boolean.TRUE.equals(resourceConfig.getResourceConfigValue(vmUuid, Boolean.class));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/devices/VmTpmManager.java` around
lines 53 - 61, The method needRegisterNvRam unboxes a nullable Boolean from
resourceConfig.getResourceConfigValue(vmUuid, Boolean.class) which can throw
NPE; change it to first read the value into a Boolean (e.g., Boolean cfg =
resourceConfig.getResourceConfigValue(vmUuid, Boolean.class)), avoid direct
unboxing by returning Boolean.TRUE.equals(cfg) when a per-VM value exists, and
if cfg is null fall back to the global default for ENABLE_UEFI_SECURE_BOOT
(retrieve via the ResourceConfig instance for ENABLE_UEFI_SECURE_BOOT or use a
sensible default) so that needRegisterNvRam (and symbols TpmVO /
TpmVO_.vmInstanceUuid, resourceConfigFacade, ENABLE_UEFI_SECURE_BOOT) never
causes a NullPointerException.
build/deploydb.sh-45-45 (1)

45-45: ⚠️ Potential issue | 🟠 Major

不要把 zsv 全量 schema 和常规 upgrade 链路混到同一次 Flyway 迁移里。

Line 44 已经把 conf/db/upgrade/* 放进 flyway_sql,现在 Line 45 又把 conf/db/zsv/* 放进同一个目录;而 Lines 55-58 会对这里的全部版本脚本执行一次 migrate。如果 zsv 目录里包含 V5.0.0__schema.sql 这类全量 schema,初始化时会在已有升级链路之后再次回放 DDL,很容易因为对象已存在或版本链冲突直接失败。建议把 upgradezsv 做成互斥的迁移来源,不要无条件混跑。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@build/deploydb.sh` at line 45, The script currently unconditionally copies
conf/db/upgrade/* and conf/db/zsv/* into the same ${flyway_sql} directory (the
cp command that copies ${base}/../conf/db/zsv/* into ${flyway_sql}), which
causes Flyway to run both upgrade (Lines 44) and zsv full-schema scripts
together and can replay full DDL during migrate; change the logic to make
upgrade and zsv mutually exclusive by either (a) only copying zsv when no
upgrade scripts are present, (b) copying zsv into a separate directory and point
Flyway to one location at a time, or (c) add a flag/env var to choose between
upgrade vs zsv migration; specifically modify the cp logic around the
${flyway_sql} assignment and the cp ${base}/../conf/db/zsv/* ${flyway_sql}
command so Flyway’s migrate operates on only one source.
conf/db/zsv/V5.0.0__schema.sql-3-10 (1)

3-10: ⚠️ Potential issue | 🟠 Major

TpmVO.vmInstanceUuid 加唯一约束。

现在只有外键,没有把 “一台 VM 只能挂一个 TPM” 真正落到数据库层。只要这张表能写出同一个 vmInstanceUuid 的多行,按 VM 反查 TPM 的路径就会变成不确定行为,后续的 update/remove/get-capability 都可能命中错误记录。

🛠️ 建议修改
 CREATE TABLE IF NOT EXISTS `zstack`.`TpmVO` (
     `uuid` char(32) NOT NULL UNIQUE,
     `vmInstanceUuid` char(32) NOT NULL,
     `lastOpDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     `createDate` timestamp NOT NULL DEFAULT '1999-12-31 23:59:59',
     PRIMARY KEY (`uuid`),
+    UNIQUE KEY `ukTpmVOVmInstanceUuid` (`vmInstanceUuid`),
     CONSTRAINT `fkTpmVOVmInstanceVO` FOREIGN KEY (`vmInstanceUuid`) REFERENCES `VmInstanceEO` (`uuid`) ON UPDATE RESTRICT ON DELETE CASCADE
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Based on learnings, TpmMessageAutoCompleter 会在只提供 tpmUuidvmInstanceUuid 时自动补全另一侧参数,这隐含 VM 与 TPM 之间必须是一对一映射。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conf/db/zsv/V5.0.0__schema.sql` around lines 3 - 10, Add a unique constraint
on TpmVO.vmInstanceUuid so the DB enforces one-to-one VM→TPM mapping; alter the
DDL for table TpmVO (or add an ALTER TABLE statement) to create a UNIQUE
index/constraint on the vmInstanceUuid column (in addition to the existing
foreign key fkTpmVOVmInstanceVO) so inserts/updates cannot create multiple rows
with the same vmInstanceUuid.
plugin/kvm/src/main/java/org/zstack/kvm/HostKeyIdentityHelper.java-66-68 (1)

66-68: ⚠️ Potential issue | 🟠 Major

空白公钥也要撤销旧的 verified 状态。

saveOrUpdateHostKeyIdentity() 的注释和后面几个非法 key 分支都在做“降级为未验证”,但 Line 66-68 这里直接返回了。若某个 host 之前已经 verified=true,现在返回空白公钥,会把旧的信任状态保留下来。

💡 建议修改
     public static void saveOrUpdateHostKeyIdentity(DatabaseFacade dbf, String hostUuid, String publicKey, boolean verified) {
         if (StringUtils.isBlank(publicKey)) {
+            setVerified(dbf, hostUuid, false);
             return;
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/HostKeyIdentityHelper.java` around
lines 66 - 68, The early return in HostKeyIdentityHelper when publicKey is blank
leaves any existing HostKeyIdentity.verified=true unchanged; instead, in
saveOrUpdateHostKeyIdentity handle the blank publicKey by retrieving the
HostKeyIdentity for the given host (via hostUuid/HostKeyIdentity lookup) and
explicitly downgrade it to unverified (set verified=false and persist/update
using the same save/update path used by other invalid-key branches) so previous
trust is revoked when an empty key is provided.
compute/src/main/java/org/zstack/compute/vm/devices/VmTpmExtensions.java-88-91 (1)

88-91: ⚠️ Potential issue | 🟠 Major

优先用 VmInstanceSpec 里的 boot mode,别回查 system tag。

afterBuildVmSpec 已经拿到了 VmInstanceSpec。Line 88-91 再按 vmUuid 回查 BOOT_MODE,在创建链路里很容易读不到尚未持久化的 tag,导致 UEFI + Secure Boot 的 VM 漏掉 NvRam 注册。这里应先读 spec.getBootMode(),必要时再 fallback 到 tag 查询。

💡 建议修改
-            String bootMode = VmSystemTags.BOOT_MODE.getTokenByResourceUuid(vmUuid, VmSystemTags.BOOT_MODE_TOKEN);
+            String bootMode = spec.getBootMode();
+            if (bootMode == null) {
+                bootMode = VmSystemTags.BOOT_MODE.getTokenByResourceUuid(vmUuid, VmSystemTags.BOOT_MODE_TOKEN);
+            }
             if (vmTpmManager.isUefiBootMode(bootMode)) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/devices/VmTpmExtensions.java`
around lines 88 - 91, In VmTpmExtensions (afterBuildVmSpec) don't read BOOT_MODE
via system tag by vmUuid; use the VmInstanceSpec's boot mode first (call
spec.getBootMode()) when determining UEFI mode for
vmTpmManager.isUefiBootMode(...) and only fallback to
VmSystemTags.BOOT_MODE.getTokenByResourceUuid(vmUuid, ...) if spec.getBootMode()
is null/empty; keep the existing
resourceConfigFacade.getResourceConfig(ENABLE_UEFI_SECURE_BOOT.getIdentity())
and needRegisterNvRam logic but base the initial bootMode value off
spec.getBootMode() to avoid missing unpersisted tags.
compute/src/main/java/org/zstack/compute/vm/devices/TpmMessageAutoCompleter.java-50-57 (1)

50-57: ⚠️ Potential issue | 🟠 Major

把“TPM 不存在”和“TPM/VM 不匹配”分开返回。

Line 50-57 和 Line 69-73 现在只要按 tpmUuid 查不到记录,就要么报 “not consistent”,要么把 vmInstanceUuid 补成 null。这样一来,只传 tpmUuid 的查询/更新请求在 TPM 已删除时会落到后面的 KVM 校验,最终返回 NOT_SUPPORTED,而不是 TPM_NOT_FOUND。建议先查出实际 vmInstanceUuid:为空时对非删除消息抛 TPM_NOT_FOUND,只有查到了 TPM 但归属 VM 不同才报参数不一致。

Also applies to: 69-73

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compute/src/main/java/org/zstack/compute/vm/devices/TpmMessageAutoCompleter.java`
around lines 50 - 57, In TpmMessageAutoCompleter, change the current combined
check that queries by tpmUuid and immediately treats missing records as "not
consistent": first load the TpmVO by tpmUuid (using Q.New(TpmVO.class) /
TpmVO_.uuid) and if the VO is null then—unless the incoming message is a
DeleteMessage—throw a TPM_NOT_FOUND style error (instead of nulling
vmInstanceUuid); if the VO exists, set vmUuid from VO.vmInstanceUuid and only
then validate that the provided vmUuid matches, throwing
ApiMessageInterceptionException(argerr(...)) when they differ. Apply the same
refactor to the second block (lines around 69-73) that currently repeats the
logic.
compute/src/main/java/org/zstack/compute/vm/devices/TpmApiInterceptor.java-75-85 (1)

75-85: ⚠️ Potential issue | 🟠 Major

删除空操作应在 KVM 校验前短路返回。

APIRemoveTpmMsg 允许只传 tpmUuid。当该 TPM 已不存在时,自动补全拿不到 vmInstanceUuid,Line 76 会先抛 NOT_SUPPORTED,根本到不了 Line 81-85 的幂等成功分支。这里应该先对“只传 tpmUuid 且记录不存在”的情况直接发布 APIRemoveTpmEvent 并停止路由。

💡 建议修改
     private void validate(APIRemoveTpmMsg msg) {
+        if (msg.getVmInstanceUuid() == null && msg.getTpmUuid() != null) {
+            boolean tpmExists = Q.New(TpmVO.class)
+                    .eq(TpmVO_.uuid, msg.getTpmUuid())
+                    .isExists();
+            if (!tpmExists) {
+                APIRemoveTpmEvent evt = new APIRemoveTpmEvent(msg.getId());
+                bus.publish(evt);
+                throw new StopRoutingException();
+            }
+        }
+
         makeSureVmInstanceIsKvmType(msg.getVmInstanceUuid());
 
         boolean tpmExists = Q.New(TpmVO.class)
                 .eq(TpmVO_.vmInstanceUuid, msg.getVmInstanceUuid())
                 .isExists();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/devices/TpmApiInterceptor.java`
around lines 75 - 85, 在 validate(APIRemoveTpmMsg msg) 中,先判断 TPM 记录是否存在再去调用
makeSureVmInstanceIsKvmType;具体来说:使用 the
Q.New(TpmVO.class).eq(TpmVO_.vmInstanceUuid,
msg.getVmInstanceUuid()).isExists()(并且当 msg 只带 tpmUuid 无法补全 vmInstanceUuid
时也要能检测“记录不存在”的情形)如果记录不存在则立即构造并发布 APIRemoveTpmEvent(msg.getId()) 并抛出
StopRoutingException,以短路返回并避免先执行 makeSureVmInstanceIsKvmType 导致的 NOT_SUPPORTED
异常。
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java-217-227 (1)

217-227: ⚠️ Potential issue | 🟠 Major

不要把单个主机文件读取失败升级成整次同步失败

Line 223-227 和 Line 278-282 现在会把 to.getError() 聚合后直接 completion.fail(...)。这会把可容忍的单文件读错升级成整次 syncVmHostFilesFromHost() 失败,并把上层流程推到降级分支。这里应该记录 warning 并继续处理其它文件;只有 agent/RPC 级别失败才应返回失败。

建议修改
-                List<ErrorCode> errors = new ArrayList<>();
                 for (String path : cmd.getPaths()) {
                     VmHostFileTO to = findOneOrNull(readRsp.getHostFiles(), item -> item.getPath().equals(path));
                     if (to == null) {
                         continue;
                     }
                     if (to.getError() != null) {
-                        errors.add(operr("failed to read file %s", path)
-                                .withOpaque("path", path)
-                                .withException(to.getError()));
+                        logger.warn(String.format(
+                                "failed to read vm host file[path=%s] from host[uuid=%s], continue with cached content: %s",
+                                path, context.hostUuid, to.getError().getReadableDetails()));
                         continue;
                     }
-                if (errors.isEmpty()) {
-                    completion.success();
-                } else {
-                    completion.fail(operr("failed to read file content from host[uuid=%s]", context.hostUuid).withCause(errors));
-                }
+                completion.success();

Based on learnings: In ZStack's KvmSecureBootExtensions.syncVmHostFilesFromHost method, when reading individual host files (NvRam/TpmState) fails (to.getError() != null), it is intentional to log a warning and continue rather than failing the whole operation.

Also applies to: 278-282

plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java-5412-5418 (1)

5412-5418: ⚠️ Potential issue | 🟠 Major

遇到 key rotate / integrity mismatch 时要让本地 host key 缓存失效。

HostKeyIdentityHelper 已经把 KEY_AGENT_KEYS_NOT_ON_DISKKEY_AGENT_KEY_FILES_INTEGRITY_MISMATCH 定义为需要 rotate/re-fetch 的场景,但这里仅透传错误,没有把当前 host 的 verified 状态打回。这样后续重试仍会继续使用陈旧公钥,直到人工 reconnect 才能恢复。

💡 建议修改
-                    if (rsp != null && rsp.getError() != null) {
+                    if (rsp != null && rsp.getError() != null) {
+                        if (HostKeyIdentityHelper.isRotateNeededGetError(rsp.getError())) {
+                            HostKeyIdentityHelper.setVerified(dbf, hostUuid, false);
+                        }
                         ErrorCode err = new ErrorCode();
                         err.setCode(rsp.getError());
                         err.setDetails(rsp.getError());
                         reply.setError(err);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 5412 -
5418, The reply path currently only forwards the error from the agent without
invalidating the host key state; when rsp.getError() equals
HostKeyIdentityHelper.KEY_AGENT_KEYS_NOT_ON_DISK or
HostKeyIdentityHelper.KEY_AGENT_KEY_FILES_INTEGRITY_MISMATCH you must mark the
host's verified/key-cache as invalid so it will re-fetch/rotate keys on retry.
Locate the block in KVMHost where rsp and reply are handled (the branch that
creates ErrorCode err and the else path using operr(...)) and, before setting
reply.setError(...), call the existing host key invalidation/verification update
method (or add a short helper) to set the host's verified=false / clear its
cached key entry for this host, ensuring subsequent reconnect attempts will
re-fetch the keys rather than reuse stale ones.
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java-5407-5410 (1)

5407-5410: ⚠️ Potential issue | 🟠 Major

成功分支需要校验 secretUuid 非空。

SecretHostDefineReply 的业务结果只有 secretUuid。这里如果 agent 返回 success=true 但 UUID 为空,调用方会拿到“成功但不可用”的结果,后续很容易在落库或再次引用时出错。

💡 建议修改
-                if (rsp != null && rsp.isSuccess()) {
-                    if (rsp.getSecretUuid() != null) {
-                        reply.setSecretUuid(rsp.getSecretUuid());
-                    }
+                if (rsp != null && rsp.isSuccess()) {
+                    if (StringUtils.isBlank(rsp.getSecretUuid())) {
+                        reply.setError(operr("agent returned success but secretUuid is empty"));
+                    } else {
+                        reply.setSecretUuid(rsp.getSecretUuid());
+                    }
                 } else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 5407 -
5410, The success path currently trusts rsp.isSuccess() without verifying
rsp.getSecretUuid(); update the block in KVMHost where SecretHostDefineReply is
constructed (referencing rsp and reply) to validate rsp.getSecretUuid() is
non-null/non-empty when rsp.isSuccess() is true, and if the UUID is missing
treat it as a failure (set reply.setSuccess(false) and populate a clear error
message or exception) so callers never receive a "success" with a null
secretUuid.
plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java-408-422 (1)

408-422: ⚠️ Potential issue | 🟠 Major

encryptedDek@NoLogging

这是 DEK 的密文封装,不应该进入请求/错误日志。当前字段没有像同文件里的 chapPasswordvmXmlcontentBase64 那样做脱敏,调试或异常日志一旦序列化整条命令,就会泄露密钥材料。

🔒 建议修改
 public static class SecretHostDefineCmd extends AgentCommand {
     /** Base64 envelope of DEK; agent expects this field name (encryptedDek). */
+    `@NoLogging`
     private String encryptedDek;
     private String vmUuid;
     private String purpose;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java` around lines
408 - 422, Add the `@NoLogging` annotation to the encryptedDek sensitive field in
SecretHostDefineCmd so it is excluded from request/error logging like
chapPassword/vmXml/contentBase64; apply the annotation to the private String
encryptedDek declaration (and mirror the pattern used for other sensitive
fields—on the field/getter if your project convention requires it) and add the
corresponding import for the NoLogging annotation so the class compiles.
plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpmManager.java-490-492 (1)

490-492: ⚠️ Potential issue | 🟠 Major

不要暴露一个永远失败的 UpdateTpm API。

这个 PR 同时新增了 sdk/src/main/java/org/zstack/sdk/tpm/api/UpdateTpmAction.java,但这里对 APIUpdateTpmMsg 直接抛 NOT_SUPPORTED。结果是新增的 UpdateTpm API 无论参数是否正确都必然失败。要么在这里补齐实际更新逻辑,要么不要注册对应的 API / SDK 入口。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpmManager.java` around lines
490 - 492, The handler method for APIUpdateTpmMsg in KvmTpmManager currently
always throws NOT_SUPPORTED, which leaves the newly added SDK UpdateTpmAction
unusable; either implement the update flow in the handle(APIUpdateTpmMsg msg)
method (perform validation of msg parameters, invoke the TPM update logic in the
appropriate class/component, handle success/failure and reply with proper API
events) referencing KvmTpmManager.handle(APIUpdateTpmMsg) and any TPM update
helper classes, or remove/unregister the corresponding API/SDK entry (the
generated UpdateTpmAction / APIUpdateTpmMsg registration) so the SDK does not
expose a never-working endpoint; pick one approach and update the code paths and
tests accordingly.
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java-282-293 (1)

282-293: ⚠️ Potential issue | 🟠 Major

按内容缺失而不是按文件缺失回退到 backup。

这里仅在 context.files 缺少类型时才查询 VmHostBackupFileVO。如果源 VM 已经有 VmHostFileVO,但上一步从宿主机读取失败且数据库里还没有对应的 VmHostFileContentVO,后面的 Line 322-324 会直接跳过,目标 VM 就会静默丢掉这份 NvRam / TPM state。建议改成按 type 判断“是否拿到了可用 content”,再决定是否回退到 backup,不要只看 VmHostFileVO 是否存在。

Based on learnings, "individual host file sync failures should be tolerated and must not cause the entire completion to fail."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java` around
lines 282 - 293, The current logic treats a file type as present if a
VmHostFileVO exists in context.files, but we must treat it as present only if
usable content exists (VmHostFileContentVO), otherwise fall back to backups;
change the computation of missingTypes (where context.typesNeedClone is compared
to transform(context.files, VmHostFileVO::getType)) to instead consider types
for which no valid content was obtained—e.g., inspect context.files entries for
each VmHostFileVO and filter out those that have associated VmHostFileContentVO
(or a non-null content flag) and treat the remainder as missingTypes; then query
VmHostBackupFileVO for those missing types (as currently done with
Q.New(...).in(..., missingTypes).list()) so individual host file read failures
fall back to backups rather than silently dropping NvRam/TPM state.
🟡 Minor comments (8)
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java-175-176 (1)

175-176: ⚠️ Potential issue | 🟡 Minor

错误消息格式不一致:多余的花括号

错误消息中包含多余的花括号 {...},与本文件中其他 operr() 调用的格式不一致(参见 79、107、133 行)。

🔧 建议修复
-                    errs.add(operr("{the physical capacity usage of the host[uuid:%s] has exceeded the threshold[%s]}",
+                    errs.add(operr("the physical capacity usage of the host[uuid:%s] has exceeded the threshold[%s]",
                             ref.getHostUuid(), physicalCapacityMgr.getRatio(ref.getPrimaryStorageUuid())));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java`
around lines 175 - 176, The error message constructed in the
errs.add(operr(...)) call contains extraneous curly braces around the formatted
string causing inconsistent message format with other operr() calls; update the
errs.add invocation that uses ref.getHostUuid() and
physicalCapacityMgr.getRatio(ref.getPrimaryStorageUuid()) to remove the
surrounding "{...}" so the message matches the style of other calls (e.g., at
lines with similar operr usages).
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java-181-181 (1)

181-181: ⚠️ Potential issue | 🟡 Minor

错误消息语法问题

"failed allocate localstorage" 存在语法错误,建议修改为 "failed to allocate local storage""local storage allocation failed"

🔧 建议修复
-                ret.errorCode = operr("failed allocate localstorage").withCause(errs);
+                ret.errorCode = operr("failed to allocate local storage").withCause(errs);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java`
at line 181, The error message assigned to ret.errorCode in
LocalStorageMainAllocatorFlow (the operr("failed allocate localstorage") call)
has a grammar/spacing issue; update the operr call to a clear, grammatically
correct message such as operr("failed to allocate local storage") (or
operr("local storage allocation failed")) so "localstorage" becomes two words
and the phrase reads correctly.
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java-924-954 (1)

924-954: ⚠️ Potential issue | 🟡 Minor

应在用户配置指定主存储时,为数据盘设置主存储 UUID。

对比同文件中的 settingRootVolume(第 811 行)和 settingDataVolume(第 904 行),当用户配置中指定了主存储时,这两个方法都会分别调用 msg.setPrimaryStorageUuidForRootVolume()msg.setPrimaryStorageUuidForDataVolume()

但在 settingDataDisks 方法中,虽然代码检查了 config.getAllocate().getPrimaryStorage()(第 943 行),但仅用其值设置了 Ceph 池系统标签,并未调用 diskAO.setPrimaryStorageUuid(config.getAllocate().getPrimaryStorage().getUuid()) 来设置主存储 UUID。

建议在第 943 行的条件块内,补充对主存储 UUID 的设置,以保持与根卷和数据卷处理逻辑的一致性:

if (config.getAllocate().getPrimaryStorage() instanceof CephPrimaryStorageAllocateConfig) {
    // 设置主存储 UUID
    diskAO.setPrimaryStorageUuid(config.getAllocate().getPrimaryStorage().getUuid());
    
    CephPrimaryStorageAllocateConfig primaryStorageAllocateConfig = (CephPrimaryStorageAllocateConfig) config.getAllocate().getPrimaryStorage();
    // ... 现有的池标签设置逻辑
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java`
around lines 924 - 954, In settingDataDisks, when
config.getAllocate().getPrimaryStorage() is a CephPrimaryStorageAllocateConfig
you must also set the primary storage UUID on the DiskAO (call
diskAO.setPrimaryStorageUuid(config.getAllocate().getPrimaryStorage().getUuid()))
before applying Ceph pool system tags; locate the if-check that casts to
CephPrimaryStorageAllocateConfig and add the diskAO.setPrimaryStorageUuid(...)
call there (keep the existing CephSystemTags.USE_CEPH_PRIMARY_STORAGE_POOL logic
intact).
header/src/main/java/org/zstack/header/tpm/api/APIRemoveTpmMsgDoc_zh_cn.groovy-20-40 (1)

20-40: ⚠️ Potential issue | 🟡 Minor

tpmUuid / vmInstanceUuid 的“至少传一个”约束写进文档。

Lines 25-40 里两个字段都标成了 optional true,但当前文档没有说明它们是二选一关系,调用方很容易误以为空请求也合法。实际只有先提供其中一个时,前置补全器才会从库里补齐另一个;两个都为空时仍会在后续校验里失败。建议在请求 desc 或这两个参数的 desc 里明确“二选一,至少传一个”。
Based on learnings, TpmMessageAutoCompleter 会在 TpmApiInterceptor 之前为 APIRemoveTpmMsg 自动补全另一个字段,但前提是已提供 tpmUuidvmInstanceUuid 之一。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/tpm/api/APIRemoveTpmMsgDoc_zh_cn.groovy`
around lines 20 - 40, 两个参数 tpmUuid 和 vmInstanceUuid 都被标为
optional,但文档未说明它们是“二选一至少需传一个”的关系;请在 APIRemoveTpmMsg 的顶层 desc 或每个参数的 desc
中明确写出“二选一,至少传一个(提供任一项后 TpmMessageAutoCompleter 会补齐另一个,两个都为空会在 TpmApiInterceptor
的后续校验失败)”,并保留原有 optional 标注以匹配代码逻辑(参照参数名 tpmUuid、vmInstanceUuid 以及类/流程标识
APIRemoveTpmMsg、TpmMessageAutoCompleter、TpmApiInterceptor 以便定位修改位置)。
header/src/main/java/org/zstack/header/vm/APICreateVmInstanceMsg.java-485-498 (1)

485-498: ⚠️ Potential issue | 🟡 Minor

setDevices() 后可能导致 devicesSpec 缓存过期。

如果在调用 getDevicesSpec() 之后再调用 setDevices(),缓存的 devicesSpec 将与新的 devices 不一致。建议在 setDevices() 中清除缓存。

🐛 建议的修复
 public void setDevices(Map<String, Object> devices) {
     this.devices = devices;
+    this.devicesSpec = null;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@header/src/main/java/org/zstack/header/vm/APICreateVmInstanceMsg.java` around
lines 485 - 498, setDevices currently replaces the underlying devices map but
does not invalidate the cached devicesSpec, which can leave devicesSpec
out-of-sync after calling getDevicesSpec; update setDevices(Map<String,Object>
devices) to clear/reset devicesSpec (e.g. set devicesSpec to null) whenever
devices is changed so that subsequent getDevicesSpec() will rehydrate from the
new devices via JSONObjectUtil.rehashObject, and keep
setDevicesSpec(VmDevicesSpec) behavior unchanged.
header/src/main/java/org/zstack/header/tpm/api/APIUpdateTpmMsgDoc_zh_cn.groovy-24-41 (1)

24-41: ⚠️ Potential issue | 🟡 Minor

补充 vmInstanceUuid / tpmUuid 的联合必填说明。

文档把这两个字段都标成了 optional true,但没有说明“至少提供一个”。客户端按当前文档构造只带 keyProviderUuid 的请求时,会在后续校验阶段被拒绝。建议在字段描述或请求说明里明确二选一必填,以及缺失的一项会被自动补全;如果这是生成文件,也建议在生成源头同步修正。

🧭 建议补充到文档中的约束
 				column {
 					name "vmInstanceUuid"
 					enclosedIn "updateTpm"
-					desc "虚拟机 UUID"
+					desc "虚拟机 UUID,与 tpmUuid 二选一必填;只提供其一时另一项会被自动补全"
 					location "body"
 					type "String"
 					optional true
 					since "5.0.0"
 				}
 				column {
 					name "tpmUuid"
 					enclosedIn "updateTpm"
-					desc "TPM UUID"
+					desc "TPM UUID,与 vmInstanceUuid 二选一必填;只提供其一时另一项会被自动补全"
 					location "body"
 					type "String"
 					optional true
 					since "5.0.0"
 				}

Based on learnings TpmMessageAutoCompleter 会在 TpmApiInterceptor 之前自动补全 APIUpdateTpmMsg 的参数;如果只提供 tpmUuidvmInstanceUuid 之一,它会自动填充另一个字段。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/tpm/api/APIUpdateTpmMsgDoc_zh_cn.groovy`
around lines 24 - 41, Update the APIUpdateTpmMsg documentation so vmInstanceUuid
and tpmUuid explicitly state they are a mutual-one-of (至少提供一个) constraint: keep
both fields optional=true but add to each column's desc (or to the request-level
description) that at least one of vmInstanceUuid or tpmUuid must be provided and
that the missing field will be auto-completed by TpmMessageAutoCompleter before
TpmApiInterceptor; if this file is generated, also fix the generator/source that
emits APIUpdateTpmMsg to include the same explanatory note.
header/src/main/java/org/zstack/header/vm/VmInstanceCreateExtensionPoint.java-13-16 (1)

13-16: ⚠️ Potential issue | 🟡 Minor

修复 Javadoc 内联链接语法错误。

Line 15 的 Javadoc 中 {@link #afterPersistVmInstanceVO(VmInstanceVO, CreateVmInstanceMsg)} 包含多余的反引号,不符合标准 Javadoc 语法。应修改为 {@link #afterPersistVmInstanceVO(VmInstanceVO, CreateVmInstanceMsg)},与文件中其他链接写法保持一致。

✏️ 建议修复
-     * {`@link` `#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)} so extensions can
+     * {`@link` `#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)} so extensions can
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/vm/VmInstanceCreateExtensionPoint.java`
around lines 13 - 16, The Javadoc inline link in the
VmInstanceCreateExtensionPoint JavaDoc is malformed due to extra backticks;
replace the incorrect `{`@link` `#afterPersistVmInstanceVO`(VmInstanceVO,
CreateVmInstanceMsg)}` with a proper Javadoc link `{`@link`
`#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)}` so it matches the
other links in the file and correctly references the
afterPersistVmInstanceVO(VmInstanceVO, CreateVmInstanceMsg) method.
header/src/main/java/org/zstack/header/vm/additions/VmHostFileInventory.java-102-103 (1)

102-103: ⚠️ Potential issue | 🟡 Minor

__example__() 里的 TPM 路径用了错误的 UUID

Line 103 当前把 hostUuid 拼进 /var/lib/libvirt/swtpm/...,但运行时 plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java 在 Line 503-504 用的是 buildTpmStateFilePath(context.vmUuid)。这个示例会把 API 文档和 SDK 样例导向错误的目录格式。

建议修改
-        ref.setPath("/var/lib/libvirt/swtpm/" + ref.getHostUuid() + "/");
+        ref.setPath("/var/lib/libvirt/swtpm/" + ref.getVmInstanceUuid() + "/");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@header/src/main/java/org/zstack/header/vm/additions/VmHostFileInventory.java`
around lines 102 - 103, The example builds a TPM path using the host UUID but
the runtime uses the VM UUID, causing API docs/SDK samples to point to the wrong
directory; update the example in VmHostFileInventory::__example__() so that the
TPM path uses the VM UUID (use ref.getVmUuid() or the VM identifier used
elsewhere) instead of ref.getHostUuid(), and ensure the resulting path matches
the format produced by
KvmSecureBootExtensions.buildTpmStateFilePath(context.vmUuid).

Comment on lines +180 to +181
pluginRgty.getExtensionList(ApplianceVmInstanceCreateExtensionPoint.class).forEach(
extension -> extension.afterPersistApplianceVmInstanceVO(finalVO));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== VmInstanceCreateExtensionPoint implementations =="
rg -nP --type=java -C2 'implements[^{]*\bVmInstanceCreateExtensionPoint\b|afterPersistVmInstanceVO\s*\('

echo
echo "== ApplianceVmInstanceCreateExtensionPoint implementations =="
rg -nP --type=java -C2 'implements[^{]*\bApplianceVmInstanceCreateExtensionPoint\b|afterPersistApplianceVmInstanceVO\s*\('

Repository: MatheMatrix/zstack

Length of output: 3852


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check VmTpmExtensions class definition =="
rg -nP --type=java -A5 -B5 'class VmTpmExtensions' | head -40

echo
echo "== All implementations of VmInstanceCreateExtensionPoint =="
rg -nP --type=java 'implements\s*\b.*VmInstanceCreateExtensionPoint\b'

echo
echo "== Check if VmTpmExtensions also implements ApplianceVmInstanceCreateExtensionPoint =="
rg -nP --type=java -A20 'class VmTpmExtensions' | grep -i appliance

Repository: MatheMatrix/zstack

Length of output: 1679


VmTpmExtensions 需要为 Appliance VM 提供 TPM 初始化支持。

VmTpmExtensions 实现了 VmInstanceCreateExtensionPointafterPersistVmInstanceVO() 方法以初始化 VM 的 TPM 设备。但在 Appliance VM 的创建流程中(CreateApplianceVmJob 第 180-181 行),仅调用 ApplianceVmInstanceCreateExtensionPoint 的实现,导致 VmTpmExtensions 的回调被跳过。当前搜索结果中未发现任何 ApplianceVmInstanceCreateExtensionPoint 的实现。需要让 VmTpmExtensions 也实现 ApplianceVmInstanceCreateExtensionPoint 并提供相应的 TPM 初始化逻辑,或确立替代的初始化路径。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/applianceVm/src/main/java/org/zstack/appliancevm/CreateApplianceVmJob.java`
around lines 180 - 181, The appliance VM creation currently only invokes
ApplianceVmInstanceCreateExtensionPoint implementations in CreateApplianceVmJob
(pluginRgty.getExtensionList(ApplianceVmInstanceCreateExtensionPoint.class).forEach(...
afterPersistApplianceVmInstanceVO(finalVO))), which skips VmTpmExtensions’ TPM
init (it implements VmInstanceCreateExtensionPoint.afterPersistVmInstanceVO).
Fix by either having VmTpmExtensions also implement
ApplianceVmInstanceCreateExtensionPoint and move/duplicate its TPM
initialization into afterPersistApplianceVmInstanceVO(finalVO) for appliance
VMs, or alter CreateApplianceVmJob to also call
pluginRgty.getExtensionList(VmInstanceCreateExtensionPoint.class).forEach(extension
-> extension.afterPersistVmInstanceVO(finalVO)) so existing VmTpmExtensions
logic runs; update the TPM init code to use finalVO and ensure no duplicate
runs.

Comment on lines +476 to +479
public void fail(ErrorCode errorCode) {
logger.warn(String.format("failed to read vm host file for VM[vmUuid=%s] but still continue: %s",
context.vmUuid, errorCode.getReadableDetails()));
trigger.next();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

同步失败后仍应保留已有 VmHostFileVO 的优先级

Line 476-479 失败分支没有设置 context.vmHostFile,而 Line 487-489 只在它非空时才跳过 backup 流。结果是“主记录读取失败”被误判成“主记录不存在”,这和 Line 429-430 写的优先级相反;如果 backup 比 MN 里现有的 VmHostFileContentVO 更旧,这里会把旧的 NvRam/TPM 状态回写到目标宿主机。

建议修改
                     `@Override`
                     public void fail(ErrorCode errorCode) {
+                        context.vmHostFile = vmHostFile;
                         logger.warn(String.format("failed to read vm host file for VM[vmUuid=%s] but still continue: %s",
                                 context.vmUuid, errorCode.getReadableDetails()));
                         trigger.next();
                     }

Also applies to: 487-489

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`
around lines 476 - 479, The fail branch in the anonymous fail(ErrorCode
errorCode) handler does not preserve the existing VmHostFileVO in context
(context.vmHostFile), causing the code later to treat a read-failure as "no main
record" and wrongly run/skip the backup branch; fix by setting
context.vmHostFile to the already-known/existing VmHostFileVO (load it from the
DB or retain the prior value on the context) in the fail handler so the
downstream backup-skip logic (which checks context.vmHostFile != null) keeps the
intended priority; apply the same preservation change to the other similar
failure branch around the lines referenced (487-489).

Comment on lines +114 to +123
VmInstanceState vmState = Q.New(VmInstanceVO.class)
.select(VmInstanceVO_.state)
.eq(VmInstanceVO_.uuid, group.getVmInstanceUuid())
.findValue();
if (!VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES.contains(vmState)) {
throw new ApiMessageInterceptionException(argerr(
"Can not revert VM with memory snapshot: unexpected VM state")
.withOpaque("vm.uuid", group.getVmInstanceUuid())
.withOpaque("vm.state", vmState.toString())
.withOpaque("expect.states", VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

修复空指针风险:vmState 可能为 null

findValue() 可能返回 null。当前 Line 122 直接 vmState.toString(),在 VM 记录缺失时会抛 NPE,拦截器无法返回预期业务错误。

建议修复
             VmInstanceState vmState = Q.New(VmInstanceVO.class)
                     .select(VmInstanceVO_.state)
                     .eq(VmInstanceVO_.uuid, group.getVmInstanceUuid())
                     .findValue();
+            if (vmState == null) {
+                throw new ApiMessageInterceptionException(argerr(
+                        "Can not revert VM with memory snapshot: vm not found or vm state is null")
+                        .withOpaque("vm.uuid", group.getVmInstanceUuid()));
+            }
             if (!VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES.contains(vmState)) {
                 throw new ApiMessageInterceptionException(argerr(
                         "Can not revert VM with memory snapshot: unexpected VM state")
                         .withOpaque("vm.uuid", group.getVmInstanceUuid())
-                        .withOpaque("vm.state", vmState.toString())
+                        .withOpaque("vm.state", vmState.toString())
                         .withOpaque("expect.states", VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES));
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java`
around lines 114 - 123, The code reads VmInstanceState via
Q.New(VmInstanceVO.class).findValue() into vmState and then calls
vmState.toString(), which can NPE if the DB returns null; update the
VolumeSnapshotApiInterceptor check to handle null: if vmState is null, throw
ApiMessageInterceptionException (using argerr) with an appropriate message and
include opaque fields (vm.uuid, vm.state as "null" or a safe representation) and
the expected states
(VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES); otherwise keep
the existing contains check against
VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES and throw as
before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants