<fix>[host]: filter out disks of type loop and rom#3523
<fix>[host]: filter out disks of type loop and rom#3523zstack-robot-2 wants to merge 93 commits intofeature-zsv-5.0.0-vm-support-vtpm-and-secucebootfrom
Conversation
Resolves: ZSV-10987 Change-Id: I7277616f7378676d696772796b766676766c706b
<fix>[host]: filter out disks of type loop and rom See merge request zstackio/zstack!9023
This patch is for zsv_4.10.28 Resolves: ZSV-6434 Related: ZSV-9610 Change-Id: I7763786570626f646a636b767577616a6e726172
<fix>[account-import]: add progress for deleting ldap server See merge request zstackio/zstack!9048
This patch is for zsv_4.10.28 Resolves: ZSV-11206 Change-Id: I77736869617878636b64756c656b7a666f777966
<chore>[conf]: update i18n json for mevoco module See merge request zstackio/zstack!9056
This patch is for zsv_4.10.28 Related: ZSV-10444 Change-Id: I70636878647a7962636a64766b6c67756574756c
<chore>[conf]: update i18n files See merge request zstackio/zstack!9058
update BatchCreateVmFailDeadlockCase, BatchCreateVmFailOnLocalStorageCase, VmOperationMultyTypeStorageCase. This patch is for zsv_4.10.28 Related: ZSV-8585 Change-Id: I71617374747574706576627a74646875746d6b68
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage See merge request zstackio/zstack!9065
This patch is for zsv_4.10.28; This patch is cherry-picked from ZSTAC-79487, see commit bd5f6d6. Resolves: ZSV-11252 Change-Id: I646379678a7923767067706e7a6e667a796f686d
<fix>[utils]: correct CPU model and MAC retrieval in license check See merge request zstackio/zstack!9069
update LocalNfsMultiCombineCase and OnePsCreateVmCase This patch is for zsv_4.10.28 Related: ZSV-8585 Change-Id: I64646671626761696b6a7577676a68787871726a
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage See merge request zstackio/zstack!9073
update cases: * CreateVmAssignNfsPsCase * CreateDataVolumeWithOtherPlatformCase This patch is for zsv_4.10.28 Related: ZSV-8585 Change-Id: I7262796b6b6b73786b786276677870676965796d
<test>[volume]: remove CreateVmInstance.dataDiskOfferingUuids usage See merge request zstackio/zstack!9081
allow users to initizalize the primary storage in the disk block directly.
vm {
disk {
sizeGB(10)
diskUsePrimaryStorage("nfs")
}
}
Resolves: ZSV-11263
Change-Id: I6a7567646e6c64636279717576727a6878756476
<chore>[testlib]: support diskUsePrimaryStorage in disk block See merge request zstackio/zstack!9078
update cases: * CreateVmWithDesignatedPSCase * CreateVmWithVolumeSpecifiedPsCase * CreateVmHostAllocateCase This patch is for zsv_4.10.28 Related: ZSV-8585 Change-Id: I68646b70716d7579676d6b6e756b6a637a676777
This patch is for zsv_4.10.28 Related: ZSV-10444 Change-Id: I68786e677666656666746868766d646c67616b69
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage See merge request zstackio/zstack!9092
the systemTag for diskAO.diskOfferingUuid is not initialized in the previous version and uses the default value of the systemTag. so, now initialize the systemTag for each diskAO whose diskOfferingUuid is not null. Resolves: ZSV-11278 Change-Id: I616462756a656e646563716579716a6172687573
…ssageParamValidator use the msg class name to replace the ApiMessageParamValidator class name Resolves: ZSV-11283 Change-Id: I64707261797962716e6676756a6d6f6474626e72
<chore>[ceph]: support systemTag for diskAO.diskOfferingUuid See merge request zstackio/zstack!9096
update cases: * MultiPsStartVmCase * CreateVmAssignPsCase * MultiNfsAttachMultiClusterMultiHostCase This patch is for zsv_5.0.0 Related: ZSV-8585 Change-Id: I7367677175776a6170646d6b6e7164656466656c
<fix>[message]: fix incorrect parameter of the format string in ApiMessageParamValidator See merge request zstackio/zstack!9097
<test>[storage]: remove CreateVmInstance.dataDiskOfferingUuids usage See merge request zstackio/zstack!9095
…r modifying host password after modifying the host password, the zstack MN will fail to reconnect the hosts and get the error message that is not well translated. we put the translated error message in the "i18nDetails" field. Resolves: ZSV-8182 Change-Id: I617066656b656c6e74726374777573616b666f6e
<fix>[i18n]: translate the error message when reconnecting hosts after modifying host password See merge request zstackio/zstack!9219
the volume can be deleted and if users want to recover the vm through the snapshot that took after binding the aforesaid volume , the system may complain the null pointer exception because we do not check if the volumes are deleted or not. Resolves: ZSV-5838 Change-Id: I636c626d7372676776637661766d626466707a72
* Move HPKE host secret envelope crypto to premium * Move logic of syncing public key to premium Resolves: ZSPHER-113 Related: http://dev.zstack.io:9080/zstackio/premium/-/merge_requests/13121
Resolves: ZSV-11439 Related: ZSV-11310 Change-Id: I63706e6a7a75616366716574646a677a6e646276
* Added support for cross-VM cloning of host files (NvRam, TpmState) and TPM encryption key cloning * KvmSecureBootManager transitioned from a Component to a service * Introducing new message/response types, workflows, and TPM key backend interfaces along with their virtual implementations Resolves: ZSV-11439 Related: ZSV-11310 Change-Id: I796b77716b6c64657469697a7a797a7268636e6e
Resolves: ZSV-11439 Related: ZSV-11310 Change-Id: I626c6775707a67657570736778766d6b75736570
Resolves: ZSV-11439 Related: ZSV-11310 Change-Id: I68696b7776656f78677679686370707a70616665
Related: ZSV-11439 Related: ZSV-11310 Change-Id: I65616264756468686477647a70716f72626f6365
DBImpact Resolves: ZSV-11473 Change-Id: I676766646c6568687472776f6c776c6f72777279
* A new hook (afterRollbackPersistVmInstanceVO) has been added to the VM rollback path * introduce the TPM key provider management interface and implementation (attach/detach/find) have been introduced and extended Resolves: ZSV-11489 Related: ZSV-11310 Change-Id: I73646c617971626d77726d6f646467746f637075
Walkthrough此 PR 引入了 TPM(可信平台模块)、NvRam 和密钥提供程序支持,扩展了虚拟机生命周期管理、存储系统和 RBAC 权限控制,包括新增数据库架构、API 消息类型和底层的主机端文件管理功能。 Changes
Sequence Diagram(s)sequenceDiagram
participant Client as API Client
participant Manager as VmInstanceManager
participant Extension as VmTpmExtensions
participant DB as DatabaseFacade
participant CloudBus as CloudBus
Client->>Manager: CreateVmInstanceMsg
Manager->>Manager: persistVmInstanceVO()
Manager->>Extension: afterPersistVmInstanceVO(vo, msg)
alt TPM Enabled
Extension->>DB: persistTpmVO(tpmUuid, vmUuid)
Extension->>Extension: attachKeyProvider(tpmUuid, keyProviderUuid)
end
Extension-->>Manager: Success
alt VM Creation Succeeds
Manager->>CloudBus: Reply Success
Client-->>Client: VM Created
else Rollback on Error
Manager->>Extension: afterRollbackPersistVmInstanceVO(vo, msg)
Extension->>DB: deleteTpmVO(tpmUuid)
Extension->>Extension: detachKeyProvider(tpmUuid)
Manager->>CloudBus: Reply Error
Client-->>Client: Creation Failed
end
sequenceDiagram
participant Client as API Client
participant KvmHost as KVM Host
participant SecureBootExt as KvmSecureBootExtensions
participant HostFile as VmHostFile Service
Client->>KvmHost: StartVmCmd (with TpmTO, NvRamSpec)
KvmHost->>SecureBootExt: beforeStartVmOnKvm()
alt NvRam as Volume
SecureBootExt->>HostFile: Create NvRam Volume
SecureBootExt->>KvmHost: Inject Volume UUID
else NvRam as Host File
SecureBootExt->>HostFile: Prepare Host File (VmHostFileVO)
SecureBootExt->>KvmHost: Set Host File Path
end
KvmHost->>KvmHost: Start VM with EDK & Secure Boot Config
KvmHost-->>Client: VM Started
note over SecureBootExt: Reads NvRam/TPM State<br/>after VM Shutdown
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Reasoning: 此 PR 是一个大规模、高复杂度的功能增强,涉及超过 20,000 行的新增代码,跨越计算、存储、KVM、SDK 和身份验证等多个核心模块。主要复杂性体现在:
Poem
✨ Finishing Touches
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 3
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugin/sharedMountPointPrimaryStorage/src/main/java/org/zstack/storage/primary/smp/KvmBackend.java (1)
971-980:⚠️ Potential issue | 🟠 MajorNvRam 在“预设 installPath”场景会被错误标记为 QCOW2。
当前先走
installPath非空分支,会跳过 NvRam 的RAW赋值;随后 Line 993 使用cmd.volumeFormat回填,最终可能把 NvRam 卷写成QCOW2。建议先按类型设置volumeFormat,再处理 installPath 选择。建议修复
private void createEmptyVolume(final VolumeInventory volume, String hostUuid, final ReturnValueCompletion<InstantiateVolumeOnPrimaryStorageReply> completion) { final CreateEmptyVolumeCmd cmd = new CreateEmptyVolumeCmd(); + boolean isNvRam = VolumeType.NvRam.toString().equals(volume.getType()); + if (isNvRam) { + cmd.volumeFormat = VolumeConstant.VOLUME_FORMAT_RAW; + } if (StringUtils.isNotEmpty(volume.getInstallPath())) { cmd.installPath = volume.getInstallPath(); } else if (VolumeType.Root.toString().equals(volume.getType())) { cmd.installPath = makeRootVolumeInstallUrl(volume); } else if (VolumeType.Data.toString().equals(volume.getType())) { cmd.installPath = makeDataVolumeInstallUrl(volume.getUuid()); - } else if (VolumeType.NvRam.toString().equals(volume.getType())) { + } else if (isNvRam) { cmd.installPath = makeNvRamVolumeInstallUrl(volume.getUuid()); - cmd.volumeFormat = VolumeConstant.VOLUME_FORMAT_RAW; } else { DebugUtils.Assert(false, "Should not be here"); }Also applies to: 992-994
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/sharedMountPointPrimaryStorage/src/main/java/org/zstack/storage/primary/smp/KvmBackend.java` around lines 971 - 980, The issue is that when volume.getInstallPath() is non-empty the code skips the VolumeType.NvRam branch and never sets cmd.volumeFormat to RAW, causing NvRam volumes to be treated as QCOW2; modify KvmBackend so you set type-specific fields (e.g., set cmd.volumeFormat = VolumeConstant.VOLUME_FORMAT_RAW for VolumeType.NvRam) prior to the installPath selection block (the code referencing volume.getInstallPath(), VolumeType.NvRam, makeNvRamVolumeInstallUrl, makeRootVolumeInstallUrl and cmd.volumeFormat) so the format is always correct even when installPath is preset.plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java (1)
1102-1113:⚠️ Potential issue | 🟠 MajorNvRam 在已有
installPath场景下会被错误写成 QCOW2当前先判断
installPath,会绕过 NvRam 分支,导致cmd.getVolumeFormat()仍是默认 QCOW2;随后 Line 1145 会把卷格式写错。建议将 NvRam 格式设置独立于安装路径分支。建议修复
diff --git a/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java b/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java @@ cmd.setName(volume.getName()); cmd.setSize(volume.getSize()); cmd.setVolumeUuid(volume.getUuid()); + if (volume.getType().equals(VolumeType.NvRam.toString())) { + cmd.setVolumeFormat(VolumeConstant.VOLUME_FORMAT_RAW); + } + if (StringUtils.isNotEmpty(volume.getInstallPath())) { cmd.setInstallUrl(volume.getInstallPath()); } else if (volume.getType().equals(VolumeType.Root.toString())) { cmd.setInstallUrl(NfsPrimaryStorageKvmHelper.makeRootVolumeInstallUrl(pinv, volume)); } else if (volume.getType().equals(VolumeType.Data.toString())) { @@ } else if (volume.getType().equals(VolumeType.Cache.toString())) { cmd.setInstallUrl(NfsPrimaryStorageKvmHelper.makeDataVolumeInstallUrl(pinv, volume.getUuid())); } else if (volume.getType().equals(VolumeType.NvRam.toString())) { cmd.setInstallUrl(NfsPrimaryStorageKvmHelper.makeNvRamVolumeInstallUrl(pinv, volume.getUuid())); - cmd.setVolumeFormat(VolumeConstant.VOLUME_FORMAT_RAW); } else { throw new CloudRuntimeException(String.format("unknown volume type %s", volume.getType())); }Also applies to: 1145-1145
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java` around lines 1102 - 1113, The NvRam volume format is only set inside the installPath-less branch, so when volume.getInstallPath() is present NvRam stays QCOW2; update NfsPrimaryStorageKVMBackend so that after the installUrl selection logic (the block that calls NfsPrimaryStorageKvmHelper.makeRootVolumeInstallUrl / makeDataVolumeInstallUrl / makeNvRamVolumeInstallUrl) you add a type check for VolumeType.NvRam and unconditionally call cmd.setVolumeFormat(VolumeConstant.VOLUME_FORMAT_RAW) for that case (or move the existing cmd.setVolumeFormat call out of the NvRam-only branch), ensuring NvRam gets RAW format regardless of volume.getInstallPath().
🟠 Major comments (18)
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java-1733-1743 (1)
1733-1743:⚠️ Potential issue | 🟠 Major
cmd.format已动态计算,但返回卷格式仍被固定为 RAW,存在元数据不一致风险这里允许非 NvRam 卷走 QCOW2,但在 Line 1758 仍固定
vol.setFormat(VolumeConstant.VOLUME_FORMAT_RAW),会把实际 qcow2 卷记录成 raw。建议将返回格式与cmd.format保持一致。建议修改
@@ if (type == VolumeType.NvRam) { cmd.format = VolumeConstant.VOLUME_FORMAT_RAW; } else { cmd.format = msg.hasSystemTag(VolumeSystemTags.FORMAT_QCOW2.getTagFormat()) ? VolumeConstant.VOLUME_FORMAT_QCOW2 : VolumeConstant.VOLUME_FORMAT_RAW ; } @@ public void success(CreateEmptyVolumeRsp ret) { VolumeInventory vol = msg.getVolume(); vol.setInstallPath(buildEmptyVolumeInstallPath(finalPoolName, cmd.installPath, ret.getInstallPath())); - vol.setFormat(VolumeConstant.VOLUME_FORMAT_RAW); + vol.setFormat(cmd.getFormat()); vol.setActualSize(ret.actualSize); reply.setVolume(vol); bus.reply(msg, reply); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java` around lines 1733 - 1743, The volume format stored in CephPrimaryStorageBase is inconsistent: cmd.format is computed conditionally (using VolumeType and VolumeSystemTags) but the persisted VolumeVO is always set to VolumeConstant.VOLUME_FORMAT_RAW via vol.setFormat(...), risking metadata mismatch; update the code that sets the volume record (the vol.setFormat call) to use the same computed value (cmd.format or the same conditional logic) so the recorded format matches cmd.format (ensure references to VolumeType, VolumeSystemTags.FORMAT_QCOW2 and VolumeConstant are reused).header/src/main/java/org/zstack/header/storage/snapshot/VolumeSnapshotConstant.java-25-30 (1)
25-30:⚠️ Potential issue | 🟠 Major将公开状态常量改为不可变集合,避免运行时被篡改
当前
Arrays.asList(...)生成的列表可被set()修改。作为public常量暴露后,外部代码可意外改写允许状态,导致快照校验规则漂移。建议修复(保持现有 API 类型)
import org.zstack.header.vm.VmInstanceState; import java.util.Arrays; +import java.util.Collections; import java.util.List; @@ - public static final List<VmInstanceState> ALLOW_TAKE_SNAPSHOTS_VM_STATES = Arrays.asList( + public static final List<VmInstanceState> ALLOW_TAKE_SNAPSHOTS_VM_STATES = Collections.unmodifiableList(Arrays.asList( VmInstanceState.Running, VmInstanceState.Stopped, VmInstanceState.Paused - ); - public static final List<VmInstanceState> ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES = Arrays.asList( + )); + public static final List<VmInstanceState> ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES = Collections.singletonList( VmInstanceState.Stopped );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/storage/snapshot/VolumeSnapshotConstant.java` around lines 25 - 30, The public mutable lists ALLOW_TAKE_SNAPSHOTS_VM_STATES and ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES (containing VmInstanceState values) must be made unmodifiable so callers cannot mutate them at runtime; replace the Arrays.asList(...) initializers with an immutable wrapper such as Collections.unmodifiableList(Arrays.asList(...)) or List.copyOf(Arrays.asList(...)) (keeping the public static final List<VmInstanceState> type and the same elements) so the constants remain read-only.plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java-1309-1311 (1)
1309-1311:⚠️ Potential issue | 🟠 MajorNvRam 新增后,目录推导分支未补齐,存在断言失败风险
这里已经引入
VolumeType.NvRam的安装路径与格式处理,但makeVolumeInstallDir(VolumeInventory vol)(Line 1000-1014)仍未覆盖 NvRam。后续凡是走到该通用目录方法(例如handle(ChangeVolumeTypeOnPrimaryStorageMsg ...))时,volPath可能为空并在 Line 1012 触发断言。建议修复
diff --git a/plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java b/plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java @@ public String makeVolumeInstallDir(VolumeInventory vol) { String volPath = null; if (VolumeType.Data.toString().equals(vol.getType())) { volPath = makeDataVolumeInstallUrl(vol.getUuid()); } else if (VolumeType.Root.toString().equals(vol.getType())) { volPath = makeRootVolumeInstallUrl(vol); } else if (VolumeType.Memory.toString().equals(vol.getType())) { volPath = makeMemoryVolumeInstallUrl(vol); } else if (VolumeType.Cache.toString().equals(vol.getType())) { volPath = makeDataVolumeInstallUrl(vol.getUuid()); + } else if (VolumeType.NvRam.toString().equals(vol.getType())) { + volPath = makeNvRamVolumeInstallUrl(vol.getUuid()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java` around lines 1309 - 1311, The code added handling for VolumeType.NvRam in LocalStorageKvmBackend (setting install URL and RAW format) but did not update makeVolumeInstallDir(VolumeInventory vol) to return a directory for NvRam, so callers like handle(ChangeVolumeTypeOnPrimaryStorageMsg...) can get a null volPath and hit the assertion; update makeVolumeInstallDir to include a branch for VolumeType.NvRam that returns the correct install directory (use the same naming/structure as makeNvRamVolumeInstallUrl or the existing volume dirs), ensuring volPath is never null for NvRam volumes and keeping behavior consistent with other volume types.plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java-47-49 (1)
47-49:⚠️ Potential issue | 🟠 Major请补齐
makeVolumeInstallDir的 NvRam 分支以避免断言失败新增了 NvRam URL 构造,但
makeVolumeInstallDir仍未覆盖VolumeType.NvRam。当后续流程传入 NvRam 卷时,volPath可能为空并触发断言失败。建议修复
diff --git a/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java b/plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java @@ } else if (VolumeType.Memory.toString().equals(vol.getType())) { volPath = makeMemoryVolumeInstallUrl(pinv, vol); } else if (VolumeType.Cache.toString().equals(vol.getType())) { volPath = makeDataVolumeInstallUrl(pinv, vol.getUuid()); + } else if (VolumeType.NvRam.toString().equals(vol.getType())) { + volPath = makeNvRamVolumeInstallUrl(pinv, vol.getUuid()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKvmHelper.java` around lines 47 - 49, makeVolumeInstallDir is missing handling for VolumeType.NvRam, causing volPath to be null and assertions to fail when NvRam volumes are used; update the method to include a branch for VolumeType.NvRam that calls the new makeNvRamVolumeInstallUrl(PrimaryStorageInventory, String) (and/or PrimaryStoragePathMaker.makeNvRamVolumeInstallPath) to build the NvRam install path, assign it to volPath, and return/create the directory like other volume types so assertions no longer trigger.sdk/src/main/java/org/zstack/sdk/tpm/api/QueryTpmResult.java-5-12 (1)
5-12:⚠️ Potential issue | 🟠 Major
inventories需要指定泛型类型参数。
inventories当前声明为原始java.util.List缺少泛型信息,Gson 反序列化时会将列表元素反序列化为Map/LinkedTreeMap,而非TpmInventory对象,导致 SDK 用户无法直接使用查询结果。请将字段和 getter/setter 改为java.util.List<TpmInventory>,并添加必要的 import 语句。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/src/main/java/org/zstack/sdk/tpm/api/QueryTpmResult.java` around lines 5 - 12, Change the raw inventories field and its accessor methods in QueryTpmResult to use a generic type so Gson deserializes elements as TpmInventory: update the field declaration inventories to java.util.List<TpmInventory>, change setInventories and getInventories signatures to accept/return java.util.List<TpmInventory>, and add the import for org.zstack.sdk.tpm.api.TpmInventory (or the correct TpmInventory package) so the compiler recognizes the type.compute/src/main/java/org/zstack/compute/vm/devices/VmTpmManager.java-53-61 (1)
53-61:⚠️ Potential issue | 🟠 Major避免把可空资源配置直接拆箱成
boolean。
getResourceConfigValue(vmUuid, Boolean.class)返回的是包装类型;这里直接return给boolean,在没有资源级值时会因为自动拆箱抛NullPointerException。对“没有 TPM 且未单独配置 secure boot”的 VM,这条判断路径会直接把 NVRAM 注册流程打断。至少这里要先做空值收敛,并让全局默认值兜底。🛠 建议修改
ResourceConfig resourceConfig = resourceConfigFacade.getResourceConfig(ENABLE_UEFI_SECURE_BOOT.getIdentity()); - return resourceConfig.getResourceConfigValue(vmUuid, Boolean.class); + return Boolean.TRUE.equals(resourceConfig.getResourceConfigValue(vmUuid, Boolean.class));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/devices/VmTpmManager.java` around lines 53 - 61, The method needRegisterNvRam unboxes a nullable Boolean from resourceConfig.getResourceConfigValue(vmUuid, Boolean.class) which can throw NPE; change it to first read the value into a Boolean (e.g., Boolean cfg = resourceConfig.getResourceConfigValue(vmUuid, Boolean.class)), avoid direct unboxing by returning Boolean.TRUE.equals(cfg) when a per-VM value exists, and if cfg is null fall back to the global default for ENABLE_UEFI_SECURE_BOOT (retrieve via the ResourceConfig instance for ENABLE_UEFI_SECURE_BOOT or use a sensible default) so that needRegisterNvRam (and symbols TpmVO / TpmVO_.vmInstanceUuid, resourceConfigFacade, ENABLE_UEFI_SECURE_BOOT) never causes a NullPointerException.build/deploydb.sh-45-45 (1)
45-45:⚠️ Potential issue | 🟠 Major不要把
zsv全量 schema 和常规 upgrade 链路混到同一次 Flyway 迁移里。Line 44 已经把
conf/db/upgrade/*放进flyway_sql,现在 Line 45 又把conf/db/zsv/*放进同一个目录;而 Lines 55-58 会对这里的全部版本脚本执行一次migrate。如果zsv目录里包含V5.0.0__schema.sql这类全量 schema,初始化时会在已有升级链路之后再次回放 DDL,很容易因为对象已存在或版本链冲突直接失败。建议把upgrade和zsv做成互斥的迁移来源,不要无条件混跑。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@build/deploydb.sh` at line 45, The script currently unconditionally copies conf/db/upgrade/* and conf/db/zsv/* into the same ${flyway_sql} directory (the cp command that copies ${base}/../conf/db/zsv/* into ${flyway_sql}), which causes Flyway to run both upgrade (Lines 44) and zsv full-schema scripts together and can replay full DDL during migrate; change the logic to make upgrade and zsv mutually exclusive by either (a) only copying zsv when no upgrade scripts are present, (b) copying zsv into a separate directory and point Flyway to one location at a time, or (c) add a flag/env var to choose between upgrade vs zsv migration; specifically modify the cp logic around the ${flyway_sql} assignment and the cp ${base}/../conf/db/zsv/* ${flyway_sql} command so Flyway’s migrate operates on only one source.conf/db/zsv/V5.0.0__schema.sql-3-10 (1)
3-10:⚠️ Potential issue | 🟠 Major给
TpmVO.vmInstanceUuid加唯一约束。现在只有外键,没有把 “一台 VM 只能挂一个 TPM” 真正落到数据库层。只要这张表能写出同一个
vmInstanceUuid的多行,按 VM 反查 TPM 的路径就会变成不确定行为,后续的 update/remove/get-capability 都可能命中错误记录。🛠️ 建议修改
CREATE TABLE IF NOT EXISTS `zstack`.`TpmVO` ( `uuid` char(32) NOT NULL UNIQUE, `vmInstanceUuid` char(32) NOT NULL, `lastOpDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `createDate` timestamp NOT NULL DEFAULT '1999-12-31 23:59:59', PRIMARY KEY (`uuid`), + UNIQUE KEY `ukTpmVOVmInstanceUuid` (`vmInstanceUuid`), CONSTRAINT `fkTpmVOVmInstanceVO` FOREIGN KEY (`vmInstanceUuid`) REFERENCES `VmInstanceEO` (`uuid`) ON UPDATE RESTRICT ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8;Based on learnings,
TpmMessageAutoCompleter会在只提供tpmUuid或vmInstanceUuid时自动补全另一侧参数,这隐含 VM 与 TPM 之间必须是一对一映射。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/db/zsv/V5.0.0__schema.sql` around lines 3 - 10, Add a unique constraint on TpmVO.vmInstanceUuid so the DB enforces one-to-one VM→TPM mapping; alter the DDL for table TpmVO (or add an ALTER TABLE statement) to create a UNIQUE index/constraint on the vmInstanceUuid column (in addition to the existing foreign key fkTpmVOVmInstanceVO) so inserts/updates cannot create multiple rows with the same vmInstanceUuid.plugin/kvm/src/main/java/org/zstack/kvm/HostKeyIdentityHelper.java-66-68 (1)
66-68:⚠️ Potential issue | 🟠 Major空白公钥也要撤销旧的
verified状态。
saveOrUpdateHostKeyIdentity()的注释和后面几个非法 key 分支都在做“降级为未验证”,但 Line 66-68 这里直接返回了。若某个 host 之前已经verified=true,现在返回空白公钥,会把旧的信任状态保留下来。💡 建议修改
public static void saveOrUpdateHostKeyIdentity(DatabaseFacade dbf, String hostUuid, String publicKey, boolean verified) { if (StringUtils.isBlank(publicKey)) { + setVerified(dbf, hostUuid, false); return; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/HostKeyIdentityHelper.java` around lines 66 - 68, The early return in HostKeyIdentityHelper when publicKey is blank leaves any existing HostKeyIdentity.verified=true unchanged; instead, in saveOrUpdateHostKeyIdentity handle the blank publicKey by retrieving the HostKeyIdentity for the given host (via hostUuid/HostKeyIdentity lookup) and explicitly downgrade it to unverified (set verified=false and persist/update using the same save/update path used by other invalid-key branches) so previous trust is revoked when an empty key is provided.compute/src/main/java/org/zstack/compute/vm/devices/VmTpmExtensions.java-88-91 (1)
88-91:⚠️ Potential issue | 🟠 Major优先用
VmInstanceSpec里的 boot mode,别回查 system tag。
afterBuildVmSpec已经拿到了VmInstanceSpec。Line 88-91 再按vmUuid回查BOOT_MODE,在创建链路里很容易读不到尚未持久化的 tag,导致 UEFI + Secure Boot 的 VM 漏掉NvRam注册。这里应先读spec.getBootMode(),必要时再 fallback 到 tag 查询。💡 建议修改
- String bootMode = VmSystemTags.BOOT_MODE.getTokenByResourceUuid(vmUuid, VmSystemTags.BOOT_MODE_TOKEN); + String bootMode = spec.getBootMode(); + if (bootMode == null) { + bootMode = VmSystemTags.BOOT_MODE.getTokenByResourceUuid(vmUuid, VmSystemTags.BOOT_MODE_TOKEN); + } if (vmTpmManager.isUefiBootMode(bootMode)) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/devices/VmTpmExtensions.java` around lines 88 - 91, In VmTpmExtensions (afterBuildVmSpec) don't read BOOT_MODE via system tag by vmUuid; use the VmInstanceSpec's boot mode first (call spec.getBootMode()) when determining UEFI mode for vmTpmManager.isUefiBootMode(...) and only fallback to VmSystemTags.BOOT_MODE.getTokenByResourceUuid(vmUuid, ...) if spec.getBootMode() is null/empty; keep the existing resourceConfigFacade.getResourceConfig(ENABLE_UEFI_SECURE_BOOT.getIdentity()) and needRegisterNvRam logic but base the initial bootMode value off spec.getBootMode() to avoid missing unpersisted tags.compute/src/main/java/org/zstack/compute/vm/devices/TpmMessageAutoCompleter.java-50-57 (1)
50-57:⚠️ Potential issue | 🟠 Major把“TPM 不存在”和“TPM/VM 不匹配”分开返回。
Line 50-57 和 Line 69-73 现在只要按
tpmUuid查不到记录,就要么报 “not consistent”,要么把vmInstanceUuid补成null。这样一来,只传tpmUuid的查询/更新请求在 TPM 已删除时会落到后面的 KVM 校验,最终返回NOT_SUPPORTED,而不是TPM_NOT_FOUND。建议先查出实际vmInstanceUuid:为空时对非删除消息抛TPM_NOT_FOUND,只有查到了 TPM 但归属 VM 不同才报参数不一致。Also applies to: 69-73
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/devices/TpmMessageAutoCompleter.java` around lines 50 - 57, In TpmMessageAutoCompleter, change the current combined check that queries by tpmUuid and immediately treats missing records as "not consistent": first load the TpmVO by tpmUuid (using Q.New(TpmVO.class) / TpmVO_.uuid) and if the VO is null then—unless the incoming message is a DeleteMessage—throw a TPM_NOT_FOUND style error (instead of nulling vmInstanceUuid); if the VO exists, set vmUuid from VO.vmInstanceUuid and only then validate that the provided vmUuid matches, throwing ApiMessageInterceptionException(argerr(...)) when they differ. Apply the same refactor to the second block (lines around 69-73) that currently repeats the logic.compute/src/main/java/org/zstack/compute/vm/devices/TpmApiInterceptor.java-75-85 (1)
75-85:⚠️ Potential issue | 🟠 Major删除空操作应在 KVM 校验前短路返回。
APIRemoveTpmMsg允许只传tpmUuid。当该 TPM 已不存在时,自动补全拿不到vmInstanceUuid,Line 76 会先抛NOT_SUPPORTED,根本到不了 Line 81-85 的幂等成功分支。这里应该先对“只传tpmUuid且记录不存在”的情况直接发布APIRemoveTpmEvent并停止路由。💡 建议修改
private void validate(APIRemoveTpmMsg msg) { + if (msg.getVmInstanceUuid() == null && msg.getTpmUuid() != null) { + boolean tpmExists = Q.New(TpmVO.class) + .eq(TpmVO_.uuid, msg.getTpmUuid()) + .isExists(); + if (!tpmExists) { + APIRemoveTpmEvent evt = new APIRemoveTpmEvent(msg.getId()); + bus.publish(evt); + throw new StopRoutingException(); + } + } + makeSureVmInstanceIsKvmType(msg.getVmInstanceUuid()); boolean tpmExists = Q.New(TpmVO.class) .eq(TpmVO_.vmInstanceUuid, msg.getVmInstanceUuid()) .isExists();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/devices/TpmApiInterceptor.java` around lines 75 - 85, 在 validate(APIRemoveTpmMsg msg) 中,先判断 TPM 记录是否存在再去调用 makeSureVmInstanceIsKvmType;具体来说:使用 the Q.New(TpmVO.class).eq(TpmVO_.vmInstanceUuid, msg.getVmInstanceUuid()).isExists()(并且当 msg 只带 tpmUuid 无法补全 vmInstanceUuid 时也要能检测“记录不存在”的情形)如果记录不存在则立即构造并发布 APIRemoveTpmEvent(msg.getId()) 并抛出 StopRoutingException,以短路返回并避免先执行 makeSureVmInstanceIsKvmType 导致的 NOT_SUPPORTED 异常。plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java-217-227 (1)
217-227:⚠️ Potential issue | 🟠 Major不要把单个主机文件读取失败升级成整次同步失败
Line 223-227 和 Line 278-282 现在会把
to.getError()聚合后直接completion.fail(...)。这会把可容忍的单文件读错升级成整次syncVmHostFilesFromHost()失败,并把上层流程推到降级分支。这里应该记录 warning 并继续处理其它文件;只有 agent/RPC 级别失败才应返回失败。建议修改
- List<ErrorCode> errors = new ArrayList<>(); for (String path : cmd.getPaths()) { VmHostFileTO to = findOneOrNull(readRsp.getHostFiles(), item -> item.getPath().equals(path)); if (to == null) { continue; } if (to.getError() != null) { - errors.add(operr("failed to read file %s", path) - .withOpaque("path", path) - .withException(to.getError())); + logger.warn(String.format( + "failed to read vm host file[path=%s] from host[uuid=%s], continue with cached content: %s", + path, context.hostUuid, to.getError().getReadableDetails())); continue; }- if (errors.isEmpty()) { - completion.success(); - } else { - completion.fail(operr("failed to read file content from host[uuid=%s]", context.hostUuid).withCause(errors)); - } + completion.success();Based on learnings: In ZStack's
KvmSecureBootExtensions.syncVmHostFilesFromHostmethod, when reading individual host files (NvRam/TpmState) fails (to.getError() != null), it is intentional to log a warning and continue rather than failing the whole operation.Also applies to: 278-282
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java-5412-5418 (1)
5412-5418:⚠️ Potential issue | 🟠 Major遇到 key rotate / integrity mismatch 时要让本地 host key 缓存失效。
HostKeyIdentityHelper已经把KEY_AGENT_KEYS_NOT_ON_DISK和KEY_AGENT_KEY_FILES_INTEGRITY_MISMATCH定义为需要 rotate/re-fetch 的场景,但这里仅透传错误,没有把当前 host 的verified状态打回。这样后续重试仍会继续使用陈旧公钥,直到人工 reconnect 才能恢复。💡 建议修改
- if (rsp != null && rsp.getError() != null) { + if (rsp != null && rsp.getError() != null) { + if (HostKeyIdentityHelper.isRotateNeededGetError(rsp.getError())) { + HostKeyIdentityHelper.setVerified(dbf, hostUuid, false); + } ErrorCode err = new ErrorCode(); err.setCode(rsp.getError()); err.setDetails(rsp.getError()); reply.setError(err);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 5412 - 5418, The reply path currently only forwards the error from the agent without invalidating the host key state; when rsp.getError() equals HostKeyIdentityHelper.KEY_AGENT_KEYS_NOT_ON_DISK or HostKeyIdentityHelper.KEY_AGENT_KEY_FILES_INTEGRITY_MISMATCH you must mark the host's verified/key-cache as invalid so it will re-fetch/rotate keys on retry. Locate the block in KVMHost where rsp and reply are handled (the branch that creates ErrorCode err and the else path using operr(...)) and, before setting reply.setError(...), call the existing host key invalidation/verification update method (or add a short helper) to set the host's verified=false / clear its cached key entry for this host, ensuring subsequent reconnect attempts will re-fetch the keys rather than reuse stale ones.plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java-5407-5410 (1)
5407-5410:⚠️ Potential issue | 🟠 Major成功分支需要校验
secretUuid非空。
SecretHostDefineReply的业务结果只有secretUuid。这里如果 agent 返回success=true但 UUID 为空,调用方会拿到“成功但不可用”的结果,后续很容易在落库或再次引用时出错。💡 建议修改
- if (rsp != null && rsp.isSuccess()) { - if (rsp.getSecretUuid() != null) { - reply.setSecretUuid(rsp.getSecretUuid()); - } + if (rsp != null && rsp.isSuccess()) { + if (StringUtils.isBlank(rsp.getSecretUuid())) { + reply.setError(operr("agent returned success but secretUuid is empty")); + } else { + reply.setSecretUuid(rsp.getSecretUuid()); + } } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 5407 - 5410, The success path currently trusts rsp.isSuccess() without verifying rsp.getSecretUuid(); update the block in KVMHost where SecretHostDefineReply is constructed (referencing rsp and reply) to validate rsp.getSecretUuid() is non-null/non-empty when rsp.isSuccess() is true, and if the UUID is missing treat it as a failure (set reply.setSuccess(false) and populate a clear error message or exception) so callers never receive a "success" with a null secretUuid.plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java-408-422 (1)
408-422:⚠️ Potential issue | 🟠 Major给
encryptedDek加@NoLogging。这是 DEK 的密文封装,不应该进入请求/错误日志。当前字段没有像同文件里的
chapPassword、vmXml、contentBase64那样做脱敏,调试或异常日志一旦序列化整条命令,就会泄露密钥材料。🔒 建议修改
public static class SecretHostDefineCmd extends AgentCommand { /** Base64 envelope of DEK; agent expects this field name (encryptedDek). */ + `@NoLogging` private String encryptedDek; private String vmUuid; private String purpose;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java` around lines 408 - 422, Add the `@NoLogging` annotation to the encryptedDek sensitive field in SecretHostDefineCmd so it is excluded from request/error logging like chapPassword/vmXml/contentBase64; apply the annotation to the private String encryptedDek declaration (and mirror the pattern used for other sensitive fields—on the field/getter if your project convention requires it) and add the corresponding import for the NoLogging annotation so the class compiles.plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpmManager.java-490-492 (1)
490-492:⚠️ Potential issue | 🟠 Major不要暴露一个永远失败的 UpdateTpm API。
这个 PR 同时新增了
sdk/src/main/java/org/zstack/sdk/tpm/api/UpdateTpmAction.java,但这里对APIUpdateTpmMsg直接抛NOT_SUPPORTED。结果是新增的 UpdateTpm API 无论参数是否正确都必然失败。要么在这里补齐实际更新逻辑,要么不要注册对应的 API / SDK 入口。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpmManager.java` around lines 490 - 492, The handler method for APIUpdateTpmMsg in KvmTpmManager currently always throws NOT_SUPPORTED, which leaves the newly added SDK UpdateTpmAction unusable; either implement the update flow in the handle(APIUpdateTpmMsg msg) method (perform validation of msg parameters, invoke the TPM update logic in the appropriate class/component, handle success/failure and reply with proper API events) referencing KvmTpmManager.handle(APIUpdateTpmMsg) and any TPM update helper classes, or remove/unregister the corresponding API/SDK entry (the generated UpdateTpmAction / APIUpdateTpmMsg registration) so the SDK does not expose a never-working endpoint; pick one approach and update the code paths and tests accordingly.plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java-282-293 (1)
282-293:⚠️ Potential issue | 🟠 Major按内容缺失而不是按文件缺失回退到 backup。
这里仅在
context.files缺少类型时才查询VmHostBackupFileVO。如果源 VM 已经有VmHostFileVO,但上一步从宿主机读取失败且数据库里还没有对应的VmHostFileContentVO,后面的 Line 322-324 会直接跳过,目标 VM 就会静默丢掉这份 NvRam / TPM state。建议改成按type判断“是否拿到了可用 content”,再决定是否回退到 backup,不要只看VmHostFileVO是否存在。Based on learnings, "individual host file sync failures should be tolerated and must not cause the entire completion to fail."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java` around lines 282 - 293, The current logic treats a file type as present if a VmHostFileVO exists in context.files, but we must treat it as present only if usable content exists (VmHostFileContentVO), otherwise fall back to backups; change the computation of missingTypes (where context.typesNeedClone is compared to transform(context.files, VmHostFileVO::getType)) to instead consider types for which no valid content was obtained—e.g., inspect context.files entries for each VmHostFileVO and filter out those that have associated VmHostFileContentVO (or a non-null content flag) and treat the remainder as missingTypes; then query VmHostBackupFileVO for those missing types (as currently done with Q.New(...).in(..., missingTypes).list()) so individual host file read failures fall back to backups rather than silently dropping NvRam/TPM state.
🟡 Minor comments (8)
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java-175-176 (1)
175-176:⚠️ Potential issue | 🟡 Minor错误消息格式不一致:多余的花括号
错误消息中包含多余的花括号
{...},与本文件中其他operr()调用的格式不一致(参见 79、107、133 行)。🔧 建议修复
- errs.add(operr("{the physical capacity usage of the host[uuid:%s] has exceeded the threshold[%s]}", + errs.add(operr("the physical capacity usage of the host[uuid:%s] has exceeded the threshold[%s]", ref.getHostUuid(), physicalCapacityMgr.getRatio(ref.getPrimaryStorageUuid())));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java` around lines 175 - 176, The error message constructed in the errs.add(operr(...)) call contains extraneous curly braces around the formatted string causing inconsistent message format with other operr() calls; update the errs.add invocation that uses ref.getHostUuid() and physicalCapacityMgr.getRatio(ref.getPrimaryStorageUuid()) to remove the surrounding "{...}" so the message matches the style of other calls (e.g., at lines with similar operr usages).plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java-181-181 (1)
181-181:⚠️ Potential issue | 🟡 Minor错误消息语法问题
"failed allocate localstorage"存在语法错误,建议修改为"failed to allocate local storage"或"local storage allocation failed"。🔧 建议修复
- ret.errorCode = operr("failed allocate localstorage").withCause(errs); + ret.errorCode = operr("failed to allocate local storage").withCause(errs);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageMainAllocatorFlow.java` at line 181, The error message assigned to ret.errorCode in LocalStorageMainAllocatorFlow (the operr("failed allocate localstorage") call) has a grammar/spacing issue; update the operr call to a clear, grammatically correct message such as operr("failed to allocate local storage") (or operr("local storage allocation failed")) so "localstorage" becomes two words and the phrase reads correctly.plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java-924-954 (1)
924-954:⚠️ Potential issue | 🟡 Minor应在用户配置指定主存储时,为数据盘设置主存储 UUID。
对比同文件中的
settingRootVolume(第 811 行)和settingDataVolume(第 904 行),当用户配置中指定了主存储时,这两个方法都会分别调用msg.setPrimaryStorageUuidForRootVolume()和msg.setPrimaryStorageUuidForDataVolume()。但在
settingDataDisks方法中,虽然代码检查了config.getAllocate().getPrimaryStorage()(第 943 行),但仅用其值设置了 Ceph 池系统标签,并未调用diskAO.setPrimaryStorageUuid(config.getAllocate().getPrimaryStorage().getUuid())来设置主存储 UUID。建议在第 943 行的条件块内,补充对主存储 UUID 的设置,以保持与根卷和数据卷处理逻辑的一致性:
if (config.getAllocate().getPrimaryStorage() instanceof CephPrimaryStorageAllocateConfig) { // 设置主存储 UUID diskAO.setPrimaryStorageUuid(config.getAllocate().getPrimaryStorage().getUuid()); CephPrimaryStorageAllocateConfig primaryStorageAllocateConfig = (CephPrimaryStorageAllocateConfig) config.getAllocate().getPrimaryStorage(); // ... 现有的池标签设置逻辑 }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java` around lines 924 - 954, In settingDataDisks, when config.getAllocate().getPrimaryStorage() is a CephPrimaryStorageAllocateConfig you must also set the primary storage UUID on the DiskAO (call diskAO.setPrimaryStorageUuid(config.getAllocate().getPrimaryStorage().getUuid())) before applying Ceph pool system tags; locate the if-check that casts to CephPrimaryStorageAllocateConfig and add the diskAO.setPrimaryStorageUuid(...) call there (keep the existing CephSystemTags.USE_CEPH_PRIMARY_STORAGE_POOL logic intact).header/src/main/java/org/zstack/header/tpm/api/APIRemoveTpmMsgDoc_zh_cn.groovy-20-40 (1)
20-40:⚠️ Potential issue | 🟡 Minor把
tpmUuid/vmInstanceUuid的“至少传一个”约束写进文档。Lines 25-40 里两个字段都标成了
optional true,但当前文档没有说明它们是二选一关系,调用方很容易误以为空请求也合法。实际只有先提供其中一个时,前置补全器才会从库里补齐另一个;两个都为空时仍会在后续校验里失败。建议在请求desc或这两个参数的desc里明确“二选一,至少传一个”。
Based on learnings,TpmMessageAutoCompleter会在TpmApiInterceptor之前为APIRemoveTpmMsg自动补全另一个字段,但前提是已提供tpmUuid或vmInstanceUuid之一。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/tpm/api/APIRemoveTpmMsgDoc_zh_cn.groovy` around lines 20 - 40, 两个参数 tpmUuid 和 vmInstanceUuid 都被标为 optional,但文档未说明它们是“二选一至少需传一个”的关系;请在 APIRemoveTpmMsg 的顶层 desc 或每个参数的 desc 中明确写出“二选一,至少传一个(提供任一项后 TpmMessageAutoCompleter 会补齐另一个,两个都为空会在 TpmApiInterceptor 的后续校验失败)”,并保留原有 optional 标注以匹配代码逻辑(参照参数名 tpmUuid、vmInstanceUuid 以及类/流程标识 APIRemoveTpmMsg、TpmMessageAutoCompleter、TpmApiInterceptor 以便定位修改位置)。header/src/main/java/org/zstack/header/vm/APICreateVmInstanceMsg.java-485-498 (1)
485-498:⚠️ Potential issue | 🟡 Minor
setDevices()后可能导致devicesSpec缓存过期。如果在调用
getDevicesSpec()之后再调用setDevices(),缓存的devicesSpec将与新的devices不一致。建议在setDevices()中清除缓存。🐛 建议的修复
public void setDevices(Map<String, Object> devices) { this.devices = devices; + this.devicesSpec = null; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/vm/APICreateVmInstanceMsg.java` around lines 485 - 498, setDevices currently replaces the underlying devices map but does not invalidate the cached devicesSpec, which can leave devicesSpec out-of-sync after calling getDevicesSpec; update setDevices(Map<String,Object> devices) to clear/reset devicesSpec (e.g. set devicesSpec to null) whenever devices is changed so that subsequent getDevicesSpec() will rehydrate from the new devices via JSONObjectUtil.rehashObject, and keep setDevicesSpec(VmDevicesSpec) behavior unchanged.header/src/main/java/org/zstack/header/tpm/api/APIUpdateTpmMsgDoc_zh_cn.groovy-24-41 (1)
24-41:⚠️ Potential issue | 🟡 Minor补充
vmInstanceUuid/tpmUuid的联合必填说明。文档把这两个字段都标成了
optional true,但没有说明“至少提供一个”。客户端按当前文档构造只带keyProviderUuid的请求时,会在后续校验阶段被拒绝。建议在字段描述或请求说明里明确二选一必填,以及缺失的一项会被自动补全;如果这是生成文件,也建议在生成源头同步修正。🧭 建议补充到文档中的约束
column { name "vmInstanceUuid" enclosedIn "updateTpm" - desc "虚拟机 UUID" + desc "虚拟机 UUID,与 tpmUuid 二选一必填;只提供其一时另一项会被自动补全" location "body" type "String" optional true since "5.0.0" } column { name "tpmUuid" enclosedIn "updateTpm" - desc "TPM UUID" + desc "TPM UUID,与 vmInstanceUuid 二选一必填;只提供其一时另一项会被自动补全" location "body" type "String" optional true since "5.0.0" }Based on learnings TpmMessageAutoCompleter 会在
TpmApiInterceptor之前自动补全APIUpdateTpmMsg的参数;如果只提供tpmUuid或vmInstanceUuid之一,它会自动填充另一个字段。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/tpm/api/APIUpdateTpmMsgDoc_zh_cn.groovy` around lines 24 - 41, Update the APIUpdateTpmMsg documentation so vmInstanceUuid and tpmUuid explicitly state they are a mutual-one-of (至少提供一个) constraint: keep both fields optional=true but add to each column's desc (or to the request-level description) that at least one of vmInstanceUuid or tpmUuid must be provided and that the missing field will be auto-completed by TpmMessageAutoCompleter before TpmApiInterceptor; if this file is generated, also fix the generator/source that emits APIUpdateTpmMsg to include the same explanatory note.header/src/main/java/org/zstack/header/vm/VmInstanceCreateExtensionPoint.java-13-16 (1)
13-16:⚠️ Potential issue | 🟡 Minor修复 Javadoc 内联链接语法错误。
Line 15 的 Javadoc 中
{@link#afterPersistVmInstanceVO(VmInstanceVO, CreateVmInstanceMsg)}包含多余的反引号,不符合标准 Javadoc 语法。应修改为{@link#afterPersistVmInstanceVO(VmInstanceVO, CreateVmInstanceMsg)},与文件中其他链接写法保持一致。✏️ 建议修复
- * {`@link` `#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)} so extensions can + * {`@link` `#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)} so extensions can🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/vm/VmInstanceCreateExtensionPoint.java` around lines 13 - 16, The Javadoc inline link in the VmInstanceCreateExtensionPoint JavaDoc is malformed due to extra backticks; replace the incorrect `{`@link` `#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)}` with a proper Javadoc link `{`@link` `#afterPersistVmInstanceVO`(VmInstanceVO, CreateVmInstanceMsg)}` so it matches the other links in the file and correctly references the afterPersistVmInstanceVO(VmInstanceVO, CreateVmInstanceMsg) method.header/src/main/java/org/zstack/header/vm/additions/VmHostFileInventory.java-102-103 (1)
102-103:⚠️ Potential issue | 🟡 Minor
__example__()里的 TPM 路径用了错误的 UUIDLine 103 当前把
hostUuid拼进/var/lib/libvirt/swtpm/...,但运行时plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java在 Line 503-504 用的是buildTpmStateFilePath(context.vmUuid)。这个示例会把 API 文档和 SDK 样例导向错误的目录格式。建议修改
- ref.setPath("/var/lib/libvirt/swtpm/" + ref.getHostUuid() + "/"); + ref.setPath("/var/lib/libvirt/swtpm/" + ref.getVmInstanceUuid() + "/");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/vm/additions/VmHostFileInventory.java` around lines 102 - 103, The example builds a TPM path using the host UUID but the runtime uses the VM UUID, causing API docs/SDK samples to point to the wrong directory; update the example in VmHostFileInventory::__example__() so that the TPM path uses the VM UUID (use ref.getVmUuid() or the VM identifier used elsewhere) instead of ref.getHostUuid(), and ensure the resulting path matches the format produced by KvmSecureBootExtensions.buildTpmStateFilePath(context.vmUuid).
| pluginRgty.getExtensionList(ApplianceVmInstanceCreateExtensionPoint.class).forEach( | ||
| extension -> extension.afterPersistApplianceVmInstanceVO(finalVO)); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== VmInstanceCreateExtensionPoint implementations =="
rg -nP --type=java -C2 'implements[^{]*\bVmInstanceCreateExtensionPoint\b|afterPersistVmInstanceVO\s*\('
echo
echo "== ApplianceVmInstanceCreateExtensionPoint implementations =="
rg -nP --type=java -C2 'implements[^{]*\bApplianceVmInstanceCreateExtensionPoint\b|afterPersistApplianceVmInstanceVO\s*\('Repository: MatheMatrix/zstack
Length of output: 3852
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check VmTpmExtensions class definition =="
rg -nP --type=java -A5 -B5 'class VmTpmExtensions' | head -40
echo
echo "== All implementations of VmInstanceCreateExtensionPoint =="
rg -nP --type=java 'implements\s*\b.*VmInstanceCreateExtensionPoint\b'
echo
echo "== Check if VmTpmExtensions also implements ApplianceVmInstanceCreateExtensionPoint =="
rg -nP --type=java -A20 'class VmTpmExtensions' | grep -i applianceRepository: MatheMatrix/zstack
Length of output: 1679
VmTpmExtensions 需要为 Appliance VM 提供 TPM 初始化支持。
VmTpmExtensions 实现了 VmInstanceCreateExtensionPoint 的 afterPersistVmInstanceVO() 方法以初始化 VM 的 TPM 设备。但在 Appliance VM 的创建流程中(CreateApplianceVmJob 第 180-181 行),仅调用 ApplianceVmInstanceCreateExtensionPoint 的实现,导致 VmTpmExtensions 的回调被跳过。当前搜索结果中未发现任何 ApplianceVmInstanceCreateExtensionPoint 的实现。需要让 VmTpmExtensions 也实现 ApplianceVmInstanceCreateExtensionPoint 并提供相应的 TPM 初始化逻辑,或确立替代的初始化路径。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugin/applianceVm/src/main/java/org/zstack/appliancevm/CreateApplianceVmJob.java`
around lines 180 - 181, The appliance VM creation currently only invokes
ApplianceVmInstanceCreateExtensionPoint implementations in CreateApplianceVmJob
(pluginRgty.getExtensionList(ApplianceVmInstanceCreateExtensionPoint.class).forEach(...
afterPersistApplianceVmInstanceVO(finalVO))), which skips VmTpmExtensions’ TPM
init (it implements VmInstanceCreateExtensionPoint.afterPersistVmInstanceVO).
Fix by either having VmTpmExtensions also implement
ApplianceVmInstanceCreateExtensionPoint and move/duplicate its TPM
initialization into afterPersistApplianceVmInstanceVO(finalVO) for appliance
VMs, or alter CreateApplianceVmJob to also call
pluginRgty.getExtensionList(VmInstanceCreateExtensionPoint.class).forEach(extension
-> extension.afterPersistVmInstanceVO(finalVO)) so existing VmTpmExtensions
logic runs; update the TPM init code to use finalVO and ensure no duplicate
runs.
| public void fail(ErrorCode errorCode) { | ||
| logger.warn(String.format("failed to read vm host file for VM[vmUuid=%s] but still continue: %s", | ||
| context.vmUuid, errorCode.getReadableDetails())); | ||
| trigger.next(); |
There was a problem hiding this comment.
同步失败后仍应保留已有 VmHostFileVO 的优先级
Line 476-479 失败分支没有设置 context.vmHostFile,而 Line 487-489 只在它非空时才跳过 backup 流。结果是“主记录读取失败”被误判成“主记录不存在”,这和 Line 429-430 写的优先级相反;如果 backup 比 MN 里现有的 VmHostFileContentVO 更旧,这里会把旧的 NvRam/TPM 状态回写到目标宿主机。
建议修改
`@Override`
public void fail(ErrorCode errorCode) {
+ context.vmHostFile = vmHostFile;
logger.warn(String.format("failed to read vm host file for VM[vmUuid=%s] but still continue: %s",
context.vmUuid, errorCode.getReadableDetails()));
trigger.next();
}Also applies to: 487-489
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`
around lines 476 - 479, The fail branch in the anonymous fail(ErrorCode
errorCode) handler does not preserve the existing VmHostFileVO in context
(context.vmHostFile), causing the code later to treat a read-failure as "no main
record" and wrongly run/skip the backup branch; fix by setting
context.vmHostFile to the already-known/existing VmHostFileVO (load it from the
DB or retain the prior value on the context) in the fail handler so the
downstream backup-skip logic (which checks context.vmHostFile != null) keeps the
intended priority; apply the same preservation change to the other similar
failure branch around the lines referenced (487-489).
| VmInstanceState vmState = Q.New(VmInstanceVO.class) | ||
| .select(VmInstanceVO_.state) | ||
| .eq(VmInstanceVO_.uuid, group.getVmInstanceUuid()) | ||
| .findValue(); | ||
| if (!VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES.contains(vmState)) { | ||
| throw new ApiMessageInterceptionException(argerr( | ||
| "Can not revert VM with memory snapshot: unexpected VM state") | ||
| .withOpaque("vm.uuid", group.getVmInstanceUuid()) | ||
| .withOpaque("vm.state", vmState.toString()) | ||
| .withOpaque("expect.states", VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES)); |
There was a problem hiding this comment.
修复空指针风险:vmState 可能为 null
findValue() 可能返回 null。当前 Line 122 直接 vmState.toString(),在 VM 记录缺失时会抛 NPE,拦截器无法返回预期业务错误。
建议修复
VmInstanceState vmState = Q.New(VmInstanceVO.class)
.select(VmInstanceVO_.state)
.eq(VmInstanceVO_.uuid, group.getVmInstanceUuid())
.findValue();
+ if (vmState == null) {
+ throw new ApiMessageInterceptionException(argerr(
+ "Can not revert VM with memory snapshot: vm not found or vm state is null")
+ .withOpaque("vm.uuid", group.getVmInstanceUuid()));
+ }
if (!VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES.contains(vmState)) {
throw new ApiMessageInterceptionException(argerr(
"Can not revert VM with memory snapshot: unexpected VM state")
.withOpaque("vm.uuid", group.getVmInstanceUuid())
- .withOpaque("vm.state", vmState.toString())
+ .withOpaque("vm.state", vmState.toString())
.withOpaque("expect.states", VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES));
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java`
around lines 114 - 123, The code reads VmInstanceState via
Q.New(VmInstanceVO.class).findValue() into vmState and then calls
vmState.toString(), which can NPE if the DB returns null; update the
VolumeSnapshotApiInterceptor check to handle null: if vmState is null, throw
ApiMessageInterceptionException (using argerr) with an appropriate message and
include opaque fields (vm.uuid, vm.state as "null" or a safe representation) and
the expected states
(VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES); otherwise keep
the existing contains check against
VolumeSnapshotConstant.ALLOW_TAKE_MEMORY_SNAPSHOTS_VM_STATES and throw as
before.
Resolves: ZSV-10987
Change-Id: I7277616f7378676d696772796b766676766c706b
sync from gitlab !9382