<fix>[plugin-premium]: GPU/VM page keeps loading when shutting down or... by ZStack-Robot · Pull Request #3549 · MatheMatrix/zstack

ZStack-Robot · 2026-03-20T06:55:57Z

[plugin-premium]: GPU/VM page keeps loading when shutting down or encountering errors in Zaku cluster.
Resolves: ZSTAC-80202
Change-Id: I7778676171646874706164777869707279776172

sync from gitlab !9014

…igration. support starting vm without nic in other clusters. Resolves/Related: ZSTAC-80468 Change-Id: I6370646f7a796265677a6a656c716f6867706d69

…isk offering Resolves: ZSTAC-74683 Change-Id: Id0339ed0221e92e506f60745cde972cc3ee6d9ae

When anti-split-brain check selects a disconnected MDS node, the HTTP call now times out after 30s instead of 5+ minutes, and automatically retries the next available MDS via tryNext mechanism. Resolves: ZSTAC-80595 Change-Id: I1be80f1b70cad1606eb38d1f0078c8f2781e6941

When MN restarts during a destroy operation, the hypervisor may report the VM as Stopped. Without this transition, the state machine throws an exception and the VM stays stuck in Destroying state forever. Resolves: ZSTAC-80620 Change-Id: I037edba70d145a44a88ce0d3573089182fedb162

…pacity Resolves: ZSTAC-79709 Change-Id: I45a2133bbb8c51c25ae3549d59e588976192a08d

Resolves: ZSTAC-78989 Change-Id: I0fe3a56ab724978944c69afadaab7ff7353e4c0f

Resolves: ZSTAC-82153 Change-Id: Ib51c2e21553277416d1a9444be55aca2aa4b2fc4

…fterAddIpAddress to prevent NPE during rollback Resolves: ZSTAC-81741 Change-Id: I53bcf20a10306afc7b6172da294d347b74e6c41f

Resolves: ZSTAC-81182 Change-Id: Id1bb642154dc66ae9995dcc4d9fc00cdce9bcaf8

<fix>[vm]: add Destroying->Stopped state transition See merge request zstackio/zstack!9156

<fix>[i18n]: improve snapshot error message for unattached volume See merge request zstackio/zstack!9192

<fix>[zbs]: reduce mds connect timeout and enable tryNext for volume clients See merge request zstackio/zstack!9153

<fix>[vm]: use max of virtual and actual size for root disk allocation See merge request zstackio/zstack!9155

…talling In dual management node scenarios, concurrent modifications to the consistent hash ring from heartbeat reconciliation and canonical event callbacks can cause NodeHash/Nodes inconsistency, leading to message routing failures and task timeouts. Fix: (1) synchronized all ResourceDestinationMakerImpl methods to ensure atomic nodeHash+nodes updates, (2) added lifecycleLock in ManagementNodeManagerImpl to serialize heartbeat reconciliation with event callbacks, (3) added two-round delayed confirmation before removing nodes from hash ring to avoid race with NodeJoin events. Resolves: ZSTAC-77711 Change-Id: I3d33d53595dd302784dff17417a5b25f2d0f3426

<fix>[network]: filter reserved IPs in getFreeIp See merge request zstackio/zstack!9170

<fix>[ceph]: apply over-provisioning ratio when releasing snapshot capacity See merge request zstackio/zstack!9162

<fix>[network]: add null check for L3 network system tags in API interceptor See merge request zstackio/zstack!9169

@host

The mdsUrls field in ExternalPrimaryStorage config contains user:password@host format credentials. Add desensitization to mask credentials as ***@host in API/CLI output. Resolves: ZSTAC-80664 Change-Id: I94bdede5a1b52eb039de70efb5458693484405f7

Add ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT constant to CloudOperationsErrorCode for use in premium volumebackup module. Resolves: ZSTAC-82195 Change-Id: Ibc405876e1171b637cf76b91a6822574fb6e7811

<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition See merge request zstackio/zstack!9154

SyncTaskFuture constructor calls Context.current() unconditionally, triggering ServiceLoader for ContextStorageProvider even when telemetry is disabled. If sentry-opentelemetry-bootstrap jar is on classpath, ServiceLoader fails with "not a subtype" due to ClassLoader isolation in Tomcat, throwing ServiceConfigurationError (extends Error) that escapes all catch(Exception) blocks. 1. Why is this change necessary? MN startup crashes with ORG_ZSTACK_CORE_WORKFLOW_10001 because Context.current() triggers ServiceLoader unconditionally in SyncTaskFuture constructor, even when telemetry is disabled. 2. How does it address the problem? Only call Context.current() when isTelemetryEnabled() returns true, matching the existing guard pattern used in other DispatchQueueImpl code paths (lines 351, 1069). 3. Are there any side effects? None. When telemetry is disabled, parentContext was never used. # Summary of changes (by module): - core/thread: conditionalize Context.current() in SyncTaskFuture Related: ZSTAC-82275 Change-Id: I5c0e1f15769c746c630028a29df8cf1815620608

<fix>[thread]: guard Context.current() with telemetry check See merge request zstackio/zstack!9202

Resolves: ZSTAC-82195 Change-Id: I3d5e91d09d7c088d3c53e3839f8b32f4bce32dec

<fix>[loadBalancer]: block SLB deletion during grayscale upgrade See merge request zstackio/zstack!9187

<fix>[volumebackup]: add backup cancel timeout error code See merge request zstackio/zstack!9200

@nologging

<fix>[core]: add @nologging to sensitive config fields See merge request zstackio/zstack!9171

…plit-brain When a management node departs, its VM skip-trace entries were immediately removed. If VMs were still being started by kvmagent, the next VM sync would falsely detect them as Stopped and trigger HA, causing split-brain. Fix: transfer departed MN skip-trace entries to an orphaned set with 10-minute TTL instead of immediate deletion. VMs in the orphaned set remain skip-traced until the TTL expires or they are explicitly continued, preventing false HA triggers during MN restart scenarios. Resolves: ZSTAC-80821 Change-Id: I3222e260b2d7b33dc43aba0431ce59a788566b34

…anup Resolves: ZSTAC-80821 Change-Id: I59284c4e69f5d2ee357b1836b7c243200e30949a

Resolves: ZSTAC-77544 Change-Id: I1f711bff9c1e87a8cbf6a2eb310ca6086f0f99ba

Resolves: ZSTAC-80821 Change-Id: Ia9a9597feceb96b3e6e22259e2d0be7bde8ae499

<fix>[conf]: use absolute path for ansible version check during upgrade See merge request zstackio/zstack!9356

…ostic Resolves: ZSTAC-82318 Change-Id: If68d507a3eeb3f48458d9fbf61c357339688d43b

setProcessedBytes/setTotalBytes/setSpeedBytesPerSecond renamed to setProcessed/setTotal/setSpeed in unit-agnostic refactor. Add setUnit("bytes") to preserve semantic clarity. Related: ZSTAC-82318 Change-Id: Ied29ac01d488062a03d792378432522fbf0a523f Co-Authored-By: Claude Opus 4.6 (1M context) <ye.zou@zstack.io>

…e_exporter Resolves: ZSTAC-82619 Change-Id: I581385808fe4d942369b98f175eb8fbbdb382a0d

…od (ZSTAC-71075) Resolves: ZSTAC-71075 Change-Id: I21ebd76ba332995252f67f4cacf78022587a215a

<feature>[longjob]: standardize progress detail fields to be unit-agnostic See merge request zstackio/zstack!9358

fix(ai): ZSTAC-68709 add eval task queuing with concurrency control See merge request zstackio/zstack!9295

…alization CloudBusGson.httpGson does not disableHtmlEscaping(), causing single quotes in error details to be escaped as \u0027. While valid JSON, the SDK deserializes these literally, breaking string assertions in tests like BatchAddBareMetal2ChassisCase that check error.details with contains("'reboot' must be 'No'"). Restore JSONObjectUtil.toJsonString() for the sendReplyResponse error path while keeping the centralized localization logic. Co-Authored-By: ye.zou <ye.zou@zstack.io>

APIImpact DBImpact GlobalConfigImpact Resolves: ZSTAC-81969 Change-Id: I736e77646266696e646271766d7170627378796b

When vSwitchType is LinuxBridge, physicalInterface must not be null or empty. Add interceptor check in L2NetworkApiInterceptor and unit test in AttachL2NetworkCase. ZSTAC-83300 Change-Id: I333a5bac866354c366051fece84e447f089e2306 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ZSTAC-81797: iSCSI fencer use platform compact hostId instead of storage bdc id See merge request zstackio/zstack!9274

<fix>[conf]: retry rm -rf virtualenv to avoid race with zstack_service_exporter See merge request zstackio/zstack!9365

<fix>[network]: set nic ip out of l3 cidr scope See merge request zstackio/zstack!9374

Resolves: ZSTAC-81969 Change-Id: I7176617669636d7362616973787a68766c6d6579

…id loader init - GlobalErrorCodeI18nServiceImpl: add final fallback to error.code/empty string when both details and description are null, guaranteeing message is never null - Platform: use existing loader field instead of getComponentLoader() to avoid triggering ComponentLoader creation during early startup; move message fallback outside catch block so it always runs Co-Authored-By: ye.zou <ye.zou@zstack.io> Change-Id: I8571f05657dc2173cc232b511f505eedc68d714e

<fix>[vmScheduling]: change GET scheduling APIs from POST to GET method (ZSTAC-71075) See merge request zstackio/zstack!9373

<feature>[errorcode]: simplify i18n — guarantee message is never null See merge request zstackio/zstack!9354

Add libvirt.tls.enabled GlobalConfig and useTls field in MigrateVmCmd to support TLS-encrypted libvirt connections for migration and V2V. Resolves: ZSTAC-81343 Change-Id: I391fa36c0dd63c25c5d85d102bc3579c8eb3d685

<fix>[docs]: move to zstack/docs See merge request zstackio/zstack!9379

<fix>[l2network]: validate physicalInterface for LinuxBridge See merge request zstackio/zstack!9375

<feature>[kvm]: add libvirt TLS config See merge request zstackio/zstack!9317

…VMHost 1. CallBackNetworkChecker.stopAnsible() creates Ssh object but never closes it, leaking JSch Session thread on every call. 2. KVMHost "check-host-is-taken-over" flow has the same pattern. When a host continuously fails reconnect (e.g. sharedblock VG error), each reconnect cycle leaks 1-2 SSH sessions. Over days this exhausts heap memory (18000+ threads observed) causing MN OOM and Unknown status. Fix: add finally { ssh.close() } to both code paths. Resolves: ZSTAC-83305 Resolves: ZSTAC-82731 Change-Id: I786e766e79776d6b7a75786d776278696c76666f

<fix>[core,kvm]: fix SSH session leak in CallBackNetworkChecker and KVMHost See merge request zstackio/zstack!9398

when user define ip/netmask/gateway, then gateway must be in the cidr of ip/netamsl Resolves: ZSTAC-83321 Change-Id: I78616376707476666a7a72786a7062766272686b

<fix>[compute]: fix user define param error See merge request zstackio/zstack!9387

coderabbitai · 2026-03-20T06:56:35Z

Walkthrough

新增与修改大量功能与接口，覆盖外部服务配置、国际化错误码、长作业进度结构、网络 IPAM 与静态 IP 管理、KVM/Libvirt TLS 支持、数据库 schema 迁移与函数扩展、多个插件与后端行为调整，以及文档与安装脚本向 Python3.11 迁移等多处变更。

Changes

Cohort / File(s)	Summary
新增扩展点与头部 API `header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java`	新增公共接口 `BeforeCallZWatchExtensionPoint`，定义 `supports(Class<?>)` 与 `beforeCallZWatch(Class<?>, List<String>)`。
外部服务配置（新增 API/VO/Inventory/事件） `core/src/main/java/org/zstack/core/externalservice/...`, `header/src/main/java/org/zstack/header/core/external/service/...`, `externalservice/src/main/java/org/zstack/externalservice/cronjob/CronJobImpl.java`	新增 ExternalService 配置相关实体、Inventory、VO、REST 消息与事件（增删查改）、管理实现、节点下发/回调与文档，含 Apply/Result 类型与节点广播逻辑。
错误码国际化与本地化支持 `core/src/main/java/org/zstack/core/errorcode/GlobalErrorCodeI18nService.java`, `GlobalErrorCodeI18nServiceImpl.java`, `LocaleUtils.java`, `core/src/main/java/org/zstack/core/global/CoreGlobalProperty.java`, `core/src/main/java/org/zstack/core/rest/RESTApiController.java`, `rest/src/main/java/org/zstack/rest/RestServer.java`	新增全局错误码 i18n 服务与 locale 解析工具，管理端/REST 层注入并在多处响应/日志中本地化 ErrorCode 消息；新增全局属性用于调优。
长作业进度标准化 `header/src/main/java/org/zstack/header/core/progress/LongJobProgressDetail*.java`, `TaskProgressInventory.java`, `LongJobProgressNotificationMessage.java`, `core/src/main/java/org/zstack/core/progress/ProgressReportService.java`	新增 LongJobProgressDetail 与 Builder，从 opaque 解析并在进度消息/库存中暴露结构化进度详情。
网络与 IPAM / 静态 IP 重构多文件修改 `compute/.../StaticIpOperator.java`, `VmInstanceBase.java`, `VmInstanceApiInterceptor.java`, `VmNicManagerImpl.java`, `VmSystemTags.java`, `network/.../IpRangeHelper.java`, `L3NetworkManagerImpl.java`, `L3BasicNetwork.java`, `DhcpExtension.java`, `NetworkServiceManager*.java`, `plugin/flatNetworkProvider/...`, `plugin/eip/...`, `plugin/loadBalancer/...`, `plugin/portForwarding/...`, `plugin/virtualRouterProvider/...`	大幅重构静态 IP 解析/验证/系统标签（包含 DNS 支持）、IP 覆盖/前缀与网关解析、新的帮助器/上下文类、IP-range/CIDR 检查与若干拦截器/过滤逻辑调整，影响 DHCP、EIP、LoadBalancer、PortForwarding、VR、平面网络等模块。
Compute 与 VM 行为调整 `compute/.../HostAllocatorChain.java`, `HostAllocatorManagerImpl.java`, `HostBase.java`, `HostManagerImpl.java`, `VmCapabilitiesJudger.java`, `VmCascadeExtension.java`, `VmInstanceBase.java`, `VmNicManagerImpl.java`, `VmQuotaOperator.java`, `VmInstanceState.java`, `VmSystemTags.java`	内存/容量返回改为警告并夹紧、主机连接/日志 null-safe 处理、增加跳过架构检查类型、VM 状态同步增强、允许无 L3 的宿主机 dry-run、IP 覆盖与 DNS 在 VM 流程中的持久化、配额变更校验等。
数据库 schema 与迁移脚本 `conf/db/upgrade/V5.5.12__schema.sql`, `conf/db/upgrade/beforeMigrate.sql`	新增/修改表、字段与索引（包括 UsedIpVO.prefixLen、ExternalServiceConfigurationVO 等）、回填脚本、调整函数 `Json_getKeyValue` 为 MEDIUMTEXT，并更新外键行为为 SET NULL（含幂等处理）。
Core/平台与并发修正 `core/src/main/java/org/zstack/core/Platform.java`, `ResourceDestinationMakerImpl.java`, `DatabaseFacade*.java`, `CallBackNetworkChecker.java`, `SentryInitHelper.java`, `TelemetryGlobalProperty.java`, `LogSafeGson.java`, `ProgressReportService.java`	错误信息增强（formatArgs/message）、i18n 本地化集成、资源目标 maker 加锁与防御拷贝、多回调生命周期支持、Ansible/SSH 资源清理改进、Sentry 调试与日志桥接、新全局属性等。
安装脚本与工具链迁移到 Python3.11 `conf/tools/install.sh`, `core/src/main/java/org/zstack/core/ansible/AnsibleFacadeImpl.java`	将虚拟环境与 pip 迁移到 Python3.11，新增 ensure_python3_venv，更新依赖版本（ansible、pycryptodome 等）、安装与 bash-completion 路径适配。
KVM / Libvirt TLS 与部署 `plugin/kvm/.../KVMHost.java`, `KVMHostDeployArguments.java`, `KVMHostFactory.java`, `KVMGlobalConfig.java`, `KVMAgentCommands.java`	增加 Libvirt TLS 配置/全局开关、部署参数 tlsCertIps、迁移参数（useTls、srcHostManagementIp）、CA 初始化与持久化逻辑、全局配置新增项与默认值。
插件与后端细节修正多处 plugin 修改（ceph、expon、cbd、zbs、iscsi、localstorage 等）	多个后端的 null-safe 修正、URL scheme/port 默认、HTTP 调用 retry/tryNext 支持、路径转换行为调整、镜像缓存策略参数传递等。
安全、RBAC、LDAP 与授权修正 `identity/AuthorizationManager.java`, `header/.../RBACInfo.java`, `plugin/ldap/LdapUtil.java`	增强会话非空检查并抛出明确错误、扩展外部服务相关管理员权限列表、LDAP 服务选择改为有意义的 OperationFailureException。
文档与生成脚本 `docs/modules/network/...`, `rest/src/main/resources/scripts/GoInventory.groovy`, `GoTestTemplate.groovy`	新增 L3 IPAM 文档页与导航，Go SDK 测试模版生成器新增（自动产出单元/集成测试样板）。
若干小修与 API/消息扩展多处 header 新增字段/消息（如 DNS 字段、AllocateIpMsg 的网关/前缀、UsedIpVO 前缀属性、长作业/任务 API 示例更新等）	多个 API message/VO/inventory 增加字段（dnsAddresses、netmask/gateway/ipv6Prefix、prefixLen 等）以及对应的文档与示例更新，影响序列化与交互契约。

代码审查工作量评估

🎯 5 (Critical) | ⏱️ ~120 分钟

兔子的诗

🐰 新接口在枝头鸣，
配置、语言、网与迁移齐飞，
小改大攻无所惧，
兔子跳跃验一遍——
编译测试都要跑！ 🥕

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch sync/chao.he/fix-ZSTAC-80202-zaku-loading@@2

…r encountering errors in Zaku cluster. Resolves: ZSTAC-80202 Change-Id: I7778676171646874706164777869707279776172

coderabbitai

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java (1)
56-64: ⚠️ Potential issue | 🟠 Major

空检查应在更新 UsedIpVO 之前执行

当前第 58 行在检查 VmNicVO 是否存在之前就更新了 UsedIpVO 的 vmNicUuid 字段。如果 vmNicUUid 对应的 NIC 不存在，UsedIpVO 将被设置为一个无效的引用，导致数据不一致。

建议将空检查移至 SQL 更新之前：
🔧 建议修复方案
 `@Override`
 public void afterAddIpAddress(String vmNicUUid, String usedIpUuid) {
-    /* update UsedIpVO */
-    SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update();
-
     VmNicVO nic = Q.New(VmNicVO.class).eq(VmNicVO_.uuid, vmNicUUid).find();
     if (nic == null) {
         logger.debug(String.format("VmNic[uuid:%s] not found, skip afterAddIpAddress", vmNicUUid));
         return;
     }
+
+    /* update UsedIpVO */
+    SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java` around
lines 56 - 64, In afterAddIpAddress, don't update UsedIpVO before verifying the
NIC exists: first fetch VmNicVO via Q.New(VmNicVO.class).eq(VmNicVO_.uuid,
vmNicUUid).find() and if it is null log and return, and only after that call
SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid,
vmNicUUid).update(); this prevents setting vmNicUuid to an invalid reference.
conf/tools/install.sh (1)
143-151: ⚠️ Potential issue | 🟠 Major

zstack-dashboard 的升级路径还会复用旧的 Python 2 venv。

这里只有“目录不存在才创建”的判断。升级场景下，如果 /var/lib/zstack/virtualenv/zstack-dashboard 已经是旧的 Python 2 环境，后面的 pip install 仍会继续落在旧解释器里，这次 3.11 迁移对存量节点就不生效了。这里也应该复用前面新加的 ensure_python3_venv。
🛠️ 建议修正
-    if [ ! -d "$UI_VIRENV_PATH" ]; then
-        python3.11 -m venv $UI_VIRENV_PATH
-        if [ $? -ne 0 ]; then
-            rm -rf $UI_VIRENV_PATH
-            exit 1
-        fi
-    fi
+    ensure_python3_venv "$UI_VIRENV_PATH"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conf/tools/install.sh` around lines 143 - 151, The current install logic only
creates the venv at UI_VIRENV_PATH when the directory is missing, so upgrades
can keep an old Python2 venv; call the shared ensure_python3_venv routine (or
perform a version check on UI_VIRENV_PATH and recreate it if its python is not
3.11) before running pip installs. Specifically, replace the "if [ ! -d
\"$UI_VIRENV_PATH\" ]" creation-only flow by invoking ensure_python3_venv (or
checking the venv's python binary via UI_VIRENV_PATH/bin/python and recreating
the venv if its version != 3.11) so the zstack-dashboard venv is guaranteed to
use Python 3.11.
core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java (1)
80-92: ⚠️ Potential issue | 🟠 Major

getNodeInfo() 在缓存未命中时返回 null

第 89 行使用 nodes.put() 的返回值（前一个值），而非新创建的 NodeInfo 对象。当缓存未命中时，put() 返回 null，导致方法返回 null。第 893 行的调用链 getNodeInfo(nodeUuid).getNodeIP() 会在 null 值上触发 NullPointerException，虽然被 try-catch 捕获，但导致 setHostName() 未被执行，主机名设置被无声忽略。

应改为：
             nodeHash.add(nodeUuid);
-            info = nodes.put(nodeUuid, new NodeInfo(vo));
+            info = new NodeInfo(vo);
+            nodes.put(nodeUuid, info);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java`
around lines 80 - 92, In getNodeInfo(String nodeUuid) the code stores new
NodeInfo via nodes.put(...) but returns the put() result (the previous value),
so on cache miss it returns null; fix by constructing the new NodeInfo (e.g. new
NodeInfo(vo)), assign it to a local variable, put that instance into nodes
(nodes.put(nodeUuid, newInfo) or putIfAbsent as appropriate), add nodeUuid to
nodeHash, and return the newInfo; update the method so it always returns the
actual NodeInfo instance (reference getNodeInfo) instead of the put() return
value.
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java (3)
1280-1282: ⚠️ Potential issue | 🟠 Major

静态 DNS 写入发生在网络服务重放之后，Running VM 这次请求里拿不到新 DNS。

这两个分支都会先执行 VmApplyNetworkServiceOnChangeIPFlow，然后才在 done/success 回调里 setStaticDns(...)。对运行中的 VM 来说，本次 change-IP 流程只会把 tag 存下来，DNS 不会在当前请求里立即下发。建议像 changeVmNicNetwork() 里的 SetStaticIp.setDns() 一样，把 DNS 写入前移到 flow 内、放在 apply flow 之前，并补上失败回滚。

Based on learnings: NetworkServiceManagerImpl.getVmNicDns() reads VM NIC DNS from the STATIC_DNS system tag keyed by l3NetworkUuid.

Also applies to: 3707-3717, 3763-3766
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java` around lines
1280 - 1282, The static DNS is being written after
VmApplyNetworkServiceOnChangeIPFlow finishes so a Running VM won't get the DNS
in this request; modify the change-IP path in VmInstanceBase so that setting the
STATIC_DNS tag (what setStaticDns does) is done inside
VmApplyNetworkServiceOnChangeIPFlow before the apply step (mirroring
changeVmNicNetwork -> SetStaticIp.setDns behavior) and add a rollback that
removes/restores the previous STATIC_DNS tag on flow failure; update references
to NetworkServiceManagerImpl.getVmNicDns to continue reading the tag and ensure
both the primary branch here and the other occurrences around the specified
blocks (3707-3717, 3763-3766) apply the same change.
1088-1110: ⚠️ Potential issue | 🟠 Major

同 IP 更新时，overrideInfo 会被这段短路逻辑直接吞掉。

现在 setIpamStaticIp() 已经把 netmask/gateway/ipv6Prefix/ipv6Gateway 传进来了，但只要请求里的 IP 和当前 NIC IP 相同，Line 1106-1110 就会直接返回成功，Line 1147-1155 的覆盖字段根本不会落到 AllocateIpMsg。这样“只改网关/前缀、不改 IP”的请求会表现为成功但实际不生效。建议只有在 IP 和这些 override 字段都未变化 时才走这个 fast path。

Based on learnings: fillIpv4Parameters() / fillIpv6Parameters() intentionally persist user-supplied netmask / gateway / ipv6Prefix / ipv6Gateway when those fields are provided.

Also applies to: 1147-1155, 3747-3753
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java` around lines
1088 - 1110, The fast-path in changeVmIp currently returns success when the
requested IPs equal the NIC's IPs, which discards overrideInfo
(netmask/gateway/ipv6Prefix/ipv6Gateway) so updates that only change those
fields never reach AllocateIpMsg; modify the check in changeVmIp (and the
analogous checks around the blocks that handle AllocateIpMsg at lines
referenced) to compare not only IPs (targetNic.getUsedIps() /
staticIpMap.values()) but also the corresponding override fields from
overrideInfo (e.g., netmask, gateway, ipv6Prefix, ipv6Gateway) against the NIC’s
current values (use fillIpv4Parameters/fillIpv6Parameters semantics to detect
when user supplied values are present); only short-circuit and call
completion.success() when both IPs and all provided override fields are
unchanged, otherwise proceed so setIpamStaticIp and AllocateIpMsg receive and
persist the overrideInfo.
143-179: ⚠️ Potential issue | 🟠 Major

NoState 只补了 tracer 收敛路径，checkState() 失败回收链路里仍会把它当异常抛掉。

stopVm()、rebootVm()、destroy()、migrate 等失败后的收敛逻辑都会回到 checkState(...)。那里目前只接受 Running/Stopped/Paused，一旦主机返回 NoState 就会直接抛 CloudRuntimeException，外层 completion 不会继续走完，这类请求仍然可能维持 loading。
🔧 建议补齐 `checkState()` 的 `NoState` 分支
-                } else {
+                } else if (VmInstanceState.NoState.toString().equals(state)) {
+                    changeVmStateInDb(VmInstanceStateEvent.noState, () -> self.setHostUuid(hostUuid));
+                } else {
                     throw new CloudRuntimeException(String.format(
                             "CheckVmStateOnHypervisorMsg should only report states[Running, Paused or Stopped]," +
                                     "but it reports %s for the vm[uuid:%s] on the host[uuid:%s]", state, self.getUuid(), hostUuid));
                 }
Also applies to: 1655-1659
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java` around lines
143 - 179, The checkState(...) method in VmInstanceBase currently only handles
Running/Stopped/Paused and throws on any other string; add a branch to handle
the hypervisor "NoState" response so it doesn't escalate to a
CloudRuntimeException: in VmInstanceBase.checkState, after retrieving state
(String state = r.getStates().get(self.getUuid())), add an else-if for
VmInstanceState.NoState.toString() that calls
changeVmStateInDb(VmInstanceStateEvent.unknown) and completes the
NoErrorCompletion (completion.done()), mirroring the existing null/error
handling paths so stopVm/rebootVm/destroy/migrate convergence won't be
interrupted.

🧹 Nitpick comments (15)

core/src/main/java/org/zstack/core/db/UpdateQueryImpl.java (1)

55-59: LGTM! 空值安全检查修复正确。

此修改正确地处理了 val 为 null 时的情况，避免了在格式化错误消息时抛出 NullPointerException。

可选的小优化：当前 val.getClass() 会输出类似 class java.lang.String 的格式，如果希望更简洁可以使用 val.getClass().getSimpleName()。

♻️ 可选优化

         if ((op == Op.IN || op == Op.NOT_IN) && !(val instanceof Collection)) {
             throw new CloudRuntimeException(String.format("for operation IN or NOT IN, a Collection value is expected, but %s got",
-                    val == null ? "null" : val.getClass()));
+                    val == null ? "null" : val.getClass().getSimpleName()));
         }

,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/zstack/core/db/UpdateQueryImpl.java` around lines 55 -
59, The current null-safe check in UpdateQueryImpl.condAnd correctly avoids NPE
but the error message uses val.getClass() which prints full class name with
"class" prefix; change the message to use val.getClass().getSimpleName() (still
guarding for null) so the thrown CloudRuntimeException shows a cleaner type
name; update the String.format call in the condAnd method to use getSimpleName()
when val != null.

header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java (1)

80-89: 列表元素类型假设可能存在风险。

当前代码通过 String.valueOf(url) 将列表元素转换为字符串。如果 mdsUrls 或 mdsInfos 中的元素是复杂对象（如嵌套的 Map），String.valueOf 会调用 toString() 方法，无法正确脱敏内部可能包含的凭证信息。

建议增加类型检查，对非字符串类型的元素做适当处理：

♻️ 可选的类型检查改进

 private static void desensitizeUrlList(Map config, String key) {
     Object urls = config.get(key);
     if (urls instanceof List) {
         List<String> desensitized = new ArrayList<>();
         for (Object url : (List) urls) {
-            desensitized.add(desensitizeUrl(String.valueOf(url)));
+            if (url instanceof String) {
+                desensitized.add(desensitizeUrl((String) url));
+            } else {
+                // Skip non-string elements or handle complex objects appropriately
+                desensitized.add(String.valueOf(url));
+            }
         }
         config.put(key, desensitized);
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`
around lines 80 - 89, desensitizeUrlList currently assumes each list element can
be stringified safely; if elements are nested Maps/Lists (e.g., mdsUrls/mdsInfos
entries with credential fields) toString() will not properly desensitize
secrets. Update desensitizeUrlList (and reuse desensitizeUrl) to type-check each
element: if element is String call desensitizeUrl and add result; if element is
Map iterate its entries and desensitize any String values (or recursively call a
helper to handle nested Map/List), if element is List process recursively, and
for other types either skip or convert safely; finally put the processed list
back into config under the same key. Ensure helper names (desensitizeUrlList,
desensitizeUrl) are used so changes are easy to find.

core/src/main/java/org/zstack/core/db/DatabaseFacadeImpl.java (1)

909-921: 建议：clz 非空但 EntityInfo 未找到时添加警告日志。

与 installEntityLifeCycleCallback 方法相比存在不一致行为：

安装时：若 info == null 会触发断言失败
卸载时：若 info == null 静默返回

虽然卸载操作采用宽容策略是合理的（避免清理时抛出异常），但建议添加警告日志以便排查问题。

♻️ 建议的改进

     `@Override`
     public void uninstallEntityLifeCycleCallback(Class clz, EntityEvent evt, EntityLifeCycleCallback cb) {
         if (clz != null) {
             EntityInfo info = entityInfoMap.get(clz);
             if (info != null) {
                 info.uninstallLifeCycleCallback(evt, cb);
+            } else {
+                logger.warn(String.format("cannot find EntityInfo for the class[%s] when uninstalling lifecycle callback", clz));
             }
         } else {
             for (EntityInfo info : entityInfoMap.values()) {
                 info.uninstallLifeCycleCallback(evt, cb);
             }
         }
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/zstack/core/db/DatabaseFacadeImpl.java` around lines
909 - 921, The uninstallEntityLifeCycleCallback currently silently returns when
clz != null but entityInfoMap.get(clz) yields null; add a warning log in that
branch to mirror installEntityLifeCycleCallback's assert behavior for
visibility: in DatabaseFacadeImpl.uninstallEntityLifeCycleCallback, when clz !=
null and EntityInfo info == null, call the logger (e.g., use existing logger in
this class) to log a warning that no EntityInfo was found for the given class
before returning; keep the tolerant behavior (no exception) but emit the warning
so missing entries are discoverable.

rest/src/main/resources/scripts/GoTestTemplate.groovy (1)

31-31: 注释编号重复

Line 28 已使用 // 2. 注释，这里的 // 2. Per-resource test files 应改为 // 3.。

📝 建议修复

-        // 2. Per-resource test files
+        // 3. Per-resource test files

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@rest/src/main/resources/scripts/GoTestTemplate.groovy` at line 31, The
comment numbering is duplicated: change the comment string "// 2. Per-resource
test files" to "// 3. Per-resource test files" so it follows the earlier "// 2."
comment; update the comment text near the existing "// 2. Per-resource test
files" occurrence in GoTestTemplate.groovy (look for that exact comment) to use
"// 3." instead.

compute/src/main/java/org/zstack/compute/host/HostBase.java (1)

1443-1446: 防御性空检查：考虑 "unknown" 带来的去重冲突风险

此处增加了对 self 的空检查以避免 NPE，这是合理的防御性编程。但使用固定字符串 "unknown" 作为去重键可能导致问题：

如果多个不同上下文的连接请求同时遇到 self == null 的情况，它们将共享相同的去重签名 "connect-host-unknown"，导致本应独立执行的请求被错误去重。

建议：

若 self 为 null 是不合法状态，考虑在方法入口处提前失败并记录日志

若为合法边界情况，可考虑使用 msg.getHostUuid() 替代 "unknown"
♻️ 建议的改进方案
 `@Override`
 protected String getDeduplicateString() {
-    return String.format("connect-host-%s", self == null ? "unknown" : self.getUuid());
+    return String.format("connect-host-%s", self != null ? self.getUuid() : msg.getHostUuid());
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/host/HostBase.java` around lines
1443 - 1446, The getDeduplicateString() method currently returns
"connect-host-unknown" when self is null, which can cause unrelated requests to
collide; update the logic in HostBase.getDeduplicateString() to avoid a fixed
"unknown" key: either fail fast with a logged error if self == null (treating it
as an invalid state), or build the dedupe key from a more specific identifier
such as the incoming message's host UUID (e.g., use msg.getHostUuid() when
available) so concurrent null-self cases produce distinct keys; ensure to
reference self and msg in the chosen branch and keep the log/context clear.

header/src/main/java/org/zstack/header/vm/VmAbnormalLifeCycleStruct.java (1)

37-44: 建议：统一使用 Objects.equals() 以提高代码一致性和安全性。

该文件中存在混合使用 .equals() 和 Objects.equals() 的情况。虽然部分枚举值已使用 Objects.equals()（如第 20、75、83 行），但大多数仍使用 .equals() 进行主机 UUID 比较（第 42、50、58、67、107、115、123、131、139、155 行等）。其中第 67 行的显式 null 检查表明这些 getter 方法确实可能返回 null。

为了保证代码的一致性和健壮性，建议在后续迭代中统一改为 Objects.equals()。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@header/src/main/java/org/zstack/header/vm/VmAbnormalLifeCycleStruct.java`
around lines 37 - 44, Replace direct .equals() comparisons of host UUIDs with
null-safe Objects.equals() in the VmAbnormalLifeCycleStruct matching logic;
specifically update the match(...) implementation in the
VmPausedFromUnknownStateHostNotChanged enum constant (and other enum constants
inside VmAbnormalLifeCycleStruct that compare
getCurrentHostUuid()/getOriginalHostUuid()) to call
Objects.equals(struct.getCurrentHostUuid(), struct.getOriginalHostUuid())
instead of struct.getCurrentHostUuid().equals(struct.getOriginalHostUuid()) so
comparisons are consistent and null-safe.

plugin/iscsi/src/main/java/org/zstack/iscsi/kvm/KvmIscsiNodeServer.java (1)

247-248: 建议去掉硬编码 999，改为具名常量。

Line 247 的 999 是魔法值，后续维护时很难理解其语义和边界。建议提取为具名常量（类常量或至少方法内 final 变量）。

♻️ 建议修改

-                            ref = new ExternalHostIdGetter(999).getOrAllocateHostIdRef(
+                            final int externalHostIdType = 999;
+                            ref = new ExternalHostIdGetter(externalHostIdType).getOrAllocateHostIdRef(
                                     param.getHostUuid(), param.getPrimaryStorage().getUuid());

As per coding guidelines, "避免使用魔法值（Magic Value）：直接使用未经定义的数值或字符串应替换为枚举或常量。"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugin/iscsi/src/main/java/org/zstack/iscsi/kvm/KvmIscsiNodeServer.java`
around lines 247 - 248, Replace the magic number 999 passed to
ExternalHostIdGetter in the call to getOrAllocateHostIdRef with a named
constant; define a descriptive constant (e.g., DEFAULT_EXTERNAL_HOST_ID or
EXTERNAL_HOST_ID_BASE) as a private static final field in KvmIscsiNodeServer (or
as a final local variable inside the method if scope-local) and use that
constant in place of 999 when constructing
ExternalHostIdGetter(param.getHostUuid(), param.getPrimaryStorage().getUuid())
so the purpose and bounds of the value are clear.

core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java (1)

58-58: 建议移除硬编码队列大小。

Line 58 的 1000 属于魔法值，建议改为 TelemetryGlobalProperty 配置项或命名常量，便于环境调优和后续维护。
♻️ 建议修改
-                    options.setMaxQueueSize(1000);
+                    options.setMaxQueueSize(TelemetryGlobalProperty.SENTRY_MAX_QUEUE_SIZE);
As per coding guidelines: “避免使用魔法值（Magic Value）：直接使用未经定义的数值或字符串应替换为常量或枚举。”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java` at line
58, 在 SentryInitHelper 中的 options.setMaxQueueSize(1000) 使用了魔法值
1000；请将该硬编码值替换为可配置或有意义命名的常量，例如从 TelemetryGlobalProperty（新增属性如
sentryMaxQueueSize）读取或定义类常量（e.g. DEFAULT_SENTRY_MAX_QUEUE_SIZE），并在
SentryInitHelper 中用该常量/属性替代原有的 1000，确保有默认值、配置注入或静态常量注释以便后续调优和维护。

plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java (2)

470-515: 检查 Shell 命令执行结果以确保 CA 初始化成功

ShellUtils.run() 方法在命令执行失败时不会抛出异常，而是返回退出码。当前代码未检查 openssl 命令的执行结果，可能导致在 CA 生成失败时继续执行后续逻辑，产生难以排查的问题。

建议的修改方案

-            if (!caFile.exists() || !keyFile.exists()) {
-                ShellUtils.run(String.format(
-                        "openssl genrsa -out %s/cakey.pem 4096", CA_DIR));
-                ShellUtils.run(String.format(
-                        "openssl req -new -x509 -days 3650 -key %s/cakey.pem " +
-                        "-out %s/cacert.pem -subj '/O=ZStack/CN=ZStack Libvirt CA'",
-                        CA_DIR, CA_DIR));
+            if (!caFile.exists() || !keyFile.exists()) {
+                int ret = ShellUtils.run(String.format(
+                        "openssl genrsa -out %s/cakey.pem 4096", CA_DIR));
+                if (ret != 0) {
+                    logger.warn(String.format("Failed to generate CA private key, exit code: %d", ret));
+                    return;
+                }
+                ret = ShellUtils.run(String.format(
+                        "openssl req -new -x509 -days 3650 -key %s/cakey.pem " +
+                        "-out %s/cacert.pem -subj '/O=ZStack/CN=ZStack Libvirt CA'",
+                        CA_DIR, CA_DIR));
+                if (ret != 0) {
+                    logger.warn(String.format("Failed to generate CA certificate, exit code: %d", ret));
+                    return;
+                }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java` around lines 470
- 515, The initLibvirtTlsCA method currently calls ShellUtils.run(...) for
openssl and file-permission commands but does not check return values; update
initLibvirtTlsCA to capture the integer return from each ShellUtils.run call
(especially the openssl genrsa and openssl req calls and chown/chmod
invocations), verify success (zero) before proceeding, and handle failures by
logging an error and aborting the CA initialization (e.g., throw or return
early) so subsequent FileUtils.readFileToString and JsonLabel.createIfAbsent
calls for caFile and keyFile are not executed with missing/invalid files; ensure
any cleanup or descriptive logger.warn/logger.error includes the failing command
and its exit code.

498-507: 使用带 Charset 参数的 FileUtils 方法

FileUtils.readFileToString(File) 和 FileUtils.writeStringToFile(File, String) 方法已被标记为 deprecated，建议使用带 Charset 参数的版本以明确指定字符编码。

建议的修改

+import java.nio.charset.StandardCharsets;
...
-            String ca = FileUtils.readFileToString(caFile).trim();
-            String key = FileUtils.readFileToString(keyFile).trim();
+            String ca = FileUtils.readFileToString(caFile, StandardCharsets.UTF_8).trim();
+            String key = FileUtils.readFileToString(keyFile, StandardCharsets.UTF_8).trim();
...
-            FileUtils.writeStringToFile(caFile, caInv.getLabelValue());
-            FileUtils.writeStringToFile(keyFile, keyInv.getLabelValue());
+            FileUtils.writeStringToFile(caFile, caInv.getLabelValue(), StandardCharsets.UTF_8);
+            FileUtils.writeStringToFile(keyFile, keyInv.getLabelValue(), StandardCharsets.UTF_8);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java` around lines 498
- 507, The FileUtils usages around caFile/keyFile should be updated to the
non-deprecated Charset-aware overloads: when reading use
FileUtils.readFileToString(caFile, StandardCharsets.UTF_8) and
FileUtils.readFileToString(keyFile, StandardCharsets.UTF_8), and when writing
use FileUtils.writeStringToFile(caFile, caInv.getLabelValue(),
StandardCharsets.UTF_8) and FileUtils.writeStringToFile(keyFile,
keyInv.getLabelValue(), StandardCharsets.UTF_8); make these changes in the block
that reads into ca/key, calls new JsonLabel().createIfAbsent(LIBVIRT_TLS_CA_KEY,
ca) and createIfAbsent(LIBVIRT_TLS_PRIVATE_KEY, key), and then overwrites the
local files so encoding is explicit and deprecation warnings are removed.

core/src/main/java/org/zstack/core/Platform.java (1)

986-1007: LocaleUtils 类名冲突的处理方式值得注意。

Line 999 使用了完全限定名 org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE，这是因为 Line 4 已经导入了 org.apache.commons.lang.LocaleUtils。当前的处理方式是正确的，但可以考虑为其中一个类使用别名导入以提高可读性。

错误码的国际化回退逻辑设计良好：先尝试本地化，失败时优雅降级到 details 或 description。
♻️ 可选：使用静态导入简化代码
+import static org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE;
 
 // 然后在使用处:
-i18nService.localizeErrorCode(result, org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE);
+i18nService.localizeErrorCode(result, DEFAULT_LOCALE);
或者重命名 Apache Commons 的导入：
-import org.apache.commons.lang.LocaleUtils;
+import org.apache.commons.lang.LocaleUtils as ApacheLocaleUtils;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/zstack/core/Platform.java` around lines 986 - 1007,
The code currently imports org.apache.commons.lang.LocaleUtils which conflicts
with org.zstack.core.errorcode.LocaleUtils used in Platform, making the
fully-qualified reference org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE
verbose; fix by removing the conflicting import and either add a static import
for org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE and call DEFAULT_LOCALE
directly in Platform.localizeErrorCode(result, DEFAULT_LOCALE), or keep the
Apache import and instead fully-qualify the Apache Commons usages elsewhere so
the Platform class can use a simple static import; update imports accordingly
and ensure the call in Platform (localizeErrorCode) uses the unambiguous
DEFAULT_LOCALE symbol.

header/src/main/java/org/zstack/header/core/external/service/APIAddExternalServiceConfigurationMsg.java (1)

53-57: audit() 方法实现可以更健壮。

当前实现在 rsp.isSuccess() 为 true 但 evt.getInventory() 为 null 时会抛出 NPE。虽然根据已有的 learnings，失败时不会设置 inventory 且 isSuccess() 会为 false，但添加空值检查可以提高防御性。

♻️ 建议的改进

 `@Override`
 public Result audit(APIMessage msg, APIEvent rsp) {
     APIAddExternalServiceConfigurationEvent evt = (APIAddExternalServiceConfigurationEvent) rsp;
-    return new Result(rsp.isSuccess() ? evt.getInventory().getUuid(): "", ExternalServiceConfigurationVO.class);
+    return new Result(rsp.isSuccess() && evt.getInventory() != null ? evt.getInventory().getUuid() : "", ExternalServiceConfigurationVO.class);
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/core/external/service/APIAddExternalServiceConfigurationMsg.java`
around lines 53 - 57, The audit() in APIAddExternalServiceConfigurationMsg
should avoid a potential NPE when rsp.isSuccess() is true but evt or
evt.getInventory() is null; update the method to cast rsp to
APIAddExternalServiceConfigurationEvent (evt) then defensively check evt != null
and evt.getInventory() != null before calling evt.getInventory().getUuid(),
returning an empty string when either is null while still constructing the
Result with ExternalServiceConfigurationVO.class.

conf/db/upgrade/V5.5.12__schema.sql (1)

112-124: 外键修改操作缺少幂等性检查

直接 DROP FOREIGN KEY 在约束不存在时会失败。建议参考 Lines 141-179 的 ModifyUsedIpVOForeignKey 存储过程模式，先检查约束是否存在再执行删除操作，以确保升级脚本可以安全重复执行。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@conf/db/upgrade/V5.5.12__schema.sql` around lines 112 - 124, The ALTER TABLE
drops for BareMetal2InstanceVO currently assume the foreign keys
fkBareMetal2InstanceVOGatewayVO and fkBareMetal2InstanceVOGatewayVO1 exist; make
the operation idempotent by first checking information_schema (or using the same
pattern as the ModifyUsedIpVOForeignKey procedure) to detect and only DROP the
constraint when present, and similarly only ADD each FOREIGN KEY when it does
not already exist; reference the BareMetal2InstanceVO table and those constraint
names (fkBareMetal2InstanceVOGatewayVO, fkBareMetal2InstanceVOGatewayVO1) and
mirror the existence-check logic from ModifyUsedIpVOForeignKey to ensure safe
repeated executions of the upgrade script.

core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java (2)

418-435: 错误信息缺少错误码

第 429 行和第 434 行的 operr(...) 调用缺少预定义的错误码，与文件中其他错误处理方式不一致。

♻️ 建议添加错误码

                 try{
                     service.externalConfig(serviceType);
                     completion.success(serviceType);
                 } catch (Exception e) {
-                    completion.fail(operr("failed to apply external service configuration for type [%s]: %s", serviceType, e.getMessage()));
+                    completion.fail(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_XXXXX, "failed to apply external service configuration for type [%s]: %s", serviceType, e.getMessage()));
                 }
                 return;
             }
         }
-        completion.fail(operr("unable to find external service type [%s]", serviceType));
+        completion.fail(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_XXXXX, "unable to find external service type [%s]", serviceType));
     }

请在 CloudOperationsErrorCode 中定义相应的错误码。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java`
around lines 418 - 435, The oprerr(...) calls in
regenerateExternalServiceConfiguration (class ExternalServiceManagerImpl, method
regenerateExternalServiceConfiguration) are missing explicit error codes; update
both failure paths (the catch block after service.externalConfig(serviceType)
and the final completion.fail for missing service) to call oprerr with a
CloudOperationsErrorCode value (e.g.,
CLOUD_OPERATION_EXTERNAL_SERVICE_APPLY_FAILED and
CLOUD_OPERATION_EXTERNAL_SERVICE_NOT_FOUND), then add those constants to
CloudOperationsErrorCode enum/class with appropriate messages so the
completion.fail carries an error code and message; ensure references to
serviceType remain in the formatted message and keep the surrounding logic
(services map, ExternalService.externalConfig, and completion) unchanged.

259-274: 错误信息缺少错误码

第 263 行的 operr(...) 调用缺少预定义的错误码，与文件中其他错误处理方式不一致（其他地方使用如 ORG_ZSTACK_CORE_EXTERNALSERVICE_10000 等错误码）。

♻️ 建议添加错误码

         if (vo == null) {
-            completion.fail(operr("unable to find external service configuration with uuid [%s]", msg.getUuid()));
+            completion.fail(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_XXXXX, "unable to find external service configuration with uuid [%s]", msg.getUuid()));
             return;
         }

请在 CloudOperationsErrorCode 中定义相应的错误码。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java`
around lines 259 - 274, The error handling in
ExternalServiceManagerImpl.updateExternalServiceConfiguration uses operr(...)
without a CloudOperationsErrorCode, making it inconsistent; add a new error code
constant (e.g. ORG_ZSTACK_CORE_EXTERNALSERVICE_10001) to
CloudOperationsErrorCode and update the failure call in
updateExternalServiceConfiguration to use
operr(CloudOperationsErrorCode.ORG_ZSTACK_CORE_EXTERNALSERVICE_10001, "unable to
find external service configuration with uuid [%s]", msg.getUuid()); ensure the
new enum member follows the existing naming/numbering convention and
compile-time imports are adjusted so ExternalServiceManagerImpl can reference
CloudOperationsErrorCode.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java`:
- Around line 369-377: The code may NPE when msg.getSystemTags() is null because
addAll(resolvedTags) is outside the null check; inside VmInstanceApiInterceptor
ensure system tags are initialized before adding: check msg.getSystemTags() and
if null call msg.setSystemTags(new ArrayList<>()) (or otherwise create an empty
list), then perform removeIf and addAll using msg.getSystemTags() and
resolvedTags so addAll never runs on a null list.
- Around line 409-426: validateDnsAddresses currently validates entries without
trimming, so values like " 1.2.3.4 " fail; update validateDnsAddresses to first
map dnsAddresses to trimmed strings (e.g., dnsAddresses =
dnsAddresses.stream().map(String::trim).collect...) and remove any empty strings
produced by trimming before performing the size check and IPv4/IPv6 checks; use
the same symbols (validateDnsAddresses, dnsAddresses,
NetworkUtils.isIpv4Address, IPv6NetworkUtils.isIpv6Address,
VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER) and keep the same error behavior
(throw ApiMessageInterceptionException with existing error codes) after
trimming/filtering.

In `@core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java`:
- Around line 62-91: In the anonymous ILogger implementation passed to
options.setLogger in SentryInitHelper, guard against null and formatting
exceptions: check if args is null before accessing args.length, and wrap
String.format(...) in a try/catch to fall back to a safe message if formatting
fails; likewise, change the logging for SentryLevel.ERROR to use
logger.error(...) (not logger.warn) so error-level events are not downgraded;
apply these fixes in the three logger methods
(log(SentryLevel,String,Object...), log(SentryLevel,String,Throwable), and
log(SentryLevel,Throwable,String,Object...)) to ensure no NPEs or format
exceptions and correct severity mapping.

In `@network/src/main/java/org/zstack/network/l3/IpRangeHelper.java`:
- Around line 303-304: isIpOutsideL3NetworkCidr currently just negates
isIpInL3NetworkCidr which causes it to return true for L3s with no
NormalIpRangeVO; update isIpOutsideL3NetworkCidr to first check for any
NormalIpRangeVO for the given l3Uuid and return false when none exist, otherwise
return the negation of isIpInL3NetworkCidr(ip, l3Uuid); reference the methods
isIpOutsideL3NetworkCidr and isIpInL3NetworkCidr and the NormalIpRangeVO query
used elsewhere in this class to locate and implement the guard.

In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java`:
- Around line 1126-1130: The current removal of reserved IPs from
freeIpInventorys in L3BasicNetwork (using reservedIpRanges and
NetworkUtils.isInRange) can cause the returned list to be shorter than
msg.getLimit() because you simply remove matches instead of filling the result
to the requested limit; change the logic so reserved IP filtering happens during
accumulation (skip adding any freeIp that matches reservedIpRanges) or, if you
must filter after collection, continue iterating the candidate pool to append
additional non-reserved freeIp entries until you reach msg.getLimit(); update
the code that populates freeIpInventorys (the accumulation loop that feeds
freeIpInventorys and any caller that relies on msg.getLimit()) to enforce this
behavior.

In `@network/src/main/java/org/zstack/network/service/DhcpExtension.java`:
- Around line 347-350: The current early filter in DhcpExtension (checking
ip.getIpRangeUuid() == null and continuing) incorrectly excludes NICs whose
ipRangeUuid is null but whose IP still falls inside the L3 CIDR and should be
processed by makeDhcpStruct(); change the logic to only skip when the IP is
truly outside the L3 CIDR (i.e., ip.getIpRangeUuid() == null AND the IP is not
within the L3 network/CIDR), otherwise keep the NIC so makeDhcpStruct() can
generate DHCP entries; update the condition around ip.getIpRangeUuid() in the
loop (the block that currently does continue) to verify CIDR inclusion before
skipping.

In `@plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java`:
- Around line 421-427: The stream filter currently calls
IpRangeHelper.isIpInL3NetworkCidr(...) inside the loop (causing N+1 DB queries);
instead collect all L3 network UUIDs from ret's nics' UsedIp objects, perform a
single batch query to load each L3 network's CIDR(s) into a Map<String,String>
(keyed by l3NetworkUuid) and then replace the in-loop call with a local lookup
against that map (e.g., lookup cidr = l3CidrMap.get(ip.getL3NetworkUuid()) and
test the IP against the cached CIDR via IpRangeHelper or a local CIDR check), so
EipManagerImpl's nic filtering uses the cached CIDRs and avoids per-UsedIp DB
calls.

In `@plugin/expon/src/main/java/org/zstack/expon/ExponStorageController.java`:
- Around line 125-129: In ExponStorageController, validate and normalize the
extracted scheme before assigning to clientConfig.scheme and before any
HttpUrl.Builder.scheme(String) usage: read uri.getScheme(), convert to
lower-case, treat null/empty as "https", accept only "http" or "https" and
assign that normalized value to clientConfig.scheme; if the scheme is any other
value (e.g., "ftp"), throw a clear, user-friendly exception (or return a
validation error) instead of passing it through, so downstream
HttpUrl.Builder.scheme(String) is never given an unsupported protocol and we
avoid leaking raw IllegalArgumentException details.

In `@plugin/iscsi/src/main/java/org/zstack/iscsi/kvm/KvmIscsiNodeServer.java`:
- Around line 247-257: The code may NPE when
ExternalHostIdGetter.getOrAllocateHostIdRef(...) returns null; in
KvmIscsiNodeServer where you create KvmSetupSelfFencerCmd and assign ref (call
to ExternalHostIdGetter.getOrAllocateHostIdRef), add a null/invalid check (if
ref == null || ref.getHostId() == 0) immediately after obtaining ref and call
trigger.fail(...) with an appropriate message and return to stop further
processing; mirror the null-check pattern used in ExternalPrimaryStorage and
KvmCbdNodeServer to avoid dereferencing ref when setting cmd.hostId.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java`:
- Around line 5823-5831: The TLS SAN list is built from HostSystemTags.EXTRA_IPS
which may not be populated yet (refresh happens in
createCollectHostFactsFlow()/saveHostExtraIps()), causing first-time connections
to miss migration IPs; change KVMHost code that sets
deployArguments.setTlsCertIps(...) to use the IPs discovered during the current
probe/host-facts (the in-memory detected extra IP list from the host probe) or
move the host facts collection earlier so you combine
getSelf().getManagementIp() with the freshly detected extra IPs (instead of
relying on HostSystemTags.EXTRA_IPS) before calling
deployArguments.setTlsCertIps(...) so the generated cert SANs include migration
network IPs on first connect.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KvmVmSyncPingTask.java`:
- Around line 463-477: The nodeLeft flow removes entries from vmApis and
vmsToSkip which causes a possible NPE in the VM_CONTINUE_TRACE_PATH callback
(lines ~150-159) where it does vmsToSkip.get(managementNodeId).remove(...); make
that callback null-safe by checking that vmsToSkip.get(managementNodeId) is
non-null before calling remove (or use a safe lookup + conditional remove), and
when you successfully obtain the vmUuid in that callback also remove any
orphaned entry from orphanedSkipVms for that vmUuid so orphan entries are
cleaned up promptly; keep all changes local to the VM_CONTINUE_TRACE_PATH
handler and reference the nodeLeft-related maps vmApis, vmsToSkip, and
orphanedSkipVms.

In
`@plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java`:
- Around line 641-645: The stream filter in LoadBalancerBase that calls
IpRangeHelper.isIpInL3NetworkCidr per NIC creates an N+1 query risk (seen where
nicVOS is filtered and again at the other site); replace per-NIC calls with a
shared helper that accepts a collection of L3 network UUIDs, does a single bulk
fetch of L3 CIDR(s) (via the appropriate DAO or batch IpRangeHelper API), builds
an in-memory map of l3Uuid -> CIDR(s), and then performs IP-in-CIDR checks
against that map to filter nicVOS; implement this new method (e.g.,
ensureIpsInL3Cidrs(Collection<NetworkNicVO> nics) or similar) in
LoadBalancerBase and call it from both places instead of calling
IpRangeHelper.isIpInL3NetworkCidr for each NIC.

In
`@plugin/portForwarding/src/main/java/org/zstack/network/service/portforwarding/PortForwardingApiInterceptor.java`:
- Around line 153-162: Move the "IP outside L3 CIDR" validation block in
PortForwardingApiInterceptor so it runs before the early-return that handles the
attach path; specifically, place the
UsedIpVO/IpRangeHelper.isIpOutsideL3NetworkCidr check (the block using
dbf.findByUuid(nicUuid) and throwing ApiMessageInterceptionException) ahead of
the return that short-circuits handling for APIAttachPortForwardingRuleMsg (the
early return around the attach path), so the CIDR check executes for attach
requests as well.

In
`@plugin/portForwarding/src/main/java/org/zstack/network/service/portforwarding/PortForwardingManagerImpl.java`:
- Around line 369-378: The per-NIC call to IpRangeHelper.isIpInL3NetworkCidr in
PortForwardingManagerImpl (after filterVmNicByIpVersion) causes an N+1 query;
instead, batch-fetch the L3 network CIDRs for all distinct l3NetworkUuid values
using l3Mgr (or the L3 network DAO), build a map of l3Uuid->CIDR(s) in memory,
then replace the stream filter so each VmNicInventory checks its IP against the
preloaded CIDR(s) using IpRangeHelper (or a pure in-memory helper) without
additional DB calls; update the logic around filtered =
filtered.stream().filter(...) to use the map and avoid per-nic queries.

In `@plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsStorageController.java`:
- Around line 182-185: HttpCaller.doSyncCall currently only retries when the
returned response has ret.isSuccess()==false, but RESTFacadeImpl.syncRawJson
throws OperationFailureException on timeouts/connection errors which bypass that
check; update HttpCaller.doSyncCall to catch OperationFailureException (and
related IO/connection timeout exceptions) around the syncRawJson call, treat
these exceptions as a failed call that honors setTryNext(true), log the
exception, and proceed to the next MDS endpoint before ultimately rethrowing if
all endpoints fail so timeouts/connection failures trigger the same MDS-switch
retry logic as unsuccessful responses.

In
`@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java`:
- Around line 381-390: The nodeDie() path is calling nodeLifeCycle.nodeLeft(...)
outside the lifecycleLock, causing a race where a stale nodeLeft can run
concurrently with a later NodeJoin handling; move the local nodeLeft call and
its event dispatch inside the same synchronized(lifecycleLock) block used by the
NodeJoin/NodeLeft canonical callback so DB/hash-ring reconciliation and callback
handling are serialized. Concretely, update the nodeDie() method to acquire
lifecycleLock before invoking nodeLifeCycle.nodeLeft(...) and any related
suspectedMissingFromDb updates or event dispatch logic (mirror the pattern used
around nodeLifeCycle.nodeJoin(...) and the existing NodeLeft branch) so both
canonical callbacks and fencing-generated nodeLefts use the same lock. Ensure
the same change is applied for the other referenced region (lines around the
second occurrence noted, e.g., the block covering 877-925) so all nodeLeft paths
are synchronized on lifecycleLock.
- Around line 890-898: The heartbeat thread in ManagementNodeManagerImpl must
not call destinationMaker.getNodeInfo (which triggers
ResourceDestinationMakerImpl.getNodeInfo -> dbf.findByUuid) because this
bypasses heartBeatDBSource; change the code around nodeLifeCycle.nodeLeft(inv)
to fetch hostname only from the in-memory joinedManagementNodes cache (use the
joinedManagementNodes map/structure available in ManagementNodeManagerImpl) and,
if there's no cached entry for nodeUuid, leave inv.hostName empty instead of
calling destinationMaker; remove the try/catch that calls
destinationMaker.getNodeInfo to avoid hitting the ordinary DB path.

In `@rest/src/main/java/org/zstack/rest/RestServer.java`:
- Around line 411-416: Deep-copy the ErrorCode before localizing to avoid
mutating shared cached objects: in callWebHook, sendResponse, and
sendReplyResponse create a deep clone of the ErrorCode (clone/copy the result of
evt.getError, response.getError, reply.getError), pass that clone to
i18nService.localizeErrorCode(..., locale), then set the localized clone back
onto the event/response/reply (instead of localizing the original). Ensure the
copy is a full deep copy (message, cause, details, children) so no shared
references remain.

---

Outside diff comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java`:
- Around line 1280-1282: The static DNS is being written after
VmApplyNetworkServiceOnChangeIPFlow finishes so a Running VM won't get the DNS
in this request; modify the change-IP path in VmInstanceBase so that setting the
STATIC_DNS tag (what setStaticDns does) is done inside
VmApplyNetworkServiceOnChangeIPFlow before the apply step (mirroring
changeVmNicNetwork -> SetStaticIp.setDns behavior) and add a rollback that
removes/restores the previous STATIC_DNS tag on flow failure; update references
to NetworkServiceManagerImpl.getVmNicDns to continue reading the tag and ensure
both the primary branch here and the other occurrences around the specified
blocks (3707-3717, 3763-3766) apply the same change.
- Around line 1088-1110: The fast-path in changeVmIp currently returns success
when the requested IPs equal the NIC's IPs, which discards overrideInfo
(netmask/gateway/ipv6Prefix/ipv6Gateway) so updates that only change those
fields never reach AllocateIpMsg; modify the check in changeVmIp (and the
analogous checks around the blocks that handle AllocateIpMsg at lines
referenced) to compare not only IPs (targetNic.getUsedIps() /
staticIpMap.values()) but also the corresponding override fields from
overrideInfo (e.g., netmask, gateway, ipv6Prefix, ipv6Gateway) against the NIC’s
current values (use fillIpv4Parameters/fillIpv6Parameters semantics to detect
when user supplied values are present); only short-circuit and call
completion.success() when both IPs and all provided override fields are
unchanged, otherwise proceed so setIpamStaticIp and AllocateIpMsg receive and
persist the overrideInfo.
- Around line 143-179: The checkState(...) method in VmInstanceBase currently
only handles Running/Stopped/Paused and throws on any other string; add a branch
to handle the hypervisor "NoState" response so it doesn't escalate to a
CloudRuntimeException: in VmInstanceBase.checkState, after retrieving state
(String state = r.getStates().get(self.getUuid())), add an else-if for
VmInstanceState.NoState.toString() that calls
changeVmStateInDb(VmInstanceStateEvent.unknown) and completes the
NoErrorCompletion (completion.done()), mirroring the existing null/error
handling paths so stopVm/rebootVm/destroy/migrate convergence won't be
interrupted.

In `@compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java`:
- Around line 56-64: In afterAddIpAddress, don't update UsedIpVO before
verifying the NIC exists: first fetch VmNicVO via
Q.New(VmNicVO.class).eq(VmNicVO_.uuid, vmNicUUid).find() and if it is null log
and return, and only after that call SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid,
usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update(); this prevents setting
vmNicUuid to an invalid reference.

In `@conf/tools/install.sh`:
- Around line 143-151: The current install logic only creates the venv at
UI_VIRENV_PATH when the directory is missing, so upgrades can keep an old
Python2 venv; call the shared ensure_python3_venv routine (or perform a version
check on UI_VIRENV_PATH and recreate it if its python is not 3.11) before
running pip installs. Specifically, replace the "if [ ! -d \"$UI_VIRENV_PATH\"
]" creation-only flow by invoking ensure_python3_venv (or checking the venv's
python binary via UI_VIRENV_PATH/bin/python and recreating the venv if its
version != 3.11) so the zstack-dashboard venv is guaranteed to use Python 3.11.

In
`@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java`:
- Around line 80-92: In getNodeInfo(String nodeUuid) the code stores new
NodeInfo via nodes.put(...) but returns the put() result (the previous value),
so on cache miss it returns null; fix by constructing the new NodeInfo (e.g. new
NodeInfo(vo)), assign it to a local variable, put that instance into nodes
(nodes.put(nodeUuid, newInfo) or putIfAbsent as appropriate), add nodeUuid to
nodeHash, and return the newInfo; update the method so it always returns the
actual NodeInfo instance (reference getNodeInfo) instead of the put() return
value.

---

Nitpick comments:
In `@compute/src/main/java/org/zstack/compute/host/HostBase.java`:
- Around line 1443-1446: The getDeduplicateString() method currently returns
"connect-host-unknown" when self is null, which can cause unrelated requests to
collide; update the logic in HostBase.getDeduplicateString() to avoid a fixed
"unknown" key: either fail fast with a logged error if self == null (treating it
as an invalid state), or build the dedupe key from a more specific identifier
such as the incoming message's host UUID (e.g., use msg.getHostUuid() when
available) so concurrent null-self cases produce distinct keys; ensure to
reference self and msg in the chosen branch and keep the log/context clear.

In `@conf/db/upgrade/V5.5.12__schema.sql`:
- Around line 112-124: The ALTER TABLE drops for BareMetal2InstanceVO currently
assume the foreign keys fkBareMetal2InstanceVOGatewayVO and
fkBareMetal2InstanceVOGatewayVO1 exist; make the operation idempotent by first
checking information_schema (or using the same pattern as the
ModifyUsedIpVOForeignKey procedure) to detect and only DROP the constraint when
present, and similarly only ADD each FOREIGN KEY when it does not already exist;
reference the BareMetal2InstanceVO table and those constraint names
(fkBareMetal2InstanceVOGatewayVO, fkBareMetal2InstanceVOGatewayVO1) and mirror
the existence-check logic from ModifyUsedIpVOForeignKey to ensure safe repeated
executions of the upgrade script.

In `@core/src/main/java/org/zstack/core/db/DatabaseFacadeImpl.java`:
- Around line 909-921: The uninstallEntityLifeCycleCallback currently silently
returns when clz != null but entityInfoMap.get(clz) yields null; add a warning
log in that branch to mirror installEntityLifeCycleCallback's assert behavior
for visibility: in DatabaseFacadeImpl.uninstallEntityLifeCycleCallback, when clz
!= null and EntityInfo info == null, call the logger (e.g., use existing logger
in this class) to log a warning that no EntityInfo was found for the given class
before returning; keep the tolerant behavior (no exception) but emit the warning
so missing entries are discoverable.

In `@core/src/main/java/org/zstack/core/db/UpdateQueryImpl.java`:
- Around line 55-59: The current null-safe check in UpdateQueryImpl.condAnd
correctly avoids NPE but the error message uses val.getClass() which prints full
class name with "class" prefix; change the message to use
val.getClass().getSimpleName() (still guarding for null) so the thrown
CloudRuntimeException shows a cleaner type name; update the String.format call
in the condAnd method to use getSimpleName() when val != null.

In
`@core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java`:
- Around line 418-435: The oprerr(...) calls in
regenerateExternalServiceConfiguration (class ExternalServiceManagerImpl, method
regenerateExternalServiceConfiguration) are missing explicit error codes; update
both failure paths (the catch block after service.externalConfig(serviceType)
and the final completion.fail for missing service) to call oprerr with a
CloudOperationsErrorCode value (e.g.,
CLOUD_OPERATION_EXTERNAL_SERVICE_APPLY_FAILED and
CLOUD_OPERATION_EXTERNAL_SERVICE_NOT_FOUND), then add those constants to
CloudOperationsErrorCode enum/class with appropriate messages so the
completion.fail carries an error code and message; ensure references to
serviceType remain in the formatted message and keep the surrounding logic
(services map, ExternalService.externalConfig, and completion) unchanged.
- Around line 259-274: The error handling in
ExternalServiceManagerImpl.updateExternalServiceConfiguration uses operr(...)
without a CloudOperationsErrorCode, making it inconsistent; add a new error code
constant (e.g. ORG_ZSTACK_CORE_EXTERNALSERVICE_10001) to
CloudOperationsErrorCode and update the failure call in
updateExternalServiceConfiguration to use
operr(CloudOperationsErrorCode.ORG_ZSTACK_CORE_EXTERNALSERVICE_10001, "unable to
find external service configuration with uuid [%s]", msg.getUuid()); ensure the
new enum member follows the existing naming/numbering convention and
compile-time imports are adjusted so ExternalServiceManagerImpl can reference
CloudOperationsErrorCode.

In `@core/src/main/java/org/zstack/core/Platform.java`:
- Around line 986-1007: The code currently imports
org.apache.commons.lang.LocaleUtils which conflicts with
org.zstack.core.errorcode.LocaleUtils used in Platform, making the
fully-qualified reference org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE
verbose; fix by removing the conflicting import and either add a static import
for org.zstack.core.errorcode.LocaleUtils.DEFAULT_LOCALE and call DEFAULT_LOCALE
directly in Platform.localizeErrorCode(result, DEFAULT_LOCALE), or keep the
Apache import and instead fully-qualify the Apache Commons usages elsewhere so
the Platform class can use a simple static import; update imports accordingly
and ensure the call in Platform (localizeErrorCode) uses the unambiguous
DEFAULT_LOCALE symbol.

In `@core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java`:
- Line 58: 在 SentryInitHelper 中的 options.setMaxQueueSize(1000) 使用了魔法值
1000；请将该硬编码值替换为可配置或有意义命名的常量，例如从 TelemetryGlobalProperty（新增属性如
sentryMaxQueueSize）读取或定义类常量（e.g. DEFAULT_SENTRY_MAX_QUEUE_SIZE），并在
SentryInitHelper 中用该常量/属性替代原有的 1000，确保有默认值、配置注入或静态常量注释以便后续调优和维护。

In
`@header/src/main/java/org/zstack/header/core/external/service/APIAddExternalServiceConfigurationMsg.java`:
- Around line 53-57: The audit() in APIAddExternalServiceConfigurationMsg should
avoid a potential NPE when rsp.isSuccess() is true but evt or evt.getInventory()
is null; update the method to cast rsp to
APIAddExternalServiceConfigurationEvent (evt) then defensively check evt != null
and evt.getInventory() != null before calling evt.getInventory().getUuid(),
returning an empty string when either is null while still constructing the
Result with ExternalServiceConfigurationVO.class.

In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`:
- Around line 80-89: desensitizeUrlList currently assumes each list element can
be stringified safely; if elements are nested Maps/Lists (e.g., mdsUrls/mdsInfos
entries with credential fields) toString() will not properly desensitize
secrets. Update desensitizeUrlList (and reuse desensitizeUrl) to type-check each
element: if element is String call desensitizeUrl and add result; if element is
Map iterate its entries and desensitize any String values (or recursively call a
helper to handle nested Map/List), if element is List process recursively, and
for other types either skip or convert safely; finally put the processed list
back into config under the same key. Ensure helper names (desensitizeUrlList,
desensitizeUrl) are used so changes are easy to find.

In `@header/src/main/java/org/zstack/header/vm/VmAbnormalLifeCycleStruct.java`:
- Around line 37-44: Replace direct .equals() comparisons of host UUIDs with
null-safe Objects.equals() in the VmAbnormalLifeCycleStruct matching logic;
specifically update the match(...) implementation in the
VmPausedFromUnknownStateHostNotChanged enum constant (and other enum constants
inside VmAbnormalLifeCycleStruct that compare
getCurrentHostUuid()/getOriginalHostUuid()) to call
Objects.equals(struct.getCurrentHostUuid(), struct.getOriginalHostUuid())
instead of struct.getCurrentHostUuid().equals(struct.getOriginalHostUuid()) so
comparisons are consistent and null-safe.

In `@plugin/iscsi/src/main/java/org/zstack/iscsi/kvm/KvmIscsiNodeServer.java`:
- Around line 247-248: Replace the magic number 999 passed to
ExternalHostIdGetter in the call to getOrAllocateHostIdRef with a named
constant; define a descriptive constant (e.g., DEFAULT_EXTERNAL_HOST_ID or
EXTERNAL_HOST_ID_BASE) as a private static final field in KvmIscsiNodeServer (or
as a final local variable inside the method if scope-local) and use that
constant in place of 999 when constructing
ExternalHostIdGetter(param.getHostUuid(), param.getPrimaryStorage().getUuid())
so the purpose and bounds of the value are clear.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java`:
- Around line 470-515: The initLibvirtTlsCA method currently calls
ShellUtils.run(...) for openssl and file-permission commands but does not check
return values; update initLibvirtTlsCA to capture the integer return from each
ShellUtils.run call (especially the openssl genrsa and openssl req calls and
chown/chmod invocations), verify success (zero) before proceeding, and handle
failures by logging an error and aborting the CA initialization (e.g., throw or
return early) so subsequent FileUtils.readFileToString and
JsonLabel.createIfAbsent calls for caFile and keyFile are not executed with
missing/invalid files; ensure any cleanup or descriptive
logger.warn/logger.error includes the failing command and its exit code.
- Around line 498-507: The FileUtils usages around caFile/keyFile should be
updated to the non-deprecated Charset-aware overloads: when reading use
FileUtils.readFileToString(caFile, StandardCharsets.UTF_8) and
FileUtils.readFileToString(keyFile, StandardCharsets.UTF_8), and when writing
use FileUtils.writeStringToFile(caFile, caInv.getLabelValue(),
StandardCharsets.UTF_8) and FileUtils.writeStringToFile(keyFile,
keyInv.getLabelValue(), StandardCharsets.UTF_8); make these changes in the block
that reads into ca/key, calls new JsonLabel().createIfAbsent(LIBVIRT_TLS_CA_KEY,
ca) and createIfAbsent(LIBVIRT_TLS_PRIVATE_KEY, key), and then overwrites the
local files so encoding is explicit and deprecation warnings are removed.

In `@rest/src/main/resources/scripts/GoTestTemplate.groovy`:
- Line 31: The comment numbering is duplicated: change the comment string "// 2.
Per-resource test files" to "// 3. Per-resource test files" so it follows the
earlier "// 2." comment; update the comment text near the existing "// 2.
Per-resource test files" occurrence in GoTestTemplate.groovy (look for that
exact comment) to use "// 3." instead.

coderabbitai · 2026-03-20T08:47:35Z

compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java

+        // Remove any existing netmask/gateway/prefix tags, then add resolved ones
+        if (msg.getSystemTags() != null) {
+            msg.getSystemTags().removeIf(tag ->
+                    VmSystemTags.IPV4_NETMASK.isMatch(tag) || VmSystemTags.IPV4_GATEWAY.isMatch(tag)
+                            || VmSystemTags.IPV6_PREFIX.isMatch(tag) || VmSystemTags.IPV6_GATEWAY.isMatch(tag));
+        }
+        if (!resolvedTags.isEmpty()) {
+            msg.getSystemTags().addAll(resolvedTags);
+        }


⚠️ Potential issue | 🟡 Minor

潜在的空指针异常风险

当 msg.getSystemTags() 为 null 时，第 376 行的 msg.getSystemTags().addAll(resolvedTags) 会抛出 NPE。虽然第 370 行检查了非空，但第 376 行的调用在该检查块之外。

🐛 建议修复

// Remove any existing netmask/gateway/prefix tags, then add resolved ones if (msg.getSystemTags() != null) { msg.getSystemTags().removeIf(tag -> VmSystemTags.IPV4_NETMASK.isMatch(tag) || VmSystemTags.IPV4_GATEWAY.isMatch(tag) || VmSystemTags.IPV6_PREFIX.isMatch(tag) || VmSystemTags.IPV6_GATEWAY.isMatch(tag)); + if (!resolvedTags.isEmpty()) { + msg.getSystemTags().addAll(resolvedTags); + } + } else if (!resolvedTags.isEmpty()) { + msg.setSystemTags(new ArrayList<>(resolvedTags)); } - if (!resolvedTags.isEmpty()) { - msg.getSystemTags().addAll(resolvedTags); - }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java` around lines 369 - 377, The code may NPE when msg.getSystemTags() is null because addAll(resolvedTags) is outside the null check; inside VmInstanceApiInterceptor ensure system tags are initialized before adding: check msg.getSystemTags() and if null call msg.setSystemTags(new ArrayList<>()) (or otherwise create an empty list), then perform removeIf and addAll using msg.getSystemTags() and resolvedTags so addAll never runs on a null list.

coderabbitai · 2026-03-20T08:47:35Z

compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java

+    private void validateDnsAddresses(List<String> dnsAddresses) {
+        if (dnsAddresses == null || dnsAddresses.isEmpty()) {
+            return;
        }

-        final Map<String, NicIpAddressInfo> nicNetworkInfo = new StaticIpOperator().getNicNetworkInfoBySystemTag(msg.getSystemTags());
-        NicIpAddressInfo nicIpAddressInfo = nicNetworkInfo.get(msg.getDestL3NetworkUuid());
-        if (nicIpAddressInfo != null) {
-            if (!nicIpAddressInfo.ipv4Address.isEmpty() && Q.New(UsedIpVO.class).eq(UsedIpVO_.ip, nicIpAddressInfo.ipv4Address).eq(UsedIpVO_.l3NetworkUuid, msg.getDestL3NetworkUuid()).isExists()) {
-                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_VM_10109, "the static IP[%s] has been occupied on the L3 network[uuid:%s]", nicIpAddressInfo.ipv4Address, msg.getDestL3NetworkUuid()));
-            }
-            if (!nicIpAddressInfo.ipv6Address.isEmpty() && Q.New(UsedIpVO.class).eq(UsedIpVO_.ip, IPv6NetworkUtils.getIpv6AddressCanonicalString(nicIpAddressInfo.ipv6Address)).eq(UsedIpVO_.l3NetworkUuid, msg.getDestL3NetworkUuid()).isExists()) {
-                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_VM_10110, "the static IP[%s] has been occupied on the L3 network[uuid:%s]", nicIpAddressInfo.ipv6Address, msg.getDestL3NetworkUuid()));
+        if (dnsAddresses.size() > VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER) {
+            throw new ApiMessageInterceptionException(argerr(
+                    ORG_ZSTACK_COMPUTE_VM_10321, "at most %d DNS addresses are allowed, but got %d",
+                    VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER, dnsAddresses.size()));
+        }
+
+        for (String dns : dnsAddresses) {
+            if (!NetworkUtils.isIpv4Address(dns) && !IPv6NetworkUtils.isIpv6Address(dns)) {
+                throw new ApiMessageInterceptionException(argerr(
+                        ORG_ZSTACK_COMPUTE_VM_10322, "invalid DNS address[%s], must be a valid IPv4 or IPv6 address", dns));
            }
        }
    }


⚠️ Potential issue | 🟡 Minor

DNS 地址应在校验前进行 trim 处理

根据编码规范，来自 Message 的参数应进行 trim 处理，因为用户可能在浏览器复制粘贴的数据带有空格或换行符。当前实现直接校验未经 trim 的 DNS 地址，可能导致有效地址被误判为无效。

🛡️ 建议修复

private void validateDnsAddresses(List<String> dnsAddresses) { if (dnsAddresses == null || dnsAddresses.isEmpty()) { return; } if (dnsAddresses.size() > VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER) { throw new ApiMessageInterceptionException(argerr( ORG_ZSTACK_COMPUTE_VM_10321, "at most %d DNS addresses are allowed, but got %d", VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER, dnsAddresses.size())); } for (String dns : dnsAddresses) { - if (!NetworkUtils.isIpv4Address(dns) && !IPv6NetworkUtils.isIpv6Address(dns)) { + String trimmedDns = dns != null ? dns.trim() : null; + if (trimmedDns == null || trimmedDns.isEmpty() || + (!NetworkUtils.isIpv4Address(trimmedDns) && !IPv6NetworkUtils.isIpv6Address(trimmedDns))) { throw new ApiMessageInterceptionException(argerr( ORG_ZSTACK_COMPUTE_VM_10322, "invalid DNS address[%s], must be a valid IPv4 or IPv6 address", dns)); } } }

As per coding guidelines: "注意检查来自 Message 的参数是否做过 trim，用户可能在浏览器上复制粘贴的数据带有空格、换行符等"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

private void validateDnsAddresses(List<String> dnsAddresses) {

if (dnsAddresses == null || dnsAddresses.isEmpty()) {

return;

}

final Map<String, NicIpAddressInfo> nicNetworkInfo = new StaticIpOperator().getNicNetworkInfoBySystemTag(msg.getSystemTags());

NicIpAddressInfo nicIpAddressInfo = nicNetworkInfo.get(msg.getDestL3NetworkUuid());

if (nicIpAddressInfo != null) {

if (!nicIpAddressInfo.ipv4Address.isEmpty() && Q.New(UsedIpVO.class).eq(UsedIpVO_.ip, nicIpAddressInfo.ipv4Address).eq(UsedIpVO_.l3NetworkUuid, msg.getDestL3NetworkUuid()).isExists()) {

throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_VM_10109, "the static IP[%s] has been occupied on the L3 network[uuid:%s]", nicIpAddressInfo.ipv4Address, msg.getDestL3NetworkUuid()));

}

if (!nicIpAddressInfo.ipv6Address.isEmpty() && Q.New(UsedIpVO.class).eq(UsedIpVO_.ip, IPv6NetworkUtils.getIpv6AddressCanonicalString(nicIpAddressInfo.ipv6Address)).eq(UsedIpVO_.l3NetworkUuid, msg.getDestL3NetworkUuid()).isExists()) {

throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_VM_10110, "the static IP[%s] has been occupied on the L3 network[uuid:%s]", nicIpAddressInfo.ipv6Address, msg.getDestL3NetworkUuid()));

if (dnsAddresses.size() > VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER) {

throw new ApiMessageInterceptionException(argerr(

ORG_ZSTACK_COMPUTE_VM_10321, "at most %d DNS addresses are allowed, but got %d",

VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER, dnsAddresses.size()));

}

for (String dns : dnsAddresses) {

if (!NetworkUtils.isIpv4Address(dns) && !IPv6NetworkUtils.isIpv6Address(dns)) {

throw new ApiMessageInterceptionException(argerr(

ORG_ZSTACK_COMPUTE_VM_10322, "invalid DNS address[%s], must be a valid IPv4 or IPv6 address", dns));

}

}

}

private void validateDnsAddresses(List<String> dnsAddresses) {

if (dnsAddresses == null || dnsAddresses.isEmpty()) {

return;

}

if (dnsAddresses.size() > VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER) {

throw new ApiMessageInterceptionException(argerr(

ORG_ZSTACK_COMPUTE_VM_10321, "at most %d DNS addresses are allowed, but got %d",

VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER, dnsAddresses.size()));

}

for (String dns : dnsAddresses) {

String trimmedDns = dns != null ? dns.trim() : null;

if (trimmedDns == null || trimmedDns.isEmpty() ||

(!NetworkUtils.isIpv4Address(trimmedDns) && !IPv6NetworkUtils.isIpv6Address(trimmedDns))) {

throw new ApiMessageInterceptionException(argerr(

ORG_ZSTACK_COMPUTE_VM_10322, "invalid DNS address[%s], must be a valid IPv4 or IPv6 address", dns));

}

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java` around lines 409 - 426, validateDnsAddresses currently validates entries without trimming, so values like " 1.2.3.4 " fail; update validateDnsAddresses to first map dnsAddresses to trimmed strings (e.g., dnsAddresses = dnsAddresses.stream().map(String::trim).collect...) and remove any empty strings produced by trimming before performing the size check and IPv4/IPv6 checks; use the same symbols (validateDnsAddresses, dnsAddresses, NetworkUtils.isIpv4Address, IPv6NetworkUtils.isIpv6Address, VmInstanceConstant.MAXIMUM_NIC_DNS_NUMBER) and keep the same error behavior (throw ApiMessageInterceptionException with existing error codes) after trimming/filtering.

coderabbitai · 2026-03-20T08:47:35Z

core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java

+                        options.setLogger(new io.sentry.ILogger() {
+                            @Override
+                            public void log(SentryLevel level, String message, Object... args) {
+                                String formatted = args.length > 0 ? String.format(message, args) : message;
+                                switch (level) {
+                                    case ERROR: logger.warn("[Sentry] " + formatted); break;
+                                    case WARNING: logger.warn("[Sentry] " + formatted); break;
+                                    default: logger.info("[Sentry] " + formatted); break;
+                                }
+                            }
+
+                            @Override
+                            public void log(SentryLevel level, String message, Throwable throwable) {
+                                switch (level) {
+                                    case ERROR: logger.warn("[Sentry] " + message, throwable); break;
+                                    case WARNING: logger.warn("[Sentry] " + message, throwable); break;
+                                    default: logger.info("[Sentry] " + message, throwable); break;
+                                }
+                            }
+
+                            @Override
+                            public void log(SentryLevel level, Throwable throwable, String message, Object... args) {
+                                String formatted = args.length > 0 ? String.format(message, args) : message;
+                                log(level, formatted, throwable);
+                            }
+
+                            @Override
+                            public boolean isEnabled(SentryLevel level) {
+                                return true;
+                            }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash fd 'pom.xml|build.gradle|build.gradle.kts|gradle.properties' -t f \ | xargs -r rg -n 'io\.sentry|sentry'

Repository: MatheMatrix/zstack

Length of output: 1821

🏁 Script executed:

# Find and read the SentryInitHelper.java file find . -name "SentryInitHelper.java" -type f

Repository: MatheMatrix/zstack

Length of output: 130

🏁 Script executed:

# Read the file to understand the actual implementation cat -n core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java | sed -n '50,100p'

Repository: MatheMatrix/zstack

Length of output: 3276

🏁 Script executed:

# Check if there are any other custom ILogger implementations in the codebase for comparison rg "ILogger" --type java -B 2 -A 10

Repository: MatheMatrix/zstack

Length of output: 1755

🌐 Web query:

Sentry Java 8.16 ILogger interface contract message args null safety

💡 Result:

In Sentry Java SDK 8.16.0, the io.sentry.ILogger contract is (per its Javadoc annotations):

level: @NotNull → implementations may assume it’s non-null.

message: @NotNull → implementations may assume it’s non-null.

args: @nullable Object... → implementations must be null-safe for the varargs array itself (i.e., callers are allowed to pass null instead of an empty array). The element nullability isn’t specified, so implementations should also tolerate args containing null values.

throwable (where present): @nullable → must be null-safe.

Source: ILogger Javadoc for 8.16.0 shows these nullability annotations on each overload. [1]

Sources

Sentry Java 8.16.0 Javadoc — io.sentry.ILogger: https://javadoc.io/static/io.sentry/sentry/8.16.0/io/sentry/ILogger.html

Sentry 自定义日志回调存在空指针和格式化异常风险，且 ERROR 被降级为 WARN。

根据 Sentry Java 8.16 ILogger 契约，args 参数为 @Nullable，但第 65、84 行直接调用 args.length 而未做空检查，会导致 NPE。此外，String.format() 也可能因格式模板与参数不匹配而抛异常。同时第 67、76 行把 ERROR 级别记录为 warn，弱化故障可观测性。

修复建议

`@Override` public void log(SentryLevel level, String message, Object... args) { - String formatted = args.length > 0 ? String.format(message, args) : message; + String formatted = safeFormat(message, args); switch (level) { - case ERROR: logger.warn("[Sentry] " + formatted); break; + case ERROR: logger.error("[Sentry] " + formatted); break; case WARNING: logger.warn("[Sentry] " + formatted); break; default: logger.info("[Sentry] " + formatted); break; } } `@Override` public void log(SentryLevel level, Throwable throwable, String message, Object... args) { - String formatted = args.length > 0 ? String.format(message, args) : message; + String formatted = safeFormat(message, args); log(level, formatted, throwable); } + + private String safeFormat(String message, Object... args) { + if (args == null || args.length == 0) { + return message; + } + try { + return String.format(message, args); + } catch (Exception e) { + return message; + } + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/zstack/core/telemetry/SentryInitHelper.java` around lines 62 - 91, In the anonymous ILogger implementation passed to options.setLogger in SentryInitHelper, guard against null and formatting exceptions: check if args is null before accessing args.length, and wrap String.format(...) in a try/catch to fall back to a safe message if formatting fails; likewise, change the logging for SentryLevel.ERROR to use logger.error(...) (not logger.warn) so error-level events are not downgraded; apply these fixes in the three logger methods (log(SentryLevel,String,Object...), log(SentryLevel,String,Throwable), and log(SentryLevel,Throwable,String,Object...)) to ensure no NPEs or format exceptions and correct severity mapping.

coderabbitai · 2026-03-20T08:47:35Z

network/src/main/java/org/zstack/network/l3/IpRangeHelper.java

+    public static boolean isIpOutsideL3NetworkCidr(String ip, String l3Uuid) {
+        return !isIpInL3NetworkCidr(ip, l3Uuid);


⚠️ Potential issue | 🟠 Major

isIpOutsideL3NetworkCidr() 在无 range 的 L3 上返回值反了。

这里直接对 isIpInL3NetworkCidr() 取反，所以当 L3 没有任何 NormalIpRangeVO 时会返回 true。新调用方把这个 helper 当成 “no-range => false” 来用，这会把 no-IP-range/no-IPAM 场景误判成 outside。

💡 建议修正

public static boolean isIpOutsideL3NetworkCidr(String ip, String l3Uuid) { - return !isIpInL3NetworkCidr(ip, l3Uuid); + if (ip == null || l3Uuid == null) { + return false; + } + + Integer ipVersion = null; + if (NetworkUtils.isIpv4Address(ip)) { + ipVersion = IPv6Constants.IPv4; + } else if (IPv6NetworkUtils.isIpv6Address(ip)) { + ipVersion = IPv6Constants.IPv6; + } + + if (ipVersion == null) { + return false; + } + + boolean hasRange = Q.New(NormalIpRangeVO.class) + .eq(NormalIpRangeVO_.l3NetworkUuid, l3Uuid) + .eq(NormalIpRangeVO_.ipVersion, ipVersion) + .isExists(); + return hasRange && !isIpInL3NetworkCidr(ip, l3Uuid); }

Based on learnings "In LoadBalancerApiInterceptor.java (plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerApiInterceptor.java), the outside-CIDR check via IpRangeHelper.isIpOutsideL3NetworkCidr does NOT require a prior hasNormalIpRange guard before being called. Per shixin.ruan, IpRangeHelper.isIpOutsideL3NetworkCidr already handles the no-IP-range case (returns false when no ranges exist), so the additional no-IPAM pre-check is unnecessary."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@network/src/main/java/org/zstack/network/l3/IpRangeHelper.java` around lines 303 - 304, isIpOutsideL3NetworkCidr currently just negates isIpInL3NetworkCidr which causes it to return true for L3s with no NormalIpRangeVO; update isIpOutsideL3NetworkCidr to first check for any NormalIpRangeVO for the given l3Uuid and return false when none exist, otherwise return the negation of isIpInL3NetworkCidr(ip, l3Uuid); reference the methods isIpOutsideL3NetworkCidr and isIpInL3NetworkCidr and the NormalIpRangeVO query used elsewhere in this class to locate and implement the guard.

coderabbitai · 2026-03-20T08:47:35Z

network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java

+        Set<ReservedIpRangeVO> reservedIpRanges = self.getReservedIpRanges();
+        if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) {
+            freeIpInventorys.removeIf(freeIp -> reservedIpRanges.stream().anyMatch(
+                    r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp())));
+        }


⚠️ Potential issue | 🟠 Major

过滤完直接返回，会漏掉后面的可用地址。

这里 removeIf 之后没有继续补齐结果；只要前面收集到的候选里包含 reserved IP，API 就可能返回少于 msg.getLimit() 的地址，甚至空列表，即使后面还有可用 IP。过滤应前移到累加阶段，或者在过滤后继续向后取满。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java` around lines 1126 - 1130, The current removal of reserved IPs from freeIpInventorys in L3BasicNetwork (using reservedIpRanges and NetworkUtils.isInRange) can cause the returned list to be shorter than msg.getLimit() because you simply remove matches instead of filling the result to the requested limit; change the logic so reserved IP filtering happens during accumulation (skip adding any freeIp that matches reservedIpRanges) or, if you must filter after collection, continue iterating the candidate pool to append additional non-reserved freeIp entries until you reach msg.getLimit(); update the code that populates freeIpInventorys (the accumulation loop that feeds freeIpInventorys and any caller that relies on msg.getLimit()) to enforce this behavior.

coderabbitai · 2026-03-20T08:47:36Z

...rding/src/main/java/org/zstack/network/service/portforwarding/PortForwardingManagerImpl.java

+                List<VmNicInventory> filtered = l3Mgr.filterVmNicByIpVersion(nicInvs, IPv6Constants.IPv4);
+
+                // Filter out NICs whose primary IP is outside L3 CIDR
+                filtered = filtered.stream().filter(nic -> {
+                    String nicIp = nic.getIp();
+                    String nicL3 = nic.getL3NetworkUuid();
+                    return nicIp == null || IpRangeHelper.isIpInL3NetworkCidr(nicIp, nicL3);
+                }).collect(Collectors.toList());
+
+                return filtered;


⚠️ Potential issue | 🟠 Major

逐 NIC 调用 CIDR 检查会产生 N+1 查询，建议改为批量判定。

Line 372-376 里每个 NIC 都调用一次 IpRangeHelper.isIpInL3NetworkCidr，在候选 NIC 较多时会显著拉高查询延迟。

💡建议改法（先批量拉取，再内存过滤）

- filtered = filtered.stream().filter(nic -> { - String nicIp = nic.getIp(); - String nicL3 = nic.getL3NetworkUuid(); - return nicIp == null || IpRangeHelper.isIpInL3NetworkCidr(nicIp, nicL3); - }).collect(Collectors.toList()); + Set<String> l3Uuids = filtered.stream() + .map(VmNicInventory::getL3NetworkUuid) + .filter(Objects::nonNull) + .collect(Collectors.toSet()); + + Map<String, List<String>> l3Cidrs = Q.New(NormalIpRangeVO.class) + .select(NormalIpRangeVO_.l3NetworkUuid, NormalIpRangeVO_.networkCidr) + .in(NormalIpRangeVO_.l3NetworkUuid, l3Uuids) + .eq(NormalIpRangeVO_.ipVersion, IPv6Constants.IPv4) + .listTuple() + .stream() + .collect(Collectors.groupingBy( + t -> t.get(0, String.class), + Collectors.mapping(t -> t.get(1, String.class), Collectors.toList()) + )); + + filtered = filtered.stream().filter(nic -> { + if (nic.getIp() == null) { + return true; + } + return l3Cidrs.getOrDefault(nic.getL3NetworkUuid(), Collections.emptyList()) + .stream().anyMatch(cidr -> NetworkUtils.isIpv4InCidr(nic.getIp(), cidr)); + }).collect(Collectors.toList());

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@plugin/portForwarding/src/main/java/org/zstack/network/service/portforwarding/PortForwardingManagerImpl.java` around lines 369 - 378, The per-NIC call to IpRangeHelper.isIpInL3NetworkCidr in PortForwardingManagerImpl (after filterVmNicByIpVersion) causes an N+1 query; instead, batch-fetch the L3 network CIDRs for all distinct l3NetworkUuid values using l3Mgr (or the L3 network DAO), build a map of l3Uuid->CIDR(s) in memory, then replace the stream filter so each VmNicInventory checks its IP against the preloaded CIDR(s) using IpRangeHelper (or a pure in-memory helper) without additional DB calls; update the logic around filtered = filtered.stream().filter(...) to use the map and avoid per-nic queries.

coderabbitai · 2026-03-20T08:47:36Z

plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsStorageController.java

+            GetVolumeClientsRsp rsp = new HttpCaller<>(GET_VOLUME_CLIENTS_PATH, cmd, GetVolumeClientsRsp.class,
+                    null, TimeUnit.SECONDS, 30, true)
+                    .setTryNext(true)
+                    .syncCall();


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # 目的：确认 ZbsPrimaryStorageMdsBase.syncCall 在超时/连接失败时， # 是返回失败响应，还是直接抛异常。 # 期望：返回失败响应；否则这里的 setTryNext(true) 不会真正切到下一个 MDS。 echo "===== ZbsPrimaryStorageMdsBase.syncCall definition =====" fd 'ZbsPrimaryStorageMdsBase.java' | while read -r f; do echo "FILE: $f" rg -n -C8 '\bsyncCall\s*\(' "$f" done echo "===== HttpCaller.doSyncCall in ZbsStorageController =====" fd 'ZbsStorageController.java' | while read -r f; do echo "FILE: $f" rg -n -C12 '\bdoSyncCall\s*\(' "$f" done

Repository: MatheMatrix/zstack

Length of output: 3129

🏁 Script executed:

# Find and examine ZbsPrimaryStorageMdsBase class fd 'ZbsPrimaryStorageMdsBase.java' -type f | head -5

Repository: MatheMatrix/zstack

Length of output: 232

🏁 Script executed:

# Examine the syncCall method in ZbsPrimaryStorageMdsBase rg -n 'syncCall' plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsPrimaryStorageMdsBase.java -A 20

Repository: MatheMatrix/zstack

Length of output: 44

🏁 Script executed:

# Check if there's exception handling or try-catch in the method rg -n 'try|catch|throw' plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsPrimaryStorageMdsBase.java -B 2 -A 3

Repository: MatheMatrix/zstack

Length of output: 829

🏁 Script executed:

# Find the ZbsPrimaryStorageMdsBase.java file find . -name 'ZbsPrimaryStorageMdsBase.java' 2>/dev/null

Repository: MatheMatrix/zstack

Length of output: 141

🏁 Script executed:

# Get file size and read it f=$(find . -name 'ZbsPrimaryStorageMdsBase.java' 2>/dev/null | head -1) if [ -f "$f" ]; then wc -l "$f" echo "=== Full file content ===" cat "$f" fi

Repository: MatheMatrix/zstack

Length of output: 18792

🏁 Script executed:

# Find ZbsMdsBase.java find . -name 'ZbsMdsBase.java' 2>/dev/null

Repository: MatheMatrix/zstack

Length of output: 127

🏁 Script executed:

# Get ZbsMdsBase file and examine it f=$(find . -name 'ZbsMdsBase.java' 2>/dev/null | head -1) if [ -f "$f" ]; then wc -l "$f" echo "=== Full file ===" cat "$f" fi

Repository: MatheMatrix/zstack

Length of output: 5717

🏁 Script executed:

# Find RESTFacade and examine syncJsonPost find . -name 'RESTFacade.java' 2>/dev/null | head -3

Repository: MatheMatrix/zstack

Length of output: 123

🏁 Script executed:

# Search for syncJsonPost implementation rg -n 'syncJsonPost' --type java -A 15 | head -100

Repository: MatheMatrix/zstack

Length of output: 10408

🏁 Script executed:

# Read RESTFacade.java to find syncJsonPost method cat ./header/src/main/java/org/zstack/header/rest/RESTFacade.java | head -300

Repository: MatheMatrix/zstack

Length of output: 5768

🏁 Script executed:

# Search for syncJsonPost method specifically rg -n 'syncJsonPost' ./header/src/main/java/org/zstack/header/rest/RESTFacade.java -A 10

Repository: MatheMatrix/zstack

Length of output: 2969

🏁 Script executed:

# Find the implementation of RESTFacade find . -name '*RESTFacade*.java' -type f | grep -i impl

Repository: MatheMatrix/zstack

Length of output: 123

🏁 Script executed:

# Search for implementations of syncJsonPost rg -l 'implements RESTFacade' --type java

Repository: MatheMatrix/zstack

Length of output: 121

🏁 Script executed:

# Get file size and read RESTFacadeImpl.java f='./core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java' wc -l "$f"

Repository: MatheMatrix/zstack

Length of output: 128

🏁 Script executed:

# Read RESTFacadeImpl and search for syncJsonPost implementation f='./core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java' rg -n 'syncJsonPost' "$f" -A 20 | head -150

Repository: MatheMatrix/zstack

Length of output: 7101

🏁 Script executed:

# Search for the actual syncJsonPost implementation that handles the HTTP call f='./core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java' rg -n 'public.*syncJsonPost.*String.*returnClass' "$f" -A 30 | head -100

Repository: MatheMatrix/zstack

Length of output: 4472

🏁 Script executed:

# Search for syncJson method which might be the underlying implementation f='./core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java' rg -n 'private.*syncJson|public.*syncJson' "$f" -A 25 | head -150

Repository: MatheMatrix/zstack

Length of output: 7133

🏁 Script executed:

# Find the syncJson method implementation f='./core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java' rg -n 'private.*<T>.*syncJson\(' "$f" -A 40 | head -150

Repository: MatheMatrix/zstack

Length of output: 44

🏁 Script executed:

# Also search for syncJson without the type parameter f='./core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java' sed -n '798,900p' "$f"

Repository: MatheMatrix/zstack

Length of output: 4910

需要修复：同步重试逻辑无法处理超时/连接异常。

当前实现中，HttpCaller.doSyncCall() 只在 ret.isSuccess() == false 时才会触发 tryNext 重试。但底层 RESTFacadeImpl.syncRawJson() 在超时或连接失败时会直接抛出 OperationFailureException 异常，而非返回失败响应对象。该异常会绕过 if (!ret.isSuccess()) 检查，导致 setTryNext(true) 对超时/连接异常无效，程序会立即在第一个 MDS 上失败。

建议在 HttpCaller.doSyncCall() 中添加异常捕获逻辑，确保超时和连接异常也能触发 MDS 切换重试。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsStorageController.java` around lines 182 - 185, HttpCaller.doSyncCall currently only retries when the returned response has ret.isSuccess()==false, but RESTFacadeImpl.syncRawJson throws OperationFailureException on timeouts/connection errors which bypass that check; update HttpCaller.doSyncCall to catch OperationFailureException (and related IO/connection timeout exceptions) around the syncRawJson call, treat these exceptions as a failed call that honors setTryNext(true), log the exception, and proceed to the next MDS endpoint before ultimately rethrowing if all endpoints fail so timeouts/connection failures trigger the same MDS-switch retry logic as unsuccessful responses.

coderabbitai · 2026-03-20T08:47:36Z

portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java

+            synchronized (lifecycleLock) {
+                if (LifeCycle.NodeJoin.toString().equals(d.getLifeCycle())) {
+                    // Clear from suspected set since the node is confirmed alive
+                    suspectedMissingFromDb.remove(d.getInventory().getUuid());
+                    nodeLifeCycle.nodeJoin(d.getInventory());
+                } else if (LifeCycle.NodeLeft.toString().equals(d.getLifeCycle())) {
+                    nodeLifeCycle.nodeLeft(d.getInventory());
+                } else {
+                    throw new CloudRuntimeException(String.format("unknown lifecycle[%s]", d.getLifeCycle()));
+                }


⚠️ Potential issue | 🟠 Major

lifecycleLock 还没覆盖 nodeDie() 这条 node-left 路径。

这里把 canonical callback 和 DB/hash-ring reconcile 都串行化了，但同文件 Line 773-781 的 nodeDie() 仍然在锁外直接执行 nodeLifeCycle.nodeLeft(...)。节点刚恢复并广播 NodeJoin 时，旧的 fencing 线程仍可能补一个 stale nodeLeft 把它再次踢出 hash ring，这个竞态还没有完全收口。建议把 nodeDie() 里的本地 nodeLeft 和事件派发也纳入同一串行机制。

Also applies to: 877-925

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java` around lines 381 - 390, The nodeDie() path is calling nodeLifeCycle.nodeLeft(...) outside the lifecycleLock, causing a race where a stale nodeLeft can run concurrently with a later NodeJoin handling; move the local nodeLeft call and its event dispatch inside the same synchronized(lifecycleLock) block used by the NodeJoin/NodeLeft canonical callback so DB/hash-ring reconciliation and callback handling are serialized. Concretely, update the nodeDie() method to acquire lifecycleLock before invoking nodeLifeCycle.nodeLeft(...) and any related suspectedMissingFromDb updates or event dispatch logic (mirror the pattern used around nodeLifeCycle.nodeJoin(...) and the existing NodeLeft branch) so both canonical callbacks and fencing-generated nodeLefts use the same lock. Ensure the same change is applied for the other referenced region (lines around the second occurrence noted, e.g., the block covering 877-925) so all nodeLeft paths are synchronized on lifecycleLock.

coderabbitai · 2026-03-20T08:47:36Z

portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java

+                                ManagementNodeInventory inv = new ManagementNodeInventory();
+                                inv.setUuid(nodeUuid);
+                                try {
+                                    inv.setHostName(destinationMaker.getNodeInfo(nodeUuid).getNodeIP());
+                                } catch (Exception e) {
+                                    logger.warn(String.format("cannot get node info for node[uuid:%s], use empty hostname", nodeUuid));
+                                }
+
+                                nodeLifeCycle.nodeLeft(inv);


⚠️ Potential issue | 🟠 Major

不要在 heartbeat 线程里通过 destinationMaker 间接走普通数据源。

Line 739-740 已经约束这个 task 只能使用 heartBeatDBSource，但这里 cache miss 时会落到 ResourceDestinationMakerImpl.getNodeInfo() 的 dbf.findByUuid(...)。DB 抖动时，这会绕过 heartbeat 的重连逻辑，把健康检查卡在普通 dbf 上。这里最好只用 joinedManagementNodes 的缓存 hostname，拿不到就留空。

🛠️ 建议修正

- try { - inv.setHostName(destinationMaker.getNodeInfo(nodeUuid).getNodeIP()); - } catch (Exception e) { - logger.warn(String.format("cannot get node info for node[uuid:%s], use empty hostname", nodeUuid)); - } + ManagementNodeInventory knownInv = joinedManagementNodes.get(nodeUuid); + if (knownInv != null) { + inv.setHostName(knownInv.getHostName()); + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java` around lines 890 - 898, The heartbeat thread in ManagementNodeManagerImpl must not call destinationMaker.getNodeInfo (which triggers ResourceDestinationMakerImpl.getNodeInfo -> dbf.findByUuid) because this bypasses heartBeatDBSource; change the code around nodeLifeCycle.nodeLeft(inv) to fetch hostname only from the in-memory joinedManagementNodes cache (use the joinedManagementNodes map/structure available in ManagementNodeManagerImpl) and, if there's no cached entry for nodeUuid, leave inv.hostName empty instead of calling destinationMaker; remove the try/catch that calls destinationMaker.getNodeInfo to avoid hitting the ordinary DB path.

coderabbitai · 2026-03-20T08:47:36Z

rest/src/main/java/org/zstack/rest/RestServer.java

+            // localize with webhook caller's locale (message already populated by Platform.err)
+            String locale = resolveLocale();
+            if (!LocaleUtils.DEFAULT_LOCALE.equals(locale)) {
+                i18nService.localizeErrorCode(evt.getError(), locale);
+            }
            response.setError(evt.getError());


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail fd 'GlobalErrorCodeI18nService.java' -x sh -c 'echo "===== {} ====="; sed -n "1,240p" "{}"' rg -n -C3 'localizeErrorCode\(' --glob '*.java' fd 'AsyncRestApiStore.java|AsyncRestQueryResult.java|RequestData.java' -x sh -c 'echo "===== {} ====="; sed -n "1,260p" "{}"' rg -n -C3 'class AsyncRestApiStore|query\(|complete\(' --glob '*.java'

Repository: MatheMatrix/zstack

Length of output: 50376

在调用 localizeErrorCode() 之前需要深拷贝 ErrorCode 对象。

GlobalErrorCodeI18nServiceImpl.localizeErrorCode() 会直接修改入参的 ErrorCode 对象（通过 setMessage()、setCause() 等 setter 方法）。如果这些错误对象来自缓存或被多个请求复用，不同请求的 Accept-Language 会污染错误信息，导致后续请求获得错误的本地化消息。

应该先深拷贝 ErrorCode，再对副本进行本地化处理，最后将本地化后的副本设置到响应中。

涉及的位置：

callWebHook() 第 414 行：i18nService.localizeErrorCode(evt.getError(), locale);

sendResponse() 第 932 行：i18nService.localizeErrorCode(response.getError(), locale);

sendReplyResponse() 第 1443 行：i18nService.localizeErrorCode(reply.getError(), locale);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@rest/src/main/java/org/zstack/rest/RestServer.java` around lines 411 - 416, Deep-copy the ErrorCode before localizing to avoid mutating shared cached objects: in callWebHook, sendResponse, and sendReplyResponse create a deep clone of the ErrorCode (clone/copy the result of evt.getError, response.getError, reply.getError), pass that clone to i18nService.localizeErrorCode(..., locale), then set the localized clone back onto the event/response/reply (instead of localizing the original). Ensure the copy is a full deep copy (message, cause, details, children) so no shared references remain.

shan.wu and others added 30 commits December 22, 2025 18:07

<fix>[migration]: fix failed to start vm after ceph to ceph offline m…

3547791

…igration. support starting vm without nic in other clusters. Resolves/Related: ZSTAC-80468 Change-Id: I6370646f7a796265677a6a656c716f6867706d69

<fix>[vm]: use max of virtual and actual size for root disk when no d…

6b4521b

…isk offering Resolves: ZSTAC-74683 Change-Id: Id0339ed0221e92e506f60745cde972cc3ee6d9ae

<fix>[ceph]: apply over-provisioning ratio when releasing snapshot ca…

a84a36e

…pacity Resolves: ZSTAC-79709 Change-Id: I45a2133bbb8c51c25ae3549d59e588976192a08d

<fix>[loadBalancer]: block SLB deletion during grayscale upgrade

f19223a

Resolves: ZSTAC-78989 Change-Id: I0fe3a56ab724978944c69afadaab7ff7353e4c0f

<fix>[i18n]: improve snapshot error message for unattached volume

24d4f3b

Resolves: ZSTAC-82153 Change-Id: Ib51c2e21553277416d1a9444be55aca2aa4b2fc4

<fix>[compute]: add null check for VmNicVO in afterDelIpAddress and a…

f563992

…fterAddIpAddress to prevent NPE during rollback Resolves: ZSTAC-81741 Change-Id: I53bcf20a10306afc7b6172da294d347b74e6c41f

<fix>[network]: filter reserved IPs from GetFreeIp API results

6545350

Resolves: ZSTAC-81182 Change-Id: Id1bb642154dc66ae9995dcc4d9fc00cdce9bcaf8

Merge branch 'fix/ZSTAC-80620' into '5.5.12'

76490a5

<fix>[vm]: add Destroying->Stopped state transition See merge request zstackio/zstack!9156

Merge branch 'fix/ZSTAC-82153' into '5.5.12'

461e8a2

<fix>[i18n]: improve snapshot error message for unattached volume See merge request zstackio/zstack!9192

Merge branch 'fix/ZSTAC-80595' into '5.5.12'

32e1e94

<fix>[zbs]: reduce mds connect timeout and enable tryNext for volume clients See merge request zstackio/zstack!9153

Merge branch 'fix/ZSTAC-74683' into '5.5.12'

addec8c

<fix>[vm]: use max of virtual and actual size for root disk allocation See merge request zstackio/zstack!9155

Merge branch 'fix/ZSTAC-81182' into '5.5.12'

be53c72

<fix>[network]: filter reserved IPs in getFreeIp See merge request zstackio/zstack!9170

Merge branch 'fix/ZSTAC-79709' into '5.5.12'

e1dee9f

<fix>[ceph]: apply over-provisioning ratio when releasing snapshot capacity See merge request zstackio/zstack!9162

Merge branch 'fix/ZSTAC-81741' into '5.5.12'

3bd062b

<fix>[network]: add null check for L3 network system tags in API interceptor See merge request zstackio/zstack!9169

<fix>[volumebackup]: add backup cancel timeout error code

f41558d

Add ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT constant to CloudOperationsErrorCode for use in premium volumebackup module. Resolves: ZSTAC-82195 Change-Id: Ibc405876e1171b637cf76b91a6822574fb6e7811

Merge branch 'fix/ZSTAC-77711' into '5.5.12'

673be94

<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition See merge request zstackio/zstack!9154

Merge branch 'bugfix/ZSTAC-82275' into '5.5.12'

72ce6ef

<fix>[thread]: guard Context.current() with telemetry check See merge request zstackio/zstack!9202

<fix>[volumebackup]: add backup cancel timeout error code

a9a3994

Resolves: ZSTAC-82195 Change-Id: I3d5e91d09d7c088d3c53e3839f8b32f4bce32dec

Merge branch 'fix/ZSTAC-78989' into '5.5.12'

34bceb1

<fix>[loadBalancer]: block SLB deletion during grayscale upgrade See merge request zstackio/zstack!9187

Merge branch 'fix/ZSTAC-82195' into '5.5.12'

3e02188

<fix>[volumebackup]: add backup cancel timeout error code See merge request zstackio/zstack!9200

Merge branch 'fix/ZSTAC-80664' into '5.5.12'

799a84f

<fix>[core]: add @nologging to sensitive config fields See merge request zstackio/zstack!9171

<fix>[kvm]: use CAS remove to fix TOCTOU race in orphaned skip VM cle…

7a6d5d7

…anup Resolves: ZSTAC-80821 Change-Id: I59284c4e69f5d2ee357b1836b7c243200e30949a

<fix>[zbs]: sync MDS node statuses to DB when reconnect fails

62e3db5

Resolves: ZSTAC-77544 Change-Id: I1f711bff9c1e87a8cbf6a2eb310ca6086f0f99ba

<fix>[kvm]: configurable orphan skip timeout

bec4623

Resolves: ZSTAC-80821 Change-Id: Ia9a9597feceb96b3e6e22259e2d0be7bde8ae499

MaJin1996 and others added 25 commits March 16, 2026 05:30

Merge branch 'fix/ZSTAC-82619' into '5.5.12'

327b1cc

<fix>[conf]: use absolute path for ansible version check during upgrade See merge request zstackio/zstack!9356

<feature>[longjob]: standardize progress detail fields to be unit-agn…

39a9a84

…ostic Resolves: ZSTAC-82318 Change-Id: If68d507a3eeb3f48458d9fbf61c357339688d43b

<fix>[conf]: retry rm -rf virtualenv to avoid race with zstack_servic…

ed8c99b

…e_exporter Resolves: ZSTAC-82619 Change-Id: I581385808fe4d942369b98f175eb8fbbdb382a0d

<fix>[vmScheduling]: change GET scheduling APIs from POST to GET meth…

d27439a

…od (ZSTAC-71075) Resolves: ZSTAC-71075 Change-Id: I21ebd76ba332995252f67f4cacf78022587a215a

Merge branch 'fix/ZSTAC-82318-phase2' into '5.5.12'

b154bc1

<feature>[longjob]: standardize progress detail fields to be unit-agnostic See merge request zstackio/zstack!9358

Merge branch 'fix/ZSTAC-68709-eval-queue@@2' into '5.5.12'

3b4d09e

fix(ai): ZSTAC-68709 add eval task queuing with concurrency control See merge request zstackio/zstack!9295

<fix>[network]: set nic ip out of l3 cidr scope

5ffa56e

APIImpact DBImpact GlobalConfigImpact Resolves: ZSTAC-81969 Change-Id: I736e77646266696e646271766d7170627378796b

Merge branch 'fix/ZSTAC-81797@@2' into '5.5.12'

262e92a

ZSTAC-81797: iSCSI fencer use platform compact hostId instead of storage bdc id See merge request zstackio/zstack!9274

Merge branch 'fix/ZSTAC-82619' into '5.5.12'

80b93d6

<fix>[conf]: retry rm -rf virtualenv to avoid race with zstack_service_exporter See merge request zstackio/zstack!9365

Merge branch 'shixin-ZSTAC-83150@@2' into '5.5.12'

d993609

<fix>[network]: set nic ip out of l3 cidr scope See merge request zstackio/zstack!9374

<fix>[docs]: move to zstack/docs

76044ca

Resolves: ZSTAC-81969 Change-Id: I7176617669636d7362616973787a68766c6d6579

Merge branch 'fix/ZSTAC-71075@@2' into '5.5.12'

8f31252

<fix>[vmScheduling]: change GET scheduling APIs from POST to GET method (ZSTAC-71075) See merge request zstackio/zstack!9373

Merge branch 'feature/errorcode-i18n-simplify' into '5.5.12'

d0ccc44

<feature>[errorcode]: simplify i18n — guarantee message is never null See merge request zstackio/zstack!9354

<feature>[kvm]: add libvirt TLS config

3feb9e9

Add libvirt.tls.enabled GlobalConfig and useTls field in MigrateVmCmd to support TLS-encrypted libvirt connections for migration and V2V. Resolves: ZSTAC-81343 Change-Id: I391fa36c0dd63c25c5d85d102bc3579c8eb3d685

Merge branch 'shixin-ZSTAC-81969' into '5.5.12'

69ae62a

<fix>[docs]: move to zstack/docs See merge request zstackio/zstack!9379

Merge branch 'shixin-ZSTAC-83300' into '5.5.12'

e0a8246

<fix>[l2network]: validate physicalInterface for LinuxBridge See merge request zstackio/zstack!9375

Merge branch 'ZSTAC-81343@@2' into '5.5.12'

f15be32

<feature>[kvm]: add libvirt TLS config See merge request zstackio/zstack!9317

Merge branch 'fix/ZSTAC-83305' into '5.5.12'

b42842c

<fix>[core,kvm]: fix SSH session leak in CallBackNetworkChecker and KVMHost See merge request zstackio/zstack!9398

<fix>[compute]: fix user define param error

046d023

when user define ip/netmask/gateway, then gateway must be in the cidr of ip/netamsl Resolves: ZSTAC-83321 Change-Id: I78616376707476666a7a72786a7062766272686b

Merge branch 'shixin-ZSTAC-83321' into '5.5.12'

2b7d21c

<fix>[compute]: fix user define param error See merge request zstackio/zstack!9387

chao.he added 2 commits March 20, 2026 16:22

<fix>[plugin-premium]: GPU/VM page keeps loading when shutting down o…

e2956c7

…r encountering errors in Zaku cluster. Resolves: ZSTAC-80202 Change-Id: I7778676171646874706164777869707279776172

<fix>[plugin-premium]: GPU/VM page keeps loading when shutting down o…

107fb32

…r encountering errors in Zaku cluster. Resolves: ZSTAC-80202 Change-Id: I7778676171646874706164777869707279776172

MatheMatrix force-pushed the sync/chao.he/fix-ZSTAC-80202-zaku-loading@@2 branch from 63007bb to 107fb32 Compare March 20, 2026 08:23

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

<fix>[plugin-premium]: GPU/VM page keeps loading when shutting down or...#3549

<fix>[plugin-premium]: GPU/VM page keeps loading when shutting down or...#3549
ZStack-Robot wants to merge 153 commits into5.5.6from
sync/chao.he/fix-ZSTAC-80202-zaku-loading@@2

ZStack-Robot commented Mar 20, 2026

Uh oh!

coderabbitai bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

		public static boolean isIpOutsideL3NetworkCidr(String ip, String l3Uuid) {
		return !isIpInL3NetworkCidr(ip, l3Uuid);

Conversation

ZStack-Robot commented Mar 20, 2026

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

代码审查工作量评估

兔子的诗

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

coderabbitai bot commented Mar 20, 2026 •

edited

Loading