<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition by MatheMatrix · Pull Request #3332 · MatheMatrix/zstack

MatheMatrix · 2026-02-12T03:52:41Z

Summary

ZSTAC-77711: 双 MN 一致性哈希环出现不一致，消息路由到错误 MN，导致 UI 任务卡顿
根因：nodeJoin/nodeLeft/iJoin 和 makeDestination 等方法无同步，心跳线程与事件线程并发修改 nodeHash 和 nodes
修复：所有读写 nodeHash/nodes 的方法加 synchronized lock，getManagementNodesInHashRing/getAllNodeInfo 返回防御性拷贝
额外修复 getNodeInfo 中 nodes.put() 返回值 bug

Files Changed

ResourceDestinationMakerImpl.java — synchronized lock on all methods

Resolves: ZSTAC-77711

sync from gitlab !9154

coderabbitai · 2026-02-12T03:53:11Z

Walkthrough

对管理节点哈希环、节点信息访问和生命周期处理增加同步保护与两轮缺失确认；在 VM 生命周期关键点新增设备地址同步调用；调整 VM 根盘分配大小计算和状态机新增从 Destroying→Stopped 的转移；为 ZBS HTTP 调用添加可选的“尝试下一个 MDS”重试开关。

Changes

Cohort / File(s)	Summary
资源目的地与节点信息同步保护 `core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java`	将多处涉及节点哈希环与节点缓存的方法改为 `synchronized`（nodeJoin/nodeLeft/iAmDead/iJoin/makeDestination/isManagedByUs/getManagementNodesInHashRing/getNodeInfo/getAllNodeInfo/getManagementNodeCount/isNodeInCircle），并返回集合副本或在缺失时从 DB 补充缓存。关注点：并发性能、锁范围与死锁风险。
生命周期与心跳对齐改动 `portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java`	新增 `lifecycleLock` 用于序列化 lifecycle 回调与心跳重整，加入 `suspectedMissingFromDb` 做两轮确认才能从哈希环移除节点；在 nodeJoin 时清理怀疑列表，并在重整中对 DB 与哈希环差异作双轮判断。关注点：事件延迟、锁竞争与一致性边界。
VM 设备地址信息同步 `compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java`	新增私有方法 `syncVmDevicesAddressInfo(String vmUuid)` 并在 VM 的若干生命周期路径（启动、InstantStart 完成、重启、恢复、从暂停/启动重启等）处调用以向宿主发送设备地址同步请求并记录结果。关注点：异步消息失败处理与性能。
根盘分配大小计算调整 `header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java`	当 `rootDiskOffering` 为 null 时，`getRootDiskAllocateSize` 改为返回 image 的虚拟大小与实际大小中的最大值（Math.max(virtualSize, actualSize)），替代之前仅返回虚拟大小的行为。
状态机新增迁移 `header/src/main/java/org/zstack/header/vm/VmInstanceState.java`	在状态机中新增从 `Destroying` 在收到 `stopped` 事件时转到 `Stopped` 的迁移。
ZBS HTTP 调用重试开关 `plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsStorageController.java`	在内部 `HttpCaller<T>` 类中新增 `setTryNext(boolean)` 方法以设置尝试“下一个” MDS 的重试行为；GetVolumeClients 的调用路径改用该 HttpCaller 并开启 `tryNext`，可能累积多个错误后再失败。关注点：错误聚合逻辑与重试语义。

Sequence Diagram(s)

sequenceDiagram
    participant Heartbeat as HeartbeatReconciler
    participant Manager as ManagementNodeManagerImpl
    participant DB as Database
    participant HashRing as NodeHashRing

    Heartbeat->>Manager: start reconciliation()
    Manager->>Manager: synchronized(lifecycleLock)
    Manager->>HashRing: list nodes in hash ring
    loop for each node
        Manager->>DB: query node by uuid
        alt DB has node
            DB-->>Manager: node exists
            Manager->>HashRing: ensure node present
            Manager->>Manager: suspectedMissingFromDb.remove(node)
        else DB missing node
            DB-->>Manager: not found
            Manager->>Manager: mark node in suspectedMissingFromDb (first detection)
            alt still missing on next reconciliation
                Manager->>HashRing: nodeLeft(node)
                Manager->>Manager: suspectedMissingFromDb.remove(node)
            end
        end
    end

预估代码审查工作量

🎯 4 (复杂) | ⏱️ ~45 分钟

诗

🐰 锁声轻敲在环的边，
我把节点一一点名；
两轮风雨方肯放手，
设备地址悄悄去问宿主，
春草里，系统又安宁。 🥕

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name	Status	Explanation	Resolution
Title check	❌ Error	Pull request title is 79 characters long, exceeding the maximum requirement of 72 characters.	Shorten the title to 72 characters or less while maintaining clarity. Consider: '[core]: sync hash ring to prevent dual-MN race condition'
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The pull request description is related to the changeset and clearly describes the issue, root cause, and fixes applied to ResourceDestinationMakerImpl.java and related files.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `5.5.6`

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch sync/ye.zou/fix/ZSTAC-77711

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.40.5)

compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…isk offering Resolves: ZSTAC-74683 Change-Id: Id0339ed0221e92e506f60745cde972cc3ee6d9ae

When anti-split-brain check selects a disconnected MDS node, the HTTP call now times out after 30s instead of 5+ minutes, and automatically retries the next available MDS via tryNext mechanism. Resolves: ZSTAC-80595 Change-Id: I1be80f1b70cad1606eb38d1f0078c8f2781e6941

When MN restarts during a destroy operation, the hypervisor may report the VM as Stopped. Without this transition, the state machine throws an exception and the VM stays stuck in Destroying state forever. Resolves: ZSTAC-80620 Change-Id: I037edba70d145a44a88ce0d3573089182fedb162

Resolves: ZSTAC-82153 Change-Id: Ib51c2e21553277416d1a9444be55aca2aa4b2fc4

add syncVmDeviceInfo when vm is running Resolves: ZSTAC-67275 Change-Id: I746e63786a677676676b6d6265657063666b7677

<fix>[compute]: add syncVmDeviceInfo when vm is running See merge request zstackio/zstack!9197

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java (1)
80-93: ⚠️ Potential issue | 🔴 Critical

修复 getNodeInfo 中 put 返回值导致的空返回。

Map.put 返回的是旧值，当前写法会让 info 变为 null 并直接返回，功能错误。应先创建 NodeInfo，再 put，并返回新对象。
🔧 建议修复
         if (info == null) {
             ManagementNodeVO vo = dbf.findByUuid(nodeUuid, ManagementNodeVO.class);
             if (vo == null) {
                 throw new ManagementNodeNotFoundException(nodeUuid);
             }

-            nodeHash.add(nodeUuid);
-            info = nodes.put(nodeUuid, new NodeInfo(vo));
+            info = new NodeInfo(vo);
+            nodeHash.add(nodeUuid);
+            nodes.put(nodeUuid, info);
         }

         return info;

<fix>[vm]: add Destroying->Stopped state transition See merge request zstackio/zstack!9156

<fix>[i18n]: improve snapshot error message for unattached volume See merge request zstackio/zstack!9192

<fix>[zbs]: reduce mds connect timeout and enable tryNext for volume clients See merge request zstackio/zstack!9153

<fix>[vm]: use max of virtual and actual size for root disk allocation See merge request zstackio/zstack!9155

…talling In dual management node scenarios, concurrent modifications to the consistent hash ring from heartbeat reconciliation and canonical event callbacks can cause NodeHash/Nodes inconsistency, leading to message routing failures and task timeouts. Fix: (1) synchronized all ResourceDestinationMakerImpl methods to ensure atomic nodeHash+nodes updates, (2) added lifecycleLock in ManagementNodeManagerImpl to serialize heartbeat reconciliation with event callbacks, (3) added two-round delayed confirmation before removing nodes from hash ring to avoid race with NodeJoin events. Resolves: ZSTAC-77711 Change-Id: I3d33d53595dd302784dff17417a5b25f2d0f3426

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

header/src/main/java/org/zstack/header/vm/VmInstanceState.java (1)

166-172: ⚠️ Potential issue | 🟠 Major

建议为新增状态迁移提供兼容性开关（Line 166-172）
新增 Destroying → Stopped 的迁移会改变既有状态机行为，可能影响依赖旧行为的用户。建议增加版本/配置开关或仅对新安装/升级路径启用，避免存量行为被直接改变。

As per coding guidelines, “之前的代码产生的行为不要直接去改动，即使是 bug 也可能用户有依赖，不应直接改动，可以考虑升级用户一套行为、新安装用户一套行为，开关控制等”.
core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java (1)
80-93: ⚠️ Potential issue | 🟠 Major

修复 Map.put() 返回值使用不当导致的空指针异常。

第 89 行代码 info = nodes.put(nodeUuid, new NodeInfo(vo)) 使用了 put() 的返回值，但 put() 返回的是旧值（首次缓存时为 null）而非新插入的对象。因此 info 在首次调用时被赋值为 null，方法会返回 null。后续在 ManagementNodeManagerImpl 和 CloudBusImpl3 中链式调用 getNodeInfo(nodeUuid).getNodeIP() 会抛出 NullPointerException。

应将新创建的 NodeInfo 对象保存在独立变量中，再返回该对象：
修复方案
             nodeHash.add(nodeUuid);
-            info = nodes.put(nodeUuid, new NodeInfo(vo));
+            NodeInfo newInfo = new NodeInfo(vo);
+            nodes.put(nodeUuid, newInfo);
+            info = newInfo;

🧹 Nitpick comments (2)

compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java (1)
8476-8496: 建议缓存 hostUuid 以避免异步日志漂移。
异步回调里再读 self.getHostUuid() 可能与发送时不一致，容易产生误导日志。建议在方法开头缓存并复用。
♻️ 建议修改
 private void syncVmDevicesAddressInfo(String vmUuid) {
-    if (self.getHostUuid() == null) {
+    String hostUuid = self.getHostUuid();
+    if (hostUuid == null) {
         return;
     }
     SyncVmDeviceInfoMsg msg = new SyncVmDeviceInfoMsg();
     msg.setVmInstanceUuid(vmUuid);
-    msg.setHostUuid(self.getHostUuid());
-    bus.makeTargetServiceIdByResourceUuid(msg, HostConstant.SERVICE_ID, msg.getHostUuid());
+    msg.setHostUuid(hostUuid);
+    bus.makeTargetServiceIdByResourceUuid(msg, HostConstant.SERVICE_ID, hostUuid);
     bus.send(msg, new CloudBusCallBack(msg) {
         `@Override`
         public void run(MessageReply reply) {
             if (!reply.isSuccess()) {
                 logger.warn(String.format("Failed to sync vm device info for vm[uuid:%s], %s",
                     vmUuid, reply.getError()));
             } else {
                 logger.debug(String.format("Sent SyncVmDeviceInfoMsg for vm[uuid:%s] on host[uuid:%s]",
-                    vmUuid, self.getHostUuid()));
+                    vmUuid, hostUuid));
             }
         }
     });
 }
header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java (1)
848-855: 存在空指针异常风险需要防御

当 rootDiskOffering 为 null 时，代码直接访问 this.getImageSpec().getInventory().getSize() 和 getActualSize()。存在两个 NPE 风险：

getInventory() 返回 null：ImageSpec 的 inventory 字段未经初始化，getInventory() 无空值检查

值拆箱 NPE：ImageInventory 中 size 和 actualSize 都是 Long 类型（可为 null），拆箱到原始类型时会抛出 NPE

建议增加防御性检查：
♻️ 建议的改进
 public long getRootDiskAllocateSize() {
     if (rootDiskOffering == null) {
+        ImageInventory inv = this.getImageSpec().getInventory();
+        if (inv == null) {
+            return 0L;
+        }
-        long virtualSize = this.getImageSpec().getInventory().getSize();
-        long actualSize = this.getImageSpec().getInventory().getActualSize();
+        long virtualSize = inv.getSize();
+        long actualSize = inv.getActualSize();
         return Math.max(virtualSize, actualSize);
     }
     return rootDiskOffering.getDiskSize();
 }

AlanJager and others added 6 commits February 12, 2026 12:02

<fix>[vm]: use max of virtual and actual size for root disk when no d…

6b4521b

…isk offering Resolves: ZSTAC-74683 Change-Id: Id0339ed0221e92e506f60745cde972cc3ee6d9ae

<fix>[i18n]: improve snapshot error message for unattached volume

24d4f3b

Resolves: ZSTAC-82153 Change-Id: Ib51c2e21553277416d1a9444be55aca2aa4b2fc4

<fix>[compute]: add syncVmDeviceInfo when vm is running

42ff614

add syncVmDeviceInfo when vm is running Resolves: ZSTAC-67275 Change-Id: I746e63786a677676676b6d6265657063666b7677

Merge branch 'fix-67275@@3' into '5.5.6'

ba18569

<fix>[compute]: add syncVmDeviceInfo when vm is running See merge request zstackio/zstack!9197

MatheMatrix force-pushed the sync/ye.zou/fix/ZSTAC-77711 branch from e8732a5 to 312bd83 Compare February 15, 2026 17:13

coderabbitai bot reviewed Feb 15, 2026

View reviewed changes

gitlab and others added 5 commits February 16, 2026 00:50

Merge branch 'fix/ZSTAC-80620' into '5.5.12'

76490a5

<fix>[vm]: add Destroying->Stopped state transition See merge request zstackio/zstack!9156

Merge branch 'fix/ZSTAC-82153' into '5.5.12'

461e8a2

<fix>[i18n]: improve snapshot error message for unattached volume See merge request zstackio/zstack!9192

Merge branch 'fix/ZSTAC-80595' into '5.5.12'

32e1e94

<fix>[zbs]: reduce mds connect timeout and enable tryNext for volume clients See merge request zstackio/zstack!9153

Merge branch 'fix/ZSTAC-74683' into '5.5.12'

addec8c

<fix>[vm]: use max of virtual and actual size for root disk allocation See merge request zstackio/zstack!9155

MatheMatrix force-pushed the sync/ye.zou/fix/ZSTAC-77711 branch from 312bd83 to 26b8b1a Compare February 16, 2026 02:22

coderabbitai bot reviewed Feb 16, 2026

View reviewed changes

ZStack-Robot closed this Feb 16, 2026

MatheMatrix deleted the sync/ye.zou/fix/ZSTAC-77711 branch February 16, 2026 13:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition#3332

<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition#3332
MatheMatrix wants to merge 11 commits into5.5.6from
sync/ye.zou/fix/ZSTAC-77711

MatheMatrix commented Feb 12, 2026

Uh oh!

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

Pre-merge checks failed

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MatheMatrix commented Feb 12, 2026

Summary

Files Changed

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

预估代码审查工作量

诗

Pre-merge checks failed

❌ Failed checks (1 error, 1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Feb 12, 2026 •

edited

Loading