Skip to content

<fix>[securitygroup]: remove strict sequential priority restriction#3344

Open
MatheMatrix wants to merge 27 commits into5.5.6from
sync/ye.zou/fix/ZSTAC-79067
Open

<fix>[securitygroup]: remove strict sequential priority restriction#3344
MatheMatrix wants to merge 27 commits into5.5.6from
sync/ye.zou/fix/ZSTAC-79067

Conversation

@MatheMatrix
Copy link
Owner

Resolves: ZSTAC-79067\n\nRemove overly strict sequential priority validation for security group rules API.

sync from gitlab !9173

…isk offering

Resolves: ZSTAC-74683

Change-Id: Id0339ed0221e92e506f60745cde972cc3ee6d9ae
When anti-split-brain check selects a disconnected MDS node, the HTTP
call now times out after 30s instead of 5+ minutes, and automatically
retries the next available MDS via tryNext mechanism.

Resolves: ZSTAC-80595

Change-Id: I1be80f1b70cad1606eb38d1f0078c8f2781e6941
When MN restarts during a destroy operation, the hypervisor may report
the VM as Stopped. Without this transition, the state machine throws
an exception and the VM stays stuck in Destroying state forever.

Resolves: ZSTAC-80620

Change-Id: I037edba70d145a44a88ce0d3573089182fedb162
…pacity

Resolves: ZSTAC-79709

Change-Id: I45a2133bbb8c51c25ae3549d59e588976192a08d
@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

放宽安全组规则优先级验证为仅要求正整数并保留全局上限,并发与稳定性改进(管理节点/调度/资源目的地同步化)、若干存储/网络/VM细节修正、测试用例调整以匹配新优先级行为,以及新增错误码与数据脱敏逻辑。

Changes

Cohort / File(s) Summary
安全组 API 验证逻辑
plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java
移除“从1开始且连续”校验,改为仅检查优先级为正整数(>=1)并保留对全局最大值的检查,更新错误提示。
相关测试用例
test/src/test/groovy/.../securitygroup/AddSecurityGroupRuleOptimizedCase.groovy, test/src/test/groovy/.../securitygroup/ChangeSecurityGroupRuleCase.groovy
调整断言:不再期待非连续优先级抛出异常,改为允许添加/更新并在操作后校验规则存在;添加行为变更注释。
管理节点与并发控制
core/src/main/java/.../cloudbus/ResourceDestinationMakerImpl.java, portal/src/main/java/.../ManagementNodeManagerImpl.java, core/src/main/java/.../thread/DispatchQueueImpl.java
增加 synchronized / lifecycleLock 及两阶段缺失节点检测以防竞态;调度任务在 telemetry 情况下保留或不保留父追踪上下文(Conditional tracing),提高并发安全与心跳/事件一致性。
VM/网络小修正
compute/src/main/java/.../VmNicManagerImpl.java, header/src/main/java/.../vm/VmInstanceSpec.java, header/src/main/java/.../vm/VmInstanceState.java, network/src/main/java/.../l3/L3BasicNetwork.java
添加 VmNic 空值保护,root 磁盘大小回退取 virtual 与 actual 的较大值,Destroying 状态新增 stopped 转换;APIGetFreeIp 结果过滤掉 ReservedIpRange 范围内的 IP。
存储与网络服务改动
plugin/ceph/src/main/java/.../CephPrimaryStorageBase.java, plugin/zbs/src/main/java/.../ZbsStorageController.java, header/src/main/java/.../storage/addon/primary/ExternalPrimaryStorageInventory.java
快照删除容量释放改为按比率调用;ZBS 的 GetVolumeClients 改为可重试的 HttpCaller 调用并新增 setTryNext(bool);为外部主存储脱敏 mdsUrls/mdsInfos。
负载均衡与错误码
plugin/loadBalancer/src/main/java/.../LoadBalancerApiInterceptor.java, utils/src/main/java/.../clouderrorcode/CloudOperationsErrorCode.java
拦截删除 SLB 场景以在灰度升级期间阻断删除操作(受全局配置控制);新增备份取消超时错误码常量。

Sequence Diagram(s)

sequenceDiagram
    participant Heartbeat as Heartbeat/Reconciler
    participant Manager as ManagementNodeManager
    participant DB as Database
    participant HashRing as HashRing

    Note over Heartbeat,Manager: 周期性心跳/重建流程
    Heartbeat->>Manager: 检测到节点列表差异
    Manager->>DB: 查询 ManagementNodeVO(可能缺失)
    alt 节点在 DB 中存在
        Manager->>HashRing: schedule nodeJoin(on node's thread)
        HashRing-->>Manager: 更新/加入节点信息
    else 节点在 DB 中缺失(第一次)
        Manager->>Manager: 标记 suspectedMissingFromDb(暂存)
    else 节点在 DB 中缺失(连续检测)
        Manager->>HashRing: 从 hash ring 移除节点
        HashRing-->>Manager: 触发 nodeLeft 流程
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰🌿 优先级变得随意些,
正数为准不作堆,
节点心跳更谨慎,
存储呼叫会重试,
小兔轻跳庆改回。

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题遵循 [scope]: 格式,共67个字符,清晰准确地描述了PR的主要变化:移除安全组规则的严格顺序优先级限制。
Description check ✅ Passed 描述相关联地指向了具体的问题单号 ZSTAC-79067,说明了变更目的(移除过度严格的顺序优先级验证),并注明了来源。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sync/ye.zou/fix/ZSTAC-79067

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.40.5)
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java
utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java

Comment @coderabbitai help to get the list of available commands and usage tips.

Resolves: ZSTAC-78989

Change-Id: I0fe3a56ab724978944c69afadaab7ff7353e4c0f
Resolves: ZSTAC-82153

Change-Id: Ib51c2e21553277416d1a9444be55aca2aa4b2fc4
…fterAddIpAddress to prevent NPE during rollback

Resolves: ZSTAC-81741

Change-Id: I53bcf20a10306afc7b6172da294d347b74e6c41f
Resolves: ZSTAC-81182

Change-Id: Id1bb642154dc66ae9995dcc4d9fc00cdce9bcaf8
@MatheMatrix MatheMatrix force-pushed the sync/ye.zou/fix/ZSTAC-79067 branch from bb2a0da to f7b3159 Compare February 13, 2026 10:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java`:
- Around line 491-493: The priority validation in SecurityGroupApiInterceptor
currently only enforces ao.getPriority() > 0 and misses an upper bound; update
the check in the method that validates ao.getPriority() to also ensure
ao.getPriority() <= SECURITY_GROUP_RULES_NUM_LIMIT and throw an
ApiMessageInterceptionException with a new appropriate error code (instead of
ORG_ZSTACK_NETWORK_SECURITYGROUP_10042) when the value exceeds the limit, using
a clear error message that includes the provided priority and the max allowed
(reference SECURITY_GROUP_RULES_NUM_LIMIT and the validation block around
ao.getPriority()).

Comment on lines +491 to +493
if (ao.getPriority() < 1) {
throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SECURITYGROUP_10042, "could not update security group rule priority, because priority[%d] must be positive", ao.getPriority()));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

⚠️ 更新优先级缺少最大值上限校验。
当前仅校验 > 0,会绕过 SECURITY_GROUP_RULES_NUM_LIMIT 的上限(新增/变更路径已限制)。建议补齐上限检查并使用合适的新错误码。

🔧 建议补充上限校验
             if (ao.getPriority() < 1) {
                 throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SECURITYGROUP_10042, "could not update security group rule priority, because priority[%d] must be positive", ao.getPriority()));
             }
+            if (ao.getPriority() > SecurityGroupGlobalConfig.SECURITY_GROUP_RULES_NUM_LIMIT.value(Integer.class)) {
+                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SECURITYGROUP_10042, "could not update security group rule priority, because priority[%d] exceeds the maximum limit[%d]", ao.getPriority(), SecurityGroupGlobalConfig.SECURITY_GROUP_RULES_NUM_LIMIT.value(Integer.class)));
+            }
🤖 Prompt for AI Agents
In
`@plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java`
around lines 491 - 493, The priority validation in SecurityGroupApiInterceptor
currently only enforces ao.getPriority() > 0 and misses an upper bound; update
the check in the method that validates ao.getPriority() to also ensure
ao.getPriority() <= SECURITY_GROUP_RULES_NUM_LIMIT and throw an
ApiMessageInterceptionException with a new appropriate error code (instead of
ORG_ZSTACK_NETWORK_SECURITYGROUP_10042) when the value exceeds the limit, using
a clear error message that includes the provided priority and the max allowed
(reference SECURITY_GROUP_RULES_NUM_LIMIT and the validation block around
ao.getPriority()).

gitlab and others added 8 commits February 16, 2026 00:50
<fix>[vm]: add Destroying->Stopped state transition

See merge request zstackio/zstack!9156
<fix>[i18n]: improve snapshot error message for unattached volume

See merge request zstackio/zstack!9192
<fix>[zbs]: reduce mds connect timeout and enable tryNext for volume clients

See merge request zstackio/zstack!9153
<fix>[vm]: use max of virtual and actual size for root disk allocation

See merge request zstackio/zstack!9155
…talling

In dual management node scenarios, concurrent modifications to the
consistent hash ring from heartbeat reconciliation and canonical event
callbacks can cause NodeHash/Nodes inconsistency, leading to message
routing failures and task timeouts.

Fix: (1) synchronized all ResourceDestinationMakerImpl methods to
ensure atomic nodeHash+nodes updates, (2) added lifecycleLock in
ManagementNodeManagerImpl to serialize heartbeat reconciliation with
event callbacks, (3) added two-round delayed confirmation before
removing nodes from hash ring to avoid race with NodeJoin events.

Resolves: ZSTAC-77711

Change-Id: I3d33d53595dd302784dff17417a5b25f2d0f3426
<fix>[network]: filter reserved IPs in getFreeIp

See merge request zstackio/zstack!9170
<fix>[ceph]: apply over-provisioning ratio when releasing snapshot capacity

See merge request zstackio/zstack!9162
<fix>[network]: add null check for L3 network system tags in API interceptor

See merge request zstackio/zstack!9169
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy (1)

612-617: 建议添加断言验证规则添加结果。

当前代码块添加了混合规则(ingress + egress)在优先级 12,但没有断言验证规则是否成功添加。与上方 594-600 行的处理方式(有 assert sg3.rules.find { it.priority == 13 } != null)保持一致会更好。

♻️ 建议的改进
         // ZSTAC-79067: mixed ingress+egress add at non-consecutive priority is now allowed
         sg3 = addSecurityGroupRule {
             securityGroupUuid = sg3.uuid
             rules = [rule_12, ingressRule]
             priority = 12
         }
+        assert sg3.rules.find { it.dstIpRange == "2.2.2.2-2.2.2.10" && it.priority == 12 } != null
+        assert sg3.rules.find { it.type == "Ingress" && it.dstPortRange == "12-13" } != null
test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy (1)

234-236: 存在未使用的变量 rule_1

方法 testChangeRulePriorityError 中,第 232 行获取了 rule_1 变量,但由于移除了连续优先级限制的测试断言,该变量现在未被使用。建议清理未使用的代码。

♻️ 建议的清理
     void testChangeRulePriorityError() {
         sg3 = querySecurityGroup {
             conditions = ["uuid=${sg3.uuid}"]
         }[0]

         assert sg3 != null
-        SecurityGroupRuleInventory rule_1 = sg3.rules.find { it.type == "Ingress" && it.priority == 1 && it.ipVersion == 4 }

         // ZSTAC-79067: priorities beyond current rule count are now allowed
         // (removed consecutive priority restriction)
     }

AlanJager and others added 10 commits February 16, 2026 17:11
The mdsUrls field in ExternalPrimaryStorage config contains
user:password@host format credentials. Add desensitization to
mask credentials as ***@host in API/CLI output.

Resolves: ZSTAC-80664

Change-Id: I94bdede5a1b52eb039de70efb5458693484405f7
Add ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT constant to
CloudOperationsErrorCode for use in premium volumebackup module.

Resolves: ZSTAC-82195

Change-Id: Ibc405876e1171b637cf76b91a6822574fb6e7811
<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition

See merge request zstackio/zstack!9154
SyncTaskFuture constructor calls Context.current() unconditionally,
triggering ServiceLoader for ContextStorageProvider even when
telemetry is disabled. If sentry-opentelemetry-bootstrap jar is
on classpath, ServiceLoader fails with "not a subtype" due to
ClassLoader isolation in Tomcat, throwing ServiceConfigurationError
(extends Error) that escapes all catch(Exception) blocks.

1. Why is this change necessary?
MN startup crashes with ORG_ZSTACK_CORE_WORKFLOW_10001 because
Context.current() triggers ServiceLoader unconditionally in
SyncTaskFuture constructor, even when telemetry is disabled.

2. How does it address the problem?
Only call Context.current() when isTelemetryEnabled() returns
true, matching the existing guard pattern used in other
DispatchQueueImpl code paths (lines 351, 1069).

3. Are there any side effects?
None. When telemetry is disabled, parentContext was never used.

# Summary of changes (by module):
- core/thread: conditionalize Context.current() in SyncTaskFuture

Related: ZSTAC-82275
Change-Id: I5c0e1f15769c746c630028a29df8cf1815620608
<fix>[thread]: guard Context.current() with telemetry check

See merge request zstackio/zstack!9202
<fix>[loadBalancer]: block SLB deletion during grayscale upgrade

See merge request zstackio/zstack!9187
<fix>[volumebackup]: add backup cancel timeout error code

See merge request zstackio/zstack!9200
<fix>[core]: add @nologging to sensitive config fields

See merge request zstackio/zstack!9171
…or security group rules

Resolves: ZSTAC-79067

Change-Id: I76c6d17b02f87f5836506e2c79be5538b3b0d89f
Resolves: ZSTAC-79067

Change-Id: Ib001cad155814a0018b5675fa375f63e8fc9c0ff
…onsecutive priority

Resolves: ZSTAC-79067

Change-Id: I88361b5f9bcf7ca6f0f9faccfa0f2e843d06e4cc
@MatheMatrix MatheMatrix force-pushed the sync/ye.zou/fix/ZSTAC-79067 branch from ed4530a to e5b952b Compare February 17, 2026 11:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java (1)

56-64: ⚠️ Potential issue | 🟠 Major

数据一致性问题:UsedIpVO 更新在 VmNicVO 空检查之前执行

第 58 行在检查 VmNicVO 是否存在之前,已经将 vmNicUuid 设置到了 UsedIpVO 上。如果后续第 61-64 行发现 nic 为 null 并提前返回,UsedIpVO 中将保留一个指向不存在 NIC 的无效引用,导致数据不一致。

建议将空检查移到 UsedIpVO 更新之前,或者在发现 NIC 不存在时清理 UsedIpVOvmNicUuid 字段。

🛠️ 建议修复方案
 `@Override`
 public void afterAddIpAddress(String vmNicUUid, String usedIpUuid) {
-    /* update UsedIpVO */
-    SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update();
-
     VmNicVO nic = Q.New(VmNicVO.class).eq(VmNicVO_.uuid, vmNicUUid).find();
     if (nic == null) {
         logger.debug(String.format("VmNic[uuid:%s] not found, skip afterAddIpAddress", vmNicUUid));
         return;
     }
+
+    /* update UsedIpVO */
+    SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java` around
lines 56 - 64, The UsedIpVO update is performed before checking whether the
VmNicVO exists in afterAddIpAddress, which can leave UsedIpVO.vmNicUuid pointing
to a non-existent NIC; move the VmNicVO null-check (the
Q.New(VmNicVO.class)...find() block) above the
SQL.New(UsedIpVO.class)...update() call and only perform the UsedIpVO update if
nic != null, or alternatively, if you prefer to keep the update first, add
compensating logic to clear UsedIpVO.vmNicUuid when nic == null (use
UsedIpVO_.vmNicUuid = null via SQL.New/ update) so UsedIpVO does not reference a
missing VmNicVO.
header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java (1)

115-117: ⚠️ Potential issue | 🔴 Critical

setConfig 方法绕过了脱敏处理,存在凭据泄露风险

构造函数在处理 config 时调用了 desensitizeConfig() 进行脱敏(移除 mdsUrlsmdsInfos 中URL的用户名密码),但 setConfig() 方法直接赋值而未进行脱敏。由于该类是 API 响应对象(@inventory 注解),外部通过 setConfig() 设置的配置会直接在 API 响应中暴露凭据。

建议的修复
     public void setConfig(LinkedHashMap config) {
         this.config = config;
+        desensitizeConfig(this.config);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`
around lines 115 - 117, The setConfig(LinkedHashMap config) method currently
assigns directly to the config field and bypasses the existing desensitization
logic; change setConfig to pass the incoming config through the class's
desensitizeConfig(...) routine (the same function called from the constructor)
before assigning to this.config so that mdsUrls/mdsInfos credentials are
stripped for any config provided via API; reference the
ExternalPrimaryStorageInventory class, its setConfig method, the
desensitizeConfig method, and the config field when making the change.
core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java (1)

80-93: ⚠️ Potential issue | 🔴 Critical

严重 Bug:nodes.put() 返回的是旧值而非新值

第 89 行存在逻辑错误。Map.put(key, value) 方法返回的是该 key 之前关联的旧值,而非新插入的值。当缓存未命中时(info == null),nodes.put() 会返回 null(因为之前没有该 key 的映射),导致方法最终返回 null 而不是新创建的 NodeInfo 对象。

🐛 修复建议
     `@Override`
     public synchronized NodeInfo getNodeInfo(String nodeUuid) {
         NodeInfo info = nodes.get(nodeUuid);
         if (info == null) {
             ManagementNodeVO vo = dbf.findByUuid(nodeUuid, ManagementNodeVO.class);
             if (vo == null) {
                 throw new ManagementNodeNotFoundException(nodeUuid);
             }

             nodeHash.add(nodeUuid);
-            info = nodes.put(nodeUuid, new NodeInfo(vo));
+            info = new NodeInfo(vo);
+            nodes.put(nodeUuid, info);
         }

         return info;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java`
around lines 80 - 93, The method getNodeInfo incorrectly returns the result of
nodes.put(nodeUuid, new NodeInfo(vo)) which is the previous value (often null)
instead of the newly created NodeInfo; fix by instantiating a new NodeInfo
(e.g., NodeInfo newInfo = new NodeInfo(vo)), adding the nodeUuid to nodeHash,
inserting newInfo into nodes with nodes.put(nodeUuid, newInfo) (or
nodes.putIfAbsent and handle accordingly), and then return newInfo; update
references to ManagementNodeVO, dbf.findByUuid, nodeHash.add, and nodes to
implement this change.
🧹 Nitpick comments (6)
header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java (1)

848-855: 逻辑改进合理,但存在潜在的空指针风险。

使用 Math.max(virtualSize, actualSize) 是更安全的做法,能确保为根磁盘分配足够的空间(特别是处理稀疏镜像或特殊格式时)。

但需注意:若 getImageSpec().getInventory() 返回 null,第 850-851 行会抛出 NullPointerException。虽然这是原有代码就存在的风险,但建议考虑添加防御性检查。

🛡️ 可选的防御性改进
 public long getRootDiskAllocateSize() {
     if (rootDiskOffering == null) {
+        ImageInventory inventory = this.getImageSpec().getInventory();
+        if (inventory == null) {
+            return 0;
+        }
-        long virtualSize = this.getImageSpec().getInventory().getSize();
-        long actualSize = this.getImageSpec().getInventory().getActualSize();
+        long virtualSize = inventory.getSize();
+        long actualSize = inventory.getActualSize();
         return Math.max(virtualSize, actualSize);
     }
     return rootDiskOffering.getDiskSize();
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java` around lines
848 - 855, In VmInstanceSpec.getRootDiskAllocateSize(), add defensive null
checks for getImageSpec() and getImageSpec().getInventory() before accessing
getSize()/getActualSize() to avoid NullPointerException; if either is null,
return a sensible fallback (e.g., 0 or rootDiskOffering.getDiskSize() when
available) or document the behavior, ensuring the code still uses
Math.max(virtualSize, actualSize) when both values are present; reference the
method getRootDiskAllocateSize(), getImageSpec(), getInventory(), virtualSize,
and actualSize when applying the checks and fallback.
test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy (1)

234-236: 测试方法 testChangeRulePriorityError 可能需要补充验证逻辑。

当前方法仅保留了注释说明行为变更,但没有任何实际的断言或验证。建议补充对新行为的正向验证,例如:

  • 验证超过全局上限 (SECURITY_GROUP_RULES_NUM_LIMIT) 的优先级会抛出错误
  • 验证正整数优先级能够正常设置
♻️ 建议补充测试验证
     void testChangeRulePriorityError() {
         sg3 = querySecurityGroup {
             conditions = ["uuid=${sg3.uuid}"]
         }[0]

         assert sg3 != null
         SecurityGroupRuleInventory rule_1 = sg3.rules.find { it.type == "Ingress" && it.priority == 1 && it.ipVersion == 4 }

         // ZSTAC-79067: priorities beyond current rule count are now allowed
         // (removed consecutive priority restriction)
+
+        // Verify that priority exceeding global limit (101) should still fail
+        expect(AssertionError) {
+            changeSecurityGroupRule {
+                uuid = rule_1.uuid
+                priority = 101
+            }
+        }
+
+        // Verify that valid non-consecutive priority should succeed
+        changeSecurityGroupRule {
+            uuid = rule_1.uuid
+            priority = 10
+        }
+        assert Q.New(SecurityGroupRuleVO).eq(SecurityGroupRuleVO_.uuid, rule_1.uuid).find().priority == 10
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy`
around lines 234 - 236, The test method testChangeRulePriorityError in
ChangeSecurityGroupRuleCase.groovy currently only documents behavior change; add
concrete assertions: (1) attempt to set a priority greater than
SECURITY_GROUP_RULES_NUM_LIMIT and assert an error/exception is thrown (or API
returns failure) to validate the global upper bound, and (2) set a valid
positive priority (e.g., a value within limits) and assert the change succeeds
and the rule's priority is updated (check the returned rule or query the
security group rules). Locate and update the testChangeRulePriorityError method
to perform these two checks using the existing test helpers/assertion utilities
used elsewhere in the class.
test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy (1)

613-618: 建议补充添加后的验证断言。

当前代码仅执行了 addSecurityGroupRule 但没有验证结果。建议添加断言确认规则成功创建且优先级正确。

♻️ 建议补充验证
         // ZSTAC-79067: mixed ingress+egress add at non-consecutive priority is now allowed
         sg3 = addSecurityGroupRule {
             securityGroupUuid = sg3.uuid
             rules = [rule_12, ingressRule]
             priority = 12
         }
+
+        // Verify rules were added successfully
+        assert sg3.rules.find { it.dstIpRange == "2.2.2.2-2.2.2.10" && it.priority == 12 } != null
+        assert sg3.rules.find { it.type == "Ingress" && it.dstPortRange == "12-13" } != null
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy`
around lines 613 - 618, 缺少对 addSecurityGroupRule 调用结果的断言验证;在调用 sg3 =
addSecurityGroupRule { securityGroupUuid = sg3.uuid; rules = [rule_12,
ingressRule]; priority = 12 } 之后,添加断言以验证新规则已被创建且优先级为
12。具体修复:调用返回后使用现有检查方法(例如查询安全组规则列表或使用 findSecurityGroupRule/getSecurityGroup 等
helper)确认包含 rule_12 和 ingressRule 并且其 priority 字段为 12,同时断言返回的 sg3.rules
或查询结果大小与预期一致并抛出失败说明;参考符号:addSecurityGroupRule、sg3、rule_12、ingressRule、priority。
network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java (1)

1078-1083: 筛选后可能返回少于 limit 的结果

当前先按 limit 拉取,再过滤保留段,最终数量可能小于 msg.getLimit() 即使仍有可用 IP。若 API 语义期望“尽量填满 limit”,考虑在获取阶段过滤或过滤后继续补齐。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java` around lines
1078 - 1083, 当前实现先按 msg.getLimit() 从 DB 拉取 freeIpInventorys 再对 reservedIpRanges
过滤(使用 reservedIpRanges、freeIpInventorys、NetworkUtils.isInRange),导致返回结果可能少于
msg.getLimit()。修复方法:在 L3BasicNetwork 相关方法中把过滤条件下推到查询层或在内存过滤后补齐结果——比如在 DB 查询中添加排除
reservedIpRanges 的条件,或实现一个循环/分页逻辑:累积非保留 IP 到结果集中,若不足 msg.getLimit()
则继续查询下一批直到满足或无更多条目(使用 msg.getLimit()、reservedIpRanges、freeIpInventorys
标识位置以定位并修改代码)。
header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java (1)

80-89: 建议:增加对列表元素为 null 的处理

当列表中包含 null 元素时,String.valueOf(url) 会返回字面量字符串 "null",这可能不是预期的行为。建议添加 null 检查以保持原值或跳过处理。

♻️ 建议的修复
     private static void desensitizeUrlList(Map config, String key) {
         Object urls = config.get(key);
         if (urls instanceof List) {
             List<String> desensitized = new ArrayList<>();
             for (Object url : (List) urls) {
-                desensitized.add(desensitizeUrl(String.valueOf(url)));
+                if (url == null) {
+                    desensitized.add(null);
+                } else {
+                    desensitized.add(desensitizeUrl(String.valueOf(url)));
+                }
             }
             config.put(key, desensitized);
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`
around lines 80 - 89, The desensitizeUrlList method currently converts each list
element with String.valueOf(url), which turns nulls into the literal "null";
update desensitizeUrlList to check for null elements in the List retrieved from
config.get(key) and either preserve the null value (add null to desensitized) or
skip null entries instead of calling String.valueOf; call desensitizeUrl only
for non-null entries and then put the new List back into config under the same
key so desensitized values and original nulls are handled correctly.
portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java (1)

890-898: 建议:为 nodeLeft 创建的 ManagementNodeInventory 数据不完整

当节点从数据库中消失时,创建的 ManagementNodeInventory 只包含 uuid 和可能的 hostname,缺少其他字段(如 stateheartBeat 等)。下游的 nodeLeft 处理逻辑如果依赖这些字段可能会出现问题。

💡 建议:考虑从 joinedManagementNodes 获取完整的 inventory
                             if (suspectedMissingFromDb.contains(nodeUuid)) {
                                 // Second consecutive detection — confirmed missing, remove from hash ring
                                 logger.warn(String.format("management node[uuid:%s] confirmed missing from database for two consecutive" +
                                         " heartbeat cycles, removing from hash ring", nodeUuid));
-                                ManagementNodeInventory inv = new ManagementNodeInventory();
-                                inv.setUuid(nodeUuid);
-                                try {
-                                    inv.setHostName(destinationMaker.getNodeInfo(nodeUuid).getNodeIP());
-                                } catch (Exception e) {
-                                    logger.warn(String.format("cannot get node info for node[uuid:%s], use empty hostname", nodeUuid));
-                                }
+                                ManagementNodeInventory inv = joinedManagementNodes.get(nodeUuid);
+                                if (inv == null) {
+                                    inv = new ManagementNodeInventory();
+                                    inv.setUuid(nodeUuid);
+                                    try {
+                                        inv.setHostName(destinationMaker.getNodeInfo(nodeUuid).getNodeIP());
+                                    } catch (Exception e) {
+                                        logger.warn(String.format("cannot get node info for node[uuid:%s], use empty hostname", nodeUuid));
+                                    }
+                                }

                                 nodeLifeCycle.nodeLeft(inv);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java`
around lines 890 - 898, The ManagementNodeInventory passed to
nodeLifeCycle.nodeLeft is incomplete (only uuid/hostname) which can break
downstream logic; modify the code that creates the inventory (currently using
new ManagementNodeInventory() and destinationMaker.getNodeInfo(nodeUuid)) to
first try to retrieve the complete inventory from the joinedManagementNodes
cache or source of truth (e.g., joinedManagementNodes.get(nodeUuid) / DB lookup)
and fall back to the minimal inventory only if none found; ensure fields like
state, lastHeartBeat/heartBeat, hostName, and any other relevant metadata on
ManagementNodeInventory are populated (or set to explicit defaults such as
state=Left/Unknown and heartBeat=null) before calling
nodeLifeCycle.nodeLeft(inv).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java`:
- Around line 1078-1083: reservedIpRanges may mix IPv4/IPv6 and comparing across
versions can throw or wrongly remove entries; update the removal logic in
L3BasicNetwork so for each free IP (freeIpInventorys -> freeIp) you only check
reserved ranges with matching ip version: filter reservedIpRanges by
r.getIpVersion().equals(freeIp.getIpVersion()) before calling
NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()), then
remove if any match. Target the block referencing reservedIpRanges,
freeIpInventorys/freeIp and NetworkUtils.isInRange.

---

Outside diff comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java`:
- Around line 56-64: The UsedIpVO update is performed before checking whether
the VmNicVO exists in afterAddIpAddress, which can leave UsedIpVO.vmNicUuid
pointing to a non-existent NIC; move the VmNicVO null-check (the
Q.New(VmNicVO.class)...find() block) above the
SQL.New(UsedIpVO.class)...update() call and only perform the UsedIpVO update if
nic != null, or alternatively, if you prefer to keep the update first, add
compensating logic to clear UsedIpVO.vmNicUuid when nic == null (use
UsedIpVO_.vmNicUuid = null via SQL.New/ update) so UsedIpVO does not reference a
missing VmNicVO.

In
`@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java`:
- Around line 80-93: The method getNodeInfo incorrectly returns the result of
nodes.put(nodeUuid, new NodeInfo(vo)) which is the previous value (often null)
instead of the newly created NodeInfo; fix by instantiating a new NodeInfo
(e.g., NodeInfo newInfo = new NodeInfo(vo)), adding the nodeUuid to nodeHash,
inserting newInfo into nodes with nodes.put(nodeUuid, newInfo) (or
nodes.putIfAbsent and handle accordingly), and then return newInfo; update
references to ManagementNodeVO, dbf.findByUuid, nodeHash.add, and nodes to
implement this change.

In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`:
- Around line 115-117: The setConfig(LinkedHashMap config) method currently
assigns directly to the config field and bypasses the existing desensitization
logic; change setConfig to pass the incoming config through the class's
desensitizeConfig(...) routine (the same function called from the constructor)
before assigning to this.config so that mdsUrls/mdsInfos credentials are
stripped for any config provided via API; reference the
ExternalPrimaryStorageInventory class, its setConfig method, the
desensitizeConfig method, and the config field when making the change.

---

Duplicate comments:
In
`@plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java`:
- Around line 491-493: The current validation in SecurityGroupApiInterceptor
only checks ao.getPriority() < 1; add an upper-bound check against
SECURITY_GROUP_RULES_NUM_LIMIT and throw the same
ApiMessageInterceptionException via argerr (using
ORG_ZSTACK_NETWORK_SECURITYGROUP_10042) when ao.getPriority() >
SECURITY_GROUP_RULES_NUM_LIMIT so the priority must be within
1..SECURITY_GROUP_RULES_NUM_LIMIT; locate the logic around ao.getPriority(),
ApiMessageInterceptionException, argerr and SECURITY_GROUP_RULES_NUM_LIMIT and
mirror the error/message style used in
APIChangeSecurityGroupRuleMsg/APIAddSecurityGroupRuleMsg.

---

Nitpick comments:
In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`:
- Around line 80-89: The desensitizeUrlList method currently converts each list
element with String.valueOf(url), which turns nulls into the literal "null";
update desensitizeUrlList to check for null elements in the List retrieved from
config.get(key) and either preserve the null value (add null to desensitized) or
skip null entries instead of calling String.valueOf; call desensitizeUrl only
for non-null entries and then put the new List back into config under the same
key so desensitized values and original nulls are handled correctly.

In `@header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java`:
- Around line 848-855: In VmInstanceSpec.getRootDiskAllocateSize(), add
defensive null checks for getImageSpec() and getImageSpec().getInventory()
before accessing getSize()/getActualSize() to avoid NullPointerException; if
either is null, return a sensible fallback (e.g., 0 or
rootDiskOffering.getDiskSize() when available) or document the behavior,
ensuring the code still uses Math.max(virtualSize, actualSize) when both values
are present; reference the method getRootDiskAllocateSize(), getImageSpec(),
getInventory(), virtualSize, and actualSize when applying the checks and
fallback.

In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java`:
- Around line 1078-1083: 当前实现先按 msg.getLimit() 从 DB 拉取 freeIpInventorys 再对
reservedIpRanges 过滤(使用
reservedIpRanges、freeIpInventorys、NetworkUtils.isInRange),导致返回结果可能少于
msg.getLimit()。修复方法:在 L3BasicNetwork 相关方法中把过滤条件下推到查询层或在内存过滤后补齐结果——比如在 DB 查询中添加排除
reservedIpRanges 的条件,或实现一个循环/分页逻辑:累积非保留 IP 到结果集中,若不足 msg.getLimit()
则继续查询下一批直到满足或无更多条目(使用 msg.getLimit()、reservedIpRanges、freeIpInventorys
标识位置以定位并修改代码)。

In
`@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java`:
- Around line 890-898: The ManagementNodeInventory passed to
nodeLifeCycle.nodeLeft is incomplete (only uuid/hostname) which can break
downstream logic; modify the code that creates the inventory (currently using
new ManagementNodeInventory() and destinationMaker.getNodeInfo(nodeUuid)) to
first try to retrieve the complete inventory from the joinedManagementNodes
cache or source of truth (e.g., joinedManagementNodes.get(nodeUuid) / DB lookup)
and fall back to the minimal inventory only if none found; ensure fields like
state, lastHeartBeat/heartBeat, hostName, and any other relevant metadata on
ManagementNodeInventory are populated (or set to explicit defaults such as
state=Left/Unknown and heartBeat=null) before calling
nodeLifeCycle.nodeLeft(inv).

In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy`:
- Around line 613-618: 缺少对 addSecurityGroupRule 调用结果的断言验证;在调用 sg3 =
addSecurityGroupRule { securityGroupUuid = sg3.uuid; rules = [rule_12,
ingressRule]; priority = 12 } 之后,添加断言以验证新规则已被创建且优先级为
12。具体修复:调用返回后使用现有检查方法(例如查询安全组规则列表或使用 findSecurityGroupRule/getSecurityGroup 等
helper)确认包含 rule_12 和 ingressRule 并且其 priority 字段为 12,同时断言返回的 sg3.rules
或查询结果大小与预期一致并抛出失败说明;参考符号:addSecurityGroupRule、sg3、rule_12、ingressRule、priority。

In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy`:
- Around line 234-236: The test method testChangeRulePriorityError in
ChangeSecurityGroupRuleCase.groovy currently only documents behavior change; add
concrete assertions: (1) attempt to set a priority greater than
SECURITY_GROUP_RULES_NUM_LIMIT and assert an error/exception is thrown (or API
returns failure) to validate the global upper bound, and (2) set a valid
positive priority (e.g., a value within limits) and assert the change succeeds
and the rule's priority is updated (check the returned rule or query the
security group rules). Locate and update the testChangeRulePriorityError method
to perform these two checks using the existing test helpers/assertion utilities
used elsewhere in the class.

Comment on lines +1078 to +1083

Set<ReservedIpRangeVO> reservedIpRanges = self.getReservedIpRanges();
if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) {
freeIpInventorys.removeIf(freeIp -> reservedIpRanges.stream().anyMatch(
r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp())));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

避免跨 IP 版本比较引发异常或误删

reservedIpRanges 可能同时包含 IPv4/IPv6 范围,但当前对每个 free IP 都遍历所有范围;当版本不一致时,底层比较可能抛异常或产生错误过滤。建议先按 ipVersion 过滤再做 isInRange

🔧 建议修改
-        if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) {
-            freeIpInventorys.removeIf(freeIp -> reservedIpRanges.stream().anyMatch(
-                    r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp())));
-        }
+        if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) {
+            freeIpInventorys.removeIf(freeIp -> {
+                boolean isIpv4 = NetworkUtils.isIpv4Address(freeIp.getIp());
+                int ipVersion = isIpv4 ? IPv6Constants.IPv4 : IPv6Constants.IPv6;
+                return reservedIpRanges.stream()
+                        .filter(r -> r.getIpVersion() == ipVersion)
+                        .anyMatch(r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()));
+            });
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java` around lines
1078 - 1083, reservedIpRanges may mix IPv4/IPv6 and comparing across versions
can throw or wrongly remove entries; update the removal logic in L3BasicNetwork
so for each free IP (freeIpInventorys -> freeIp) you only check reserved ranges
with matching ip version: filter reservedIpRanges by
r.getIpVersion().equals(freeIp.getIpVersion()) before calling
NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()), then
remove if any match. Target the block referencing reservedIpRanges,
freeIpInventorys/freeIp and NetworkUtils.isInRange.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants