<fix>[securitygroup]: remove strict sequential priority restriction#3344
<fix>[securitygroup]: remove strict sequential priority restriction#3344MatheMatrix wants to merge 27 commits into5.5.6from
Conversation
…isk offering Resolves: ZSTAC-74683 Change-Id: Id0339ed0221e92e506f60745cde972cc3ee6d9ae
When anti-split-brain check selects a disconnected MDS node, the HTTP call now times out after 30s instead of 5+ minutes, and automatically retries the next available MDS via tryNext mechanism. Resolves: ZSTAC-80595 Change-Id: I1be80f1b70cad1606eb38d1f0078c8f2781e6941
When MN restarts during a destroy operation, the hypervisor may report the VM as Stopped. Without this transition, the state machine throws an exception and the VM stays stuck in Destroying state forever. Resolves: ZSTAC-80620 Change-Id: I037edba70d145a44a88ce0d3573089182fedb162
…pacity Resolves: ZSTAC-79709 Change-Id: I45a2133bbb8c51c25ae3549d59e588976192a08d
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough放宽安全组规则优先级验证为仅要求正整数并保留全局上限,并发与稳定性改进(管理节点/调度/资源目的地同步化)、若干存储/网络/VM细节修正、测试用例调整以匹配新优先级行为,以及新增错误码与数据脱敏逻辑。 Changes
Sequence Diagram(s)sequenceDiagram
participant Heartbeat as Heartbeat/Reconciler
participant Manager as ManagementNodeManager
participant DB as Database
participant HashRing as HashRing
Note over Heartbeat,Manager: 周期性心跳/重建流程
Heartbeat->>Manager: 检测到节点列表差异
Manager->>DB: 查询 ManagementNodeVO(可能缺失)
alt 节点在 DB 中存在
Manager->>HashRing: schedule nodeJoin(on node's thread)
HashRing-->>Manager: 更新/加入节点信息
else 节点在 DB 中缺失(第一次)
Manager->>Manager: 标记 suspectedMissingFromDb(暂存)
else 节点在 DB 中缺失(连续检测)
Manager->>HashRing: 从 hash ring 移除节点
HashRing-->>Manager: 触发 nodeLeft 流程
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.40.5)plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.javautils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.javaComment |
Resolves: ZSTAC-78989 Change-Id: I0fe3a56ab724978944c69afadaab7ff7353e4c0f
Resolves: ZSTAC-82153 Change-Id: Ib51c2e21553277416d1a9444be55aca2aa4b2fc4
…fterAddIpAddress to prevent NPE during rollback Resolves: ZSTAC-81741 Change-Id: I53bcf20a10306afc7b6172da294d347b74e6c41f
Resolves: ZSTAC-81182 Change-Id: Id1bb642154dc66ae9995dcc4d9fc00cdce9bcaf8
bb2a0da to
f7b3159
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java`:
- Around line 491-493: The priority validation in SecurityGroupApiInterceptor
currently only enforces ao.getPriority() > 0 and misses an upper bound; update
the check in the method that validates ao.getPriority() to also ensure
ao.getPriority() <= SECURITY_GROUP_RULES_NUM_LIMIT and throw an
ApiMessageInterceptionException with a new appropriate error code (instead of
ORG_ZSTACK_NETWORK_SECURITYGROUP_10042) when the value exceeds the limit, using
a clear error message that includes the provided priority and the max allowed
(reference SECURITY_GROUP_RULES_NUM_LIMIT and the validation block around
ao.getPriority()).
| if (ao.getPriority() < 1) { | ||
| throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SECURITYGROUP_10042, "could not update security group rule priority, because priority[%d] must be positive", ao.getPriority())); | ||
| } |
There was a problem hiding this comment.
当前仅校验 > 0,会绕过 SECURITY_GROUP_RULES_NUM_LIMIT 的上限(新增/变更路径已限制)。建议补齐上限检查并使用合适的新错误码。
🔧 建议补充上限校验
if (ao.getPriority() < 1) {
throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SECURITYGROUP_10042, "could not update security group rule priority, because priority[%d] must be positive", ao.getPriority()));
}
+ if (ao.getPriority() > SecurityGroupGlobalConfig.SECURITY_GROUP_RULES_NUM_LIMIT.value(Integer.class)) {
+ throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SECURITYGROUP_10042, "could not update security group rule priority, because priority[%d] exceeds the maximum limit[%d]", ao.getPriority(), SecurityGroupGlobalConfig.SECURITY_GROUP_RULES_NUM_LIMIT.value(Integer.class)));
+ }🤖 Prompt for AI Agents
In
`@plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java`
around lines 491 - 493, The priority validation in SecurityGroupApiInterceptor
currently only enforces ao.getPriority() > 0 and misses an upper bound; update
the check in the method that validates ao.getPriority() to also ensure
ao.getPriority() <= SECURITY_GROUP_RULES_NUM_LIMIT and throw an
ApiMessageInterceptionException with a new appropriate error code (instead of
ORG_ZSTACK_NETWORK_SECURITYGROUP_10042) when the value exceeds the limit, using
a clear error message that includes the provided priority and the max allowed
(reference SECURITY_GROUP_RULES_NUM_LIMIT and the validation block around
ao.getPriority()).
<fix>[vm]: add Destroying->Stopped state transition See merge request zstackio/zstack!9156
<fix>[i18n]: improve snapshot error message for unattached volume See merge request zstackio/zstack!9192
<fix>[zbs]: reduce mds connect timeout and enable tryNext for volume clients See merge request zstackio/zstack!9153
<fix>[vm]: use max of virtual and actual size for root disk allocation See merge request zstackio/zstack!9155
…talling In dual management node scenarios, concurrent modifications to the consistent hash ring from heartbeat reconciliation and canonical event callbacks can cause NodeHash/Nodes inconsistency, leading to message routing failures and task timeouts. Fix: (1) synchronized all ResourceDestinationMakerImpl methods to ensure atomic nodeHash+nodes updates, (2) added lifecycleLock in ManagementNodeManagerImpl to serialize heartbeat reconciliation with event callbacks, (3) added two-round delayed confirmation before removing nodes from hash ring to avoid race with NodeJoin events. Resolves: ZSTAC-77711 Change-Id: I3d33d53595dd302784dff17417a5b25f2d0f3426
<fix>[network]: filter reserved IPs in getFreeIp See merge request zstackio/zstack!9170
<fix>[ceph]: apply over-provisioning ratio when releasing snapshot capacity See merge request zstackio/zstack!9162
<fix>[network]: add null check for L3 network system tags in API interceptor See merge request zstackio/zstack!9169
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy (1)
612-617: 建议添加断言验证规则添加结果。当前代码块添加了混合规则(ingress + egress)在优先级 12,但没有断言验证规则是否成功添加。与上方 594-600 行的处理方式(有
assert sg3.rules.find { it.priority == 13 } != null)保持一致会更好。♻️ 建议的改进
// ZSTAC-79067: mixed ingress+egress add at non-consecutive priority is now allowed sg3 = addSecurityGroupRule { securityGroupUuid = sg3.uuid rules = [rule_12, ingressRule] priority = 12 } + assert sg3.rules.find { it.dstIpRange == "2.2.2.2-2.2.2.10" && it.priority == 12 } != null + assert sg3.rules.find { it.type == "Ingress" && it.dstPortRange == "12-13" } != nulltest/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy (1)
234-236: 存在未使用的变量rule_1。方法
testChangeRulePriorityError中,第 232 行获取了rule_1变量,但由于移除了连续优先级限制的测试断言,该变量现在未被使用。建议清理未使用的代码。♻️ 建议的清理
void testChangeRulePriorityError() { sg3 = querySecurityGroup { conditions = ["uuid=${sg3.uuid}"] }[0] assert sg3 != null - SecurityGroupRuleInventory rule_1 = sg3.rules.find { it.type == "Ingress" && it.priority == 1 && it.ipVersion == 4 } // ZSTAC-79067: priorities beyond current rule count are now allowed // (removed consecutive priority restriction) }
The mdsUrls field in ExternalPrimaryStorage config contains user:password@host format credentials. Add desensitization to mask credentials as ***@host in API/CLI output. Resolves: ZSTAC-80664 Change-Id: I94bdede5a1b52eb039de70efb5458693484405f7
Add ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT constant to CloudOperationsErrorCode for use in premium volumebackup module. Resolves: ZSTAC-82195 Change-Id: Ibc405876e1171b637cf76b91a6822574fb6e7811
<fix>[core]: synchronize consistent hash ring to prevent dual-MN race condition See merge request zstackio/zstack!9154
SyncTaskFuture constructor calls Context.current() unconditionally, triggering ServiceLoader for ContextStorageProvider even when telemetry is disabled. If sentry-opentelemetry-bootstrap jar is on classpath, ServiceLoader fails with "not a subtype" due to ClassLoader isolation in Tomcat, throwing ServiceConfigurationError (extends Error) that escapes all catch(Exception) blocks. 1. Why is this change necessary? MN startup crashes with ORG_ZSTACK_CORE_WORKFLOW_10001 because Context.current() triggers ServiceLoader unconditionally in SyncTaskFuture constructor, even when telemetry is disabled. 2. How does it address the problem? Only call Context.current() when isTelemetryEnabled() returns true, matching the existing guard pattern used in other DispatchQueueImpl code paths (lines 351, 1069). 3. Are there any side effects? None. When telemetry is disabled, parentContext was never used. # Summary of changes (by module): - core/thread: conditionalize Context.current() in SyncTaskFuture Related: ZSTAC-82275 Change-Id: I5c0e1f15769c746c630028a29df8cf1815620608
<fix>[thread]: guard Context.current() with telemetry check See merge request zstackio/zstack!9202
<fix>[loadBalancer]: block SLB deletion during grayscale upgrade See merge request zstackio/zstack!9187
<fix>[volumebackup]: add backup cancel timeout error code See merge request zstackio/zstack!9200
<fix>[core]: add @nologging to sensitive config fields See merge request zstackio/zstack!9171
…or security group rules Resolves: ZSTAC-79067 Change-Id: I76c6d17b02f87f5836506e2c79be5538b3b0d89f
Resolves: ZSTAC-79067 Change-Id: Ib001cad155814a0018b5675fa375f63e8fc9c0ff
…onsecutive priority Resolves: ZSTAC-79067 Change-Id: I88361b5f9bcf7ca6f0f9faccfa0f2e843d06e4cc
ed4530a to
e5b952b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java (1)
56-64:⚠️ Potential issue | 🟠 Major数据一致性问题:
UsedIpVO更新在VmNicVO空检查之前执行第 58 行在检查
VmNicVO是否存在之前,已经将vmNicUuid设置到了UsedIpVO上。如果后续第 61-64 行发现nic为 null 并提前返回,UsedIpVO中将保留一个指向不存在 NIC 的无效引用,导致数据不一致。建议将空检查移到
UsedIpVO更新之前,或者在发现 NIC 不存在时清理UsedIpVO的vmNicUuid字段。🛠️ 建议修复方案
`@Override` public void afterAddIpAddress(String vmNicUUid, String usedIpUuid) { - /* update UsedIpVO */ - SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update(); - VmNicVO nic = Q.New(VmNicVO.class).eq(VmNicVO_.uuid, vmNicUUid).find(); if (nic == null) { logger.debug(String.format("VmNic[uuid:%s] not found, skip afterAddIpAddress", vmNicUUid)); return; } + + /* update UsedIpVO */ + SQL.New(UsedIpVO.class).eq(UsedIpVO_.uuid, usedIpUuid).set(UsedIpVO_.vmNicUuid, vmNicUUid).update();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java` around lines 56 - 64, The UsedIpVO update is performed before checking whether the VmNicVO exists in afterAddIpAddress, which can leave UsedIpVO.vmNicUuid pointing to a non-existent NIC; move the VmNicVO null-check (the Q.New(VmNicVO.class)...find() block) above the SQL.New(UsedIpVO.class)...update() call and only perform the UsedIpVO update if nic != null, or alternatively, if you prefer to keep the update first, add compensating logic to clear UsedIpVO.vmNicUuid when nic == null (use UsedIpVO_.vmNicUuid = null via SQL.New/ update) so UsedIpVO does not reference a missing VmNicVO.header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java (1)
115-117:⚠️ Potential issue | 🔴 Critical
setConfig方法绕过了脱敏处理,存在凭据泄露风险构造函数在处理
config时调用了desensitizeConfig()进行脱敏(移除mdsUrls和mdsInfos中URL的用户名密码),但setConfig()方法直接赋值而未进行脱敏。由于该类是 API 响应对象(@inventory 注解),外部通过setConfig()设置的配置会直接在 API 响应中暴露凭据。建议的修复
public void setConfig(LinkedHashMap config) { this.config = config; + desensitizeConfig(this.config); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java` around lines 115 - 117, The setConfig(LinkedHashMap config) method currently assigns directly to the config field and bypasses the existing desensitization logic; change setConfig to pass the incoming config through the class's desensitizeConfig(...) routine (the same function called from the constructor) before assigning to this.config so that mdsUrls/mdsInfos credentials are stripped for any config provided via API; reference the ExternalPrimaryStorageInventory class, its setConfig method, the desensitizeConfig method, and the config field when making the change.core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java (1)
80-93:⚠️ Potential issue | 🔴 Critical严重 Bug:
nodes.put()返回的是旧值而非新值第 89 行存在逻辑错误。
Map.put(key, value)方法返回的是该 key 之前关联的旧值,而非新插入的值。当缓存未命中时(info == null),nodes.put()会返回null(因为之前没有该 key 的映射),导致方法最终返回null而不是新创建的NodeInfo对象。🐛 修复建议
`@Override` public synchronized NodeInfo getNodeInfo(String nodeUuid) { NodeInfo info = nodes.get(nodeUuid); if (info == null) { ManagementNodeVO vo = dbf.findByUuid(nodeUuid, ManagementNodeVO.class); if (vo == null) { throw new ManagementNodeNotFoundException(nodeUuid); } nodeHash.add(nodeUuid); - info = nodes.put(nodeUuid, new NodeInfo(vo)); + info = new NodeInfo(vo); + nodes.put(nodeUuid, info); } return info; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java` around lines 80 - 93, The method getNodeInfo incorrectly returns the result of nodes.put(nodeUuid, new NodeInfo(vo)) which is the previous value (often null) instead of the newly created NodeInfo; fix by instantiating a new NodeInfo (e.g., NodeInfo newInfo = new NodeInfo(vo)), adding the nodeUuid to nodeHash, inserting newInfo into nodes with nodes.put(nodeUuid, newInfo) (or nodes.putIfAbsent and handle accordingly), and then return newInfo; update references to ManagementNodeVO, dbf.findByUuid, nodeHash.add, and nodes to implement this change.
🧹 Nitpick comments (6)
header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java (1)
848-855: 逻辑改进合理,但存在潜在的空指针风险。使用
Math.max(virtualSize, actualSize)是更安全的做法,能确保为根磁盘分配足够的空间(特别是处理稀疏镜像或特殊格式时)。但需注意:若
getImageSpec().getInventory()返回null,第 850-851 行会抛出NullPointerException。虽然这是原有代码就存在的风险,但建议考虑添加防御性检查。🛡️ 可选的防御性改进
public long getRootDiskAllocateSize() { if (rootDiskOffering == null) { + ImageInventory inventory = this.getImageSpec().getInventory(); + if (inventory == null) { + return 0; + } - long virtualSize = this.getImageSpec().getInventory().getSize(); - long actualSize = this.getImageSpec().getInventory().getActualSize(); + long virtualSize = inventory.getSize(); + long actualSize = inventory.getActualSize(); return Math.max(virtualSize, actualSize); } return rootDiskOffering.getDiskSize(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java` around lines 848 - 855, In VmInstanceSpec.getRootDiskAllocateSize(), add defensive null checks for getImageSpec() and getImageSpec().getInventory() before accessing getSize()/getActualSize() to avoid NullPointerException; if either is null, return a sensible fallback (e.g., 0 or rootDiskOffering.getDiskSize() when available) or document the behavior, ensuring the code still uses Math.max(virtualSize, actualSize) when both values are present; reference the method getRootDiskAllocateSize(), getImageSpec(), getInventory(), virtualSize, and actualSize when applying the checks and fallback.test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy (1)
234-236: 测试方法testChangeRulePriorityError可能需要补充验证逻辑。当前方法仅保留了注释说明行为变更,但没有任何实际的断言或验证。建议补充对新行为的正向验证,例如:
- 验证超过全局上限 (
SECURITY_GROUP_RULES_NUM_LIMIT) 的优先级会抛出错误- 验证正整数优先级能够正常设置
♻️ 建议补充测试验证
void testChangeRulePriorityError() { sg3 = querySecurityGroup { conditions = ["uuid=${sg3.uuid}"] }[0] assert sg3 != null SecurityGroupRuleInventory rule_1 = sg3.rules.find { it.type == "Ingress" && it.priority == 1 && it.ipVersion == 4 } // ZSTAC-79067: priorities beyond current rule count are now allowed // (removed consecutive priority restriction) + + // Verify that priority exceeding global limit (101) should still fail + expect(AssertionError) { + changeSecurityGroupRule { + uuid = rule_1.uuid + priority = 101 + } + } + + // Verify that valid non-consecutive priority should succeed + changeSecurityGroupRule { + uuid = rule_1.uuid + priority = 10 + } + assert Q.New(SecurityGroupRuleVO).eq(SecurityGroupRuleVO_.uuid, rule_1.uuid).find().priority == 10 }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy` around lines 234 - 236, The test method testChangeRulePriorityError in ChangeSecurityGroupRuleCase.groovy currently only documents behavior change; add concrete assertions: (1) attempt to set a priority greater than SECURITY_GROUP_RULES_NUM_LIMIT and assert an error/exception is thrown (or API returns failure) to validate the global upper bound, and (2) set a valid positive priority (e.g., a value within limits) and assert the change succeeds and the rule's priority is updated (check the returned rule or query the security group rules). Locate and update the testChangeRulePriorityError method to perform these two checks using the existing test helpers/assertion utilities used elsewhere in the class.test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy (1)
613-618: 建议补充添加后的验证断言。当前代码仅执行了
addSecurityGroupRule但没有验证结果。建议添加断言确认规则成功创建且优先级正确。♻️ 建议补充验证
// ZSTAC-79067: mixed ingress+egress add at non-consecutive priority is now allowed sg3 = addSecurityGroupRule { securityGroupUuid = sg3.uuid rules = [rule_12, ingressRule] priority = 12 } + + // Verify rules were added successfully + assert sg3.rules.find { it.dstIpRange == "2.2.2.2-2.2.2.10" && it.priority == 12 } != null + assert sg3.rules.find { it.type == "Ingress" && it.dstPortRange == "12-13" } != null🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy` around lines 613 - 618, 缺少对 addSecurityGroupRule 调用结果的断言验证;在调用 sg3 = addSecurityGroupRule { securityGroupUuid = sg3.uuid; rules = [rule_12, ingressRule]; priority = 12 } 之后,添加断言以验证新规则已被创建且优先级为 12。具体修复:调用返回后使用现有检查方法(例如查询安全组规则列表或使用 findSecurityGroupRule/getSecurityGroup 等 helper)确认包含 rule_12 和 ingressRule 并且其 priority 字段为 12,同时断言返回的 sg3.rules 或查询结果大小与预期一致并抛出失败说明;参考符号:addSecurityGroupRule、sg3、rule_12、ingressRule、priority。network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java (1)
1078-1083: 筛选后可能返回少于 limit 的结果当前先按
limit拉取,再过滤保留段,最终数量可能小于msg.getLimit()即使仍有可用 IP。若 API 语义期望“尽量填满 limit”,考虑在获取阶段过滤或过滤后继续补齐。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java` around lines 1078 - 1083, 当前实现先按 msg.getLimit() 从 DB 拉取 freeIpInventorys 再对 reservedIpRanges 过滤(使用 reservedIpRanges、freeIpInventorys、NetworkUtils.isInRange),导致返回结果可能少于 msg.getLimit()。修复方法:在 L3BasicNetwork 相关方法中把过滤条件下推到查询层或在内存过滤后补齐结果——比如在 DB 查询中添加排除 reservedIpRanges 的条件,或实现一个循环/分页逻辑:累积非保留 IP 到结果集中,若不足 msg.getLimit() 则继续查询下一批直到满足或无更多条目(使用 msg.getLimit()、reservedIpRanges、freeIpInventorys 标识位置以定位并修改代码)。header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java (1)
80-89: 建议:增加对列表元素为 null 的处理当列表中包含 null 元素时,
String.valueOf(url)会返回字面量字符串"null",这可能不是预期的行为。建议添加 null 检查以保持原值或跳过处理。♻️ 建议的修复
private static void desensitizeUrlList(Map config, String key) { Object urls = config.get(key); if (urls instanceof List) { List<String> desensitized = new ArrayList<>(); for (Object url : (List) urls) { - desensitized.add(desensitizeUrl(String.valueOf(url))); + if (url == null) { + desensitized.add(null); + } else { + desensitized.add(desensitizeUrl(String.valueOf(url))); + } } config.put(key, desensitized); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java` around lines 80 - 89, The desensitizeUrlList method currently converts each list element with String.valueOf(url), which turns nulls into the literal "null"; update desensitizeUrlList to check for null elements in the List retrieved from config.get(key) and either preserve the null value (add null to desensitized) or skip null entries instead of calling String.valueOf; call desensitizeUrl only for non-null entries and then put the new List back into config under the same key so desensitized values and original nulls are handled correctly.portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java (1)
890-898: 建议:为nodeLeft创建的ManagementNodeInventory数据不完整当节点从数据库中消失时,创建的
ManagementNodeInventory只包含uuid和可能的hostname,缺少其他字段(如state、heartBeat等)。下游的nodeLeft处理逻辑如果依赖这些字段可能会出现问题。💡 建议:考虑从 joinedManagementNodes 获取完整的 inventory
if (suspectedMissingFromDb.contains(nodeUuid)) { // Second consecutive detection — confirmed missing, remove from hash ring logger.warn(String.format("management node[uuid:%s] confirmed missing from database for two consecutive" + " heartbeat cycles, removing from hash ring", nodeUuid)); - ManagementNodeInventory inv = new ManagementNodeInventory(); - inv.setUuid(nodeUuid); - try { - inv.setHostName(destinationMaker.getNodeInfo(nodeUuid).getNodeIP()); - } catch (Exception e) { - logger.warn(String.format("cannot get node info for node[uuid:%s], use empty hostname", nodeUuid)); - } + ManagementNodeInventory inv = joinedManagementNodes.get(nodeUuid); + if (inv == null) { + inv = new ManagementNodeInventory(); + inv.setUuid(nodeUuid); + try { + inv.setHostName(destinationMaker.getNodeInfo(nodeUuid).getNodeIP()); + } catch (Exception e) { + logger.warn(String.format("cannot get node info for node[uuid:%s], use empty hostname", nodeUuid)); + } + } nodeLifeCycle.nodeLeft(inv);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java` around lines 890 - 898, The ManagementNodeInventory passed to nodeLifeCycle.nodeLeft is incomplete (only uuid/hostname) which can break downstream logic; modify the code that creates the inventory (currently using new ManagementNodeInventory() and destinationMaker.getNodeInfo(nodeUuid)) to first try to retrieve the complete inventory from the joinedManagementNodes cache or source of truth (e.g., joinedManagementNodes.get(nodeUuid) / DB lookup) and fall back to the minimal inventory only if none found; ensure fields like state, lastHeartBeat/heartBeat, hostName, and any other relevant metadata on ManagementNodeInventory are populated (or set to explicit defaults such as state=Left/Unknown and heartBeat=null) before calling nodeLifeCycle.nodeLeft(inv).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java`:
- Around line 1078-1083: reservedIpRanges may mix IPv4/IPv6 and comparing across
versions can throw or wrongly remove entries; update the removal logic in
L3BasicNetwork so for each free IP (freeIpInventorys -> freeIp) you only check
reserved ranges with matching ip version: filter reservedIpRanges by
r.getIpVersion().equals(freeIp.getIpVersion()) before calling
NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()), then
remove if any match. Target the block referencing reservedIpRanges,
freeIpInventorys/freeIp and NetworkUtils.isInRange.
---
Outside diff comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmNicManagerImpl.java`:
- Around line 56-64: The UsedIpVO update is performed before checking whether
the VmNicVO exists in afterAddIpAddress, which can leave UsedIpVO.vmNicUuid
pointing to a non-existent NIC; move the VmNicVO null-check (the
Q.New(VmNicVO.class)...find() block) above the
SQL.New(UsedIpVO.class)...update() call and only perform the UsedIpVO update if
nic != null, or alternatively, if you prefer to keep the update first, add
compensating logic to clear UsedIpVO.vmNicUuid when nic == null (use
UsedIpVO_.vmNicUuid = null via SQL.New/ update) so UsedIpVO does not reference a
missing VmNicVO.
In
`@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java`:
- Around line 80-93: The method getNodeInfo incorrectly returns the result of
nodes.put(nodeUuid, new NodeInfo(vo)) which is the previous value (often null)
instead of the newly created NodeInfo; fix by instantiating a new NodeInfo
(e.g., NodeInfo newInfo = new NodeInfo(vo)), adding the nodeUuid to nodeHash,
inserting newInfo into nodes with nodes.put(nodeUuid, newInfo) (or
nodes.putIfAbsent and handle accordingly), and then return newInfo; update
references to ManagementNodeVO, dbf.findByUuid, nodeHash.add, and nodes to
implement this change.
In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`:
- Around line 115-117: The setConfig(LinkedHashMap config) method currently
assigns directly to the config field and bypasses the existing desensitization
logic; change setConfig to pass the incoming config through the class's
desensitizeConfig(...) routine (the same function called from the constructor)
before assigning to this.config so that mdsUrls/mdsInfos credentials are
stripped for any config provided via API; reference the
ExternalPrimaryStorageInventory class, its setConfig method, the
desensitizeConfig method, and the config field when making the change.
---
Duplicate comments:
In
`@plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java`:
- Around line 491-493: The current validation in SecurityGroupApiInterceptor
only checks ao.getPriority() < 1; add an upper-bound check against
SECURITY_GROUP_RULES_NUM_LIMIT and throw the same
ApiMessageInterceptionException via argerr (using
ORG_ZSTACK_NETWORK_SECURITYGROUP_10042) when ao.getPriority() >
SECURITY_GROUP_RULES_NUM_LIMIT so the priority must be within
1..SECURITY_GROUP_RULES_NUM_LIMIT; locate the logic around ao.getPriority(),
ApiMessageInterceptionException, argerr and SECURITY_GROUP_RULES_NUM_LIMIT and
mirror the error/message style used in
APIChangeSecurityGroupRuleMsg/APIAddSecurityGroupRuleMsg.
---
Nitpick comments:
In
`@header/src/main/java/org/zstack/header/storage/addon/primary/ExternalPrimaryStorageInventory.java`:
- Around line 80-89: The desensitizeUrlList method currently converts each list
element with String.valueOf(url), which turns nulls into the literal "null";
update desensitizeUrlList to check for null elements in the List retrieved from
config.get(key) and either preserve the null value (add null to desensitized) or
skip null entries instead of calling String.valueOf; call desensitizeUrl only
for non-null entries and then put the new List back into config under the same
key so desensitized values and original nulls are handled correctly.
In `@header/src/main/java/org/zstack/header/vm/VmInstanceSpec.java`:
- Around line 848-855: In VmInstanceSpec.getRootDiskAllocateSize(), add
defensive null checks for getImageSpec() and getImageSpec().getInventory()
before accessing getSize()/getActualSize() to avoid NullPointerException; if
either is null, return a sensible fallback (e.g., 0 or
rootDiskOffering.getDiskSize() when available) or document the behavior,
ensuring the code still uses Math.max(virtualSize, actualSize) when both values
are present; reference the method getRootDiskAllocateSize(), getImageSpec(),
getInventory(), virtualSize, and actualSize when applying the checks and
fallback.
In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java`:
- Around line 1078-1083: 当前实现先按 msg.getLimit() 从 DB 拉取 freeIpInventorys 再对
reservedIpRanges 过滤(使用
reservedIpRanges、freeIpInventorys、NetworkUtils.isInRange),导致返回结果可能少于
msg.getLimit()。修复方法:在 L3BasicNetwork 相关方法中把过滤条件下推到查询层或在内存过滤后补齐结果——比如在 DB 查询中添加排除
reservedIpRanges 的条件,或实现一个循环/分页逻辑:累积非保留 IP 到结果集中,若不足 msg.getLimit()
则继续查询下一批直到满足或无更多条目(使用 msg.getLimit()、reservedIpRanges、freeIpInventorys
标识位置以定位并修改代码)。
In
`@portal/src/main/java/org/zstack/portal/managementnode/ManagementNodeManagerImpl.java`:
- Around line 890-898: The ManagementNodeInventory passed to
nodeLifeCycle.nodeLeft is incomplete (only uuid/hostname) which can break
downstream logic; modify the code that creates the inventory (currently using
new ManagementNodeInventory() and destinationMaker.getNodeInfo(nodeUuid)) to
first try to retrieve the complete inventory from the joinedManagementNodes
cache or source of truth (e.g., joinedManagementNodes.get(nodeUuid) / DB lookup)
and fall back to the minimal inventory only if none found; ensure fields like
state, lastHeartBeat/heartBeat, hostName, and any other relevant metadata on
ManagementNodeInventory are populated (or set to explicit defaults such as
state=Left/Unknown and heartBeat=null) before calling
nodeLifeCycle.nodeLeft(inv).
In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/AddSecurityGroupRuleOptimizedCase.groovy`:
- Around line 613-618: 缺少对 addSecurityGroupRule 调用结果的断言验证;在调用 sg3 =
addSecurityGroupRule { securityGroupUuid = sg3.uuid; rules = [rule_12,
ingressRule]; priority = 12 } 之后,添加断言以验证新规则已被创建且优先级为
12。具体修复:调用返回后使用现有检查方法(例如查询安全组规则列表或使用 findSecurityGroupRule/getSecurityGroup 等
helper)确认包含 rule_12 和 ingressRule 并且其 priority 字段为 12,同时断言返回的 sg3.rules
或查询结果大小与预期一致并抛出失败说明;参考符号:addSecurityGroupRule、sg3、rule_12、ingressRule、priority。
In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/securitygroup/ChangeSecurityGroupRuleCase.groovy`:
- Around line 234-236: The test method testChangeRulePriorityError in
ChangeSecurityGroupRuleCase.groovy currently only documents behavior change; add
concrete assertions: (1) attempt to set a priority greater than
SECURITY_GROUP_RULES_NUM_LIMIT and assert an error/exception is thrown (or API
returns failure) to validate the global upper bound, and (2) set a valid
positive priority (e.g., a value within limits) and assert the change succeeds
and the rule's priority is updated (check the returned rule or query the
security group rules). Locate and update the testChangeRulePriorityError method
to perform these two checks using the existing test helpers/assertion utilities
used elsewhere in the class.
|
|
||
| Set<ReservedIpRangeVO> reservedIpRanges = self.getReservedIpRanges(); | ||
| if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) { | ||
| freeIpInventorys.removeIf(freeIp -> reservedIpRanges.stream().anyMatch( | ||
| r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()))); | ||
| } |
There was a problem hiding this comment.
避免跨 IP 版本比较引发异常或误删
reservedIpRanges 可能同时包含 IPv4/IPv6 范围,但当前对每个 free IP 都遍历所有范围;当版本不一致时,底层比较可能抛异常或产生错误过滤。建议先按 ipVersion 过滤再做 isInRange。
🔧 建议修改
- if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) {
- freeIpInventorys.removeIf(freeIp -> reservedIpRanges.stream().anyMatch(
- r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp())));
- }
+ if (reservedIpRanges != null && !reservedIpRanges.isEmpty()) {
+ freeIpInventorys.removeIf(freeIp -> {
+ boolean isIpv4 = NetworkUtils.isIpv4Address(freeIp.getIp());
+ int ipVersion = isIpv4 ? IPv6Constants.IPv4 : IPv6Constants.IPv6;
+ return reservedIpRanges.stream()
+ .filter(r -> r.getIpVersion() == ipVersion)
+ .anyMatch(r -> NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()));
+ });
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java` around lines
1078 - 1083, reservedIpRanges may mix IPv4/IPv6 and comparing across versions
can throw or wrongly remove entries; update the removal logic in L3BasicNetwork
so for each free IP (freeIpInventorys -> freeIp) you only check reserved ranges
with matching ip version: filter reservedIpRanges by
r.getIpVersion().equals(freeIp.getIpVersion()) before calling
NetworkUtils.isInRange(freeIp.getIp(), r.getStartIp(), r.getEndIp()), then
remove if any match. Target the block referencing reservedIpRanges,
freeIpInventorys/freeIp and NetworkUtils.isInRange.
Resolves: ZSTAC-79067\n\nRemove overly strict sequential priority validation for security group rules API.
sync from gitlab !9173