Skip to content

Conversation

@liuxuezhao
Copy link
Contributor

For the case of 2nd remap, if the spare target is DOWN2UP need to set fs_down2up flag, to make it be able to set shard's po_rebuilding flag at the end.
One example case -
Target A is DOWN, rebuild completed and status changed to DOWNOUT Target B is DOWN, rebuild started but not completed but admin do the reint, its status change to UP and with DOWN2UP flag.

In object layout calculation, one shard firstly located in Target A, but 1st remap to Target B, but still need to do 2nd remap. In this case should set fs_down2up flag which is not set in the 1st remap, to avoid not be able to set shard's po_rebuilding flag so will cause read from it (invalid place).

This bug could cause data corruption (mostly like with cause shard losing).

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

For the case of 2nd remap, if the spare target is DOWN2UP need to set
fs_down2up flag, to make it be able to set shard's po_rebuilding flag
at the end.
One example case -
Target A is DOWN, rebuild completed and status changed to DOWNOUT
Target B is DOWN, rebuild started but not completed but admin do the reint,
its status change to UP and with DOWN2UP flag.

In object layout calculation, one shard firstly located in Target A, but 1st
remap to Target B, but still need to do 2nd remap. In this case should set
fs_down2up flag which is not set in the 1st remap, to avoid not be able to set
shard's po_rebuilding flag so will cause read from it (invalid place).

This bug could cause data corruption (mostly like with cause shard losing).

Signed-off-by: Xuezhao Liu <xuezhao.liu@hpe.com>
@liuxuezhao liuxuezhao requested review from a team as code owners February 10, 2026 05:59
@github-actions
Copy link

Errors are Unable to load ticket data
https://daosio.atlassian.net/browse/DAOS-18487

@daosbuild3
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants