Skip to content

DAOS-19036 vos: store MBS information when allocate active DTX entry#18400

Closed
Nasf-Fan wants to merge 1 commit into
masterfrom
Nasf-Fan/DAOS-19036
Closed

DAOS-19036 vos: store MBS information when allocate active DTX entry#18400
Nasf-Fan wants to merge 1 commit into
masterfrom
Nasf-Fan/DAOS-19036

Conversation

@Nasf-Fan
Copy link
Copy Markdown
Contributor

@Nasf-Fan Nasf-Fan commented Jun 2, 2026

Originally, the MBS (membership) information for an active DTX entry is stored when such DTX is ready (prepared) locally. But if there is CPU yield after DTX entry allocation but before it is prepared, then someone may find such DTX entry but with empty MBS information, that will trigger server side segmentation fault or misguide related user to take wrong action.

This patch stores the MBS information when allocate related DTX entry.

The patch also explicitly cleanup non-prepared DTX after modification failure to avoid leaving stale active DTX entry.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Ticket title is 'Argonne Daos_user : Engine ranks 590, 593, and 596 entered Errored state unexpectedly'
Status is 'In Progress'
Labels: 'ALCF'
https://daosio.atlassian.net/browse/DAOS-19036

@daosbuild3
Copy link
Copy Markdown
Collaborator

@Nasf-Fan Nasf-Fan force-pushed the Nasf-Fan/DAOS-19036 branch from d71cfd3 to 77d248e Compare June 2, 2026 11:51
Originally, the MBS (membership) information for an active DTX entry
is stored when such DTX is ready (prepared) locally. But if there is
CPU yield after DTX entry allocation but before it is prepared, then
someone may find such DTX entry but with empty MBS information, that
will trigger server side segmentation fault or misguide related user
to take wrong action.

This patch stores the MBS information when allocate related DTX entry.

The patch also explicitly cleanup non-prepared DTX after modification
failure to avoid leaving stale active DTX entry.

Signed-off-by: Fan Yong <fan.yong@hpe.com>
@Nasf-Fan Nasf-Fan force-pushed the Nasf-Fan/DAOS-19036 branch from 77d248e to 01a237e Compare June 2, 2026 15:56
@Nasf-Fan
Copy link
Copy Markdown
Contributor Author

Nasf-Fan commented Jun 4, 2026

Replaced by #18428

@Nasf-Fan Nasf-Fan closed this Jun 4, 2026
@Nasf-Fan Nasf-Fan deleted the Nasf-Fan/DAOS-19036 branch June 4, 2026 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants