-
Notifications
You must be signed in to change notification settings - Fork 341
Jlo/posix default #17516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Jlo/posix default #17516
Conversation
Change the default to POSIX container in the daos utility when no container type is specified on the command line. Signed-off-by: Johann Lombardi <johann.lombardi@hpe.com>
Signed-off-by: Johann Lombardi <johann.lombardi@hpe.com>
|
Errors are component not formatted correctly,Ticket number prefix incorrect,PR title is malformatted. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data |
| # method when creating/destroying containers | ||
| self.path = BasicParameter(None) | ||
| self.type = BasicParameter(None) | ||
| self.type = BasicParameter("NONE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a script to find all the configs where we do not specify the type, and there are 42 of them. Although it's more changes, I think it's better overall . Are you okay with me pushing that instead of setting the default here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, please feel free to push it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Pushed. It should cover almost all the cases, if not all
| # method when creating/destroying containers | ||
| self.path = BasicParameter(None) | ||
| self.type = BasicParameter(None) | ||
| self.type = BasicParameter("NONE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this set as the type None results in a daos container create command that excludes the --type argument. Setting the default to the "NONE" string results in a daos container create command with the --type=NONE argument. Based upon the commit message don't we want thew first option, where the --type argument is excluded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If --type is excluded, the default container type will now be POSIX.
So if we leave this as None we will need to update all tests that currently create a container but do not specify a type. And IMO that's what we should do, instead of setting a different default value here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it will pass the string "NONE" which will restore the old default. I am fine with Dalton's approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by the approach of changing all the config files. Do the tests actually need the type to be NONE for the purpose of the test itself, or is it some outdated expectation? If the latter, shouldn't the tests be fixed to expect the new default? (Or if it doesn't matter to the test, maybe not compare the layout type at all?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests using NONE type are working with non-POSIX containers. I.e. using the daos_ API directly. So they don't want to use POSIX. But as a side note, I do think most of those tests should eventually be updated to use POSIX instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. That makes sense if the test requires direct API access. As an aside, I think the NONE type is a bit of a misnomer--should be called "API" or something like that.
Signed-off-by: Dalton Bohning <dalton.bohning@hpe.com>
Features: container Signed-off-by: Dalton Bohning <dalton.bohning@hpe.com>
Features: container Signed-off-by: Dalton Bohning <dalton.bohning@hpe.com>
| container: | ||
| cont_types: | ||
| - "" | ||
| - "NONE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we still test "" in addition to "NONE"?
| - "NONE" | |
| - "" | |
| - "NONE" |
We also might want to clarify https://github.com/daos-stack/daos/blob/master/src/tests/ftest/dfuse/container_type.py#L43, now that "NONE" should create a POSIX container.
Either way https://github.com/daos-stack/daos/blob/master/src/tests/ftest/dfuse/container_type.py#L48-L49 will need to be updated to support setting "NONE".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the python, but the purpose of this test is to verify dfuse behavior with POSIX and non-POSIX, so I don't think there is any need to create a container with an empty type
Features: container Signed-off-by: Dalton Bohning <dalton.bohning@hpe.com>
| // Use POSIX container type by default, if none is specified | ||
| typeProp := createPropList.MustAddEntryByType(daos.ContainerPropLayoutType) | ||
| if hasType() { | ||
| typeProp := createPropList.MustAddEntryByType(daos.ContainerPropLayoutType) | ||
| typeProp.SetValue(uint64(cmd.Type.Type)) | ||
| } else { | ||
| typeProp.SetValue(uint64(C.DAOS_PROP_CO_LAYOUT_POSIX)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to add these C layout types as Go constants in src/control/lib/daos/container_property.go, and then use the constant here.
I wouldn't block on this -- we're still making some changes around the way we organize the cgo code.
| # method when creating/destroying containers | ||
| self.path = BasicParameter(None) | ||
| self.type = BasicParameter(None) | ||
| self.type = BasicParameter("NONE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by the approach of changing all the config files. Do the tests actually need the type to be NONE for the purpose of the test itself, or is it some outdated expectation? If the latter, shouldn't the tests be fixed to expect the new default? (Or if it doesn't matter to the test, maybe not compare the layout type at all?)
|
Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17516/4/execution/node/826/log |
|
Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17516/5/execution/node/679/log |
|
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17516/5/testReport/ |
|
NLT is failing to create containers. The stacktrace doesn't give much info, but I do see some calls in |
utils/node_local_test.py
Outdated
|
|
||
| # pylint: disable-next=too-many-arguments | ||
| def create_cont(conf, pool=None, ctype=None, label=None, path=None, oclass=None, dir_oclass=None, | ||
| def create_cont(conf, pool=None, ctype='NONE', label=None, path=None, oclass=None, dir_oclass=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need a default here at all? It seems all callers either pass ctype or path so I would think we don't need to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try reverting this line and see if/which tests fail
Features: container Signed-off-by: Dalton Bohning <dalton.bohning@hpe.com>
|
Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17516/6/execution/node/710/log |
|
Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17516/7/execution/node/826/log |
| try: | ||
| rc.json = json.loads(rc.stdout.decode('utf-8')) | ||
| except json.JSONDecodeError: | ||
| print("Failed to decode json output") | ||
| print(f"command={exec_cmd}") | ||
| print(rc.stdout.decode('utf-8')) | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what's going on, but from the log, when this fails, there is no stdout
https://jenkins-3.daos.hpc.amslabs.hpecorp.net/blue/rest/organizations/jenkins/pipelines/daos-stack/pipelines/daos/branches/PR-17516/runs/7/nodes/483/log/?start=0
[2026-02-10T01:26:32.526Z] Failed to decode json output
[2026-02-10T01:26:32.526Z] command=['/opt/daos/bin/daos', '--json', 'container', 'create', 'NLT', '--type', 'POSIX', '--properties', 'cksum:off,srv_cksum:off,rd_fac:0']
[2026-02-10T01:26:32.526Z]
DAOS-17946 cont: use POSIX container type by default in daos utility
Change the default to POSIX container in the daos utility when no container
type is specified on the command line.
Signed-off-by: Johann Lombardi johann.lombardi@hpe.com