-
Notifications
You must be signed in to change notification settings - Fork 1.1k
cgmgr: use NewSystemd from createSandboxCgroup #6196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It has been reported that sometimes pod creation fails with an error like this one: > cgroup: error creating cgroup path /pod_123.slice/pod_123-456.slice/crio-0dd29e75c4072e6d8227a338500c5a9a0cae2b41215c136150640c21e3e07fdf.scope: write /sys/fs/cgroup/pod_123.slice/pod_123-456.slice/cgroup.subtree_control: no such file or directory" The error comes from containers/common/pkg/cgroups.New, which eventually calls createCgroupv2Path, which does Mkdir immediately followed by WriteFile, which fails with ENOENT. It seems that this is caused by systemd which seems an unknown cgroup hierarchy and removes it (in this case, in between Mkdir and WriteFile). The solution is to not create paths when using systemd driver, since systemd is going to create those for us. Use cgroups.NewSystemd when appropriate. Signed-off-by: Kir Kolyshkin <[email protected]>
|
@kolyshkin: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
CI failures (2 of the same failures in 3 different tests) appear to be unrelated (they are the same in any recent PR, e.g. #6193) |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kolyshkin, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroupfs cgroup underneath the systemd cgroup. Attempt to use libcontainer to do this, as even when it creates a cgroup with cgroupfs, it sets the `name=systemd` controller, which would allow systemd to be aware of the cgroup, even if it isn't managing it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
The history here is a bit convoluted. Originally, runc created the cgroup for the infra container. cAdvisor was built to assume the cgroup for the infra container would be created, and it uses this to find the network metrics for the pod. When we dropped the infra container, cri-o needed to make this cgroup so cAdvisor could still find the network metrics. However, systemd didn't like the way we did it, and would remove the cgroup mid pod creation, which was fixed in cri-o#6196. This actually caused the cgroup to not be created at all, which then caused the networking metrics to not be gathered at all. Thus, we do need to create a cgroup underneath the systemd cgroup. Attempt to use a slice for this, as systemd won't require a process be underneath it. Signed-off-by: Peter Hunt <[email protected]>
It has been reported that sometimes pod creation fails with an
error like this one:
The error comes from containers/common/pkg/cgroups.New, which
eventually calls createCgroupv2Path, which does Mkdir immediately
followed by WriteFile, which fails with ENOENT.
It seems that this is caused by systemd which seems an unknown cgroup
hierarchy and removes it (in this case, in between Mkdir and WriteFile).
The solution is to not create paths when using systemd driver, since
systemd is going to create those for us.
Use cgroups.NewSystemd when appropriate.
What type of PR is this?
/kind bug
/kind flake
What this PR does / why we need it:
See above.
Which issue(s) this PR fixes:
None
Special notes for your reviewer:
None
Does this PR introduce a user-facing change?