Thanks to visit codestin.com
Credit goes to github.com

Skip to content

CNI DEL not called on node reboot #4727

@caseydavenport

Description

@caseydavenport

Description

I'm seeing a behavior where, upon rebooting an OpenShift node using CRI-O, the CNI plugin does not appear to be called to tear down resources for the previously running pods on that node.

Basically, I am making note of the running pod sandbox IDs, rebooting the node with sudo reboot now (without draining the pods, effectively simulating a failure scenario) and letting the node come back up. When it does, I see the pods get launched again with new sandboxes. However, the old sandboxes never seem to get a CNI DEL.

Similarly, I can see cache files in /var/lib/cni/results for both the new and old sandboxes. Does CRIO make any attempt to release these resources on node reboot?

Steps to reproduce the issue:

  1. Run some pods on a node. Note the pod sandbox container IDs.
  2. Reboot the node, see that the pods are re-launched with new pod sandboxes
  3. See no evidence in the logs that the old sandbox IDs were released.

Describe the results you received:

I expect CNI DEL to be called on the old sandbox IDs, providing a chance for CNI plugins to release any associated resources.

Describe the results you expected:

CNI DEL is not called.

Additional information you deem important (e.g. issue happens only occasionally):

I haven't looked at the code yet, but this seems very reproducible in my environment and from the logs it looks like no effort is even made to release the resources. It seems counter-intuitive to need this given a node-reboot will tear down namespaces, etc., but there are other resources which persist across node reboot and thus need to be released even if the original sandbox is no longer running.

Output of crio --version:

crio version 1.18.2-18.rhaos4.5.git754d46b.el8
Version:    1.18.2-18.rhaos4.5.git754d46b.el8
GoVersion:  go1.13.4
Compiler:   gc
Platform:   linux/amd64
Linkmode:   dynamic

Additional environment details (AWS, VirtualBox, physical, etc.):

OpenShift, IPI in AWS using Calico CNI.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions