Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@dcbw
Copy link
Contributor

@dcbw dcbw commented Jun 23, 2021

Ongoing sandbox requests cannot be (or are not) canceled by kubelet, leading to a situation where short-lived pods (especially Kubernetes e2e tests for stateful sets) cause overlapping sandbox requests. If the CNI plugin needs to wait for network state to converge, it's pointless to wait for a sandbox who's pod has been deleted so the plugin should cancel the request and return to the runtime. However, it's impossible to do that race-free without the pod UID the sandbox was created for, since the there is a gap between when kubelet requests the sandbox creation and when the plugin gets the pod object from the apiserver when the pod could have been deleted and recreated, and the CNI plugin would retrieve information for the new pod, not the pod the sandbox was created for.

Passing the pod UID to the plugin allows the plugin to cancel the operation when the pod UID retrieved from the apiserver during plugin operation does not match the one the sandbox was created for.

@trozet @haircommander @mrunalp

/kind feature

CNI plugins are now passed a K8S_POD_UID environment variable containing the pod UID this sandbox was started for.

@dcbw dcbw requested review from mrunalp and runcom as code owners June 23, 2021 03:47
@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jun 23, 2021
@openshift-ci openshift-ci bot requested a review from saschagrunert June 23, 2021 03:47
@codecov
Copy link

codecov bot commented Jun 23, 2021

Codecov Report

Merging #5026 (f1b5f58) into master (fa01253) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head f1b5f58 differs from pull request most recent head 6e8d370. Consider uploading reports for the commit 6e8d370 to get more accurate results

@@           Coverage Diff           @@
##           master    #5026   +/-   ##
=======================================
  Coverage   41.73%   41.74%           
=======================================
  Files         108      108           
  Lines       10157    10158    +1     
=======================================
+ Hits         4239     4240    +1     
  Misses       5470     5470           
  Partials      448      448           

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 23, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 23, 2021
@saschagrunert
Copy link
Member

/retest
/override ci/prow/e2e-gcp

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 23, 2021

@saschagrunert: Overrode contexts on behalf of saschagrunert: ci/prow/e2e-gcp

Details

In response to this:

/retest
/override ci/prow/e2e-gcp

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dcbw
Copy link
Contributor Author

dcbw commented Jun 23, 2021

kata-jenkins failure is:

07:21:15 #   Warning  FailedCreatePodSandBox  4s (x7 over 89s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: unrecognised machinetype: pc: unknown
07:21:15 # pod "handlers" deleted
07:21:15 Failed at 69: bats "${K8S_TEST_ENTRY}"

which seems unrelated

@dcbw
Copy link
Contributor Author

dcbw commented Jun 23, 2021

e2e-gcp failure is:

fail [k8s.io/[email protected]/test/e2e/storage/ubernetes_lite_volumes.go:163]: Unexpected error:     <*errors.errorString \| 0xc00263f220>: {         s: "PersistentVolumeClaims [pvc-1] not all in phase Bound within 5m0s",     }     PersistentVolumeClaims [pvc-1] not all in phase Bound within 5m0s

Also seems like a flake.

@dcbw dcbw force-pushed the ocicni-pass-pod-uid branch from 4a56951 to efd50f2 Compare June 23, 2021 14:15
@dcbw
Copy link
Contributor Author

dcbw commented Jun 23, 2021

Updated with tests in network.bats

@dcbw dcbw force-pushed the ocicni-pass-pod-uid branch from efd50f2 to ef97723 Compare June 23, 2021 14:52
dcbw added 2 commits June 23, 2021 10:23
To allow passing pod UID to plugins.

Signed-off-by: Dan Williams <[email protected]>
This allows plugins to more correctly cancel long-running sandbox
operations when the pod is deleted/re-created in the Kube API
while the call is ongoing.

Signed-off-by: Dan Williams <[email protected]>
@dcbw dcbw force-pushed the ocicni-pass-pod-uid branch from ef97723 to 6e8d370 Compare June 23, 2021 15:24
@dcbw
Copy link
Contributor Author

dcbw commented Jun 23, 2021

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 23, 2021

@dcbw: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/openshift-jenkins/e2e_crun_cgroupv2 6e8d370 link /test e2e_cgroupv2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@haircommander
Copy link
Member

Ongoing sandbox requests cannot be (or are not) canceled by kubelet, leading to a situation where short-lived pods (especially Kubernetes e2e tests for stateful sets) cause overlapping sandbox requests

weird this sounds like a kubelet bug. I would expect cri-o to fail to create a duplicate pod while the first request is ongoing, and synchronously wait on cni plugin, thus preventing duplicate calls.

this change is fine to me, but I fear we're putting a bandaid on a bigger wound
/lgtm

/override ci/prow/e2e-gcp

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 23, 2021

@haircommander: Overrode contexts on behalf of haircommander: ci/prow/e2e-gcp

Details

In response to this:

Ongoing sandbox requests cannot be (or are not) canceled by kubelet, leading to a situation where short-lived pods (especially Kubernetes e2e tests for stateful sets) cause overlapping sandbox requests

weird this sounds like a kubelet bug. I would expect cri-o to fail to create a duplicate pod while the first request is ongoing, and synchronously wait on cni plugin, thus preventing duplicate calls.

this change is fine to me, but I fear we're putting a bandaid on a bigger wound
/lgtm

/override ci/prow/e2e-gcp

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot openshift-merge-robot merged commit a8af5b8 into cri-o:master Jun 23, 2021
@dcbw
Copy link
Contributor Author

dcbw commented Jun 24, 2021

weird this sounds like a kubelet bug. I would expect cri-o to fail to create a duplicate pod while the first request is ongoing, and synchronously wait on cni plugin, thus preventing duplicate calls.

this change is fine to me, but I fear we're putting a bandaid on a bigger wound

@haircommander the scenarios are something like this:

Scenario 1: pod recreated during sandbox wait

  1. pod created, kubelet notices, asks CRI to create the sandbox
  2. CRI creates sandbox, execs CNI plugin
  3. CNI plugin gets pod from apiserver, starts setting up networking
  4. something deletes and recreates the pod
  5. CNI plugin waiting for networking to converge

Now in this scenario, the plugin could create a pod watch for delete events. But that's not race-proof since the pod could be deleted + recreated between steps 2 and 3 (see scenario 2) and the sandbox would be for an old pod (and be subsequently torn down by kubelet at some point in the future). Pod UID allows the plugin to notice the pod instance it gets in (3) or (5) is different and exit early.

Scenario 2: pod deleted during sandbox init

  1. pod created, kubelet notices, asks CRI to create the sandbox
  2. CRI creates sandbox, execs CNI plugin
  3. something deletes and recreates the pod
  4. CNI plugin gets pod from apiserver, starts setting up networking
  5. CNI plugin still waiting for networking

in this scenario, the CNI plugin gets the new pod instance, which is still wrong for this sandbox setup. Pod UUID immediately tells the plugin that its pod is gone and it can exit early.

In all cases, kubelet will just tear the sandbox down anyway, and what we're trying to prevent is waiting longer than necessary before noticing that this sandbox is useless.

I also tried a variant of this that asks the CRI for the sandbox metadata during the call, but that fails because ListPodSandbox only lists completed sandbox setups, not in-progress ones :(


I suppose the real fix would be to allow a sandbox delete + CNI DEL while an existing add was going on, or a CANCEL operation that kubelet could execute to tell the CRI and plugins to stop the request. But that's a much longer arc to make happen (but it should happen, and at least on the CNI side we are working on that via gRPC).

@dcbw
Copy link
Contributor Author

dcbw commented Jun 24, 2021

/cherry-pick release-1.22
/cherry-pick release-1.21

@openshift-cherrypick-robot

@dcbw: new pull request could not be created: failed to create pull request against cri-o/cri-o#release-1.22 from head openshift-cherrypick-robot:cherry-pick-5026-to-release-1.22: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"No commits between cri-o:release-1.22 and openshift-cherrypick-robot:cherry-pick-5026-to-release-1.22"}],"documentation_url":"https://docs.github.com/rest/reference/pulls#create-a-pull-request"}

Details

In response to this:

/cherry-pick release-1.22
/cherry-pick release-1.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dcbw
Copy link
Contributor Author

dcbw commented Jun 24, 2021

/cherry-pick release-1.21

@openshift-cherrypick-robot

@dcbw: #5026 failed to apply on top of branch "release-1.21":

Applying: vendor: bump ocicni to 4ea5fb8752cfe
Using index info to reconstruct a base tree...
M	go.mod
M	go.sum
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
Auto-merging go.sum
CONFLICT (content): Merge conflict in go.sum
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 vendor: bump ocicni to 4ea5fb8752cfe
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Details

In response to this:

/cherry-pick release-1.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Member

thanks for the explaination @dcbw , makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants