Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@haircommander
Copy link
Member

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

none

@openshift-ci openshift-ci bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Nov 1, 2021
@openshift-ci openshift-ci bot requested a review from sameo November 1, 2021 19:44
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 1, 2021
@codecov
Copy link

codecov bot commented Nov 1, 2021

Codecov Report

Merging #5433 (0ce45a3) into main (6ee64e9) will increase coverage by 0.00%.
The diff coverage is 77.27%.

@@           Coverage Diff           @@
##             main    #5433   +/-   ##
=======================================
  Coverage   43.46%   43.47%           
=======================================
  Files         118      118           
  Lines       11848    11851    +3     
=======================================
+ Hits         5150     5152    +2     
  Misses       6206     6206           
- Partials      492      493    +1     

@haircommander
Copy link
Member Author

for reference, we have a goroutine stack that has a ridiculous number of go routines stuck on this line

While this more seems like the go routine didn't receive correctly on the close of chControl, adding the buffer would also let the function make forward progress. i.e. I am suspecting something funky about the go runtime, but maybe we can work around it?

@haircommander
Copy link
Member Author

/retest

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@haircommander
Copy link
Member Author

/test e2e-agnostic

@saschagrunert
Copy link
Member

/retest-required

@haircommander
Copy link
Member Author

haircommander commented Nov 2, 2021

so I didn't actually know why this could work but I guessed it would. Figured it out chatting with @rphillips today:
looking at https://github.com/cri-o/cri-o/blob/main/internal/oci/oci.go#L139-L146

			case <-chControl:
				return
			default:
				// Check if the container is stopped
				if err := impl.UpdateContainerStatus(ctx, c); err != nil {
					done <- err
					return
				}
				if c.State().Status == ContainerStateStopped {
					done <- nil
					return
				}

if chControl is closed while UpdateContainerStatus is running, we will wait forever on done, when we should check chControl. buffering the channel will allow us to make forward progress (without really complicating the code that much)

@haircommander haircommander changed the title oci: Cleanup channel handling in WaitContainerStateStopped oci: Fix a couple of deadlocks in container stop code Nov 3, 2021
@haircommander
Copy link
Member Author

/retest

1 similar comment
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

@cri-o/cri-o-maintainers PTAL

@haircommander
Copy link
Member Author

/retest

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 8, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 8, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [haircommander,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 8, 2021

@haircommander: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/openshift-jenkins/e2e_crun_cgroupv2 0ce45a3 link false /test e2e_cgroupv2
ci/openshift-jenkins/integration_crun_cgroupv2 0ce45a3 link false /test integration_cgroupv2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@haircommander
Copy link
Member Author

/cherry-pick release-1.22

@openshift-cherrypick-robot

@haircommander: once the present PR merges, I will cherry-pick it on top of release-1.22 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit e93843f into cri-o:main Nov 8, 2021
@openshift-cherrypick-robot

@haircommander: #5433 failed to apply on top of branch "release-1.22":

Applying: oci: make some channels buffered
Applying: oci: always close chControl
Applying: oci: fix deadlock in container stop code
Using index info to reconstruct a base tree...
M	internal/oci/container.go
M	internal/oci/runtime_oci.go
Falling back to patching base and 3-way merge...
Auto-merging internal/oci/runtime_oci.go
Auto-merging internal/oci/container.go
CONFLICT (content): Merge conflict in internal/oci/container.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0003 oci: fix deadlock in container stop code
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Details

In response to this:

/cherry-pick release-1.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants