-
Notifications
You must be signed in to change notification settings - Fork 1.1k
network stop: don't segfault if sandbox isn't created yet #4258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
network stop: don't segfault if sandbox isn't created yet #4258
Conversation
| // cleaning up a failed sandbox creation. | ||
| // We don't need to create the file, as there will be no | ||
| // sandbox to restore | ||
| if !s.created { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should probably come before infra := s.InfraContainer() line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
Would be nice to amend the first commit message with explanation about why this was reverted. It's nice to be able to read git log without referring to PRs. |
Codecov Report
@@ Coverage Diff @@
## master #4258 +/- ##
==========================================
- Coverage 38.59% 38.57% -0.02%
==========================================
Files 111 111
Lines 8893 8894 +1
==========================================
- Hits 3432 3431 -1
- Misses 5077 5079 +2
Partials 384 384 |
This reverts commit ef07f71. commit ef07 (henceforth called revert-1) was a revert of 83169c5 revert-1 was needed because there was a chance that cri-o segfaulted because of an expectation of ordering. If a RunPodSandbox request failed after the sandbox was created, but before the infra container was created, the cleanup func called to clean up the network: networkStop would attempt to write a file to the infra container's Dir(). Since the infra container doesn't exist yet, that infraContainer.Dir() call would segfault. This revert (revert-2) is the first in a two commit series, the second of which will fix that segfault, thus allowing us to revert revert-1 Signed-off-by: Peter Hunt <[email protected]>
If we create the network before we have an infra container, but fail to fully create a sandbox, we attempt to clean up the network. Calling networkStop() causes CRI-O to place a file in the sandbox's infra container's directory, thus allowing us to restore the fact that the network had been stopped The problem is, we don't have a infra container directory, so the call segfaults. Instead, check if the sandbox has finished creating before attempting to create the file. if it hasn't, there will be no sandbox to restore, so we don't really need the temp file. Another option would be to wire it so that the sandbox has access to the infraContainer.Dir() without actually having an infra container. That requires another item in libsandbox.New(), which I find cumbersome. Further, I think sandbox creation code is itching for a refactor, which can include that fix if we find it desireable. In the meantime, this work around is sufficient. Signed-off-by: Peter Hunt <[email protected]>
87a3747 to
fecf1a1
Compare
|
LGTM |
|
/retest |
kolyshkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, kolyshkin, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
|
/retest |
5 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
@haircommander: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
If we create the network before we have an infra container, but fail to fully create a sandbox,
we attempt to clean up the network. Calling networkStop() causes CRI-O to place a file in the
sandbox's infra container's directory, thus allowing us to restore the fact that the network had been stopped
The problem is, we don't have a infra container directory, so the call segfaults.
Instead, check if the sandbox has finished creating before attempting to create the file. if it hasn't, there will be
no sandbox to restore, so we don't really need the temp file.
Another option would be to wire it so that the sandbox has access to the infraContainer.Dir() without actually having an infra container.
That requires another item in libsandbox.New(), which I find cumbersome. Further, I think sandbox creation code is itching for a refactor,
which can include that fix if we find it desireable. In the meantime, this work around is sufficient.
This PR un-reverts #4244 (i.e. reverting #4244), but also fixes #4240 (comment)
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?