-
Notifications
You must be signed in to change notification settings - Fork 1.1k
oci: return IsAlive error instead of logging #4149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4149 +/- ##
==========================================
- Coverage 40.74% 40.72% -0.02%
==========================================
Files 111 111
Lines 9499 9496 -3
==========================================
- Hits 3870 3867 -3
Misses 5253 5253
Partials 376 376 |
2f015d5 to
2d94193
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
| } | ||
|
|
||
| return true | ||
| return errors.Wrapf(err, "checking if PID of %s is running failed", c.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing the return nil case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the supplied err is nil, wrapf returns nil
https://godoc.org/github.com/pkg/errors#Wrap
server/container_execsync.go
Outdated
| if !c.IsAlive() { | ||
| return nil, fmt.Errorf("container is not created or running") | ||
| if err := c.IsAlive(); err != nil { | ||
| return nil, errors.Wrapf(err, "container is not created or running") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's return the same codes.NotFound as the first check above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
When a container has been stopped, but the rest of its pod is still stopping, the kubelet still runs exec probes IsAlive() correctly identifies the container has been stopped, and logs an error, but in reality, this is expected. Instead of logging the error, return it in IsAlive (and also ExecSync), and let the kubelet report it if it thinks it'll be problematic This fixes superluous errors like this: "Checking if PID of 4a81020e858fbdd1ee6a271190ab36aec1940489386e177f33c2e62afa309580 is running failed: PID running but not the original container. PID wrap may have occurred" Signed-off-by: Peter Hunt <[email protected]>
2d94193 to
ec69e86
Compare
| if !c.IsAlive() { | ||
| return nil, fmt.Errorf("container is not created or running") | ||
| if err := c.IsAlive(); err != nil { | ||
| return nil, status.Errorf(codes.NotFound, "container is not created or running: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include the container id as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's included in the return value of IsAlive()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I move that up a level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's fine!
/retest |
|
/lgtm |
|
@haircommander: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest |
|
I will wait till #4153 merges to cherry pick to not have to retest a bajillion times against a failing integration_rhel |
|
/cherry-pick release-1.19 |
|
@haircommander: new pull request created: #4157 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
When a container has been stopped, but the rest of its pod is still stopping, the kubelet still runs exec probes
IsAlive() correctly identifies the container has been stopped, and logs an error, but in reality, this is expected.
Instead of logging the error, return it in IsAlive (and also ExecSync), and let the kubelet report it if it thinks it'll be problematic
This fixes superluous errors like this:
"Checking if PID of 4a81020e858fbdd1ee6a271190ab36aec1940489386e177f33c2e62afa309580 is running failed: PID running but not the original container. PID wrap may have occurred"
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?