-
Notifications
You must be signed in to change notification settings - Fork 1.1k
oci: do not use conmon for exec sync #4943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oci: do not use conmon for exec sync #4943
Conversation
d7f1150 to
424374e
Compare
Codecov Report
@@ Coverage Diff @@
## master #4943 +/- ##
==========================================
+ Coverage 42.94% 43.38% +0.44%
==========================================
Files 107 107
Lines 9933 9825 -108
==========================================
- Hits 4266 4263 -3
+ Misses 5217 5112 -105
Partials 450 450 |
internal/oci/runtime_oci.go
Outdated
| Stdout: stdout, | ||
| Stderr: stderr, | ||
| ExitCode: exitCode, | ||
| }, waitErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block doesn't match the existing code block and that may break container restarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check if the kubelet will tolerate this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's something awry with the code. I am working on getting critests passing, and that will likely involve changing this block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spent nearly a whole day looking for
// gather exit code from waitErr
exitCode := int32(0)
if waitErr != nil {
- if exitError, ok := err.(*exec.ExitError); ok {
+ if exitError, ok := waitErr.(*exec.ExitError); ok {
exitCode = int32(exitError.ExitCode())
}
}
😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice catch
6e02a4b to
98e83f9
Compare
|
/retest |
ac2d146 to
9a6598b
Compare
9a6598b to
9e6b798
Compare
|
/test integration_fedora |
|
goodness this is a rabbit hole I think this passes critest now, hard to tell because I keep getting rate limited. no idea what's up with the openshift jenkins tests either... edit: openshift jenkins didn't seem to like the |
|
/override ci/prow/e2e-agnostic |
|
@saschagrunert: Overrode contexts on behalf of saschagrunert: ci/prow/e2e-agnostic, ci/prow/e2e-gcp DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two nits, otherwise LGTM
internal/oci/runtime_oci.go
Outdated
| Stderr: stderrBuf, | ||
| ExitCode: -1, | ||
| Err: err, | ||
| log.Errorf(ctx, "failed to get pid (%d) or pgid (%d) from file %s: %v", ctrPid, ctrPgid, pidFile, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| log.Errorf(ctx, "failed to get pid (%d) or pgid (%d) from file %s: %v", ctrPid, ctrPgid, pidFile, err) | |
| log.Errorf(ctx, "Failed to get pid (%d) or pgid (%d) from file %s: %v", ctrPid, ctrPgid, pidFile, err) | |
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
internal/oci/runtime_oci.go
Outdated
| log.Errorf(ctx, "Failed to kill process after timeout: %v", err) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
It is not really needed, and causes unnecessary overhead. Plus, we drop an unkillable child from the mix allowing cri-o to keep track of its children better. Signed-off-by: Peter Hunt <[email protected]>
Signed-off-by: Peter Hunt <[email protected]>
9e6b798 to
eebef46
Compare
|
/retest |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
@mrunalp @haircommander feel free to lift the hold when ready.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
For ultra-low-latency workloads, it is helpful to guarantee that the container entry point, and any other foreign process run in the container (kubectl exec...) are always scheduled on fixed and predictable cpu in the container cpuset (e.g. not in a random one). This patch implements this optional behaviour depending on container annotations. Signed-off-by: Francesco Romani <[email protected]>
|
/retest |
|
@haircommander: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/hold cancel |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
7 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/cherry-pick release-1.20 |
|
@saschagrunert: #4943 failed to apply on top of branch "release-1.20": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherry-pick release-1.21 |
|
@haircommander: new pull request created: #4962 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind design
What this PR does / why we need it:
It is not really needed, and causes unnecessary overhead. Plus, we drop an unkillable child from the mix
allowing cri-o to keep track of its children better.
Signed-off-by: Peter Hunt [email protected]
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?