-
Notifications
You must be signed in to change notification settings - Fork 1.1k
oci: fix a performance regression from execs #5136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
22c7d13 to
1440701
Compare
internal/oci/oci.go
Outdated
| func New(c *config.Config) *Runtime { | ||
| func New(c *config.Config) (*Runtime, error) { | ||
| execNotifyDir := filepath.Join(c.ContainerAttachSocketDir, "exec-pid-dir") | ||
| if err := os.MkdirAll(execNotifyDir, 0o755); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G301: Expect directory permissions to be 0750 or less
(at-me in a reply with help or ignore)
b2def49 to
657d53c
Compare
Codecov Report
@@ Coverage Diff @@
## master #5136 +/- ##
==========================================
+ Coverage 43.93% 44.08% +0.15%
==========================================
Files 110 111 +1
Lines 11453 11484 +31
==========================================
+ Hits 5032 5063 +31
+ Misses 5944 5939 -5
- Partials 477 482 +5 |
|
/test e2e-gcp |
1 similar comment
|
/test e2e-gcp |
|
Just curious why you're creating notify in CRI-O rather than fixing upstream? |
I'm not super sure which upstream you're talking about. I would really like to get a timeout feature into runc, but there is a lot of nuance to the behavior here (if exec takes more than x seconds, sig kill it is not a very generic requirement). So we do the next best thing: our best in CRI-O. |
|
Maybe I misunderstood this part, but it looked like you dropped github.com/rjeczalik/notify and replaced it with your own spin on notify. |
| return nil, err | ||
| } | ||
| go func() { | ||
| defer watcher.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be put after the err check on NewWatcher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave it here because we want to close it after we receive <-done, not after the function exits
| defer close(eiCh) | ||
| defer notify.Stop(eiCh) | ||
| for { | ||
| select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing with the container exit monitor, we are missing a case for wacher.Errors.
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, otherwise LGTM
ah not quite. I introduced a second library in the initial commit because it seemed to work better. I reverted that addition in this PR |
Signed-off-by: Peter Hunt <[email protected]>
Signed-off-by: Peter Hunt <[email protected]>
Signed-off-by: Peter Hunt <[email protected]>
Signed-off-by: Peter Hunt <[email protected]>
to prevent excessive rss from being used in execs Signed-off-by: Peter Hunt <[email protected]>
|
/retest |
|
/lgtm |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@haircommander: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
turns out, starting a new inotify watcher for each exec probe was really inefficient. Replace it with one single watcher for all execs, which makes the RSS addition per exec much less (pretty much the overhead of exec.Cmd, which is unavoidable)
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?