-
Notifications
You must be signed in to change notification settings - Fork 1.1k
server/metrics: Update seccomp notifier metrics to reduce cardinality #6456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Peter Fern <[email protected]>
|
Hi @pdf. Thanks for your PR. I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| } | ||
|
|
||
| metrics.Instance().MetricContainersSeccompNotifierCountTotalInc(ctr.Name(), usedSyscalls) | ||
| metrics.Instance().MetricContainersSeccompNotifierCountTotalInc(ctr.Name(), syscall) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above we pull notifier.UsedSyscallswhich got use usedSyscalls. I don't see us finding all of the syscalls that were reported. shouldn't we call this function for each syscall?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is called in a loop, every time a seccomp notification arrives on the channel:
Lines 746 to 750 in 093d680
| for { | |
| msg := <-s.seccompNotifierChan | |
| ctx := msg.Ctx() | |
| id := msg.ContainerID() | |
| syscall := msg.Syscall() |
usedSyscalls accumulates syscalls from each iteration in the same loop:
Line 764 in 093d680
| notifier.AddSyscall(syscall) |
So we do call the function for each syscall - we just incr the metric count for the specific syscall name received each iteration, ie:
| iteration | msg.Syscall() | metric values |
|---|---|---|
| 0 | swapoff | {name="...", syscall="swapoff"} = 1 |
| 1 | swapoff | {name="...", syscall="swapoff"} = 2 |
| 2 | chroot | {name="...", syscall="swapoff"} = 2, {name="...", syscall="chroot"} = 1 |
|
/ok-to-test |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6456 +/- ##
==========================================
- Coverage 43.59% 43.57% -0.03%
==========================================
Files 123 123
Lines 14390 14390
==========================================
- Hits 6273 6270 -3
- Misses 7437 7439 +2
- Partials 680 681 +1 |
|
Looks like the original tests may not have been functional - all the exit code tests for commands that should make forbidden syscalls are failing, eg: cri-o/test/seccomp_notifier.bats Line 36 in 093d680
Is the test runner perhaps not restricting these calls? |
That's a known failure we've been investigating (and subsequently ignoring in order to get the release out) in rhel e2e. They're working in fedora. I personally have no idea what's going on, and those that could possibly look into it are already on holiday break. |
|
/approve /override ci/prow/e2e-gcp-ovn |
|
@haircommander: Overrode contexts on behalf of haircommander: ci/prow/ci-cgroupv2-e2e, ci/prow/ci-rhel-integration, ci/prow/e2e-gcp-ovn DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Thanks a bunch @pdf for taking this on! |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, pdf The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Peter Fern [email protected]
What type of PR is this?
/kind bug
What this PR does / why we need it:
As discussed in #6418 this change reduces cardinality of the seccomp notifier metrics by adjusting the metric output format.
Which issue(s) this PR fixes:
Fixes #6422
Special notes for your reviewer:
I wasn't able to sort out running the test suite locally in time for release, marking as WIP here to make use of the CI test runners.
Does this PR introduce a user-facing change?
This change does modify the metric output format, however the previous metrics have not been included in any prior release.