-
Notifications
You must be signed in to change notification settings - Fork 1.1k
oci_linux: fix working set calculation #4068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oci_linux: fix working set calculation #4068
Conversation
Commit 512fdb2 mistakenly used the value of total_inactive_file from the top-level cgroup, thus the working set value was either wrong (too low) or invalid (negative), for example: > Unable to account working set stats: total_inactive_file (1572753409) > memory usage (585728)" file="oci/oci_linux.go:93" We need to use total_inactive_file and memory.usage_in_bytes from the same cgroup, otherwise it does not make any sense. While at it - promote the above message from debug to warning; - optimize getTotalInactiveFile() by using HasPrefix() rather than Contains(). Signed-off-by: Kir Kolyshkin <[email protected]>
|
Found while fixing test/stats.bats in PR #4064 (see #4064 (comment)), going to test it there. |
Codecov Report
@@ Coverage Diff @@
## master #4068 +/- ##
==========================================
- Coverage 41.18% 41.17% -0.02%
==========================================
Files 109 109
Lines 8998 9001 +3
==========================================
Hits 3706 3706
- Misses 4955 4958 +3
Partials 337 337 |
|
/retest LGTM |
giuseppe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/retest
It's done in a same manner as for v1, except for cgroupv2 there's no
total_* counters in memory.stat. My assumption is the counters now
include subcgroup counters, although I am only 95% sure about that
even after digging into the cgroupv2 docs.
Here's an emprirical way I used to check the above assumption:
find /sys/fs/cgroup/system.slice/ -name memory.stat -type f \
| xargs grep ^inactive_file 2>/dev/null \
| awk '$2 > 0 {printf "%20d %s\n",$2,$1}' \
| sort -nr
It shows that the top-level's cgroup value is higher than that of
sub-cgroups for both system.slice and user.slice.
PS I have also checked it is done the same way in containerd/cri.
Signed-off-by: Kir Kolyshkin <[email protected]>
1ece911 to
dca8285
Compare
|
/lgtm |
|
/retest |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/test integration_crun
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, kolyshkin, mrunalp, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
LGTM |
/kind bug
What this PR does / why we need it:
Commit 512fdb2 (PR #3115) mistakenly used the value of total_inactive_file
from the top-level cgroup, thus the working set value was either
wrong (too low) or invalid (negative), for example:
We need to use total_inactive_file and memory.usage_in_bytes from
the same cgroup, otherwise it does not make any sense.
While at it
rather than Contains().
Which issue(s) this PR fixes:
None
Special notes for your reviewer:
Does this PR introduce a user-facing change?