-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Support Intel RDT #4830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Intel RDT #4830
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: marquiz The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @marquiz. Thanks for your PR. I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
cool! thanks for doing this work @marquiz. It is my understanding that this currently doesn't totally work (the rdt found in the annotation doesn't seem to be going anywhere). that's fine to start, just wanted to check. Is it standard for all containers to have the same rdt profile for a host? or would they want the configuration to look differently? (I don't know anything about rdt) If they may look different, it may behoove us to have mutliple profiles, found in a directory, rather than a single file. seccomp does this currently, where if you specify I have a few review comments and nits, and we'll likely want some tests to go along with this one day. I would be interested in adding support independent of corresponding CRI changes. our |
|
/ok-to-test |
|
Thanks for taking a look at this @haircommander !
It does work. We inspect the annotation(s) and set
RDT enables QoS control of cache and/or memory bandwidth by providing class-based allocation of these resources. IOW, it allows a set of Classes-of-Service (CLOSes) that cache lines and/or memory bandwidth is allocated. PIDs are then assigned to one of those classes limiting their resource usage. The maximum number of classes is fairly limited (by HW) e.g. to 16. The OCI container runtime (runc) does not read any configuration file. It just assigns the container (PIDs) to the class/clos specified in the container runtime spec. With this PR, classes/closes are pre-configured at CRI-O startup by goresctrl. Or probably better, "may be configured", as the class/clos configuration could be done out-of-band, manually or with some other tool as well. I'm not sure of the aspect of independently configured profiles of e.g. seccomp/apparmor is applicable here. At least not in the case of goresctrl. The classes are inter-dependent as you may e.g. specify that reserve 50% to class A exclusively and remaining 50% to B and C so that C only gets 60% of that share. The set of available classes/closes are specified in a single configuration file. Some references:
I didn't add any, yet, as the integration is so "thin". We could basically test the config parsing error cases and parsing of the annotation(s) but that's about it.
Yeah, I added the container annotation to the list of I think that the last patch (adding Pod annotations) really should not go into CRI-O. I'll submit that against K8s/kubelet if this approach looks feasible to the container runtimes (I submitted a PR against containerd as well: containerd/containerd#5439) |
|
/retest |
Codecov Report
@@ Coverage Diff @@
## master #4830 +/- ##
==========================================
- Coverage 44.27% 44.13% -0.14%
==========================================
Files 112 113 +1
Lines 11563 11640 +77
==========================================
+ Hits 5119 5137 +18
- Misses 5957 6013 +56
- Partials 487 490 +3 |
|
Rebased |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my point of view. Can we test this somehow in our CI?
|
Other than @saschagrunert 's comments, LGTM |
|
@marquiz please rebase again 🙃 |
Use goresctrl for parsing RDT related container and pod annotations. In practice from the users' perspective, this patchs adds support for a container annotation and two pod annotations for controlling the RDT class on CRI level. Container annotation can be used by the CRI client: "io.kubernetes.cri.rdt-class" Pod annotations for specifying the RDT class in the K8s podspec level: "rdt.resources.beta.kubernetes.io/pod" (pod-wide default for all containers) "rdt.resources.beta.kubernetes.io/container.<container_name>" (container-specific overrides) Annotations are intended as an intermediate step before the CRI API supports RDT. Signed-off-by: Markus Lehtonen <[email protected]>
Signed-off-by: Markus Lehtonen <[email protected]>
fafbef7 to
d0baf9c
Compare
Np, this was an easy one.
👍 |
|
/retest |
1 similar comment
|
/retest |
|
/retest |
|
/assign @saschagrunert |
saschagrunert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/unhold
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: marquiz, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@marquiz: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest |
|
/override ci/prow/e2e-gcp agnostic passed |
|
@haircommander: Overrode contexts on behalf of haircommander: ci/prow/e2e-gcp DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
OCI runtime-spec has supported Intel RDT (resctrl pseudo-filesystem) for a while already. This RFC PR adds two independent parts of RDT support to CRI-O.
The first one is integration with github.com/intel/goresctrl which enables flexible class-based configuration mechanism. The concept of the integration is partly inspired by seccomp et. al. CRI-O config only specifies a path to an external configuration file, and the file structure and logic of applying the configuration is owned by the library. This integration will give users an easy option for configuring RDT in tandem with the container runtime.
The second part adds support for container annotation for controlling the RDT class/CLOS of containers. The idea is that Kubelet can utilize this annotation before the CRI API properly supports RDT. There's also a patch that adds respective Pod annotation for controlling the RDT class (per-pod and per-container) – this is for testing/demonstration purposes of K8s integration.
My take is that the next steps – e.g. how to split this PR, how to proceed with CRI API etc – depend on how this PR is received.
What type of PR is this?
/kind feature
What this PR does / why we need it:
The OCI runtime spec has had support for RDT for a long time, already. This PR opens up the path to CRI support. Integration with
goresctrlis a light and simple extension point for better user experience.Which issue(s) this PR fixes:
Special notes for your reviewer:
This (RFC) PR consists of two separate pieces (support RDT annotation(s) and integration with goresctrl) that could be split into separate PRs.
Related runc PR:
opencontainers/runc#2920
Sibling PR against containerd:
containerd/containerd#5439
/hold
Does this PR introduce a user-facing change?