Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

dubstack
Copy link

@dubstack dubstack commented Jul 16, 2016

This PR is linked to the upstream issue #27204 for introducing pod level cgroups into Kubernetes.

@derekwaynecarr @vishh @Random-Liu PTAL.
I have also documented the reasons behind each design decision
I would like some suggestion/discussion on some comments that I would add in the PR.

Please note that only the second commit is unique to this PR.


This change is Reviewable

@dubstack dubstack added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Jul 16, 2016
@dubstack dubstack added this to the v1.4 milestone Jul 16, 2016
@dubstack dubstack added release-note Denotes a PR that will be considered when it comes time to generate release notes. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jul 16, 2016
@k8s-github-robot k8s-github-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 16, 2016
@dubstack dubstack force-pushed the inject-pod branch 7 times, most recently from 5d0416a to 511cd28 Compare July 20, 2016 03:08
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 23, 2016
@dubstack dubstack changed the title [WIP] Inject pod cgroup creation and deletion in the Kubelet Inject pod cgroup creation and deletion in the Kubelet Jul 28, 2016
@dubstack dubstack removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Jul 28, 2016
@dubstack
Copy link
Author

@Random-Liu @derekwaynecarr This PR is ready for review.

maxImagesInNodeStatus = 50

// podCgroupNamePrefix is the prefix of all pod cgroup names
podCgroupNamePrefix = "pod#"
Copy link
Member

@Random-Liu Random-Liu Jul 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be a good idea to define the same constant twice in different places. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 23, 2016
@derekwaynecarr
Copy link
Member

@vishh @dchen1107 - please confirm this is out of the 1.4 milestone? what's the motivation to do it in 1.4 versus wait for 1.5? who is really going to run with this enabled at this time?

}
for i := range dirInfo {
if dirInfo[i].IsDir() && strings.HasPrefix(dirInfo[i].Name(), podCgroupNamePrefix) {
podUID := strings.TrimPrefix(dirInfo[i].Name(), podCgroupNamePrefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this go into a utility function of some kind so it can be unit tested?

@vishh
Copy link
Contributor

vishh commented Aug 23, 2016

As of now, this PR is not even ready since the pod level cgroups are
causing tests to flake randomly. Given the experimental nature of it, I
don't see why we would restrict it in v1.4 if this PR were to be
hypothetically ready to be merged.

}
return kl.containerRuntime.KillPod(pod, p, gracePeriodOverride)
// cache the pod cgroup Name for reducing the cpu resource limits of the pod cgroup once the pod is killed
pcm := kl.containerManager.NewPodContainerManager()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this logic need to be protected by pod level cgroups being enabled?

@derekwaynecarr
Copy link
Member

@vishh - i want the feature, the pr helps me in my own work, the concern was if the call-points are all vetted based on the presence of the flag. it wasn't clear to me in the current pr if that was the case, but i could have missed something in the mock managers that are returned. I was just surprised to see this still tagged 1.4 milestone since I thought it was out.

@derekwaynecarr
Copy link
Member

@dubstack - I started a branch relative to this PR that integrates with systemd, will post something in next day or so for you to review. may have more comments based on that effort.

@vishh
Copy link
Contributor

vishh commented Aug 23, 2016

@derekwaynecarr Got it. I tried to stabilize this PR yesterday and it required debugging individual test failures. Just an FYI!

@k8s-bot
Copy link

k8s-bot commented Aug 24, 2016

GCE e2e build/test passed for commit 64c3e88.

@dubstack
Copy link
Author

@derekwaynecarr @vishh This PR needs more work. The tests are flaking when limits are being applied on the pod cgroups. I haven't been able to understand what exactly is going wrong. Will have to investigate further. Besides that some tests are failing with just the pod cgroups enabled, which should be fairly easy to resolve. @derekwaynecarr has raised some points which would need to be addressed aswell.
Lets drop the v1.4 milestone.

@derekwaynecarr derekwaynecarr modified the milestones: v1.5, v1.4 Aug 24, 2016
@derekwaynecarr
Copy link
Member

Just a heads up on something I found when using this PR, but have not fully tracked yet. --cgroup-root is not defaulting to / so when the experimental flag is enabled, you need to also specify cgroup-root. I will track down in the branch I am working on the reason why...

@k8s-github-robot
Copy link

@dubstack PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 25, 2016
@dubstack
Copy link
Author

I plan to get this PR ready to review by EOD 28 Sept.

@derekwaynecarr
Copy link
Member

@dubstack -- please keep me informed. this is a p0 release blocker for 1.5 and so it needs quick iteration. i may start peeling non-controversial aspects in smaller prs.

fs.StringVar(&s.SystemCgroups, "system-cgroups", s.SystemCgroups, "Optional absolute name of cgroups in which to place all non-kernel processes that are not already inside a cgroup under `/`. Empty for no container. Rolling back the flag requires a reboot. (Default: \"\").")

//fs.BoolVar(&s.CgroupsPerQOS, "cgroups-per-qos", s.CgroupsPerQOS, "Enable creation of QoS cgroup hierarchy, if true top level QoS and pod cgroups are created.")
fs.BoolVar(&s.CgroupsPerQOS, "experimental-cgroups-per-qos", s.CgroupsPerQOS, "Enable creation of QoS cgroup hierarchy, if true top level QoS and pod cgroups are created.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just call this cgroups-per-qos , it needs to be functional in 1.5, and we can state its support level separate from the flag name.

@derekwaynecarr
Copy link
Member

@dubstack -- i am going to carve out parts of this PR into smaller PRs so we can start getting things merged. i will cc you on those prs for awareness.

@derekwaynecarr
Copy link
Member

ok, i have started cleaning up the delete path in a separate PR.

there are a number of issues where container manager details bled up into the kubelet that made this difficult with other cgroup drivers.

@derekwaynecarr
Copy link
Member

I am closing this PR in favor of the set of PRs I will open shortly. There are a lot of assumptions in this PR that are wrong in the delete path once you plugin in a cgroup driver.

xingzhou pushed a commit to xingzhou/kubernetes that referenced this pull request Dec 15, 2016
Automatic merge from submit-queue

Unblock iterative development on pod-level cgroups

In order to allow forward progress on this feature, it takes the commits from kubernetes#28017 kubernetes#29049 and then it globally disables the flag that allows these features to be exercised in the kubelet.  The flag can be re-added to the kubelet when its actually ready.

/cc @vishh @dubstack @kubernetes/rh-cluster-infra
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants