Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Endpoints with TolerateUnready annotation, should list Pods in state terminating#37093

Merged
k8s-github-robot merged 1 commit into
kubernetes:masterfrom
simonswine:fix-tolerate-unready-endpoints-pods-terminating
Jan 3, 2017
Merged

Endpoints with TolerateUnready annotation, should list Pods in state terminating#37093
k8s-github-robot merged 1 commit into
kubernetes:masterfrom
simonswine:fix-tolerate-unready-endpoints-pods-terminating

Conversation

@simonswine
Copy link
Copy Markdown
Contributor

@simonswine simonswine commented Nov 18, 2016

What this PR does / why we need it:

We are using preStop lifecycle hooks to gracefully remove a node from a cluster. This hook is potentially long running and after the preStop hook is fired, the DNS resolution of the soon to be stopped Pod is failing, which causes a failure there.

Special notes for your reviewer:

Would be great to backport that to 1.4, 1.3

Release note:

Endpoints, that tolerate unready Pods, are now listing Pods in state Terminating as well

@bprashanth


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note-label-needed labels Nov 18, 2016
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins kops AWS e2e failed for commit c44e81f3a5e3bd7da9358790efeea8bc3d69168b. Full PR test history.

The magic incantation to run this job again is @k8s-bot kops aws e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCI GCE e2e failed for commit c44e81f3a5e3bd7da9358790efeea8bc3d69168b. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@simonswine simonswine force-pushed the fix-tolerate-unready-endpoints-pods-terminating branch from c44e81f to d282f47 Compare November 18, 2016 16:03
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCE e2e failed for commit c44e81f3a5e3bd7da9358790efeea8bc3d69168b. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Jenkins GCE etcd3 e2e failed for commit c44e81f3a5e3bd7da9358790efeea8bc3d69168b. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@bprashanth
Copy link
Copy Markdown
Contributor

I'm fine doing this, the annotation effectively means: keep all service entries and dns records for the pod around for as long as you possibly can (ignore readiness, ignore deletion grace etc). I don't think it can make 1.5 though, since we're well past the feature freeze date and this isn't a stabilization fix. Maybe we can just fold this behavior into #25283? We need to graduate the tolerate-unready somehow, anyway.

@simonswine
Copy link
Copy Markdown
Contributor Author

@bprashanth: tbh I just expected that behaviour when I wrote my preStop hooks and I was pretty suprised, when I found out DNS is gone and this is the reason for the balance to fail. So I would see it more as a fix than a bug.

Anyhow I won't be much around for the next weeks, @mattbates can you have a look at this PR

@mattbates
Copy link
Copy Markdown

@bprashanth just following up here. Re @simonswine's comment, can we progress this PR as a fix and merge, cherry-picking into 1.3 and 1.4 too ideally?

Copy link
Copy Markdown
Contributor

@bprashanth bprashanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the pr, it's low risk, I don't know if it's going to make 1.5 because of the timing. It's an alpha feature that we are with high probability going to remodel before beta (#25283), so I don't think we should be cherrypicking it into older releases.

The original annotation was to tolerate "unreadiness". This pr bends the definition of "unreadiness" from "ignore failing readiness probes" to "ignore failing readiness probes AND deletion timestamps". A slightly more correct way to do this would be to "ignore readiness AND deletion timestamps on not ready pods", but given that the feature is going to change soon, I'm not sure the distinction matters.

continue
}
if pod.DeletionTimestamp != nil {
if !tolerateUnreadyEndpoints && pod.DeletionTimestamp != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please augment the comment above the annotation definition with:
// Endpoints of Services bearing this annotation retain their DNS
// records and continue receiving traffic for the Service from the moment
// the kubelet starts all containers in the pod and marks it "Running", till the
// kubelet stops all containers and deletes the pod from the apiserver.

@k8s-github-robot k8s-github-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Dec 10, 2016
@simonswine simonswine force-pushed the fix-tolerate-unready-endpoints-pods-terminating branch from d282f47 to e68f748 Compare December 20, 2016 17:53
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 20, 2016
@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 20, 2016
@smarterclayton
Copy link
Copy Markdown
Contributor

Please add a test and then this LGTM. You can add it to test/e2e/services.go under It("should create endpoints for unready pods"). There is an image that can arbitrarily delay shutdown network-tester and a flag -delay-shutdown it takes which is number of seconds to hold before graceful shutdown.

@simonswine simonswine force-pushed the fix-tolerate-unready-endpoints-pods-terminating branch from e68f748 to a92a0d1 Compare December 28, 2016 16:36
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 28, 2016
…n state terminating

* Otherwise it prevents long running task in a preStop hook to succeed,
that require DNS resolution
@simonswine simonswine force-pushed the fix-tolerate-unready-endpoints-pods-terminating branch from a92a0d1 to b44de1e Compare January 3, 2017 13:00
@simonswine
Copy link
Copy Markdown
Contributor Author

@smarterclayton thanks for your input on how to prevent test flakes. I ve implemented it as you suggested by modifying the annotation of the service and waiting for the test wget to timeout.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 3, 2017
@smarterclayton smarterclayton added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 3, 2017
@smarterclayton
Copy link
Copy Markdown
Contributor

LGTM thanks

@deads2k
Copy link
Copy Markdown
Contributor

deads2k commented Jan 3, 2017

@k8s-bot test this

@k8s-github-robot
Copy link
Copy Markdown

Automatic merge from submit-queue (batch tested with PRs 39092, 39126, 37380, 37093, 39237)

@smarterclayton
Copy link
Copy Markdown
Contributor

After reflecting on this a bit more, I think it should be possible for a consumer to request that terminating pods continue to be in the load balancer rotation independent of the annotation. Many applications that can control their shutdown (like one with a very long graceful shutdown period) to take traffic. I think that's orthogonal to the "ready immediately" setting. It should be possible in shutdown to control when traffic is diverted away, and it's not automatically at the very end of termination, not st the very beginning. I'll spawn an issue, but it may be that tolerate unready becomes a policy (EndpointInclusionPolicy) or a set of orthogonal flags.

@simonswine
Copy link
Copy Markdown
Contributor Author

@smarterclayton: I think this sounds like that the unready handling should have more states then just true and false before being promoted to a spec field in the Service object. Is there an issue to track this effort?

And another thing, do you think this could be cherry-picked into 1.5 or is this not seen as a bugfix as such? If so I think I can't initiate this as I am not able to add and remove labels

@smarterclayton
Copy link
Copy Markdown
Contributor

I did not open an issue yet.

It's reasonable to backport, tagging.

@k8s-cherrypick-bot
Copy link
Copy Markdown

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

@smarterclayton smarterclayton added this to the v1.5 milestone Jan 17, 2017
@saad-ali saad-ali added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jan 20, 2017
k8s-github-robot pushed a commit that referenced this pull request Jan 20, 2017
…7093-upstream-release-1.5

Automatic merge from submit-queue

Automated cherry pick of #37093

Cherry pick of #37093 on release-1.5.

#37093: Fix: With TolerateUnready set, endpoints are still listed for
@k8s-cherrypick-bot
Copy link
Copy Markdown

Commit found in the "release-1.5" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

// create a headless Service just for the StatefulSet, and clients shouldn't
// be using this Service for anything so unready endpoints don't matter.
// Endpoints of these Services retain their DNS records and continue
// receiving traffic for the Service from the moment the kubelet starts all
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make self hosted etcd reliable (an important part of self hosted k8s effort), we want to have the DNS resolvable since pod initialization phase (init container). The current implementation does not prevent us from doing that. Basically, I hope after the pod gets the IP and before the Pod terminates, the DNS can be resolvable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.