Thanks to visit codestin.com
Credit goes to github.com

Skip to content

kubelet flushes KUBE-HOSTPORTS and KUBE-MARK-MASQ when pods start #32415

@thockin

Description

@thockin

Observed: hitting a service nodePort fails intermittently, but the service cluster IP works 100%

Debug: tcpdump shows SYN sent, but no SYNACK returned. I noticed that in the error case, srcip was 127.0.0.1 - clearly wrong. We proved that the KUBE-MARK-MASQ chain was being flushed and so we were not getting SNAT'ed. We proved it was kubelet that was flushing, and kube-proxy that eventually restored it (yay for rectification loops!).

We found the hostport code in kubenet erroneously flushes those chains when starting a pod. After that it can take up to several minutes for kube-proxy to hit its own sync loop and fix the problem.

The fix is easy - don't flush those chains. @freehan is working on the fix right now.

@fabioy @timstclair for 1.3.x
@pwittrock for 1.4.x
@spxtr for reporting it concretely enough to repro

Metadata

Metadata

Assignees

Labels

area/kubeletpriority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions