Thanks to visit codestin.com
Credit goes to github.com

Skip to content

e2e flake following resize test: Cannot start container <id>: no available ip addresses on network #8758

@justinsb

Description

@justinsb

Running e2e tests on GCE, I saw a failure on "should provide DNS for the cluster". The test reported this error:

INFO: event for dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf: {scheduler } scheduled: Successfully assigned dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf to e2e-test-justinsb-minion-gbm3
INFO: event for dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf: {kubelet e2e-test-justinsb-minion-gbm3} pulled: Successfully pulled image "gcr.io/google_containers/pause:0.8.0"
INFO: event for dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf: {kubelet e2e-test-justinsb-minion-gbm3} created: Created with docker id 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320
INFO: event for dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf: {kubelet e2e-test-justinsb-minion-gbm3} failed: Failed to start with docker id 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320 with error: API error (500): Cannot start container 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320: no available ip addresses on network

INFO: event for dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf: {kubelet e2e-test-justinsb-minion-gbm3} failedSync: Error syncing pod, skipping: API error (500): Cannot start container 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320: no available ip addresses on network

...

This happened when dns was run immediately after "ResizeNodes / should be able to delete nodes."

I SSHed in to the minion, and saw that kubelet had restarted, but had this error around this time in /var/log/kubelet.log:

I0524 15:40:54.847548    3009 manager.go:230] Starting recovery of all containers
I0524 15:40:54.851445    3009 manager.go:235] Recovery completed
I0524 15:40:54.853060    3009 status_manager.go:56] Starting to sync pod status with apiserver
I0524 15:40:54.853078    3009 kubelet.go:1596] Starting kubelet main sync loop.
E0524 15:40:54.859528    3009 kubelet.go:1518] error getting node: node e2e-test-justinsb-minion-gbm3 not found
E0524 15:40:54.866972    3009 kubelet.go:2089] Cannot get host IP: cannot get node: node e2e-test-justinsb-minion-gbm3 not found
I0524 15:40:54.866995    3009 manager.go:1347] Need to restart pod infra container for "fluentd-elasticsearch-e2e-test-justinsb-minion-gbm3_default" because it is not found
I0524 15:40:54.868450    3009 provider.go:91] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
I0524 15:40:54.868585    3009 provider.go:91] Refreshing cache for provider: *gcp_credentials.dockerConfigKeyProvider
I0524 15:40:54.869219    3009 config.go:119] body of failing http response: &{0xc208391140 {0 0} false <nil> 0x5ca9a0 0x5ca930}
E0524 15:40:54.869256    3009 metadata.go:109] while reading 'google-dockercfg' metadata: http status code: 404 while fetching url http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg
I0524 15:40:54.869274    3009 provider.go:91] Refreshing cache for provider: *gcp_credentials.dockerConfigUrlKeyProvider
I0524 15:40:54.871380    3009 config.go:119] body of failing http response: &{0xc2083913c0 {0 0} false <nil> 0x5ca9a0 0x5ca930}
E0524 15:40:54.871408    3009 metadata.go:121] while reading 'google-dockercfg-url' metadata: http status code: 404 while fetching url http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg-url
I0524 15:40:55.146249    3009 kubelet.go:1779] Recording NodeReady event message for node e2e-test-justinsb-minion-gbm3
I0524 15:40:55.146291    3009 kubelet.go:731] Attempting to register node e2e-test-justinsb-minion-gbm3
I0524 15:40:55.146353    3009 event.go:203] Event(api.ObjectReference{Kind:"Node", Namespace:"", Name:"e2e-test-justinsb-minion-gbm3", UID:"e2e-test-justinsb-minion-gbm3", APIVersion:"", ResourceVersion:"", FieldPath:""}): reason: 'NodeReady' Node e2e-test-justinsb-minion-gbm3 status is now: NodeReady
I0524 15:40:55.175255    3009 kubelet.go:751] Successfully registered node e2e-test-justinsb-minion-gbm3
I0524 15:40:55.175270    3009 kubelet.go:764] Starting node status updates
I0524 15:40:55.516960    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"fluentd-elasticsearch-e2e-test-justinsb-minion-gbm3", UID:"66cf140c101765011818758029a443b7", APIVersion:"v1beta3", ResourceVersion:"", FieldPath:"implicitly required container POD"}): reason: 'pulled' Successfully pulled image "gcr.io/google_containers/pause:0.8.0"
I0524 15:40:55.602978    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"fluentd-elasticsearch-e2e-test-justinsb-minion-gbm3", UID:"66cf140c101765011818758029a443b7", APIVersion:"v1beta3", ResourceVersion:"", FieldPath:"implicitly required container POD"}): reason: 'created' Created with docker id 434e69d5640632250ec29a1565daa2a3740664eaae36f339907c2b2038cc1fcc
I0524 15:40:55.750577    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"fluentd-elasticsearch-e2e-test-justinsb-minion-gbm3", UID:"66cf140c101765011818758029a443b7", APIVersion:"v1beta3", ResourceVersion:"", FieldPath:"implicitly required container POD"}): reason: 'started' Started with docker id 434e69d5640632250ec29a1565daa2a3740664eaae36f339907c2b2038cc1fcc
I0524 15:41:01.785023    3009 server.go:588] POST /stats/container/: (3.123839ms) 0 [[Go 1.1 package http] 10.245.1.5:50852]
I0524 15:41:02.206085    3009 manager.go:1347] Need to restart pod infra container for "dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf_e2e-tests-dns-227bd4b9-c484-4584-8875-512747f44b24" because it is not found
I0524 15:41:02.208733    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"e2e-tests-dns-227bd4b9-c484-4584-8875-512747f44b24", Name:"dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf", UID:"481fc3ad-022b-11e5-b444-42010af0abb3", APIVersion:"v1beta3", ResourceVersion:"11857", FieldPath:"implicitly required container POD"}): reason: 'pulled' Successfully pulled image "gcr.io/google_containers/pause:0.8.0"
I0524 15:41:02.303923    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"e2e-tests-dns-227bd4b9-c484-4584-8875-512747f44b24", Name:"dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf", UID:"481fc3ad-022b-11e5-b444-42010af0abb3", APIVersion:"v1beta3", ResourceVersion:"11857", FieldPath:"implicitly required container POD"}): reason: 'created' Created with docker id 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320
E0524 15:41:02.343079    3009 manager.go:1515] Failed to create pod infra container: API error (500): Cannot start container 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320: no available ip addresses on network
; Skipping pod "dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf_e2e-tests-dns-227bd4b9-c484-4584-8875-512747f44b24"
I0524 15:41:02.343306    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"e2e-tests-dns-227bd4b9-c484-4584-8875-512747f44b24", Name:"dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf", UID:"481fc3ad-022b-11e5-b444-42010af0abb3", APIVersion:"v1beta3", ResourceVersion:"11857", FieldPath:"implicitly required container POD"}): reason: 'failed' Failed to start with docker id 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320 with error: API error (500): Cannot start container 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320: no available ip addresses on network
E0524 15:41:02.347866    3009 pod_workers.go:108] Error syncing pod 481fc3ad-022b-11e5-b444-42010af0abb3, skipping: API error (500): Cannot start container 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320: no available ip addresses on network
I0524 15:41:02.347955    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"e2e-tests-dns-227bd4b9-c484-4584-8875-512747f44b24", Name:"dns-test-4c6fa09e-022b-11e5-aab4-00224d56fdcf", UID:"481fc3ad-022b-11e5-b444-42010af0abb3", APIVersion:"v1beta3", ResourceVersion:"11857", FieldPath:""}): reason: 'failedSync' Error syncing pod, skipping: API error (500): Cannot start container 7a4b23f8a0ca05b35c284a7e606f5c378792567b8ffcac756bf3ff67fc895320: no available ip addresses on network
I0524 15:41:05.262557    3009 container_bridge.go:32] Attempting to recreate cbr0 with address range: 10.245.0.1/24
I0524 15:41:05.337978    3009 container_bridge.go:62] Recreated cbr0 and restarted docker
W0524 15:41:11.461827    3009 manager.go:1527] Failed to pull image "gcr.io/google_containers/fluentd-elasticsearch:1.5" from pod "fluentd-elasticsearch-e2e-test-justinsb-minion-gbm3_default" and container "fluentd-elasticsearch": [unexpected EOF, dial unix /var/run/docker.sock: no such file or directory]
I0524 15:41:11.461855    3009 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"fluentd-elasticsearch-e2e-test-justinsb-minion-gbm3", UID:"66cf140c101765011818758029a443b7", APIVersion:"v1beta3", ResourceVersion:"", FieldPath:"spec.containers{fluentd-elasticsearch}"}): reason: 'failed' Failed to pull image "gcr.io/google_containers/fluentd-elasticsearch:1.5": [unexpected EOF, dial unix /var/run/docker.sock: no such file or directory]

Things I think are suspicious:

  • 404 while fetching url http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-doc kercfg
  • It does somehow figure out that it should recreate cbr0 with address range: 10.245.0.1/24, but it then spews a lot of errors while Docker restarts (unable to reach docker.sock)
  • I feel we probably shouldn't start Docker at all until it has a valid cbr0

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions