Thanks to visit codestin.com
Credit goes to github.com

Skip to content

i/o timeout, when trying to communicate with the Kubernetes API server #7559

@Archisman-Mridha

Description

@Archisman-Mridha

What happened:

We were upgrading one of our KOps managed clusters, from Kubernetes v1.31.11 to v1.32.8. Mid upgrade, one of the CoreDNS pods (managed using KOps CoreDNS addon) got scheduled to an upgraded Kubernetes worker node. And it was erroring out with the following :

[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.Service:
 Get "https://100.64.0.1:443/api/v1/services?limit=500&resourceVersion=0": dial tcp 100.64.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.Endpoint
Slice: Get "https://100.64.0.1:443/apis/discovery.k8s.io/v1/endpointslices?limit=500&resourceVersion=0": dial tcp 10
0.64.0.1:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error

What you expected to happen:

I expected CoreDNS to not throw that error :).

How to reproduce it (as minimally and precisely as possible):

Spinup a Kubernetes v1.31.11 cluster using KOps v1.31.0. And then, try upgrading the cluster to Kubernetes v1.32.8, using KOps v1.32.0.

Anything else we need to know?:

The surprising part is, if I uninstalled KOps CoreDNS addon, and installed upstrean CoreDNS Helm chart it ran fine.

The application version for both of them are same : v1.11.3. But : CoreDNS container image of KOps CoreDNS addon is coming from registry.k8s.io, whereas that of upstream CoreDNS is coming from docker.io.

Environment:

  • the version of CoreDNS: v1.11.3
  • Corefile:

This KOps CoreDNS addon's Corefile :

    .:53 {
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local. in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }

And this is of upstream CoreDNS Helm chart's :

    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus 0.0.0.0:9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
  • OS (e.g: cat /etc/os-release): Ubuntu 24.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions