Allow nodename to be != hostname, use AWS instance ID on AWS #9728

justinsb · 2015-06-12T19:12:35Z

We currently assume that a node's name is the hostname; that is natural on GCE where the name of an instance becomes the hostname, and where you refer to instances by name. But other clouds (notably AWS) typically have synthetic instance names, and it makes more sense on those not to assume that nodename == hostname.

This has been a long time coming because we assumed in a lot of places that node-name was resolvable; but there have all been fixed now (I believe!)

The first commit here should be the most controversial: it differentiates between a nodename and a hostname.

The second commit allows a cloud provider to specify a different name for its own node, though nobody actually uses this.

The third commit wires up the previous commit to work with AWS; so AWS nodes use the instance id for the name. This also means we don't need to query for the PublicDnsName any more, but can just retrieve instances by id.

The fourth commit just adds AWS to the list of clouds where the master kubelet does not register as a node.

k8s-bot · 2015-06-12T22:12:53Z

GCE e2e build/test passed for commit e0d5b6a8a5fc51bfb7c814b9e8453e072b9f0664.

justinsb · 2015-06-12T22:38:15Z

I broke out the fourth commit into #9747; we should be doing that anyway, and I suspect that should merge faster than this.

justinsb · 2015-06-16T03:56:08Z

@dchen1107 what do you think? This should be low-impact/no-impact other than to AWS. I would really like to get this into 1.0, for two reasons:

I would like the node names on AWS to be the AWS instance ids, rather than the private DNS names (which are fragile and aren't particularly natural identifiers on AWS). I think we'll end up doing this eventually, and it will make upgrading a running cluster really hard if nodes change names.
It unblocks Refactor Routes, and dynamically configure minion CIDRs on AWS #9720, which enables dynamic CIDRs on AWS, which in turn enables using the equivalent of managed-instance-groups.

If it's just too late to get this into 1.0, I can probably rework #9720 to work in terms of the existing instance names, but I was pretty careful to make it really safe for other clouds (and I think it might even make the code a little clearer to draw a distinction between node name and host name)

dchen1107 · 2015-06-16T05:43:49Z

@justinsb I like what you proposed here distinguishing nodename from hostname. I am reviewing it now.

dchen1107 · 2015-06-16T05:46:13Z

ok to test

Also this one breaks contrib/mesos. cc/ @jdef

dchen1107 · 2015-06-16T05:50:06Z

cmd/kubelet/app/server.go

why do we still want to keep hostname? why not replace hostname with nodename completely? but still keep hostname_override flag? and marked it to be deprecated? then introducing another one called nodename_override, and it not specified, make it equals to hostname_override?

I think you are breaking ppl using hostname_override now. Should we treat the value of flag hostname_override as node_name here?

I don't think I am breaking hostname_override, except maybe on AWS, where if it is broken I want to break it :-)

The logic is this: we build Hostname, just as before: hostname from the OS, but can be overridden by hostname_override. If we have a CloudProvider, we ask it to determine the node-name, but all the implementations apart from AWS simply echo back the maybe-overridden hostname.

That is why we pass hostname in to CurrentNodeName. It is admittedly a weird argument to pass in - it feels a little out of place. Maybe it would be cleaner if we didn't pass in anything, but only changed the nodeName if CurrentNodeName returned non-empty? But then that makes the function even more specialized. I think post V1 we will want a function which returns the CurrentNode() (see https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/kubelet/kubelet.go#L721-L735)

hostname (& hostname_override) do determine the SSL self-signed cert subject, so I think we have to keep this.

I do agree that we will likely end up with a nodename_override (or just nodename?) but I'm hoping that maybe we can avoid it - that it will only be needed on bare-metal, and on bare-metal the hostname is the logical identifier for nodes.

This is possibly optimistic, but I'd rather avoid introducing another flag until we have someone that needs it!

If you'd prefer, I can create nodename_override, but I don't think anyone will have to use it today.

[EDIT] ~~in contrib/mesos/pkg/executor/service/service.go set kcfg.NodeName = kcfg.Hostname.~~
ignore prior statement

also, some unit tests may need updating in contrib/mesos/pkg/executor/executor_test.go

@justinsb I agreed we shouldn't need nodename_override. The only reason I suggested it above is that I thought we could completely remove hostname* term from our source base, and I think it is very misleading. But you are right that today we need it for SSL cert.

dchen1107 · 2015-06-16T05:57:13Z

I reviewed first two commits which related to kubelet and gce cloud provider, and I think they are pretty sane to me except above two comments.

justinsb · 2015-06-16T12:33:54Z

@dchen1107 posted a slightly rambling reply, but I think we need hostname for the SSL cert, and nodename will be == hostname (as today) everywhere except AWS, where I want it to be different. I agree that nodename_override will probably happen eventually, but am hoping we don't need it.

@jdef I'm not particularly familiar with the mesos code, but if you point me in the right direction of what code this might impact I can try to fix it / avoid problems in future.

jdef · 2015-06-16T13:01:20Z

cmd/kubelet/app/server.go

need similar change in contrib/mesos/pkg/executor/service/service.go call to NewMainKubelet

jdef · 2015-06-16T13:10:26Z

you can build/test the contrib/mesos code if you export KUBERNETES_CONTRIB=mesos prior to running make or hack/build-go.sh

jdef · 2015-06-16T13:22:14Z

I think there's probably room for improvement w/ respect to the cloudprovider APIs. The API's are used by apiserver, kube-controller-manager and now kubelet. Things seem a bit over-exposed: cloudprovider is a pretty broad grab-bag. Adding CurrentNodeName further highlights this "kitchen sink" approach. It's probably too late for severely refactoring these but I think it's worthy of consideration post-v1.

For example, perhaps it makes sense to more cleanly separate the APIs that expose "everything that I might want to do-with/know-about the cloud I'm on" from "things I want to know about the node I'm on".

k8s-bot · 2015-06-16T22:38:51Z

GCE e2e build/test failed for commit 4058adf3e170f64c76f497304e8a11e2ef36afe0.

dchen1107 · 2015-06-16T22:52:42Z

cc/ @roberthbailey

thockin · 2015-06-17T02:35:07Z

The implication of not doing this PR-set right now is that the v1.0 release will use "bad" names for nodes, but will anything actually not work?

I'm afraid we will have to cross the bridge of "how to upgrade when the upgrade uses different names for nodes" anyway, which takes some of the urgency off this largish patch set. Convince me this is worth the risk at this point?

justinsb · 2015-06-17T04:01:39Z

I think the risk is low for non-AWS clouds: I've tried to separate it out and make it (I hope) clearly a no-op for anything non-AWS. Though there are two variables (nodename and hostname), they are equal everywhere but on AWS.

For AWS, there is obviously more risk, but the upside is that we get to use the natural identifier (for AWS users, who expect the instance ID), and that we dump matching-by-primary-dns-name. I believe I have made that work, but it is matching on a secondary index without the same uniqueness constraints as the instance id has.

There are some other PRs that also rely on this, including the big one: running the minion nodes in an auto scaling group (#9720 then #9921).

I have tried to make it acceptable risk to other clouds. It is higher risk on AWS, but we unlock huge pieces of (last-minute) functionality.

We can also have those PRs without this, though I have to rework them to not be based on this, they would have to go through the translation layer. If we can't have this one, but the others are a possibility, let's decide that quickly so I can "de-base" them :-)

This will allow us to use a nodeName that is not the hostname, for example on clouds where the hostname is not the natural identifier for a node.

…hostname

The EC2 instance id is the canonical node name on EC2.

justinsb · 2015-06-17T04:41:02Z

Rebased to trigger up-to-date e2e

k8s-bot · 2015-06-17T04:56:38Z

GCE e2e build/test passed for commit 77e1bd3.

This removes dependency on kubernetes#9728

justinsb · 2015-06-17T14:35:36Z

I am also work on a "plan B": breaking the dependency of dynamic CIDRs on this PR, in #9940

I probably should have done that in the first place!

bgrant0607 · 2015-06-17T20:22:23Z

See also #2462.

cc @cjcullen

bgrant0607 · 2015-06-17T20:23:55Z

More background in #7775 (comment)

bgrant0607 · 2015-06-17T20:31:42Z

Without having looked at the details in this PR, I'm strongly in favor of clearly distinguishing the node name from the address/hostname the apiserver can use to reach the node for the proxy functionality. The current --hostname_override flag is problematic.

Regarding changing node naming schemes later: I'd guess this would require removing a node from the cluster, upgrading its Kubelet to a version using the new naming scheme, and then allowing it to self-register again.

dchen1107 · 2015-06-17T20:36:23Z

I agreed with @justinsb this pr has low risk for non-aws cloud providers, and it is align with our effort to distinguish node identifier from the hostname too.

dchen1107 · 2015-06-17T20:53:27Z

LGTM, but still prefer @roberthbailey or @cjcullen take a final look before merge

roberthbailey · 2015-06-18T17:08:55Z

This seems reasonable to me.

dchen1107 · 2015-06-18T17:14:13Z

@roberthbailey Thanks. @satnam6502 You can merge this one now. Thanks!

Allow nodename to be != hostname, use AWS instance ID on AWS

satnam6502 · 2015-06-18T18:27:46Z

I am going to revert this to see if it fixes the unit tests.

satnam6502 · 2015-06-18T18:31:07Z

I think this PR causes our unit tests to fail.

Running tests for APIVersion: v1beta3 with etcdPrefix: registry
+++ [0618 18:23:08] Running unit tests without code coverage
ok      github.com/GoogleCloudPlatform/kubernetes/cmd/genutils  0.008s
ok      github.com/GoogleCloudPlatform/kubernetes/cmd/hyperkube 0.135s
ok      github.com/GoogleCloudPlatform/kubernetes/cmd/kube-apiserver/app    0.117s
ok      github.com/GoogleCloudPlatform/kubernetes/examples  0.174s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/admission 0.081s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api   1.384s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/endpoints 0.102s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/errors    0.061s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/latest    0.057s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/meta  0.083s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/resource  0.059s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/rest  0.089s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/testapi   0.061s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/v1    0.063s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/v1beta3   0.056s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/api/validation    0.255s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/apiserver 1.431s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/auth/authenticator/bearertoken    0.057s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/auth/authorizer/abac  0.042s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/auth/handlers 0.057s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client    0.693s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/cache  0.347s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/chaosclient    0.025s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/clientcmd  0.207s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/clientcmd/api  0.083s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/portforward    0.109s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/record 0.167s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/remotecommand  0.073s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/client/testclient 0.089s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/clientauth    0.057s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/aws 0.097s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/gce 0.083s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/mesos   0.080s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/nodecontroller  0.078s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/openstack   0.109s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/ovirt   0.030s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/rackspace   0.058s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/routecontroller 0.102s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/servicecontroller   0.093s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/cloudprovider/vagrant 0.056s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/controller    9.712s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/controller/framework  4.302s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/conversion    0.799s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/conversion/queryparams    0.058s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/credentialprovider    2.053s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/credentialprovider/gcp    0.261s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/fieldpath 0.074s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/fields    0.015s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/healthz   0.073s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/httplog   0.061s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubectl   0.222s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubectl/cmd   0.120s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubectl/cmd/config    0.292s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubectl/cmd/util  0.218s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubectl/resource  0.065s
W0618 18:23:24.766280    1084 docker.go:259] found a container with the "k8s" prefix, but too few fields (2): "k8s_unidentified"
I0618 18:23:24.766395    1084 container_gc.go:126] Removing unidentified dead container "/k8s_unidentified" with ID "2876"
I0618 18:23:24.766610    1084 image_manager.go:254] [ImageManager]: Removing image "image-0" to free 1024 bytes
I0618 18:23:24.766674    1084 image_manager.go:254] [ImageManager]: Removing image "image-0" to free 1024 bytes
I0618 18:23:24.766732    1084 image_manager.go:254] [ImageManager]: Removing image "image-0" to free 1024 bytes
I0618 18:23:24.766769    1084 image_manager.go:254] [ImageManager]: Removing image "image-0" to free 1024 bytes
I0618 18:23:24.766833    1084 image_manager.go:202] [ImageManager]: Disk usage on "" () is at 95% which is over the high threshold (90%). Trying to free 150 bytes
I0618 18:23:24.766844    1084 image_manager.go:254] [ImageManager]: Removing image "image-0" to free 450 bytes
I0618 18:23:24.766877    1084 image_manager.go:202] [ImageManager]: Disk usage on "" () is at 95% which is over the high threshold (90%). Trying to free 150 bytes
I0618 18:23:24.766887    1084 image_manager.go:254] [ImageManager]: Removing image "image-0" to free 50 bytes
W0618 18:23:24.768256    1084 kubelet.go:542] Data dir for pod "bothpod" exists in both old and new form, using new
W0618 18:23:24.768587    1084 kubelet.go:593] Data dir for pod "newpod", container "bothctr" exists in both old and new form, using new
I0618 18:23:24.768860    1084 plugins.go:56] Registering credential provider: .dockercfg
E0618 18:23:24.768918    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.770355    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.770821    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.770851    1084 kubelet.go:1591] error getting node: node  not found
I0618 18:23:24.779181    1084 plugins.go:56] Registering credential provider: .dockercfg
E0618 18:23:24.779279    1084 kubelet.go:1514] Pod "_": HostPort is already allocated, ignoring: [[0].port: duplicate value '81/']
E0618 18:23:24.779539    1084 kubelet.go:1514] Pod "newpod_foo": HostPort is already allocated, ignoring: [[0].port: duplicate value '80/']
E0618 18:23:24.779585    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.779815    1084 kubelet.go:1591] error getting node: node  not found
--- FAIL: TestHandleNodeSelector (0.00s)
    kubelet_test.go:2318: status of pod "podA_foo" is not found in the status map
E0618 18:23:24.780071    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.780282    1084 kubelet.go:1514] Pod "pod2_": HostPort is already allocated, ignoring: [[0].port: duplicate value '80/']
E0618 18:23:24.780327    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.780350    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.783221    1084 kubelet.go:1829] Error updating node status, will retry: error getting node "": Node "" not found
E0618 18:23:24.783243    1084 kubelet.go:1829] Error updating node status, will retry: error getting node "": Node "" not found
E0618 18:23:24.783253    1084 kubelet.go:1829] Error updating node status, will retry: error getting node "": Node "" not found
E0618 18:23:24.783266    1084 kubelet.go:1829] Error updating node status, will retry: error getting node "": Node "" not found
E0618 18:23:24.783275    1084 kubelet.go:1829] Error updating node status, will retry: error getting node "": Node "" not found
E0618 18:23:24.784170    1084 kubelet.go:1204] Deleting mirror pod "foo_ns" because it is outdated
E0618 18:23:24.784416    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.785641    1084 kubelet.go:1591] error getting node: node  not found
I0618 18:23:24.887432    1084 kubelet.go:778] Node  was previously registered
W0618 18:23:24.887519    1084 kubelet.go:847] Port name conflicted, "fooContainer-foo" is defined more than once
W0618 18:23:24.887534    1084 kubelet.go:847] Port name conflicted, "fooContainer-TCP:80" is defined more than once
E0618 18:23:24.888143    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.888700    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.889885    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.890320    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.890932    1084 kubelet.go:1591] error getting node: node  not found
E0618 18:23:24.891697    1084 kubelet.go:1591] error getting node: node  not found
I0618 18:23:24.892054    1084 plugins.go:56] Registering credential provider: .dockercfg
I0618 18:23:24.942399    1084 plugins.go:56] Registering credential provider: .dockercfg
I0618 18:23:24.942569    1084 plugins.go:56] Registering credential provider: .dockercfg
I0618 18:23:24.992844    1084 plugins.go:56] Registering credential provider: .dockercfg
W0618 18:23:24.992937    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_foo_new_12345678_0"
W0618 18:23:24.992953    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_foo_new_12345678_0"
W0618 18:23:24.992968    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.992982    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993010    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_foo_new_12345678_0"
W0618 18:23:24.993036    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_foo_new_12345678_0"
W0618 18:23:24.993048    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993058    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993087    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_98765_0"
W0618 18:23:24.993099    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_98765_0"
W0618 18:23:24.993110    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993119    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993139    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_98765_0"
W0618 18:23:24.993158    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_98765_0"
W0618 18:23:24.993170    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993179    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993201    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_12345678_0"
W0618 18:23:24.993211    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_12345678_0"
W0618 18:23:24.993222    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993231    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993252    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_12345678_0"
W0618 18:23:24.993264    1084 docker.go:269] invalid container hash "hash123" in container "k8s_bar.hash123_bar_new_12345678_0"
W0618 18:23:24.993275    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
W0618 18:23:24.993285    1084 docker.go:269] invalid container hash "hash123" in container "k8s_POD.hash123_foo_new_12345678_0"
I0618 18:23:24.993581    1084 plugins.go:56] Registering credential provider: .dockercfg
E0618 18:23:24.993610    1084 kubelet.go:1591] error getting node: node  not found
I0618 18:23:24.993643    1084 runonce.go:66] waiting for 1 pods
I0618 18:23:24.993709    1084 runonce.go:130] Container "bar" not running: api.ContainerState{Waiting:(*api.ContainerStateWaiting)(0xc2080292c0), Running:(*api.ContainerStateRunning)(nil), Terminated:(*api.ContainerStateTerminated)(nil)}
I0618 18:23:24.993729    1084 runonce.go:104] pod "foo" containers not running: syncing
W0618 18:23:24.994252    1084 docker.go:259] found a container with the "k8s" prefix, but too few fields (5): "k8s_net_foo.new.test_abcdefgh_42"
I0618 18:23:24.994282    1084 runonce.go:114] pod "foo" containers synced, waiting for 1ms
W0618 18:23:24.995410    1084 docker.go:259] found a container with the "k8s" prefix, but too few fields (5): "k8s_net_foo.new.test_abcdefgh_42"
E0618 18:23:24.995434    1084 manager.go:742] Error examining the container: parse docker container name "/k8s_net_foo.new.test_abcdefgh_42" error: Docker container name "k8s_net_foo.new.test_abcdefgh_42" has less parts than expected [k8s net foo.new.test abcdefgh 42]
W0618 18:23:24.995493    1084 docker.go:259] found a container with the "k8s" prefix, but too few fields (5): "k8s_net_foo.new.test_abcdefgh_42"
I0618 18:23:24.995511    1084 runonce.go:101] pod "foo" containers running
I0618 18:23:24.995524    1084 runonce.go:76] started pod "foo"
I0618 18:23:24.995532    1084 runonce.go:82] 1 pods started
W0618 18:23:25.242734    1084 connection.go:126] Stream rejected: Unable to parse '' as a port: strconv.ParseUint: parsing "": invalid syntax
W0618 18:23:25.244365    1084 connection.go:126] Stream rejected: Unable to parse 'abc' as a port: strconv.ParseUint: parsing "abc": invalid syntax
W0618 18:23:25.245993    1084 connection.go:126] Stream rejected: Unable to parse '-1' as a port: strconv.ParseUint: parsing "-1": invalid syntax
W0618 18:23:25.247588    1084 connection.go:126] Stream rejected: Unable to parse '65536' as a port: strconv.ParseUint: parsing "65536": value out of range
W0618 18:23:25.252922    1084 connection.go:126] Stream rejected: Port '0' must be greater than 0
FAIL
FAIL    github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet   0.553s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/config    2.139s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/container 0.048s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/dockertools   0.041s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/envvars   0.046s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/lifecycle 0.029s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/network   0.028s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/network/exec  1.106s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/prober    0.027s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/labels    0.023s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/master    0.083s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/namespace 0.034s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/probe/exec    0.009s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/probe/http    1.016s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/probe/tcp 0.021s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/proxy 7.142s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/proxy/config  0.056s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/componentstatus  0.074s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/controller/etcd  2.159s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/endpoint/etcd    0.059s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/etcd 0.154s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/event    0.044s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/generic  0.071s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/generic/etcd 0.053s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/generic/rest 0.051s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/limitrange   0.035s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/minion   0.046s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/minion/etcd  0.137s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/namespace    0.035s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/namespace/etcd   0.062s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/persistentvolume/etcd    0.057s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/persistentvolumeclaim/etcd   0.038s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/pod/etcd 0.177s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/podtemplate/etcd 0.043s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/resourcequota    0.040s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/resourcequota/etcd   0.148s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/secret/etcd  0.042s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service  0.039s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service/allocator    0.011s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service/allocator/etcd   0.141s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service/ipallocator  0.036s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service/ipallocator/controller   0.029s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service/ipallocator/etcd 0.035s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/service/portallocator    0.030s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/registry/serviceaccount/etcd  0.035s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/resourcequota 0.038s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/runtime   0.159s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/securitycontext   0.064s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/service   0.078s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/serviceaccount    0.457s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/tools 0.083s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util  3.517s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/config   0.053s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/errors   0.017s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/exec 0.075s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/fielderrors  0.028s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/flushwriter  0.023s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/httpstream/spdy  0.972s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/iptables 0.037s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/mount    0.017s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/proxy    0.034s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/slice    0.014s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/strategicpatch   0.054s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/wait 0.016s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/workqueue    0.091s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/util/yaml 0.019s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume    0.040s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/aws_ebs    0.048s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/empty_dir  0.066s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/gce_pd 0.067s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/git_repo   0.077s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/glusterfs  0.070s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/host_path  0.084s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/iscsi  0.070s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/nfs    0.066s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/persistent_claim   0.055s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/rbd    0.406s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volume/secret 0.058s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/volumeclaimbinder 0.059s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/watch 0.037s
ok      github.com/GoogleCloudPlatform/kubernetes/pkg/watch/json    0.049s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/admit    0.049s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/deny 0.055s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/exec/denyprivileged  0.029s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/limitranger  0.048s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/namespace/autoprovision  0.029s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/namespace/exists 0.028s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/namespace/lifecycle  0.043s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/resourcequota    0.095s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/securitycontext/scdeny   0.057s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/admission/serviceaccount   1.405s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/auth/authenticator/password/allow  0.025s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/auth/authenticator/password/passwordfile   0.288s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/auth/authenticator/request/basicauth   0.046s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/auth/authenticator/request/union   0.030s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/auth/authenticator/request/x509    0.098s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/auth/authenticator/token/tokenfile 0.160s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler  0.031s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/algorithm    0.027s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/algorithm/predicates 0.038s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/algorithm/priorities 0.029s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/algorithmprovider    0.028s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/api/validation   0.034s
ok      github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/factory  0.041s
ok      github.com/GoogleCloudPlatform/kubernetes/third_party/forked/reflect    0.012s
ok      github.com/GoogleCloudPlatform/kubernetes/third_party/golang/expansion  0.023s
!!! Error in hack/test-go.sh:159
  'go test "${goflags[@]:+${goflags[@]}}" ${KUBE_RACE} ${KUBE_TIMEOUT} "${@+${@/#/${KUBE_GO_PACKAGE}/}}" "${testargs[@]:+${testargs[@]}}"' exited with status 1
Call stack:
  1: hack/test-go.sh:159 runTests(...)
  2: hack/test-go.sh:221 main(...)
Exiting with status 1
!!! Error in build/../build/common.sh:447
  '"${docker_cmd[@]}" "$@"' exited with status 1
Call stack:
  1: build/../build/common.sh:447 kube::build::run_build_command(...)
  2: build/release.sh:35 main(...)
Exiting with status 1

justinsb · 2015-06-18T19:22:20Z

Sorry about the problem - not sure how that got past my testing, but I guess Shippable wasn't a flake after all :-(

Fix coming soon

justinsb · 2015-06-18T20:52:10Z

Actually that was a different commit where I thought Shippable was a flake. (Which I am now double-checking ;-) ) Trying to figure it out...

Fix of reverted #9728

This is a partial reversion of kubernetes#9728, and should fix kubernetes#10612. 9728 used the AWS instance id as the node name. But proxy, logs and exec all used the node name as the host name for contacting the minion. It is possible to resolve a host to the IP, and this fixes logs. But exec and proxy also require an SSL certificate match on the hostname, and this is harder to fix. So the sensible fix seems to be a minimal reversion of the changes in kubernetes#9728, and we can revisit this post 1.0.

Most of our communications from apiserver -> nodes used nodutil.GetNodeHostIP, but a few places didn't - and this meant that the node name needed to be resolvable _and_ we needed to populate valid IP addresses. Fix the last few places that used the NodeName. Issue kubernetes#18525 Issue kubernetes#9451 Issue kubernetes#9728 Issue kubernetes#17643 Issue kubernetes#11543 Issue kubernetes#22063 Issue kubernetes#2462 Issue kubernetes#22109 Issue kubernetes#22770 Issue kubernetes#32286

googlebot added the cla: yes label Jun 12, 2015

ArtfulCoder assigned dchen1107 Jun 12, 2015

justinsb mentioned this pull request Jun 13, 2015

Refactor Routes, and dynamically configure minion CIDRs on AWS #9720

Merged

dchen1107 reviewed Jun 16, 2015
View reviewed changes

jdef reviewed Jun 16, 2015
View reviewed changes

cmd/kubelet/app/server.go

Copy link

Contributor

jdef Jun 16, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need similar change in contrib/mesos/pkg/executor/service/service.go call to NewMainKubelet

justinsb force-pushed the aws_id_as_name branch from e0d5b6a to 4058adf Compare June 16, 2015 22:18

justinsb force-pushed the aws_id_as_name branch from 4058adf to 77e1bd3 Compare June 17, 2015 04:40

justinsb added 4 commits June 17, 2015 00:40

For kubelet, differentiate between the nodeName and the hostname

c28cdfb

This will allow us to use a nodeName that is not the hostname, for example on clouds where the hostname is not the natural identifier for a node.

Allow cloud providers to return a node identifier different from the …

efaead8

…hostname

AWS: Use the instance id as the node name

c89b0cd

The EC2 instance id is the canonical node name on EC2.

NodeName != HostName: Fixes for contrib/mesos

77e1bd3

justinsb added a commit to justinsb/kubernetes that referenced this pull request Jun 17, 2015

Don't assume name == instanceID

6d0e4fc

This removes dependency on kubernetes#9728

justinsb mentioned this pull request Jun 17, 2015

[WIP] Version of #9720 without the dependency on #9728 #9940

Closed

dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 18, 2015

satnam6502 added a commit that referenced this pull request Jun 18, 2015

Merge pull request #9728 from justinsb/aws_id_as_name

790ca23

Allow nodename to be != hostname, use AWS instance ID on AWS

satnam6502 merged commit 790ca23 into kubernetes:master Jun 18, 2015

yujuhong mentioned this pull request Jun 18, 2015

TestHandleNodeSelector test flake #10036

Closed

satnam6502 mentioned this pull request Jun 18, 2015

Revert "Allow nodename to be != hostname, use AWS instance ID on AWS" #10047

Merged

satnam6502 removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 18, 2015

justinsb mentioned this pull request Jun 18, 2015

Fix of reverted #9728 #10057

Merged

satnam6502 added a commit that referenced this pull request Jun 18, 2015

Merge pull request #10057 from justinsb/aws_id_as_name_2

4c13f89

Fix of reverted #9728

antoineco mentioned this pull request Jul 1, 2015

kubectl can not resolve EC2 instance IDs (AWS) #10612

Closed

justinsb mentioned this pull request Jul 2, 2015

WIP: Don't assume that NodeName == Node host name #10663

Closed

justinsb mentioned this pull request Jul 3, 2015

WIP: AWS: Use private dns name for node name again #10699

Merged

justinsb mentioned this pull request Sep 29, 2016

Use nodeutil.GetHostIP consistently when talking to nodes #33718

Merged

Allow nodename to be != hostname, use AWS instance ID on AWS #9728

Allow nodename to be != hostname, use AWS instance ID on AWS #9728

Uh oh!

Conversation

justinsb commented Jun 12, 2015

Uh oh!

k8s-bot commented Jun 12, 2015

Uh oh!

justinsb commented Jun 12, 2015

Uh oh!

justinsb commented Jun 16, 2015

Uh oh!

dchen1107 commented Jun 16, 2015

Uh oh!

dchen1107 commented Jun 16, 2015

Uh oh!

dchen1107 Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

dchen1107 Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

justinsb Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

jdef Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

dchen1107 Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

dchen1107 commented Jun 16, 2015

Uh oh!

justinsb commented Jun 16, 2015

Uh oh!

jdef Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

jdef commented Jun 16, 2015

Uh oh!

jdef commented Jun 16, 2015

Uh oh!

k8s-bot commented Jun 16, 2015

Uh oh!

dchen1107 commented Jun 16, 2015

Uh oh!

thockin commented Jun 17, 2015

Uh oh!

justinsb commented Jun 17, 2015

Uh oh!

justinsb commented Jun 17, 2015

Uh oh!

k8s-bot commented Jun 17, 2015

Uh oh!

justinsb commented Jun 17, 2015

Uh oh!

bgrant0607 commented Jun 17, 2015

Uh oh!

bgrant0607 commented Jun 17, 2015

Uh oh!

bgrant0607 commented Jun 17, 2015

Uh oh!

dchen1107 commented Jun 17, 2015

Uh oh!

dchen1107 commented Jun 17, 2015

Uh oh!

roberthbailey commented Jun 18, 2015

Uh oh!

dchen1107 commented Jun 18, 2015

Uh oh!

satnam6502 commented Jun 18, 2015

Uh oh!

satnam6502 commented Jun 18, 2015

Uh oh!

justinsb commented Jun 18, 2015

Uh oh!

justinsb commented Jun 18, 2015

Uh oh!

Uh oh!