Fix nil pointer issue when getting metrics from volume mounter#34251
Conversation
|
Jenkins unit/integration failed for commit f8baaa6. Full PR test history. The magic incantation to run this job again is |
| podVolumes := kl.volumeManager.GetMountedVolumesForPod( | ||
| volumetypes.UniquePodName(podUID)) | ||
| for outerVolumeSpecName, volume := range podVolumes { | ||
| if volume.Mounter == nil { |
There was a problem hiding this comment.
nit: Add a brief comment/TODO explaining why mounter could be nil
|
LGTM mod @yujuhong's comment |
Currently it is possible that the mounter object stored in Mounted Volume data structure in the actual state of kubelet volume manager is nil if this information is recovered from state sync process. This will cause nil pointer issue when calculating stats in volume_stat_calculator. A quick fix is to not return the volume if its mounter is nil. A more complete fix is to also recover mounter object when reconstructing the volume data structure which will be addressed in PR kubernetes#33616
f8baaa6 to
b2b0409
Compare
|
To reproduce the issue:
Manually test the fix and it is working. |
|
LGTM. Thanks |
|
Jenkins GKE smoke e2e failed for commit b2b0409. Full PR test history. The magic incantation to run this job again is |
|
Automatic merge from submit-queue |
|
This has merged, @jingxu97 can you please create a cherry pick for this to the 1.4 branch. |
|
opened cherrypick here: #34296 |
|
Thanks @jessfraz |
|
cherry-picked to 1.4 in #34296 so removing cherry-pick candidate |
…ck-of-#33520-kubernetes#34251-origin-release-1.4 Automatic merge from submit-queue Automated cherry pick of kubernetes#33520 kubernetes#34251 origin release 1.4 Cherry pick of kubernetes#33520 kubernetes#34251 on release-1.4. kubernetes#33520: nodefs becomes imagefs on GCI since kubelet cannot identify kubernetes#34251: Fix nil pointer issue when getting metrics from volume
Automatic merge from submit-queue
Fix nil pointer issue when making mounts for container
When rebooting one of the nodes in my colleague's cluster, two panics were discovered:
```
E1216 04:07:00.193058 2394 runtime.go:52] Recovered from panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:58
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:51
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:41
/usr/local/go/src/runtime/asm_amd64.s:472
/usr/local/go/src/runtime/panic.go:443
/usr/local/go/src/runtime/panic.go:62
/usr/local/go/src/runtime/sigpanic_unix.go:24
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1313
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1473
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go:1495
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go:2125
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/dockertools/docker_manager.go:2093
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1971
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:530
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:171
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:154
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:215
/usr/local/go/src/runtime/asm_amd64.s:1998
E1216 04:07:00.275030 2394 runtime.go:52] Recovered from panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:58
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:51
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/runtime/runtime.go:41
/usr/local/go/src/runtime/asm_amd64.s:472
/usr/local/go/src/runtime/panic.go:443
/usr/local/go/src/runtime/panic.go:62
/usr/local/go/src/runtime/sigpanic_unix.go:24
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/server/stats/volume_stat_caculator.go:98
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/server/stats/volume_stat_caculator.go:63
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:86
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/util/wait/wait.go:87
/usr/local/go/src/runtime/asm_amd64.s:1998
```
kubectl version
```
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.8", GitCommit:"693ef591120267007be359f97191a6253e0e4fb5", GitTreeState:"clean", BuildDate:"2016-09-28T03:03:21Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.8", GitCommit:"693ef591120267007be359f97191a6253e0e4fb5", GitTreeState:"clean", BuildDate:"2016-09-28T02:52:25Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
```
The second panic had already been fixed by #33616 and #34251. Not sure what caused the first nil pointer issue and whether it has been fixed yet in the master branch. Just fix it by ignoring the nil pointer when making mounts.
cc @jingxu97 @yujuhong
Currently it is possible that the mounter object stored in Mounted
Volume data structure in the actual state of kubelet volume manager is
nil if this information is recovered from state sync process. This will
cause nil pointer issue when calculating stats in volume_stat_calculator.
A quick fix is to not return the volume if its mounter is nil. A more
complete fix is to also recover mounter object when reconstructing the
volume data structure which will be addressed in PR #33616
This change is