Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Container cannot be accessed via kubectl exec when pod is in Terminating state #7160

@Ezetowers

Description

@Ezetowers

What happened?

Containers associated to pods cannot be accessed via kubectl exec command (or crictl exec in the node where the container is running) when they are in Terminating state.

What did you expect to happen?

We expect to be able to access containers even if they are in Terminating state. We have verified Docker works under the same conditions

How can we reproduce it (as minimally and precisely as possible)?

  1. Deploy a pod that ignores SIGTERM signals if pod is deleted. Set the spec parameter terminationGracePeriodSeconds parameter to a really high value to avoid the kubelet sending a SIGKILL to the container associated with the pod
apiVersion: v1
kind: Pod
metadata:
  name: nginx-example
spec:
  containers:
  - name: nginx-example
    image: nginx
    imagePullPolicy: IfNotPresent
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
  terminationGracePeriodSeconds: 3600
  1. Wait for the pod to be running and then try to delete the pod (in the example, you should execute the command kubectl delete pod nginx-example). The delete command is synchronous but the SIGTERM is sent immediately, meaning the user will need to manually cancel the action with Ctrl+C but after that the pod will be left in Terminating state
# kubectl  get pods  nginx-example
NAME            READY   STATUS        RESTARTS   AGE
nginx-example   1/1     Terminating   0          14m
  1. Try to access the container associated with the pod executing the command kubectl exec -it nginx-example -- bash. The command will hung and as in the kubectl delete command, with the difference that Ctrl + C won't work. The same behaviour is shown if crictl exec command is executed in the container associated with the pod in the host where the container is running

Anything else we need to know?

A stacktrace of the goroutines in the crio service has been attached to this bug report

crio-goroutine-stacks-2023-07-21T213047Z.log

In addition to this, we have tested that the process being executed inside the container in Terminating state is working as expected. Doing a nsenter -a -t <process-pid> let us access all the namespaces of the process, kind mimicking the crictl exec command. The problem seems to be similar to the one reported in #6865: crio seems to be taking a lock associated with the state of the container that prevents the operation (crictl exec in this case) to be executed

CRI-O and Kubernetes version

Details
crio version 1.25.2
Version:        1.25.2
GitCommit:      unknown
GitCommitDate:  unknown
GitTreeState:   clean
BuildDate:      2023-03-06T07:45:59Z
GoVersion:      go1.19
Compiler:       gc
Platform:       linux/amd64
Linkmode:       dynamic
BuildTags:
  rpm_crashtraceback
  exclude_graphdriver_btrfs
  btrfs_noversion
  exclude_graphdriver_devicemapper
  libdm_no_deferred_remove
  seccomp
  containers_image_openpgp
LDFlags:          -X github.com/cri-o/cri-o/internal/pkg/criocli.DefaultsPath= -X  github.com/cri-o/cri-o/internal/version.buildDate=2023-03-06T07:45:59Z -X  github.com/cri-o/cri-o/internal/version.gitCommit=1d7407e62446d25ca4fa77c9f6853143ec994d15 -X  github.com/cri-o/cri-o/internal/version.version=1.25.2 -X  github.com/cri-o/cri-o/internal/version.gitTreeState=clean  -B 0x4b9fcda34660fd3501556a3366899982947c7308 -extldflags '-Wl,-z,relro  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld ' -compressdwarf=false
SeccompEnabled:   true
AppArmorEnabled:  false
Dependencies:
~ k8s(yul1) kubectl version --output=json
{
  "clientVersion": {
    "major": "1",
    "minor": "25",
    "gitVersion": "v1.25.2",
    "gitCommit": "5835544ca568b757a8ecae5c153f317e5736700e",
    "gitTreeState": "clean",
    "buildDate": "2022-09-21T14:33:49Z",
    "goVersion": "go1.19.1",
    "compiler": "gc",
    "platform": "darwin/amd64"
  },
  "kustomizeVersion": "v4.5.7",
  "serverVersion": {
    "major": "1",
    "minor": "25",
    "gitVersion": "v1.25.7",
    "gitCommit": "723bcdb232300aaf5e147ff19b4df7ec8a20278d",
    "gitTreeState": "clean",
    "buildDate": "2023-02-22T13:58:23Z",
    "goVersion": "go1.19.6",
    "compiler": "gc",
    "platform": "linux/amd64"
  }
}

OS version

Details
# cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="8.8"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="8.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Oracle Linux Server 8.8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:8:8:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://github.com/oracle/oracle-linux"

ORACLE_BUGZILLA_PRODUCT="Oracle Linux 8"
ORACLE_BUGZILLA_PRODUCT_VERSION=8.8
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=8.8

# uname -a
Linux yul1-r13-u17 4.18.0-477.13.1.el8_8.x86_64 #1 SMP Tue May 30 16:09:32 PDT 2023 x86_64 x86_64 x86_64 GNU/Linux

Additional environment details (AWS, VirtualBox, physical, etc.)

Details Physical

Metadata

Metadata

Labels

good first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.kind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions