Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[BUG] v2 RWX workload IO timed out after Longhorn components are deleted and restarted #13217

Description

@yangchiu

Describe the Bug

Robot test case Test Longhorn Components Recovery fails on master-head (longhorn-manager 94237fc) and v1.12.x-head (longhorn-manager f57c766) with v2 volumes. After Longhorn components are deleted and restarted, the v2 RWX workload IO got stuck and timed out.

https://ci.longhorn.io/job/private/job/longhorn-e2e-test/8083/

Checked e2e-test-deployment-1-676c6b7f47-2w7r5 file data.txt checksum failed.                     Got /data/data.txt checksum = md5sum: can't open '/data/data.txt': Operation timed out
# kubectl exec -it e2e-test-deployment-1-676c6b7f47-2w7r5 -- /bin/sh
/git $ cd /data
ls
# gets stuck

To Reproduce

Run robot test case Test Longhorn Components Recovery:

-t \"Test Longhorn Components Recovery\" -v RETRY_COUNT:259200 -v DATA_ENGINE:v2

Expected Behavior

N/A

Support Bundle for Troubleshooting

supportbundle_12ed26ad-b565-4db0-b89b-e7daa08255cd_2026-05-27T06-51-01Z.zip

Environment

  • Longhorn version: master-head, v1.12.x-head
  • Impacted volume (PV):
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): kubectl
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: v1.35.4+k3s1
    • Number of control plane nodes in the cluster:
    • Number of worker nodes in the cluster:
  • Node config
    • OS type and version: sles 16.0
    • Kernel version:
    • CPU per node:
    • Memory per node:
    • Disk type (e.g. SSD/NVMe/HDD):
    • Network bandwidth between the nodes (Gbps):
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal):
  • Number of Longhorn volumes in the cluster:

Additional context

No response

Workaround and Mitigation

No response

Metadata

Metadata

Labels

area/resilienceSystem or volume resiliencearea/v2-data-enginev2 data engine (SPDK)area/volume-rwxVolume RWX relatedkind/bugpriority/1Highly recommended to implement or fix in this release (managed by PO)reproduce/always100% reproduciblerequire/backportRequire backport. Only used when the specific versions to backport have not been definied.require/qa-review-coverageRequire QA to review coverageseverity/2Function working but has a major issue w/o workaround (a major incident with significant impact)

Type

No fields configured for Bug.

Projects

Status
Closed

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions