-
Notifications
You must be signed in to change notification settings - Fork 41.4k
Use CSI driver to determine unique name for migrated in-tree plugins #101423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@codablock: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @codablock. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
/retest |
Hey @codablock thanks for submitting this PR! I am trying to understand the PR. Is there any symptom that you are running into that is caused by this? |
I found that the attach_detach_controller will indeed mark volume attachment as uncertain for the migrated PV.
But I did not see the volume get unmount from the pod. It will try to detach but it cant because it is still mounted:
So I would be really interested at what circumstance it will lead to the detach you mentioned. And also you mentioned the fix is tested, so please let me know if there is a good way to test it. Thanks a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work. You cannot find csiPluginName in the volumePluginMgr. For csi plugins, the plugin name is "kubernetes.io/csi". So for csi migration case, in order to get the unique volume name, we will need the csi plugin and the translated volumeSpec. So the following two lines should be able to do the work.
plugin, err = adc.volumePluginMgr.FindAttachablePluginByName("kubernetes.io/csi")
volumeSpec, err = csimigration.TranslateInTreeSpecToCSI(volumeSpec, "", adc.intreeToCSITranslator)
I think only one plugin that is problematic is azurefile inline plugin since it requires podNamespace while we are not able to get it from the VA... But for all other plugins this should be fine.
@andyzhangx FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into this later today or tomorrow when I find time.
@Jiawei0227 I observed the same as you, it tries to detach the volume for quite some time and keeps failing. Until some point where it then ignores the "still in use" state, which I don't understand when and why it does this. I had this happen ~10 times or so on multiple clusters, always leading to PODs crashing in all kinds of ways. Maybe there is some short time where the volume is not considered "in use"? At time of POD restart maybe? I can't give more details/info unfortunately as I don't have the logs anymore. I also manually migrated the PVs to CSI by snapshotting, deleting and manual restore. |
/assign @gnufied |
I did some more testing this morning and found the issue exists for sure. Basically the The detach will be issuing here when the following condition is not met:
Now, either after it timeout which is 6 min, or if there is a node update event from kubelet, the After the volume is detached, thanks to our csi-attacher we have the logic here to check the VA and actual attach status of the disk: https://github.com/kubernetes-csi/external-attacher/blob/c6ce4016cae099974630257e1b8208727b719daa/pkg/controller/csi_handler.go#L182 So this will issue a ControllerPublishVolume which attaches the volume again. So the unavailability of the disk can be few seconds depending on the attach/detach speed but can cause severe data loss. For the fix, I think the following will do the trick and for Azurefile case, it is okay because azurefile unique name shall be the same regardless passing podNamespace or not..
This should definitely be cherrypick to previous release. |
/assign |
processVolumeAttachments currently only tries to find PVs with uncertain attachemnt state. This will however lead to false postives for CSI migrated PVs, as the unique volume does not match between the original and migrated PV. This commit falls back to using the unique name of the CSI PV when migration for the in-tree plugin is enabled. The described false positives are an issue because later the attach/detach controller tries to detach the volume as it thinks the volume should not be attached to the node, ignoring that the CSI driver still thinks that the same volme must be (and is) attached to this node. This causes POD mounted volumes to loose the backing EBS storage and then running into all kinds of follow up errors (io related).
ca05be0
to
b023c60
Compare
@Jiawei0227 ahh, now I understand and you're most likely right...would I have looked at the logs I'd had seen the error message. I force pushed your suggested code. |
/retest |
/lgtm Thanks a lot for raising the issue and submit this fix! |
Hi @codablock and @Jiawei0227 thanks for finding and fixing this bug! We have been experiencing these exact issues with kube-controller-manager in our clusters seemingly randomly unattaching volumes after enabling the CSIMigrationAWS feature-gate. We were contemplating this whole time if we performed the migration procedure wrong and running through all possible scenarios. Due to the severity of this bug (literally un-attaching disks from node) we also agree with pushing out a cherry-pick ASAP. |
We found this to not be the case. Some of our migrated volumes (5%) were permanently unattached and receiving IO errors. Never got reconciled. We are running k8s 1.18.16 and csi-attacher 3.0.0. |
Sorry to hear that... the disk might be broken if the disk is force detached while you are writing data. I will work on the cherrypick ASAP. |
Sounds good. Also correction: We experienced this bug after upgrading our cluster to v1.18.16, and it seems this commit was cherry-picked in, introducing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall changes look good. I think we can add some unit tests, to validate migration.
err) | ||
continue | ||
} | ||
inTreePluginName := plugin.GetPluginName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - the name of variable inTreePluginName
is not correct. It could be CSI or in-tree plugin name depending on PV being linked to VA object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is also - adc.csiMigratedPluginManager.IsMigratable(volumeSpec)
if you want to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Iet's change the var name to pluginName
for now. And I think IsMigrationEnabledForPlugin works just fine here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code need to use IsMigratable(). I think the volumeSpec here could already CSI. I think it may not be safe to call TranslateInTreeSpecToCSI() on CSI volumes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is already CSI, then IsMigrationEnabledForPlugin
will return false and it will not translate the spec so I think it should be okay?
nodeName, | ||
inTreePluginName, | ||
err) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be possible to cover this via an unit test in Test_ADC_VolumeAttachmentRecovery
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codablock Can you help to add a unit test case here to verify it is working as expected? If it end up being too complicated, we can follow up in the next PR.
Lets merge this and fix outstanding items in a follow up. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: codablock, gnufied The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
@codablock will you have time to add some unit tests on this? I think PR is good as it is and thank you for debugging and fixing it. But some tests will give us greater confidence with cherry-picks we need to perform. For this reason, I am putting this on hold for a bit. /hold |
/retest |
I unfortunately won't have the required time to implement the unit tests. Also, I don't feel confident enough with the the test system in go and especially in kubernetes to be able to produce good tests on first try. |
Could someone confirm if this issue was introduced with 1.20 or 1.21? We first saw volumes get detached after upgrading to 1.21.0, and I thought it was related to the following change in the 1.21 release notes.
|
I see that PR#96617 is backported to 1.20, 1.19 and 1.18. |
Thanks. We were on 1.20.2, and the backport arrived in 1.20.3, which explains why we didn't see the issue. Here are the versions where the PR was backported. |
I can confirm that this bug also appears in Kubernetes 1.18.18, which I noticed when upgrading from Kubernetes 1.17.x. |
Thanks for reporting this bug and making fix for it. |
@Jiawei0227: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
From the commit:
The initial behaviour was observed on 1.20 and the fix was tested on the current release-1.20 branch.
What type of PR is this?
/kind bug
What this PR does / why we need it:
See commit message from above.
Which issue(s) this PR fixes:
I was unable to find any known/opened issues regarding this.
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: