-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
Filesystem remove() function can fail when run concurrently #27578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not sure we can do much here. Would you have an idea solving this at the Filesystem component level? Instead, would you be able to put a lock around race-condition sensitive parts? |
Here's an example of how I'm going to attempt to solve this for Drush: However, instead of every upstream library trying to implement that in its own way, I think it might make sense to implement it in the Filesystem, especially since Symfony itself has suffered tremendously from this same bug, which many people consider to still be unresolved: #2600 I think roughly the same approach could be applied as for Drush (basically, just drop a semaphore file in the directory that's about to be removed, and don't remove directories with that file) If you're worried about adding complexity to the existing |
Through some testing I think I've discovered a little more about this problem. I don't think the problem is multiple concurrent calls to This could be a problem regardless of the underlying filesystem, but it's exacerbated on shared file systems simply because the deletion process takes longer and gives other processes more time to interfere. I'd really love to see a way for Symfony to work around this. I'm assuming this kind of problem wouldn't happen if you simply used |
IIRC there is no way to do that without recursively removing the directory tree (except from using a native command). I could imagine though that we could mitigate the issue by renaming the directory to be removed first, couldn't we? |
@xabbuh good idea, I tested a PoC of that approach and it seems to more or less work. You'd need to rename the directory in-place (as opposed to i.e. moving it to a tmp directory), since it turns out PHP has problems moving directories across devices, and even the Unix |
Hey, thanks for your report! |
Hello? This issue is about to be closed if nobody replies. |
This is an actual issue. Our AppVeyor builds fails randomly because of this. |
Yeah definitely still an issue. I think the two most recently proposed solutions are still valid: rename the directory in-situ before deleting it, so other processes are less likely to write to it, or use OS-native commands ( |
@danepowell would you like to give it a try in a PR? |
I'm no longer in a position where I deal with this on a regular basis, so I'm afraid I can't commit much time to testing and validation. But here's at least a proof of concept solution: #39984 |
Uh oh!
There was an error while loading. Please reload this page.
Symfony version(s) affected: 3.4.11
Description
If you invoke the Filesystem remove() method on a directory multiple times concurrently, it can fail on certain filesystems with the error:
Failed to remove directory "/home/foo/.drush/cache/bar": rmdir(/home/foo/.drush/cache/bar): Directory not empty
How to reproduce
I'm currently regularly encountering this using Drush and a Gluster filesystem. I have a script that calls
drush cc drush
, which invokes remove() on a directory that's hosted on Gluster.Here's where Drush calls remove(): https://github.com/drush-ops/drush/blob/0e2abf43ad0d2f398a7afb23772c556a906d840d/src/Cache/FileCache.php#L132
When my script runs multiple times in parallel, it frequently fails with the above error.
I'm not totally sure if this is due to a race condition with Gluster, or if it's a race condition within Symfony that only becomes apparent when disk i/o is heavily throttled (as on a shared filesystem).
I've also seen this happen (albeit somewhat less frequently) on a mounted EC2 filesystem, so it's not just a Gluster problem.
Other folks have reported similar issues running Symfony's internal cache clear on shared filesystems, although I don't know if they were running cache clears concurrently: #2600
The text was updated successfully, but these errors were encountered: