S3 Dag Bundle won't remove files removed on remote S3 #64775
-
|
Hello. I'm using S3 Dag Bundle and it works perfectly except one thing. Files removed remotely (in S3 bucket) won't be removed in Airflow. My source S3 bucket contains some config files additionally to dag files. I use these config files to generate dags dynamically like that: So number of dags have to be the same as number of yaml files in Is there any way to clean up files that were removed on remote S3 bucket? Any workaround would help. Would be best to allow something like |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
The current source says stale local files should be deleted.
So I would first check the exact Airflow and If you are already on a version with that code, then this is probably worth a small reproducible bug report: bucket prefix, bundle config, provider version, whether the deleted file is under the configured prefix, and debug logs from |
Beta Was this translation helpful? Give feedback.
-
|
Hello @cookesan. Thanks for your help. |
Beta Was this translation helpful? Give feedback.
The current source says stale local files should be deleted.
S3DagBundle.refresh()callsS3Hook.sync_to_local_dir(..., delete_stale=True), andsync_to_local_dir()has a stale-file cleanup path that removes local files not present in the current S3 object list.So I would first check the exact Airflow and
apache-airflow-providers-amazonversions in the environment. If your installed provider version predates thatdelete_stale=Truebehavior, upgrading the Amazon provider may be the fix.If you are already on a version with that code, then this is probably worth a small reproducible bug report: bucket prefix, bundle config, provider version, whether the deleted file is under the configured pβ¦