-
Notifications
You must be signed in to change notification settings - Fork 4k
Fix: Handle Multi-Part Extensions Like .tar.gz in File Upload Validation #11043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Handle Multi-Part Extensions Like .tar.gz in File Upload Validation #11043
Conversation
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
| base_name, extension = os.path.splitext(filename.lower()) | ||
| if base_name.endswith(".tar"): | ||
| extension = ".tar" + extension | ||
| elif base_name.endswith(".coffee"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @moutayam!
thank you for the PR!
I never heard about .coffee.* extension, could we remove it from the special case handling here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kajarenc ,
Thank you for reviewing the PR! 😊
You’re absolutely right—the .coffee.* case isn’t a common extension pattern. The example I referenced comes from CoffeeScript’s Literate Programming support, where code can be written in Markdown files with extensions like .coffee.md or .litcoffee. However, this is a very niche use case, and I agree it’s not worth adding special handling for it here.
I’m happy to remove the .coffee.* logic to keep the code focused on the most relevant extensions (like .tar.gz).
Thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, great, thank you!
I also found an interesting case when '.gz' is in the allowed types, but since tar.gz is not there, the check fails.
I also fixed that, let me wait for another teammate review
| allowed_as_multipart = False | ||
| base_name, extension = os.path.splitext(filename.lower()) | ||
|
|
||
| # Handle the special case of popular multipart extension tar.gz | ||
| # for all other extensions, we just check the last one | ||
| if filename.lower().endswith(".tar.gz") and ".tar.gz" in allowed_types: | ||
| allowed_as_multipart = True | ||
|
|
||
| if allowed_types and extension not in allowed_types and not allowed_as_multipart: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we simplify this logic and just use .endswith(...)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplified!
| if allowed_types and extension not in allowed_types: | ||
| base_name, extension = os.path.splitext(filename.lower()) | ||
|
|
||
| if not any(filename.endswith(allowed_type) for allowed_type in allowed_types): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kajarenc Do we validate the allowed types starts with a dot? I am wondering if we should handle this case so that if allowed_type were like tar.gz it wouldn't allow myfiletar.gz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kmcgrady good point!
the answer is yes! the normalize_upload_file_type helper function always normalize allowed types to start from the . even if user not specified it
Describe your changes
This PR addresses the issue where
enforce_filename_restrictionfails to handle multi-part extensions like.tar.gz.Changes made:
.tar.gz).Fixes #<11041>
GitHub Issue Link (if applicable)
#11041
Testing Plan
The existing unit tests you’ve written comprehensively cover all relevant scenarios, including valid and invalid file extensions, multi-part extensions, edge cases (e.g., no extension, malformed filenames).
No additional tests are required because the scope of the changes is limited to ensuring proper handling of file extensions, which is fully validated by the provided unit tests (Python).
Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.