Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@moutayam
Copy link
Contributor

@moutayam moutayam commented Apr 6, 2025

Describe your changes

This PR addresses the issue where enforce_filename_restriction fails to handle multi-part extensions like .tar.gz.

Changes made:

  • Updated the function to combine base names and extensions for known multi-part extensions (e.g., .tar.gz).
  • Added test cases to ensure the fix works for both single and multi-part extensions.
  • Validated the solution through manual testing and edge-case scenarios.

Fixes #<11041>

GitHub Issue Link (if applicable)

#11041

Testing Plan

  • The existing unit tests you’ve written comprehensively cover all relevant scenarios, including valid and invalid file extensions, multi-part extensions, edge cases (e.g., no extension, malformed filenames).

  • No additional tests are required because the scope of the changes is limited to ensuring proper handling of file extensions, which is fully validated by the provided unit tests (Python).


Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@snyk-io
Copy link
Contributor

snyk-io bot commented Apr 6, 2025

🎉 Snyk checks have passed. No issues have been found so far.

security/snyk check is complete. No issues have been found. (View Details)

license/snyk check is complete. No issues have been found. (View Details)

base_name, extension = os.path.splitext(filename.lower())
if base_name.endswith(".tar"):
extension = ".tar" + extension
elif base_name.endswith(".coffee"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @moutayam!

thank you for the PR!

I never heard about .coffee.* extension, could we remove it from the special case handling here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kajarenc ,

Thank you for reviewing the PR! 😊

You’re absolutely right—the .coffee.* case isn’t a common extension pattern. The example I referenced comes from CoffeeScript’s Literate Programming support, where code can be written in Markdown files with extensions like .coffee.md or .litcoffee. However, this is a very niche use case, and I agree it’s not worth adding special handling for it here.

I’m happy to remove the .coffee.* logic to keep the code focused on the most relevant extensions (like .tar.gz).
Thanks again!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, great, thank you!

I also found an interesting case when '.gz' is in the allowed types, but since tar.gz is not there, the check fails.

I also fixed that, let me wait for another teammate review

@kajarenc kajarenc added security-assessment-completed Security assessment has been completed for PR impact:users PR changes affect end users change:bugfix PR contains bug fix implementation labels Apr 14, 2025
Comment on lines 62 to 70
allowed_as_multipart = False
base_name, extension = os.path.splitext(filename.lower())

# Handle the special case of popular multipart extension tar.gz
# for all other extensions, we just check the last one
if filename.lower().endswith(".tar.gz") and ".tar.gz" in allowed_types:
allowed_as_multipart = True

if allowed_types and extension not in allowed_types and not allowed_as_multipart:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we simplify this logic and just use .endswith(...)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified!

if allowed_types and extension not in allowed_types:
base_name, extension = os.path.splitext(filename.lower())

if not any(filename.endswith(allowed_type) for allowed_type in allowed_types):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kajarenc Do we validate the allowed types starts with a dot? I am wondering if we should handle this case so that if allowed_type were like tar.gz it wouldn't allow myfiletar.gz

Copy link
Collaborator

@kajarenc kajarenc Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmcgrady good point!

the answer is yes! the normalize_upload_file_type helper function always normalize allowed types to start from the . even if user not specified it

@kajarenc kajarenc merged commit ab9c338 into streamlit:develop Apr 16, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:bugfix PR contains bug fix implementation impact:users PR changes affect end users security-assessment-completed Security assessment has been completed for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants