Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
bonus: reorder hffs args to improve caching
  • Loading branch information
lhoestq committed Oct 15, 2025
commit ad4d8479e4f16b2ce07a88f800e711d4e035cc78
2 changes: 1 addition & 1 deletion src/datasets/download/download_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def copy(self) -> "DownloadConfig":
def __setattr__(self, name, value):
if name == "token" and getattr(self, "storage_options", None) is not None:
if "hf" not in self.storage_options:
self.storage_options["hf"] = {"token": value, "endpoint": config.HF_ENDPOINT}
self.storage_options["hf"] = {"endpoint": config.HF_ENDPOINT, "token": value}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you need to make this change? seems weird since dicts aren't ordered in python

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because fsspec cache that maps a filesystem argument to the cached instance is sensitive to the order ^^'

with those changes every instance of HfFileSystem in datasets uses the same order

elif getattr(self.storage_options["hf"], "token", None) is None:
self.storage_options["hf"]["token"] = value
super().__setattr__(name, value)
2 changes: 1 addition & 1 deletion src/datasets/utils/file_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -895,8 +895,8 @@ def _prepare_single_hop_path_and_storage_options(
storage_options["headers"] = {"Accept-Encoding": "identity", **headers}
elif protocol == "hf":
storage_options = {
"token": token,
"endpoint": config.HF_ENDPOINT,
"token": token,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, weird that you need this :S

**storage_options,
}
if storage_options:
Expand Down
Loading