-
Notifications
You must be signed in to change notification settings - Fork 598
MAINT: improve organization of dataset fetch functions (refactoring) #785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: improve organization of dataset fetch functions (refactoring) #785
Conversation
Separates each dataset fetching function into its own file for better organization and maintainability.
|
This is almost ready to be reviewed. I just have a question: Should I update the blog post about Datasets and Seed Prompts since, after the changes I've made in this PR, it will no longer be up-to-date? It's about the following paragraph specifically: 2025_02_11.md#loading-datasets-with-seed-prompts I'll absolutely update the User guide for Datasets. Just wondering whether I should also modify the blog post π |
Awesome! We usually don't update blog posts substantially, but this is easy enough of a fix that I'm inclined to make the change. CC @eugeniavkim I would replace
with
|
romanlutz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This is perfect!
|
I see the checks are failing. I've run I'll try to fix the problem tomorrow π |
|
There might be a naming collision since fetch_examples is both the file and function name. But that's just a guess. |
That's right, renaming did the trick! Thank you so much! |
|
Fantastic @paulinek13 !!! Thanks once again for a great contribution. |
Description
Related issue: #775
This PR is about refactoring the dataset fetching functions to improve their organization and maintainability as the codebase grows and new datasets are introduced.
π οΈ The main changes:
fetch_example_datasets.pyinto separate files (similar to how converters are handled)tests/unit/datasets)__init__.py) and docs (api.rst)βοΈ Other modifications:
fetch_babelscape_alert_datasetandfetch_librAI_do_not_answer_datasetfetch_example_datasets.pytofetch_examples.py.pre-commit-config.yaml/docfiles:doc/code/datasets/0_dataset.md,doc/code/datasets/2_fetch_dataset.ipynb,doc/code/datasets/2_fetch_dataset.pyClose #775