Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat: Adds update_dataset_from_dir #430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

nicolastomeo
Copy link
Contributor

No description provided.

Copy link

This pull request has been linked to Shortcut Story #833032: Ingest into dataset from directory.

@nicolastomeo nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch 4 times, most recently from 8916a0f to 398a17d Compare February 5, 2024 17:44
Copy link
Contributor

@ntamas92 ntamas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!! Small comments regarding the usage of the functions.

@@ -1285,15 +1296,88 @@ def create_dataset_from_dir(

if len(items) == 0:
print(f"Did not find any items in {dirname}")
return None
return existing_dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be actually better to create the dataset in this case also. Even though there were not items in the directory, we wanted to create the dataset in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's a question for @jean-lucas , since the original behaviour is not to create a dataset if the directory is empty or nothing was found

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, i think the dataset should still be created. I think the original behaviour was misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

skip_size_warning=skip_size_warning,
)

def update_dataset_from_dir(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this one would be better to be had on the dataset class itself. I think it's a better workflow to create/fetch a dataset explicitly then ingesting items into it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I think that's better encapsulation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good suggestion. It would avoid the user having to copy the dataset id.

I think we can even rename it to:

dataset.add_images_from_dir(dirname)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link
Contributor

@gatli gatli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - can you also add a method to the CLI? See cli/* - this is a perfect use case for a CLI tool. LMK if you need an intro into click and rich.

skip_size_warning=skip_size_warning,
)

def update_dataset_from_dir(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I think that's better encapsulation

Copy link
Contributor

@jean-lucas jean-lucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, tried it out, worked well.
I agree with @ntamas92 comments.

Before merging though, lets update the changelog and version

@nicolastomeo nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch from db6b48a to 54e7972 Compare February 6, 2024 11:04
@nicolastomeo nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch from 54e7972 to 5c9519e Compare February 6, 2024 11:48
@nicolastomeo
Copy link
Contributor Author

Looking good - can you also add a method to the CLI? See cli/* - this is a perfect use case for a CLI tool. LMK if you need an intro into click and rich.

Yes, but i think it's better to do it in a different PR

@nicolastomeo nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch from 6731c6e to ef147e0 Compare February 6, 2024 15:07
Copy link
Contributor

@jean-lucas jean-lucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@nicolastomeo nicolastomeo merged commit 05435fc into master Feb 6, 2024
@nicolastomeo nicolastomeo deleted the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch February 6, 2024 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants