feat: Adds update_dataset_from_dir #430

nicolastomeo · 2024-02-05T17:28:15Z

No description provided.

shortcut-integration · 2024-02-05T17:28:19Z

This pull request has been linked to Shortcut Story #833032: Ingest into dataset from directory.

ntamas92

Nice work!! Small comments regarding the usage of the functions.

ntamas92 · 2024-02-06T09:42:25Z

nucleus/__init__.py

@@ -1285,15 +1296,88 @@ def create_dataset_from_dir(

        if len(items) == 0:
            print(f"Did not find any items in {dirname}")
-            return None
+            return existing_dataset


I'm wondering if it would be actually better to create the dataset in this case also. Even though there were not items in the directory, we wanted to create the dataset in the first place.

I guess that's a question for @jean-lucas , since the original behaviour is not to create a dataset if the directory is empty or nothing was found

That's a good point, i think the dataset should still be created. I think the original behaviour was misleading.

ntamas92 · 2024-02-06T09:48:13Z

nucleus/__init__.py

+            skip_size_warning=skip_size_warning,
+        )
+
+    def update_dataset_from_dir(


Maybe this one would be better to be had on the dataset class itself. I think it's a better workflow to create/fetch a dataset explicitly then ingesting items into it.

Agreed, I think that's better encapsulation

I think that's a good suggestion. It would avoid the user having to copy the dataset id.

I think we can even rename it to:

dataset.add_images_from_dir(dirname)

gatli

Looking good - can you also add a method to the CLI? See cli/* - this is a perfect use case for a CLI tool. LMK if you need an intro into click and rich.

gatli · 2024-02-06T10:02:10Z

nucleus/__init__.py

+            skip_size_warning=skip_size_warning,
+        )
+
+    def update_dataset_from_dir(


Agreed, I think that's better encapsulation

jean-lucas

overall LGTM, tried it out, worked well.
I agree with @ntamas92 comments.

Before merging though, lets update the changelog and version

nicolastomeo · 2024-02-06T11:49:10Z

Looking good - can you also add a method to the CLI? See cli/* - this is a perfect use case for a CLI tool. LMK if you need an intro into click and rich.

Yes, but i think it's better to do it in a different PR

nucleus/dataset.py

jean-lucas

LGTM 👍

…-from-directory

nicolastomeo requested review from jean-lucas, ntamas92 and gatli February 5, 2024 17:28

nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch 4 times, most recently from 8916a0f to 398a17d Compare February 5, 2024 17:44

ntamas92 reviewed Feb 6, 2024

View reviewed changes

gatli reviewed Feb 6, 2024

View reviewed changes

jean-lucas approved these changes Feb 6, 2024

View reviewed changes

nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch from db6b48a to 54e7972 Compare February 6, 2024 11:04

nicolastomeo added 4 commits February 6, 2024 12:48

feat: Adds update_dataset_from_dir

7ce2914

_create_or_update_dataset_from_dir always create dataset

5c6adde

update test

bbb5da4

Refactor using dataset.add_items_from_dir

5c9519e

nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch from 54e7972 to 5c9519e Compare February 6, 2024 11:48

nicolastomeo requested review from ntamas92, gatli and jean-lucas February 6, 2024 15:05

Updates CHANGELOG and bump version

ef147e0

nicolastomeo force-pushed the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch from 6731c6e to ef147e0 Compare February 6, 2024 15:07

jean-lucas reviewed Feb 6, 2024

View reviewed changes

nucleus/dataset.py Outdated Show resolved Hide resolved

jean-lucas approved these changes Feb 6, 2024

View reviewed changes

nicolastomeo added 2 commits February 6, 2024 16:33

Check for items bigger than zero

f0898ca

Merge branch 'master' into nicolastomeo/sc-833032/ingest-into-dataset…

617f615

…-from-directory

nicolastomeo merged commit 05435fc into master Feb 6, 2024

nicolastomeo deleted the nicolastomeo/sc-833032/ingest-into-dataset-from-directory branch February 6, 2024 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Adds update_dataset_from_dir #430

feat: Adds update_dataset_from_dir #430

nicolastomeo commented Feb 5, 2024

shortcut-integration bot commented Feb 5, 2024

ntamas92 left a comment

ntamas92 Feb 6, 2024

nicolastomeo Feb 6, 2024

jean-lucas Feb 6, 2024

nicolastomeo Feb 6, 2024

ntamas92 Feb 6, 2024

gatli Feb 6, 2024

jean-lucas Feb 6, 2024

nicolastomeo Feb 6, 2024

gatli left a comment

gatli Feb 6, 2024

jean-lucas left a comment

nicolastomeo commented Feb 6, 2024

jean-lucas left a comment

feat: Adds update_dataset_from_dir #430

feat: Adds update_dataset_from_dir #430

Conversation

nicolastomeo commented Feb 5, 2024

shortcut-integration bot commented Feb 5, 2024

ntamas92 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gatli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jean-lucas left a comment

Choose a reason for hiding this comment

nicolastomeo commented Feb 6, 2024

jean-lucas left a comment

Choose a reason for hiding this comment