Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Create a torch dataset out of a selection of samples #5

@andandandand

Description

@andandandand

Goal

Enable selection of samples in the HyperView interface, and provide an option to export the selected samples as a Torch-compatible dataset. This will support both prototyping and downstream ML workflows.

Requirements

  • User should be able to select samples from the data view.
  • Provide export functionality (button/menu) for selected samples.
  • Exported dataset should be in a format readily usable by PyTorch (torch.utils.data.Dataset).
  • Document the data schema and any requirements for serialization (e.g., images, labels, metadata).
  • Ensure compatibility with common ML data loading operations (e.g., batching, transforms).
  • Example usage should be part of the documentation.

Suggested Implementation Steps

  1. Add a selection mechanism to the sample view (e.g., checkboxes, multi-select).
  2. Implement an export option in the UI for the selected samples.
  3. On export, package the selected samples into a Torch-compatible dataset object, and serialize it (e.g., as .pt or a folder structure).
  4. Provide sample code (Python) for loading/exporting the dataset and for a minimal training loop using the exported dataset.

Documentation

  • Add documentation to the repo on the selection/export workflow.
  • Include code snippets for using the exported dataset in PyTorch.

Acceptance Criteria

  • Users can select samples and export them as a Torch dataset.
  • Exported dataset is loadable using PyTorch with correct metadata.
  • Documentation and example code are available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions