GeDa is a Python package that helps you to Get the Data for your project easily.
pip install gedafrom geda.data_providers.voc import VOCSemanticSegmentationDataProvider
root = "<directory>/<to>/<store>/<data>" # e.g. "data/VOC"
dataprovider = VOCSemanticSegmentationDataProvider(root)
dataprovider.get_data()from geda import get_data
root = "<directory>/<to>/<store>/<data>" # e.g. "data/VOC"
dataprovider = get_data(name="VOC_SemanticSegmentation", root=root)
dataprovider.get_data()The
get_datafunction currently supported names:MNIST,DUTS,NYUDv2,VOC_InstanceSegmentation,VOC_SemanticSegmentation,VOC_PersonPartSegmentation,VOC_Main,VOC_Action,VOC_Layout,MPII,COCO_Keypoints
By using dataprovider.get_data() functionality, the data is subjected to the following pipeline:
- Download the data from source (specified by the
_URLSvariable in each module) - Unzip the files if needed (in case of
tar,ziporgzfiles downloaded) - Move the files to
<root>/rawdirectory - Find the split ids (file basenames or indices - depending on the dataset)
- Arrange files, i.e. move (or copy) files from
<root>/rawdirectory to task-specific directories - [Optional] Create labels in specific format (f.e. YOLO)
Resulting directory structure of the get_data(name="VOC_SemanticSegmentation", root="data/VOC")
.
└── data
└── VOC
├── raw
│ ├── Annotations
│ ├── ImageSets
│ ├── JPEGImages
│ ├── SegmentationClass
│ └── SegmentationObject
├── SegmentationClass
│ ├── annots
│ ├── images
│ ├── labels
│ └── masks
└── trainval_2012.tar
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.