-
Couldn't load subscription status.
- Fork 12
Open
Description
Hi @ximinng,
Thanks for releasing the SVGX datasets. A few small questions about the datasets:
According to the README:
Available Datasets on Hugging Face:
- xingxm/SVGX-Core-250k: Core pretraining data (250k examples).
- xingxm/SVGX-SFT-1M: Supervised fine-tuning data (1M examples).
But I presume SVGX-Core-250k is the actual SFT dataset needed for fine-tuning, based on how it's used in the code? Could you clarify this discrepancy?
Also, could you explain what SVGX_SFT_GEN_51k, SVGX_SFT_GEN_basic, and SVGX_SFT_UN_25k in SVGX-SFT-1M are exactly? I've noticed that a lot of SVGs in SVGX-SFT-1M dataset appear to be broken.
e.g.
Metadata
Metadata
Assignees
Labels
No labels