Thanks to visit codestin.com
Credit goes to github.com

Skip to content
@OpenDCAI

OpenDCAI

Define the future of Data-centric AI together

OpenDCAI

We are dedicated to advancing research and open-source tools in Data-Centric Artificial Intelligence (DCAI).

Our goal is to develop effective and efficient DCAI systems and algorithms that support and enhance the performance of AI models and applications.

Newly Released Works

🔥 2025/6/29 Our DCAI system DataFlow is released! Link

Pinned Loading

  1. DataFlow DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    Python 1.4k 97

  2. MyScaleDB MyScaleDB Public

    Forked from OriginHubAI/MyScaleDB

    AI Database for unified, scalable SQL + vector data management, search and analytics

    C++ 37 1

  3. DataFlex DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    Python 30 9

Repositories

Showing 10 of 22 repositories
  • DataFlex-Doc Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    OpenDCAI/DataFlex-Doc’s past year of commit activity
    Python 1 6 0 0 Updated Oct 30, 2025
  • DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    OpenDCAI/DataFlex’s past year of commit activity
    Python 30 9 0 1 Updated Oct 30, 2025
  • Text2VectorSQL Public

    Official implementation of Text2VectorSQL: Towards a Unified Interface for Vector Search and SQL Queries

    OpenDCAI/Text2VectorSQL’s past year of commit activity
    Python 4 0 0 0 Updated Oct 30, 2025
  • OpenDCAI/DataFlow-Agent’s past year of commit activity
    Python 3 Apache-2.0 1 0 0 Updated Oct 30, 2025
  • DataFlow-Doc Public

    Documentation for DataFlow, Data-centric AI system for LLM.

    OpenDCAI/DataFlow-Doc’s past year of commit activity
    Python 8 24 4 0 Updated Oct 29, 2025
  • DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    OpenDCAI/DataFlow’s past year of commit activity
    Python 1,432 Apache-2.0 97 9 1 Updated Oct 30, 2025
  • SciAgent Public

    SciAgent is a reasoning agent system for scientific task reasoning.

    OpenDCAI/SciAgent’s past year of commit activity
    Python 9 MIT 2 0 0 Updated Oct 21, 2025
  • MorphoBench Public
    OpenDCAI/MorphoBench’s past year of commit activity
    Python 9 MIT 0 0 0 Updated Oct 20, 2025
  • DataFlow-MM Public

    Dataflow-MM, multi-media operators for Dataflow. We aim to prepare data for Multimodal Large Language Models.

    OpenDCAI/DataFlow-MM’s past year of commit activity
    Python 10 Apache-2.0 12 1 2 Updated Oct 15, 2025
  • DataFlow-MM-Doc Public

    Documentation for DataFlow-MM

    OpenDCAI/DataFlow-MM-Doc’s past year of commit activity
    Python 1 4 0 3 Updated Oct 8, 2025

Most used topics

Loading…