Lists (3)
Sort Name ascending (A-Z)
Stars
The absolute trainer to light up AI agents.
Apache Fluss is a streaming storage built for real-time analytics.
🐘 Elasticsearch real-time search and analytics natively integrated with Hadoop
Free and Open Source, Distributed, RESTful Search Engine
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
ClickHouse® is a real-time analytics database management system
Apache Doris is an easy-to-use, high performance and unified analytics database.
Apache Spark - A unified analytics engine for large-scale data processing
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
Upserts, Deletes And Incremental Processing on Big Data.