Stars
Empowering everyone to build reliable and efficient software.
Axiom is a set of reusable and extensible components designed to be compatible with Velox. Its primary purpose is to simplify the process of building front-ends for query execution powered by Velox.
An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
Real-time analytics on Postgres tables
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
Pretrain, finetune and serve LLMs on Intel platforms with Ray
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
Power CLI and Workflow manager for LLMs (core package)
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
ClickBench: a Benchmark For Analytical Databases
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
LingoDB: A new analytical database system that blurs the lines between databases and compilers.
A modular acceleration toolkit for big data analytic engines
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
The Fastest Distributed Database for Transactional, Analytical, and AI Workloads.
JDK main-line development https://openjdk.org/projects/jdk
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.