Stars
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
A syntax-highlighting pager for git, diff, grep, and blame output
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Distributed pushdown cache for DataFusion
A SQL query equivalence prover in Rust aiming for high performance and wide SQL feature coverage.
Open, Multi-modal Catalog for Data & AI
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Scripts to make specific datasets cleaner and more convenient
Lean 4 programming language and theorem prover
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Fully open reproduction of DeepSeek-R1
An Open Standard for lineage metadata collection
Open deep learning compiler stack for cpu, gpu and specialized accelerators
An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.
Composable building blocks to build Llama Apps
WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices,…
Several Coding Patterns for Solving Data Structures and Algorithms Problems during Interviews
The full-stack edge platform for your edge oriented applications.
Substation is a toolkit for routing, normalizing, and enriching security event and audit logs.
Distributed query engine providing simple and reliable data processing for any modality and scale