Stars
pg_lake: Postgres with Iceberg and data lake access
Flowistry is an IDE plugin for Rust that helps you focus on relevant code.
The official Go SDK for Model Context Protocol servers and clients. Maintained in collaboration with Google.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others
Apache Druid: a high performance real-time analytics database.
Apache Pinot - A realtime distributed OLAP datastore
Apache Spark - A unified analytics engine for large-scale data processing
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
AEPs help developers and organizations build clear, consistent network APIs and clients by providing an extensible set of design guidelines.
Generates AEP-compliant REST/proto APIs from a resource model. See the main README.md for usage.
An implementation oriented cookbook for compiler writers.
TPC-H benchmark data generation in pure Rust
My solution to the SIGMOD 2025 contest (non-registered).
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
practical quantum-secure key encapsulation from generic lattices
Borgo is a statically typed language that compiles to Go.
A parsing/linking engine for protobuf; the guts for a pure Go replacement of protoc.
Go support for Google's protocol buffers
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.