Stars
Apache Spark - A unified analytics engine for large-scale data processing
Scala 2 compiler and standard library. Scala 2 bugs at https://github.com/scala/bug; Scala 3 at https://github.com/scala/scala3
CMAK is a tool for managing Apache Kafka clusters
A fault tolerant, protocol-agnostic RPC system
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
REST job server for Apache Spark
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
scopt / scopt
Forked from jstrachan/scoptcommand line options parsing for Scala
A simple-build-tool (sbt) plugin/processor for creating IntelliJ IDEA project files
A collection of open source Apache 2.0 Kafka Connector maintained by Lenses.io.
Livy is an open source REST interface for interacting with Apache Spark from anywhere
A connector for Spark that allows reading and writing to/from Redis cluster
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Non-blocking, Reactive Redis driver for Scala (with Sentinel support)
This project provides Apache Spark SQL, RDD, DataFrame and Dataset examples in Scala language
Quick up and running using Scala for Apache Kafka
Connect Spark to HBase for reading and writing data with ease
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
A library based on delta for Spark and MLSQL
SparkSQL自定义Hint优化器解决热点数据导致JOIN数据倾斜问题