Stars
Production-Grade Container Scheduling and Management
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
rohan-uptycs / hudi
Forked from apache/hudiUpserts, Deletes And Incremental Processing on Big Data.
rohan-uptycs / spark
Forked from apache/sparkApache Spark - A unified analytics engine for large-scale data processing
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
Base classes to use when writing tests with Spark
Data pipelines for cloud config and security data. Build cloud asset inventory, CSPM, FinOps, and vulnerability management solutions. Extract from AWS, Azure, GCP, and 70+ cloud and SaaS sources.
Apache Spark - A unified analytics engine for large-scale data processing
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that…
Collection of middlewares created by the community
Golang implementation of the Raft consensus protocol