Stars
The Lineage Analysis system for FlinkSQL supports advanced syntax such as Watermark, UDTF, CEP, Windowing TVFs, and CTAS.
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
Includes notes on using Apache Spark, with drill down on Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark. Also tools for stress testing, measuring CPUs' performa…
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
xgSama / chunjun
Forked from DTStack/chunjunBased on Apache Flink. Support data synchronization/integration.
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display