Stars
Open, Multi-modal Catalog for Data & AI
psyoblade / docker-stacks
Forked from jupyter/docker-stacksReady-to-run Docker images containing Jupyter applications
The open source frontend for GitBook doc sites
Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
Spark ClickHouse Connector build on DataSourceV2 API
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
📚 Parameterize, execute, and analyze notebooks
Maxwell's daemon, a mysql-to-json kafka producer
📚 개발 전공 서적 읽고 정리하는 레포
Includes notes on using Apache Spark, with drill down on Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark. Also tools for stress testing, measuring CPUs' performa…
The Internals of Spark Structured Streaming
An open protocol for secure data sharing
Spark: The Definitive Guide's Code Repository
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
Admin UI for administration of spring boot applications
Eclipse Temurin™ build scripts - common across all releases/versions