-
Kyligence Inc. @Kyligence
- ShangHai
- http://7mming7.github.io
Starred repositories
Apache DataFusion Comet Spark Accelerator
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Apache Doris is an easy-to-use, high performance and unified analytics database.
JSON-Schema + fake data generators
RocketMQ integration for Apache Flink. This module includes the RocketMQ source and sink that allows a flink job to either write messages into a topic or read from topics in a flink job.
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
🚀 LeetCode From Zero To One & 题单整理 & 题解分享 & 算法模板 & 刷题路线,持续更新中...
Leetcode questions (Company-wise, Paradigm-wise and much more)
Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
Anti OCR, Free Texts (拒绝被OCR,让文字得到自由)。把文本转换成机器无法识别但人可读的图片。
Free, simple, and intuitive online database diagram editor and SQL generator.
A collection of best resources to learn System Design, Software architecture, and prepare for System Design Interviews
A native Rust library for Delta Lake, with bindings into Python
Apache Paimon Rust The rust implementation of Apache Paimon.
Blazingly fast analytics database that will rapidly devour all of your data.
Open, Multi-modal Catalog for Data & AI
DuckDB is an analytical in-process SQL database management system
Extremely fast Query Engine for DataFrames, written in Rust
𝗔𝗜-𝗡𝗮𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲. Blazing analytics, fast search, geo insights, vector AI. Built for multimodal analytics, Open-source Snowflake alternative. https://databend.com
A Spark plugin for reading and writing Excel files
Streaming data platform. Real-time stream processing, low-latency serving, and Iceberg table management.
This is a library for SQL optimizing/rewriting including Materialized View rewrite
A performance profiler for Minecraft clients, servers, and proxies.
GoReplay is an open-source tool for capturing and replaying live HTTP traffic into a test environment in order to continuously test your system with real data. It can be used to increase confidence…
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)