- Berlin, Germany
-
16:23
(UTC +01:00) - @venkat_systems
- https://venkat-systems.bearblog.dev
- https://venkat.eu
-
mini-sglang Public
Forked from sgl-project/mini-sglangA compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Python UpdatedFeb 14, 2026 -
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedFeb 14, 2026 -
vllm.rs Public
Forked from guoqingbao/vllm.rsMinimalist vLLM implementation in Rust
Rust UpdatedFeb 13, 2026 -
vllm-omni Public
Forked from vllm-project/vllm-omniA framework for efficient model inference with omni-modality models
Python Apache License 2.0 UpdatedFeb 13, 2026 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedFeb 13, 2026 -
dynamo Public
Forked from ai-dynamo/dynamoA Datacenter Scale Distributed Inference Serving Framework
Rust Other UpdatedFeb 11, 2026 -
Mooncake Public
Forked from kvcache-ai/MooncakeMooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
C++ Apache License 2.0 UpdatedFeb 10, 2026 -
kvcached Public
Forked from ovg-project/kvcachedVirtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
Python Apache License 2.0 UpdatedFeb 4, 2026 -
restate Public
Forked from restatedev/restateRestate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
Rust Other UpdatedFeb 3, 2026 -
tigerbeetle Public
Forked from tigerbeetle/tigerbeetleThe financial transactions database designed for mission critical safety and performance.
Zig Apache License 2.0 UpdatedFeb 3, 2026 -
iggy Public
Forked from apache/iggyApache Iggy: Hyper-Efficient Message Streaming at Laser Speed
Rust Apache License 2.0 UpdatedFeb 3, 2026 -
slatedb Public
Forked from slatedb/slatedbA cloud native embedded storage engine built on object storage.
Rust Apache License 2.0 UpdatedFeb 3, 2026 -
rustfs Public
Forked from rustfs/rustfsπ2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforβ¦
Rust Apache License 2.0 UpdatedFeb 3, 2026 -
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
Zig Apache License 2.0 UpdatedFeb 2, 2026 -
bun Public
Forked from oven-sh/bunIncredibly fast JavaScript runtime, bundler, test runner, and package manager β all in one
Zig Other UpdatedFeb 1, 2026 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensorβ¦
Python Other UpdatedJan 31, 2026 -
nixl Public
Forked from ai-dynamo/nixlNVIDIA Inference Xfer Library (NIXL)
C++ Other UpdatedJan 28, 2026 -
llama.cpp Public
Forked from ggml-org/llama.cppLLM inference in C/C++
C++ MIT License UpdatedJan 28, 2026 -
mistral.rs Public
Forked from EricLBuehler/mistral.rsBlazingly fast LLM inference.
Rust MIT License UpdatedJan 28, 2026 -
FlashMoE Public
Forked from osayamenja/FlashMoEDistributed MoE in a Single Kernel [NeurIPS '25]
Cuda Other UpdatedJan 26, 2026 -
yali Public
Speed-of-Light SW efficiency by using ultra low-latency primitives for comms collectives
-
nano-vllm Public
Forked from GeeeekExplorer/nano-vllmNano vLLM
Python MIT License UpdatedJan 21, 2026 -
ome Public
Forked from sgl-project/omeOpen Model Engine (OME) β Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton
Go Apache License 2.0 UpdatedJan 21, 2026 -
NeetCode-Solutions Public
Forked from mdmzfzl/NeetCode-SolutionsMy solutions in C++ and Python for problems on NeetCode.io
C++ Apache License 2.0 UpdatedJan 18, 2026 -
disruptor-rs Public
Forked from nicholassm/disruptor-rsLow latency inter-thread communication library in Rust inspired by the LMAX Disruptor.
Rust MIT License UpdatedJan 7, 2026 -
luminal Public
Forked from luminal-ai/luminalDeep learning at the speed of light.
Rust Apache License 2.0 UpdatedNov 27, 2025 -
reference-kernels Public
Forked from gpu-mode/reference-kernelsOfficial Problem Sets / Reference Kernels for the GPU MODE Leaderboard!
Python MIT License UpdatedNov 11, 2025 -
pplx-garden Public
Forked from perplexityai/pplx-gardenPerplexity open source garden for inference technology
Rust MIT License UpdatedNov 7, 2025 -
modular Public
Forked from modular/modularThe Modular Platform (includes MAX & Mojo)
Mojo Other UpdatedNov 6, 2025 -
text-generation-inference Public
Forked from huggingface/text-generation-inferenceLarge Language Model Text Generation Inference
Python Apache License 2.0 UpdatedOct 11, 2025