Venkat2811

🎯

Focusing

Venkat Raman Venkat2811

🎯

Focusing

🧑‍💻 Staff Engineer – distributed systems, low latency, inference

51 followers · 153 following

Achievements

Organizations

mini-sglang Public
Forked from sgl-project/mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python Updated Feb 14, 2026
sglang Public
Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python Apache License 2.0 Updated Feb 14, 2026
vllm.rs Public
Forked from guoqingbao/vllm.rs

Minimalist vLLM implementation in Rust

Rust Updated Feb 13, 2026
vllm-omni Public
Forked from vllm-project/vllm-omni

A framework for efficient model inference with omni-modality models

Python Apache License 2.0 Updated Feb 13, 2026
vllm Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python Apache License 2.0 Updated Feb 13, 2026
dynamo Public
Forked from ai-dynamo/dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust Other Updated Feb 11, 2026
Mooncake Public
Forked from kvcache-ai/Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ Apache License 2.0 Updated Feb 10, 2026
kvcached Public
Forked from ovg-project/kvcached

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python Apache License 2.0 Updated Feb 4, 2026
restate Public
Forked from restatedev/restate

Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.

Rust Other Updated Feb 3, 2026
tigerbeetle Public
Forked from tigerbeetle/tigerbeetle

The financial transactions database designed for mission critical safety and performance.

Zig Apache License 2.0 Updated Feb 3, 2026
iggy Public
Forked from apache/iggy

Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed

Rust Apache License 2.0 Updated Feb 3, 2026
slatedb Public
Forked from slatedb/slatedb

A cloud native embedded storage engine built on object storage.

Rust Apache License 2.0 Updated Feb 3, 2026
rustfs Public
Forked from rustfs/rustfs

🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platfor…

Rust Apache License 2.0 Updated Feb 3, 2026
zml Public
Forked from zml/zml

Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild

Zig Apache License 2.0 Updated Feb 2, 2026
bun Public
Forked from oven-sh/bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one

Zig Other Updated Feb 1, 2026
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python Other Updated Jan 31, 2026
nixl Public
Forked from ai-dynamo/nixl

NVIDIA Inference Xfer Library (NIXL)

C++ Other Updated Jan 28, 2026
llama.cpp Public
Forked from ggml-org/llama.cpp

LLM inference in C/C++

C++ MIT License Updated Jan 28, 2026
mistral.rs Public
Forked from EricLBuehler/mistral.rs

Blazingly fast LLM inference.

Rust MIT License Updated Jan 28, 2026
FlashMoE Public
Forked from osayamenja/FlashMoE

Distributed MoE in a Single Kernel [NeurIPS '25]

Cuda Other Updated Jan 26, 2026
yali Public

Speed-of-Light SW efficiency by using ultra low-latency primitives for comms collectives

deep-learning gpu cuda communications

Cuda 11 MIT License Updated Jan 22, 2026
nano-vllm Public
Forked from GeeeekExplorer/nano-vllm

Nano vLLM

Python MIT License Updated Jan 21, 2026
ome Public
Forked from sgl-project/ome

Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton

Go Apache License 2.0 Updated Jan 21, 2026
NeetCode-Solutions Public
Forked from mdmzfzl/NeetCode-Solutions

My solutions in C++ and Python for problems on NeetCode.io

C++ Apache License 2.0 Updated Jan 18, 2026
disruptor-rs Public
Forked from nicholassm/disruptor-rs

Low latency inter-thread communication library in Rust inspired by the LMAX Disruptor.

Rust MIT License Updated Jan 7, 2026
luminal Public
Forked from luminal-ai/luminal

Deep learning at the speed of light.

Rust Apache License 2.0 Updated Nov 27, 2025
reference-kernels Public
Forked from gpu-mode/reference-kernels

Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!

Python MIT License Updated Nov 11, 2025
pplx-garden Public
Forked from perplexityai/pplx-garden

Perplexity open source garden for inference technology

Rust MIT License Updated Nov 7, 2025
modular Public
Forked from modular/modular

The Modular Platform (includes MAX & Mojo)

Mojo Other Updated Nov 6, 2025
text-generation-inference Public
Forked from huggingface/text-generation-inference

Large Language Model Text Generation Inference

Python Apache License 2.0 Updated Oct 11, 2025

Venkat Raman Venkat2811

Achievements

Achievements

Organizations

mini-sglang Public

Uh oh!

sglang Public

Uh oh!

vllm.rs Public

Uh oh!

vllm-omni Public

Uh oh!

vllm Public

Uh oh!

dynamo Public

Uh oh!

Mooncake Public

Uh oh!

kvcached Public

Uh oh!

restate Public

Uh oh!

tigerbeetle Public

Uh oh!

iggy Public

Uh oh!

slatedb Public

Uh oh!

rustfs Public

Uh oh!

zml Public

Uh oh!

bun Public

Uh oh!

TensorRT-LLM Public

Uh oh!

nixl Public

Uh oh!

llama.cpp Public

Uh oh!

mistral.rs Public

Uh oh!

FlashMoE Public

Uh oh!

yali Public

Uh oh!

nano-vllm Public

Uh oh!

ome Public

Uh oh!

NeetCode-Solutions Public

Uh oh!

disruptor-rs Public

Uh oh!

luminal Public

Uh oh!

reference-kernels Public

Uh oh!

pplx-garden Public

Uh oh!

modular Public

Uh oh!

text-generation-inference Public

Uh oh!