vwxyzjn

😃

Costa Huang vwxyzjn

😃

Exploiting physical rewards @periodiclabs. Prev: RL @allenai @huggingface.

1.7k followers · 127 following

@huggingface
Philadelphia, PA
18:29 (UTC -05:00)
https://costa.sh
@vwxyzjn

Achievements

x4 x3 x3

Achievements

x4 x3 x3

Lists (5)

Sort

Stars

radixark / miles

Python 610 56 Updated Dec 19, 2025

wejoncy / sfllm

Super fast serving stack for LLM on Windows/Linux/Macos

Cuda 9 1 Updated Dec 17, 2025

amulil / cleanvllm

A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.

Python 33 4 Updated Jun 22, 2025

RDMA-Rust / rdma-rust.github.io

HTML 4 Updated Nov 18, 2025

AlongWY / TransformerEngine_wheels

wheels for TransformerEngine

Python 5 2 Updated Nov 27, 2025

llm-d / llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,199 267 Updated Dec 19, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,451 475 Updated Dec 19, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,147 199 Updated Dec 19, 2025

flexagoon / ream

Python 26 1 Updated Jul 31, 2025

cohere-ai / cohere-terrarium

A simple Python sandbox for helpful LLM data agents

Python 299 50 Updated Jun 18, 2024

PrimeIntellect-ai / prime-rl

Async RL Training at Scale

Python 948 162 Updated Dec 19, 2025

joerick / pyinstrument

🚴 Call stack profiler for Python. Shows you why your code is slow!

Python 7,539 256 Updated Nov 17, 2025

McGill-NLP / nano-aha-moment

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 567 54 Updated Oct 7, 2025

hendrycks / apps

APPS: Automated Programming Progress Standard (NeurIPS 2021)

Python 497 67 Updated Jun 19, 2024

amir20 / dozzle

Realtime log viewer for containers. Supports Docker, Swarm and K8s.

Go 10,558 456 Updated Dec 19, 2025

SzymonOzog / GPU_Programming

Python 86 8 Updated Nov 11, 2025

huggingface / Math-Verify

Python 1,043 49 Updated Jul 2, 2025

deepseek-ai / smallpond

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,871 430 Updated Mar 5, 2025

allenai / olmocr

Toolkit for linearizing PDFs for LLM datasets/training

Python 16,226 1,254 Updated Dec 19, 2025

codingfisch / flashrl

Fast reinforcement learning 💨

Cython 28 1 Updated Jul 15, 2025

nebius / kvax

A FlashAttention implementation for JAX with support for efficient document mask computation and context parallelism.

Python 152 10 Updated Nov 11, 2025

PrimeIntellect-ai / verifiers

Our library for RL environments + evals

Python 3,646 454 Updated Dec 19, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,856 644 Updated Dec 19, 2025

joey00072 / nanoGRPO

nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)

Python 136 9 Updated May 8, 2025

AI-Hypercomputer / JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Python 396 59 Updated Jun 10, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,926 918 Updated Dec 15, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,948 288 Updated May 15, 2025

keraJLi / rejax

Hardware-Accelerated Reinforcement Learning Algorithms in pure Jax!

Python 251 19 Updated Oct 31, 2025

open-thought / reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,278 106 Updated Dec 15, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 35,844 4,232 Updated Dec 14, 2025

Costa Huang vwxyzjn

Lists (5)

🔥 CleanRL-supported Projects

Fancy Tech

🔮 Future tech

🚀 My stack

🔨 My tools

Stars