- HangZhou, ZheJiang, China
- https://khotyn.com/blog
- @khotyn
Starred repositories
A simple, performant and scalable Jax LLM!
an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM
Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…
Intelligent Router for Mixture-of-Models
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Open Source Landscapes and Insights Produced by AntOSS
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
verl: Volcano Engine Reinforcement Learning for LLMs
iTerm2 is a terminal emulator for Mac OS X that does amazing things.
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
CUDA Templates and Python DSLs for High-Performance Linear Algebra
FlashMLA: Efficient Multi-head Latent Attention Kernels
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
SGLang is a fast serving framework for large language models and vision language models.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Module, Model, and Tensor Serialization/Deserialization
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
An unprofessional open-source Chinese font derived from Fontworks' Klee One. 一款非专业的开源中文字体,基于 FONTWORKS 出品字体 Klee One 衍生。