Stars
verl: Volcano Engine Reinforcement Learning for LLMs
CRS-自建Claude Code镜像,一站式开源中转服务,让 Claude、OpenAI、Gemini、Droid 订阅统一接入,支持拼车共享,更高效分摊成本,原生工具无缝使用。
Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.
Simple & Scalable Pretraining for Neural Architecture Research
Zotero BabelDOC plugin, for Immersive Translate Pro members.
EleutherAI / nanoGPT-mup
Forked from karpathy/nanoGPTThe simplest, fastest repository for training/finetuning medium-sized GPTs.
When it comes to optimizers, it's always better to be safe than sorry
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus Agent Tools, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae…
Minimalistic large language model 3D-parallelism training
Official Repo for Open-Reasoner-Zero
Writing AI Conference Papers: A Handbook for Beginners
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
A Synthetic Dataset for Personal Attribute Inference (NeurIPS'24 D&B)
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
llama3 implementation one matrix multiplication at a time
A collection of modern/faster/saner alternatives to common unix commands.