-
NYU Shanghai
- Shanghai/Suzhou/New York
-
07:33
(UTC -04:00) - https://zephyr271828.github.io/
- in/yufeng-felix-xu
Highlights
- Pro
Lists (8)
Sort Name ascending (A-Z)
Stars
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
Survey of Small Language Models from Penn State, ...
TPU inference for vLLM, with unified JAX and PyTorch support.
qwen3-base family of models RL on gsm8k using verl, is there an RL power law on downstream tasks?
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
MineContext is your proactive context-aware AI partner(Context-Engineering+ChatGPT Pulse)
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
Hackable and optimized Transformers building blocks, supporting a composable construction.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Python tool for converting files and office documents to Markdown.
📰 Must-read papers and blogs on Speculative Decoding ⚡️
[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
slime is an LLM post-training framework for RL Scaling.
TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.
A curated list of neural network pruning resources.
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
An Awesome List of Agentic Model trained with Reinforcement Learning
[ICML 2022] The official implementation of DWBC in "Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations"
Train transformer language models with reinforcement learning.