Thanks to visit codestin.com
Credit goes to github.com

ydtydr

Follow

Tao Yu ydtydr

Follow

31 followers · 13 following

AWS AI
Santa Clara, CA
https://ydtydr.github.io/

Achievements

Achievements

Stars

ucb-bar / autocomp

Autocomp: AI Code Optimizer for Tensor Accelerators

Python 56 Updated Dec 21, 2025

UOR-Foundation / atlas-embeddings

Rust 21 2 Updated Oct 17, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 38,986 4,935 Updated Dec 9, 2025

LichAmnesia / GPT-Prompt-Hub

GPT-Prompt-Hub is an open-source community-driven repository dedicated to the collection, sharing, and refinement of custom GPT prompts

2,255 391 Updated Aug 11, 2025

meituan-longcat / LongCat-Flash-Chat

1,245 60 Updated Dec 15, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,661 2,861 Updated Dec 21, 2025

ROCm / Megatron-LM

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

Python 34 32 Updated Dec 12, 2025

yanring / Megatron-MoE-ModelZoo

Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.

Python 140 29 Updated Dec 19, 2025

aws-neuron / nki-samples

Python 58 32 Updated Dec 18, 2025

bcpierce00 / unison

Unison file synchronizer

OCaml 5,003 259 Updated Dec 1, 2025

NVIDIA / nvidia-dlfw-inspect

The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such as Transformer Engine, Megatron-LM, and NeMo.

Python 16 6 Updated Sep 17, 2025

MoonshotAI / Kimi-K2

Kimi K2 is the large language model series developed by Moonshot AI team

9,742 705 Updated Nov 7, 2025

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 961 82 Updated Sep 4, 2024

Cornell-RelaxML / yaqa-quantization

Python 66 2 Updated Jun 20, 2025

uccl-project / uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,132 105 Updated Dec 21, 2025

amazon-science / mxfp4-llm

Official implementation for Training LLMs with MXFP4

Python 115 16 Updated Apr 25, 2025

awslabs / nki-autotune

Python 15 6 Updated Dec 15, 2025

MoonshotAI / Moonlight

Muon is Scalable for LLM Training

1,386 78 Updated Aug 3, 2025

ScalingIntelligence / KernelBench

KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)

Jupyter Notebook 718 102 Updated Dec 19, 2025

bytedance / flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,210 85 Updated Aug 28, 2025

alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,482 216 Updated Dec 15, 2025

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 2,208 402 Updated Dec 15, 2025

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 2,374 260 Updated Dec 11, 2025

karpathy / build-nanogpt

Video+code lecture on building nanoGPT from scratch

Python 4,620 725 Updated Aug 13, 2024

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,927 918 Updated Dec 15, 2025

microsoft / BitNet

Official inference framework for 1-bit LLMs

Python 24,461 1,914 Updated Jun 3, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,747 271 Updated Jul 18, 2025

genmoai / mochi

The best OSS video generation models, created by Genmo

Python 3,538 468 Updated Nov 14, 2025

skypilot-org / skypilot

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).

Python 9,126 878 Updated Dec 21, 2025

hacksider / Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Python 76,360 11,137 Updated Dec 15, 2025