Thanks to visit codestin.com
Credit goes to github.com

rumisle

Follow

🛏️

Oh

rumi rumisle

🛏️

Oh

Follow

straying

20 followers · 106 following

07:32 (UTC)

Achievements

Achievements

Lists (8)

Sort

Advent of Code 2022

GNN

LLM

LLM Auto Eval

12 repositories

M

Metal

MLIR

❄️ Nix

Stars

MoonshotAI / kimi-cli

Kimi CLI is your next CLI agent.

Python 2,038 145 Updated Oct 30, 2025

YaLTeR / niri

A scrollable-tiling Wayland compositor.

Rust 14,249 500 Updated Oct 30, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 20,236 2,097 Updated Oct 29, 2025

deepseek-ai / DeepSeek-V3.2-Exp

Python 941 64 Updated Oct 2, 2025

microsoft / llvm-mctoll

llvm-mctoll

C++ 867 123 Updated Jun 22, 2024

bytedance / flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,158 82 Updated Aug 28, 2025

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 922 77 Updated Sep 4, 2024

PrimeIntellect-ai / prime-rl

Async RL Training at Scale

Python 730 121 Updated Oct 30, 2025

fenxer / llm-things

A collection of LLM memes

282 4 Updated Sep 22, 2025

gau-nernst / learn-cuda

Learn CUDA with PyTorch

Cuda 95 12 Updated Sep 24, 2025

mingyin0312 / RLFromScratch

Python 447 35 Updated Aug 28, 2025

openai / harmony

Renderer for the harmony response format to be used with gpt-oss

Rust 3,952 223 Updated Aug 15, 2025

nirw4nna / dsc

Tensor library & inference framework for machine learning

C++ 113 5 Updated Oct 3, 2025

Xuanwo / xlaude

A CLI tool for managing Claude instances with git worktree

Rust 101 11 Updated Oct 27, 2025

eyalroz / cuda-kat

CUDA kernel author's tools

Cuda 113 8 Updated Apr 24, 2022

stepfun-ai / StepMesh

C++ 309 26 Updated Oct 1, 2025

SandAI-org / MagiAttention

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 545 33 Updated Oct 30, 2025

microsoft / ArchScale

Simple & Scalable Pretraining for Neural Architecture Research

Python 298 30 Updated Oct 28, 2025

woct0rdho / transformers-qwen3-moe-fused

Fused Qwen3 MoE layer for faster training, compatible with HF Transformers, LoRA, 4-bit quant, Unsloth

Python 200 9 Updated Oct 21, 2025

SakanaAI / treequest

A Tree Search Library with Flexible API for LLM Inference-Time Scaling

Python 481 62 Updated Oct 18, 2025

JD-P / minihf

MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user develop their prompts into full models.

Python 181 12 Updated Oct 11, 2025

meta-pytorch / monarch

PyTorch Single Controller

Rust 829 97 Updated Oct 30, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 2,306 232 Updated Oct 30, 2025

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,918 144 Updated Oct 30, 2025

YdrMaster / context-spore

Manage resources and move them between hardware contexts

Rust 2 1 Updated Feb 23, 2025

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,591 279 Updated Oct 30, 2025

tenstorrent / tt-bh-linux

Tenstorrent Blackhole P100/P150 card RISC-V Linux demo 🐧

C++ 37 4 Updated Oct 14, 2025

dropbox / gemlite

Fast low-bit matmul kernels in Triton

Python 387 28 Updated Oct 26, 2025

safety-research / circuit-tracer

Python 2,402 258 Updated Oct 21, 2025

google-deepmind / formal-conjectures

A collection of formalized statements of conjectures in Lean.

Lean 653 85 Updated Oct 29, 2025