Highlights
- Pro
Stars
[NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding
An efficient implementation of the NSA (Native Sparse Attention) kernel
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
🚀 Efficient implementations of state-of-the-art linear attention models
beep boop personal website hosted at tinabmai.com
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
Residual Quantization Autoencoder, used for interpreting LLMs
repo for code for paper on general theory associative memory models
Recipes to scale inference-time compute of open models
Curated list of datasets and tools for post-training.
A toolkit for describing model features and intervening on those features to steer behavior.
Recreating and refactoring weareninja.com's "space warp" effect
Entropy Based Sampling and Parallel CoT Decoding
Repository for Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
A library for making RepE control vectors
Thorn in a HaizeStack test for evaluating long-context adversarial robustness.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Simple, powerful and flexible site generation framework with everything you love from Next.js.
Orchestrate zero-shot computer vision models
LlamaIndex is the leading framework for building LLM-powered agents over your data.