-
ModelCloud.ai
- Earth/Epoch 2.0
- https://modelcloud.ai
- @qubitium
-
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedDec 17, 2025 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Python Apache License 2.0 UpdatedDec 12, 2025 -
huggingface_hub Public
Forked from huggingface/huggingface_hubThe official Python client for the Hugging Face Hub.
Python Apache License 2.0 UpdatedNov 12, 2025 -
flash-linear-attention Public
Forked from fla-org/flash-linear-attention🚀 Efficient implementations of state-of-the-art linear attention models
Python MIT License UpdatedNov 1, 2025 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
MLIR MIT License UpdatedOct 25, 2025 -
BitBLAS Public
Forked from microsoft/BitBLASBitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
-
nanochat Public
Forked from karpathy/nanochatThe best ChatGPT that $100 can buy.
Python MIT License UpdatedOct 22, 2025 -
accelerate Public
Forked from huggingface/accelerate🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Python Apache License 2.0 UpdatedOct 14, 2025 -
-
xet-core Public
Forked from huggingface/xet-corexet client tech, used in huggingface_hub
Rust Apache License 2.0 UpdatedOct 3, 2025 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedOct 3, 2025 -
-
lm-evaluation-harness Public
Forked from EleutherAI/lm-evaluation-harnessA framework for few-shot evaluation of language models.
Python MIT License UpdatedSep 26, 2025 -
h2 Public
Forked from python-hyper/h2Pure-Python HTTP/2 protocol implementation
Python MIT License UpdatedSep 20, 2025 -
-
tokenizers Public
Forked from huggingface/tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Rust Apache License 2.0 UpdatedMay 27, 2025 -
threadpoolctl Public
Forked from joblib/threadpoolctlPython helpers to limit the number of threads used in native libraries that handle their own internal threadpool (BLAS and OpenMP implementations)
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 8, 2025 -
datasets Public
Forked from huggingface/datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Python Apache License 2.0 UpdatedMay 3, 2025 -
mav Public
Forked from attentionmech/mavmodel activation visualiser
Python MIT License UpdatedMar 28, 2025 -
pytorch Public
Forked from ROCm/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedMar 17, 2025 -
clod-code Public
Forked from qpwo/clod-coderot13 version of claw code
Grammatical Framework UpdatedMar 12, 2025 -
ethos-paper Public
Forked from ipolharvard/ethos-paperJupyter Notebook MIT License UpdatedMar 8, 2025 -
QQQ Public
Forked from HandH1998/QQQQQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
Python UpdatedFeb 18, 2025 -
GPTQModel Public
Forked from 1096125073/GPTQModelProduction ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Python Apache License 2.0 UpdatedJan 20, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
-
evalplus Public
Forked from evalplus/evalplusRigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
Python Apache License 2.0 UpdatedDec 22, 2024 -
unsloth Public
Forked from unslothai/unsloth5X faster 60% less memory QLoRA finetuning
Python Apache License 2.0 UpdatedAug 30, 2024 -
auto-round Public
Forked from intel/auto-roundSOTA Weight-only Quantization Algorithm for LLMs
Python Apache License 2.0 UpdatedJul 23, 2024 -
hqq Public
Forked from dropbox/hqqOfficial implementation of Half-Quadratic Quantization (HQQ)
Python Apache License 2.0 UpdatedJul 22, 2024 -
AutoGPTQ Public
Forked from AutoGPTQ/AutoGPTQAn easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.