A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,846 527 Updated Oct 24, 2025

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,331 921 Updated Mar 27, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,003 2,244 Updated Oct 24, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 13,940 3,184 Updated Oct 24, 2025

currentslab / awesome-vector-search

Collections of vector search related libraries, service and research papers

1,527 103 Updated Aug 6, 2024

DeMoriarty / TorchPQ

Approximate nearest neighbor search with product quantization on GPU in pytorch and cuda

Cuda 228 22 Updated Dec 12, 2023

criteo / autofaiss

Automatically create Faiss knn indices with the most optimal similarity search parameters.

Python 873 79 Updated May 21, 2024

facebookresearch / bitsandbytes

Library for 8-bit optimizers and quantization routines.

780 48 Updated Aug 18, 2022

DwangoMediaVillage / pqkmeans

Fast and memory-efficient clustering

Jupyter Notebook 262 44 Updated Oct 16, 2023

HaoZeSun2016 / HNSW-HAMMING

hnsw lib with hamming distance and uint32 coding

C++ 4 1 Updated Sep 29, 2019

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 40,494 4,589 Updated Oct 24, 2025

ScottTilley / tianwen1

Doppler data from TIANWEN-1

Jupyter Notebook 10 1 Updated Jan 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoveSJ benywon

Achievements

Achievements

Block or report benywon

Stars

baichuan-inc / Baichuan-M1-14B

lucidrains / ring-attention-pytorch

pymupdf / PyMuPDF

mindspore-ai / mindspore

baichuan-inc / Baichuan2

facebookresearch / xformers

ouwei2013 / baichuan13b.cpp

facebookresearch / nougat

Dao-AILab / flash-attention

haonan-li / CMMLU

llmeval / LLMEval-2

baichuan-inc / Baichuan-13B

RUCAIBox / LLMSurvey

pleisto / yuren-baichuan-7b

baichuan-inc / Baichuan-7B

ExpressAI / AI-Gaokao

LAION-AI / Open-Assistant

openai / following-instructions-human-feedback

NVIDIA / TransformerEngine