Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View bdubayah's full-sized avatar

Block or report bdubayah

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ready-to-use ML training recipes to help you build and deploy models on Baseten.

Python 38 3 Updated Jan 28, 2026

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,561 650 Updated Jan 28, 2026

FlashInfer: Kernel Library for LLM Serving

Python 4,803 672 Updated Jan 28, 2026

[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding

Python 137 9 Updated Dec 4, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,655 541 Updated Jan 29, 2026

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 943 46 Updated Oct 29, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,649 413 Updated Jan 29, 2026

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,698 192 Updated Jun 25, 2024
Python 29 3 Updated May 24, 2025

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,859 248 Updated Jan 29, 2026

Entropy Based Sampling and Parallel CoT Decoding

Python 3,435 324 Updated Nov 13, 2024

Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild

Zig 3,080 113 Updated Jan 29, 2026

📰 Must-read papers and blogs on Speculative Decoding ⚡️

1,108 63 Updated Jan 24, 2026

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,148 659 Updated Aug 10, 2024

Numbers every LLM developer should know

4,279 140 Updated Jan 16, 2024

A guidance language for controlling large language models.

Jupyter Notebook 21,227 1,143 Updated Jan 28, 2026

Tips and tricks for working with Large Language Models like OpenAI's GPT-4.

9,446 509 Updated Oct 23, 2023

Port of OpenAI's Whisper model in C/C++

C++ 46,256 5,164 Updated Jan 21, 2026

Tensor library for machine learning

C++ 13,896 1,457 Updated Jan 13, 2026

A collection of libraries to optimise AI model performances

Python 8,354 629 Updated Jul 22, 2024

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,700 384 Updated Jan 12, 2026

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 978 225 Updated Jan 29, 2026

An R package implementing the UMAP dimensionality reduction method.

R 353 31 Updated Dec 21, 2025

Uniform Manifold Approximation and Projection

Python 8,081 858 Updated Jan 21, 2026

C++ port of the UMAP algorithm

C++ 72 14 Updated Dec 15, 2025

A library for efficient similarity search and clustering of dense vectors.

C++ 38,919 4,203 Updated Jan 29, 2026

Asahi Linux documentation

Dockerfile 2,079 91 Updated Jan 17, 2026

CUDA-accelerated GIS and spatiotemporal algorithms

Cuda 698 164 Updated Jul 28, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 34,737 3,381 Updated Jan 29, 2026

cuML - RAPIDS Machine Learning Library

C++ 5,104 614 Updated Jan 29, 2026
Next