blackwell

Star

Here are 22 public repositories matching this topic...

Fortnumsound / LaQuisha_complete-chat-browser_model-loader_and-backend_for-running-GGUF-models_with-Llama.cpp

Star

A fast API booty-licious back-end for running GGUF models with Llama.cpp

python api cuda nvidia llama quantized textui blackwell gguf 5090 5080

Updated Sep 21, 2025
Python

MoHussein197 / dgx-spark-finetune-llm

Star

🔧 Fine-tune large language models efficiently on NVIDIA DGX Spark with LoRA adapters and optimized quantization for high performance.

deep-learning pytorch nvidia lora quantization fine-tuning blackwell llm nvfp4 dgx-spark transformer-engine mxfp8

Updated Jan 11, 2026
Python

mikecaronna / GEN3C

Star

GEN3C: Generative Novel 3D Captions - Adapted for NVIDIA Blackwell GPU architecture (sm_120). Includes automatic GPU detection, CPU-based T5 text encoding for Blackwell compatibility, and full backward compatibility with older GPUs.

pytorch nvidia video-generation blackwell gen3c cuda-12-8 sm-120 transformer-engine rtx-blackwell

Updated Oct 23, 2025
Jupyter Notebook

insanelywicked1 / literate-dollop

Star

A fully automated PowerShell script to compile PyTorch from source with CUDA 12.1 support for NVIDIA RTX 50-series GPUs, optimized for Windows 11.

windows cuda pytorch blackwell rtx5080 rtx5090 gpu-build

Updated Oct 25, 2025
PowerShell

ThompsonShapiro / Cam-Lug-Well-Family-History

Star

Repository for Campbells-Luggs-Blackwells family history web site

family-history campbell lugg blackwell

Updated Jul 23, 2022
HTML

Atsusheeesh / vllm-daily

Star

📊 Summarize merged PRs daily with vLLM, ensuring you stay updated on key changes and enhancements in your projects.

github javascript java frontend backend tutorials pytorch webapp developer-tools llama developer-portal model-serving blackwell professional-networking llm llm-serving deepseek gpt-oss

Updated Jan 11, 2026

prateekshukla1108 / pytorch-distributed-gemm

Star

Pytorch Operation for distributed gemm in nvidia blackwell gpus

cuda gemm blackwell

Updated Jun 21, 2025
Cuda

m96-chan / PyGPUkit

Star

Minimal GPU runtime for Python - high-performance CUDA kernels, memory management, and LLM inference without heavy dependencies

python rust gpu numpy cuda inference hopper ampere tensorcore blackwell llm safetensors

Updated Jan 5, 2026
Python

MGD-Ben / GPT-OSS

Star

🚀 Build and explore OpenAI's GPT-OSS model from scratch in Python, unlocking the mechanics of large language models.

agent amd cuda openai mistral vlm ai-agents fine-tuning kimi blackwell stable-diffusion chatgpt llm-serving qwen deepseek deepseek-v3 gpt-oss gpt-oss-120b

Updated Jan 11, 2026
Python

Justus0405 / Nvidiainstall

Star

📦 A fully automated method for installing Nvidia drivers on Arch Linux

Updated Jan 11, 2026
Shell

xxrjun / gb200-kvcache-offload-study

Star

An empirical study of benchmarking LLM inference with KV cache offloading using vLLM and LMCache on NVIDIA GB200 with high-bandwidth NVLink-C2C .

offloading blackwell kvcache gb200

Updated Dec 20, 2025
Python

waybarrios / dgx-spark-finetune-llm

Star

LLM fine-tuning with LoRA + NVFP4/MXFP8 on NVIDIA DGX Spark (Blackwell GB10)

deep-learning pytorch nvidia lora quantization fine-tuning blackwell llm nvfp4 dgx-spark transformer-engine mxfp8

Updated Dec 22, 2025
Python

egaoharu-kensei / flash-attention-triton

Star

Cross-platform FlashAttention-2 Triton implementation for Turing+ GPUs with custom configuration mode

Updated Dec 16, 2025
Python

dougeeai / llama-cpp-python-wheels

Star

Pre-built wheels for llama-cpp-python across platforms and CUDA versions

Updated Nov 9, 2025

dconsorte / pytorch-tensorflow-gpu

Star

RTX 5090 & RTX 5060 Docker container with PyTorch + TensorFlow. First fully-tested Blackwell GPU support for ML/AI. CUDA 12.8, Python 3.11, Ubuntu 24.04. Works with RTX 50-series (5090/5080/5070/5060) and RTX 40-series.

docker machine-learning deep-learning tensorflow cuda pytorch gpu-computing blackwell rtx-5090 rtx-5060 blackwell-gpu nvidia-blackwell cuda-12-8 rtx-50-series rtx-5080

Updated Jul 8, 2025
Shell

eelbaz / dgx-spark-vllm-setup

Star

One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)

machine-learning ai deep-learning gpu cuda pytorch nvidia arm64 blackwell llm vllm llm-inference gb10 dgx-spark

Updated Oct 28, 2025
Shell

6Morpheus6 / deepspeed-windows-wheels

Star

Prebuilt DeepSpeed wheels for Windows with NVIDIA GPU support. Supports GTX 10 - RTX 50 series. Compiled with pytorch 2.7, 2.8 and cuda 12.8

windows blackwell deepspeed prebuilt-wheels

Updated Aug 18, 2025

IST-DASLab / qutlass

Star

QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning

cuda blackwell quantization-aware-training post-training-quantization

Updated Nov 11, 2025
C++

GradientHQ / parallax

Star

Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere

python distributed-systems chatbot pytorch transformer llama glm minimax kimi blackwell large-language-models llm llm-serving qwen deepseek oss-gpt decentralized-inference

Updated Jan 11, 2026
Python

NVIDIA / TensorRT-LLM

Star

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moe blackwell llm-serving

Updated Jan 11, 2026
Python

Improve this page

Add a description, image, and links to the blackwell topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the blackwell topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blackwell

Here are 22 public repositories matching this topic...

Fortnumsound / LaQuisha_complete-chat-browser_model-loader_and-backend_for-running-GGUF-models_with-Llama.cpp

MoHussein197 / dgx-spark-finetune-llm

mikecaronna / GEN3C

insanelywicked1 / literate-dollop

ThompsonShapiro / Cam-Lug-Well-Family-History

Atsusheeesh / vllm-daily

prateekshukla1108 / pytorch-distributed-gemm

m96-chan / PyGPUkit

MGD-Ben / GPT-OSS

Justus0405 / Nvidiainstall

xxrjun / gb200-kvcache-offload-study

waybarrios / dgx-spark-finetune-llm

egaoharu-kensei / flash-attention-triton

dougeeai / llama-cpp-python-wheels

dconsorte / pytorch-tensorflow-gpu

eelbaz / dgx-spark-vllm-setup

6Morpheus6 / deepspeed-windows-wheels

IST-DASLab / qutlass

GradientHQ / parallax

NVIDIA / TensorRT-LLM

Improve this page

Add this topic to your repo