Codestin Search App

TensorRT-LLM

Public

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moeblackwell llm-serving

Python

•

Other

•2k•12k•521•472•Updated

Dec 27, 2025

TransformerEngine

Public

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learninggpu cuda pytorch jax fp8 fp4

Python

•

Apache License 2.0

•589•3k•284•101•Updated

Dec 27, 2025

cudaqx

Public

Accelerated libraries for quantum-classical computing built on CUDA-Q.

C++

•

Other

•41•72•28•18•Updated

Dec 27, 2025

aistore

Public

AIStore: scalable storage for AI applications

kubernetes high-performance distributed-storagehigh-availability object-storage multi-cloud batch-jobs s3-compatible multipart-upload ml-training

Go

•

MIT License

•231•1.7k•1•0•Updated

Dec 27, 2025

cccl

Public

CUDA Core Compute Libraries

cpp hpc gpumodern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing

C++

•

Other

•309•2.1k•1.1k•201•Updated

Dec 27, 2025

stdexec

Public

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++

•

Apache License 2.0

•222•2.2k•115•12•Updated

Dec 26, 2025

linux

Public

OpenBMC Linux kernel source tree

C

•

Other

•60k•7•0•0•Updated

Dec 26, 2025

Fuser

Public

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++

•

Other

•74•368•210•219•Updated

Dec 26, 2025

Megatron-LM

Public

Ongoing research training transformer models at scale

transformers model-para large-language-models

Python

•

Other

•3.4k•15k•339•253•Updated

Dec 26, 2025

nsmd

Public

MCTP VDM-based Nvidia System Management API

C++

•

Apache License 2.0

•1•4•1•0•Updated

Dec 26, 2025

bmcweb

Public

A do everything Redfish, KVM, GUI, and DBus webserver for OpenBMC

C++

•

Apache License 2.0

•175•5•0•0•Updated

Dec 26, 2025

garak

Public

the LLM vulnerability scanner

ai vulnerability-assessment security-scannersllm-security llm-evaluation

Python

•

Apache License 2.0

•737•6.7k•265•40•Updated

Dec 26, 2025

NVTX

Public

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++

•

Other

•66•491•3•4•Updated

Dec 26, 2025

OSMO

Public

The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML

Python

•

Apache License 2.0

•6•61•23•13•Updated

Dec 26, 2025

JAX-Toolbox

Public

JAX-Toolbox

Python

•

Apache License 2.0

•68•369•80•40•Updated

Dec 26, 2025

doca-platform

Public

DOCA Platform manages provisioning and service orchestration for Bluefield DPUs

Go

•

Apache License 2.0

•16•64•0•0•Updated

Dec 26, 2025

cuda-quantum

Public

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

python cpp quantumquantum-computing hacktoberfest quantum-programming-language quantum-algorithms quantum-machine-learning unitaryhack

C++

•

Other

•316•876•406•85•Updated

Dec 26, 2025

barney

Public

A Scalable (and Optionally, Data-Parallel) ANARI Multi-GPU Path Tracer

C++

•

Apache License 2.0

•4•21•2•0•Updated

Dec 26, 2025

TensorRT-Incubator

Public

Experimental projects related to TensorRT

MLIR

•22•116•37•12•Updated

Dec 25, 2025

warp

Public

A Python framework for accelerated simulation, data generation and spatial computing.

python gpu cudanvidia gpu-acceleration differentiable-programming nvidia-warp

Python

•

Apache License 2.0

•403•6k•178•3•Updated

Dec 25, 2025

Model-Optimizer

Public

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

Python

•

Apache License 2.0

•226•1.7k•56•57•Updated

Dec 25, 2025

nvshmem

Public

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmers to perform one-sided communication from within CUDA kernels and on CUDA streams.

python deep-learning cppcuda nvidia communciations

C++

•

Other

•48•427•20•13•Updated

Dec 25, 2025

mig-parted

Public

MIG Partition Editor for NVIDIA GPUs

Go

•

Apache License 2.0

•54•235•22•19•Updated

Dec 25, 2025

KAI-Scheduler

Public

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale

Go

•

Apache License 2.0

•127•1k•24•64•Updated

Dec 25, 2025

NV-Kernels

Public

Ubuntu kernels which are optimized for NVIDIA server systems

C

•

Other

•49•72•0•7•Updated

Dec 25, 2025

NVSentinel

Public

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go

•

Apache License 2.0

•32•138•32•46•Updated

Dec 25, 2025

edk2

Public

NVIDIA fork of tianocore/edk2

C

•

Other

•16•25•0•15•Updated

Dec 25, 2025

gpu-operator

Public

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

kubernetes gpu cudanvidia

Go

•

Apache License 2.0

•431•2.5k•94•67•Updated

Dec 25, 2025

recsys-examples

Public

Examples for Recommenders - easy to train and deploy on accelerated infrastructure.

pytorch recommender-system recommendersgenerative-recommenders

Python

•

Other

•39•195•38•8•Updated

Dec 25, 2025

TileGym

Public

Helpful kernel tutorials and examples for tile-based GPU programming

Python

•

Other

•29•501•0•3•Updated

Dec 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Corporation

All

All

645 repositories

TensorRT-LLM

TransformerEngine

cudaqx

aistore

cccl

stdexec

linux

Fuser

Megatron-LM

nsmd

bmcweb

garak

NVTX

OSMO

JAX-Toolbox

doca-platform

cuda-quantum

barney

TensorRT-Incubator

warp

Model-Optimizer

nvshmem

mig-parted

KAI-Scheduler

NV-Kernels

NVSentinel

edk2

gpu-operator

recsys-examples

TileGym

All

All

Repositories list

645 repositories