Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Change the repository type filter

All

    Repositories list

    • TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
      Python
      2k12k519486Updated Dec 25, 2025Dec 25, 2025
    • gpu-operator

      Public
      NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
      Go
      4312.5k9467Updated Dec 25, 2025Dec 25, 2025
    • recsys-examples

      Public
      Examples for Recommenders - easy to train and deploy on accelerated infrastructure.
      Python
      39193388Updated Dec 25, 2025Dec 25, 2025
    • Model-Optimizer

      Public
      A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
      Python
      2221.7k5657Updated Dec 25, 2025Dec 25, 2025
    • TileGym

      Public
      Helpful kernel tutorials and examples for tile-based GPU programming
      Python
      2849102Updated Dec 25, 2025Dec 25, 2025
    • doca-platform

      Public
      DOCA Platform manages provisioning and service orchestration for Bluefield DPUs
      Go
      166400Updated Dec 25, 2025Dec 25, 2025
    • nccl

      Public
      Optimized primitives for collective multi-GPU communication
      C++
      1.1k4.3k19073Updated Dec 25, 2025Dec 25, 2025
    • skyhook

      Public
      A Kubernetes Operator to manage Node OS customizations.
      Go
      33500Updated Dec 25, 2025Dec 25, 2025
    • spark-rapids-examples

      Public
      A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.
      Jupyter Notebook
      62164223Updated Dec 25, 2025Dec 25, 2025
    • NVFlare

      Public
      NVIDIA Federated Learning Application Runtime Environment
      Python
      2268511516Updated Dec 25, 2025Dec 25, 2025
    • cuEquivariance

      Public
      cuEquivariance is a math library that is a collective of low-level primitives and tensor ops to accelerate widely-used models, like DiffDock, MACE, Allegro and NEQUIP, based on equivariant neural networks. Also includes kernels for accelerated structure prediction.
      Python
      23335135Updated Dec 25, 2025Dec 25, 2025
    • stdexec

      Public
      `std::execution`, the proposed C++ framework for asynchronous and parallel programming.
      C++
      2222.2k11413Updated Dec 24, 2025Dec 24, 2025
    • Fuser

      Public
      A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
      C++
      74368210218Updated Dec 24, 2025Dec 24, 2025
    • warp

      Public
      A Python framework for accelerated simulation, data generation and spatial computing.
      Python
      4046k1783Updated Dec 24, 2025Dec 24, 2025
    • Megatron-LM

      Public
      Ongoing research training transformer models at scale
      Python
      3.4k15k338255Updated Dec 24, 2025Dec 24, 2025
    • spark-rapids-tools

      Public
      User tools for Spark RAPIDS
      Scala
      47652621Updated Dec 24, 2025Dec 24, 2025
    • spark-rapids-ml

      Public
      Spark RAPIDS MLlib – accelerate Apache Spark MLlib with GPUs
      Jupyter Notebook
      3186311Updated Dec 24, 2025Dec 24, 2025
    • VisRTX

      Public
      NVIDIA OptiX based implementation of ANARI
      C++
      38271120Updated Dec 24, 2025Dec 24, 2025
    • nv-ingest

      Public
      NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
      Python
      2802.8k10136Updated Dec 24, 2025Dec 24, 2025
    • numbast

      Public
      Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.
      Python
      18552710Updated Dec 24, 2025Dec 24, 2025
    • KAI-Scheduler

      Public
      KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale
      Go
      1261k2441Updated Dec 24, 2025Dec 24, 2025
    • nsmd

      Public
      MCTP VDM-based Nvidia System Management API
      C++
      1410Updated Dec 24, 2025Dec 24, 2025
    • bionemo-framework

      Public
      BioNeMo Framework: For building and adapting AI models in drug discovery at scale
      Jupyter Notebook
      10860861111Updated Dec 24, 2025Dec 24, 2025
    • JAX-Toolbox

      Public
      JAX-Toolbox
      Python
      683698040Updated Dec 24, 2025Dec 24, 2025
    • makani

      Public
      Massively parallel training of machine-learning based weather and climate models
      Python
      6334234Updated Dec 24, 2025Dec 24, 2025
    • edk2

      Public
      NVIDIA fork of tianocore/edk2
      C
      1625015Updated Dec 24, 2025Dec 24, 2025
    • spark-rapids-jni

      Public
      RAPIDS Accelerator JNI For Apache Spark
      Cuda
      7852869Updated Dec 24, 2025Dec 24, 2025
    • OSMO

      Public
      The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML
      Python
      6612213Updated Dec 24, 2025Dec 24, 2025
    • nvidia-container-toolkit

      Public
      Build and run containers leveraging NVIDIA GPUs
      Go
      4553.9k12331Updated Dec 24, 2025Dec 24, 2025
    • k8s-device-plugin

      Public
      NVIDIA device plugin for Kubernetes
      Go
      7683.6k7544Updated Dec 24, 2025Dec 24, 2025