Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View yaox12's full-sized avatar
🐳
Slacking
🐳
Slacking

Organizations

@NVIDIA

Block or report yaox12

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

FlashInfer: Kernel Library for LLM Serving

Cuda 3,958 541 Updated Oct 25, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 19,380 3,159 Updated Oct 25, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 60,998 10,768 Updated Oct 25, 2025

Line-by-line profiling for Python

Python 3,134 130 Updated Oct 17, 2025

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 228 34 Updated Oct 24, 2025

A tool to convert HDR file to Adaptive HDR (Gain Map HDR) and ISO HDR format in HEIC

Swift 84 4 Updated Oct 25, 2025

Fast and memory-efficient exact attention

Python 20,159 2,084 Updated Oct 25, 2025

Simple, safe way to store and distribute tensors

Python 3,488 274 Updated Oct 25, 2025

CUDA Core Compute Libraries

C++ 1,986 282 Updated Oct 25, 2025

A new markup-based typesetting system that is powerful and easy to learn.

Rust 47,307 1,284 Updated Oct 24, 2025

Ongoing research training transformer models at scale

Python 13,943 3,186 Updated Oct 25, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 151,603 30,932 Updated Oct 25, 2025

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 358 68 Updated Oct 25, 2025

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++ 460 63 Updated Oct 21, 2025

A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

Python 3,927 199 Updated Oct 14, 2025

NumPy and SciPy on Multi-Node Multi-GPU systems

Python 935 86 Updated Oct 24, 2025

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,594 241 Updated May 21, 2025

nanobind: tiny and efficient C++/Python bindings

C++ 3,095 257 Updated Oct 17, 2025
Rust 28 17 Updated Jul 3, 2025

A guide that teach you enable hardware HEVC decoding & encoding for Chrome / Edge, or build a custom version of Chromium / Electron that supports hardware & software HEVC decoding and hardware HEVC…

JavaScript 1,386 70 Updated Sep 10, 2025

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,237 385 Updated Aug 12, 2025

Unified Collective Communication Library

C 278 117 Updated Oct 24, 2025

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,480 489 Updated Oct 23, 2025

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 175 30 Updated Oct 23, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,851 529 Updated Oct 25, 2025

WholeGraph - large scale Graph Neural Networks

Cuda 105 36 Updated Nov 25, 2024

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,310 192 Updated Feb 7, 2024

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,985 765 Updated Feb 8, 2024

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,657 606 Updated Oct 24, 2025

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,064 129 Updated Apr 17, 2024
Next