yaox12

🐳

Slacking

Xin Yao yaox12

🐳

Slacking

114 followers · 34 following

Achievements

x3 x3

Achievements

x3 x3

Organizations

Stars

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 3,958 541 Updated Oct 25, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 19,380 3,159 Updated Oct 25, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 60,998 10,768 Updated Oct 25, 2025

pyutils / line_profiler

Line-by-line profiling for Python

Python 3,134 130 Updated Oct 17, 2025

NVIDIA / nvidia-resiliency-ext

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 228 34 Updated Oct 24, 2025

chemharuka / toGainMapHDR

A tool to convert HDR file to Adaptive HDR (Gain Map HDR) and ISO HDR format in HEIC

Swift 84 4 Updated Oct 25, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 20,159 2,084 Updated Oct 25, 2025

huggingface / safetensors

Simple, safe way to store and distribute tensors

Python 3,488 274 Updated Oct 25, 2025

NVIDIA / cccl

CUDA Core Compute Libraries

C++ 1,986 282 Updated Oct 25, 2025

typst / typst

A new markup-based typesetting system that is powerful and easy to learn.

Rust 47,307 1,284 Updated Oct 24, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 13,943 3,186 Updated Oct 25, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 151,603 30,932 Updated Oct 25, 2025

NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 358 68 Updated Oct 25, 2025

NVIDIA / NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++ 460 63 Updated Oct 21, 2025

asottile / pyupgrade

A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

Python 3,927 199 Updated Oct 14, 2025

nv-legate / cupynumeric

NumPy and SciPy on Multi-Node Multi-GPU systems

Python 935 86 Updated Oct 24, 2025

CVCUDA / CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,594 241 Updated May 21, 2025

wjakob / nanobind

nanobind: tiny and efficient C++/Python bindings

C++ 3,095 257 Updated Oct 17, 2025

suo / lintrunner

Rust 28 17 Updated Jul 3, 2025

StaZhu / enable-chromium-hevc-hardware-decoding

A guide that teach you enable hardware HEVC decoding & encoding for Chrome / Edge, or build a custom version of Chromium / Electron that supports hardware & software HEVC decoding and hardware HEVC…

JavaScript 1,386 70 Updated Sep 10, 2025

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,237 385 Updated Aug 12, 2025

openucx / ucc

Unified Collective Communication Library

C 278 117 Updated Oct 24, 2025

openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,480 489 Updated Oct 23, 2025

NVIDIA-Merlin / HierarchicalKV

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 175 30 Updated Oct 23, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,851 529 Updated Oct 25, 2025

rapidsai / wholegraph

WholeGraph - large scale Graph Neural Networks

Cuda 105 36 Updated Nov 25, 2024

NVIDIA / libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,310 192 Updated Feb 7, 2024

NVIDIA / thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,985 765 Updated Feb 8, 2024

llvm / torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,657 606 Updated Oct 24, 2025

pytorch / torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,064 129 Updated Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xin Yao yaox12

Achievements

Achievements

Organizations

Block or report yaox12

Stars

flashinfer-ai / flashinfer

sgl-project / sglang

vllm-project / vllm

pyutils / line_profiler

NVIDIA / nvidia-resiliency-ext

chemharuka / toGainMapHDR

Dao-AILab / flash-attention

huggingface / safetensors

NVIDIA / cccl

typst / typst

NVIDIA / Megatron-LM

huggingface / transformers

NVIDIA / Fuser

NVIDIA / NVTX

asottile / pyupgrade

nv-legate / cupynumeric

CVCUDA / CV-CUDA

wjakob / nanobind

suo / lintrunner

StaZhu / enable-chromium-hevc-hardware-decoding

arogozhnikov / einops

openucx / ucc

openucx / ucx

NVIDIA-Merlin / HierarchicalKV

NVIDIA / TransformerEngine

rapidsai / wholegraph

NVIDIA / libcudacxx

NVIDIA / thrust

llvm / torch-mlir

pytorch / torchdynamo