Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View rhdong's full-sized avatar
🎯
Focusing
🎯
Focusing
  • NVIDIA Corporation
  • Santa Clara, California

Organizations

@tensorflow @rapidsai @NVIDIA-Merlin

Block or report rhdong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 17,280 1,380 Updated Feb 8, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 70,357 13,460 Updated Feb 15, 2026

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

Python 683 143 Updated Feb 3, 2026

Fast and memory-efficient exact attention

Python 22,259 2,383 Updated Feb 14, 2026

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,501 32,079 Updated Feb 15, 2026

Lean Algorithmic Trading Engine by QuantConnect (Python, C#)

C# 16,466 4,332 Updated Feb 13, 2026

The Triton backend for TensorRT.

C++ 86 35 Updated Feb 9, 2026

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,693 2,315 Updated Feb 13, 2026

Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.

C++ 669 192 Updated Feb 15, 2026

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Rust 10,473 1,034 Updated Feb 11, 2026

📄 Configuration files that enhance Cursor AI editor experience with custom rules and behaviors

MDX 37,850 3,197 Updated Oct 24, 2025

Enhanced MCP server for interactive user feedback and command execution in AI-assisted development, featuring dual interface support (Web UI and Desktop Application) with intelligent environment de…

JavaScript 3,568 326 Updated Jun 29, 2025

Rules and Knowledge to work better with agents such as Claude Code or Cursor

Shell 5,573 501 Updated Dec 31, 2025

A Lucene codec for vector search and clustering on the GPU

Java 7 11 Updated Feb 4, 2026

XLA Launcher is a high-performance, lightweight C++ library designed to provide a simple interface for loading and executing computation graphs represented in the StableHLO format.

C++ 3 1 Updated Aug 1, 2025

Dockerfile templates for creating RAPIDS Docker Images

Shell 84 55 Updated Feb 12, 2026

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ

Python 1,727 90 Updated Feb 4, 2026

A distributed high-performance dynamic lookuptable-style Embedding designed for recommendation, search, CTR and advertising systems. Supports GPU, CPU, remote distributed KV (such as Redis), SSD, a…

4 3 Updated Nov 26, 2024

A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.

C++ 29 3 Updated Feb 10, 2026

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 11,513 1,006 Updated Aug 18, 2024

NVIDIA Inference Xfer Library (NIXL)

C++ 885 241 Updated Feb 13, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,885 2,101 Updated Feb 15, 2026

Demos for deep learning

Python 739 151 Updated Dec 4, 2024

Implementation of bitonic mergesort for the GPU

C++ 2 1 Updated Aug 17, 2018

Distributed transactional key-value database, originally created to complement TiDB

Rust 16,525 2,244 Updated Feb 15, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,098 856 Updated Feb 15, 2026

NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, con…

Python 2,845 294 Updated Feb 13, 2026

A probabilistic programming library for Bayesian deep learning, generative models, based on Tensorflow

Python 2,218 417 Updated Dec 17, 2022

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 23,534 4,446 Updated Feb 15, 2026

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Go 42,763 3,826 Updated Feb 13, 2026
Next