rhdong

🎯

Focusing

rhdong rhdong

🎯

Focusing

GPU/CUDA/RecSys/TensorFlow

115 followers · 65 following

NVIDIA Corporation
Santa Clara, California

Achievements

x2 x3

Achievements

x2 x3

Organizations

Stars

openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 17,280 1,380 Updated Feb 8, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 70,357 13,460 Updated Feb 15, 2026

NVIDIA / cudnn-frontend

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

Python 683 143 Updated Feb 3, 2026

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 22,259 2,383 Updated Feb 14, 2026

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,501 32,079 Updated Feb 15, 2026

QuantConnect / Lean

Lean Algorithmic Trading Engine by QuantConnect (Python, C#)

C# 16,466 4,332 Updated Feb 13, 2026

triton-inference-server / tensorrt_backend

The Triton backend for TensorRT.

C++ 86 35 Updated Feb 9, 2026

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,693 2,315 Updated Feb 13, 2026

triton-inference-server / python_backend

Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.

C++ 669 192 Updated Feb 15, 2026

huggingface / tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Rust 10,473 1,034 Updated Feb 11, 2026

PatrickJS / awesome-cursorrules

📄 Configuration files that enhance Cursor AI editor experience with custom rules and behaviors

MDX 37,850 3,197 Updated Oct 24, 2025

Minidoracat / mcp-feedback-enhanced

Enhanced MCP server for interactive user feedback and command execution in AI-assisted development, featuring dual interface support (Web UI and Desktop Application) with intelligent environment de…

JavaScript 3,568 326 Updated Jun 29, 2025

steipete / agent-rules

Rules and Knowledge to work better with agents such as Claude Code or Cursor

Shell 5,573 501 Updated Dec 31, 2025

rapidsai / cuvs-lucene

A Lucene codec for vector search and clustering on the GPU

Java 7 11 Updated Feb 4, 2026

MoFHeka / xla-launcher

XLA Launcher is a high-performance, lightweight C++ library designed to provide a simple interface for loading and executing computation graphs represented in the StableHLO format.

C++ 3 1 Updated Aug 1, 2025

rapidsai / docker

Dockerfile templates for creating RAPIDS Docker Images

Shell 84 55 Updated Feb 12, 2026

potamides / DeTikZify

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ

Python 1,727 90 Updated Feb 4, 2026

MoFHeka / MeepoEmbedding

A distributed high-performance dynamic lookuptable-style Embedding designed for recommendation, search, CTR and advertising systems. Supports GPU, CPU, remote distributed KV (such as Redis), SSD, a…

4 3 Updated Nov 26, 2024

MoFHeka / execution-ucx

A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.

C++ 29 3 Updated Feb 10, 2026

adam-maj / tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 11,513 1,006 Updated Aug 18, 2024

ai-dynamo / nixl

NVIDIA Inference Xfer Library (NIXL)

C++ 885 241 Updated Feb 13, 2026

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,885 2,101 Updated Feb 15, 2026

SingleZombie / DL-Demos

Demos for deep learning

Python 739 151 Updated Dec 4, 2024

bkarsin / inplace-gpusort

Implementation of bitonic mergesort for the GPU

C++ 2 1 Updated Aug 17, 2018

tikv / tikv

Distributed transactional key-value database, originally created to complement TiDB

Rust 16,525 2,244 Updated Feb 15, 2026

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,098 856 Updated Feb 15, 2026

NVIDIA / nv-ingest

NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, con…

Python 2,845 294 Updated Feb 13, 2026

thu-ml / zhusuan

A probabilistic programming library for Bayesian deep learning, generative models, based on Tensorflow

Python 2,218 417 Updated Dec 17, 2022

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 23,534 4,446 Updated Feb 15, 2026

milvus-io / milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Go 42,763 3,826 Updated Feb 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rhdong rhdong

Achievements

Achievements

Organizations

Block or report rhdong

Stars

openai / tiktoken

vllm-project / vllm

NVIDIA / cudnn-frontend

Dao-AILab / flash-attention

huggingface / transformers

QuantConnect / Lean

triton-inference-server / tensorrt_backend

NVIDIA / TensorRT

triton-inference-server / python_backend

huggingface / tokenizers

PatrickJS / awesome-cursorrules

Minidoracat / mcp-feedback-enhanced

steipete / agent-rules

rapidsai / cuvs-lucene

MoFHeka / xla-launcher

rapidsai / docker

potamides / DeTikZify

MoFHeka / MeepoEmbedding

MoFHeka / execution-ucx

adam-maj / tiny-gpu

ai-dynamo / nixl

NVIDIA / TensorRT-LLM

SingleZombie / DL-Demos

bkarsin / inplace-gpusort

tikv / tikv

ai-dynamo / dynamo

NVIDIA / nv-ingest

thu-ml / zhusuan

sgl-project / sglang

milvus-io / milvus