Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View yuhong-zhong's full-sized avatar

Highlights

  • Pro

Block or report yuhong-zhong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
164 results for source starred repositories
Clear filter
C++ 58 21 Updated May 31, 2025

Query-Adaptive Vector Search

C++ 59 12 Updated Nov 3, 2025

Run any GUI app in the terminalâť—

TypeScript 6,773 153 Updated Oct 26, 2025
C 5 Updated Nov 4, 2025

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 3,491 348 Updated Nov 3, 2025

Infrastructure that's powering E2B Cloud.

Go 699 182 Updated Nov 4, 2025

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,841 245 Updated Oct 22, 2025

Code search MCP for Claude Code. Make entire codebase the context for any coding agent.

TypeScript 4,365 383 Updated Sep 16, 2025

Golang implementation of the Raft consensus protocol

Go 8,812 1,041 Updated Nov 3, 2025

A vector search SQLite extension that runs anywhere!

C 6,355 240 Updated Jan 24, 2025

Trae Agent is an LLM-based agent for general purpose software engineering tasks.

Python 9,880 1,021 Updated Sep 24, 2025

AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

Rust 3,362 291 Updated Sep 23, 2025

Efficient Compute-Communication Overlap for Distributed LLM Inference

Python 61 4 Updated Oct 31, 2025

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 910 44 Updated Oct 29, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,720 1,510 Updated Nov 4, 2025

A lightweight design for computation-communication overlap.

Cuda 183 8 Updated Oct 10, 2025

NCCL Tests

Cuda 1,323 326 Updated Nov 3, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,210 104 Updated Oct 17, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,161 82 Updated Aug 28, 2025

Fast and memory-efficient exact attention

Python 20,333 2,110 Updated Nov 3, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 4,013 558 Updated Nov 4, 2025

An open source, self-hosted implementation of the Tailscale control server

Go 32,506 1,731 Updated Nov 2, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 800 73 Updated Nov 4, 2025

Optimized primitives for collective multi-GPU communication

C++ 4,204 1,059 Updated Nov 4, 2025

[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents

Python 440 72 Updated Nov 3, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 699 177 Updated Nov 4, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,208 420 Updated Nov 4, 2025
Jupyter Notebook 124 12 Updated Nov 11, 2024

🙌 OpenHands: Code Less, Make More

Python 64,680 7,863 Updated Nov 4, 2025
Python 1 1 Updated May 12, 2025
Next