Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View jokerwyt's full-sized avatar
🚧
Working
🚧
Working

Highlights

  • Pro

Block or report jokerwyt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Rust bindings for the Python interpreter

Rust 15,195 925 Updated Jan 19, 2026

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 97,238 10,771 Updated Jan 19, 2026

Perplexity open source garden for inference technology

Rust 343 28 Updated Dec 25, 2025

Yet Another Document Translator

Python 7,506 588 Updated Jan 16, 2026

Fil-C: completely compatible memory safety for C and C++

2,899 57 Updated Jan 21, 2026

⚡ Clash for Lab 是为实验室环境设计的科学上网工具,无需sudo权限,优雅地一键式脚本安装

Shell 259 12 Updated Dec 11, 2025

程序员延寿指南 | A programmer's guide to live longer

34,704 2,375 Updated May 19, 2025

A Throughput-Optimized Pipeline Parallel Inference System for Large Language Models

Python 46 2 Updated Dec 24, 2025
C++ 340 33 Updated Jan 4, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 642 143 Updated Jan 22, 2026

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python 2,577 166 Updated Jan 22, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 8,908 1,069 Updated Jan 20, 2026

Practical GPU Sharing Without Memory Size Constraints

C 299 32 Updated Mar 28, 2025

Optimized primitives for collective multi-GPU communication

C++ 4,398 1,117 Updated Jan 9, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 4,783 407 Updated Jan 21, 2026

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 377 46 Updated Jan 21, 2026

技术面试最后反问面试官的话

18,373 1,386 Updated Mar 4, 2024

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 797 86 Updated Nov 28, 2025

A lightweight design for computation-communication overlap.

Python 212 10 Updated Jan 20, 2026

Python bindings for UCX

Python 140 64 Updated Sep 18, 2025

Tile primitives for speedy kernels

Cuda 3,096 229 Updated Jan 17, 2026

Perplexity GPU Kernels

C++ 554 75 Updated Nov 7, 2025

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,552 514 Updated Jan 22, 2026

NVIDIA Inference Xfer Library (NIXL)

C++ 830 227 Updated Jan 22, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,614 522 Updated Jan 22, 2026

Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton

Python 39 1 Updated Feb 13, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,226 89 Updated Aug 28, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,812 792 Updated Jan 22, 2026

Development repository for the Triton language and compiler

MLIR 18,210 2,516 Updated Jan 22, 2026
Next