Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View qili93's full-sized avatar

Block or report qili93

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 22,245 4,011 Updated Jan 11, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 67,238 12,510 Updated Jan 11, 2026

HIPIFY: Convert CUDA to Portable C++ Code

C++ 646 102 Updated Jan 11, 2026

Community maintained hardware plugin for vLLM on Ascend

Python 1,553 716 Updated Jan 11, 2026

Visualizer for neural network, deep learning and machine learning models

JavaScript 32,159 3,059 Updated Jan 11, 2026

A debugging and profiling tool that can trace and visualize python code execution

Python 7,503 471 Updated Jan 11, 2026

Github mirror of trition-lang/triton repo.

MLIR 119 33 Updated Jan 11, 2026

Development repository for the Triton language and compiler

MLIR 18,083 2,495 Updated Jan 11, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,205 2,996 Updated Jan 11, 2026

The Modular Platform (includes MAX & Mojo)

Mojo 25,433 2,760 Updated Jan 10, 2026

PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)

C++ 101 210 Updated Jan 10, 2026

Tile primitives for speedy kernels

Cuda 3,044 222 Updated Jan 10, 2026

Fast and memory-efficient exact attention

Python 21,540 2,271 Updated Jan 10, 2026

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 114,139 12,098 Updated Jan 10, 2026

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

C++ 23,565 5,917 Updated Jan 10, 2026

Ascend TileLang adapter

C++ 181 61 Updated Jan 10, 2026

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,514 702 Updated Jan 10, 2026

PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.

Python 12,948 2,159 Updated Jan 10, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 4,583 385 Updated Jan 10, 2026

The IBM/charts repository provides helm charts for IBM and Third Party middleware.

Smarty 310 412 Updated Jan 10, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 16,331 1,198 Updated Jan 10, 2026

FlashInfer: Kernel Library for LLM Serving

Python 4,565 636 Updated Jan 10, 2026

Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.

Python 309 58 Updated Jan 10, 2026

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,214 349 Updated Jan 10, 2026

TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels

Python 183 15 Updated Jan 10, 2026

The new Windows Terminal and the original Windows console host, all in the same place!

C++ 101,326 9,015 Updated Jan 9, 2026

Apache Spark - A unified analytics engine for large-scale data processing

Scala 42,615 28,996 Updated Jan 9, 2026

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,797 95 Updated Jan 9, 2026

Train transformer language models with reinforcement learning.

Python 16,918 2,412 Updated Jan 9, 2026

Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown

TypeScript 85,235 8,488 Updated Jan 9, 2026
Next