Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View shuaills's full-sized avatar

Block or report shuaills

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Zero-Config Code Flow for Claude code & Codex

TypeScript 3,333 247 Updated Oct 31, 2025

The best ChatGPT that $100 can buy.

Python 34,814 3,929 Updated Oct 30, 2025

Build a Claude Code–like CLI coding agent from scratch.

Python 52 6 Updated Sep 21, 2025

🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation feedback, cross-platform NVIDIA/AMD, Kernelbook + KernelBench

Python 92 2 Updated Oct 9, 2025

AI模型聚合管理中转分发系统,支持将多种大模型转为统一格式调用,支持OpenAI、Claude、Gemini等格式,可供个人或者企业内部管理与分发渠道使用。🍥 The next-generation LLM gateway and AI asset management system supports multiple languages.

JavaScript 11,896 2,302 Updated Nov 1, 2025

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,184 83 Updated Sep 22, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,324 236 Updated Nov 1, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,048 1,889 Updated Nov 1, 2025

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 81,116 9,008 Updated Nov 1, 2025

3x Faster Inference; Unofficial implementation of EAGLE Speculative Decoding

Python 78 14 Updated Jul 3, 2025

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 452 103 Updated Oct 30, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,160 82 Updated Aug 28, 2025

Bitcoin Core integration/staging tree

C++ 86,558 38,142 Updated Oct 31, 2025

The official Python SDK for Model Context Protocol servers and clients

Python 19,765 2,704 Updated Oct 31, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 2,926 214 Updated Nov 1, 2025

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…

Python 1,489 190 Updated Nov 1, 2025

Ongoing research training transformer models at scale

Python 14,029 3,220 Updated Nov 1, 2025

Fast, Flexible and Portable Structured Generation

C++ 1,335 95 Updated Oct 20, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,873 305 Updated Mar 10, 2025

My learning notes/codes for ML SYS.

Python 4,031 243 Updated Oct 6, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,929 285 Updated May 15, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,994 554 Updated Nov 1, 2025

Lightweight Kubernetes

Go 31,181 2,533 Updated Oct 31, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 151,914 31,007 Updated Oct 31, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 94,434 25,732 Updated Nov 1, 2025

Making large AI models cheaper, faster and more accessible

Python 41,220 4,536 Updated Oct 13, 2025

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 12,776 3,689 Updated Nov 1, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,626 6,851 Updated Nov 1, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,008 1,828 Updated Nov 1, 2025

Super-fast Structured Outputs

Rust 577 38 Updated Oct 20, 2025
Next