Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View tonyw's full-sized avatar

Block or report tonyw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 888 347 Updated Sep 14, 2025

best way to save what you love

Svelte 37,019 3,065 Updated Oct 13, 2025

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 902 103 Updated Jul 10, 2025

科技爱好者周刊,每周五发布

77,929 3,680 Updated Oct 24, 2025

🤯 Lobe Chat - an open-source, modern design AI chat framework. Supports multiple AI providers (OpenAI / Claude 4 / Gemini / DeepSeek / Ollama / Qwen), Knowledge Base (file upload / RAG ), one click…

TypeScript 67,230 13,895 Updated Oct 28, 2025

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

TypeScript 66,705 7,072 Updated Oct 28, 2025

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 18,772 1,279 Updated Oct 21, 2025

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 2,061 122 Updated Jun 1, 2023

DataComp for Language Models

HTML 1,381 128 Updated Sep 9, 2025

TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pytorch module. We modified the dequantation and weight preproc…

C++ 16 2 Updated Jul 5, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,092 966 Updated Jul 1, 2024

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,323 277 Updated Jul 17, 2025

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 684 62 Updated Aug 14, 2025

Golang Version Manager

Go 2,512 243 Updated Sep 17, 2025

保存微信历史版本

Shell 718 64 Updated Oct 22, 2025

This is our own implementation of 'Layer Selective Rank Reduction'

Python 239 28 Updated May 26, 2024

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,404 200 Updated Oct 24, 2025

Official implementations for paper: Anydoor: zero-shot object-level image customization

Python 4,184 371 Updated Apr 8, 2024

High-speed Large Language Model Serving for Local Deployment

C++ 8,372 448 Updated Aug 2, 2025

FaceChain is a deep-learning toolchain for generating your Digital-Twin.

Jupyter Notebook 9,490 891 Updated Jun 6, 2025

Yuan 2.0 Large Language Model

Python 689 85 Updated Jul 11, 2024

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 59,126 7,176 Updated Oct 4, 2025

Fast and memory-efficient exact attention

Python 20,218 2,089 Updated Oct 28, 2025

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 19 3 Updated Jul 20, 2023

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

Python 15,434 2,267 Updated Aug 15, 2025

TigerBot: A multi-language multi-task LLM

Python 2,257 189 Updated Dec 28, 2024

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,859 533 Updated Oct 28, 2025

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Python 4,983 739 Updated Jan 21, 2025

SoftVC VITS Singing Voice Conversion

Python 27,712 5,066 Updated Nov 11, 2023
Next