Lists (1)
Sort Name ascending (A-Z)
Stars
A highly optimized LLM inference acceleration engine for Llama and its variants.
🤯 Lobe Chat - an open-source, modern design AI chat framework. Supports multiple AI providers (OpenAI / Claude 4 / Gemini / DeepSeek / Ollama / Qwen), Knowledge Base (file upload / RAG ), one click…
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
OCR, layout analysis, reading order, table recognition in 90+ languages
800,000 step-level correctness labels on LLM solutions to MATH problems
TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pytorch module. We modified the dequantation and weight preproc…
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Model Compression Toolbox for Large Language Models and Diffusion Models
保存微信历史版本
This is our own implementation of 'Layer Selective Rank Reduction'
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Official implementations for paper: Anydoor: zero-shot object-level image customization
High-speed Large Language Model Serving for Local Deployment
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Fast and memory-efficient exact attention
Ongoing research training transformer language models at scale, including: BERT & GPT-2
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
TigerBot: A multi-language multi-task LLM
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
SoftVC VITS Singing Voice Conversion