Stars
A lightweight, powerful framework for multi-agent workflows
基于Dify自主创建的AI应用DSL工作流,你可以免费获取,无论是出于个人需求还是学习目的,它都能为您开启一段充满无限可能的智能之旅。
分享一些好用的 Dify DSL 工作流程,自用、学习两相宜。 Sharing some Dify workflows.
Production-ready platform for agentic workflow development.
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents…
This is RAG Modules Repo. This includes various modules in the RAG ecosystem.
[SIGIR 2024] Boosting Conversational Question Answering with Fine-Grained Retrieval-Augmentation and Self-Check
RAGChecker: A Fine-grained Framework For Diagnosing RAG
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Retrieval and Retrieval-augmented LLMs
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
The Triton TensorRT-LLM Backend
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。
canghongjian / vllm
Forked from vllm-project/vllmChatGLM2 support for vLLM
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A high-throughput and memory-efficient inference and serving engine for LLMs
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
The official GitHub page for the survey paper "A Survey of Large Language Models".
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
4 bits quantization of LLaMA using GPTQ