Thanks to visit codestin.com
Credit goes to www.libhunt.com

Python retrieval-augmented-generation

Open-source Python projects categorized as retrieval-augmented-generation

Top 23 Python retrieval-augmented-generation Projects

retrieval-augmented-generation
  1. ragflow

    RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. storm

    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

    Project mention: Code Explanation: "STORM: Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking" | dev.to | 2025-03-08

    Note: this explanation only covers the knowledge_storm in the storm repo because it aligns with my interests.

  4. LightRAG

    [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    Project mention: 🍥 Hands-on Experience with LightRAG | dev.to | 2025-10-27

    LightRAG examples: https://github.com/HKUDS/LightRAG/tree/main/examples

  5. llmware

    Unified framework for building enterprise RAG pipelines with small, specialized models

    Project mention: How I Learned Generative AI in Two Weeks (and You Can Too): Part 3 - Prompts & Models | dev.to | 2025-05-14

    Notebook for example 3: prompts and models

  6. txtai

    💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26

    GitHub: https://github.com/neuml/txtai

  7. FlagEmbedding

    Retrieval and Retrieval-augmented LLMs

    Project mention: BGE-Reasoner: An open-source framework for reasoning-intensive retrieval | news.ycombinator.com | 2025-08-27
  8. memvid

    Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

    Project mention: Friday Links #30 — JavaScript Updates, Tools, and Inspiration | dev.to | 2025-10-17

    memvid - Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. RAG-Anything

    "RAG-Anything: All-in-One RAG Framework"

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26

    GitHub: https://github.com/HKUDS/RAG-Anything

  11. Agent-S

    Agent S: an open agentic framework that uses computers like a human

    Project mention: Show HN: Agent S: an open agentic framework that uses computers | news.ycombinator.com | 2025-05-01
  12. R2R

    SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26

    Citations: Community references, https://github.com/SciPhi-AI/R2R

  13. TaskingAI

    The open source platform for AI-native application development.

  14. AutoRAG

    AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

  15. cognita

    RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

    Project mention: Lists of open-source frameworks for building RAG applications | dev.to | 2025-01-02

    Ideal For: Enterprises seeking a robust framework for large-scale AI applications. GitHub Repository

  16. langroid

    Harness LLMs with Multi-Agent Programming

    Project mention: Using Claude Code to modernize a forgotten Linux kernel driver | news.ycombinator.com | 2025-09-07

    > using these tools as a massive force multiplier…

    Even before tools like CC it was the case that LLMs enabled venturing into projects/areas that would be intimidating otherwise. But Claude-Code (and codex-cli as of late) has made this massively more true.

    For example I recently used CC to do a significant upgrade of the Langroid LLM-Agent framework from Pydantic V1 to V2, something I would not have dared to attempt before CC:

    https://github.com/langroid/langroid/releases/tag/0.59.0

    I also created nice collapsible html logs [2] for agent interactions and tool-calls, inspired by @badlogic/Zechner’s Claude-trace [3] (which incidentally is a fantastic tool!).

    [2] https://github.com/langroid/langroid/releases/tag/0.57.0

    [3] https://github.com/badlogic/lemmy/tree/main/apps/claude-trac...

    And added a DSL to specify agentic task termination conditions based on event-sequence patterns:

    https://langroid.github.io/langroid/notes/task-termination/

    Needless to say, the docs are also made with significant CC assistance.

  17. LEANN

    RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

    Project mention: First lightweight local semantic search MCP for Claude Code | news.ycombinator.com | 2025-08-15

    @Berkeley SkyLab, we’re the first to bring semantic search to Claude Code with a fully local index in a novel, lightweight structure — check it out at LEANN(https://github.com/yichuan-w/LEANN).

  18. MemOS

    Build memory-native AI agents with Memory OS — an open-source framework for long-term memory, retrieval, and adaptive learning in large language models. Agent Memory | Memory System | Memory Management | Memory MCP | MCP System | LLM Memory | Agents Memory System | (by MemTensor)

    Project mention: MemOS: Treating "memory" as a first-class resource for LLMs | news.ycombinator.com | 2025-08-18
  19. fastembed

    Fast, Accurate, Lightweight Python library to make State of the Art Embedding

  20. colpali

    The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

    Project mention: Integrating Vision-Language Models into Agentic RAG Systems with ColPali | dev.to | 2025-03-31

    If you want to learn more about ColPali, you can refer to the official documentation and also I would recommend you to read the 9 part blog series on RAG on DailyDoseofDS by Avi Chawla and Akshay Pachaar.

  21. raptor

    The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

    Project mention: Graph RAG의 모든 것 | dev.to | 2025-04-20

    3.2. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval (Stanford Univ, 2024)

  22. raglite

    🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL

    Project mention: Show HN: RAGLite – A Python package for the unhobbling of RAG | news.ycombinator.com | 2024-12-19
  23. rag-demystified

    An LLM-powered advanced RAG pipeline built from scratch

  24. AnglE

    Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard (by SeanLee97)

  25. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python retrieval-augmented-generation discussion

Log in or Post with

Python retrieval-augmented-generation related posts

  • 🍥 Hands-on Experience with LightRAG

    1 project | dev.to | 27 Oct 2025
  • Wikipedia as a Graph

    5 projects | news.ycombinator.com | 29 Aug 2025
  • 6 Weeks of Claude Code

    6 projects | news.ycombinator.com | 2 Aug 2025
  • How I Learned Generative AI in Two Weeks (and You Can Too): Part 3 - Prompts & Models

    1 project | dev.to | 14 May 2025
  • Show HN: Toller – A Python library for robust async calls

    2 projects | news.ycombinator.com | 13 May 2025
  • Graph RAG의 모든 것

    4 projects | dev.to | 20 Apr 2025
  • Integrating Vision-Language Models into Agentic RAG Systems with ColPali

    2 projects | dev.to | 31 Mar 2025
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 15 Nov 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source retrieval-augmented-generation projects in Python? This list will help you:

# Project Stars
1 ragflow 67,441
2 storm 27,602
3 LightRAG 22,597
4 llmware 14,448
5 txtai 11,800
6 FlagEmbedding 10,831
7 memvid 10,372
8 RAG-Anything 10,061
9 Agent-S 8,126
10 R2R 7,430
11 TaskingAI 5,346
12 AutoRAG 4,399
13 cognita 4,277
14 langroid 3,759
15 LEANN 4,367
16 swirl-search 2,922
17 MemOS 2,986
18 fastembed 2,488
19 colpali 2,307
20 raptor 1,374
21 raglite 1,102
22 rag-demystified 854
23 AnglE 559

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?