Thanks to visit codestin.com
Credit goes to github.com

Skip to content

SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.

License

Notifications You must be signed in to change notification settings

aimaster-dev/SmartRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 SmartRAG

SmartRAG is a terminal-based Retrieval-Augmented Generation (RAG) system built using LangGraph. It routes user queries through a custom flow that includes message history, query transformation, and document retrieval from a vector store.

🔗 GitHub: https://github.com/aimaster-dev/SmartRAG


🧠 Features

  • LangGraph-powered RAG pipeline
  • Smart routing of user queries
  • PDF and Markdown ingestion support
  • Optional webpage-to-PDF and PDF-to-Markdown conversion
  • OpenAI GPT integration for natural language responses

🗂️ Project Structure

SmartRAG/
├── architecture/         # LangGraph RAG workflow logic
├── data/                 # Processed markdown or PDF content
├── modules/              # Core logic for query handling & doc processing
├── main.py               # Entry point
└── processDocs.py        # Document preprocessing script

⚙️ Setup

Follow the steps below to get SmartRAG up and running:

1. Clone the Repo

git clone https://github.com/aimaster-dev/SmartRAG.git
cd SmartRAG

2. Create Virtual Environment

python3.12 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt
choco install wkhtmltopdf  # for HTML to PDF conversion (Windows only)

4. Environment Setup

Copy and edit the .env file:

cp .env.example .env

Edit .env to include:

OPENAI_API_KEY=your_openai_key
URLS=url1,url2         # Optional: URLs to fetch as PDF
GET_WEB_PAGES_TO_PDF=True
CONVERT_PDF_TO_MD=True
INTERMEDIATE_PDF_DIR=./pdfs
DATA_DIR=./data

5. Process Your Documents

python modules/processDocs.py

⚠️ Make sure to update .env parameters based on your use case.

6. Run SmartRAG

python main.py

🔍 How It Works

  1. User query is passed into a LangGraph workflow.
  2. Message history is cached and contextually enriched.
  3. If needed, input is transformed for better retrieval.
  4. Documents are pulled from a vector store using similarity search.
  5. GPT model generates a context-aware answer.

🖼️ Architecture Overview

📄 Vector Store Creation

Vector DB

🧠 RAG Pipeline

RAG Flow


🤝 Contributing

We welcome contributions!

  • Fork the repo
  • Create a feature branch
  • Submit a pull request

Got a big idea? Open an issue to discuss it first.


📬 Contact

For questions, feedback, or collaboration ideas — feel free to open an issue or reach out through GitHub!


About

SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages