SmartRAG is a terminal-based Retrieval-Augmented Generation (RAG) system built using LangGraph. It routes user queries through a custom flow that includes message history, query transformation, and document retrieval from a vector store.
🔗 GitHub: https://github.com/aimaster-dev/SmartRAG
- LangGraph-powered RAG pipeline
- Smart routing of user queries
- PDF and Markdown ingestion support
- Optional webpage-to-PDF and PDF-to-Markdown conversion
- OpenAI GPT integration for natural language responses
SmartRAG/
├── architecture/ # LangGraph RAG workflow logic
├── data/ # Processed markdown or PDF content
├── modules/ # Core logic for query handling & doc processing
├── main.py # Entry point
└── processDocs.py # Document preprocessing script
Follow the steps below to get SmartRAG up and running:
git clone https://github.com/aimaster-dev/SmartRAG.git
cd SmartRAGpython3.12 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activatepip install -r requirements.txt
choco install wkhtmltopdf # for HTML to PDF conversion (Windows only)Copy and edit the .env file:
cp .env.example .envEdit .env to include:
OPENAI_API_KEY=your_openai_key
URLS=url1,url2 # Optional: URLs to fetch as PDF
GET_WEB_PAGES_TO_PDF=True
CONVERT_PDF_TO_MD=True
INTERMEDIATE_PDF_DIR=./pdfs
DATA_DIR=./datapython modules/processDocs.py
⚠️ Make sure to update.envparameters based on your use case.
python main.py- User query is passed into a LangGraph workflow.
- Message history is cached and contextually enriched.
- If needed, input is transformed for better retrieval.
- Documents are pulled from a vector store using similarity search.
- GPT model generates a context-aware answer.
We welcome contributions!
- Fork the repo
- Create a feature branch
- Submit a pull request
Got a big idea? Open an issue to discuss it first.
For questions, feedback, or collaboration ideas — feel free to open an issue or reach out through GitHub!