Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Local Retrieval-Augmented Generation (RAG) pipeline using LangChain and ChromaDB to query PDF files with LLMs.

Notifications You must be signed in to change notification settings

shekar369/rag_local_pdfs

Repository files navigation

Understanding LLMs & RAG for Enterprise AI

Presented by: Shekar Kaki

AI Engineer | Data Engineer | Python Developer | Application Architect
🌐 Portfolio | 💼 LinkedIn


Understand what is LLM first

1. What is an LLM?

A Large Language Model (LLM) is an advanced AI system trained to understand and generate human-like text.
It learns from massive public datasets such as Wikipedia, books, and websites.

Key capabilities:

  • Answering questions
  • Summarizing content
  • Translating languages
  • Engaging in human-like conversation

Examples: ChatGPT, Google Gemini, Claude, LLaMA


2. The Limitations of LLMs in Enterprises

While LLMs are powerful, they come with limitations in business environments:

  • No access to internal documents, company policies, or private databases
  • Cannot provide accurate, up-to-date answers about internal or proprietary information
  • Lack of context about your organization limits their usefulness in real-world applications

3. Why RAG is Needed

Retrieval-Augmented Generation (RAG) bridges the gap between LLMs and enterprise knowledge:

  • Retrieves relevant internal data in real-time
  • Augments the LLM's responses with accurate, business-specific context
  • Delivers precise and trustworthy answers based on your own content

Result: Smarter, enterprise-ready AI that truly understands your business How Enterprice RAG Works

rag_local_pdfs

— Retrieval-Augmented Generation with Local PDFs

This project implements a Retrieval-Augmented Generation (RAG) system using local PDF documents as the knowledge base. Built with LangChain, ChromaDB, and a local LLM via Ollama, this setup enables question-answering over your own documents using efficient vector search and contextual responses from LLMs.


Features

  • PDF Loader: Extracts text from PDF files in your local data/ directory.
  • Text Chunking: Splits text into overlapping chunks using RecursiveCharacterTextSplitter to preserve context.
  • Vector Store (ChromaDB): Stores and indexes embeddings for fast similarity search.
  • Local LLM via Ollama: Generates contextual answers using lightweight models like Mistral, all on your local machine.
  • Testing Suite: Validate system accuracy using test queries.

Project Structure

rag_local_pdfs/
│
├── chroma/                    # chroma directory will be added when we run populate_database.py
├── data/                      # Local PDFs for ingestion
├── media/                     # Concept images, documents
├── get_embedding_function.py  # Defines embedding logic
├── populate_database.py       # Loads, chunks, embeds, and stores PDFs
├── query_data.py              # Queries ChromaDB and returns LLM response
├── requirements.txt           # Python dependencies
└── README.md

Setup Instructions

1. Clone the repository

cd rag_local_pdfs

2. Create virtual environment & install dependencies

python -m venv venv
source venv/bin/activate   # on Windows use venv\Scripts\activate
pip install -r requirements.txt

3. Add PDFs to data/ folder

Put your PDF files in the data/ directory for ingestion.

4. Install and run Ollama server. For running LLM Locally.

Install Ollama from https://ollama.com, then run: Below step will download model

ollama run llama 3.2

Make sure Ollama is running in the background. By checking http://127.0.0.1:11434/ you should see "Ollama is running" in the page

5. Populate the vector database

python populate_database.py

Sample illustration, showing how data flow from source to Vector db to Response to the User

6. Query your documents

python query_data.py "What is the summary of the document?"

Sample illustration, showing how response comes from Vector db to Response to the User

Successful setup will give you response from the pdf you added by giving summary

** Local LLM Support**

Uses Ollama for running models like:
-llama3.2
-gemma

Change model name in query_data.py if you prefer a different LLM.

📌 Future Enhancements

-GUI or Streamlit interface
-Multi-file PDF summarization
-Real-time chat over documents
-RAG + Agent framework (e.g., LangGraph or CrewAI)

About

Local Retrieval-Augmented Generation (RAG) pipeline using LangChain and ChromaDB to query PDF files with LLMs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages