AI Engineer | Data Engineer | Python Developer | Application Architect
🌐 Portfolio | 💼 LinkedIn
A Large Language Model (LLM) is an advanced AI system trained to understand and generate human-like text.
It learns from massive public datasets such as Wikipedia, books, and websites.
Key capabilities:
- Answering questions
- Summarizing content
- Translating languages
- Engaging in human-like conversation
Examples: ChatGPT, Google Gemini, Claude, LLaMA
While LLMs are powerful, they come with limitations in business environments:
- No access to internal documents, company policies, or private databases
- Cannot provide accurate, up-to-date answers about internal or proprietary information
- Lack of context about your organization limits their usefulness in real-world applications
Retrieval-Augmented Generation (RAG) bridges the gap between LLMs and enterprise knowledge:
- Retrieves relevant internal data in real-time
- Augments the LLM's responses with accurate, business-specific context
- Delivers precise and trustworthy answers based on your own content
Result: Smarter, enterprise-ready AI that truly understands your business
This project implements a Retrieval-Augmented Generation (RAG) system using local PDF documents as the knowledge base. Built with LangChain, ChromaDB, and a local LLM via Ollama, this setup enables question-answering over your own documents using efficient vector search and contextual responses from LLMs.
- PDF Loader: Extracts text from PDF files in your local
data/
directory. - Text Chunking: Splits text into overlapping chunks using
RecursiveCharacterTextSplitter
to preserve context. - Vector Store (ChromaDB): Stores and indexes embeddings for fast similarity search.
- Local LLM via Ollama: Generates contextual answers using lightweight models like Mistral, all on your local machine.
- Testing Suite: Validate system accuracy using test queries.
rag_local_pdfs/
│
├── chroma/ # chroma directory will be added when we run populate_database.py
├── data/ # Local PDFs for ingestion
├── media/ # Concept images, documents
├── get_embedding_function.py # Defines embedding logic
├── populate_database.py # Loads, chunks, embeds, and stores PDFs
├── query_data.py # Queries ChromaDB and returns LLM response
├── requirements.txt # Python dependencies
└── README.md
cd rag_local_pdfs
python -m venv venv
source venv/bin/activate # on Windows use venv\Scripts\activate
pip install -r requirements.txt
Put your PDF files in the data/ directory for ingestion.
Install Ollama from https://ollama.com, then run: Below step will download model
ollama run llama 3.2
Make sure Ollama is running in the background. By checking http://127.0.0.1:11434/ you should see "Ollama is running" in the page
python populate_database.py
python query_data.py "What is the summary of the document?"
** Local LLM Support**
Uses Ollama for running models like:
-llama3.2
-gemma
Change model name in query_data.py if you prefer a different LLM.
-GUI or Streamlit interface
-Multi-file PDF summarization
-Real-time chat over documents
-RAG + Agent framework (e.g., LangGraph or CrewAI)