Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Offline Doc Chat is a privacy-first, local-only RAG (Retrieval-Augmented Generation) application that lets you chat with your personal documents without requiring any internet connection.

License

Notifications You must be signed in to change notification settings

Sid330s/Offline-Doc-Chat

Repository files navigation

📽️ Demo GIFs

Below is a step-by-step walkthrough of the app:

Step 1 Step 2
Step 3 Step 4
Step 5 Step 6

🧪 Quick Start

Follow these simple steps to get started with Offline Doc Chat:

  1. Set Your Ollama Endpoint and Model: Navigate to the Settings panel and configure the connection to your locally hosted Ollama instance.

  2. Upload Your Documents: Drag and drop your files into the upload area. The system will automatically process and embed them in the background.

  3. Start Asking Questions: Once processing is complete, you can ask natural language questions about your documents — all without internet access!


⚙️ Settings

You can fine-tune the entire RAG (Retriever-Augmented Generation) pipeline via Settings > Show Advanced Options. This gives you full control over model behavior and retrieval strategy.

🔧 Ollama Configuration

Setting Description Default
Ollama Endpoint URL to your locally hosted Ollama API instance http://localhost:11434
Model LLM to use for generating chat completions (User-defined)
System Prompt Initial prompt used to condition the LLM (See source code)
Top K Number of most similar documents to retrieve per query 3
Chat Mode Llama Index chat mode for retrievals Best

🧠 Embeddings

Setting Description Default
Embedding Model Model used to vectorize uploaded files bge-large-en-v1.5
Chunk Size Text length per chunk to improve embedding granularity 1024

🔍 How Local RAG Works

Local RAG is powered by llama-index and utilizes its SimpleDirectoryReader() to process a wide variety of document types (e.g. .pdf, .md, .ipynb). Here's what happens under the hood:

🧾 File Processing & Embedding

  • Each file is split into smaller logical units. For example, a multi-page PDF is separated into one document per page.
  • These documents are then chunked based on the configured chunk_size, with optional chunk_overlap to preserve context.
  • The resulting chunks are embedded using the selected model and stored for fast retrieval.

🛠️ Key Parameters for Customization

Parameter Description
chunk_size Determines the size of each embedded chunk. Smaller chunks = higher quality
chunk_overlap Defines text overlap between chunks. Helps maintain contextual flow

You can tweak these settings to balance between embedding precision, retrieval speed, and system performance.

💡 Tip: Experimenting with different chunk_size and chunk_overlap values can significantly affect answer quality. A smaller chunk size may be better for detailed queries, while larger chunks can improve speed.

About

Offline Doc Chat is a privacy-first, local-only RAG (Retrieval-Augmented Generation) application that lets you chat with your personal documents without requiring any internet connection.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published