Codestin Search App

📽️ Demo GIFs

Below is a step-by-step walkthrough of the app:

Step 1	Step 2

Step 3	Step 4

Step 5	Step 6

🧪 Quick Start

Follow these simple steps to get started with Offline Doc Chat:

Set Your Ollama Endpoint and Model: Navigate to the Settings panel and configure the connection to your locally hosted Ollama instance.
Upload Your Documents: Drag and drop your files into the upload area. The system will automatically process and embed them in the background.
Start Asking Questions: Once processing is complete, you can ask natural language questions about your documents — all without internet access!

⚙️ Settings

You can fine-tune the entire RAG (Retriever-Augmented Generation) pipeline via Settings > Show Advanced Options. This gives you full control over model behavior and retrieval strategy.

🔧 Ollama Configuration

Setting	Description	Default
Ollama Endpoint	URL to your locally hosted Ollama API instance	`http://localhost:11434`
Model	LLM to use for generating chat completions	(User-defined)
System Prompt	Initial prompt used to condition the LLM	(See source code)
Top K	Number of most similar documents to retrieve per query	`3`
Chat Mode	Llama Index chat mode for retrievals	`Best`

🧠 Embeddings

Setting	Description	Default
Embedding Model	Model used to vectorize uploaded files	`bge-large-en-v1.5`
Chunk Size	Text length per chunk to improve embedding granularity	`1024`

🔍 How Local RAG Works

Local RAG is powered by llama-index and utilizes its SimpleDirectoryReader() to process a wide variety of document types (e.g. .pdf, .md, .ipynb). Here's what happens under the hood:

🧾 File Processing & Embedding

Each file is split into smaller logical units. For example, a multi-page PDF is separated into one document per page.
These documents are then chunked based on the configured chunk_size, with optional chunk_overlap to preserve context.
The resulting chunks are embedded using the selected model and stored for fast retrieval.

🛠️ Key Parameters for Customization

Parameter	Description
`chunk_size`	Determines the size of each embedded chunk. Smaller chunks = higher quality
`chunk_overlap`	Defines text overlap between chunks. Helps maintain contextual flow

You can tweak these settings to balance between embedding precision, retrieval speed, and system performance.

💡 Tip: Experimenting with different chunk_size and chunk_overlap values can significantly affect answer quality. A smaller chunk size may be better for detailed queries, while larger chunks can improve speed.

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.streamlit		.streamlit
assets		assets
components		components
data		data
utils		utils
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
logo.png		logo.png
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📽️ Demo GIFs

🧪 Quick Start

⚙️ Settings

🔧 Ollama Configuration

🧠 Embeddings

🔍 How Local RAG Works

🧾 File Processing & Embedding

🛠️ Key Parameters for Customization

About

Uh oh!

Releases

Packages

Languages

License

Sid330s/Offline-Doc-Chat

Folders and files

Latest commit

History

Repository files navigation

📽️ Demo GIFs

🧪 Quick Start

⚙️ Settings

🔧 Ollama Configuration

🧠 Embeddings

🔍 How Local RAG Works

🧾 File Processing & Embedding

🛠️ Key Parameters for Customization

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages