Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ Snappy Public template

Snappy: A vision-first document retrieval using ColPali embeddings - Search PDFs with FastAPI, Next.js 16, Qdrant, and React 19.2

License

Notifications You must be signed in to change notification settings

athrael-soju/Snappy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Snappy_light_readme


Snappy - Vision-Grounded Document Retrieval

GitHub Release GitHub Stars GitHub Forks GitHub Issues License: MIT

CI/CD CodeQL Code Quality Pre-commit

FastAPI Next.js React Python TypeScript Qdrant MinIO Docker

Snappy pairs a FastAPI backend, a ColPali embedding service, and a Next.js frontend to deliver vision-first retrieval over PDFs. Each page is rasterized, embedded as multivectors, and stored alongside images so you can search by how documents look rather than only extracted text.

TL;DR

  • Vision-focused retrieval and chat with ColPali multivector embeddings, MinIO image storage, and Qdrant search.
  • Streaming responses, live indexing progress, and a schema-driven configuration UI to keep changes safe.
  • One Docker Compose stack or individual services for local development and production-style deployments.

Table of Contents


Showcase

Snappy.mp4

Architecture

---
config:
  theme: neutral
  layout: elk
---
flowchart TB
  subgraph Frontend["Next.js Frontend"]
    UI["Pages (/upload, /search, /chat, /configuration, /maintenance)"]
    CHAT["Chat API Route"]
  end

  subgraph Backend["FastAPI Backend"]
    API["REST Routers"]
  end

  subgraph Services["Supporting Services"]
    QDRANT["Qdrant"]
    MINIO["MinIO"]
    COLPALI["ColPali Embedding API"]
    OPENAI["OpenAI Responses API"]
  end

  USER["Browser"] <--> UI
  UI --> API
  API --> QDRANT
  API --> MINIO
  API --> COLPALI
  CHAT --> API
  CHAT --> OPENAI
  CHAT -- SSE --> USER
Loading

Head to backend/docs/architecture.md and backend/docs/analysis.md for a deeper walkthrough of the indexing and retrieval flows.


Quick Start

Using pre-built images? Skip to Option A for the fastest deployment using the pre-built containers from GitHub Container Registry.

1. Prepare environment files

cp .env.example .env
cp frontend/.env.example frontend/.env.local

Add your OpenAI API key to frontend/.env.local and review the backend defaults in .env.

2. Start the ColPali embedding service

From colpali/ pick one profile:

# GPU profile (CUDA + flash-attn tooling)
docker compose --profile gpu up -d --build

# CPU profile (no GPU dependencies)
docker compose --profile cpu up -d --build

Only start one profile at a time to avoid port clashes. The first GPU build compiles flash-attn; subsequent builds reuse the cached wheel.


Option A - Run with Pre-built Docker Images

Use the pre-built images from GitHub Container Registry for instant deployment:

# Pull pre-built images
docker pull ghcr.io/athrael-soju/Snappy/backend:latest
docker pull ghcr.io/athrael-soju/Snappy/frontend:latest
docker pull ghcr.io/athrael-soju/Snappy/colpali-cpu:latest

# Create minimal docker-compose.yml (see docs/DOCKER_IMAGES.md)
# Then start services
docker compose up -d

Available images:

  • backend:latest - FastAPI backend (amd64/arm64)
  • frontend:latest - Next.js frontend (amd64/arm64)
  • colpali-cpu:latest - CPU embedding service (amd64/arm64)
  • colpali-gpu:latest - GPU embedding service (amd64 only)

Full guide: See docs/DOCKER_IMAGES.md for complete documentation on using pre-built images, version tags, configuration, and production deployment examples.


Option B - Run the full stack with Docker Compose (Build from Source)

At the project root:

docker compose up -d --build

Services will come online at:

Update .env and frontend/.env.local if you need to expose different hostnames or ports.


Option C - Run services locally

  1. In backend/, install dependencies and launch FastAPI:

    python -m venv .venv
    source .venv/bin/activate  # Windows: .venv\Scripts\Activate.ps1
    pip install -U pip setuptools wheel
    pip install -r backend/requirements.txt
    uvicorn backend:app --host 0.0.0.0 --port 8000 --reload
  2. Start Qdrant and MinIO (via Docker or your preferred deployment).

  3. In frontend/, install and run the Next.js app:

    yarn install --frozen-lockfile
    yarn dev
  4. Keep the ColPali service from step 2 running (Docker or uvicorn colpali/app.py).


Highlights

  • Page-level vision retrieval powered by ColPali multivector embeddings; no OCR pipeline to maintain.
  • Streaming chat responses from the OpenAI Responses API with inline visual citations so you can see each supporting page.
  • Pipelined indexing with live Server-Sent Events progress updates and optional MUVERA-assisted first-stage search.
  • Runtime configuration UI backed by a typed schema, with reset and draft flows that make experimentation safe.
  • Docker Compose profiles for ColPali (GPU or CPU) plus an all-in-one stack for local development.

Use Cases

Snappy excels at retrieval scenarios where visual layout, formatting, and appearance matter as much as textual content:

  • Legal Document Analysis - Search case files, contracts, and legal briefs by visual layout, annotations, and document structure without relying on OCR accuracy.
  • Medical Records Retrieval - Find patient charts, diagnostic reports, and medical forms by handwritten notes, stamps, diagrams, and visual markers that traditional text search misses.
  • Financial Auditing and Compliance - Locate invoices, receipts, financial statements, and compliance documents by visual characteristics like logos, stamps, signatures, and table layouts.
  • Academic Research and Papers - Search scientific papers, technical documents, and research archives by figures, tables, equations, charts, and visual presentation; ideal for literature reviews.
  • Archive and Document Management - Retrieve historical documents, scanned archives, and legacy records by visual appearance, preserving context that text extraction destroys.
  • Engineering and Technical Documentation - Find blueprints, schematics, technical drawings, and specification sheets by visual elements, diagrams, and layout patterns.
  • Media and Publishing - Search newspaper archives, magazine layouts, and published materials by visual design, page composition, and formatting.
  • Educational Content - Organize and retrieve textbooks, lecture notes, and educational materials by visual structure, highlighting, and annotations.

Frontend Experience

The Next.js 16 frontend with React 19.2 keeps things fast and friendly: real-time streaming, responsive layouts, and design tokens (text-body-*, size-icon-*) that make extending the UI consistent. Configuration and maintenance pages expose everything the backend can do, while upload/search/chat give you the workflows you need day to day.


Environment Variables

Backend highlights

  • COLPALI_URL, COLPALI_API_TIMEOUT
  • QDRANT_EMBEDDED, QDRANT_URL, QDRANT_COLLECTION_NAME, QDRANT_PREFETCH_LIMIT, QDRANT_MEAN_POOLING_ENABLED, optional quantisation toggles
  • MINIO_URL, MINIO_PUBLIC_URL, credentials, bucket naming, IMAGE_FORMAT, IMAGE_QUALITY
  • MUVERA_ENABLED and related settings (requires fastembed[postprocess] in your environment)
  • LOG_LEVEL, ALLOWED_ORIGINS, UVICORN_RELOAD

All schema-backed settings (and defaults) are documented in backend/docs/configuration.md. Runtime updates via /config/update are ephemeral; update .env for persistence.

Frontend highlights (frontend/.env.local)

  • NEXT_PUBLIC_API_BASE_URL (defaults to http://localhost:8000)
  • OPENAI_API_KEY, OPENAI_MODEL, optional OPENAI_TEMPERATURE, OPENAI_MAX_TOKENS

API Overview

Area Endpoint(s) Notes
Meta GET /health Service and dependency status
Retrieval GET /search?q=...&k=5 Page-level search (defaults to 10 when k omitted)
Indexing POST /index Background indexing job (multipart PDF upload)
GET /progress/stream/{job_id} Real-time progress (SSE)
POST /index/cancel/{job_id} Cancel an active job
Maintenance GET /status Collection/bucket statistics
POST /initialize, DELETE /delete Provision or tear down collection + bucket
POST /clear/qdrant, /clear/minio, /clear/all Data reset helpers
Configuration GET /config/schema, /config/values Expose runtime schema and values
POST /config/update, /config/reset Runtime configuration management

Chat streaming lives in frontend/app/api/chat/route.ts. The route calls the backend search endpoint, invokes the OpenAI Responses API, and streams Server-Sent Events to the browser. The backend does not proxy OpenAI calls.


Troubleshooting

  • ColPali timing out? Increase COLPALI_API_TIMEOUT or run the GPU profile for heavy workloads.
  • Progress bar stuck? Ensure Poppler is installed and check backend logs for PDF conversion errors.
  • Missing images? Verify MinIO credentials/URLs and confirm next.config.ts allows the domains you expect.
  • CORS issues? Replace wildcard ALLOWED_ORIGINS entries with explicit URLs before exposing the API publicly.
  • Config changes vanish? /config/update modifies runtime state only-update .env for anything you need to keep after a restart.
  • Upload rejected? The uploader currently accepts PDFs only. Adjust max size, chunk size, or file count limits in the "Uploads" section of the configuration UI.

backend/docs/configuration.md and backend/CONFIGURATION_GUIDE.md cover advanced troubleshooting and implementation details.


Developer Notes

  • Background indexing uses FastAPI BackgroundTasks. For larger deployments consider a dedicated task queue.
  • MinIO worker pools auto-size based on hardware. Override only when you have specific throughput limits.
  • TypeScript types and Zod schemas regenerate from the OpenAPI spec (yarn gen:sdk, yarn gen:zod) to keep the frontend in sync.
  • Pre-commit hooks (autoflake, isort, black, pyright) keep the codebase tidy-run them before contributing.
  • Version management: Uses Release Please + Conventional Commits for automated releases. See VERSIONING.md for details.

Documentation

  • backend/README.md - FastAPI backend guide
  • frontend/README.md - Next.js frontend guide
  • colpali/README.md - ColPali embedding service guide
  • backend/docs/configuration.md - Configuration reference
  • VERSIONING.md - Release and version workflow

Further Reading

  • backend/docs/analysis.md - vision vs. text RAG comparison
  • backend/docs/architecture.md - collection, indexing, and search deep dive
  • colpali/README.md - details on the standalone embedding service

License

MIT License - see LICENSE.


Acknowledgements

Snappy builds on the work of: