diff --git a/docs/docs/examples/examples/image_search.md b/docs/docs/examples/examples/image_search.md
index fa4fbe81..d86908a0 100644
--- a/docs/docs/examples/examples/image_search.md
+++ b/docs/docs/examples/examples/image_search.md
@@ -1,5 +1,5 @@
---
-title: Index Images with ColPali
+title: Image Search App with ColPali and FastAPI
description: Build image search index with ColPali and FastAPI
sidebar_class_name: hidden
slug: /examples/image_search
@@ -10,46 +10,45 @@ sidebar_custom_props:
tags: [vector-index, multi-modal]
---
-import { GitHubButton, YouTubeButton } from '../../../src/components/GitHubButton';
+import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/components/GitHubButton';
## Overview
+CocoIndex supports native integration with ColPali - with just a few lines of code, you embed and index images with ColPali’s late-interaction architecture. We also build a light weight image search application with FastAPI.
-CocoIndex now supports native integration with ColPali — enabling multi-vector, patch-level image indexing using cutting-edge multimodal models. With just a few lines of code, you can now embed and index images with ColPali’s late-interaction architecture, fully integrated into CocoIndex’s composable flow system.
-
-## Why ColPali for Indexing?
+## ColPali
**ColPali (Contextual Late-interaction over Patches)** is a powerful model for multimodal retrieval.
It fundamentally rethinks how documents—especially visually complex or image-rich ones—are represented and searched. Instead of reducing each image or page to a single dense vector (as in traditional bi-encoders), ColPali breaks an image into many smaller patches, preserving local spatial and semantic structure. Each patch receives its own embedding, which together form a multi-vector representation of the complete document.
+
-## Declare an Image Indexing Flow with CocoIndex
-
-
-In this example, we will use CocoIndex to index images with ColPali, and Qdrant to store and retrieve the embeddings.
-
-This flow illustrates how we’ll process and index images using ColPali:
+## Flow Overview
+
1. Ingest image files from the local filesystem
2. Use **ColPali** to embed each image into patch-level multi-vectors
3. Optionally extract image captions using an LLM
4. Export the embeddings (and optional captions) to a Qdrant collection
-Check out the full working code [here](https://github.com/cocoindex-io/cocoindex/blob/main/examples/image_search/colpali_main.py).
+## Setup
+- [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
-:star: Star [CocoIndex on GitHub](https://github.com/cocoindex-io/cocoindex) if you like it!
+- Make sure Qdrant is running
+ ```
+ docker run -d -p 6334:6334 -p 6333:6333 qdrant/qdrant
+ ```
-### 1. Ingest the Images
+## Add Source
We start by defining a flow to read `.jpg`, `.jpeg`, and `.png` files from a local directory using `LocalFile`.
```python
-
@cocoindex.flow_def(name="ImageObjectEmbeddingColpali")
def image_object_embedding_flow(flow_builder, data_scope):
data_scope["images"] = flow_builder.add_source(
@@ -60,41 +59,36 @@ def image_object_embedding_flow(flow_builder, data_scope):
),
refresh_interval=datetime.timedelta(minutes=1),
)
-
```
The `add_source` function sets up a table with fields like `filename` and `content`. Images are automatically re-scanned every minute.
+
-### 2. Process Each Image and Collect the Embedding
-### 2.1 Embed the Image with ColPali
+## Process Each Image and Collect the Embedding
We use CocoIndex's built-in `ColPaliEmbedImage` function, which returns a **multi-vector representation** for each image. Each patch receives its own vector, preserving spatial and semantic information.
+
+
```python
img_embeddings = data_scope.add_collector()
with data_scope["images"].row() as img:
img["embedding"] = img["content"].transform(cocoindex.functions.ColPaliEmbedImage(model="vidore/colpali-v1.2"))
+ collect_fields = {
+ "id": cocoindex.GeneratedField.UUID,
+ "filename": img["filename"],
+ "embedding": img["embedding"],
+ }
+ img_embeddings.collect(**collect_fields)
```
-This transformation turns the raw image bytes into a list of vectors — one per patch — that can later be used for **late interaction search**.
-
-
-### 3. Collect and Export the Embeddings
+This transformation turns the raw image bytes into a list of vectors — one per patch — that can later be used for **late interaction search**. And then we collect the embeddings.
-Once we’ve processed each image, we collect its metadata and embedding and send it to Qdrant.
-
-```python
-collect_fields = {
- "id": cocoindex.GeneratedField.UUID,
- "filename": img["filename"],
- "embedding": img["embedding"],
-}
-img_embeddings.collect(**collect_fields)
-```
+
-Then we export to Qdrant using the `Qdrant` target:
+## Export the Embeddings
```python
img_embeddings.export(
@@ -107,7 +101,7 @@ img_embeddings.export(
This creates a vector collection in Qdrant that supports **multi-vector fields** — required for ColPali-style late interaction search.
-### 4. Enable Real-Time Indexing
+## Enable Real-Time Indexing
To keep the image index up to date automatically, we wrap the flow in a `FlowLiveUpdater`:
@@ -124,50 +118,43 @@ async def lifespan(app: FastAPI):
This keeps your vector index fresh as new images arrive.
+## Fast API Application
-## What’s Actually Stored?
-
-Unlike typical image search pipelines that store one global vector per image, ColPali stores:
+We build a simple FastAPI application to query the index.
```python
-Vector[Vector[Float32, N]]
+app = FastAPI(lifespan=lifespan)
+
+app.add_middleware(
+ CORSMiddleware,
+ allow_origins=["*"],
+ allow_credentials=True,
+ allow_methods=["*"],
+ allow_headers=["*"],
+)
+# Serve images from the 'img' directory at /img
+app.mount("/img", StaticFiles(directory="img"), name="img")
```
-Where:
-
-- The outer dimension is the **number of patches**
-- The inner dimension is the **model’s hidden size**
+## Search API & Query the index
-This makes the index **multi-vector ready**, and compatible with late-interaction query strategies — like MaxSim or learned fusion.
+We use `ColPaliEmbedQuery` to embed the query text into a multi-vector format.
-
-## Real-Time Indexing with Live Updater
-
-You can also attach CocoIndex’s `FlowLiveUpdater` to your FastAPI or any Python app to keep your ColPali index synced in real time:
+
```python
-from fastapi import FastAPI
-from contextlib import asynccontextmanager
-
-@asynccontextmanager
-async def lifespan(app: FastAPI):
- load_dotenv()
- cocoindex.init()
- image_object_embedding_flow.setup(report_to_stdout=True)
- app.state.live_updater = cocoindex.FlowLiveUpdater(image_object_embedding_flow)
- app.state.live_updater.start()
- yield
-
+@cocoindex.transform_flow()
+def text_to_colpali_embedding(
+ text: cocoindex.DataSlice[str],
+) -> cocoindex.DataSlice[list[list[float]]]:
+ return text.transform(
+ cocoindex.functions.ColPaliEmbedQuery(model=COLPALI_MODEL_NAME)
+ )
```
-
-## Retrivel and application
-
-Refer to this example on Query and application building:
-https://cocoindex.io/blogs/live-image-search#3-query-the-index
-
-Make sure we use ColPali to embed the query
+Then we build a search API to query the index.
```python
+# --- Search API ---
@app.get("/search")
def search(
q: str = Query(..., description="Search query"),
@@ -175,40 +162,107 @@ def search(
) -> Any:
# Get the multi-vector embedding for the query
query_embedding = text_to_colpali_embedding.eval(q)
+ print(
+ f"🔍 Query multi-vector shape: {len(query_embedding)} tokens x {len(query_embedding[0]) if query_embedding else 0} dims"
+ )
+ # Search in Qdrant with multi-vector MaxSim scoring using query_points API
+ search_results = app.state.qdrant_client.query_points(
+ collection_name=QDRANT_COLLECTION,
+ query=query_embedding, # Multi-vector format: list[list[float]]
+ using="embedding", # Specify the vector field name
+ limit=limit,
+ with_payload=True,
+ )
+
+ print(f"📈 Found {len(search_results.points)} results with MaxSim scoring")
+
+ return {
+ "results": [
+ {
+ "filename": result.payload["filename"],
+ "score": result.score,
+ "caption": result.payload.get("caption"),
+ }
+ for result in search_results.points
+ ]
+ }
```
-Full working code is available [here](https://github.com/cocoindex-io/cocoindex/blob/main/examples/image_search/colpali_main.py).
+## Run the application
+
+- Install dependencies:
+ ```
+ pip install -e .
+ pip install 'cocoindex[colpali]' # Adds ColPali support
+ ```
-Check it out for yourself! It is fun :) In this image search example, the results look better compared to [using CLIP](http://localhost:3000/blogs/live-image-search) with a single dense vector (1D embedding).
-ColPali produces richer and more fine-grained retrieval.
+- Configure model (optional):
+ ```sh
+ # All ColVision models supported by colpali-engine are available
+ # See https://github.com/illuin-tech/colpali#list-of-colvision-models for the complete list
+ # ColPali models (colpali-*) - PaliGemma-based, best for general document retrieval
+ export COLPALI_MODEL="vidore/colpali-v1.2" # Default model
+ export COLPALI_MODEL="vidore/colpali-v1.3" # Latest version
-## Built with Flexibility in Mind
+ # ColQwen2 models (colqwen-*) - Qwen2-VL-based, excellent for multilingual text (29+ languages) and general vision
+ export COLPALI_MODEL="vidore/colqwen2-v1.0"
+ export COLPALI_MODEL="vidore/colqwen2.5-v0.2" # Latest Qwen2.5 model
-Whether you’re working on:
+ # ColSmol models (colsmol-*) - Lightweight, good for resource-constrained environments
+ export COLPALI_MODEL="vidore/colSmol-256M"
-- Visual RAG
-- Multimodal retrieval systems
-- Fine-grained visual search tools
-- Or want to bring image understanding to your AI agent workflows
+ # Any other ColVision models from https://github.com/illuin-tech/colpali are supported
+ ```
-[CocoIndex](https://github.com/cocoindex-io/cocoindex) + ColPali gives you a modular, modern foundation to build from.
+- Run ColPali Backend:
+ ```
+ uvicorn colpali_main:app --reload --host 0.0.0.0 --port 8000
+ ```
+ :::warning
+ Note that recent Nvidia GPUs (such as the RTX 5090) are not supported by the stable PyTorch version up to 2.7.1.
+ :::
-## Connect to Any Data Source — and Keep It in Sync
+ If you get this error:
-One of CocoIndex’s core strengths is its ability to connect to your existing data sources and automatically keep your index fresh.
-Beyond local files, CocoIndex natively supports source connectors including:
+ ```
+ The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 sm_80 sm_86 sm_90 compute_37.
+ ```
+
+ You can install the nightly pytorch build here: https://pytorch.org/get-started/locally/
+
+ ```sh
+ pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu129
+ ```
+- Run Frontend:
+ ```
+ cd frontend
+ npm install
+ npm run dev
+ ```
+
+ Go to `http://localhost:5173` to search. The frontend works with both backends identically.
+
+ 
+
+## CLIP Model & Comparison with ColPali
+We've also had a similar application built with CLIP model.
+
+
+
+In general,
+- CLIP: Faster, good for general image-text matching
+- ColPali: More accurate for document images and text-heavy content, supports multi-vector late interaction for better precision
+
+## Connect to Any Data Source
+
+One of CocoIndex’s core strengths is its ability to connect to your existing data sources and automatically keep your index fresh. Beyond local files, CocoIndex natively supports source connectors including:
- Google Drive
- Amazon S3 / SQS
- Azure Blob Storage
-See documentation [here](https://cocoindex.io/docs/ops/sources).
+
Once connected, CocoIndex continuously watches for changes — new uploads, updates, or deletions — and applies them to your index in real time.
-
-## Support us
-
-We’re constantly adding more examples and improving our runtime.
-If you found this helpful, please ⭐ star [CocoIndex on GitHub](https://github.com/cocoindex-io/cocoindex) and share it with others.
\ No newline at end of file
diff --git a/docs/static/img/examples/image_search/cover.png b/docs/static/img/examples/image_search/cover.png
index db360205..e75e4678 100644
Binary files a/docs/static/img/examples/image_search/cover.png and b/docs/static/img/examples/image_search/cover.png differ
diff --git a/docs/static/img/examples/image_search/embedding.png b/docs/static/img/examples/image_search/embedding.png
new file mode 100644
index 00000000..6b87d5d2
Binary files /dev/null and b/docs/static/img/examples/image_search/embedding.png differ
diff --git a/docs/static/img/examples/image_search/flow.png b/docs/static/img/examples/image_search/flow.png
new file mode 100644
index 00000000..06ea6aab
Binary files /dev/null and b/docs/static/img/examples/image_search/flow.png differ
diff --git a/docs/static/img/examples/image_search/multi_modal_architecture.png b/docs/static/img/examples/image_search/multi_modal_architecture.png
new file mode 100644
index 00000000..2b317013
Binary files /dev/null and b/docs/static/img/examples/image_search/multi_modal_architecture.png differ
diff --git a/docs/static/img/examples/image_search/result.png b/docs/static/img/examples/image_search/result.png
new file mode 100644
index 00000000..c56da124
Binary files /dev/null and b/docs/static/img/examples/image_search/result.png differ