Vespa + FastAPI demo that now covers both lexical BM25 and dense semantic retrieval with a hybrid (RRF) ranking option.
- Part 1: Set up Vespa search engine, ingest data, and query via curl (https://www.youtube.com/watch?v=lfoOtjLhKh8)
- Part 2: Build an interactive web UI with FastAPI and modern frontend (https://www.youtube.com/watch?v=83k0gnqxE_s)
- Part 3: A no nonsense, applied intro to BM25 (https://www.youtube.com/watch?v=TW9vHU1GpU4)
- Part 4: Hybrid search (lexical + dense) (https://www.youtube.com/watch?v=BXvCxG_H31M)
- Part 5: What is reciprocal rank fusion? (https://youtu.be/2uBcjEecr38)
More coming soon!
Questions or requests? Open a GitHub issue.
bm25.py: Vespa application package with multiple BM25 rank profiles for lexical experiments.hybrid.py: Vespa package with BM25 + HNSW vectors and three rank profiles (bm25,semantic,fusionusing reciprocal rank fusion).feed.py: Deploys the hybrid package to a local Vespa Docker container, writes the app to./vespa_app_hybrid, encodes documents withall-MiniLM-L6-v2, and streams FineWeb into Vespa.ui.py+templates/+static/: FastAPI-powered UI that lets you pick ranking modes and handles query embedding for semantic/hybrid searches.
- Python 3.10+
- Docker or Podman (for Vespa deployment)
- uv package manager (recommended)
- Network access to pull HuggingFace FineWeb and the SentenceTransformer model
uv syncThis launches a Vespa container with 8 GB memory, writes the Vespa app files to vespa_app_hybrid/, and streams FineWeb with on-the-fly embeddings.
python feed.pyNotes:
- Dataset:
HuggingFaceFW/finewebsplitCC-MAIN-2025-08(streaming). - Embeddings:
all-MiniLM-L6-v2(usesmpson Apple Silicon by default). - Stop with
Ctrl+Conce you have enough documents indexed.
curl -X POST http://localhost:8080/search/ \
-H "Content-Type: application/json" \
-d '{
"yql": "select * from sources * where userQuery() limit 10",
"query": "python programming",
"ranking": {"profile": "bm25"}
}'Start the FastAPI server:
uvicorn ui:app --reloadOpen http://localhost:8000 and choose a ranking mode:
fusion: hybrid RRF over BM25 + ANN semantic scoressemantic: dense vector onlybm25: lexical only