VectorDBBench

Benchmark framework for comparing Milvus and Weaviate vector databases on Kubernetes.

Quick Start

1. Install

git clone https://github.com/hungngodev/VectorDBBench.git
cd VectorDBBench
pip install poetry && poetry install

2. Build & Push Docker Image

docker build -t hungngodev/vectordbbench:latest .
docker push hungngodev/vectordbbench:latest

3. Prepare Dataset

Download datasets to shared NFS storage:

./prepare_datasets.sh /mnt/nfs/shared/datasets

4. Run Benchmarks

export NS=marco
export HOST_DATA_DIR=/mnt/nfs/shared/datasets
export HOST_RESULTS_DIR=/mnt/nfs/shared/results
export CASE_TYPE=Performance768D1M  # or Performance768D100K

./scripts/run_config_matrix.sh

5. Analyze Results

python scripts/aggregate_results.py --dir /mnt/nfs/shared/results --output analysis/all_results.csv
cd analysis && python generate_figures.py

UMass Swarm Cluster Setup

SSH Access

ssh [email protected]

Kubernetes Access

export KUBECONFIG=/path/to/kubeconfig
kubectl config use-context swarm
kubectl get pods -n marco

Database Deployments

Databases are deployed in the marco namespace:

Database	Service URL	Port	Configuration
Milvus	`milvus.marco.svc.cluster.local`	19530	Distributed architecture, 1 querynode
Weaviate	`weaviate.marco.svc.cluster.local`	8080	Single monolithic instance

Note: Although Milvus uses a distributed architecture (separate coordinator, data node, index node, query node), we run with 1 querynode for fair comparison with Weaviate's single instance.

Raft Scaling Experiment

In the Raft scaling experiment, Weaviate was deployed as a 3-node Raft cluster while Milvus remained at 1 querynode.

Key findings:

Weaviate's Raft consensus provides fault tolerance only, not search parallelism
Each search query is still processed by a single node
Load balancing must be implemented separately (e.g., via Kubernetes Ingress or a custom load balancer)
Milvus's querynode can be independently scaled for search parallelism

Modifying Database Configurations

Milvus (Helm values):

# View current config
helm get values milvus -n marco

# Update querynode replicas (NOTE: kept at 1 for fair comparison)
helm upgrade milvus milvus/milvus -n marco --set queryNode.replicas=1

Weaviate (Helm values):

# View current config
helm get values weaviate -n marco

# Update replica count (Raft consensus for fault tolerance)
helm upgrade weaviate semitechnologies/weaviate -n marco --set replicas=3

HNSW Index Parameters

To modify HNSW parameters (M, efConstruction, efSearch), edit the benchmark scripts:

# In scripts/run_config_matrix.sh
M_VALUES="4 8 16 32 64 128"
EF_VALUES="128 192 256 384 512 768"

Monitoring

# Check pod status
kubectl get pods -n marco -w

# View logs
kubectl logs -f deployment/milvus-querynode -n marco

# Resource usage
kubectl top pods -n marco

Results

Benchmark results and analysis are in the analysis/ directory:

RESEARCH_REPORT_v2.md - Performance comparison report
all_results_*.csv - Raw benchmark data
*.png - Visualization figures

Scripts Reference

Main Benchmark Script

scripts/run_all_nohup.sh - Run full benchmark detached (recommended):

HOST_DATA_DIR=/mnt/nfs/shared/datasets \
HOST_RESULTS_DIR=/mnt/nfs/shared/results \
CPU=16 MEM=64Gi \
bash scripts/run_all_nohup.sh

Logs written to run_all.log. Monitor with tail -f run_all.log.

Configuration Script

scripts/run_config_matrix.sh - Core benchmark runner with configurable parameters:

Environment Variable	Default	Description
`NS`	`marco`	Kubernetes namespace
`HOST_DATA_DIR`	(empty)	Path to cached datasets on NFS
`HOST_RESULTS_DIR`	(empty)	Path to save results on NFS
`CASE_TYPE`	`Performance768D1M`	Benchmark case (`Performance768D100K`, `Performance768D1M`)
`K`	`100`	Number of nearest neighbors to retrieve
`EF_CONSTRUCTION`	`360`	HNSW efConstruction (fixed for index quality)
`NUM_CONCURRENCY`	`1,2,4,8,16,32`	Client concurrency levels
`CONCURRENCY_DURATION`	`60`	Seconds per concurrency level
`CPU` / `MEM`	`16` / `64Gi`	Pod resource limits

HNSW Parameter Matrices (edit in script):

# Milvus/Weaviate: M and efSearch values
milvus_m=(4 8 16 32 64 128 256)
milvus_ef=(128 192 256 384 512 640 768 1024)

weav_m=(4 8 16 32 64 128 256)
weav_ef=(128 192 256 384 512 640 768 1024)

Other Scripts

scripts/run_all_and_cleanup.sh Orchestrates the entire benchmark pipeline:

Runs run_config_matrix.sh to execute all benchmark jobs
Calls aggregate_results.py to combine JSON results into CSV
Cleans up individual JSON files after aggregation

NS=marco RESULT_ROOT=/mnt/nfs/shared/results OUTPUT=all_results.csv \
bash scripts/run_all_and_cleanup.sh

scripts/aggregate_results.py Combines individual JSON result files into a single CSV for analysis.

python scripts/aggregate_results.py --root /mnt/nfs/shared/results --output all_results.csv

Output columns: db, task_label, concurrency, qps, latency_p99, recall, load_duration, etc.

scripts/cleanup_bench.sh Deletes all benchmark jobs and pods (prefixed with vdb- or vectordb-bench) from the cluster.

NS=marco bash scripts/cleanup_bench.sh

scripts/stop_and_clean.sh Emergency stop: kills local benchmark scripts AND deletes all Kubernetes jobs in marco namespace.

bash scripts/stop_and_clean.sh

Name		Name	Last commit message	Last commit date
Latest commit History 360 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
docs		docs
install		install
scripts		scripts
tests		tests
vectordb_bench		vectordb_bench
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
install.py		install.py
prepare_datasets.sh		prepare_datasets.sh
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VectorDBBench

Quick Start

1. Install

2. Build & Push Docker Image

3. Prepare Dataset

4. Run Benchmarks

5. Analyze Results

UMass Swarm Cluster Setup

SSH Access

Kubernetes Access

Database Deployments

Raft Scaling Experiment

Modifying Database Configurations

HNSW Index Parameters

Monitoring

Results

Scripts Reference

Main Benchmark Script

Configuration Script

Other Scripts

About

Uh oh!

Releases

Packages

Languages

hungngodev/VectorDBBench

Folders and files

Latest commit

History

Repository files navigation

VectorDBBench

Quick Start

1. Install

2. Build & Push Docker Image

3. Prepare Dataset

4. Run Benchmarks

5. Analyze Results

UMass Swarm Cluster Setup

SSH Access

Kubernetes Access

Database Deployments

Raft Scaling Experiment

Modifying Database Configurations

HNSW Index Parameters

Monitoring

Results

Scripts Reference

Main Benchmark Script

Configuration Script

Other Scripts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages