VecDB

a very simple vector embedding database, you can say that it is a hash-table that let you find items similar to the item you're searching for.

Why!

I'm a databases enthusiast, and this is a for fun and learning project that could be used in production ;).

P.S: I like to re-invent the wheel in my free time, because it is my free time!

Data Model

I'm using the {key => value} model,

key should be a unique value that represents the item.

value should be the vector itself (List of Floats).

Configurations

by default vecdb searches for config.yml in the current working directory. but you can override it using the --config /path/to/config.yml flag by providing your own custom file path.

# http server related configs
server:
  # the address to listen on in the form of '[host]:port'
  listen: "0.0.0.0:3000"

# storage related configs
store:
  # the driver you want to use
  # currently vecdb supports "bolt" which is based on boltdb the in process embedded the database
  driver: "bolt"
  # the arguments required by the driver
  # for bolt, it requires a key called `database` points to the path you want to store the data in.
  args:
    database: "./vec.db"

# embeddings related configs
embedder:
  # whether to enable the embedder and all endpoints using it or not
  enabled: true
  # the driver you want to use, currently vecdb supports gemini
  driver: gemini
  # the arguments required by the driver
  # currently gemini driver requires `api_key` and `text_embedding_model`
  args:
    # by default vecdb will replace anything between ${..} with the actual value from the ENV var
    api_key: "${GEMINI_API_KEY}"
    text_embedding_model: "text-embedding-004"

Components

Raw Vectors Layer (low-level)
- send VectorWriteRequest to POST /v1/vectors/write when you have a vector and want to store it somewhere.
- send VectorSearchRequest to POST /v1/vectors/search when you have a vector and want to list all similar vectors' keys/ids ordered by cosine similarity in descending order.
Embedding Layer (optional)
- send TextEmbeddingWriteRequest to POST /v1/embeddings/text/write when you have a text and want vecdb to build and store the vector for you using the configured embedder (gemini for now).
- send TextEmbeddingSearchRequest to POST /v1/embeddings/text/search when you have a text and want vecdb to build a vector and search for similar vectors' keys for you ordered by cosine similarity in descending order.

Requests

VectorWriteRequest

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "key": "product-id-1", // should be unique and represents a valid value in your main data store (example: the row id in your mysql/postgres ... etc)
  "vector": [1.929292, 0.3848484, -1.9383838383, ... ] // the vector you want to store 
}

VectorSearchRequest

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "vector": [1.929292, 0.3848484, -1.9383838383, ... ], // you will get a list ordered by cosine-similarity in descending order
  "min_cosine_similarity": 0.0, // the more you increase, the fewer data you will get
  "max_result_count": 10 // max vectors to return (vecdb will first order by cosine similarity then apply the limit)
}

TextEmbeddingWriteRequest

if you set embedder.enabled to true.

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "key": "product-id-1", // should be unique and represents a valid value in your main data store (example: the row id in your mysql/postgres ... etc)
  "content": "This is some text representing the product" // this will be converted to a vector using the configured embedder 
}

TextEmbeddingSearchRequest

if you set embedder.enabled to true.

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "content": "A Product Text", // you will get a list ordered by cosine-similarity in descending order
  "min_cosine_similarity": 0.0, // the more you increase, the fewer data you will get
  "max_result_count": 10 // max vectors to return (vecdb will first order by cosine similarity then apply the limit)
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
internals		internals
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.yml		config.yml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VecDB

Why!

Data Model

Configurations

Components

Requests

VectorWriteRequest

VectorSearchRequest

TextEmbeddingWriteRequest

TextEmbeddingSearchRequest

Download/Install

About

Uh oh!

Releases 8

Packages

Uh oh!

Uh oh!

Languages

License

alash3al/vecdb

Folders and files

Latest commit

History

Repository files navigation

VecDB

Why!

Data Model

Configurations

Components

Requests

VectorWriteRequest

VectorSearchRequest

TextEmbeddingWriteRequest

TextEmbeddingSearchRequest

Download/Install

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Uh oh!

Languages

Packages