Introduction
To
Vector Embeddings and Vector
Databases
How is Google able to differentiate between these
apples?
Vector Embeddings
What are vector embeddings?
Embeddings are words but represented as
numbers.
Similar words like cat and kitten will have
numbers that are similar.
Think of embeddings as a way to
translate words into numbers, so
computers can understand and
work with them. Each word gets
its own special set of numbers,
called a "vector," that captures
its meaning.
Visualizing Vector Embeddings
Tower
Dog
Building
Cat
Sky Scrapper Goat
Math on Vector Embeddings
Since embeddings are numbers you can do math on them
King - man + woman =
Queen
Testing out Vector Embeddings in code
As you can see here the
similarity score while
comparing each embedding
Vector Databases
What are vector databases?
Vector databases are like specialized storage systems designed to
handle and quickly find information that's represented as vectors.
Think of them as super-efficient filing cabinets for storing and
searching through large amounts of vector data.
How do vector databases work?
1. Storing vectors: 2. Similarity Search:
When we have data (like text) we convert this One of the main jobs of a vector database is to
data into vectors that capture the important quickly find items that are similar to a given
information about each item. query.
These vectors are then stored in the vector If you have a vector for a specific item (like a
database, much like putting files into drawers. search query), the database can find other
vectors (items) that are close to it in the vector
space.
3. Fast retrieval:
Vector databases are designed to handle
searches very quickly, even when there are
millions of vectors.
They use clever indexing methods to make the
search process super fast, like having a very
efficient librarian who knows exactly where every
book is.
Testing out Vector Databases in code
As you can see, when we do the
search query at the end, it gives
a correct document as
response aswell as confidence
score and metadata