-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Description
Create the index after the table has some data
Waiting for data before creating the index on a vector column + periodically rebalancing been a common stumbling block for users on Supabase
What do you think about adding a "goodness of fit" estimating utility function? something like:
create function vector_index_balance_estimate(index_name name, n_samples bigint)that
- samples a fraction of the column's vectors
- computes k-mean centroids for the sample
- computes the sample's average distance to cluster centroid
- compares the sample's average distance to same points in the index
- returns the difference as a %
so the result could be interpreted as recomputing this index will make points approximately x% closer to their cluster centroid
That'd give users the ability to check how much drift they've experienced quickly and potentially put their index maintenance in a cron job
gregnr
Metadata
Metadata
Assignees
Labels
No labels