Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Estimating ivfflat balance / need to recompute index #105

@olirice

Description

@olirice

Create the index after the table has some data

Waiting for data before creating the index on a vector column + periodically rebalancing been a common stumbling block for users on Supabase

What do you think about adding a "goodness of fit" estimating utility function? something like:

create function vector_index_balance_estimate(index_name name, n_samples bigint)

that

  • samples a fraction of the column's vectors
  • computes k-mean centroids for the sample
  • computes the sample's average distance to cluster centroid
  • compares the sample's average distance to same points in the index
  • returns the difference as a %

so the result could be interpreted as recomputing this index will make points approximately x% closer to their cluster centroid


That'd give users the ability to check how much drift they've experienced quickly and potentially put their index maintenance in a cron job

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions