-
Notifications
You must be signed in to change notification settings - Fork 1k
vectorize: optimize VectorSumCenter and HalfvecSumCenter #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The functions VectorSumCenter and HalfvecSumCenter were not being vectorized by the compiler. A few slight changes will allow these optimizations to take place and get a performance boost by utilizing SIMD instructions. This optimization helps improve performance of vector operations in IVF index building and updating.
|
Hi @binarycleric, thanks for the PR. Can you share data on how this impacts the k-means stage of the index build time? |
|
@ankane Sure thing. I was just about to publish a test when I saw your comment. This this commit we're seeing a significant drop in Given the following SQL script:CREATE EXTENSION IF NOT EXISTS vector;
CREATE OR REPLACE FUNCTION generate_random_floats(size integer)
RETURNS float[] AS $$
BEGIN
RETURN (
SELECT array_agg(random() * 2 - 1)
FROM generate_series(1, size)
);
END;
$$ LANGUAGE plpgsql;
CREATE TABLE t (val vector(1536));
INSERT INTO t (val)
SELECT generate_random_floats(1536)
FROM generate_series(1, 10000);
CREATE INDEX ON t USING ivfflat (val vector_l2_ops) WITH (lists = 1);
CREATE TABLE t2 (val halfvec(1536));
INSERT INTO t2 (val)
SELECT generate_random_floats(1536)
FROM generate_series(1, 10000);
CREATE INDEX ON t2 USING ivfflat (val halfvec_l2_ops) WITH (lists = 1);
DROP FUNCTION generate_random_floats(integer);
DROP TABLE t;
DROP TABLE t2;Without autovectorization in Sums.With autovectorization in Sums. |
|
I see that my change introduced the following compiler warning. I'm looking into seeing if there is anything I can do to address this. Looks like any kind of change to use native ARM functions breaks vectorization. Despite the warning this commit does seem to improve performance (at least on ARM, which is what I have available to test). |
ankane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some results from my testing for the k-means stage (with 1,000 lists):
| Platform | Dataset | Type | Before (sec) | After (sec) |
|---|---|---|---|---|
| Linux x86-64 | SIFT | vector | 7.3 | 7.2 |
| Mac arm64 | SIFT | vector | 3.52 | 3.48 |
| Mac arm64 | SIFT | halfvec | 4.33 | 4.25 |
| Linux x86-64 | GIST | vector | 36.0 | 34.8 |
| Mac arm64 | GIST | vector | 14.6 | 14.1 |
| Mac arm64 | GIST | halfvec | 15.2 | 14.5 |
It's not a big difference (especially when you factor in the rest of the index build time), but seems fine to include. Added a few comments inline.
|
Updated to remove |
|
Great, thanks! |
* vectorize: optimize VectorSumCenter and HalfvecSumCenter The functions VectorSumCenter and HalfvecSumCenter were not being vectorized by the compiler. A few slight changes will allow these optimizations to take place and get a performance boost by utilizing SIMD instructions. This optimization helps improve performance of vector operations in IVF index building and updating. * Removing const, commenting that it is only vectoirzed on ARM
The functions
VectorSumCenterandHalfvecSumCenterwere not being vectorized by the compiler. A few slight changes will allow these optimizations to take place and get a performance boost by utilizing SIMD instructions.This optimization helps improve performance of vector operations in IVF index building and updating.
Verified that the loops are being vectorized by checking
PG_CFLAGS += -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize.Before
After