Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

dian-lun-lin
Copy link

@dian-lun-lin dian-lun-lin commented Aug 23, 2025

Pull Request resolved: #4450

This pull request introduces support for Intel ScalableVectorSearch, integrating Intel's proprietary LVQ and LeanVec technologies in binary form. The following index types are now supported: IndexSVSVamana, IndexSVSVamanaLVQ, IndexSVSVamanaLeanVec, and IndexSVSFlat. IndexSVSVamana and IndexSVSFlat utilizes SVS open-source float32 implementation.

Key features and enhancements include:

  • Implemented search, add, and remove_ids
  • Implemented filter search and range_search
  • Enabled SVS open-source fp16 and scalar quantization implementation
  • Enabled factory methods
  • Enabled save/load
  • Enabled Python bindings
  • Enable fallback mechanism that fallbacks to 8-bit scalar quantization if LVQ/LeanVec is used on non-intel hardware CPUs
  • Added examples in both Python and C++ under the tutorial/ directory
  • Added FAISS_ENABLE_SVS flag to allows users to optionally enable SVS

TODOs:

  • Support more indices such as IndexSVSIVF and IndexSVSFlat with LVQ and LeanVec
  • Documentation

// we only know the number of bytes at the very end.
// One solution would be to write a footer and read it in index_read(),
// but since the file ends anyway after index_write() is done, we just
// assume EOF means end of binary SVS blob.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be a problem if the SVSVamana is embedded into another index, eg. the coarse quantizer of an IndexIVF? In that case, the index does not end in EOF.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. I found a simple solution that removes the EOF assumption entirely. So this is no longer a concern.

FAISS_THROW_IF_NOT(impl);
FAISS_THROW_IF_NOT(k > 0);

auto queries = svs::data::ConstSimpleDataView<float>(x, n, d);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the Flat data represented in memory? As a row-first array?
In that case, would it be possible to implement knn_L2sqr and knn_innerproduct with SVS ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be row-first array. The ConstSimpleDataView is just a wrapper that wraps the x and does not do any copy.

We also plan to support FP16/int_8 Scalar Quantization/LVQ/LeanVec on Flat, which require our own distance computation to work properly.

What would be the intuition to implement knn_L2sqr and knn_innterproduct with SVS? I think only using the impl->search() is a cleaner way to hide those computation details.

svs::threads::StaticPartition(n),
[&](auto is, auto SVS_UNUSED(tid)) {
for (auto i : is) {
labels[i] = ntotal + i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT how useful is it to parallelize this loop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition was it could be useful if n is very large. Now I'm thinking std::iota should be enough. I'll make the changes.

float* distances,
idx_t* labels,
const SearchParameters* params) const {
FAISS_THROW_IF_NOT(impl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC !impl means that the index is empty, in which case the labels should all be set to -1 and the distances to +inf

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I'll make the corresponding changes.

FAISS_THROW_IF_NOT_MSG(
impl, "Cannot serialize: SVS index not initialized.");

// Write index to temporary files and concatenate the contents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to do I/O without temp files?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires some larger changes in SVS, which we would like to contribute separately at a later point.


# Test round-trip serialization preserves exact type
with tempfile.NamedTemporaryFile() as f:
faiss.write_index(index, f.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

#endif

#ifdef FAISS_ENABLE_SVS
options += "SVS ";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent

import tempfile
import faiss


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if "SVS" not in faiss.get_compile_options().split():
sys.exit()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a skip condition similar to other tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, added by mistake.


if(FAISS_ENABLE_SVS)
include(FetchContent)
set(SVS_URL "https://github.com/intel/ScalableVectorSearch/releases/download/v1.0.0-dev/svs-shared-library-1.0.0-NIGHTLY-20250820.tar.gz")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a binary package right? For what platform?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this binary package is for Linux

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify this - how critical is to support all platforms supported by FAISS? as far this is first round of integration - is it ok to focus on linux only?

rfsaliev and others added 26 commits September 9, 2025 15:08
* Added index factory datatypes: 'FP16', 'SQ8'
* Update CPP and Python tests
delete accidentally added file
Add Float16 and Scalalar Quantization INT8 support to IndexSVSVamana
fix(svs-io): Don't rely on EOF when loading SVS indices
chore(test-svs): remove need for temp files
chore(test-svs): skip tests if compiled without SVS
* basic testing idselector & range_search


---------

Co-authored-by: Rafik Saliev <[email protected]>
[SVS] Add support for IDSelecttor in IndexSVSVamana::search() and implement IndexSVSVamana::range_search()
Revise based on comments and enable clang format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration of Intel Scalable Vector Search (SVS) with FAISS
6 participants