Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Kerollmops
Copy link
Member

@Kerollmops Kerollmops commented Nov 10, 2025

This PR aims to improve the current settings by supporting the update of searchable attributes with the new settings indexer.

The new indexer is designed to significantly reduce disk writes. Its main approach is to read documents directly from LMDB instead of duplicating the dataset into a temporary file and processing it from scratch each time a setting change occurs. Instead, we calculate the difference between the settings and apply only the necessary changes. This second iteration adds support for searchable attributes, following the previous support for embedders.

To Do

  • Start in Settings::execute:L1600.
  • Start by adding the exact fields to the MetadataBuilder and FieldIdMapWithMetadata.
  • Use Index::fields_ids_map_with_metadata to get old settings.
  • Check if default embedder template is correctly impacted.
  • Two phases: write in wtxn + extract & merge.
  • Create and call update_searchable_configs in settings.rs:L1633.
  • Just call L1467 + L1479 (Maybe not needed?).
  • Look at the SettingsDelta in indexer/mod.rs:L235.
  • Actual extraction of stuff in extract_all_settings_changes:L394.
  • In settings_change_extract, The Extractor(Ex) can be the Ex of the classic indexer.
  • Be careful about the db_fields_ids_map (could be outdated).
  • Take inspiration from the SettingsChangeEmbeddingExtractor.
  • Note that disableOnAttributes corresponds to exact_word_docids.
  • word_fid_docids and fid_word_count_docids can only be imnpacted by added/removed searchable fields.
  • In insert_del_u32 we should touch the word_fid_docids and the fid_word_count_docids if the current field has been added or deleted from the list (we can add a boolean to help).
  • If a field that has been changed in the searchable attributes is part of the current document, tokenize it; else, skip it. For WordDocidsSettingsExtractor and WordPairProximityDocidsSettingsExtractor.
  • For WordPairProximityDocidsSettingsExtractor, copy and paste the WordPairProximityDocidsExtractor logic.
  • Force reindexing if proximity precision changes:
    • If from byAttribute -> byWord: Reindex all fields. Boolean that says: fais tout!
    • If from byWord -> byAttribute: Clear word_pair_proximity_docids and skip this extractor.
  • For prefix_search:
    • Enabled -> Disabled: Clear the word_prefix_docids, word_prefix_field_docids, word_prefix_position_docids databases and skip post processing of prefixes.
    • Disabled -> Enabled: Reindex all fields: Boolean that says: fais tout! (Post processing).

@Kerollmops Kerollmops added the no db change The database didn't change label Nov 10, 2025
@Kerollmops Kerollmops changed the title Add some comments as guidance Support the searchable attributes in the new Settings Indexer Nov 12, 2025
@Kerollmops Kerollmops force-pushed the new-searchable-settings-indexer branch from 81b1c2e to 9736c0e Compare November 13, 2025 11:31
@Kerollmops Kerollmops force-pushed the new-searchable-settings-indexer branch from a6bda2e to cd45451 Compare November 13, 2025 16:33
@Kerollmops Kerollmops force-pushed the new-searchable-settings-indexer branch from ee5c8b8 to 24ee086 Compare November 13, 2025 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no db change The database didn't change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants