-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Background:
- We use SQLite's FTS5 functionality for search. It works pretty okay. It's small and fast and (somewhat) easy to use. However, it has almost no guardrails. If you use it wrong, it will silently fail and corrupt the index. I have lost many evenings to this.
- When multiple terms are entered into the search field, the default operation is AND, meaning all terms must appear in the results.
I have observed on my personal instance that when searching for multiple terms, the body results all look correct but the title results return titles that do NOT contain all terms being search for. For example, if I search for linux mate, I get these results:
- linux_distributions
- linux_mint_mate
- linux_mint
- arch_linux
The only result in this case should be the second result, "linux_mint_mate" because it's the only title with both terms.
I have noticed that reversing the query (searching for mate linux results in a different set of results (it should not):
- ubuntu_mate
- linux_mint_mate
Things I have ruled out so far:
- Two separate queries are used to get the title and body results. They are identical except for which column they search, and the maximum number of terms returned. It doesn't make sense to me why they would behave any differently.
- It's not a matter of the AND mistakenly getting turned into an OR somehow, because running that query returns a great many other pages.
- This happens on a database which has been exported and then imported again, which rules out some latent corruption in the
pages_ftstable.
Metadata
Metadata
Assignees
Labels
No labels