Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@florian-huber
Copy link
Collaborator

@florian-huber florian-huber commented Aug 15, 2025

New fast similarity score computation inspired by the Flash Entropy from Li and Fiehn (2023).

This is based on first creating a m/z index (_LibraryIndex) that contains ALL fragments of the dataset ordered by m/z. This allows much faster computation than, for instance, CosineGreedy or ModifiedCosine.

In the current implementation it is possible to chose differnt score types:

  • score_type can be spectral entropy or cosine
  • matching_mode can be fragment, neutral_loss or hybrid (hybrid being a combination of both fragments and neutral losses)

The computation of modified cosine scores is then a combination of cosine with hybrid matching mode.

I quickly compared CosineGreedy with FlashSimilarity (cosine mode) and blink (see #829 ). This shows that FlashSimilarity is not scaling as well as Blink for the presented task (all-vs-all comparisons), but is MUCH faster than CosineGreedy, by about 100x. In addition, and unlike Blink, FlashSimilarity is no approximation but computes the exact same scores as CosineGreedy.

image

FlashSimlarity now also allows to compute the modified cosine scores (exact same as with current ModifiedCosine) about 100x faster.
image

@florian-huber florian-huber changed the title Flash entropy FlashSimilarity for faster flash entropy, cosine, modified cosine computation Sep 4, 2025
@florian-huber florian-huber merged commit 557feb9 into master Oct 6, 2025
10 checks passed
@florian-huber florian-huber deleted the flash_entropy branch October 6, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants