Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Adamtaranto
Copy link
Collaborator

This PR will add simple distance metrics for comparing count tables. Starting with Jaccard and Cosine Similarity.

Closes #39

This initial commit adds cosine() method to calculate cosine similarity between two KmerCountTables.
Have attempted to use Rayon to make hash lookup multithreaded.

@ctb can you check the logic on this + look over the tests?

I've benchmarked against the scipy cosine function (allowing for a 0.001% margin - not sure where the diff is coming from)

In the test test_cosine_similarity_identical_tables() my function is giving 0.999999999 instead of 1. No idea why this is happening.

@Adamtaranto Adamtaranto added the enhancement New feature or request label Sep 22, 2024
@Adamtaranto Adamtaranto self-assigned this Sep 22, 2024
@ctb
Copy link
Contributor

ctb commented Sep 23, 2024

see #48 - I added a bit to the tests. Overall this looks great, thank you!!

ctb and others added 2 commits September 23, 2024 12:44
…f paranoia thrown in (#48)

* test symmetric

* Style fixes by Ruff

---------

Co-authored-by: ctb <[email protected]>
@Adamtaranto Adamtaranto merged commit c0ff7fe into main Sep 23, 2024
15 checks passed
@Adamtaranto Adamtaranto deleted the dev_dist_metrics branch September 23, 2024 03:06
@ctb ctb mentioned this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support Jaccard and angular similarity calculations

2 participants