Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add semantic similarity to top level interface#1063

Open
o-love wants to merge 2 commits into
mainfrom
semantic-similarity
Open

Add semantic similarity to top level interface#1063
o-love wants to merge 2 commits into
mainfrom
semantic-similarity

Conversation

@o-love
Copy link
Copy Markdown
Contributor

@o-love o-love commented Feb 6, 2026

Connects the similarity threshold of memmachine to the similarity argument for semantic memory.
This allows users to filter by similarity when using semantic memory.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Connects the top-level query_search “similarity threshold” parameter to the semantic-memory search call so callers can filter semantic results by similarity/distance.

Changes:

  • Passes score_threshold through to semantic memory search as min_distance
  • Updates the mock-based test to assert the new threshold is forwarded to both episodic and semantic managers

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
tests/memmachine/main/test_memmachine_mock.py Strengthens assertions to verify the threshold is forwarded into both memory backends.
src/memmachine/main/memmachine.py Wires the threshold into the semantic search call via min_distance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/memmachine/main/memmachine.py Outdated
Comment thread tests/memmachine/main/test_memmachine_mock.py
Comment thread tests/memmachine/main/test_memmachine_mock.py
Comment thread src/memmachine/main/memmachine.py Outdated
@o-love o-love force-pushed the semantic-similarity branch from 2787739 to 4aed52f Compare February 6, 2026 22:37
Copy link
Copy Markdown
Contributor

@edwinyyyu edwinyyyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently score is usable for reranker score in episodic memory.

min_distance (interpreted as minimum distance) means that we are including all of the worst results and excluding all of the best results. Lower distance is better. Higher similarity is better. Please rename.

To share score thresholds between different memory types using different databases, we should convert all distances/similarities to the same type of score i.e. use the same embedder (and for embedders, normalize the results from database queries to use the same formula for computation) or use the same reranker (easier).

See #1066 for a reference of formulas used for score computation.

Comment thread src/memmachine/main/memmachine.py Outdated
@o-love o-love force-pushed the semantic-similarity branch from 4aed52f to 00bff33 Compare February 10, 2026 21:51
@o-love o-love requested review from Copilot and edwinyyyu February 10, 2026 21:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/memmachine/semantic_memory/semantic_session_manager.py
Comment thread src/memmachine/semantic_memory/storage/neo4j_semantic_storage.py
Comment thread src/memmachine/semantic_memory/storage/neo4j_semantic_storage.py
session_data=session_data,
set_metadata=set_metadata,
limit=limit,
distance_threshold=score_threshold,
Copy link
Copy Markdown
Contributor

@edwinyyyu edwinyyyu Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We interpret score as higher is better right now.

Please either rename or redefine score or distance, and ensure that it works with rerankers which return 0-1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

higher_is_better bool is now on SimilarityMetric

@o-love
Copy link
Copy Markdown
Contributor Author

o-love commented Apr 6, 2026

Suggestions were opened regarding unifying similarity system.
Awaiting unified vector store interface to study

@sscargal sscargal requested a review from srinivas-aji May 20, 2026 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants