Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@kemingy
Copy link
Member

@kemingy kemingy commented Aug 18, 2025

It's hard to know the user's retrieval methods. Thus, it has to be passed as an argument.

It's the user's responsibility to guarantee that the source table has no updates. This is also due to the limitation of retrieval methods. It's much easier for users to work directly on their Table than to wrap a snapshot into their retrieval logic.

@kemingy kemingy requested a review from Copilot August 18, 2025 10:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for generating groundtruth data and evaluating retrieval methods on it, consolidating embedding models under a unified API and adding utility functions for type checking.

  • Adds a GroundTruth class to generate evaluation datasets from retrieval results and measure performance metrics
  • Consolidates embedding models by unifying BaseTextEmbedding and BaseMultiModalEmbedding into a single BaseEmbedding class
  • Adds utility functions for type checking iterables and extracting iterator types

Reviewed Changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vechord/utils.py Adds utility functions for type checking lists/iterables and extracting nested types
vechord/registry.py Removes duplicate type checking functions, imports them from utils module
vechord/pipeline.py Updates to use unified embedding API and adds transaction context management
vechord/model/web.py Updates documentation to clarify optional steps parameter
vechord/groundtruth.py New module implementing groundtruth generation and evaluation functionality
vechord/evaluate.py Updates to support list-based truth IDs and InputType enum usage
vechord/embedding.py Consolidates embedding classes into unified BaseEmbedding interface
tests/test_run.py Adds tests for running pipelines with spacy embedding
tests/test_groundtruth.py Adds tests for groundtruth generation and evaluation
tests/conftest.py Fixes test fixture parameter handling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Signed-off-by: Keming <[email protected]>
Signed-off-by: Keming <[email protected]>
Signed-off-by: Keming <[email protected]>
Signed-off-by: Keming <[email protected]>
@kemingy kemingy merged commit 8bfd7cc into tensorchord:main Aug 20, 2025
7 checks passed
@kemingy kemingy deleted the groundtruth branch August 20, 2025 03:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant