-
Couldn't load subscription status.
- Fork 57
feat: update implementation of QAAccuracy to use Transform-based approach #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| SemanticRobustnessConfig, | ||
| get_perturbation_transform, | ||
| get_model_responses_from_perturbed_inputs, | ||
| get_model_outputs_from_perturbed_inputs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see use_ray is still present in this file, could you please remove it? Also check if it is present in any other files yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will handle in separate PR dedicated to GSR changes.
| target_output_key = self.transform.target_output_key | ||
| model_output_key = self.transform.model_output_key | ||
| sample = {target_output_key: target_output, model_output_key: model_output} | ||
| pipeline = TransformPipeline([self.transform]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels this is making evaluate_sample more complicated, was this concern raised earlier? I understand we are too late to discuss on this now, but can you share the conclusion or thread if a discussion on this was done earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does make things slightly more complicated, but I think it's best to keep things consistent across all algos, where it's clear that evaluate_sample and evaluate execute the algo's pipeline. Plus, in terms of raw lines of code, there's hardly any increase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one of the major tenet for evaluate_sample has been to keep it simple and readable. I feel we are moving against that by adding transforms in the method. Yeah, let's capture a backlog SIM for it, we can discuss with the team then and take a call.
I am in favor of not having so many hops for customer in evaluate_sample code.
Description of changes:
Title
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.