-
Notifications
You must be signed in to change notification settings - Fork 60
feat: support average='binary' in precision_score() #2080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bigframes/ml/metrics/_metrics.py
Outdated
if y_true.drop_duplicates().count() != 2 or y_pred.drop_duplicates().count() != 2: | ||
raise ValueError( | ||
"Target is multiclass but average='binary'. Please choose another average setting." | ||
) | ||
|
||
total_labels = set( | ||
y_true.drop_duplicates().to_list() + y_pred.drop_duplicates().to_list() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably avoid drop_duplicates
, it has overhead from trying to preserve ordering, try unique(keep_order=False)
instead. Also try to minimize query count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code updated. This is the execution output: https://screenshot.googleplex.com/9aFGAUSHzuPDPtB.
It's weird that no query job links are provided.
bigframes/ml/metrics/_metrics.py
Outdated
def _precision_score_binary_pos_only( | ||
y_true: bpd.Series, y_pred: bpd.Series, pos_label: int | float | bool | str | ||
) -> float: | ||
if y_true.drop_duplicates().count() != 2 or y_pred.drop_duplicates().count() != 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may create extra queries with y_true.drop_duplicates().to_list() in line 340. We may want to merge them.
Can you take a look at how many queries are created when running this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the result: https://screenshot.googleplex.com/9aFGAUSHzuPDPtB. it feels weird because no query jobs are printed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local execution? @TrevorBergeron
Fixes #437198383 🦕