Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

sycai
Copy link
Contributor

@sycai sycai commented Sep 12, 2025

Fixes #437198383 🦕

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Sep 12, 2025
@sycai sycai marked this pull request as ready for review September 12, 2025 22:03
@sycai sycai requested review from a team as code owners September 12, 2025 22:03
@sycai sycai changed the title feat: support 'binary' for precision_score feat: support average='binary' in precision_score() Sep 12, 2025
Comment on lines 334 to 341
if y_true.drop_duplicates().count() != 2 or y_pred.drop_duplicates().count() != 2:
raise ValueError(
"Target is multiclass but average='binary'. Please choose another average setting."
)

total_labels = set(
y_true.drop_duplicates().to_list() + y_pred.drop_duplicates().to_list()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably avoid drop_duplicates, it has overhead from trying to preserve ordering, try unique(keep_order=False) instead. Also try to minimize query count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code updated. This is the execution output: https://screenshot.googleplex.com/9aFGAUSHzuPDPtB.

It's weird that no query job links are provided.

def _precision_score_binary_pos_only(
y_true: bpd.Series, y_pred: bpd.Series, pos_label: int | float | bool | str
) -> float:
if y_true.drop_duplicates().count() != 2 or y_pred.drop_duplicates().count() != 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may create extra queries with y_true.drop_duplicates().to_list() in line 340. We may want to merge them.

Can you take a look at how many queries are created when running this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the result: https://screenshot.googleplex.com/9aFGAUSHzuPDPtB. it feels weird because no query jobs are printed out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local execution? @TrevorBergeron

@sycai sycai merged commit 920f381 into main Sep 17, 2025
18 of 25 checks passed
@sycai sycai deleted the sycai_precision_score_binary branch September 17, 2025 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants