ENH allows to overwrite read_csv parameter in fetch_openml #26433

glemaitre · 2023-05-25T14:52:59Z

Reopening #25488

Context

As pointed out in #25878 (comment), pandas introduced a breaking change from 1.X to 2.X to consider None as a missing value by default. If we want to avoid suffering from the breaking change and also not having a different behaviour depending on the pandas version, then we can set the default na_values in read_csv to the previous 1.X values, and announce a future change.

However, to silence this FutureWarning, we need our user to provide the future default na_values and thus we need to expose read_csv_kwargs.

This PR is the part that exposes read_csv_kwargs to our user.

thomasjpfan

I left some minor comments about the docstring, otherwise LGTM.

sklearn/datasets/_openml.py

glemaitre · 2023-05-25T15:59:38Z

I am going to open a subsequent PR to illustrate what this PR is allowing for.

Co-authored-by: Thomas J. Fan <[email protected]>

glemaitre · 2023-05-25T18:47:34Z

I open #26436 which shows how useful this feature will be for copping with the Pandas breaking change.

adrinjalali

Otherwise LGTM.

sklearn/datasets/_openml.py

Co-authored-by: Adrin Jalali <[email protected]>

…arn#26433) Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Adrin Jalali <[email protected]>

ENH allows to overwrite read_csv parameter in fetch_openml

1c26cb2

github-actions bot added the module:datasets label May 25, 2023

update pr number

e5802ff

thomasjpfan approved these changes May 25, 2023

View reviewed changes

sklearn/datasets/_openml.py Outdated Show resolved Hide resolved

sklearn/datasets/_openml.py Outdated Show resolved Hide resolved

sklearn/datasets/_openml.py Outdated Show resolved Hide resolved

Apply suggestions from code review

e600588

Co-authored-by: Thomas J. Fan <[email protected]>

glemaitre marked this pull request as draft May 25, 2023 17:30

glemaitre mentioned this pull request May 25, 2023

DEPR announce change of default na_values in fetch_openml #26436

Closed

glemaitre marked this pull request as ready for review May 25, 2023 18:46

adrinjalali reviewed Jun 1, 2023

View reviewed changes

sklearn/datasets/_openml.py Outdated Show resolved Hide resolved

Update sklearn/datasets/_openml.py

25eb07c

Co-authored-by: Adrin Jalali <[email protected]>

adrinjalali enabled auto-merge (squash) June 7, 2023 12:02

adrinjalali merged commit 586f431 into scikit-learn:main Jun 7, 2023

manudarmi pushed a commit to primait/scikit-learn that referenced this pull request Jun 12, 2023

ENH allows to overwrite read_csv parameter in fetch_openml (scikit-le…

0ed4374

…arn#26433) Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Adrin Jalali <[email protected]>

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

ENH allows to overwrite read_csv parameter in fetch_openml (scikit-le…

5e92694

…arn#26433) Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Adrin Jalali <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH allows to overwrite read_csv parameter in fetch_openml #26433

ENH allows to overwrite read_csv parameter in fetch_openml #26433

Uh oh!

glemaitre commented May 25, 2023

Uh oh!

thomasjpfan left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre commented May 25, 2023

Uh oh!

glemaitre commented May 25, 2023

Uh oh!

adrinjalali left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ENH allows to overwrite read_csv parameter in fetch_openml #26433

ENH allows to overwrite read_csv parameter in fetch_openml #26433

Uh oh!

Conversation

glemaitre commented May 25, 2023

Context

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre commented May 25, 2023

Uh oh!

glemaitre commented May 25, 2023

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!