Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Feb 28, 2022

Reference Issues/PRs

Fixes #22628

What does this implement/fix? Explain your changes.

This makes LabelEncoder.transform convert y to the same dtype as classes_ before encoding it.

Any other comments?

On main, the behavior of nan in LabelEncoder encodes the nan in fit_transform:

from sklearn.preprocessing import LabelEncoder

print(LabelEncoder().fit_transform([1, np.nan, 2]))
# [0 2 1]

This PR makes the behavior consistent with object dtypes.

@kabartay
Copy link

Thanks. This seems really helpful.

Copy link
Member

@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@thomasjpfan thomasjpfan added the Quick Review For PRs that are quick to review label Oct 25, 2022
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you, @thomasjpfan.

Just a comment.

Comment on lines +1151 to +1152
dtype : data-type, default=None
Data type for `y`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the behavior of column_or_1d when dtype=None? Is it this one?

Suggested change
dtype : data-type, default=None
Data type for `y`.
dtype : data-type, default=None
Data type for `y`.
When dtype is None, dtype is inferred from the elements of `y`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it is inferred from the elements of y:

XREF: np.asarray docs

@jjerphan jjerphan merged commit 9c4f023 into scikit-learn:main Oct 28, 2022
jjerphan pushed a commit to samronsin/scikit-learn that referenced this pull request Oct 28, 2022
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2022
andportnoy pushed a commit to andportnoy/scikit-learn that referenced this pull request Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LabelEncoder not transforming nans as expected.

5 participants