Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Handle missing values in OneHotEncoder #11996

Closed
@jnothman

Description

@jnothman

A minimum implementation might translate a NaN in input to a row of NaNs in output. I believe this would be the most consistent default behaviour with respect to other preprocessing tools, and with reasonable backwards-compatibility, but other core devs might disagree (see #10465 (comment)).

NaN should also be excluded from the categories identified in fit.

A handle_missing parameter might allow NaN in input to be:

  • replaced with a row of NaNs as above
  • replaced with a row of zeros
  • represented with a separate one-hot column

in the output.

A missing_values parameter might allow the user to configure what object is a placeholder for missingness (e.g. NaN, None, etc.).

See #10465 for background

Metadata

Metadata

Assignees

Labels

ModerateAnything that requires some knowledge of conventions and best practices

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions