Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: support DataFrames in OneHot/OrdinalEncoder without converting to array #12147

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Left-over to do from #9151 (comment)

Idea is to support DataFrames without converting to a contiguous array. This conversion is not needed, as the transformer encodes the input column by column anyway, so it would be rather easy to preserve the datatypes per column.

This would avoid converting a potentially mixed-dtype DataFrame (eg ints and object strings) to a full object array.

This can introduces a slight change in behaviour (it can change the dtype of the categories_ in certain edge cases, eg when you had a mixture of float and int columns).

(Note that is not yet necessarily means to have special handling for certain pandas dtypes such as categorical dtype, see #12086, in an initial step, we could still do a check_array on each column / coerce each column to a numpy array).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions