-
-
Notifications
You must be signed in to change notification settings - Fork 26.4k
API improve the remainder index dtype to be consistent with transformers #27657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- if all columns in inputs are provided as column names, so are remainder columns - if all columns in inputs are provided as boolean masks, so are remainder columns - otherwise remainder columns are int indices (as before)
TODO silence when they are accessed by ColumnTransformer itself
|
as discussed IRL with @glemaitre, we want to avoid showing spurious warnings, as many users may not use directly the columns of the remainder entry in transformers_. The solution we wrote is to store the columns in a |
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first round of comment before looking at the tests
| The remainder columns warning (if it exists) is disabled. | ||
| """ | ||
| return _with_dtype_warning_enabled_set_to(False, transformers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot call directly this function in the code instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to call this function directly. I had introduced the enabled and disabled wrappers because I thought it was a bit easier to figure out what they did without needing to check the _RemaindersColsList when reading the ColumnTransformer code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see but I think this is good enough now.
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
jeremiedbb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to fit it in 1.5. Please change the target version to 1.5, the target version to change the default to 1.7 and the target version to remove the old behavior to 1.9.
jeremiedbb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @jeromedockes
Reference Issues/PRs
closes #27533
What does this implement/fix? Explain your changes.
As described in the issue #27533 , this modifies the format of the columns of the last item of the
ColumnTransformer'stransformers_attribute, ie of the item that corresponds to the "remainder". They used to always be indices (integers), now they match the format that was used for thetransformersparameter, if it was consistent across all transformers:This is controlled by the
force_int_remainder_colsparameter (better name suggestions welcome :) ) : when it is True the old behavior is kept and a FutureWarning is emitted; when it is False the new behavior is applied