-
Notifications
You must be signed in to change notification settings - Fork 186
MAINT compatibility for scikit-learn 1.4 #883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| self, | ||
| "feature_names_in_", | ||
| # by default, we will use "x0", "x1", ... | ||
| np.asarray([f"x{i}" for i in range(X_array.shape[1])], dtype=object), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we did not need the feature names then it was maybe fine. But we were having inconsistent names: scikit-learn/scikit-learn#27801
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we might have a scikit-learn bug here. I have to check if we should forward the .set_output to the underlying FunctionTransformer if this is set on the ColumnTransformer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bug.
|
thanks a lot @glemaitre ! |
OK good enough workaround for |
* MAINT compatibility for scikit-learn 1.4 * iter * use _safe_set_output * iter * use string column names * avoid doctest failure due to sklearn 1.4 change --------- Co-authored-by: Jerome Dockes <[email protected]>
* update readme install section (#864) * [DOC] Multiple small improvements (#867) * Small doc improvements * Fix typos in AggTarget and AggJoiner * Fix bulleted lists in TableVectorizer doc * Update skrub/_table_vectorizer.py Co-authored-by: Jérôme Dockès <[email protected]> * Change _capitals to _aux in Joiner docstring * add whitespace so printed dataframe columns align * running ci to reproduce the error --------- Co-authored-by: Jérôme Dockès <[email protected]> * MAINT make sure pandas doesn't insert a "key_0" column in merge result (#878) * MAINT scikit-learn estimator check should not be run yet #879 because refrenced pr has not been merged before 1.4 * MAINT compatibility with pandas >= 2.2 (#880) * make pandas output consistent in tests * bypass pyarrow warning * ignore warning --------- Co-authored-by: Guillaume Lemaitre <[email protected]> * MAINT avoid tie breaking in joiner examples (#882) * MAINT compatibility for scikit-learn 1.4 (#883) * MAINT compatibility for scikit-learn 1.4 * iter * use _safe_set_output * iter * use string column names * avoid doctest failure due to sklearn 1.4 change --------- Co-authored-by: Jerome Dockes <[email protected]> * DOC add imports in examples to have self-contained snippet (#875) Co-authored-by: Guillaume Lemaitre <[email protected]> * DOC more consistent formatting in docstring (#881) * Small doc improvements * Fix typos in AggTarget and AggJoiner * Fix bulleted lists in TableVectorizer doc * Iter * Uniformize keys * Format docstrings * Draft multiaggjoiner and multijoiner * Revamp aggjoiner like joiner * Make new aggjoiner work * Iter * Add multijoiner and multiaggjoiner to __init__.py * Update assembling doc * Draft multiaggjoiner checks * Add multijoiner and multiaggjoiner to api reference * Remove comments * Docstrings * Write example for multiaggjoiner * Check case where aux_table is 'X' * Change TypeError into ValueError * Change 'list' into 'iterable' in joiner docstring following PR 742 discussion * Update aggjoiner tests * Implement multiaggjoiner transform * Implement multiaggjoiner transform, fix fit * Don't check duplicate columns when aggregating on the main table * Test that aggjoiner doesn't work with multiple tables * Iter tests * Check column names after aggjoiner aggregation * Check that all cols are present in aux_table * Fix typo aux_key into _aux_key * Check suffix in aggjoiner, pass all tests * Transform test df into a fixture, support polars in aggjoiner test * Format aggjoiner using utils * Iter multiaggjoiner * Add new estimators to numpydoc xref * Update public vs private attributes for aggjoiner * Check operations, check missing columns in multiaggjoiner * Update docstrings * Iter check operations * Draft tests for cols and operations check in multiaggjoiner * Iter * Iter checks multiaggjoiner * Implement checks in aggjoiner transform, format joiner * Check dataframes in multiaggjoiner * Iter _check_keys * Iter _check_keys, _check_cols and _check_operations * Make multiaggjoiner work * Split aggjoiner tests into smaller chunks * Test cols, operations, suffixes in multiaggjoiner * Small doc changes * Uniformize keys order * Update skrub/_dataframe/_pandas.py Co-authored-by: Guillaume Lemaitre <[email protected]> * Update skrub/_dataframe/_polars.py Co-authored-by: Guillaume Lemaitre <[email protected]> * Update skrub/_dataframe/_polars.py Co-authored-by: Guillaume Lemaitre <[email protected]> * Update skrub/_join_utils.py Co-authored-by: Guillaume Lemaitre <[email protected]> * Update _join_utils.py * Update _pandas.py --------- Co-authored-by: Guillaume Lemaitre <[email protected]> * MAINT ignore deprecation warning arising from pandas after numpy 1.25 (#887) * import parse_version from sklearn.utils.fixes (#892) * temporarily skip ken embedding download tests (#901) * changelog --------- Co-authored-by: Vincent M <[email protected]> Co-authored-by: Théo Jolivet <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>
Fixes some compatibilities issues with scikit-learn 1.4