Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Estimator check for dtype preservation for regressors #22682

Open
@ogrisel

Description

@ogrisel

Describe the workflow you want to enable

As discussed in #22663 (comment), we should have a common test that checks that the predict method of regressors preserves the dtype, similarly to check_transformer_preserve_dtypes for transformers.

More specifically I would expect the following to hold most of the time:

>>> dtype = np.float32
>>> reg = Regressor().fit(X_train.astype(dtype), y_train.astype(dtype))
>>> assert reg.predict(X_test.astype(dtype)).dtype == dtype

If X_train, y_train, and X_test do not all share the same dtype, I would be in favor of leaving the behavior undefined (and unchecked).

This could be coupled with the existing preserves_dtype estimator tag.

Describe your proposed solution

  • Implement a new check_regressor_preserve_dtypes function, next to check_transformer_preserve_dtypes.
  • Make sure it does not nothing if the regressor does not define the preserves_dtype estimator tag.
  • Identify a few regressors where this property holds, add the tag for float64 and float32 and check that the common test pass with:
pytest -vk "check_regressor_preserve_dtypes and RegressorClassName"

Once done, create a meta-issue to track all regressors that should be labeled to preserve dtypes, similarly to what the following issue does for transformers:

Metadata

Metadata

Assignees

Labels

New Featurefloat32Issues related to support for 32bit datamodule:test-suiteeverything related to our tests

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions