Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TransformedTargetRegressor forces 1d y shape to regressor #26530

@Daniel3009

Description

@Daniel3009

Describe the bug

I experience the following error when using TransformedTargetRegressor with my skorch model:
ValueError: The target data shouldn't be 1-dimensional but instead have 2 dimensions, with the second dimension having the same size as the number of regression targets (usually 1). Please reshape your target data to be 2-dimensional (e.g. y = y.reshape(-1, 1).

After checking the Source Code this lead me the the following unexpected behaivor which makes little sense:

If TransformedTargetRegressor is fitted with with a 2d dimensional y, it will still be transformed to a 1d dimensional output

y should have the same input and output shapes with a TransformedTargetRegressor or there should be an init argument to disable the change of the input shape
(Yes, internally it gets casted to 2d, but I’m talking about the In and Outputs)

https://github.com/scikit-learn/scikit-learn/blob/364c77e04/sklearn/compose/_target.py#L20
TransformedTargetRegressor-->fit

        if y.ndim == 1:
            y_2d = y.reshape(-1, 1)
        else:
            y_2d = y
        self._fit_transformer(y_2d)

[...]

        if y_trans.ndim == 2 and y_trans.shape[1] == 1:
            y_trans = y_trans.squeeze(axis=1)

But in the end we squeeze it back into a 1d which causes issues for models which expect a 2d input of y
y was 2d in the beginning for a reason

The following code would solve this:

        if y_trans.ndim == 2 and y_trans.shape[1] == 1 and y.ndim==1:  #only squeeze back to 1d if y is 1d
            y_trans = y_trans.squeeze(axis=1)

This could only create an issue where the y input was for some reason 2d but should be 1d for the regressor.
In this case an attribute would be nice

        if y_trans.ndim == 2 and y_trans.shape[1] == 1 and self.output_dim == 1:
            y_trans = y_trans.squeeze(axis=1)

Also in TransformedTargetRegressor-->predict the results dont get squeezed after the prediction of the estimator - only if the original input shape was 1, in that case it is squeezed

So the result looks as expected, but only if the regressor takes a 1d y
If the estimator expects a 2d y the code fails

Steps/Code to Reproduce

regressor = TransformedTargetRegressor(
    transformer=MinMaxScaler()
)
X, y = np.random.rand(10, 10), np.expand_dims(np.random.rand(10), 1)
regressor.fit(X, y)

Expected Results

The shape of y stays the same as the input OR there is a attribute which allows the choice of (1d or original) or (1d or 2d)

input | internal | output
2d —> 2d —> 2d
1d —> 2d —> 1d

Actual Results

the regressor gets just a 1d array even through y was specifically set to 2d
(I don't know how to extract these results without an debugger)

It works for this example because the default regressor is used, but when using it with other models they might need the 2nd dimention of y, because it was specifically reshaped (-1,1)

input | internal | output
2d —> 2d —> 1d THIS creates issues for the regressive which is passed to the Transformer if it expects a 2d array because a 2d y was given
1d —> 2d —> 1d

Versions

System:
    python: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]
executable: /anaconda/envs/azureml_py310_sdkv2/bin/python
   machine: Linux-5.15.0-1017-azure-x86_64-with-glibc2.31

Python dependencies:
      sklearn: 1.1.3
          pip: 22.1.2
   setuptools: 61.2.0
        numpy: 1.23.2
        scipy: 1.9.0
       Cython: None
       pandas: 1.4.3
   matplotlib: 3.6.2
       joblib: 1.2.0
threadpoolctl: 3.1.0

Metadata

Metadata

Assignees

Labels

Buggood first issueEasy with clear instructions to resolve

Type

No type

Projects

Status

Easy

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions