[MRG] Added inverse_transform for pls base object and test #15304

jiwidi · 2019-10-20T14:47:50Z

Reference Issues/PRs

This PR is the same as #15289 but with squeezed commits since I failed with docstring syntax.

What does this implement/fix? Explain your changes.

With this PR I'm adding a function inverse_transform to the base _PLS object and the consecuent test. This function is already implemented in other decomposition methods like PCA here.

The function transforms back to the original space by multiplying the transformed data with the x_loadings as X_original = np.matmul(X, self.x_loadings_.T) and denormalizes the result with the model std and mean.

This function allows you to transform back data to the original space (this transformation will only be exact if n_components=n_features). This transformation is widely used to compute the Squared Prediction Error of our _PLS model. This metric is famous for its use in industry scenarios where PLS acts as a statistical model to control processes where time takes a big act (papers on this: 1, 2 )

Following Sklearn _PLS example this is how the function should be used:

from sklearn.cross_decomposition import PLSRegression
X = [[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]]
Y = [[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]]
pls2 = PLSRegression(n_components=2)
pls2.fit(X, Y)


t = [0., 0., 1.]
Y_pred = pls2.transform([t])
X_reconstructed = pls2.inverse_transform(Y_pred) # Value will be [0.02257852, 0.11391906, 0.87805722]

And a example to showcase the correctness of the function with n_components==n_features

from sklearn.cross_decomposition import PLSRegression
X = [[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]]
Y = [[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]]
pls2 = PLSRegression(n_components=3)
pls2.fit(X, Y)


t = [0., 0., 1.]
Y_pred = pls2.transform([t])
X_reconstructed = pls2.inverse_transform(Y_pred) # Value will be [0, 0, 1]

Any other comments?

I have been developing software for multivariate statistical process control for some time now and Sklearn implementation of PLS has been widely used in this field. I always thought the _PLS was lacking from this method while PCA had it and decided to make a contribution for it :)

jiwidi · 2019-10-20T14:50:06Z

ping @NicolasHug @adrinjalali Since I have seen you guys both touch _pls files recently.

adrinjalali

Some nitpicks, otherwise looks good to me, thanks @jiwidi

sklearn/cross_decomposition/_pls_.py

adrinjalali · 2019-10-21T16:20:48Z

sklearn/cross_decomposition/_pls_.py

+        Notes
+        This transformation will only be exact if n_components=n_features
+        -----


Notes ----- This transformation will only be exact if n_components=n_features

adrinjalali · 2019-10-21T16:22:06Z

sklearn/cross_decomposition/_pls_.py

+        check_is_fitted(self)
+        X = check_array(X, copy=False, dtype=FLOAT_DTYPES)
+        # From pls space to original space
+        X_original = np.matmul(X, self.x_loadings_.T)


Suggested change

X_original = np.matmul(X, self.x_loadings_.T)

np.matmul(X, self.x_loadings_.T, out=X)

this way you can do it in place.

You should also add a test for this case.

So I came up with this:

def inverse_transform(self, X, copy=True): check_is_fitted(self) X = check_array(X, copy=copy, dtype=FLOAT_DTYPES) # From pls space to original space np.matmul(X, self.x_loadings_.T, out=X) # Denormalize X *= self.x_std_ X += self.x_mean_ return X

And the extra test:

# Check inplace usage plsca.inverse_transform(Xr,copy=False) assert_array_almost_equal(Xr, X, err_msg="inverse_transform failed")

I tested it and it works with both inplace. Just checking this is what you meant before commiting it :)

yeah this looks good to me, I think.

jiwidi · 2019-10-23T09:41:59Z

@adrinjalali newbie question here. All the last commits have failed to run the test tho I can not replicate the same error on local (they pass here). Is there any way to debug this in an easier way rather than just committing and make the test run every time? Thanks!

adrinjalali · 2019-10-23T15:53:48Z

The failing tests are all on python 3.5. You should create a similar environment and see why they fail. You can see the environment on top of the failing tests. For example:

##[section]Starting: Test Library
==============================================================================
Task         : Command line
Description  : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
Version      : 2.151.2
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
==============================================================================
Generating script.
Script contents:
build_tools/azure/test_script.sh
========================== Starting Command Output ===========================
[command]/bin/bash --noprofile --norc /home/vsts/work/_temp/d1d9cf55-3eb0-4aa7-af95-8ce64d06a52e.sh
Python 3.5.2
numpy 1.11.0
scipy 0.17.0
pandas not installed
2 CPUs
Package             Version                Location           
------------------- ---------------------- -------------------
atomicwrites        1.3.0                  
attrs               19.3.0                 
blinker             1.3                    
chardet             2.3.0                  
cloud-init          19.2                   
command-not-found   0.3                    
configobj           5.0.6                  
coverage            4.5.4                  
cryptography        1.2.3                  
cycler              0.9.0                  
Cython              0.29.13                
decorator           4.0.6                  
idna                2.0                    
importlib-metadata  0.23                   
Jinja2              2.8                    
joblib              0.11                   
jsonpatch           1.10                   
jsonpointer         1.9                    
language-selector   0.1                    
MarkupSafe          0.23                   
matplotlib          1.5.1                  
more-itertools      7.2.0                  
numpy               1.11.0                 
oauthlib            1.0.3                  
pathlib2            2.3.5                  
Pillow              3.1.2                  
pip                 19.3.1                 
pkg-resources       0.0.0                  
pluggy              0.13.0                 
prettytable         0.7.2                  
py                  1.8.0                  
pyasn1              0.1.9                  
pycurl              7.43.0                 
pygobject           3.20.0                 
PyJWT               1.3.0                  
pyparsing           2.0.3                  
pyserial            3.0.1                  
pytest              3.8.1                  
pytest-cov          2.8.1                  
python-apt          1.1.0b1+ubuntu0.16.4.5 
python-dateutil     2.4.2                  
python-debian       0.1.27                 
python-systemd      231                    
pytz                2014.10                
PyYAML              3.11                   
requests            2.9.1                  
scikit-learn        0.22.dev0              /home/vsts/work/1/s
scipy               0.17.0                 
setuptools          41.4.0                 
six                 1.10.0                 
ssh-import-id       5.5                    
ufw                 0.35                   
unattended-upgrades 0.1                    
urllib3             1.13.1                 
virtualenv          15.0.1                 
WALinuxAgent        2.2.40                 
wheel               0.33.6                 
zipp                0.6.0

jiwidi · 2019-10-24T09:24:45Z

@adrinjalali I have been thinking about the inplace support for this function. It's a different behavior from the transform() inplace, where if you select copy=False the only change reflected to X would be the normalization:

def transform(self, X, Y=None, copy=True):
        """Apply the dimension reduction learned on the train data.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            Training vectors, where n_samples is the number of samples and
            n_features is the number of predictors.

        Y : array-like of shape (n_samples, n_targets)
            Target vectors, where n_samples is the number of samples and
            n_targets is the number of response variables.

        copy : boolean, default True
            Whether to copy X and Y, or perform in-place normalization.

        Returns
        -------
        x_scores if Y is not given, (x_scores, y_scores) otherwise.
        """
        check_is_fitted(self)
        X = check_array(X, copy=copy, dtype=FLOAT_DTYPES)
        # Normalize
        X -= self.x_mean_
        X /= self.x_std_
        # Apply rotation
        x_scores = np.dot(X, self.x_rotations_)
        if Y is not None:
            Y = check_array(Y, ensure_2d=False, copy=copy, dtype=FLOAT_DTYPES)
            if Y.ndim == 1:
                Y = Y.reshape(-1, 1)
            Y -= self.y_mean_
            Y /= self.y_std_
            y_scores = np.dot(Y, self.y_rotations_)
            return x_scores, y_scores

        return x_scores

In inverse_transform the denormalization is computed on the matrix result and not in the X parameter.

    def inverse_transform(self, X, copy=True):
        """Transform data back to its original space.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_components)
            New data, where n_samples is the number of samples
            and n_components is the number of pls components.
        copy : bool, default=True
            Whether to copy X, or perform in-place normalization.

        Returns
        -------
        X_original array-like of shape (n_samples, n_features)

        Notes
        -----
        This transformation will only be exact if n_components=n_features
        """
        check_is_fitted(self)
        X = check_array(X, copy=copy, dtype=FLOAT_DTYPES)
        # From pls space to original space
        np.matmul(X, self.x_loadings_.T, out=X)

        # Denormalize
        X *= self.x_std_
        X += self.x_mean_
        return X

And by contrast in PCA inverse_transform they seem to not support the inplace functionality.

It doesn't feel natural to support inplace functionality here if it isn't supported in other decomposition methods. What do you think? Or you see value to this inplace feature? (I'll be happy to work in a PR to add it for PCA).

Coming back to the tests, I was able to reproduce the failure test. This is happening because np.matmul(X, self.x_loadings_.T, out=X) is following the known error with numpy==1.11.0 where if the out variable is the same as input it will return a zero matrix (tested by myself).

With numpy==1.17 the test pass as this bug was fixed. Is there a specific reason why numpy==1.11 is being used? Should I push for a solution that solves that bug or ignore the inplace feature?

I believe to have found a solution to bypass this bug and it would be by just hard copying X.
Rewrite the matmul: to np.matmul(X.copy(), self.x_loadings_.T, out=X)

And thanks again for the feedback, I'm learning some new stuff with this PR :)

adrinjalali · 2019-10-24T14:35:22Z

Numpy 1.11 is the oldest numpy we support. We'll be dropping support for that version soon, but not yet.

As you suggest, I recommend you remove the copy parameter, and always create a new array then. It would also avoid the issue, since you won't be using the out parameter anyway.

jiwidi · 2019-10-24T14:48:57Z

Numpy 1.11 is the oldest numpy we support. We'll be dropping support for that version soon, but not yet.

As you suggest, I recommend you remove the copy parameter, and always create a new array then. It would also avoid the issue, since you won't be using the out parameter anyway.

This will mean dropping inplace support as well. There will be no reason to keep the copy parameter. You okay with this?

adrinjalali · 2019-10-24T14:53:10Z

sklearn/cross_decomposition/_pls_.py

        """
        check_is_fitted(self)
-        X = check_array(X, copy=copy, dtype=FLOAT_DTYPES)
+        X = check_array(X, copy=True, dtype=FLOAT_DTYPES)


you don't need to copy here. Just leave the copy as default.

Does it make sense to accept the copy parameter even if it will not affect the parameter X? I was thinking on removing copy from the function arg list and only accept X (and making it copy=True x default)

Wether copy is true or false will not affect the execution or trace it leaves on the parameters.

yes that's what I meant.

jiwidi · 2019-10-25T08:27:33Z

@adrinjalali Last commit should be ready to merge version I believe

adrinjalali

Otherwise LGTM. You also need a whats_new entry for this one.

sklearn/cross_decomposition/_pls_.py

Co-Authored-By: Adrin Jalali <[email protected]>

jiwidi · 2019-10-25T12:50:35Z

Otherwise LGTM. You also need a whats_new entry for this one.

You mean a whats_new entry on the PR text? Doesn't What does this implement/fix? Explain your changes. Refer to that?

adrinjalali · 2019-10-25T13:25:21Z

Sorry for not being clear in my previous comment. I meant an entry in the v0.22.rst file like the other ones.

…n into pls_inverse-transform

jiwidi · 2019-10-25T14:25:08Z

Sorry for not being clear in my previous comment. I meant an entry in the v0.22.rst file like the other ones.

no problem. It should ready to merge now :)

adrinjalali · 2019-10-25T14:26:52Z

doc/whats_new/v0.22.rst

+- |Feature| :class:`cross_decomposition._PLS` Has a new function :func:`cross_decomposition._PLS.inverse_transform` to transform 
+  data to the original space`. :pr:`15304` by :user:`Jaime Ferrando Huertas <jiwidi>`.
+


Please keep the line length <80.

Also, we should point to the public API here, i.e. list the classes which inherit from _PLS instead.

something like this maybe?

- |Feature| :class:`cross_decomposition.PLSCanonical` and :class:`cross_decomposition.PLSRegression` have a new function `inverse_transform` :func:`cross_decomposition.PLSCanonical.inverse_transform` :func:`cross_decomposition.PLSRegression.inverse_transform` to transform data to the original space`. :pr:`15304` by :user:`Jaime Ferrando Huertas <jiwidi>`.

Yeah, I'd do something like:

- |Feature| :class:`cross_decomposition.PLSCanonical` and :class:`cross_decomposition.PLSRegression` have a new function ``inverse_transform`` to transform data to the original space`. :pr:`15304` by :user:`Jaime Ferrando Huertas <jiwidi>`.

adrinjalali

Thanks @jiwidi for your prompt responses. Now we need to wait for another maintainer to do a second review, and hopefully it'll get in soon :)

jiwidi · 2019-10-25T15:21:55Z

Thanks @jiwidi for your prompt responses. Now we need to wait for another maintainer to do a second review, and hopefully it'll get in soon :)

Thank you for all the feedback! :)

jnothman

Otherwise LGTM!

jnothman · 2019-10-27T22:03:44Z

sklearn/cross_decomposition/_pls_.py

+
+        Returns
+        -------
+        x_reconstructed array-like of shape (n_samples, n_features)


Suggested change

x_reconstructed array-like of shape (n_samples, n_features)

x_reconstructed : array-like of shape (n_samples, n_features)

You removed this colon...

my bad, fixed.

Co-Authored-By: Joel Nothman <[email protected]>

…rse-transform

…n into pls_inverse-transform

jnothman

I don't think reverting the inclusion of the colon will fix CI... It seems something is wrong with installation at the moment, not your fault

jnothman · 2019-10-28T09:52:32Z

sklearn/utils/_unittest_backport.py

@@ -0,0 +1,224 @@
+"""
+This is a backport of assertRaises() and assertRaisesRegex from Python 3.5.4


Why has this been added?

This is my fault, I pulled from origin and messed things up... Tha's why I tried reverting to the commit before all of this happened

I removed the files now, sorry for the mess

jnothman · 2019-10-28T11:18:26Z

sklearn/cross_decomposition/_pls_.py

+
+        Returns
+        -------
+        x_reconstructed array-like of shape (n_samples, n_features)


You removed this colon...

Added inverse_Transform for pls base object and test

82012ff

jiwidi changed the title ~~Added inverse_Transform for pls base object and test~~ [WIP] Added inverse_Transform for pls base object and test Oct 20, 2019

adrinjalali reviewed Oct 21, 2019

View reviewed changes

Update docstring

7aede4b

adrinjalali reviewed Oct 21, 2019

View reviewed changes

Jaime Ferrando Huertas added 4 commits October 23, 2019 07:54

Added inplace for pls inverse_transform and test

7a9bb59

Update test

c30685b

Update pls for inplace and new test

0c77e96

update pls inverse_Transform

bf20b44

Update inverse_transform to support np 1.11

cb6c143

removed pls inplace execution and test

8f6edde

adrinjalali reviewed Oct 24, 2019

View reviewed changes

Removed inplace for pls inverse_transform and consequent test

af9a0f1

adrinjalali reviewed Oct 25, 2019

View reviewed changes

sklearn/cross_decomposition/_pls_.py Outdated Show resolved Hide resolved

adrinjalali changed the title ~~[WIP] Added inverse_Transform for pls base object and test~~ Added inverse_Transform for pls base object and test Oct 25, 2019

Remove unnecessary parameter on pls_.py inverse_transform

3f6c7bc

Co-Authored-By: Adrin Jalali <[email protected]>

Jaime Ferrando Huertas added 2 commits October 25, 2019 14:00

Add _PLS.inverse_transform as feature for v0.22

7dc4c6e

Merge branch 'pls_inverse-transform' of github.com:jiwidi/scikit-lear…

80521c1

…n into pls_inverse-transform

adrinjalali reviewed Oct 25, 2019

View reviewed changes

Add _PLS.inverse_transform as feature for v0.22

a750bab

adrinjalali approved these changes Oct 25, 2019

View reviewed changes

adrinjalali changed the title ~~Added inverse_Transform for pls base object and test~~ [MRG] Added inverse_Transform for pls base object and test Oct 25, 2019

jnothman approved these changes Oct 27, 2019

View reviewed changes

jnothman changed the title ~~[MRG] Added inverse_Transform for pls base object and test~~ [MRG] Added inverse_transform for pls base object and test Oct 27, 2019

jiwidi and others added 4 commits October 27, 2019 23:34

update pls inverse_transform docstring

6827b10

Co-Authored-By: Joel Nothman <[email protected]>

Merge branch 'master' of github.com:jiwidi/scikit-learn into pls_inve…

298d4d0

…rse-transform

Merge branch 'pls_inverse-transform' of github.com:jiwidi/scikit-lear…

a3934b2

…n into pls_inverse-transform

Revert to a750bab

46b3423

jnothman reviewed Oct 28, 2019

View reviewed changes

Removed a modified file from pull request

94c25d4

jnothman requested changes Oct 28, 2019

View reviewed changes

update inverse_transform docstring

82f070a

jiwidi requested a review from jnothman October 28, 2019 11:28

jnothman approved these changes Oct 28, 2019

View reviewed changes

jnothman merged commit 89b15e1 into scikit-learn:master Oct 28, 2019

thomasjpfan mentioned this pull request Apr 23, 2022

Implement inverse_transform method for cross_decomposition ? #6439

Closed

	X_original = np.matmul(X, self.x_loadings_.T)
	np.matmul(X, self.x_loadings_.T, out=X)

		- \|Feature\| :class:`cross_decomposition._PLS` Has a new function :func:`cross_decomposition._PLS.inverse_transform` to transform
		data to the original space`. :pr:`15304` by :user:`Jaime Ferrando Huertas <jiwidi>`.

	x_reconstructed array-like of shape (n_samples, n_features)
	x_reconstructed : array-like of shape (n_samples, n_features)

		@@ -0,0 +1,224 @@
		"""
		This is a backport of assertRaises() and assertRaisesRegex from Python 3.5.4

Uh oh!

[MRG] Added inverse_transform for pls base object and test #15304

[MRG] Added inverse_transform for pls base object and test #15304

Uh oh!

Conversation

jiwidi commented Oct 20, 2019

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jiwidi commented Oct 20, 2019

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jiwidi commented Oct 23, 2019

Uh oh!

adrinjalali commented Oct 23, 2019

Uh oh!

jiwidi commented Oct 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Oct 24, 2019

Uh oh!

jiwidi commented Oct 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jiwidi commented Oct 25, 2019

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jiwidi commented Oct 25, 2019

Uh oh!

adrinjalali commented Oct 25, 2019

Uh oh!

jiwidi commented Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

jiwidi commented Oct 25, 2019

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jiwidi commented Oct 24, 2019 •

edited

Loading

jiwidi commented Oct 24, 2019 •

edited

Loading

jiwidi commented Oct 25, 2019 •

edited

Loading