Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] Ensures that partial_fit for sklearn.decomposition.IncrementalPCA uses float division #9492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24 changes: 23 additions & 1 deletion doc/whats_new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,18 @@ Version 0.20 (under development)
Changed models
--------------

The following estimators and functions, when fit with the same data and
parameters, may produce different models from the previous version. This often
occurs due to changes in the modelling logic (bug fixes or enhancements), or in
random sampling procedures.

- :class:`decomposition.IncrementalPCA` in Python 2 (bug fix)

Details are listed in the changelog below.

(While we are trying to better inform users by providing this information, we
cannot assure that this list is complete.)

Changelog
---------

Expand All @@ -24,6 +36,16 @@ Classifiers and regressors
via ``n_iter_no_change``, ``validation_fraction`` and ``tol``. :issue:`7071`
by `Raghav RV`_

Bug fixes
.........

Decomposition, manifold learning and clustering

- Fixed a bug where the ``partial_fit`` method of
:class:`decomposition.IncrementalPCA` used integer division instead of float
division on Python 2 versions. :issue:`9492` by
:user:`James Bourbeau <jrbourbeau>`.


Version 0.19
============
Expand Down Expand Up @@ -160,7 +182,7 @@ Model selection and evaluation
:issue:`8120` by `Neeraj Gangwar`_.

- Added a scorer based on :class:`metrics.explained_variance_score`.
:issue:`9259` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
:issue:`9259` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.

Miscellaneous

Expand Down
1 change: 1 addition & 0 deletions sklearn/decomposition/incremental_pca.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
# Giorgio Patrini
# License: BSD 3 clause

from __future__ import division
import numpy as np
from scipy import linalg

Expand Down
24 changes: 24 additions & 0 deletions sklearn/decomposition/tests/test_incremental_pca.py
Original file line number Diff line number Diff line change
Expand Up @@ -273,3 +273,27 @@ def test_whitening():
assert_almost_equal(X, Xinv_ipca, decimal=prec)
assert_almost_equal(X, Xinv_pca, decimal=prec)
assert_almost_equal(Xinv_pca, Xinv_ipca, decimal=prec)


def test_incremental_pca_partial_fit_float_division():
# Test to ensure float division is used in all versions of Python
# (non-regression test for issue #9489)

rng = np.random.RandomState(0)
A = rng.randn(5, 3) + 2
B = rng.randn(7, 3) + 5

pca = IncrementalPCA(n_components=2)
pca.partial_fit(A)
# Set n_samples_seen_ to be a floating point number instead of an int
pca.n_samples_seen_ = float(pca.n_samples_seen_)
pca.partial_fit(B)
singular_vals_float_samples_seen = pca.singular_values_

pca2 = IncrementalPCA(n_components=2)
pca2.partial_fit(A)
pca2.partial_fit(B)
singular_vals_int_samples_seen = pca2.singular_values_

np.testing.assert_allclose(singular_vals_float_samples_seen,
singular_vals_int_samples_seen)