Thanks to visit codestin.com
Credit goes to github.com

Skip to content

IncrementalPCA.partial_fit doesn't use float division in python 2 #9489

Closed
@kaqqy

Description

@kaqqy

Description

The partial_fit method in IncrementalPCA does integer division instead of float division here:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/incremental_pca.py#L249
This causes IncrementalPCA to give the wrong output when using python 2.

np.sqrt((self.n_samples_seen_ * n_samples) /

should be changed to

np.sqrt(float(self.n_samples_seen_ * n_samples) /

Steps/Code to Reproduce

Running the following code in python 2 and 3 gives a different output.

from __future__ import print_function
import numpy as np
from sklearn.decomposition import IncrementalPCA

A = np.array([[1, 2, 4], [5, 3, 6]])
B = np.array([[6, 7, 3], [5, 2, 1], [3, 5, 6]])
C = np.array([[3, 2, 1]])

ipca = IncrementalPCA(n_components=2)
ipca.partial_fit(A)
ipca.partial_fit(B)
print(ipca.transform(C))

Expected Results

Python 3 output:
[[-1.48864923 -3.15618645]]

Actual Results

Python 2 output:
[[-1.9943712 -2.86487266]]

Versions

Linux-4.10.0-27-generic-x86_64-with-Ubuntu-16.04-xenial
('Python', '2.7.12 (default, Nov 19 2016, 06:48:10) \n[GCC 5.4.0 20160609]')
('NumPy', '1.13.1')
('SciPy', '0.19.1')
('Scikit-Learn', '0.18.2')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions