Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 0888111

Browse files
committed
Make standardization of input optional in mlab.PCA
In principal component analysis, standardization of input data should be a user choice, depending on the problem at hand. The current version always standardizes, without offering a choice to the user. Therefore, I added an option to the PCA class where the user can specify whether the input data are to be standardized before performing the PCA. When *standardize* is set to False, only centering is performed, without dividing by sigma. The default is to standardize, to remain compatible with the current version.
1 parent fcdb600 commit 0888111

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

lib/matplotlib/mlab.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1653,14 +1653,16 @@ def prepca(P, frac=0):
16531653

16541654

16551655
class PCA:
1656-
def __init__(self, a):
1656+
def __init__(self, a, standardize=True):
16571657
"""
16581658
compute the SVD of a and store data for PCA. Use project to
16591659
project the data onto a reduced set of dimensions
16601660
16611661
Inputs:
16621662
16631663
*a*: a numobservations x numdims array
1664+
*standardize*: True if input data are to be standardized. If False, only centering will be
1665+
carried out.
16641666
16651667
Attrs:
16661668
@@ -1694,6 +1696,7 @@ def __init__(self, a):
16941696
self.numrows, self.numcols = n, m
16951697
self.mu = a.mean(axis=0)
16961698
self.sigma = a.std(axis=0)
1699+
self.standardize = standardize
16971700

16981701
a = self.center(a)
16991702

@@ -1745,8 +1748,11 @@ def project(self, x, minfrac=0.):
17451748

17461749

17471750
def center(self, x):
1748-
'center the data using the mean and sigma from training set a'
1749-
return (x - self.mu)/self.sigma
1751+
'center and optionally standardize the data using the mean and sigma from training set a'
1752+
if self.standardize:
1753+
return (x - self.mu)/self.sigma
1754+
else:
1755+
return (x - self.mu)
17501756

17511757

17521758

0 commit comments

Comments
 (0)