Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
35 views2 pages

Datanormalization Details

data.Normalization (normalization) details

Uploaded by

tadeuszlabuz78
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views2 pages

Datanormalization Details

data.Normalization (normalization) details

Uploaded by

tadeuszlabuz78
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

data.

Normalization (clusterSim)

Types of variable normalization formulas

A. Variable (column) normalization


Variable (column) normalization can be applied to any data matrix.
Selection of ob-
1 data matrix [𝑥𝑖𝑗 ]
jects and variables
Variable scale
Ratio Ratio Interval
level
n1 – standardization
n1 – standardization n2 – positional standardiza-
n2 – positional standardization tion
n3 – unitization n3 – unitization
n6 – quotient transformation
n3a – positional unitization n3a – positional unitization
n6a – positional quotient transfor-
n4 – unitization with zero n4 – unitization with zero
mation
minimum minimum
n7 – quotient transformation
2 Selection of varia- n5 – normalization in range n5 – normalization in range
n8 – quotient transformation
ble normalization [–1, 1] [–1, 1]
n9 – quotient transformation
formula n5a – positional normalization n5a – positional normaliza-
n9a – positional quotient transfor-
in range [–1, 1] tion in range [–1, 1]
mation
n12 – normalization n12 – normalization
n10 – quotient transformation
n12a – positional normaliza- n12a – positional normaliza-
n11 – quotient transformation
tion tion
n13 – normalization with zero n13 – normalization with
being the central point zero being the central
point
Transformed vari-
Ratio Interval Interval
able scale level

(n1) 𝑧𝑖𝑗 = (𝑥𝑖𝑗 − 𝑥̄𝑗 )/𝑠𝑗


(n2) 𝑧𝑖𝑗 = (𝑥𝑖𝑗 −̶ 𝑚𝑒𝑑𝑗 )⁄𝑚𝑎𝑑𝑗
(n3) 𝑧𝑖𝑗 = (𝑥𝑖𝑗 − 𝑥̄𝑗 )/𝑟𝑗
(n3a) 𝑧𝑖𝑗 = (𝑥𝑖𝑗 − 𝑚𝑒𝑑𝑗 )⁄𝑟𝑗
(n4) 𝑧𝑖𝑗 = [𝑥𝑖𝑗 −̶ min {𝑥𝑖𝑗 }]⁄𝑟𝑗
𝑖
(n5) 𝑧𝑖𝑗 = (𝑥𝑖𝑗 − 𝑥̄𝑗 )⁄𝑚𝑎𝑥 |𝑥𝑖𝑗 − 𝑥̄𝑗 |
𝑖
(n5a) 𝑧𝑖𝑗 = (𝑥𝑖𝑗 − 𝑚𝑒𝑑𝑗 )⁄𝑚𝑎𝑥 |𝑥𝑖𝑗 − 𝑚𝑒𝑑𝑗 |
𝑖
(n6) 𝑥𝑖𝑗 ⁄𝑠𝑗
(n6a) 𝑧𝑖𝑗 = 𝑥𝑖𝑗 ⁄𝑚𝑎𝑑𝑗
(n7) 𝑥𝑖𝑗 ⁄𝑟𝑗
(n8) ⁄
𝑥𝑖𝑗 𝑚𝑎𝑥{𝑥𝑖𝑗 }
𝑖
(n9) 𝑥𝑖𝑗 ⁄𝑥̄𝑗
(n9a) 𝑧𝑖𝑗 = 𝑥𝑖𝑗 ⁄𝑚𝑒𝑑𝑗
(n10) 𝑥𝑖𝑗 ⁄∑𝑛𝑖=1 𝑥𝑖𝑗
(n11) 𝑥𝑖𝑗 ⁄√∑𝑛𝑖=1 𝑥𝑖𝑗
2

𝑥𝑖𝑗 −𝑥̄ 𝑗
(n12) 𝑧𝑖𝑗 =
√∑𝑛
𝑖=1(𝑥𝑖𝑗 −𝑥̄ 𝑗 )
2

𝑥𝑖𝑗 −𝑚𝑒𝑑𝑗
(n12a) 𝑧𝑖𝑗 =
√∑𝑛
𝑖=1(𝑥𝑖𝑗 −𝑚𝑒𝑑𝑗 )
2

1
𝑥𝑖𝑗 −𝑚𝑗
(n13)1 𝑧𝑖𝑗 =
𝑟𝑗/2

where: 𝑥𝑖𝑗 (𝑧𝑖𝑗 ) – i-th observation on j-th variable (i-th normalized observation on j-th variable),
𝑥̄𝑗 (𝑠𝑗 ) – mean (standard deviation) for j-th variable,
𝑚𝑒𝑑𝑗 = 𝑚𝑒𝑑 (𝑥𝑖𝑗 ) – median for j-th variable,
𝑖
𝑚𝑎𝑑𝑗 = 𝑚𝑎𝑑 (𝑥𝑖𝑗 ) – median absolute deviation for j-th variable,
𝑖
𝑟𝑗 = 𝑚𝑎𝑥{𝑥𝑖𝑗 } − 𝑚𝑖𝑛{𝑥𝑖𝑗 } – range for j-th variable,
𝑖 𝑖
𝑚𝑎𝑥 {𝑥𝑖𝑗 }+𝑚𝑖𝑛{𝑥𝑖𝑗 }
𝑚𝑗 = 𝑖 𝑖
– mid-range for j-th variable.
2

B. Object (row) normalization


The same normalization procedures can be applied as for variable (column) normalization. Object
(row) normalization makes sense only when all variables are expressed in the same unit. This is often
the case for instance with structural data.

References
Anderberg, M.R. (1973), Cluster analysis for applications, Academic Press, New York, San Fran-
cisco, London.
Gatnar, E., Walesiak, M. (Eds.) (2004), Metody statystycznej analizy wielowymiarowej w badaniach
marketingowych [Multivariate statistical analysis methods in marketing research], Wydawnictwo
AE, Wroclaw, 35-38.
Jajuga, K., Walesiak, M. (2000), Standardisation of data set under different measurement scales, In:
R. Decker, W. Gaul (Eds.), Classification and information processing at the turn of the millen-
nium, Springer-Verlag, Berlin, Heidelberg, 105-112. DOI: https://doi.org/10.1007/978-3-642-
57280-7_11.
Milligan, G.W., Cooper, M.C. (1988), A study of standardization of variables in cluster analysis,
“Journal of Classification”, vol. 5, 181-204.
Młodak, A. (2006), Analiza taksonomiczna w statystyce regionalnej, Difin, Warszawa.
Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w staty-
stycznej analizie wielowymiarowej [Data normalization in multivariate data analysis. An overview
and properties], “Przegląd Statystyczny” (Statistical Review), vol. 61, no 4, 365-374.

1
http://www.benetzkorn.com/2011/11/data-normalization-and-standardization/ (1.06.2014).

You might also like