0% found this document useful (0 votes)

5 views27 pages

Analysis of Data

Chapter 4 focuses on multivariate analysis techniques, outlining key concepts and methods such as multiple regression analysis, discriminant analysis, factor analysis, and ANOVA. It emphasizes the importance of analyzing multiple statistical outcome variables simultaneously to understand complex data relationships. The chapter also discusses the classification of multivariate techniques into dependence and interdependence methods, along with practical applications and considerations for effective analysis.

Uploaded by

shweta26bhatia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views27 pages

Analysis of Data

Uploaded by

shweta26bhatia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Chapter

MULTIVARIATE ANALYSIS TECHNIQUES

Objectives
The objectives of this lesson are to:
z Concept of Multivariate Data Analysis
z Techniques of Multivariate Analysis
z Multiple Regression Analysis
z Discriminated Analysis
z Factor Analysis
z ANOVA
Structure:
4.1 Introduction
4.2 Multivariate Data Analysis
4.3 Multivariate Analysis Techniques
4.4 Multiple Regression Analysis
4.5 Discriminated Analysis
4.6 Factor Analysis
4.7 ANOVA
4.8 Summary
4.9 Self Assessment Questions

4.1 INTRODUCTION
Multivariate analysis is based in observation and analysis of more than one statistical outcome
variable at a time. In design and analysis, the technique is used to perform trade studies across multiple
dimensions while taking into account the effects of all variables on the responses of interest. The
development of multivariate methods emerged to analyze large databases and increasingly complex
data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate
statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the
analysis of different variables for each person or object studied. Keep in mind at all times that all
variables must be treated accurately reflect the reality of the problem addressed. There are different
types of multivariate analysis and each one should be employed according to the type of variables to
analyze: dependent, interdependence and structural methods.
126 Research Methodology

Notes
4.2 MULTIVARIATE DATA ANALYSIS
Multivariate analysis (MVA) is based on the statistical principle of multivariate statistics, which
involves observation and analysis of more than one statistical outcome variable at a time. In design and
analysis, the technique is used to perform trade studies across multiple dimensions while taking into
account the effects of all variables on the responses of interest. Uses for multivariate analysis include:
i) Design for capability (also known as capability-based design).
ii) Inverse design, where any variable can be treated as an independent variable.
iii) Analysis of Alternatives (AoA), the selection of concepts to fulfill a customer need.
iv) Analysis of concepts with respect to changing scenarios.
v) Identification of critical design drivers and correlations across hierarchical levels.
Multivariate analysis can be complicated by the desire to include physics-based analysis to calculate
the effects of variables for a hierarchical "system-of-systems." Often, studies that wish to use multivariate
analysis are stalled by the dimensionality of the problem. These concerns are often eased through the
use of surrogate models, highly accurate approximations of the physics-based code. Since surrogate
models take the form of an equation, they can be evaluated very quickly. This becomes an enabler for
large-scale MVA studies: while a Monte Carlo simulation across the design space is difficult with physics-
based codes, it becomes trivial when evaluating surrogate models, which often take the form of response
surface equations.

4.3 MULTIVARIATE ANALYSIS TECHNIQUES

Multivariate analysis techniques which can be conveniently classified into two broad categories
viz., dependence methods and interdependence methods. This sort of classification depends upon the
question: Are some of the involved variables dependent upon others? If the answer is ‘yes’, we have
dependence methods; but in case the answer is ‘no’, we have interdependence methods. Two more
questions are relevant for understanding the nature of multivariate techniques. Firstly, in case some
variables are dependent, the question is how many variables are dependent? The other question is,
whether the data are metric or non-metric? This means whether the data are quantitative, collected on
interval or ratio scale, or whether the data are qualitative, collected on nominal or ordinal scale. The
technique to be used for a given situation depends upon the answers to all these very questions. The
category are included techniques like multiple regression analysis, multiple discriminant analysis,
multivariate analysis of variance and canonical analysis

4.4 MULTIPLE REGRESSION ANALYSIS

Multiple regression is the most commonly utilized multivariate technique. It examines the relationship
between a single metric dependent variable and two or more metric independent variables. The technique
relies upon determining the linear relationship with the lowest sum of squared variances; therefore,
assumptions of normality, linearity, and equal variance are carefully observed. The beta coefficients
(weights) are the marginal impacts of each variable, and the size of the weight can be interpreted
directly. Multiple regression is often used as a forecasting tool.
Assumptions:
Regression residuals must be normally distributed.
Multivariate Analysis Techniques 127

A linear relationship is assumed between the dependent variable and the independent variables. Notes
The residuals are homoscedastic and approximately rectangular-shaped.
Absence of multicollinearity is assumed in the model, meaning that the independent variables are
not too highly correlated.
At the center of the multiple linear regression analysis is the task of fitting a single line through a
scatter plot. More specifically the multiple linear regression fits a line through a multi-dimensional space
of data points. The simplest form has one dependent and two independent variables. The dependent
variable may also be referred to as the outcome variable or regressand. The independent variables may
also be referred to as the predictor variables or regressors.
There are 3 major uses for multiple linear regression analysis. First, it might be used to identify the
strength of the effect that the independent variables have on a dependent variable.
Second, it can be used to forecast effects or impacts of changes. That is, multiple linear regression
analysis helps us to understand how much will the dependent variable change when we change the
independent variables. For instance, a multiple linear regression can tell you how much GPA is expected
to increase (or decrease) for every one point increase (or decrease) in IQ.
Third, multiple linear regression analysis predicts trends and future values. The multiple linear
regression analysis can be used to get point estimates.
The Multiple Regression Model
In general, the multiple regression equation of Y on X1, X2, …, Xk is given by:
Y = b0 + b1 X1 + b2 X2 + …………………… + bk Xk

Interpreting Regression Co-efficients

Here b0 is the intercept and b1, b2, b3, …, bk are analogous to the slope in linear regression
equation and are also called regression coefficients. They can be interpreted the same way as slope.
Thus if bi = 2.5, it would indicates that Y will increase by 2.5 units if Xi increased by 1 unit.
The appropriateness of the multiple regression model as a whole can be tested by the F-test in the
ANOVA table. A significant F indicates a linear relationship between Y and at least one of the X's.
1. Discriminant Analysis
Discriminant analysis is the regression based statistical technique that is used in determining the
particular classification or group for an item of data or an object belongs to on the basis of its characteristics
or essential features. It differs from group building techniques such as cluster analysis in that the
classifications or groups to choose from must be known in advance.
Purposes of Discriminant Analysis
Discriminant Analysis undertakes the same task as multiple linear regressions by predicting an
outcome. However, multiple linear regressions is limited to cases where the dependent variable on the Y
axis is an interval variable so that the combination of predictors will, through the regression equation,
produce estimated mean population numerical Y values for given values of weighted combinations of X
values.
2. Factor Analysis
A type of analysis used to discern the underlying dimensions or regularity in phenomena. Its general
purpose is to summarize the information contained in a large number of variables into a smaller number
of factors.
128 Research Methodology

Notes Factor analysis attempts to identify underlying variables or factors, that explain the pattern of
correlations within a set of observed variables. Factor analysis is often used in data reduction to identify
a small number of factors that explain most of the variance that is observed in a much larger number of
manifest variables.
3. Cluster Analysis
A body of techniques with the purpose of classifying individuals or objects into a small number of
mutually exclusive groups, ensuring that there will be as much likeness within groups and as much
difference among groups as possible.
Cluster analysis is a collection of statistical methods which identifies groups of samples that behave
similarly or show similar characteristics. In common parlance it is also called look-a-like groups.
Concept of Cluster Analysis
Cluster analysis is a collection of statistical methods, which identifies groups of samples that behave
similarly or show similar characteristics. In common parlance it is also called look-a-like groups. The
simplest mechanism is to partition the samples using measurements that capture similarity or distance
between samples. In this way, clusters and groups are interchangeable words. Often in market research
studies, cluster analysis is also referred to as a segmentation method. In neural network concepts,
clustering method is called unsupervised learning. Typically in clustering methods, all the samples with in
a cluster is considered to be equally belonging to the cluster. If each observation has its unique probability
of belonging to a group and the application is interested more about these probabilities than we have to
use multinomial models.
Cluster analysis is a class of statistical techniques that can be applied to data that exhibit “natural”
groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group
of relatively homogeneous cases or observations. Objects in a cluster are similar to each other. They are
also dissimilar to objects outside the cluster, particularly objects in other clusters.
Explanation

Clustering and segmentation basically partition the database so that each partition or group is
similar according to some criteria or metric. Clustering according to similarity is a concept which appears
in many disciplines. If a measure of similarity is available there are a number of techniques for forming
clusters. Membership of groups can be based on the level of similarity between members and from this
the rules of membership can be defined. Another approach is to build set functions that measure some
Multivariate Analysis Techniques 129

property of partitions i.e. groups or subsets as functions of some parameter of the partition. This latter Notes
approach achieves what is known as optimal partitioning.
Many data mining applications make use of clustering according to similarity for example to segment
a client/customer base. Clustering according to optimization of set functions is used in data analysis e.g.
when setting insurance tariffs the customers can be segmented according to a number of parameters
and the optimal tariff segmentation achieved.
Clustering/segmentation in databases are the processes of separating a data set into components
that reflect a consistent pattern of behaviour. Once the patterns have been established they can then be
used to “deconstruct” data into more understandable subsets and also they provide sub-groups of a
population for further analysis or action which is important when dealing with very large databases. For
example, a database could be used for profile generation for target marketing where previous response
to mailing campaigns can be used to generate a profile of people who responded and this can be used to
predict response and filter mailing lists to achieve the best response.
Simple Cluster Analysis
In cases of one or two measures, a visual inspection of the data using a frequency polygon or
scatter plot often provides a clear picture of grouping possibilities. For example, the following is the
data from the “Example Assignment” of the cluster analysis homework assignment.

The relative frequency polygon appears as follows:

It is fairly clear from this picture that two subgroups, the first including X, Y, and Z and the second
including everyone else except describe the data fairly well. When faced with complex multivariate
data, such visualization procedures are not available and computer programs assist in assigning objects
to groups. The following text describes the logic involved in cluster analysis algorithms.
Steps in Doing a Cluster Analysis
A common approach to doing a cluster analysis is to first create a table of relative similarities or
differences between all objects and second to use this information to combine the objects into groups.
The table of relative similarities is called a proximities matrix. The method of combining objects into
groups is called a clustering algorithm. The idea is to combine objects that are similar to one another into
separate groups.
The Proximities Matrix
Cluster analysis starts with a data matrix, where objects are rows and observations are columns.
From this beginning, a table is constructed where objects are both rows and columns and the numbers in
130 Research Methodology

Notes the table are measures of similarity or differences between the two observations. For example, given
the following data matrix:
X1 X 2 X 3 X4 X5
O1
O2
O3
O4
A proximities matrix would appear as follows:
O1 O2 O3 O 4
O1
O2
O3
O4
The difference between a proximities matrix in cluster analysis and a correlation matrix is that a
correlation matrix contains similarities between variables (X1, X2) while the proximities matrix contains
similarities between observations (O1, O2).
The researcher has dual problems at this point. The first is a decision about what variables to
collect and include in the analysis. Selection of irrelevant measures will not aid in classification. For
example, including the number of legs an animal has would not help in differentiating cats and dogs,
although it would be very valuable in differentiating between spiders and insects.
The second problem is how to combine multiple measures into a single number, the similarity
between the two observations. This is the point where univariate and multivariate cluster analysis separate.
Univariate cluster analysis groups are based on a single measure, while multivariate cluster analysis is
based on multiple measures.
Univariate Measures
A simpler version of the problem of how to combine multiple measures into a measure of difference
between objects is how to combine a single observation into a measure of difference between objects.
Consider the following scores on a test for four students:
Student Score
X 11
Y 11
Z 13
A 18
The proximities matrix for these four students would appear as follows:
X Y Z A
X
Y
Z
A
Multivariate Analysis Techniques 131

The entries of this matrix will be described using a capital “D”, for distance with a subscript Notes
describing which row and column. For example, D34 would describe the entry in row 3, column 4, or in
this case, the intersection of Z and A.
One means of filling in the proximities matrix is to compute the absolute value of the difference
between scores. For example, the distance, D, between Z and A would be |13-18| or 5. Completing the
proximities matrix using the example data would result in the following:
X Y Z A
X 0 0 2 7
Y 0 0 2 7
Z 2 2 0 5
A 7 7 5 0
A second means of completing the proximities matrix is to use the squared difference between the
two measures. Using the example above D 34 , the distance between Z and A, would be (13-18)2 or 25.
This distance measure has the advantage of being consistent with many other statistical measures, such
as variance and the least squares criterion and will be used in the examples that follow. The example
proximities matrix using squared differences as the distance measure is presented below.
X Y Z A
X 0 0 4 49
Y 0 0 4 49
Z 4 4 0 25
A 49 49 25 0
Note that both example proximities matrices are symmetrical. Symmetrical means that row and
column entries can be interchanged or that the numbers are the same on each half of the matrix defined
by a diagonal running from top left to bottom right.
Other distance measures have been proposed and are available with statistical packages. For
example, SPSS/WIN provides the following options for distance measures.
Some of these options themselves contain options. For example, Minkowski and Customized are
really many different possible measures of distance.
Multivariate Measures
When more than one measure is obtained for each observation, then some method of combining
the proximities matrices for different measures must be found. Usually the matrices are summed in a
combined matrix. For example: given the following scores.
X1 X2
O1 25 11
O2 33 11
O3 34 13
O4 35 18
The two proximities matrices resulting from squared Euclidean distance that result could be summed
to produce a combined distance matrix.
132 Research Methodology

Notes O1 O2 O3 O4
O1 0 64 81 100
O2 64 0 1 4
O3 81 1 0 1
O4 100 4 1 0
+
O1 O2 O3 O4
O1 0 0 4 49
O2 0 0 4 49
O3 4 4 0 25
O4 49 49 25 0
=
O1 O2 O3 O4
O1 0 64 85 149
O2 64 0 5 53
O3 85 5 0 26
O4 149 53 26 0
Note that each corresponding cell is added. With more measures there are more matrices to be
added together.
This system works reasonably well if the measures share similar scales. One measure can overwhelm
the other if the measures use different scales. Consider the following scores.
X1 X2
O1 25 11
O2 33 21
O3 34 33
O4 35 48
The two proximities matrices resulting from squared Euclidean distance that result could be summed
to produce a combined distance matrix.
O1 O2 O3 O4
O1 0 64 81 100
O2 64 0 1 4
O3 81 1 0 1
O4 100 4 1 0
+
Multivariate Analysis Techniques 133

O1 O2 O3 O4 Notes

O1 0 100 484 49
O2 100 0 144 729
O3 484 144 0 225
O4 1369 729 225 0
=
O1 O2 O3 O4
O1 0 164 485 153
O2 164 0 145 733
O3 565 145 0 226
O4 1469 733 226 0
It can be seen that the second measure overwhelms the first in the combined matrix.
For this reason the measures are optionally transformed before they are combined. For example,
the previous data matrix might be converted to standard scores before computing the separated distance
matrices.
X1 X2 Z1 Z2
O1 25 11 -1.48 -1.08
O2 33 21 .27 -.45
O3 34 33 .49 .30
O4 35 48 .71 1.24
The two proximities matrices resulting from squared Euclidean distance that result from the standard
scores could be summed to produce a combined distance matrix.
O1 O2 O3 O4
O1 0 3.06 3.88 4.80
O2 3.06 0 .05 .19
O3 3.88 .05 0 .05
O4 4.80 .19 .05 0
+
O1 O2 O3 O4
O1 0 .40 1.90 5.38
O2 .40 0 .56 2.86
O3 1.9 .56 0 .88
O4 5.38 2.86 .88 0
=
134 Research Methodology

Notes O1 O2 O3 O4
O1 0 3.46 5.78 10.18
O2 3.46 0 .61 3.05
O3 5.78 .61 0 .93
O4 10.18 3.05 .93 0
The point is that the choice of whether to transform the data and the choice of distance metric can
result in vastly different proximities matrices.
4. Multidimensional Scaling
A statistical technique that measures objects in multidimensional space on the basis of respondents’
judgments of the similarity of objects.
5. Multivariate Analysis of Variance (MANOVA)
A statistical technique that provides a simultaneous significance test of mean difference between
groups for two or more dependent variables.

4.5 DISCRIMINATED ANALYSIS

Discriminant analysis is the regression based statistical technique that is used in determining the
particular classification or group for an item of data or an object belongs to on the basis of its characteristics
or essential features. It differs from group building techniques such as cluster analysis in that the
classifications or groups to choose from must be known in advance.
Purposes of Discriminant Analysis
Discriminant Analysis undertakes the same task as multiple linear regressions by predicting an
outcome. However, multiple linear regressions is limited to cases where the dependent variable on the Y
axis is an interval variable so that the combination of predictors will, through the regression equation,
produce estimated mean population numerical Y values for given values of weighted combinations of X
values.
Assumptions of Discriminant Analysis
The major underlying assumptions of discriminant analysis are:
i) The observations are a random sample.
ii) Each predictor variable is normally distributed.
iii) Each of the allocations for the dependent categories in the initial classification is correctly
classified.
iv) There must be at least two groups or categories with each case belonging to only one group so
that the groups are mutually exclusive and collectively exhaustive.
v) Each group or category must be well defined, clearly differentiated from any other group(s)
and natural. Putting a median split on an attitude scale is not a natural way to form groups.
Partitioning quantitative variables is only justifiable if there are easily identifiable gaps at the
points of division.
vi) The attribute is used to separate the groups should discriminate quite clearly between the
groups so that group or category overlap is clearly non-existent or minimal.
Multivariate Analysis Techniques 135

vii) Group sizes of the dependent should not be grossly different and should be at least times the Notes
number of independent variables.

4.6 FACTOR ANALYSIS

Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of
correlations within a set of observed variables. Factor analysis is often used in data reduction to identify
a small number of factors that explain most of the variance that is observed in a much larger number of
manifest variables. Factor analysis can also be used to generate hypotheses regarding causal mechanisms
or to screen variables for subsequent analysis.
Factor analysis is used to analyze large numbers of dependent variables to detect certain aspects
of the independent variables affecting those dependent variables without directly analyzing the independent
variables. It enables an analyst to reduce the number of elements to be studied and to observe how they
are interlinked. Factor analysis techniques are used in constructing factor models.
Factor analysis is based on a model that supposes that correlations between pairs of measured
variables can be explained by the connections of the measured variables to a small number of non-
measurable but meaningful variables, which are termed factors.
The factor analysis procedure offers a high degree of flexibility:
(i) Seven methods of factor extraction are available.
(ii) Five methods of rotation are available, including direct oblimin and promax for nonorthogonal
rotations.
(iii) Three methods of computing factor scores are available and scores can be saved as variables
for further analysis.
Meaning of Factor Analysis
Factor analysis is a statistical technique that can uncover relationship patterns underlying hundreds
of interacting phenomenon such as changes in interest rates, inflation and/or oil prices.

Objectives of Factor Analysis

The aims of factor analysis are to:
(i) Identify the number of factors;
(ii) Define the factors as functions of the measured variables;
(iii) Study the factors which have been defined.

4.7 ANOVA
Analysis of Variance (ANOVA) is a collection of statistical models and their associated procedures,
in which the observed variance in a particular variable is partitioned into components attributable to
different sources of variation. In its simplest form ANOVA provides a statistical test of whether or not
the means of several groups are all equal, and therefore generalizes t-test to more than two groups.
Doing multiple two-sample t-tests would result in an increased chance of committing a type I error. For
this reason, ANOVAs are useful in comparing two, three or more means.
An important technique for analyzing the effect of categorical factors on a response is to perform
an Analysis of Variance. An ANOVA decomposes the variability in the response variable amongst the
different factors. Depending upon the type of analysis, it may be important to determine: (a) which
136 Research Methodology

Notes factors have a significant effect on the response, and/or (b) how much of the variability in the response
variable is attributable to each factor.
Statgraphics Centurion provides several procedures for performing an analysis of variance:
1. One-Way ANOVA - used when there is only a single categorical factor. This is equivalent to
comparing multiple groups of data.
2. Multifactor ANOVA - used when there is more than one categorical factor, arranged in a
crossed pattern. When factors are crossed, the levels of one factor appear at more than one
level of the other factors.
3. Variance Components Analysis - used when there are multiple factors, arranged in a hierarchical
manner. In such a design, each factor is nested in the factor above it.
4. General Linear Models - used whenever there are both crossed and nested factors, when
some factors are fixed and some are random, and when both categorical and quantitative
factors are present.
One-Way ANOVA
A one-way analysis of variance is used when the data are divided into groups according to only one
factor. The questions of interest are usually: (a) Is there a significant difference between the groups and
(b) If so, which groups are significantly different from which others? Statistical tests are provided to
compare group means, group medians, and group standard deviations. When comparing means, multiple
range tests are used, the most popular of which is Tukey's HSD procedure. For equal size samples,
significant group differences can be determined by examining the means plot and identifying those
intervals that do not overlap.
Multifactor ANOVA
When more than one factor is present and the factors are crossed, a multifactor ANOVA is
appropriate. Both main effects and interactions between the factors may be estimated. The output
includes an ANOVA table and a new graphical ANOVA from the latest edition of Statistics for
Experimenters by Box, Hunter and Hunter (Wiley, 2005). In a graphical ANOVA, the points are scaled
so that any levels that differ by more than exhibited in the distribution of the residuals are significantly
different.
Variance Components Analysis
A Variance Components Analysis is most commonly used to determine the level at which variability
is being introduced into a product. A typical experiment might select several batches, several samples
from each batch and then run replicates tests on each sample. The goal is to determine the relative
percentages of the overall process variability that is being introduced at each level.
Assumptions of ANOVA
The analysis of variance has been studied from several approaches, the most common of which
use a linear model that relates the response to the treatments and blocks. Even when the statistical
model is nonlinear, it can be approximated by a linear model for which an analysis of variance may be
appropriate.
(1) The model is correctly specified.
(2) The Hij’s are normally distributed.
(3) The Hij’s have mean zero and a common variance, V 2 .

(4) The Hij’s are independent across observations.

Multivariate Analysis Techniques 137

With multiple populations, detection of violations of these assumptions requires examining the residuals Notes
rather than the Y-values themselves.
Illustration - 1
The following are measurements of performance obtained after training 4 groups by different
methods:

Method 1: 17 19 18 15 21 19 16 14
Method 2: 21 23 20 19 19

Method 3: 20 16 21 17 19 16 16
Method 4: 13 15 16 17 13 16
Find out whether there is a significant overall differences between these 4 groups in terms of their
performance after training ( D = 0.05).
Solution:
Let, the null hypothesis be that different methods of training do not result difference in performance
after training.

1 2 3 4
17 21 20 13
19 23 16 15
18 20 21 16
15 19 17 17
21 19 19 13
19 16 16
16 16
14
By coding of data (i.e., add, subtract, multiply or divide all observations by a number), can
simplify the task. Let us subtract 15 from all observations, we get

1 2 3 4
2 6 5 -2
4 8 1 0
3 5 6 1
0 4 2 2
6 4 4 -2
4 1 1
1 1
-1
T1 = 19 n1 = 8
T2 = 27 n2 = 5
138 Research Methodology

Notes T3 = 20 n3 = 7
T4 = 0 n4 = 6
T = 66 N = 26

T2
Correction factor = where T = total of all observations
N
= no. of all observations

66 2
= 167.54
26
Sum of squares between samples:

Tj2 T2
SSB = ¦n j

N

F 19 2
272 202 02 I
= G 8 JK – 167.54
H 5 7 6

= (45.125 + 145.8 + 57.14 + 0) – 167.54 = 80.525

Sum of squares withinsamples:

¦T 2

¦
j
SSW = X 2ij
n2

= (22 + 42 + 02 + 62 + 42 + 12 + 12 + 62 + 82 + 52 + 42 + 42 + 52 + 12 + 62 +

F 19
2
272 202 02 I
22 + 42 + 12 + 12 + 22 + 02 + 12 + 22 + 22 + 12) GH 8
5

7

6 JK
= 338 – 248.07 = 89.93
ANOVA Table

Sources of Sum of Degrees of (df) Mean square F-ratio variation

squares (SS) Freedom MS = SS/df

80.525 26.84
Between samples 80.525 (k – 1) = (4 – 1) = 3 = 26.84 = 6.56
3 4.09

89.930
Within samples 89.930 (n – k) = (26 – 4) = 22 = 4.09
22
Total 170.455 (n – 1) = (26 – 1) = 25
F-ratio calculated = 6.56
F-ratio from table for v1 = 3 and v2 = 22 at 5% level of significance is 3.05
Since, Fcalculated > Ftable , to reject the null hypothesis, which means there is a significant
overall difference between 4 groups in terms of performance after training.
Multivariate Analysis Techniques 139

Illustration - 2 Notes
Three methods are used in the production process test. At 5% level of significance test
whether the three methods can be considered to be equivalent as far as output are concerned.

Method I 70 72 75 80 53
Method II 100 110 108 112 120 107
Method III 60 65 57 84 87 73
Solution:
Let the null hypothesis be that there is no significant difference between the three methods.

Method I II III
70 100 60
72 110 65
75 108 57
80 112 84
53 120 87
107 73

Total 350 657 426

T2
Correction factor =
N
where, T - sum of all observations
N - no. of observations
Here, T1 = 350 T2 = 657. T3 = 426, T = 1433
n1 = 5 n2 = 6, n3 = 6 N = 17
Sum of squares between samples:
Tj2 T2
SSB = ¦n j

N

F 350 2
657 2 426 2 I F 1433 I 2

= GH 5
6

6 JK GH 17 JK
= 24,500 + 71,941.5 + 30,246 – 1,20,793.5 = 5894
Sum of squares within samples:

¦T 2

¦
j
SSW = X 2ij
n2

= (702 + 722 + 752 + 802 + 832 + 1002 + 1102 + 1082 + 1122 + 1202 + 1072 +
F 350 2
657 2 426 2 I
602 + 652 + 572 + 842 +872 + 732 GH 5
6

6 JK
= 1,32,183 – 1,26,687.50 = 5195.5
140 Research Methodology

Notes ANOVA Table

Sources of Sum of (SS) Degrees of Mean square MS F-ratio variation

squares Freedom d.f

5894 2954
Between samples 5894 (k – 1) = (3 – 1) = 2 = 2947 = 7.51
2 392.54

5495.5
Within samples 5495.5 (n – k) = (17 – 3) = 14 = 392.54
14
6279.00 (n – 1) = (17 – 1) = 16
F-ratio calculated = 32.4
F-ratio from table for v1 = 2 and v2 = 14 at 5% level C1 significance = 3.74
Since Fcalculated > Ftable , reject the null hypothesis which means there is a significant
difference between the three methods.
Illustration - 3
The following table gives the monthly sales in rupees (in thousands) of a certain firm in three
different states of 4 different salesmen.

Salesmen
States 1 2 3 4
A 10 8 8 14
B 14 16 10 8
C 18 12 12 14
Test whether:
i. Sales between salesmen are significant
ii. Sales between states are significant.
Solution:
Two Way ANOVA:
Let, the first null hypothesis be that sales between salesmen are insignificant and second null
hypothesis be that sales between states are in significant.
i.e., H0 (1) : Sales between salesmen are insignificant
H0 (2) : Sales between states are insignificant’
By coding the data, we can simplify the task. Let us subtract 12 from all the observations and
we get:

Salesmen Total
2 4 4 2 8
State 2 4 2 4 0
6 0 0 2 8

Total 6 0 6 0 0
Multivariate Analysis Techniques 141

Correction factor: Notes

T2 02
= =0
N 12
Where, T - total of all samples
N - no. of samples
Total sum of squares:

Tj2
SST = ¦ X 2ij
N
= (22 + 22 + 62 + 42 + 42 + 02 + 42 + 22 + 02 + 22 + 42 + 22) – 0 = 120
Sum of squares between columns (i.e., between salesmen):

Tj2 T2 F6 2
02 ( 6)2 02 I
SSC = ¦n j

N = GH 3
3

3

3 JK – 0 = 24
Sum of squares between rows (i.e., between states):

Ti2 T2 F
( 8) 2 02 82 I
SSR = ¦n i

N
= GH 4

4

4 JK – 0 = 32
Sum of squares of residual or error:
SSres = SST – (SSC + SSR) = 120 – (24 + 32) = 64

ANOVA Table

Sources of Sum of Degrees of Freedom Mean square F-ratio

variation squares (SS)

24 10.67
Between 24 (c – 1) = (4 – 1) = 3 =8 = 1.33
3 8
samples
32 16
Between States 32 (r – 1) = (3 – 1) = 2 = 16 = 1.50
2 10.67

64
Residual or error 64 (c – 1) (r – 1) = (3) (2) = 6 = 10.67
6
Total 120 (n – 1) = (12 – 1) = 11
Note:
Greater var iance
F-ratio = Smaller var iance

Table values of F at 5% level of significance

F(6, 3) = 8.94; F(2, 6) = 5.14.
i) Calculated F(6, 3) = 1.33 < Table F(2, 6) = 8.94.
142 Research Methodology

Notes Hence, conclude that null hypothesis holds good and there is no significant difference between
the salesmen.
ii) Calculated F(2, 6) = 1.5 < Table F(2, 6) = 5.14
Hence null hypothesis is accepted and conclude the there is no significant difference between
the states
Illustration - 4
The following table shows the lifetimes in hours of samples from three different types of
television tables manufactured by a company. Determine whether there is a difference between the
three types of significance level of 0.01.

Sample 1407 411 409

Sample 2404 406 408 405 402

Sample 3410 408 406 408

Let, X = 406 be the change of scale

Ti Ti2 Ti2 /x
S1 1 5 3 9 81 27
S2 –2 0 2 –1 –4 –5 25 5
S3 4 2 0 2 8 64 16

T = 12 48

T2 122
CF = = = 12
N 12

SS = 6 6 x 2ij – CF = 12 + 52 + 32 + 22 + 22 + 22 + 12 + 42 + 42 + 22 + 22 – 12
= 1 + 25 + 9 + 4 + 4 + 4 + 1 + 16 + 16 + 4 + 4 – 12 = 76

Ti2
SSR = 6 CF
n
= 48 – 12 = 36
SSE = SS – SSR = 76 – 36 = 40
ANOVA Table

SV SS df MS F ratio
B/w rows 36 2 18 F = 4.0909

Error 40 9 44
F(2, 5) Table value = 8.02
? F < FD
Accept H0
i.e., there is no significant differences between the 3 samples.
Multivariate Analysis Techniques 143

Illustration - 5 Notes
A research company has designed three different systems to clear up oil spills. The following
table contains the results, measured by how much surface area (in square meters) is cleared in 1
hour. The data were found by testing each method in several trials. Are the three systems equally
effective? Use the 0.05 level of significance.

System A: 55 60 63 56 59 55
System B: 57 53 64 49 62

System C: 66 52 61 57
Solution:
Let, us change the origin
X – 55

Ti Ti2 Ti2 /n
System A: 0 5 8 1 4 0 18 324 54
System B: 2 2 9 6 7 10 100 20
System C: 11 3 6 2 16 256 64

44 138

T2 44 2 1936
CF = = = = 129.07
N 15 15

SS = 6 6 x 2ij – CF
= 25 + 64 + 1 + 16 + 4 + 4 + 81 + 36 + 49 + 121 + 9 + 36 + 4 – 129.07
= 450 – 129.7 = 320.93

6Ti2
= SSR = CF
n
= 138 – 129.07 = 8.93
SSE = SS – SSk
= 320.93 – 8.93 = 312
ANOVA Table

SV SS df MS F ratio
B/w system 8.93 2 4.465 F = 5.823

w/n system 312 12 26

Table value FD = 19.43

F < FD
Accept H0
i.e., There is no significant difference between the system
144 Research Methodology

Notes Illustration - 6
The following table shows the yields per acre of four different plant crops grown on lots
treated with three different types of fertilizer. Determine at the 0.05 significance level whether
there is a difference in yield per acre
i) due to the fertilizers and
ii) due to the crops

Table Crop I Crop II Crop III Crop IV

Fertilizer A 4.5 6.4 7.2 6.7
Fertilizer B 8.8 7.8 9.6 7.0

Fertilizer C 5.9 6.8 5.7 5.2

Solution:

I II III IV Ti Ti2 Ti2 /n

A 4.5 6.4 7.2 6.7 24.8 615.04 153.76
B 8.8 7.8 9.6 7.0 33.2 1102.24 275.56
C 5.9 6.8 5.7 5.2 23.6 556.96 139.24
Tj 19.2 21 22.5 18.9 81.6 568.56

Tj2 368.64 441 506.25 357.21

Tj2
122.88 147 168.75 119.07 557.7
n

C.F =
T2
=
b81.6g 2

=
6658.56
= 554.88
N 12 12

SS = 6 6 x 2ij – CF = 577.96 – 554.88 = 23.08

6Ti2
SSR = – CF = 568.56 – 554.88 = 13.68
n

Tj2
SSC = 6 – CF = 557.7 – 554.88 = 2.82
k
SSE = SS – SSR – SSC = 23.08 – 13.68 – 2.82 = 6.58
ANOVA Table

SV SS df MS
B/w Rows 13.68 2 6.84 F1 = 6.218

B/w Column 2.82 3 0.94 F2 = 1.170

Residual error 6.58 6 1.10
Multivariate Analysis Techniques 145

FD1 = 5.14 Notes

FD 2 = 8.94

F1 > FD1 F2 < FD 2

Reject H0 :
There is a significant difference in yield due to fertilizers and there is no significant difference
between the crops.
Illustration - 7
The following data are the out puts per day from three machines when operated by four
mechanics.

Mechanics Machine
A B C
1 44 48 38
2 37 40 36
3 45 38 32
4 40 44 44
Test whether:
(i) Mean productivity is same for machines.
(ii) Mean productivity is same for mechanics.
Solution:
A 2-way ANOVA technique will enable us to solve and answer the question asked.
Let us take null hypothesis that
i) There is no significant difference between the machines productivity.
ii) There is no significant difference between the mechanics productivity.
Let us code the data by subtracting 40 from all observations to simplify the task.
Machines Total
4 8 2 10
Mechanics 3 0 4 7
5 2 8 5
0 4 4 8
Total 6 10 10 6
Correction factor:

T2 62
= =3
N 12
Where, T - total of all observations
N - No. of observations
146 Research Methodology

Notes Total sum of squares:

Tj2
SST = ¦ X 2ij
N
= (42 + 82 + 22 + 32 + 0 + 42 + 52 + 22 + 82 + 02 + 42 + 42) – 3 = 231
Sum of squares between columns (i.e., between machines):

Tj2 T2 F6 2
102 ( 10) 2 I
SSC = ¦n j

N = GH 4
4

4 JK 3 = 56

Sum of squares between rows (i.e., between mechanics)

F T I T F 10 2 2 2
( 7) 2 ( 5)2 82 I
SSR = ¦ G n J N = G 3 JK = 76.33
i

H K H i 3 3 3

Sum of squares of residual or error:

SSres = SST – (SSC + SSR) = 231 – (56 + 76.33) = 98.67
ANOVA Table

Sources of Sum of Degree of Mean squares MS-ratio variation

squares SS freedom d.f

56 28
Between machines 56 (c – 1) = 2 = 28 = 1.7
2 16.45

76.33 25.44
Between mechanics 76.33 (r – 1) = 3 = 25.44 = 1.55
3 16.45

98.67
Residual or error 98.67 (c – 1) (r – 1) = 6 = 16.45
6
Total 231 (n – 1) = 11
Table values of F ratio at 5% level of significance:
F(2, 6) = 5.14
F(3, 6) = 4.76
(i) Calculated F(2, 6) = 1.7 < Table F(2, 6) = 5.14.
Hence, null hypothesis is accepted i.e., there is no significant difference between machines
which means the mean productivity is same for machines.
(ii) Calculated F(3, 6) = 1.55 < Table F(3, 6) = 4.76.
Hence, null hypothesis is accepted i.e., there is no significant difference between mechanics which
means the mean productivity is same for mechanics.
Illustration - 8
Set up an ANOVA table for the following information relating to three drugs testing to judge the
effectiveness in reducing blood pressure for three different groups of people.
Multivariate Analysis Techniques 147

Amount of BP reduction in mm of mercury: Notes

Group of people Drug

X Y Z
A 14 10 11
15 9 11
B 12 7 10
11 8 11
C 10 11 8
11 11 7
Do the drug act differently? Are the different group of people affected differently? Is the
interaction term significant? Table D = 0.05.
Solution:
As repeated values are given in the table, this is a case of 2 way ANOVA with interaction.
(Interaction is the measure of inter-relationship among two different classifications).
Let the null hypothesis be that:
i) There is no significant difference between drugs.
ii) There is no significant difference between groups of people.
iii) The interaction term is insignificant

Drug Total
Groups of people 14 10 11 70
15 9 11
12 7 10 59
11 8 11
10 11 8 58
11 11 7
Total 73 56 58 187
Correction factor:

T2 1872
= = 1942.72
N 18
Total sum of squares:

T2
SSC = ¦ X 2ij
N
= (142 + 152 + 122 + 112 + 102 + 92 + 102 + 92 + 72 + 82 + 112 + 112 + 112
+ 112 + 102 + 112 + 82 + 72) – 1942.72
= 76.28
148 Research Methodology

Notes Sum of squares between columns (i.e., between drugs)

T2
SST =  X 2ij 
N
= (142 + 152 + 122 + 112 + 102 + 92 + 102 + 92 + 72 + 82 + 112 + 112 + 112
+ 112 + 102 + 112 + 82 + 72) – 1942.72 = 76.28
Sum of squares between rows (i.e., between people):

Ti2 T 2 F 70 2
592 582 I
  = G 6   JK – 1942.72 = 14.78
SSR =
ni N H 6 6

Sum of squares within samples:

 dX i 2
SSW = ij  Xw where X w – mean within samples

= (14 – 14.5)2 + (15 – 14.5)2 + (10 – 9.5)2 + (9 – 9.5)2 + (11 – 11)2 + (11 – 11)2 +
(12 – 11.5)2 + (11 – 11.5)2 + (7 – 7.5)2 + (8 – 7.5)2 + (10 – 10.5)2 + (11 – 10.5)2
+ (10 – 10.5)2 + (11 – 10.5)2 + (11 – 11)2 + (11 – 11)2 + (8 – 7.5)2 + (7 – 7.5)2
= 3.50
Sum of squares for interaction variation:
SSI = SST – (SSC + SSR + SSW) = 76.28 – (28.77 + (14.78 + 3.50) = 29.23
ANOVA Table

Sources of Sum of Degree of Mean square Ms F-ratio variation

squares SS freedom d.f

28.77 14.385
Between Drugs 28.77 (c – 1) = 2 = 14.385 = 36.9
2 0.389

14.78 7.390
Between groups 14.78 (r – 1) = 2 = 7.390 = 19.0 of people
2 0.389

29.23 7.308
Interaction 29.23 17 – 2 – 2 – 9 = 4 = 7.308 = 18.8
4 0.389

3.5
Within samples 3.50 (n – rc) = 9 = 0.389 (error)
9
Total 76.28 (n – 1) = 17
Table value of F-ratios at 5% level of significance F(2, 9) = 4.26; F(2, 9) = 3.63
i) Calculated F(2, 9) = 36.9 > Table F(2, 9) = 4.26.
Hence, null hypothesis is rejected which means the drugs act differently.
ii) Calculated F(2, 9) = 19.0 > Table F(2, 9) = 4.26.
Hence, null hypothesis is rejected which means the different groups of people are affected
differently.
Multivariate Analysis Techniques 149

iii) Calculated F(4, 9) = 18.8 > Table F(4, 9) = 3.63. Notes

Hence, null hypothesis is rejected which means the interaction term is significant.
Illustration - 9
Is the interaction variation significant in case of the following information concerning mileage
based on different brands of gasoline and cars?

Brands of gasoline
W X Y Z
Cars A 13 12 12 11
11 10 11 13
B 12 10 11 9
13 11 12 10
C 14 11 13 10
13 10 14 8
Solution:
T2
Correction factor
N
T - Sum of all observations
N - No. of all observations.

W X Y Z Total
A 13 12 12 11 93
11 10 11 13
B 12 10 11 9 88
13 11 12 10
C 14 11 13 10 93
13 10 14 8
Total 76 64 73 61 274
Here,
T1 = 76, T2 = 64, T3 = 73, T4 = 61, T = 274
n1 = 6, n2 = 6, n3 = 6, n4 = 6, N = 24
T2 274 2
Correction factor = = = 3128.17
N 24
Total sum of squires:
T2
SST = ¦ X 2ij
N
= (132 + 112 + 122 + 132 + 122 + 102 + 102 + 112 + 102 + 122 + 112 + 112 +
122 + 132 + 142 + 112 + 132 + 92 + 102 + 102 + 82) – 3128.17
= 3184 – 3128.17 = 55.83
150 Research Methodology

Notes Sum of squares between columns (i.e., between bas oline):

Tj2 T2 F
762 64 2 732 612 I
SSC = ¦n j

N = 6

6GH
6

6 JK – 3128.17
= 3,153.67 – 3,128.17 = 25.50
Sum of squares between rows (i.e., between cars):

Ti2 T 2 932 882 932 F I

SSR = ¦ ni

N
= 8 8 8 GH JK – 3,128.17
= 3130.25 – 3128.17 = 2.08
Sum of squares within samples:

¦ dX i 2
SSW = ij Xw

= (13 – 12)2 + (11 – 12)2 + (12 – 12.5)2 + (13 – 12.5)2 + (14 – 13.5)2 + (12 – 11)2 + (10 – 11)2
+ (10 – 10.5)2 + (11 – 10.5)2 + (10- 10.5)2 + (12 – 11.5)2 + (11 – 11.5)2 + (11 – 11.5)2 + (12 –
11.5)2 + (13 – 13.5)2 + (14 – 13.5)2 + (11 – 12)2 + (13 – 12)2 + (9 – 9.5)2 + (10 – 9.5)2 + (10 –
9)2 + (8 – 9)2 = 12
Sum of squares for interaction variation:
SSI = SST – (SSC + SSR + SSW) = 55.83 – (25.50 + 2.08 + 12) = 16.25

4.8 SUMMARY
Multivariate analysis is based in observation and analysis of more than one statistical outcome
variable at a time. In design and analysis, the technique is used to perform trade studies across multiple
dimensions while taking into account the effects of all variables on the responses of interest.
Multivariate analysis (MVA) is based on the statistical principle of multivariate statistics, which
involves observation and analysis of more than one statistical outcome variable at a time. In design and
analysis, the technique is used to perform trade studies across multiple dimensions while taking into
account the effects of all variables on the responses of interest.
Multivariate analysis techniques which can be conveniently classified into two broad categories
viz., dependence methods and interdependence methods.
Multiple regression is the most commonly utilized multivariate technique. It examines the relationship
between a single metric dependent variable and two or more metric independent variables.
Discriminant analysis is the regression based statistical technique that is used in determining the
particular classification or group for an item of data or an object belongs to on the basis of its characteristics
or essential features. It differs from group building techniques such as cluster analysis in that the
classifications or groups to choose from must be known in advance.
Cluster analysis is a collection of statistical methods which identifies groups of samples that behave
similarly or show similar characteristics. In common parlance it is also called look-a-like groups.
A statistical technique that measures objects in multidimensional space on the basis of respondents’
judgments of the similarity of objects.
Multivariate Analysis Techniques 151

A statistical technique that provides a simultaneous significance test of mean difference between Notes
groups for two or more dependent variables.
Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of
correlations within a set of observed variables. Factor analysis is often used in data reduction to identify
a small number of factors that explain most of the variance that is observed in a much larger number of
manifest variables. Factor analysis can also be used to generate hypotheses regarding causal mechanisms
or to screen variables for subsequent analysis.
Analysis of Variance (ANOVA) is a collection of statistical models and their associated procedures,
in which the observed variance in a particular variable is partitioned into components attributable to
different sources of variation.

4.9 SELF ASSESSMENT QUESTIONS

1. What is Multivariate Data Analysis? Discuss the Techniques of Multivariate Analysis.
2. Explain the Multiple Regression Analysis.
3. What is Discriminated Analysis? Explain the purposes and assumptions of Discriminant Analysis.
4. What do you mean by Factor Analysis? What are its objectives?
5. What is ANOVA? Explain the assumptions of ANOVA.
*****

Statistics For Chemical Engineers
No ratings yet
Statistics For Chemical Engineers
122 pages
Biostatistics Series Module 10 Multivariate Methods
No ratings yet
Biostatistics Series Module 10 Multivariate Methods
9 pages
Anova and Ancova
No ratings yet
Anova and Ancova
35 pages
Multivariate Statistics Unit 1 JJ 09-01-2025
No ratings yet
Multivariate Statistics Unit 1 JJ 09-01-2025
27 pages
Article TikTok Education
No ratings yet
Article TikTok Education
7 pages
MLBB Game Experimental Research Final Paper
No ratings yet
MLBB Game Experimental Research Final Paper
13 pages
Research Methodology - Multi Variate Analysis 13 10 23
No ratings yet
Research Methodology - Multi Variate Analysis 13 10 23
17 pages
Business Research Methods: Multivariate Analysis
No ratings yet
Business Research Methods: Multivariate Analysis
34 pages
Ics054 Unit 2a
No ratings yet
Ics054 Unit 2a
8 pages
Multivariate Analysis for Execs
No ratings yet
Multivariate Analysis for Execs
8 pages
Notes of DA Unit-II
No ratings yet
Notes of DA Unit-II
91 pages
Multivariate Analysis: Are Some of The Variables Dependent On Others?
100% (2)
Multivariate Analysis: Are Some of The Variables Dependent On Others?
16 pages
Multivariate Research Assignment
No ratings yet
Multivariate Research Assignment
6 pages
BRM chp09
No ratings yet
BRM chp09
41 pages
Introduction
No ratings yet
Introduction
27 pages
Bivariate Analysis: Research Methodology Digital Assignment Iii
No ratings yet
Bivariate Analysis: Research Methodology Digital Assignment Iii
6 pages
Choosing Appropriate Statistical Tool - PDF
No ratings yet
Choosing Appropriate Statistical Tool - PDF
48 pages
Unit-Iii 3.1 Regression Modelling
100% (1)
Unit-Iii 3.1 Regression Modelling
7 pages
Business Research Methods Guide
No ratings yet
Business Research Methods Guide
13 pages
Multi Variate Analysis Techniques
No ratings yet
Multi Variate Analysis Techniques
4 pages
(Suraj B.Atnure) : Title of The Project
No ratings yet
(Suraj B.Atnure) : Title of The Project
15 pages
Unit 3 Research Methods
No ratings yet
Unit 3 Research Methods
16 pages
Multi Variate Analysis
No ratings yet
Multi Variate Analysis
4 pages
Multivariate Analysis Techniques Guide
No ratings yet
Multivariate Analysis Techniques Guide
26 pages
Unit-3 Research Methods-MCA
No ratings yet
Unit-3 Research Methods-MCA
15 pages
5th Module SDS
No ratings yet
5th Module SDS
13 pages
Classification of Multivariate Techniques
No ratings yet
Classification of Multivariate Techniques
25 pages
SImulated Exam 1 - Testlet 2 - Questions
No ratings yet
SImulated Exam 1 - Testlet 2 - Questions
37 pages
Unit Iv
No ratings yet
Unit Iv
15 pages
Cashew Nut Shell Liquid Expeller Evaluation
No ratings yet
Cashew Nut Shell Liquid Expeller Evaluation
15 pages
Unit 1
No ratings yet
Unit 1
24 pages
Pb12mat 21-22
No ratings yet
Pb12mat 21-22
29 pages
Multivariate Analysis Guide
No ratings yet
Multivariate Analysis Guide
7 pages
Multivariate Analysis Guide
No ratings yet
Multivariate Analysis Guide
45 pages
Finance
No ratings yet
Finance
43 pages
Statistics Exam Review
No ratings yet
Statistics Exam Review
90 pages
Multivariate Analysis Guide
No ratings yet
Multivariate Analysis Guide
7 pages
M7 Tax Computations and Credits - Questions
No ratings yet
M7 Tax Computations and Credits - Questions
23 pages
Multivariate Statistical Analysis: Professor Dr. Muhammad Mohsin Butt Department of Marketing School of Business Studies
No ratings yet
Multivariate Statistical Analysis: Professor Dr. Muhammad Mohsin Butt Department of Marketing School of Business Studies
8 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
23 pages
Interpretaion & Report Writing
No ratings yet
Interpretaion & Report Writing
27 pages
Mutivariate and Baysian
No ratings yet
Mutivariate and Baysian
21 pages
Multivariate Analysis Basics
No ratings yet
Multivariate Analysis Basics
13 pages
Multivariate Analysis Guide
No ratings yet
Multivariate Analysis Guide
7 pages
Multivariate Analysis Techniques
No ratings yet
Multivariate Analysis Techniques
4 pages
Unit 16
No ratings yet
Unit 16
7 pages
Bivariate
No ratings yet
Bivariate
8 pages
Multivariate Statistics
No ratings yet
Multivariate Statistics
6 pages
Whati S "Mul T I Vari at E"?
No ratings yet
Whati S "Mul T I Vari at E"?
22 pages
Multivariate Analysis Insights
No ratings yet
Multivariate Analysis Insights
13 pages
Chapter 13 Multivariate Analysis Techniques
No ratings yet
Chapter 13 Multivariate Analysis Techniques
58 pages
01 Multivariate Analysis
100% (1)
01 Multivariate Analysis
40 pages
SImulated Exam 1 - Testlet 2 - Answers
No ratings yet
SImulated Exam 1 - Testlet 2 - Answers
41 pages
One Way ANOVA
No ratings yet
One Way ANOVA
11 pages
Multivariate Analysis An Overview
No ratings yet
Multivariate Analysis An Overview
9 pages
CH 24 Multi Variate
No ratings yet
CH 24 Multi Variate
34 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
41 pages
What Is Multivariate Analysis
No ratings yet
What Is Multivariate Analysis
7 pages
Chapter-24 Multivariate Statistical Analysis
No ratings yet
Chapter-24 Multivariate Statistical Analysis
80 pages
Design and Analysis of Experiments 2
100% (1)
Design and Analysis of Experiments 2
8 pages
M7 Tax Computations and Credits - Answers
No ratings yet
M7 Tax Computations and Credits - Answers
23 pages
Multi Variance Analyses
No ratings yet
Multi Variance Analyses
12 pages
Pertemuan 1 SNN
No ratings yet
Pertemuan 1 SNN
37 pages
Answers To End-Of-Chapter Questions: 2222222 (1) in Your Own Words, Define Multivariate Analysis
No ratings yet
Answers To End-Of-Chapter Questions: 2222222 (1) in Your Own Words, Define Multivariate Analysis
9 pages
Introduction to Data Analysis
No ratings yet
Introduction to Data Analysis
8 pages
SMDE - (US) Experts Sesion Multivariate Analysis
No ratings yet
SMDE - (US) Experts Sesion Multivariate Analysis
4 pages
Eleven Multivariate Analysis Techniques
No ratings yet
Eleven Multivariate Analysis Techniques
4 pages
02 Ekta Rani
No ratings yet
02 Ekta Rani
11 pages
Ultra-High-Performance Concrete With Local High Unburned Carbon Fly Ash
No ratings yet
Ultra-High-Performance Concrete With Local High Unburned Carbon Fly Ash
10 pages
M2 Gross Income Part 1 - Answers
No ratings yet
M2 Gross Income Part 1 - Answers
68 pages
Servuction Model 1
No ratings yet
Servuction Model 1
160 pages
Effect of Freezing Rate and Storage Time On Qualit
No ratings yet
Effect of Freezing Rate and Storage Time On Qualit
6 pages
M2 Gross Income Part 1 - Questions
No ratings yet
M2 Gross Income Part 1 - Questions
64 pages
Agile Cost Estimation Insights
No ratings yet
Agile Cost Estimation Insights
15 pages
Analysis of VArience
No ratings yet
Analysis of VArience
18 pages
Introduction To Research
No ratings yet
Introduction To Research
5 pages
SImulated Exam 1 - Testlet 3 - Answers
No ratings yet
SImulated Exam 1 - Testlet 3 - Answers
4 pages
2007 IJOPMDatta Allen Christopher
No ratings yet
2007 IJOPMDatta Allen Christopher
47 pages
M6 Section 199A Qualified Business Income Deduction - Answers
No ratings yet
M6 Section 199A Qualified Business Income Deduction - Answers
7 pages
Business Research Methods: Multivariate Analysis
No ratings yet
Business Research Methods: Multivariate Analysis
34 pages
Decision Making and Accounting
No ratings yet
Decision Making and Accounting
4 pages
Effectiveness of Labour Welfare in Improving Employee Productivity at Midas Treades PVT LTD
No ratings yet
Effectiveness of Labour Welfare in Improving Employee Productivity at Midas Treades PVT LTD
16 pages
Tourism Destination Marketing: A Case Study of Lumbini Nepal
No ratings yet
Tourism Destination Marketing: A Case Study of Lumbini Nepal
14 pages
Business Analytics Presentation
No ratings yet
Business Analytics Presentation
11 pages
Inventory Inaccuracy and Performance of Collaborative Supply Chain Practices
No ratings yet
Inventory Inaccuracy and Performance of Collaborative Supply Chain Practices
27 pages
Advanced Statistical Methods
No ratings yet
Advanced Statistical Methods
39 pages
2024 Philosophical Psychology
No ratings yet
2024 Philosophical Psychology
28 pages
BRM Model Paper With Solution - 2022-23
No ratings yet
BRM Model Paper With Solution - 2022-23
16 pages
ANOVA Analysis for IT Professionals
No ratings yet
ANOVA Analysis for IT Professionals
10 pages
Spirulina Microalgae for CO2 Reduction
No ratings yet
Spirulina Microalgae for CO2 Reduction
10 pages
INST627 Spring2018 Syllabus
No ratings yet
INST627 Spring2018 Syllabus
6 pages
ANOVA Test
No ratings yet
ANOVA Test
4 pages
BU - Assignment 2 PDF
No ratings yet
BU - Assignment 2 PDF
2 pages
Multi Variate
No ratings yet
Multi Variate
4 pages

Analysis of Data

Uploaded by

Analysis of Data

Uploaded by

Chapter

MULTIVARIATE ANALYSIS TECHNIQUES

4.3 MULTIVARIATE ANALYSIS TECHNIQUES

4.4 MULTIPLE REGRESSION ANALYSIS

Interpreting Regression Co-efficients

The relative frequency polygon appears as follows:

4.5 DISCRIMINATED ANALYSIS

4.6 FACTOR ANALYSIS

Objectives of Factor Analysis

(4) The Hij’s are independent across observations.

= (45.125 + 145.8 + 57.14 + 0) – 167.54 = 80.525

Sources of Sum of Degrees of (df) Mean square F-ratio variation

Total 350 657 426

Notes ANOVA Table

Sources of Sum of (SS) Degrees of Mean square MS F-ratio variation

Correction factor: Notes

Sources of Sum of Degrees of Freedom Mean square F-ratio

Table values of F at 5% level of significance

Sample 1407 411 409

Sample 3410 408 406 408

w/n system 312 12 26

Table value FD = 19.43

Table Crop I Crop II Crop III Crop IV

Fertilizer C 5.9 6.8 5.7 5.2

I II III IV Ti Ti2 Ti2 /n

Tj2 368.64 441 506.25 357.21

SS = 6 6 x 2ij – CF = 577.96 – 554.88 = 23.08

B/w Column 2.82 3 0.94 F2 = 1.170

FD1 = 5.14 Notes

F1 > FD1 F2 < FD 2

Notes Total sum of squares:

Sum of squares between rows (i.e., between mechanics)

Sum of squares of residual or error:

Sources of Sum of Degree of Mean squares MS-ratio variation

Amount of BP reduction in mm of mercury: Notes

Group of people Drug

Notes Sum of squares between columns (i.e., between drugs)

Sum of squares within samples:

Sources of Sum of Degree of Mean square Ms F-ratio variation

iii) Calculated F(4, 9) = 18.8 > Table F(4, 9) = 3.63. Notes

Notes Sum of squares between columns (i.e., between bas oline):

Ti2 T 2 932 882 932 F I

4.9 SELF ASSESSMENT QUESTIONS

You might also like