DATA ANALYSIS
DESCRIPTIVE ANALYSIS OF
UNIVARIATE, BIVARIATE AND
MULTIVARIATE DATA
Univariate Analysis – In univariate analysis, one variable is
analyzed at a time.
Bivariate Analysis – In bivariate analysis two variables are
analysed together and examined for any possible association
between them.
Multivariate Analysis – In multivariate analysis, the concern is to
analyse more than two variables at a time.
DESCRIPTIVE ANALYSIS OF
UNIVARIATE DATA
Frequency distribution & percentage distribution (for
Nominal scale)
Analysis of multiple responses (for Nominal scale)
Analysis of ordinal scaled questions
Grouping of large data sets
Data File
DESCRIPTIVE ANALYSIS OF BIVARIATE
DATA
Preparation of cross-tables
For interpretation of cross-tables, it is required to
identify dependent and independent variable.
Percentages should be computed in the direction of
independent variable.
There is no hard and fast rule as to where the dependent
or independent variables are to be taken. They can be
taken either in rows or in columns.
CROSS-TABULATION
Cross-tabulation- a procedure to study the
relationship among and between the variable.
Describes two or more variables simultaneously.
Joint distribution of two or more variables with a limited number of
categories or distinct values.
To learn how the DV varies from subgroup to subgroup
To allow the inspection of difference and to make comparison
To determine the form of relationship between two variables.
Contingency table
TWO VARIABLES CROSS-TABULATION
Purchase of Fashion Clothing by Marital Status
Purchase of Current Marital Status
Fashion
Clothing Married Unmarried
High 31% 52%
Low 69% 48%
Column 100% 100%
Number of 700 300
respondents
CROSS TABULATION
Is product usage related to interest in outdoor activities?
Is income related to purchase of fashion clothing?
Is age related to preference for fast food?
Is gender related to consumption of ice cream?
Is product ownership related to income?
THREE VARIABLES CROSS-TABULATION
PURCHASE OF FASHION CLOTHING BY MARITAL
STATUS AND GENDER
Purchase of Gender
Fashion Male Female
Clothing
Married Not Married Not
Married Married
High 35% 40% 25% 60%
Low 65% 60% 75% 40%
Column 100% 100% 100% 100%
totals
Number of 400 120 300 180
cases
THREE VARIABLES CROSS-TABULATION
As can be seen, 52% of unmarried respondents fell in the high-purchase category, as
opposed to 31% of the married respondents. Before concluding that unmarried
respondents purchase more fashion clothing than those who are married, a third
variable, the buyer's gender, was introduced into the analysis.
in the case of females, 60% of the unmarried fall in the high-purchase category, as
compared to 25% of those who are married. On the other hand, the percentages are
much closer for males, with 40% of the unmarried and 35% of the married falling in
the high purchase category.
Hence, the introduction of gender (third variable) has refined the relationship
between marital status and purchase of fashion clothing (original variables).
Unmarried respondents are more likely to fall in the high purchase category than
married ones, and this effect is much more pronounced for females than for males.
STEPS INVOLVED IN HYPOTHESIS TESTING
Formulate H0 and H1
Select Appropriate Test
Choose Level of Significance
Collect Data and Calculate Test Statistic
Determine Probability Determine Critical Value of
Associated with Test Test Statistic TSCR
Statistic
Determine if TSCR falls
Compare with Level of
into (Non) Rejection Region
Significance,
Reject or Do not Reject H0
Draw Marketing Research Conclusion
STATISTICS ASSOCIATED WITH
CROSS-TABULATION CHI-SQUARE
To determine whether a systematic association exists,
An important characteristic of the chi-square statistic is the number of degrees
of freedom (df) associated with it. That is, df = (r - 1) x (c -1).
The null hypothesis (H0) of no association between the two variables will be
rejected only when the calculated value of the test statistic is greater than the
critical value of the chi-square distribution with the appropriate degrees of
freedom
STATISTICS ASSOCIATED WITH
CROSS-TABULATION CHI-SQUARE
2
The chi-square statistic ( ) iscused to test the
statistical significance of the observed association in
a cross-tabulation.
The expected frequency for each cell can be
calculated by using a simple formula:
f e = nrnnc
where nr = total number in the row
nc = total number in the column
n = total sample size
CHI SQUARE- FORMULA
2
where fo= observed frequency
fe= expected frequency
EXAMPLE-SPSS
Testing the association of Income and Gender with fast food preference.
Data file 3
1. Testing the association between Educational background of PGDM
students and their performance in terms of Grades. . sav
Testing the association between Age and consumer choice of soft drink.
Data file 2
Testing the association of Income and Gender with fast food preference.
Data file 3
CROSS-TABULATION IN PRACTICE
While conducting cross-tabulation analysis in practice, it is useful to proceed along the
following steps.
1. Test the null hypothesis that there is no association between the variables using the chi-
square statistic. If null hypothesis is rejected, then there is no relationship.
2. If H0 is rejected, then determine the strength of the association using an appropriate
statistic (phi-coefficient, contingency coefficient, Cramer's V, lambda coefficient, or
other statistics)
3. If H0 is rejected, interpret the pattern of the relationship by computing the percentages in
the direction of the independent variable, across the dependent variable.
4. If the variables are treated as ordinal rather than nominal, use tau b, tau c, or Gamma as
the test statistic. If H0 is rejected, then determine the strength of the association using the
magnitude, and the direction of the relationship using the sign of the test statistic.