Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views39 pages

Data Preparation

The document outlines the data preparation process for marketing research, detailing steps such as checking questionnaires, editing, coding, and cleaning data. It emphasizes the importance of statistical adjustments and selecting appropriate data analysis strategies, including univariate and multivariate techniques. Additionally, it covers hypothesis testing procedures, including formulating hypotheses, selecting tests, and determining significance levels.

Uploaded by

anubhavj2001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views39 pages

Data Preparation

The document outlines the data preparation process for marketing research, detailing steps such as checking questionnaires, editing, coding, and cleaning data. It emphasizes the importance of statistical adjustments and selecting appropriate data analysis strategies, including univariate and multivariate techniques. Additionally, it covers hypothesis testing procedures, including formulating hypotheses, selecting tests, and determining significance levels.

Uploaded by

anubhavj2001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 39

Data Preparation

Data Preparation Process

Prepare Preliminary Plan of Data Analysis

Check Questionnaire

Edit

Code

Transcribe

Clean Data

Statistically Adjust the Data

Select Data Analysis Strategy


Questionnaire Checking

A questionnaire returned from the field may be


unacceptable for several reasons.
• Parts of the questionnaire may be incomplete.
• The pattern of responses may indicate that
the respondent did not understand or follow
the instructions.
• One or more pages are missing.
• The questionnaire is received after the
preestablished cutoff date.
• The questionnaire is answered by someone
who does not qualify for participation.
Editing

• Editing is the review of the questionnaires with the


objective of increasing accuracy. It consist of
screening questionnaires to identify incomplete,
inconsistent or ambiguous responses
Editing

Treatment of Unsatisfactory Results


• Returning to the Field – The questionnaires
with unsatisfactory responses may be returned
to the field, where the interviewers recontact the
respondents.
• Assigning Missing Values – If returning the
questionnaires to the field is not feasible, the
editor may assign missing values to
unsatisfactory responses.
• Discarding Unsatisfactory Respondents –
In this approach, the respondents with
unsatisfactory responses are simply discarded.
Coding

Coding means assigning a code, usually a number, to each


possible response to each question. The code includes an
indication of the column position (field) and data record it will
occupy.
Codebook

A codebook contains coding instructions and


the necessary information about variables in the
data set. A codebook generally contains the
following information:
• column number
• record number
• variable number
• variable name
• question number
• instructions for coding
Coding Questionnaires

• The respondent code and the record number


appear on each record in the data.
Restaurant Preference

ID PREFER. QUALITY QUANTITY VALUE SERVICE INCOME


1 2 2 3 1 3 6
2 6 5 6 5 7 2
3 4 4 3 4 5 3
4 1 2 1 1 2 5
5 7 6 6 5 4 1
6 5 4 4 5 4 3
7 2 2 3 2 3 5
8 3 3 4 2 3 4
9 7 6 7 6 5 2
10 2 3 2 2 2 5
11 2 3 2 1 3 6
12 6 6 6 6 7 2
13 4 4 3 3 4 3
14 1 1 3 1 2 4
15 7 7 5 5 4 2
16 5 5 4 5 5 3
17 2 3 1 2 3 4
18 4 4 3 3 3 3
19 7 5 5 7 5 5
20 3 2 2 3 3 3
SPSS Variable View of the Data of Table
Codebook Excerpt
Column Variable Variable Question Coding
Number Number Name Number Instructions
1 1 ID 1 to 20 as coded

2 2 Preference 1 Input the number circled.


1=Weak Preference
7=Strong Preference
3 3 Quality 2 Input the number circled.
1=Poor
7=Excellent
4 4 Quantity 3 Input the number circled.
1=Poor
7=Excellent
5 5 Value 4 Input the number circled.
1=Poor
7=Excellent
6 6 Service 5 Input the number circled.
1=Poor
7=Excellent
Codebook Excerpt (Cont.)
Column Variable Variable Question Coding
Number Number Name Number Instructions
7 7 Income 6 Input the number circled.
1 = Less than $20,000
2 = $20,000 to 34,999
3 = $35,000 to 49,999
4 = $50,000 to 74,999
5 = $75,000 to 99,999
6 = $100,00 or more
Data Cleaning

Data cleaning includes consistency check and


treatment of missing responses
Consistency Checks

Consistency checks identify data that are out of


range, logically inconsistent, or have extreme
values.

• Extreme values should be closely examined.


Missing Responses

• Missing responses represents values of a variable


that are unknown, either because respondents
provide ambiguous answers or their answers were
not properly recorded
Treatment of Missing Responses

• Substitute a Neutral Value – A neutral value,


typically the mean response to the variable, is
substituted for the missing responses.
• In casewise deletion, cases, or respondents, with any
missing responses are discarded from the analysis.
• In pairwise deletion, instead of discarding all cases
with any missing values, the researcher uses only the
cases or respondents with complete responses for each
calculation.
Statistically Adjusting the Data Weighting

• In weighting, each case or respondent in the


database is assigned a weight to reflect its
importance relative to other cases or
respondents.

• Weighting is most widely used to make the


sample data more representative of a target
population on specific characteristics.

• Yet another use of weighting is to adjust the


sample so that greater importance is attached
to respondents with certain characteristics.
Selecting a Data Analysis Strategy

Earlier Steps (1, 2, & 3) of the Marketing Research


Process
Known Characteristics of the
Data
Properties of Statistical Techniques

Background and Philosophy of the


Researcher
Data Analysis Strategy
Univariate Techniques

• Statistical techniques in which analysis are made


only based on one variable
A Classification of Univariate Techniques

Univariate Techniques

Metric Data Non-Metric Data


(Interval, ratio) (Nominal, Ordinal)

One Sample Two or More One Sample Two or More


Samples Samples
* t test * Frequency
* Z test * Chi-Square

Independent Related
Independent Related
* Two- Group * Paired
test t test * Chi-Square * Sign
* Z test * Mann-Whitney * Wilcoxon
* One-Way * Median * McNemar
ANOVA * Chi-Square
Multivariate Techniques

• Statistical techniques suitable for analyzing data


when there are two or more independent
variables

• Multivariate techniques are concerned with the


simultaneous relationship among two or more
phenomena
A Classification of Multivariate Techniques

Multivariate Techniques

Dependence Interdependen
Technique ce Technique

One Dependent More Than One Variable Interobject


Variable Dependent Interdependenc Similarity
Variable e
* Cross- * Multivariate * Factor * Cluster Analysis
Tabulation Analysis Analysis * Multidimensional
* Analysis of of Variance * Confirmatory Scaling
Variance and * Canonical Factor
Covariance Correlation Analysis
* Multiple * Multiple
Regression Discriminant
* 2-Group Analysis
Discriminant/Lo * Structural Equation
git Modeling
* Conjoint and Path Analysis
Frequency Distribution and Hypothesis
Testing
Frequency Distribution

• In a frequency distribution, one variable is


considered at a time.

• A frequency distribution for a variable produces a


table of frequency counts, percentages, and
cumulative percentages for all the values
associated with that variable.
Frequency Histogram

8
7
6
Frequency

5
4
3
2
1
0
2 3 4 5 6 7
Familiarity
Steps Involved in Hypothesis Testing

Formulate H0 and H1

Select Appropriate Test


Choose Level of Significance

Collect Data and Calculate Test Statistic

Determine Probability Determine Critical


Associated with Test Value of Test Statistic
Statistic TSCR
Determine if TSCAL
Compare with
falls into (Non)
Level of
Rejection Region
Significance, 
Reject or Do not Reject
H0
Draw Marketing Research Conclusion
A General Procedure for Hypothesis Testing
Step 1: Formulate the Hypothesis

• A null hypothesis is a statement of the status quo,


one of no difference or no effect. If the null
hypothesis is not rejected, no changes will be made.

• An alternative hypothesis is one in which some


difference or effect is expected. Accepting the
alternative hypothesis will lead to changes in
opinions or actions.
A General Procedure for Hypothesis Testing
Step 1: Formulate the Hypothesis

• In marketing research, the null hypothesis is


formulated in such a way that its rejection
leads to the acceptance of the desired
conclusion. The alternative hypothesis
represents the conclusion for which evidence
is sought.

H0:   0.40
H1:  > 0.40
A General Procedure for Hypothesis Testing
Step 2: Select an Appropriate Test

• The test statistic measures how close the


sample has come to the null hypothesis.
• The test statistic often follows a well-known
distribution, such as the normal, t, or chi-
square distribution.
A General Procedure for Hypothesis Testing
Step 3: Choose a Level of Significance

Type I Error

• Type I error occurs when the sample results


lead to the rejection of the null hypothesis when
it is in fact true. 
• The probability of type I error ( ) is also called
the level of significance.

• Null Hypothesis: Person is not guilty of the crime



• Person is judged as guilty when the person
actually  did not commit the crime
(convicting an innocent person)
A General Procedure for Hypothesis Testing
Step 3: Choose a Level of Significance

Type II Error
• Type II error occurs when, based on the sample
results, the null hypothesis is not rejected when it
is in fact false. 

• The probability of type II error is denoted by .

• Null Hypothesis: Person is not guilty of the crime

 did
• Person is judged not guilty when they actually
commit the crime (letting a guilty person go free)

A General Procedure for Hypothesis Testing
Step 3: Choose a Level of Significance

Power of a Test
• The power of a test is the probability (1 - ) of
rejecting the null hypothesis when it is false and
should be rejected.



A General Procedure for Hypothesis Testing
Step 4: Collect Data and Calculate Test Statistic

• The required data are collected and the value


of the test statistic computed.
A General Procedure for Hypothesis Testing
Step 5: Determine the Probability (Critical Value)

• Using standard normal tables


A General Procedure for Hypothesis Testing Steps 6 &7:
Compare the Probability (Critical Value) & Making the Decision

• If the probability associated with the calculated or observed


value of the test statistic (TSCAL) is less than the level of
significance ( ), the
 null hypothesis is rejected.

• Alternatively, if the absolute calculated value of the test


statistic (|TSCAL|) is greater than the absolute critical value of
the test statistic (|TSCR |), the null hypothesis is rejected.
A General Procedure for Hypothesis Testing
Step 8: Marketing Research Conclusion

• The conclusion reached by hypothesis testing


must be expressed in terms of the marketing
research problem.
A Broad Classification of Hypothesis Tests

Hypothesis Tests

Tests of Tests of
Association Differences

Median/
Distribution Means Proportions
Rankings
s
Hypothesis Testing Related to Differences

• Parametric tests assume that the variables of interest are measured on


at least an interval scale.

• Nonparametric tests assume that the variables are measured on a


nominal or ordinal scale.
A Classification of Hypothesis Testing
Procedures for Examining Differences
Hypothesis Tests

Parametric Non-parametric Tests


Tests (Metric (Nonmetric Tests)
Tests)

One Sample Two or More One Sample Two or More


Samples Samples
* t test * Chi-Square
* Z test * K-S
* Runs
* Binomial

Independen Paired
t Samples Samples Independen Paired
t Samples Samples
* Two-Group t * Paired
test * Chi-Square * Sign
t test * Mann-Whitney * Wilcoxon
* Z test
* Median * McNemar
* K-S * Chi-Square

You might also like