📘 Chapter 1: Research Methodology — An Introduction
🔍 1.1 Meaning of Research
Research is a systematic and scientific inquiry into a problem to gain new knowledge.
It moves from the known to the unknown.
Definitions:
o Redman & Mory: “Systematized effort to gain new knowledge.”
o Clifford Woody: Defining & redefining problems, collecting & analyzing data,
drawing conclusions.
🎯 1.2 Objectives of Research
1. Exploratory: Gaining familiarity with a phenomenon.
2. Descriptive: Accurately portraying characteristics.
3. Diagnostic: Determining frequency of occurrence or association.
4. Hypothesis-Testing: Testing a causal relationship.
🧠 1.3 Motivation in Research
Common motives: Academic degrees, problem-solving, intellectual joy, societal
contribution, professional recognition.
🔄 1.4 Types of Research
Descriptive Observes and records present phenomena
Analytical Uses available information for critical evaluation
Applied Solves practical problems
Fundamental Develops theories/generalizations
Quantitative Deals with measurable variables
Qualitative Deals with subjective assessment
Conceptual Based on theories or concepts
Empirical Based on observation/experimentation
🧰 1.5 Research Approaches
Quantitative: Inferential, experimental, simulation.
Qualitative: Opinions, attitudes, and behavior insights.
📈 1.6 Significance of Research
Policy making: Helps governments in planning.
Business: Aids in decision-making, marketing, forecasting.
Social sciences: Understands and solves social problems.
🔧 1.7 Research Methods vs. Methodology
Research Methods: Techniques/tools used for data collection (e.g., surveys,
experiments).
Methodology: Underlying logic, explanation of chosen methods and their rationale.
🧪 1.8 Research & Scientific Method
Based on empirical evidence, objectivity, reliability, and logical reasoning.
Scientific method ensures repeatability and generalization.
📚 1.9 Importance of Knowing How Research is Done
Enables critical evaluation.
Builds logical and systematic thinking.
Helps in decision-making and application.
🔁 1.10 Research Process (Steps)
1. Define the problem
2. Literature review
3. Formulate hypothesis
4. Design research
5. Select sample
6. Collect data
7. Analyze data
8. Test hypothesis
9. Interpret & generalize
10. Write report
✅ 1.11 Criteria of Good Research
Clearly defined objectives.
Systematic, logical, unbiased.
Proper analysis and valid conclusions.
Transparent reporting.
📘 Chapter 2: Defining the Research Problem
🔍 2.1 What is a Research Problem?
A research problem refers to an issue or concern that needs a solution.
It's the foundation of any research—if defined incorrectly, the whole study may go off-track.
It involves selecting an area where a gap in knowledge exists and framing it in a way that is
researchable.
🧭 2.2 Selecting the Problem
Choosing the right problem is crucial and depends on:
1. Researcher’s interest
2. Knowledge and expertise
3. Availability of data
4. Time and resources
5. Feasibility of conducting the study
💡 Tip: A good problem should be relevant, novel, researchable, and clearly defined.
⚙️2.3 Necessity of Defining the Problem
Why you must define the problem carefully:
Ensures proper direction of the research.
Helps determine the research design, sampling methods, and data analysis techniques.
Prevents ambiguity and confusion.
🛠️2.4 Technique Involved in Defining a Problem
📌 Steps:
1. Understand the problem thoroughly
o Discuss with peers or experts
o Gather background information
2. Survey literature
o Helps identify gaps and avoid duplication
3. Develop theoretical framework
o Use existing theories to support the problem
4. Rephrase into operational terms
o Make it specific, measurable, and clear
🧪 2.5 An Illustration
Example: Studying the impact of social media on college students’ academic performance.
Broad Topic: Impact of social media
Narrowed Research Problem: How does daily use of Instagram affect college students’ GPA in
their final year?
Notice:
It specifies the platform (Instagram)
The target group (college students)
The parameter to measure (GPA)
✅ 2.6 Conclusion
The success of research lies in defining the problem clearly.
It sets the direction for the entire research process.
🧠 Quick Recap
Section Key Points
What is a research problem? A question needing a solution, based on knowledge gaps
How to select it? Based on interest, feasibility, resources, and scope
Why define it? Provides clarity, direction, and structure
How to define it? Understand → Review Literature → Reframe Clearly
Example Social media use → academic performance link
📘 Chapter 3: Research Design
🧾 3.1 Meaning of Research Design
A research design is a blueprint or detailed plan for how the research will be conducted.
It lays out the methods and procedures for collecting, measuring, and analyzing data.
It ensures that the study is efficient, logical, and aligned with research objectives.
🧩 It answers: How will the research be done? What data? What methods? What tools?
❓ 3.2 Need for Research Design
Provides clarity and structure to your work.
Prevents wastage of time, money, and effort.
Ensures valid and reliable results.
Helps avoid biases and ensures objectivity.
⭐ 3.3 Features of a Good Design
A good research design should:
1. Be flexible (especially for exploratory research)
2. Ensure minimum bias and maximum reliability
3. Have a clear structure and purpose
4. Be efficient in terms of resources (time, cost, effort)
🧠 3.4 Important Concepts Relating to Research Design
Concept Meaning
Dependent Variable The outcome being studied (effect)
Independent Variable The factor that causes the change (cause)
Variables that may affect the outcome but are not the main
Extraneous Variable
focus
Control Techniques to minimize effects of extraneous variables
When the effect of the independent variable is mixed with
Confounded Relationship
extraneous ones
Research Hypothesis Prediction of a relationship between variables
Experimental & Non-
Whether variables are manipulated or only observed
Experimental
Treatment Different conditions applied to experimental units
🧪 3.5 Different Research Designs
🔍 A. Exploratory Research Design
Objective: Explore unknown problems, gain insights.
Flexible & informal.
Methods: Literature survey, expert interviews, focus groups.
📊 B. Descriptive Research Design
Objective: Describe characteristics/facts accurately.
Structured and pre-planned.
Methods: Surveys, observation, case studies.
🧪 C. Experimental Research Design
Objective: Establish causal relationships.
Involves control and manipulation.
Ideal for hypothesis testing.
📏 3.6 Basic Principles of Experimental Design
According to R.A. Fisher, three key principles:
1. Replication: Repeating treatment to estimate variability.
2. Randomization: Random assignment of subjects to treatments.
3. Local Control: Grouping similar experimental units together to reduce variation.
📋 3.7 Common Types of Experimental Designs
Design Type Description
CRD (Completely Randomized Treatments assigned randomly; suitable for homogeneous
Design) units
RBD (Randomized Block
Blocks of similar units; more accurate comparisons
Design)
Controls for two extraneous variables; treatments arranged in
Latin Square Design
rows and columns
Factorial Design Examines effects of two or more variables simultaneously
📝 Appendix – Developing a Research Plan
A solid research plan includes:
Statement of the problem
Objectives
Review of literature
Hypotheses
Variables involved
Sample design
Data collection methods
Analysis methods
Time & budget estimates
🔁 Summary Table
Section Key Ideas
Meaning Blueprint for conducting research
Need Ensures clarity, validity, and saves resources
Good Design Flexible, objective, reliable, efficient
Key Concepts Variables, hypothesis, control, treatment
Types Exploratory, Descriptive, Experimental
Experimental Principles Replication, Randomization, Local Control
📘 Chapter 4: Sampling Design
🔍 4.1 Census vs. Sample Survey
Basis Census Sample Survey
Coverage Entire population A portion (sample)
Accuracy High (if done correctly) Dependent on sample & method
Cost & Time High Lower
Practicality Less practical Highly practical for large populations
⚙️4.2 Implications of a Sample Design
A sample design is the plan for selecting a sample from the population. It determines:
How many items to include
Which items to include
How they are selected (randomly, purposely, etc.)
📋 4.3 Steps in Sampling Design
1. Define the population
2. Specify the sampling frame (list of all elements in the population)
3. Determine the sampling unit (e.g., individuals, schools, regions)
4. Select sampling method (probability or non-probability)
5. Determine sample size
6. Execute the sampling process
🎯 4.4 Criteria of Selecting a Sampling Procedure
A good sampling design should:
Give a small sampling error
Be economical
Be practical
Be systematic and objective
Be sufficiently representative
✅ 4.5 Characteristics of a Good Sample Design
Representative of the population
Reliable (consistent results)
Flexible (can be adjusted)
Efficient (accurate with minimum cost)
🧪 4.6 Types of Sampling Designs
🧮 A. Probability Sampling (each unit has a known, non-zero chance of being selected)
Method Description
Simple Random Sampling Every unit has equal chance (e.g., lottery)
Systematic Sampling Select every kᵗʰ unit after a random start
Stratified Sampling Population divided into strata, samples drawn from each
Cluster Sampling Population divided into clusters; clusters randomly chosen
Multistage Sampling Sampling in stages (e.g., states → districts → villages)
Area Sampling Geographical-based cluster sampling
🧠 B. Non-Probability Sampling (selection based on researcher's judgment or convenience)
Method Description
Convenience Sampling Units chosen based on ease of access
Judgment Sampling Researcher selects "representative" units
Quota Sampling Like stratified, but selection within strata is non-random
🎲 4.7 How to Select a Random Sample?
Lottery method
Random number tables
Computer-generated random numbers
🌌 4.8 Random Sampling from an Infinite Universe
Apply theoretical probability to ensure unbiased selection.
Useful in cases like rolling dice, continuous production lines, etc.
🔁 4.9 Complex Random Sampling Designs
These are combinations of multiple sampling techniques, used to:
Improve efficiency
Reduce cost
Control for multiple variables
📌 4.10 Conclusion
Sampling is essential when census is impractical.
Right sampling design enhances accuracy, cost-effectiveness, and reliability.
Choice of sampling technique depends on the nature of research, resources, and
population structure.
📊 Summary Table
Topic Key Points
Census vs. Sample Census is full coverage; sample is a subset
Sample Design Blueprint for selecting a sample
Good Design Representative, economical, reliable
Probability Sampling Random, systematic, stratified, cluster
Non-Probability Sampling Convenience, judgment, quota
Complex Designs Mix of techniques for better results
📘 Chapter 5: Measurement and Scaling Techniques
📏 5.1 Measurement in Research
Measurement is the process of assigning numbers or symbols to characteristics according to a
set of rules.
In research, especially in social sciences, we often measure non-physical attributes like
satisfaction, attitude, awareness, etc.
🔍 Example: Assigning a score of “5” to someone who is "highly satisfied" in a customer satisfaction
survey.
🎚️5.2 Measurement Scales
There are four basic types of measurement scales:
Scale Type Description Example
Nominal Categories without order Gender (Male/Female), Religion
Ordinal Categories with order Class rank (1st, 2nd, 3rd)
Interval Equal intervals, no true zero Temperature (°C, °F)
Ratio Equal intervals with true zero Income, Height, Age
⚠️5.3 Sources of Error in Measurement
Common issues that lead to unreliable or invalid results:
1. Respondent errors – lies, misunderstandings
2. Instrument errors – faulty scales or questions
3. Situational factors – environment, timing
4. Data processing errors – incorrect coding or entry
✅ 5.4 Tests of Sound Measurement
To ensure measurement quality, it should be:
Test Meaning
Validity Are we measuring what we intend to?
Reliability Are the results consistent?
Practicability Is it economically feasible and easy to administer?
🔨 5.5 Technique of Developing Measurement Tools
Steps involved:
1. Conceptualization – Define what you want to measure
2. Operationalization – Decide how to measure it
3. Design the instrument – e.g., questionnaire, checklist
4. Test and refine – Pre-test/pilot study
📐 5.6 Scaling
Scaling is assigning numbers or symbols to measure the intensity of a concept (e.g.,
satisfaction).
It helps translate qualitative responses into quantitative data.
🧱 5.7 Scale Classification Bases
Basis Types
Subject orientation Rating vs. ranking scales
Data comparability Comparative vs. non-comparative
Response form Single vs. multiple items
Scale properties Nominal, ordinal, interval, ratio
📊 5.8 Important Scaling Techniques
1. Rating Scales
Respondent assigns a value to an object.
Types:
o Graphic rating scale (mark on a line)
o Itemized rating scale (Likert, semantic differential)
2. Ranking Scales
Respondent compares items and ranks them in order.
3. Likert Scale
Measures level of agreement (Strongly agree → Strongly disagree)
Usually a 5- or 7-point scale
4. Semantic Differential Scale
Bipolar adjectives on either side (e.g., Happy — Sad)
Used to measure connotative meaning
5. Thurstone Scale
Judges rate statements; used for attitude measurement
6. Guttman Scale
Cumulative scale; if you agree with one item, you agree with the previous ones.
🧪 5.9 Scale Construction Techniques
Steps:
1. Define the concept
2. Generate items/statements
3. Select scaling method (Likert, Guttman, etc.)
4. Pilot test
5. Assess reliability and validity
6. Finalize the scale
🧠 Summary Table
Concept Description
Measurement Assigning numbers/symbols to properties
Scales Levels of measurement (Nominal, Ordinal, etc.)
Errors Respondent, Instrument, Environment
Validity & Reliability Measures of quality and consistency
Scaling Techniques Likert, Semantic Differential, Ranking, etc.
📘 Chapter 6: Methods of Data Collection
📂 6.1 Collection of Primary Data
Primary data is data collected directly by the researcher for the specific purpose of their study.
👀 6.2 Observation Method
The researcher observes behavior, events, or phenomena directly.
Suitable when respondents cannot communicate or when honesty is crucial.
🔍 Types:
1. Structured vs. Unstructured – Predefined vs. open-ended
2. Participant vs. Non-participant – Involvement vs. passive watching
3. Controlled vs. Uncontrolled – Lab vs. natural setting
📌 Example: Studying shopper behavior in a supermarket.
🗣️6.3 Interview Method
A direct method involving face-to-face interaction with respondents.
📋 Types:
Type Description
Personal interview Formal and structured
Focused interview Based on specific experiences
Clinical interview Probes deeper into personal feelings
Non-directive interview Unstructured and exploratory
🔍 Used in: Psychology, Market research, HR studies
📝 6.4 Collection of Data through Questionnaires
A written list of questions sent to respondents.
Respondents answer without interviewer involvement.
✅ Advantages:
Low cost, covers wide area, no interviewer bias
❌ Disadvantages:
Low response rate, unclear answers, literacy required
🗂️6.5 Collection of Data through Schedules
Similar to questionnaires but filled by enumerators, not respondents.
✅ Advantages:
Suitable for illiterate respondents
More accurate and complete
📌 Common in large government or census surveys.
🆚 6.6 Difference between Questionnaires and Schedules
Criteria Questionnaires Schedules
Filled by Respondents Enumerators
Cost Low Higher
Literacy required Yes Not necessary
Accuracy May be low Generally high
📦 6.7 Other Methods of Data Collection
📞 1. Telephonic Interviews
Fast, flexible, but limited to short questions and willing respondents.
💬 2. Mailed Questionnaires
Sent through post, suitable for educated respondents.
🎓 3. Electronic Methods
Online forms, email surveys — modern, cost-efficient, fast.
🧾 4. Projective Techniques
Used in psychology (e.g., Rorschach inkblots, sentence completion)
🎭 5. Depth Interviews
In-depth probing to uncover hidden motives/feelings.
📚 6.8 Collection of Secondary Data
Secondary data is already existing data collected for other purposes.
📂 Sources:
Published: Books, journals, reports, newspapers
Unpublished: Diaries, company records, internal memos
🔍 Use when primary data collection is not feasible or needed for comparison.
❓ 6.9 Selection of Appropriate Data Collection Method
Depends on:
Nature of the study
Time and budget
Literacy of respondents
Geographical spread
📌 Example: Use schedules in rural areas with low literacy, questionnaires in corporate studies.
🧪 6.10 Case Study Method
An in-depth study of a single unit (person, group, institution).
Rich detail but not generalizable.
Common in clinical psychology, management, and sociology.
📎 Appendices
✍️(i) Guidelines for Constructing Questionnaires/Schedules
Keep it simple, logical, and relevant
Avoid leading questions
Pretest before final use
💬 (ii) Guidelines for Successful Interviewing
Be courteous and professional
Build rapport
Ensure clarity and neutrality
🧪 (iii) Difference between Survey and Experiment
Survey Experiment
Observes without intervention Involves manipulation and control
Natural setting Often lab setting
Descriptive Causal/explanatory
🧠 Summary Table
Method Best for Pros Cons
Observation Natural behavior Real-time, unbiased Limited scope
Interview In-depth info Clarifications possible Time-consuming
Questionnaire Large, literate samples Low cost Poor response rate
Schedules Illiterate populations Accurate Costly, needs training
Secondary Data Historical/large-scale trends Quick, low cost May be outdated
📘 Chapter 7: Processing and Analysis of Data
🧹 7.1 Processing Operations
Data processing involves converting raw data into a usable form. Key stages include:
Stage Description
Editing Correcting errors, ensuring consistency
Coding Assigning numerical or symbolic codes to responses
Classification Grouping data into categories
Tabulation Summarizing data into tables for analysis
Transcription Transferring data to analysis-ready formats (manual or digital)
🔍 Example: Assigning codes to “Yes” = 1, “No” = 0 in a survey.
⚠️7.2 Some Problems in Processing
Incomplete data
Inconsistent or contradictory responses
Errors in transcription or entry
Ambiguity in open-ended responses
🛠️Solution: Rigorous editing and pretesting of data collection tools.
🔎 7.3 Elements/Types of Analysis
Type Description
Descriptive Analysis Summarizing data (mean, median, mode, percentages)
Drawing conclusions beyond data (hypothesis testing, confidence
Inferential Analysis
intervals)
Causal Analysis Identifying cause-effect relationships
Type Description
Predictive Analysis Forecasting trends or behaviors
📊 7.4 Statistics in Research
Statistics help organize, analyze, and interpret numerical data.
📌 7.5 Measures of Central Tendency
Measure Description Example
Mean Arithmetic average (Σx)/n
Median Middle value Useful for skewed data
Mode Most frequent value Best for categorical data
📐 7.6 Measures of Dispersion
Measure Description
Range Difference between highest & lowest values
Variance Average squared deviation from mean
Standard Deviation Square root of variance; indicates spread of data
↔️7.7 Measures of Asymmetry (Skewness)
Measures the lack of symmetry in data.
o Positive skew: Tail on the right (mean > median)
o Negative skew: Tail on the left (mean < median)
🔗 7.8 Measures of Relationship
Measure Use
Correlation Coefficient (r) Strength & direction of linear relationship
Covariance Measures how two variables vary together
📌 Values of r range from -1 (perfect negative) to +1 (perfect positive)
📉 7.9 Simple Regression Analysis
Predicts the value of a dependent variable based on one independent variable.
Equation: Y = a + bX
🔁 7.10 Multiple Correlation and Regression
Involves two or more independent variables.
Used when a single factor isn't enough to predict the outcome.
🔄 7.11 Partial Correlation
Measures the relationship between two variables while controlling for others.
🔤 7.12 Association in Case of Attributes
Used when variables are categorical (e.g., gender, education level).
Techniques Use
Contingency tables Show frequency distributions
Chi-square test Test independence between variables
🧮 7.13 Other Measures
Index numbers (e.g., price indices)
Time-series analysis
Ratios and proportions
📋 Appendix: Summary Chart for Data Analysis
Kothari provides a visual chart outlining which statistical tools to use for different types of data
and objectives (available in the book’s appendix).
🧠 Summary Table
Topic Key Ideas
Data Processing Editing, coding, classifying, tabulating
Analysis Types Descriptive, inferential, causal, predictive
Topic Key Ideas
Central Tendency Mean, median, mode
Dispersion Standard deviation, range
Relationship Correlation, regression
Attributes Analyzed using Chi-square and contingency tables
📘 Chapter 8: Sampling Fundamentals
🧩 8.1 Need for Sampling
Sampling is used when:
o Population is large
o Time and cost constraints exist
o Data collection from the whole population is impractical
It allows researchers to make inferences about the population based on the sample.
📚 8.2 Some Fundamental Definitions
Term Description
Population/Universe The entire group of interest
Sample A subset of the population used for study
Statistic A value derived from the sample (e.g., sample mean)
Parameter A value that describes a characteristic of the population
Sampling Error Difference between sample statistic and population parameter
🔁 8.3 Important Sampling Distributions
These are theoretical models showing how a sample statistic behaves:
1. Normal Distribution:
o Bell-shaped, symmetric
o Most sampling distributions approach normality (Central Limit Theorem)
2. Chi-square Distribution:
o Used for variance testing and tests of independence
3. t-distribution:
o Used when sample size is small and population standard deviation is unknown
4. F-distribution:
o Used to compare two variances, e.g., in ANOVA
🧠 8.4 Central Limit Theorem (CLT)
For large samples (n ≥ 30), the sampling distribution of the mean approximates normal
distribution.
This holds true regardless of the population distribution.
📌 CLT justifies the use of parametric tests even with unknown population shapes.
📊 8.5 Sampling Theory
Helps us understand:
o How samples behave
o How to estimate population parameters
o How to calculate error margins and confidence intervals
The goal is to make accurate inferences about a population based on sample data.
🧪 8.6 Sandler’s A-Test
A non-parametric test used when:
o Sample size is small
o Normality can't be assumed
Simpler than t-test but less commonly used.
➕ 8.7 Concept of Standard Error (SE)
Standard deviation of a sampling distribution
Measures the variability of a sample statistic from sample to sample
🧮 Formula:
SE of mean = σ /√ n
Where:
σ = population standard deviation
n = sample size
📌 A smaller SE means more precise estimates.
📏 8.8 Estimation
Estimation is used to infer population parameters from sample data.
Types:
Type Description
Point Estimation Single value (e.g., sample mean)
Interval Estimation Range within which the parameter lies (confidence interval)
📐 8.9 Estimating the Population Mean (μ)
If σ is known
If σ is unknown and n is small:
Where:
xˉ\bar{x}xˉ = sample mean
s = sample SD
z / t = critical value (from standard or t-distribution table)
📊 8.10 Estimating Population Proportion
Formula for confidence interval of proportion:
Where:
p^\hat{p}p^ = sample proportion
n = sample size
z = critical value (e.g., 1.96 for 95% confidence)
📏 8.11 Sample Size and Its Determination
Key Factors:
1. Desired precision
2. Confidence level
3. Population variability
4. Cost and time constraints
📌 Methods of Sample Size Determination
1. Based on Precision Rate and Confidence Level:
Where:
Z = z-value (from table)
σ = standard deviation
E = desired margin of error
2. Based on Bayesian Statistics:
Uses prior knowledge or belief to update sample size requirements.
Less commonly used in practical research unless prior distributions are known.
🧠 Summary Table
Topic Key Ideas
Sampling Need Time, cost, practicality
Sampling Terms Sample, population, SE, error
Distributions Normal, t, chi-square, F
Central Limit Theorem Sample means ≈ normal for large n
Estimation Point and interval; mean and proportion
Topic Key Ideas
Sample Size Based on precision, confidence, variability
📘 Chapter 9: Testing of Hypotheses – I (Parametric Tests)
📌 9.1 What is a Hypothesis?
A hypothesis is a proposed statement about a population parameter.
It can be tested using sample data.
It may be accepted or rejected based on statistical evidence.
🔍 9.2 Basic Concepts in Hypothesis Testing
Term Description
Null Hypothesis (H₀) Assumes no effect or no difference
Alternative Hypothesis
Assumes a significant effect or difference
(H₁)
Type I Error (α) Rejecting H₀ when it is true (false positive)
Type II Error (β) Accepting H₀ when it is false (false negative)
Level of Significance (α) Probability of Type I error (commonly 0.05 or 0.01)
Power of Test Probability of correctly rejecting a false H₀ (1 - β)
A value used to decide whether to accept or reject
Test Statistic
H₀
🔁 9.3 Procedure for Hypothesis Testing
1. State H₀ and H₁
2. Choose a level of significance (α)
3. Select the appropriate test statistic
4. Define the rejection region
5. Compute the test statistic from sample data
6. Compare and conclude (reject or fail to reject H₀)
🔀 9.4 Flow Diagram of Hypothesis Testing
Start → Set Hypotheses → Choose α → Select Test → Calculate Statistic → Compare →
Conclude
🔋 9.5 Measuring the Power of a Test
Power = 1 - β
High power means the test is likely to detect a real effect.
🔍 9.6 Important Parametric Tests
These tests assume that:
The population follows a normal distribution
The variables are measurable
Interval or ratio scale data is used
📊 9.7 Hypothesis Testing of Means
🔁 9.8 Testing Differences Between Means
Where SpS_pSp is the pooled standard deviation.
🔁 9.9 Paired Sample t-test
Used when samples are related/dependent (e.g., before-and-after studies)
Where:
dˉ\bar{d}dˉ = mean of the differences
sds_dsd = standard deviation of differences
🔢 9.10 Hypothesis Testing of Proportions
🧮 9.11 Hypothesis Testing for Variance
(Larger variance in numerator)
🔗 9.12 Testing Correlation Coefficients
Test whether correlation (r) is significantly different from 0
⚠️9.13 Limitations of Hypothesis Testing
Can be misleading if:
o Assumptions (e.g., normality) are violated
o Poor sample quality
o Misinterpretation of p-values
Doesn’t measure effect size, only statistical significance
🧠 Summary Table
Test Use
z-test Large samples, known σ
t-test Small samples, unknown σ
Paired t-test Dependent samples
Chi-square Variance test, independence
Test Use
F-test Comparing variances
Proportion test Single or two-sample proportion
Correlation test Significance of r
📘 Chapter 10: Chi-square Test
🔍 10.1 Chi-square as a Test for Comparing Variance
The Chi-square (χ²) test is used:
o To test whether a sample variance differs from a hypothesized population variance
o In goodness of fit and independence of attributes
🧮 Formula (for testing variance):
χ2=(n−1)s2σ2\chi^2 = \frac{(n - 1)s^2}{\sigma^2}χ2=σ2(n−1)s2
Where:
s2s^2s2 = sample variance
σ2\sigma^2σ2 = population variance
nnn = sample size
📌 If calculated χ² > critical χ² → reject null hypothesis
📦 10.2 Chi-square as a Non-parametric Test
Does not require assumptions about population parameters or distribution.
Used for testing:
1. Goodness of fit (how well an observed distribution matches an expected one)
2. Independence in contingency tables (e.g., gender vs. preference)
📐 10.3 Conditions for the Application of χ² Test
1. Data should be in frequencies, not percentages
2. Categories must be mutually exclusive
3. Expected frequency (E) should be ≥ 5 in each cell
4. Observations must be independent
5. A sufficiently large sample size is recommended
🧮 10.4 Steps Involved in Applying Chi-square Test
1. State hypotheses (H₀: observed = expected)
2. Compute expected frequencies for each category
3. Use formula:
Where:
OOO = Observed frequency
EEE = Expected frequency
4. Calculate degrees of freedom (df):
o Goodness of fit: df=n−1df = n - 1df=n−1
o Contingency table: df=(r−1)(c−1)df = (r - 1)(c - 1)df=(r−1)(c−1)
5. Compare with critical value from χ² distribution table
6. Draw conclusion (reject or accept H₀)
🔧 10.6 Yates’ Correction (for 2x2 tables with small samples)
Adjustment for continuity when expected frequencies are low (but > 5)
📌 Helps reduce error due to overestimation of significance
🔄 10.7 Converting χ² to Measures of Association
✅ Phi Coefficient (ϕ)
For 2x2 tables
✅ Contingency Coefficient (C)
For larger tables
🧠 10.8 Characteristics of χ² Test
Non-parametric
Based on frequencies and not measurements
Additive across categories
One-tailed test (right tail)
⚠️10.9 Cautions While Using χ² Test
Not suitable for very small samples
May give misleading results if:
o Expected frequencies are too small
o Data are percentages/proportions instead of raw counts
Should not be used when observations are not independent
🧠 Summary Table
Feature Details
Test type Non-parametric
Best for Frequencies, categories
Formula χ2=∑(O−E)2E\chi^2 = \sum \frac{(O - E)^2}{E}χ2=∑E(O−E)2
Degrees of freedom (r - 1)(c - 1) for tables
Uses Goodness of fit, test of independence
Measures Phi (ϕ), Contingency coefficient (C)
Chapter 11: Analysis of Variance and Covariance
Analysis of Variance (ANOVA):
ANOVA is used when we want to test the significance of differences between more than two group
means. It extends the t-test which compares only two means.
Principle:
Total variation in the data is divided into:
o Between-group variation (due to different treatments/groups)
o Within-group variation (due to chance or random error)
These variations are compared using the F-ratio:
F=Mean Square Between GroupsMean Square Within GroupsF = \frac{\text{Mean Square
Between Groups}}{\text{Mean Square Within
Groups}}F=Mean Square Within GroupsMean Square Between Groups
Types of ANOVA:
One-way ANOVA: Examines the effect of a single factor on a dependent variable.
Two-way ANOVA: Examines the effect of two independent variables and their interaction.
Latin Square Design: A special design used when two non-treatment variables (like rows and
columns) need to be controlled along with the treatment variable.
Analysis of Covariance (ANCOVA):
Why ANCOVA?
Useful when a covariate (Z) affects the dependent variable (Y), and we want to adjust for that.
For example, comparing three teaching methods but adjusting for student IQ.
Technique:
Involves linear regression to adjust Y by removing the influence of Z.
Then, a usual ANOVA is conducted on the adjusted values.
This gives a more accurate comparison of the groups.
Assumptions:
1. Linear relationship between covariate and dependent variable.
2. Homogeneity of regression slopes.
3. Random and independent sampling.
Chapter 12: Nonparametric or Distribution-Free Tests
What are Nonparametric Tests?
Do not require assumptions about population distribution (like normality).
Ideal for ordinal or nominal data.
Based on ranks or signs rather than raw values.
Types and Examples:
1. Sign Tests: Compare medians using the direction (+ or -) of differences.
2. Wilcoxon Signed Rank Test: Better than sign test as it considers magnitude of differences.
3. Rank Sum Test (Mann-Whitney U Test): Compares two independent samples.
4. Kruskal-Wallis Test: Nonparametric version of ANOVA.
5. Run Test: Checks for randomness in a sequence.
6. Chi-Square Test: For testing independence in categorical data.
Advantages:
Can handle small samples.
Useful when assumptions of parametric tests are not met.
Easier to understand and apply.
Chapter 13: Multivariate Analysis Techniques
What is Multivariate Analysis?
Refers to statistical methods that analyze multiple variables simultaneously.
Often used in fields like marketing, psychology, and education where multiple measures are
recorded per observation.
Why Use It?
To reduce complexity, identify patterns, and understand relationships between variables.
Two Main Categories:
1. Dependence Techniques: When one or more variables depend on others.
Multiple Regression
Discriminant Analysis
Multivariate Analysis of Variance (MANOVA)
Canonical Correlation
2. Interdependence Techniques: No dependent variable.
Factor Analysis: Reduces data to a few underlying factors.
Cluster Analysis: Groups observations based on similarity.
Multidimensional Scaling (MDS): Visualizes similarity between objects.
Latent Structure Analysis
Important Concepts:
Explanatory vs Criterion Variables: Like Independent vs Dependent.
Observable vs Latent Variables: Latent variables are not directly measured, e.g., intelligence.
Chapter 14: Interpretation and Report Writing
Interpretation:
It's the process of making sense of the data and drawing meaningful conclusions.
Helps link findings to theory and make generalizations.
Essential in turning raw data into knowledge.
Why Important?
Even accurate analysis is useless if wrongly interpreted.
Helps identify patterns, implications, and leads to future research.
Report Writing:
Purpose:
To communicate findings effectively to others.
Might be aimed at policymakers, academics, or the general public.
Types of Reports:
1. Technical Report: Detailed, data-heavy, for academic/professional use.
2. Popular Report: Simplified, focused on practical implications, intended for general readers.
Report Structure:
Title Page
Abstract or Summary
Introduction (Problem + Objectives)
Methodology
Data Analysis and Results
Interpretation and Conclusions
Bibliography and Appendices
Index (optional)
Oral Presentation:
Useful for quick decision-making.
Should be supplemented with a written report.
Chapter 15: The Computer: Its Role in Research
Introduction:
Computers have revolutionized how research is done.
Useful for data storage, statistical analysis, and report preparation.
Types of Computers:
1. Digital Computers: Use binary numbers, suitable for most applications.
2. Analog Computers: Handle continuous data, used in engineering.
Generations of Computers:
1st Gen: Vacuum tubes
2nd Gen: Transistors
3rd Gen: Integrated Circuits (ICs)
4th Gen: Microprocessors (present use)
5th Gen: Artificial Intelligence (under development).
Applications in Research:
Data entry and coding
Running statistical software
Graphical representation
Simulations and modeling
Efficient storage and retrieval
Limitations:
Computers don't think; they follow instructions.
Results depend on the quality of input.
High setup costs for small projects.