Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views274 pages

Research Methodologies

Uploaded by

M.sekhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views274 pages

Research Methodologies

Uploaded by

M.sekhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 274

About IIMM

“Indian Institute of Materials Management (IIMM)”, with its headquarters at Navi Mumbai, is a
Professional Body of Materials Management classified under Engineering & Technology Group under
Apprenticeship Act, 1961 and is recognised by ISTE, MHRD.

Through its wide network of 56 branches and 19 chapters having around 9500 members drawn

RESEARCH METHODOLOGY
from public and private sectors, IIMM is dedicated to the promotion of the profession of Materials
Management through its multifarious activities including Educational Programs approved by AICTE
(Post Graduate Diploma in Materials Management and Post Graduate Diploma in Supply Chain
Management & Logistics), Seminars, National Conferences, Regional Conferences, Workshops,
In-house training programs, Consultancy & Research Programs.

To have an effective global interaction, the Institute is a charter member of International Federation
of Purchasing and Supply Management (IFPSM), Helsinki, Finland which has its roots in over
44 member countries.

M
In furtherance of its objectives, IIMM brings out a monthly journal, “Materials Management Review”
comprising latest Articles and Research Papers in the field of Materials, Logistics, Purchase, Inventory,
Supply Chain Management and latest Technological Innovations like Artificial Intelligence, Block

M
Chain, Cloud Computing and Internet of Things.

The Institute has its Centre for Research in Materials Management (CRIMM) at Kolkata, which

II
is engaged in promotion of research activities in collaboration with industries for furthering the
advancement of the profession of Materials and Supply Chain Management.

The Institute is dedicated for the Societal & Environmental considerations through Sustainable
Procurement, Green Purchasing and Life Cycle Consideration, which are part of our course curriculum.
The aim & objective of the Institution is to update & upgrade the skills & knowledge of professionals
so as to ensure inclusive and sustainable development.
RESEARCH METHODOLOGY

Indian Institute of Materials Management


Plot No. 102 & 104, Sector-15,
Institutional Area, CBD Belapur, Navi Mumbai – 400614
Ph.: 0222 757 1022/0222 756 5592,
E-mail: [email protected], Website: iimm.org
Research Methodology

M
M
II
© Copyright 2024 Publisher

ISBN: 978-93-91540-95-1

This book may not be duplicated in any way without the express written consent of the
publisher, except in the form of brief excerpts or quotations for the purposes of review.
The information contained herein is for the personal use of the reader and may not be
incorporated in any commercial programs, other books, databases, or any kind of software
without written consent of the publisher. Making copies of this book or any portion,
for any purpose other than your own is a violation of copyright laws. The author and
publisher have used their best efforts in preparing this book and believe that the content is
reliable and correct to the best of their knowledge. The publisher makes no representation
or warranties with respect to the accuracy or completeness of the contents of this book.
M
M
II
Table of Contents

Chapter 1:
M
Fundamentals of Research................................................................................... 1

Chapter 2: Defining and Formulating a Research Problem............................................... 23


M
Chapter 3: Research Design.................................................................................................. 41
II

Chapter 4: Sampling.............................................................................................................. 59

Chapter 5: Measurement and Scaling.................................................................................. 75

Chapter 6: Data Collection Techniques............................................................................... 95

Chapter 7: Introduction to Questionnaire Designing ..................................................... 111

Chapter 8: Data Processing and Analysis......................................................................... 129

Chapter 9: Concept of Hypothesis..................................................................................... 163

Chapter 10: Parametric Tests................................................................................................ 181

Chapter 11: Non-Parametric Tests....................................................................................... 227

Chapter 12: Report Writing.................................................................................................. 255


Course Outcomes
The course on “Research Methodology” aims to equip students with a comprehensive understanding of
research design, data collection, and analysis techniques. It provides practical skills for developing,
conducting, and presenting research, enabling students to tackle complex research problems and contribute
valuable insights across various fields. The book comprises the following twelve chapters:
Chapter 1: Fundamentals of Research - The chapter provides an overview of research, starting with an
introduction to the fundamental principles of research. It then delves into the concept of research, explaining
its importance and applications. The chapter concludes by outlining the research process, detailing the steps
involved in conducting systematic and structured research.
Chapter 2: Defining and Formulating a Research Problem - The chapter begins with the concept of
management dilemmas and their significance in the research context. Then it covers the importance and
methodology of conducting a literature review. The chapter concludes with an in-depth discussion on
formulating and understanding research problems.
Chapter 3: Research Design - This chapter introduces the concept of research design, highlighting its
necessity and key features. It categorises the various types of research designs, providing examples and
applications for each. The chapter concludes by detailing the essential components of an effective research
design, enabling students to structure and plan their research projects systematically.
Chapter 4: Sampling - This chapter elucidates the concept of sampling, emphasising its importance in research
methodology. Then it discusses common errors in measurement and sampling, as well as non-sampling
M
errors that can impact research validity. The chapter concludes by exploring various methods of sampling,
equipping students with the knowledge to select appropriate sampling techniques for their research projects.
Chapter 5: Measurement and Scaling - This chapter provides an understanding of measurement concepts
and their application in research. Then the chapter introduces various scaling techniques, explaining their
M
importance and use in quantifying variables. At the end, the chapter discusses bases of scale classification
and techniques of scale construction.
Chapter 6: Data Collection Techniques - This chapter begins with the concept of data collection. Then it
explores the methods of data collection. At the end, the chapter explores the factors affecting the selection of
II

data collection methods.


Chapter 7: Introduction to Questionnaire Designing - This chapter introduces the concept of questionnaire
designing. Then it covers the different types of questions used in questionnaire designing, and the steps
involved in designing a questionnaire. At the end, the chapter discusses designing of an effective questionnaire.
Chapter 8: Data Processing and Analysis - This chapter provides an overview of data processing and
analysis techniques essential for interpreting research data. Then it covers the concepts of data processing
and analysis, including measures of central tendency, dispersion, skewness, and relationship. At the end, the
chapter also explores various charts and visual tools used in data analysis.
Chapter 9: Concept of Hypothesis - This chapter delves into the fundamentals of hypothesis formation and
testing in research. It begins with defining what constitutes a hypothesis and progresses to the process of
hypothesis testing. At the end, the chapter discusses the procedure for testing hypotheses, including the steps
and techniques involved.
Chapter 10: Parametric Tests - This chapter explores various types of hypothesis testing methods and their
applications. Then it covers parametric tests and their use in statistical analysis, including one-sample tests
and the scenarios in which they are applicable. The chapter also discusses two-sample tests and provides an
overview of Analysis of Variance (ANOVA).
Chapter 11: Non-Parametric Tests - This chapter begins with the concept of non-parametric tests. It covers
the applications of non-parametric tests, including the sign test, rank correlation, rank sum test, and chi-
square test. By the end of this chapter, students will be able to apply non-parametric tests to analyse data
that does not meet parametric test assumptions, interpreting results accurately for diverse research scenarios.
Chapter 12: Report Writing - This chapter covers the essentials of writing a research proposal and report. It
begins with an exploration of how to draft an effective research proposal, followed by guidelines for writing
a comprehensive research report. At the end, the chapter explains the integral parts of a research report,
including structure and content.
R
TE
1

AP
H
C
Fundamentals of Research
M
Table of Contents
M
1.1 Introduction
1.2 Concept of Research
II

1.2.1 Characteristics of a Good Research


1.2.2 Types of Research
1.2.3 Research Approach
1.2.4 Significance of Research
1.2.5 Applying Research in Different Fields of Management
1.2.6 Problems Encountered by a Researcher
1.2.7 Ethics in Research
1.2.8 Managers and Research
Self Assessment Questions
1.3 Research Process
Self Assessment Questions
1.4 Summary
1.5 Key Words
1.6 Case Study
1.7 Exercise
1.8 Answers for Self Assessment Questions
1.9 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Elucidate the concept of research
€€ List the characteristics of good research
€€ Discuss various types of research
€€ Explain the role of ethics in research
€€ List the steps involved in the research process

1.1 INTRODUCTION
In simple words, the term ‘Research’ is associated with the act of seeking out the
information and knowledge on a specific topic or subject. In other words, research
refers to an art of systematic and careful investigation into an explicit field. The
systematic investigation makes research as an art of scientific investigation. Research
is of significant importance in various fields, such as business, economics and
politics. Research is regarded as a powerful and essential tool, which leads human
beings towards progress. M
Research is conducted to serve a varied range of purposes, such as increasing the
knowledge of the researcher, developing and revising theories based on observed
facts, etc. For instance, organisations use research to take well-informed decisions
about the products and services they deal into or to devise new strategies. Significant
M
management decisions, such as pricing decisions, new product launch, undertaking
new projects, etc., require a research to be conducted to find the probable state of the
circumstances and the most feasible and appropriate strategies that can be designed
II

and formulated in the given conditions.

A research study begins by first reviewing the available literature followed by


defining the research problem. The research problem must be stated in a clear
and concise way. Secondly, the authentic and accessible source of information is
identified. Thirdly, the design of the research is decided, which gives direction to
the research study. Further, the data (observations recorded in numeric, textual or
any other form to be referenced easily) is collected and organised for easy analysis
by researchers. Based on the analysis of the data, the research report is prepared,
which comprises inferences on the given research problem and also contains the
research findings.

This chapter will help you in understanding the concept of research. You will study
the characteristics of a good research and types of research. Further, various research
approaches and significance of research are also discussed. The latter section of this
chapter will describe problems encountered by a researcher and ethics in research.
Towards the end, you will learn about the research process.

1.2 CONCEPT OF RESEARCH


Research is ‘search for knowledge’. It refers to an intellectual activity that comprises
2 a systematic investigation about new findings or an activity that is aimed to gain
Fundamentals of Research

new knowledge of the already existing researched facts. The term ‘research’ has Notes
been defined by various authors in different ways. Few major definitions of research
are as follows:

According to Clifford Woody, “Research comprises defining and redefining


problems, formulating hypothesis or suggested solutions; collecting, organising and
evaluating data; making deductions and reaching conclusions; and, at last, carefully
testing the conclusions to determine whether they fit the formulating hypothesis.”

In the words of Redman and Mory, “Research is a careful and systematised effort to
gain new knowledge.”

D. Slesinger and M. Stephenson define research as “Manipulation of things, concepts


or symbols for the purpose of generalising to extend, correct or verify knowledge,
whether that knowledge aids in the construction of theory or in the practice of art.”

In the words of W.S. Monroes, “Research may be defined as a method of studying


problems whose solutions are to be derived partly or wholly from facts.”

Hence, in a broader perspective, the concept of research has different meanings.


In simple terms, research is a process of collecting, analysing and interpreting
the relevant information about any topic. The primary objective of performing a
M
research is to explore answers to questions in a scientific manner. At a broader level,
the objectives of research are to:
€€ Achieve new insights: Research is done with the objective to determine a
M
phenomenon or explore something new. For instance, a study may be carried out
to explore the eating habits and their effect on the growth of children aged 11 to 14.
€€ Portray characteristics: Research is done with the objective to describe the
II

characteristics of a specific individual, situation or group. For instance, a study


may be carried out to find the characteristics of a solar cell.
€€ Determine frequency: Research may also be done with the objective to determine
the frequency with which an event occurs. For example, a study may be carried out
to determine the frequency with which Huntington’s disease occurs in humans.
€€ Hypothesis test: Research is also conducted with the objective to do hypothesis-
based relationship test between variables considered for the study. For example,
a research study may be carried out to test the relationship between polarity and
stability.
The above-mentioned objectives of research are common objectives for all researches.
Additionally, various research studies can have diverse objectives according to their
own specific nature. For instance, the research conducted for marketing purpose
will concentrate on the following objectives also:
€€ Product development
€€ Cost reduction
€€ Inventory control
€€ New product launching
3
Research Methodology

Notes €€ Profitability improvement


€€ Productivity improvement
Similarly, a research in human resource development will have the objectives, such
as developing new tools, concepts or theories, which may help in enhancing the
skills and talent of human resource in an organisation.

1.2.1 CHARACTERISTICS OF A GOOD RESEARCH


There are various forms of research and some characteristics that are common to
all types of research. Fundamentally, a research to be effective should have the
following characteristics:
€€ Directed: Research should be directed towards arriving at some solution to a
problem.
€€ Systematic: Research should be properly structured and should not be based upon
intuition and guess.
€€ Clear purpose: Research should be carried out with certain clearly defined
objectives.
€€ Empirical: Research should be based on the actual data that is derived from
M
observation and experience. However, in case of research relating to abstract
concepts, they can be measured by constructs.
€€ Data-driven: Research encompasses gathering new data from primary sources
M
(first-hand data) or using the existing data (secondary sources) for doing research
for a new purpose.
€€ Logical: Research ought to be guided by clear and logical reasoning.
II

Fundamentally, logical reasoning is of two types: induction and deduction. The


process of moving from unambiguous to general is induction, whereas, the process
of moving from general to unambiguous is deduction. The use of logical reasoning
makes the research more significant.
€€ Elaboration: Research procedure should be explained and detailed properly.
Elaboration is required to maintain continuity that would help another researcher
taking the same topic for further advancement.
€€ Efficient analysis: The facts and figures accumulated in a research ought to be
correctly investigated and analysed by utilising a suitable method.
€€ Requires time: Research ought to be a patient and unhurried activity. Proper time
should be given to conduct any research so as to get a logical result.
€€ Carefully designed: Research must follow carefully designed procedures that
apply diligent analysis. A sensibly designed research also requires data and
information to be carefully collected and recorded.

1.2.2 TYPES OF RESEARCH


4 The classification of research can be based on various decisive factors, such as
the technique, reason, accessibility of time and other assets, capacity, type of
Fundamentals of Research

investigation, and statistical substance. A wide categorisation of types of research is Notes


shown in Figure 1:

Basic research

Applied research

Descriptive research

Causal research

Conceptual research

Empirical research

Qualitative research

Quantitative research

Other types of research

Figure 1: Types of Research


M
The types of research are explained as follows:
€€ Basic research: Basic research is done with the purpose to expand the knowledge
M
of a subject. It is also known as pure, theoretical or fundamental research. As the
basic research is inquisitive in nature, so, in most of the cases, the outcomes of
basic research do not carry any immediate commercial value. It is aimed to gather
knowledge for the sake of knowledge. Basic research is mainly concerned with
II

generalisations and with the formulation of a theory. It can be conducted in any of


the following two different ways:
zz Discovery of a new theory: Basic research may be entirely a new discovery,
the knowledge of which has not existed so far.
zz Development of the existing theory: Existing theories are always based on
assumptions, and provides the scope for changing or formulating new sets
of assumptions and adding new dimensions to the existing theory by doing
further research work.
Basic research is useful in developing new scientific ideas and various ways of
thinking. Some examples of basic research are as follows:
zz Research concerning natural phenomena, such as big bang theory and climate
change
zz Investigation related to basic science
zz Research related to human behaviour
€€ Applied research: Applied research aims to solve practical problems of the world.
It is also known as action research. The applied research aims to provide solutions 5
(conclusions) of problems concerning society or business. It is conducted to test
Research Methodology

Notes the basic assumptions or the empirical content or the very validity of theory under
the given conditions. Applied research may explore ways to:
zz Treat a disease
zz Identify social, economic or political trends
zz Improve agricultural productivity
zz Curb or reduce carbon emissions
zz Improve energy efficiency
zz Reduce inflation
€€ Descriptive research: Descriptive research aims to describe the characteristics of a
phenomenon. It includes a different kind of conducting surveys and fact-finding
enquiries. In descriptive research, the researcher only describes the phenomenon.
Descriptive research can answer what, where, when and how questions, but not
why questions. Descriptive research may be conducted in the following situations:
zz To explain the inflation rate in India in the past 20 years
zz To know how India’s housing market changed over the past 10 years
zz To know the most popular news channels among the middle-aged people
€€
M
Causal research: Causal research is also known as explanatory research. It is one
step ahead of the descriptive research as it aims to investigate the cause-effect
relationships. For instance, if, in a descriptive research, the inflation rates of the
M
past 20 years in India is studied and explained without explaining its negative
or positive impact on the Indian economy, a causal research would thoroughly
investigate the causes of the same. In the cause-effect analysis, data can be analysed
in different ways, such as by comparing inflation rates of different years, giving
II

reasons of high/low inflation, etc.


€€ Conceptual research: Conceptual research aims to explore new concepts or ideas
(theories) and upgrade or redefine existing concepts. It is more concerned with
ideas and is commonly used by philosophers and thinkers. Conceptual research is
conducted by analysing already present information on the concerned topic. No
practical experiment is done in conceptual research.
€€ Empirical research: Empirical research aims to gain knowledge through experience
or observation. The researcher heeds to investigate an established theory on the
basis of a predefined hypothesis in this research. After theory investigation, the
researcher arrives at some conclusions or predictions. After that, predictions are
verified with a suitable experiment. Considering the results of the experiment, the
theory on which the predictions are based is supported or revised.
The concept of empirical research is explained with the help of an example. Assume
that the topic of observation is ‘Does the brain development of children learning to
play musical instruments sometimes have a long-term effect?’
Now suppose that the hypothesis for this topic is:
6 ‘Brain development in children aged between 2 and 6 speeds up when children
play musical instruments.’
Fundamentals of Research

On the basis of this hypothesis, a researcher draws some predictions, such as the Notes
brain development of children who play musical instruments is not affected. After
that, the researcher would conduct a suitable experiment to test predictions. On the
basis of the result of the experiment, the topic of observation would be supported or
revised. For example, if the researcher finds that the brain development of children
who play musical instruments is not affected by it, the topic of observation would
be supported; otherwise, it would be revised.
€€ Qualitative research: Qualitative research is concerned with getting a deep
understanding of qualitative phenomenon. The phenomenon in this research
relates to quality or kind. For example, if the researcher wants to know the cause
of the rising disrespect of the youth towards elders, he/she would have to deeply
look at different aspects, such as changing lifestyle, increasing stress among the
youth and the attitude of people towards the nuclear family. This research tries
to find out why and how rather than what, when and where of a phenomenon.
The aim of a qualitative research is to discover the fundamental ideas, desires and
motives by using the method of in-depth interviews.
€€ Quantitative research: Quantitative research aims to study a phenomenon that is
expressed in terms of quantity. Some examples of the quantitative research are as
follows:
zz
M
A research study that shows that the average rainfall in the month of June in
Uttar Pradesh is more than that of July.
zz A research study that aims to show the percentage of all components of the
M
earth’s atmosphere.
€€ Other types of research: In addition to the types of research mentioned in the
preceding section, there are some other types of research, which are explained as
II

follows:
zz One-time research: This refers to the research that is carried only once.
zz Longitudinal research: This refers to the observational research that is
performed for the same purpose repeatedly over a period of time on the same
group of subjects.
zz Laboratory research: This refers to the research that is done in a laboratory. It is
also known as simulation research. A research in the fields of natural sciences,
such as Physics, Chemistry and Biology, are examples of the laboratory
research. For example, reaction of one chemical with another chemical is an
example of the laboratory research.
zz Field-setting research: This refers to the research that cannot be done in a
laboratory. The research conducted on topics of economics, such as demand,
supply, product and price are examples of a field research.
zz Historical research: This refers to the research in which the researcher either
takes the help of historical sources to conduct fresh research or studies past
events. For example, a research on the outcome of the Revolt of1857 may be
considered as a historical research.
7
Research Methodology

Notes 1.2.3 RESEARCH APPROACH


A research approach refers to a plan or procedure that consists of assumptions to be
considered and the detailed methods of data collection, analysis and interpretation
to be used while performing research. Depending on the nature of the research
problem being addressed, different organisations use different research approaches.
Broadly, there are three types of research approaches, namely quantitative approach,
qualitative approach and pragmatic approach (mixed methods). The types of
research approach are shown in Figure 2:

Types of
Research
Approach

Quantitative Qualitative Pragmatic


Research Research Research
M Approach Approach Approach

Inferential Experimental Simulation


M
Approach Approach Approach
II

Figure 2: Types of Research Approach

A brief description of these approaches is as follows:


€€ Quantitative research approach: It refers to the generation of data in a quantitative
form, which can be subjected to quantitative analysis. The quantitative analysis
should be rigorous and done in a formal and rigid manner. The subtypes of
quantitative approach are as follows:
zz Inferential approach: The approach that is used where a sample (a subset of
the population on whom the experiments are conducted) of the population (the
target group of people that are under investigation) is observed or studied. The
primary aim of this approach is to infer some characteristics of the population
under study by forming a proper database.
zz Experimental approach: The approach is useful for research in which some
variables of a research study are manipulated to observe their effects on other
variables.
zz Simulation approach: The approach is useful for conducting research in which
an artificial environment is created to generate the relevant information and
8 data. This approach is quite useful in the modern world. For example, training
of pilots is conducted in a simulated environment.
Fundamentals of Research

€€ Qualitative research approach: The qualitative research approach deals with the Notes
subjective evaluation of attitudes, opinions and actions. This approach generates
results in a non-quantitative form. This research is based out of researcher’s
insights and impressions. Usually, the techniques used in a qualitative research
involve focus group interviews, projective techniques and depth interviews.
€€ Pragmatic research approach: This research approach is also known as mixed
method. The research conducted in this approach involves collecting both
quantitative and qualitative data to conduct inquiries, integrating the two forms
of data, and using different designs that may involve philosophical assumptions
and theoretical frameworks. The central postulation of this form of research is
that by combining both qualitative and quantitative approaches, a more complete
understanding of a research problem is achieved than either approach alone.

1.2.4 SIGNIFICANCE OF RESEARCH


Hudson Maxim has expressed that “All progress is born of inquiry. Doubt is often
better than overconfidence, for it leads to inquiry, and inquiry leads to invention.”
The main motive of doing a research study is to find the hidden truth. Research leads
to progress as research is done to solve problems, expand knowledge or explore a
new phenomenon. In a society and in a business, there are various problems which
researchers use to solve through different researches.
M
The role of research has greatly influenced the field of business and economy.
Research is being used by organisations to solve operational problems and it is also
M
useful to the government as it assists in framing economic and development policies.
The growing competition and complexities in business have necessitated research
in marketing. This need has given rise to a new field of research called marketing
research. A marketing research is basically the methodical gathering, recording and
II

analysing of facts about business problems.

Apart from the role of research in marketing, the significance of research in the field
of business is noteworthy for an organisation because it helps:
€€ Identify and define opportunities
€€ Define, monitor and refine strategies
€€ Identify economic and business objectives
€€ Identify policy objectives
€€ Develop products
€€ Identify objectives of human resource development
€€ Identify promotional objectives
€€ Identify market objectives
€€ Identify customer satisfaction objectives
The significance of research for social scientists is reflected in studying social
relationships and in seeking answers to various social problems. The main purpose of 9
research in social sciences is related to two main motives, firstly with the knowledge
Research Methodology

Notes for its own motives and, secondly, with the knowledge for what can be contributed
to practical concerns of the society.

The significance of research can also be appreciated for the following purposes:
€€ For students, writing a master’s or Ph.D. thesis may mean better career
opportunities. It also aids them to attain a high position in the social structure.
€€ For professionals working with research methodology jobs, such as a survey
research assistant, research associate, research faculty, etc., research may mean a
source of livelihood.
€€ Philosophers and thinkers bring light to new ideas and insights after conducting
research.
€€ Literary men and women may use research as a means for the development of
new styles and creative work.
€€ Analysts and intellectuals use the concepts of research for the generalisations of
new theories.

1.2.5 APPLYING RESEARCH IN DIFFERENT FIELDS OF MANAGEMENT


Application of research in different fields of management may be done in the
following ways:
M
€€ Theory-building research is done with the aim to develop management theories.
Such research is done for improving the understanding and knowledge related to
M
the management process.
€€ Theory-testing research is done with the aim to test out theories of management.
The testing is done by using the process of observations and measurements which
II

guides in arriving at the decision as to whether accept or reject a theory.


€€ Problem-centered/practical research is primarily done with the aim of investigating
a practical problem, question or issue in a specific organisation or management
context with a view to resolving the problem and, subsequently, making
recommendations for the procedure to be followed. The research is conducted to
find and propose solutions related to real-life management problems.
Various functional areas in ‘management’ where research is used are as follows:
€€ Application of Research in Marketing: The application of research in the field of
marketing is done for the following reasons:
zz For decision-making
zz For doing market research
zz For doing survey on demand
zz For conducting product research
zz For customer research
zz For sales research
10
Fundamentals of Research

zz For promotional research Notes


zz For risk management on collaboration
zz For research on market development
zz For research on marketing and reach of competitors
zz For research on the formation of marketing strategy
zz For research to build up a competitive advantage
€€ Application of Research in Finance: The application of research in the field of
finance is done for the following purposes:
zz Portfolio management
zz Risk perception
zz Financial crisis management
zz Research to assess the perception of mutual fund investors
zz Investment analysis
zz Break-even analysis
zz Capital budgeting
M
zz Ratio analysis
zz Decision-making
M
zz Financial planning for salaried employees
zz Strategies for tax savings
Research on investment pattern and preference of retail investors
II

zz

€€ Application of Research in HR: The application of research in the field of HR is


done for the following purposes:
zz Recruitment and selection
zz Training and development
zz Manpower planning
zz Labor welfare study
zz Leadership style
zz Administrative roles
zz Performance appraisal system
zz Research on MBO
zz Comparative approach
zz Problem identification
zz Conflict management
zz Research on statistical approach
11
Research Methodology

Notes €€ Application of Research in Production: The application of research in the field of


production is done for the following reasons:
zz Planning production
zz Supply chain management
zz Testing new products
zz Guaranteeing adequate distribution
zz Prototype development
zz In-house research is required for professional and self-development of the
workers through training and mentoring
zz New technology approach
zz Undertaking research can help a company avoid future failure
zz Studying the competition and competitors
zz R&D for full utilisation of the machines
zz Strategic module for overall production and distribution
zz Operational module for production and sales synchronisation
M
1.2.6 PROBLEMS ENCOUNTERED BY A RESEARCHER
Conducting research requires several elements to be managed and arranged. Some
M
elements are difficult to manage, while others are difficult to arrange. Research is
carried by a single individual or a group/organisation/institution, but it requires
the acceptance/approval of several others also, such as guides, supervisors, defense
committee members, interviewees, focus group members, etc. The smooth flow of a
II

research also depends upon the developing or developed nations. In the developing
nations, research is in its initiation stage while the developed nations have sufficient
facilities and resources to carry on research. Researchers particularly in a developing
nation face the following problems:
€€ Lack of scientific training: The lack of a scientific training in the methodology of
research is a great impediment for researchers in the developing nations. There is
scarcity of capable and experienced researchers. Numerous researchers without
any prior experience and without any certainty about research methods conduct the
research. So, all researches done are necessarily not methodologically appropriate.
Some researchers and their guides, due to lack of specific research training, take
research as a scissor and paste job without doing actual analysis of the collected
material. The outcome of such researches is that the results of research do not
reflect the reality. Thus, irresponsible research necessitates the need of a systematic
study of research methodology. A researcher must be properly equipped with all
methodological aspects of research prior to starting any research project. As such,
efforts should be made to provide short-duration intensive courses for meeting the
requirement of a systematic research methodology.
€€ Insufficient interaction: The lack of interaction among various research and
12 non-research organisations causes problems to researchers. There is inadequate
communication between the sole researcher/university research departments
on one side and business organisations/government departments/research
Fundamentals of Research

institutions on the other side. Due to the lack of interaction and proper contacts of Notes
researchers, a large amount of primary data remains unused.
In order to overcome this problem, efforts should be made to develop a satisfactory
link among all concerned (institutions, organisations, researchers, etc.) for better
and realistic researches. Certain systematic mechanisms, such as university-
industry interaction programmes, must be developed so that researchers can get
ideas and encouragement from the experienced practitioners.
€€ Lack of secrecy: The lack of confidentiality about the usage of information often
creates problems for researchers. Business organisations in a developing country
do not have much confidence that the information shared by them to researchers
will not be misused. The concept of information secrecy is very important for any
business organisation. This restricts business organisations to share information
and proves a barrier to researchers. Consequently, the utmost requirement for the
researchers is to generate the confidence among business organisations that the
information/data obtained from a business organisation will not be misused.
€€ Identification of research problems: Researchers often face the problem of
appropriate identification of research problems. The absence of adequate
information often makes researchers choose such research problems and conduct
research studies which may overlap with the previous researches. This results in
M
duplication and wastes away resources. The solution to this problem is creating
a proper list of subjects, with research problem topics, and the places where the
research is done. Such lists need to be revised and updated at regular intervals and
made available to all the prospective researchers for appropriate identification of
M
the research problems.
€€ Lack of assistance: Researchers often face the problem of the absence of support
in terms of time, funds and proper direction for research. It leads to unnecessary
II

delays in the completion of the research studies. This difficulty can be lessened by
providing sufficient and timely assistance to researchers.
€€ Lack of resources: Deficiency of resources leads to the wastage of energy and
efforts of researchers. At many places, the functioning of library is not satisfactory
and researchers have to spend a lot of their precious time in searching books,
journals and reports. In many libraries, especially which are away from cities and
state capitals, it is difficult to obtain copies of old acts/rules, reports and other
government journals and publications. This creates a big obstacle in the research
work.
€€ Code of conduct: There is a lack of pre-defined code of conduct for researchers.
This, sometimes, results in inter-university and inter-departmental rivalries.
Thus, it is required to develop a code of conduct for researchers which, if obeyed
sincerely, can solve this problem.

1.2.7 ETHICS IN RESEARCH


Ethics refers to a branch of philosophy that distinguishes between right and wrong.
Ethics helps in deciding whether an action is right or wrong. All people often develop
a sense of right and wrong in their childhood. Ethical development is regarded as a 13
long-term and continual process. Although some arguments say that ethics is merely
Research Methodology

Notes a common sense, but it is also right to say that ethical norms vary according to
individuals. Hence, different people may interpret ethical norms in different ways.
For example, if an experimental research involves children as respondents, then
the parents of the children must be informed about the same and prior permission
should be taken. If parents are not informed and their consent is not gained, the
research would be deemed as unethical.

The objectives which underline the necessity of adhering to ethics in a research are
as follows:
€€ Ethics in research should be obeyed to protect the interests of participants involved
in a research.
€€ Ethics in research should be followed to make sure that research is carried out in
a manner that serves interests of individuals, groups and/or society as a whole.
€€ Ethics in research helps scrutinise specific research for its ethical soundness keeping
in consideration issues, such as management of risk, protection of confidentiality
and process of informed consent.
The primary ethical values related to research are shown in Figure 3:
M
Honesty
M
Social
Objectivity
Responsibility
II

Ethics in
Research

Confidentiality Integrity

Figure 3: Ethics in Research

The ethical values related to research are described as follows:


€€ Honesty: It refers to truthfulness of the researcher in collecting and presenting
data. A researcher should never fabricate the gathered data or misinterpret the
data to arrive at the desired conclusion.
€€ Objectivity: It implies that a researcher should not be biased in research design,
data collection, interpretation, analysis and other aspects of the research.
€€ Integrity: It implies that a researcher should be sincere in his/her action and should
14
keep his/her promises.
Fundamentals of Research

€€ Confidentiality: It involves that the secret information, such as military secrets, Notes
papers, and personnel records which are used in the research, should be kept
private.
€€ Social responsibility: It infers that a researcher should try to increase social welfare
through his/her research study. In addition, the researcher should not harm
society and environment in any way while conducting research. For example, if
the research is related to animals, the researcher should give them proper care and
respect.
A researcher should adhere to the following five major principles of research ethics:
1. Do good (Beneficence)
2. Do no harm (Non-malfeasance)
3. Obtain informed consent from research participants
4. Do not use deceptive practices
5. Research participants should have the right to withdraw from the research at any
point of time
As ethical norms and standards are important for research, many universities and
government organisations, such as National Institute of Health (NIH), National
M
Science Foundation (NSF) and Food and Drug Administration (FDA), have adopted
and implemented some rules and procedures related to research ethics.
M
1.2.8 MANAGERS AND RESEARCH
Managers equipped with the basic knowledge of research are at an advantage as
compared to those managers who do not have any idea about it. Managers of an
II

organisation often need to conduct research, so as to address various problems. For


instance, stable demand for products and costs higher than the allotted budget are
some of the business problems for managers. Growing competition and complexities
in the business environment are the main reasons behind the occurrence of business
problems. Managers with the help of thoughtful research methods can prevent the
occurrence of unwanted situations before it gets out of control.

The management of an organisation should think of hiring professional researchers


or consultants for problem-solving because managers of an organisation may handle
only minor problems efficiently, whereas serious problems definitely need handling
by professional researchers. The research conducted by professionals should be
based upon an effective and fruitful interaction between managers and researchers
if managers understand the fundamentals of the research. Managers’ knowledge
about research process, research design, data collection and data interpretation help
them determine whether the solutions recommended by the professional researchers
are feasible or not.

Additionally, managers require the knowledge of research because of various


reasons, which are as follows:
€€ Differentiating good research from bad research 15
€€ Making important business decisions
Research Methodology

Notes €€ Combining experience and scientific knowledge to take decisions


€€ Forecasting and planning about future uncertainties which are controllable
S elf A ssessment Q uestions
1. Research may also be done with the objective to determine the ___________
with which an event occurs.
2. The research should be based on actual data that is derived from _____________
and experience.
3. Basic research is not useful in developing new scientific ideas and various
ways of thinking. (True/False)
4. The primary aim of ___________ approach is to infer some characteristics of
the population under study by forming a proper database.
5. A ___________ research is basically the methodical gathering, recording and
analysing of facts about the business problems.
6. The lack of confidentiality about the usage of information never creates
problems for researchers. (True/False)
7. Ethics in research should be obeyed to protect the interests of participants
involved in a research. (True/False)
M
1.3 RESEARCH PROCESS
M
A process refers to the act of doing something efficiently through an acknowledged
set of actions. Research process implies the series of events needed to carry out
research competently. Every research requires a pre-determined process to be
II

followed because of the following reasons:


€€ A research process is required to achieve the desired results from the research.
Most research studies are done with unambiguous goals that can be turned into
specific outcomes only when a well-defined process is followed.
€€ A researcher is required to complete research study in a specific time. A research
process is needed because in the absence of a well-defined process, researchers
may not be able to complete their research timely.
€€ A research process is required to conduct research in a competent and successful
manner. A pre-planned procedure guarantees competence in the research study.
Steps in a research process may vary as per the subject and need of research.

Figure 4 shows the fundamental steps of a research process:

Defining
Reviewing Formulating Designing Collecting Analysing Preparing
Research
Literature Hypothesis Research Data Data Reports
Problem

16
Figure 4: Fundamental Steps of a Research Process
Fundamentals of Research

The steps of a research process are closely correlated. It is not essential to follow Notes
the research steps in strict order. However, this order of steps provides a useful
guideline to the researcher. These steps are discussed as follows:
€€ Step 1: Defining Research Problem: The first step refers to the identification
of a problem whose solution can be attained by research. In simple terms, a
research problem means the matter on which the investigator/researcher wants
to investigate. At this stage, the researcher usually feels confused and doubtful.
Research comes into existence through the efforts made by the researcher to solve
doubts and confusions. Basically, two steps are involved in defining a research
problem:
zz Knowing the problem correctly
zz Expressing the problem into meaningful terms
€€ Step 2: Reviewing Literature: The second step refers to a way of developing a
proper understanding of the research problem. Usually, two types of literature
may be reviewed by a researcher, i.e., conceptual literature, which comprises
thoughts and presumptions and empirical literature, which comprises empirical
studies done earlier on a same or similar topic. It is important for a researcher to
review the literature properly to achieve the following:
zz Develop and refine research ideas
M
zz Improve subject information
zz Elucidate study questions
M
zz Focus research possibilities that have been disregarded or unseen
zz Shun easy monotonous work, which has been done previously
II

zz Find out and provide an insight into research advances, tactics and methods
€€ Step 3: Formulating Hypothesis: The third step relates to an uncertain hypothesis
made by the researcher to consider the result of research. It provides the crucial
point for the research and helps the researcher be on the right track.
€€ Step 4: Designing Research: The fourth step is deciding the type of research
design that should be followed for conducting the research study. The research
design selected is based upon the type of research problem and the scope of the
research study undertaken. The preparation of the research design enables the
researcher to yield maximal information from the research conducted.
€€ Step 5: Collecting Data: The fifth step relates to assembling information, which
is crucial for any research study. There are essentially two types of data: primary
data and secondary data. Primary data is collected by testing or investigations. In
case of investigations, data can be composed through:
zz Observation
zz Interviews
zz Telephonic talk
17
zz Feedback form
Research Methodology

Notes zz Agenda
zz Questionnaires
Secondary data relates to that information which has already been composed by
some other researcher. Examples of such data are biographies, diary, records, and
published material. In order to complete a research study successfully, exact and
suitable information is compulsory.
€€ Step 6: Analysing Data: The sixth step of the research process is transforming
and refining data to highlight useful information. There are various statistical
methods to analyse the data, such as tabulation, bar diagrams and pie charts.
Statistical theories, such as correlation, regression and time series are also used
for data analysis. After the data analysis, the researcher is in a position to test the
hypothesis formulated in step three. The researcher can check the rationality of
the hypothesis by using several statistical tests, such as Chi square test, t-test and
F-test.
€€ Step 7: Preparing Reports: The seventh step of the research process is the last stage
in which a researcher shows the complete work done by him through a report
prepared by him. Report writing should be done with great care by keeping in
view the proper layout of report.
M
The main text of a report should include:
zz Preface
zz Summary of whatever researcher has found
M
zz Main report
zz Conclusion
II

At the end, proof tables, questionnaires, and other documents used in the research
study should be given in the form of appendices. The research report also needs to
contain a bibliography, i.e., a list of literary material consulted by the researcher.
All the seven steps of a research process mentioned above are discussed in detail
in further chapters.
S elf A ssessment Q uestions
8. ____________ provides the crucial point for the research and helps the
researcher be on the right track.
9. The preparation of a research design enables a researcher to yield minimal
information from the research conducted. (True/False)
10. Report writing should be done with great care by keeping in view the proper
__________ of report.

A ctivit y
Make a list of 10 research companies running in India. Analyse their process of
research and prepare a short report with the information collected.
18
Fundamentals of Research

1.4 SUMMARY Notes

€€ Research is a process of collecting, analysing and interpreting the relevant


information about any topic. The primary objective of performing a research is to
explore answers to questions in a scientific manner.
€€ Research is done with the objective to determine a phenomenon or explore
something new.
€€ Research is also conducted with the objective to do hypothesis-based relationship
test between variables considered for study.
€€ A research to be effective should have the characteristics, such as directed,
systematic, clear purpose, empirical, data-driven, logical, elaboration, efficient
analysis, and carefully designed.
€€ The classification of research can be based on various decisive factors, such as
the technique, reason, accessibility of time and other assets, capacity, type of
investigation, and statistical substance. Various forms of research include basic
research, applied research, descriptive research, causal research, conceptual
research, empirical research, qualitative research, quantitative research and other
types of research also.
€€ Depending on the nature of the research problem being addressed, different
M
organisations use different research approaches. Broadly, there are three types
of research approaches, namely quantitative approach qualitative approach and
pragmatic approach (mixed methods).
M
€€ The main motive of doing a research study is to find the hidden truth. Research
leads to progress as research is done to solve problems, expand knowledge or
explore a new phenomenon.
II

€€ The significance of research for social scientists is reflected in studying social


relationships and in seeking answers to various social problems.
€€ Problems encountered by a researcher include lack of scientific training, insufficient
interaction, lack of secrecy, identification of research problems, lack of assistance,
lack of resources, and code of conduct.
€€ The ethical values related to research are honesty, objectivity, integrity,
confidentiality and social responsibility.
€€ Managers of an organisation often need to conduct research so as to address
various problems.
€€ Research process implies the series of events needed to carry out research
competently. The main steps to conduct a research include defining research
problem, reviewing literature, formulating hypothesis, designing research,
collecting data, analysing data and preparing reports.

1.5 KEY WORDS


€€ Research: It is an exploration for awareness.
€€ Empirical literature: It is an interdisciplinary field of research that is carried out in
19
areas, such as psychology, sociology and philosophy.
Research Methodology

Notes €€ Objectivity: It implies that the researcher ought not to be partial in research
plan, collecting information, understanding, investigation, and other features of
research.
€€ Hypothesis: It is a proposition made on the basis of limited evidence for further
investigation.
€€ Simulation: It is a scientific modelling of natural systems with an aim to understand
their functioning.
€€ Honesty: It implies to the truthfulness with which the researcher collects and
presents data.

1.6 CASE STUDY: RESEARCH PROCESS FOLLOWED IN SURVEY


LIMITED
Survey Limited, one of the top research companies, was started in Hyderabad, India
in1990s. Since then, it has picked up research topics and conducted researches on
major disturbing social issues, such as child marriage, dowry, and honor killing.
Now, the company wants to conduct a research study on the increasing effect of
alcohol on adolescents and youth. For the purpose, the company states its research
problem as:
M
Alcohol beverages are causing negative effect on the health of adolescents and youth.

In order to clearly and deeply understand the topic, the company surveys and
reviews already available research papers and thesis, which included conceptual as
M
well as empirical literature.

It also takes the help from various books and journals. This in-depth review and survey
II

of the available material enable Survey Limited to develop clearer understanding for
formulating research hypothesis. Survey Limited formulates its research hypothesis
as:

Westernisation and changing lifestyle are responsible for the increasing use of
alcoholic beverages.

After formulating its research hypothesis clearly, SurveyLimited chalks out complete
research design within which research would be carried out. It collects primary
as well as secondary data from books, journals, and observation and personal
interviews. The collected data is then analysed critically using various statistical
tools, such as bars, pie charts, tables, and time series. Survey Limited presents a final
report of its work that also includes strategies to reduce effects of alcohol.

QUESTIONS
1. What did Survey Limited do for a clear and deep understanding of the research
topic?
(Hint: The company surveyed and reviewed already available research papers
20 and thesis)
Fundamentals of Research

2. How did in-depth review and survey of the available material help Survey Limited Notes
in conducting research?
(Hint: To develop a clearer understanding for formulating research hypothesis.)
3. What tools were used to analyse the collected data?
(Hint: Various statistical tools, such as bars, pie charts, tables, and time series.)
4. State the research process followed in Survey Limited.
(Hint: Defining the topic of research, reviewing literature to gain more
understanding about the topic, and so on)
5. What sources were used by Survey Limited for data collection?
(Hint: Primary as well as secondary sources)

1.7 EXERCISE
1. Explain the characteristics of a good research.
2. What are the objectives of research?
3. Discuss the importance of research.
4. What ethical norms are required to be followed while conducting research?
M
5. What are the problems encountered by a researcher in the research process?
6. What are the applications of research in various fields of management?
M
7. Explain the research process in detail.
8. Explain various forms of research.
II

1.8 ANSWERS FOR SELF ASSESSMENT QUESTIONS

Topic Q. No. Answer

Concept of Research 1. frequency

2. observation

3. False

4. inferential

5. marketing

6. False

7. True

Research Process 8. Formulating hypothesis

9. True

10. layout

21
Research Methodology

Notes 1.9 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Welman, J., Kruger, F., & Mitchell, B. (2005). Research Methodology. Cape Town:
Oxford University Press.
€€ Kothari, C. (2004). Research Methodology. New Delhi: New Age International (P)
Ltd.
€€ Goddard, W., & Melville, S. (2011). Research Methodology. Kenwyn, South Africa:
Juta & Co.
€€ Lancaster, G. (2012). Research Methods. Taylor & Francis.

E-REFERENCES
€€ What is Research Definition, Types, Methods & Examples. (2020). Retrieved 7
March 2020, from https://www.questionpro.com/blog/what-is-research/
€€ Materials, U., Aptitude, R., & Notes, S. (2020). Steps Involved In Research Process
| Research Aptitude Notes. Retrieved 7 March 2020, from https://ugcnetpaper1.
com/research-process/
M
M
II

22
R
TE
2

AP
H
C
Defining and Formulating a
Research Problem M
Table of Contents
M
2.1 Introduction
2.2 Management Dilemma
II

Self Assessment Questions


2.3 Literature Review
2.3.1 Importance of a Literature Review
2.3.2 Functions of a Literature Review
2.3.3 Process of a Literature Review
2.3.4 How to Write a Literature Review
2.3.5 Types of Sources for Review
Self Assessment Questions
2.4 Concept of a Research Problem
2.4.1 The Need for Defining a Research Problem
2.4.2 Conditions and Components of a Research Problem
2.4.3 Identifying a Research Problem
2.4.4 Formulating a Research Problem
Self Assessment Questions
2.5 Summary
2.6 Key Words
2.7 Case Study
Table of Contents
2.8 Exercise
2.9 Answers for Self Assessment Questions
2.10 Suggested Books and e-References

M
M
II
Defining and Formulating a Research Problem

LEARNING OBJECTIVES Notes

After studying this chapter, you will be able to:


€€ Define the term ‘management dilemma’
€€ Describe the importance and functions of a literature review
€€ Explain the ways of writing a literature review
€€ Discuss the types of sources for review
€€ Describe the concept of a research problem
€€ Discuss the conditions and components of a research problem

2.1 INTRODUCTION
In the previous chapter, you studied about the concept of research. The chapter
discussed the characteristics and types of research. The latter section of the chapter
described the problems encountered by a researcher. The chapter concluded with
the explanation of the research process.

A management dilemma occurs when the decision-makers of an organisation,


M
i.e., executives and managers, encounter a complex situation where they have to
choose between two or more options. Their decision will have an influence on the
profitability, competitiveness, stakeholder’s wealth, etc. Here comes the need for
dilemma management. The decisions of the management must be taken on the
M
basis of the facts generated from the research. For conducting research, a researcher
defines the area of study and formulates research problem. A research problem is
an issue, a contradiction, or a gap that a researcher is willing to address. To find the
II

answer to the problem, the researcher conducts literature review to gain the idea of
previous researches on a similar topic or area of study. Literature review helps in
developing the knowledge base, outlining the research questions and finding of the
previous or existing research conducted by the other researchers on the same topic
of study. It is important to find out the exact problem faced by the management for
conducting research as it is correct to say that a problem rightly explained is half
solved. If the researcher has recognised more than one problem, then the selection
of problem must be done on the basis of priority, financial condition and time limit.
Researchers must aware themselves about the selected problem by studying the
available literature.

In this chapter, you will study the management dilemma. Next, you learn about
the literature review, its functions and process. Further, the chapter will describe
the concept of a research problem. Towards the end, the chapter will brief about
formulating a research problem.

2.2 MANAGEMENT DILEMMA


A dilemma is referred to as a tough choice in a complicated situation where managers
have to choose between more than one alternative. The word ‘dilemma’ is created 25
by combining prefix ‘di’ and suffix ‘lemma’, where ‘di’ means ‘double’ and ‘lemma’
Research Methodology

Notes means ‘proposition’ or ‘subject’. Let us assume that Priyamvadha went to the pacific
mall. Now she has to choose between red dress and blue dress. Here, we cannot say
that she is in a dilemma, but if a fire broke out at her floor of the residential building
and her cat and dog are inside the room, and she can save only one of them, then this
can be considered as an awful dilemma.

Therefore, we can say that the management dilemma is a complex situation faced by
executives or managers when they have to achieve two or more goals at a particular
time. It becomes difficult for them to prioritise one goal out of other goals. As an
executive or a manager of an organisation, people are likely to face management
dilemmas on a regular basis. For example, the marketing head of XYZ Organisation
is in a dilemma because a few months ago, one competitor organisation announced
that it will launch a new product very soon, which is now under the development
stage. XYZ Organisation has also publicly announced to launch the same type of
product. The launch of the new product is in 2 months, and the development team
informs the marketing executive that the version of its product will not be up to
the standard as that of its competitor. It needs at least more than a year for creating
a product matching the competitor’s standards, and it will be too late to start in a
fresh manner. This type of a condition forces the marketing head to take a decision
whether to launch the product or postpone it. The scope of business management
is full of such problems called management dilemmas that need detailed business
research and study.
M
A management dilemma is usually the symptom of a problem that requires a business
decision, which can be related to:
M
€€ Increase in the overall costs
€€ Decline in the sales
II

€€ Increase in the number of defects in a product


€€ Rise in customer complaints post purchase of a product
€€ Conflicts among employees
€€ Low motivation levels
€€ Performance issues
€€ High absenteeism rate
€€ Resignation of key employees
It is necessary for an organisation to manage the dilemma to take the best business
decisions. The dilemma management helps in resolving the complex issues in a
systematic manner. Following points must be kept in mind while facing a dilemma:
€€ Address the dilemma; do not try to avoid it: In an organisation, dilemmas can
arise because of the lack of foresightedness. A manager must understand that
there is an issue and try to find a solution to it promptly. If the manager will try to
avoid the issue, it will only escalate it. The manager can begin by evaluating what
the underlying problem is and look for the appropriate solution. This will assist in
26 preparing for future complexities.
Defining and Formulating a Research Problem

€€ Think productive: In an organisation, it is necessary for managers and executives Notes


to think in a productive manner and analyse the situation from all angles. Managers
must create willingness to work and help every employee in resolving issues. The
goal is to pay more attention to the dilemma and find its best possible solutions.
Management must avoid defensive and reactionary behaviours.
€€ Review action: After addressing the issue, it is good to review the actions. This
helps in finding out whether the actions taken are successful or is there any other
way to solve that problem. Managers must examine the work critically and find
out the hidden mistakes.
€€ Develop the environment for dilemma management: An organisation must
establish an environment within the workplace for dilemma management rather
than its avoidance. Managers and executives must understand their mistakes,
learn from them and take corrective actions.
S elf A ssessment Q uestions
1. A __________ is referred to as a tough choice in a complicated situation where
managers have to choose between more than one alternative.
2. The dilemma management helps in resolving the complex issues in a systematic
manner. (True/False)
M
3. Management must avoid the __________ and reactionary behaviours at the
time of resolving the complex issues.
M
2.3 LITERATURE REVIEW
Literature is the assembly of scholarly writings on a certain topic. A literature
II

review is a document that helps researchers examine the published information


related to the particular subject area. A literature review covers a particular time
period. It is not limited to the summarisation of the sources; rather, it combines
both summarisation and synthesisation of sources. A summary is a recap or brief
outline of previous important information of the source; whereas, a synthesis is the
reshuffling or reorganisation of information. Literature review helps in providing
an outline of current knowledge that allows the writer or researcher to recognise the
relevant theories, methods, and gaps in the old and present researches. A literature
review is conducted by collecting, evaluating and analysing books, journals, articles
(publications) related to the research problem. For understanding the solution of
the dilemma of management, researchers examine a wide variety of journals, books,
and articles related to the business research problem. Researchers show a summary
and critical evaluation of the literature reviews that fits their field of study to the
interested parties. This process is known as the literature review.

A good literature review provides a clear picture of the current knowledge on the
research subject. The objectives of a literature review are:
€€ To conduct a survey in the area or subject of study
€€ To synthesise the information into a summary
27
Research Methodology

Notes €€ To perform a critical analysis of the collected information by recognising the gaps
in the existing knowledge
€€ To show limitations of research theories and review controversial areas
€€ To present the literature in a proper manner
Let us understand the importance of a literature review.

2.3.1 IMPORTANCE OF A LITERATURE REVIEW


A researcher must not get confused between literature review and research papers.
Both of them are different. A literature review does not give new ideas related to the
topic of study or research problem. It just summarises and synthesises the ideas of
others or existing literature. Research papers, however, develop new arguments and
are based on the original research. Conducting literature review is just like doing
homework and getting an idea about the topic in advance. It is important to conduct
a literature review because:
€€ Literature review brings clarity regarding the subject of study and helps in
understanding the subject.
€€ Literature review helps in familiarising the researcher with the research
methodologies used by others for finding the answers to the related research
problems.
M
€€ Literature review helps identify which methodologies in the previous researches
have been most beneficial in analysing a topic.
M
€€ Literature review makes the researcher aware of the pitfalls and problems faced by
others and helps in choosing the correct methodology to solve the problem.
II

€€ Literature review also helps broaden the knowledge base in the related research
area in which researchers wants to study.
€€ Literature review helps in finding the researcher’s present knowledge for
conducting the study.
€€ Literature review helps identify the experts on a researcher’s topic of study. For
example, if any person has written 20 articles on a research topic related to your
research subject, then he is likely to be knowledgeable about that topic. This
person’s written work could be a key resource for consultation in your research.
€€ Literature review helps in avoiding delicacy and plagiarism.
€€ Literature review helps make a comparison between the findings of the researcher
and others.
Now, let us understand the functions of a literature review:

2.3.2 FUNCTIONS OF A LITERATURE REVIEW


Literature review helps the researcher create a link or bond with the readers and
build trust. It also helps eliminate the chances of the repetition of the similar research
28 publication. It saves the time, money and other resources invested by the researcher
in conducting research. A literature review gives a theoretical background of the
Defining and Formulating a Research Problem

research subject and establishes the relation between what a researcher is proposing Notes
to examine and what he/she has already studied.

The main functions of a literature review are shown in Figure 1:

Providing a Context to your Research

Giving a Shape to your Research Problem

Ensuring Novelty in your Research

Showing the Contribution of the Findings

Formulating a Research Hypothesis

M
Figure 1: Functions of a Literature Review

Let us discuss the functions of a literature review:


Providing a context to your research: A literature review helps place your research
M
€€
in the context of what is already known about the topic. It answers questions such
as:
zz How does your research answer provide the solution to the management
II

dilemma/question in comparison to the answers given by other researchers?


zz What contribution has your research work made?
zz What are the differences between your findings and the findings of other
researchers?
€€ Giving shape to your research problem: By understanding the topic better, you
will be able to conceptualise your research problem clearly and precisely. You will
also be able to understand the relationship between your research problem and
the work done in your research area.
€€ Ensuring novelty in your research: Finally, a literature review ensures that you
do not ‘reinvent the wheel’. In other words, you save effort in trying to rediscover
something that is already known or published. This will ensure that you bring
new and significant contributions to your field of research.
€€ Showing the contribution of the findings: It enables you to show how your
findings have contributed to the existing body of knowledge in your profession.
€€ Formulating a research hypothesis: Researchers read and review the available
literature related to the research topic. The review may include reading articles,
books, cases or other research papers. After the literature review is completed, 29
the researchers gain a sufficient amount of information regarding their study
Research Methodology

Notes topic which helps them in narrowing down or limiting it and expressing it in
the form of a research question. The research hypothesis is constructed using the
research question. Therefore, a literature review helps in formulating the research
hypothesis.

2.3.3 PROCESS OF A LITERATURE REVIEW


A literature review helps the researcher prepare well for conducting research. It
shows the originality and relevance of the research problem and justifies the proposed
research methodology. Research methodology involves the techniques that are used
to recognise, select, process, and examine information about the research topic.
Figure 2 shows the process of a literature review:

Search the existing literature


in your field of interest

Review the literature


M obtained

Develop a theoretical
framework
M

Write up the literature


II

review

Figure 2: Process of a Literature Review

Let us understand the process of literature review in detail:


1. Search the existing literature in your field of interest: First, search what has
already been done in the chosen topic of interest. To search the existing literature,
compile a bibliography and/or a list of references, which is a list of books on the
topic of interest. To save time, the researcher can go through the following sources:
zz Indices of journals on your research topic
zz Abstracts of articles on your research topic
zz Citation indices
zz Digital libraries
2. Review the literature obtained: After the researcher has identified the relevant
journals and books, he/she must start reading them. Evaluate them critically
to compile themes and issues that are associated with the research topic. The
30 researcher must note down the main points to create a rough framework or theme
of the research.
Defining and Formulating a Research Problem

Do a critical evaluation of the literature to: Notes


zz Identify the proposed theories, critics and methodologies (sample size, data
used, measurement methods)
zz Assess whether the knowledge relevant to your theoretical framework has
been confirmed beyond doubt
zz Discover different perspectives among researchers and write down your
opinions about their validity
zz Find the gaps that are present in the existing body of knowledge
3. Develop a theoretical framework: A literature review can be a time-consuming
task. Therefore, it is important to set the boundary and parameters for a research
work. Sort out the information obtained from the literature sources according to
theoretical framework. This will enable the researcher to focus in the literature
search. In other words, the theoretical framework will provide foundation and
guide to read further. It is quite possible as a researcher reads further. However,
this is part of a research process.
4. Write up the literature review: The final step is to compile and write all the
literature read and reviewed. To do so, the researcher must perform the following
steps:
M
a. Start review with a theme or points
b. Organise and list all the themes to discuss and relate. This will give a structure
M
to literature review
c. Identify and describe various theories relevant to the field of research
d. Describe the gaps that exist in the body of knowledge in the field
II

e. Explain the recent advances and current trends in the field of research
f. Compare and evaluate findings based on:
i. Assumptions of research
ii. Theories related to the topic of research
iii. Hypotheses
iv. Research designs applied
v. Variables selected
vi. Potential future work speculated by the researchers
g. Acknowledge, cite and quote sources of research. Give credit to the works
of other researchers. Quote their work to show how research contradicts
or contributes to their work. This will make the literature review more
comprehensive and precise.

2.3.4 HOW TO WRITE A LITERATURE REVIEW


After reviewing the existing body of knowledge on the topic of research, researcher 31
has created a theoretical framework for the area of research. Researcher is now ready
Research Methodology

Notes to write a literature review. How should researcher go about it? Some strategies of
writing a literature review are shown in Figure 3:

Present
Find a Focus State the Focus
Information

Compose your Revise or Review


Study Information

Figure 3: Strategies of Writing a Literature Review

The strategies of writing a literature review are as follows:


€€ Find a focus: A literature review is generally organised around ideas, and not just
sources. As the researcher, read the existing body of knowledge in topic, consider
and pick any of the following themes to focus and organise review:
zz
M
Which themes connect the sources together?
zz Do they present single or multiple solutions?
M
zz Are there any gaps in the existing themes?
zz How effectively do they present the material?
zz Do they reveal a trend in the field?
II

€€ State the focus: The researcher writes a simple statement in the literature review
that tells readers what to expect. Some examples are as follows:
zz The current trend in treatment for cancer combines surgery, medicine and
natural healing.
zz Popular media is acquiring academic consideration.
€€ Present information: The researcher organises the information to present in the
following way:
zz Cover the basic categories: A literature review contains the following three
basic categories:
99 Introduction: It gives a quick idea of the topic of literature review, such as
central theme.
99 Body: It contains discussion of sources. It can be organised chronologically,
thematically or methodologically (discussed further).
99 Conclusions/recommendations: It provides the conclusion the researcher
has drawn from reviewing literature.
32
Defining and Formulating a Research Problem

zz Organise the body: Once the researcher has the basic categories in place, Notes
consider how to organise the sources within the body of the review. Table 1
shows the ways to organise sources of a literature review:

Table 1: Ways to Organise Sources of a Literature Review

Organisation Methods Description


Chronologically Sources are organised according to when they were
published.
By publication Sources are organised by publication in a chronological
manner if the order shows an important trend.
Thematic Sources are organised around a research topic or a
problem, rather than time progression. However, time
progression may still be an essential factor in this case.
Methodologically Sources are organised on the methods of the research or
by author.

zz Consider additional sections: Sometimes the researcher might need to add


additional sections for the study, such as:
99 Current situation: This provides the necessary information for the readers
to understand the topic or focus of the literature review.
99 History:
M
This presents the chronological progression of the field.
99 Methods and/or standards: This presents the criteria used to select the
sources in the literature review.
M
€€ Compose your study: After organising the basic categories, the researcher is ready
to write the review. Some guidelines to follow during the writing are given below:
zz Refer to several other sources when making a point. Back up your point with
II

a suitable evidence.
zz Selectively highlight only the most important points in each source. Your
points must directly relate to the review’s focus.
zz Avoid using any direct quotes. This is because the survey nature of the literature
review does not allow for in-depth discussion or detailed quotes from the text.
However, if you do want to use quotes to emphasise a point, then use short
quotes sparingly.
zz Summarise and synthesise your sources within each paragraph and throughout
the review.
zz Maintain your own voice by starting and ending a paragraph with your own
ideas and own words.
zz When paraphrasing a source that is not your own, remember to represent the
author’s information/opinions accurately and in your own words.
€€ Revise or review information: Finally, the researcher must revise the review.
Make sure that it follows the outline. Rewrite the language of review to present
information in the most concise manner possible. Avoid unnecessary jargon
or slang; use familiar terminology. The researcher must verify that sources are 33
documented and format the review appropriately.
Research Methodology

Notes 2.3.5 TYPES OF SOURCES FOR REVIEW


The literature includes peer-reviewed articles, books, dissertations and conference
papers. Literature review sources can be divided into two categories, such as primary
and secondary sources for literature review. The primary sources are original and
provide first-hand information. Some of the examples of primary sources are:
€€ Reports €€ Organisation reports
€€ Thesis €€ Unpublished manuscript sources
€€ E-mails €€ Some government publications
€€ Conference proceedings
The secondary sources are non-original and provide second-hand information. Some
of the examples of secondary sources are:
€€ Journals
€€ Books
€€ Newspapers
€€ Some (secondary) government publications
M S elf A ssessment Q uestions
4. A literature review creates a rapport with the readers and builds trust.
(True/False)
M
5. The secondary sources are non-original and provide __________.
6. Which among the following is not an example of primary sources?
II

a. E-mails b. Organisation reports


c. Reports d. Newspapers

2.4 CONCEPT OF A RESEARCH PROBLEM


A research problem is referred to as a statement which is about an area of concern,
a condition that needs improvement, a difficulty to be eliminated, or a troublesome
query that exists in scholarly literature, in theory, or in practice that requires
meaningful understanding and deliberate investigation. In social science, the research
problem is in the form of a question. A research problem does not explain the way to
do something, offer a vague or broad proposition, or show a value question.

2.4.1 THE NEED FOR DEFINING A RESEARCH PROBLEM


It is important to formulate a research problem carefully to clearly indicate what you
intend to achieve through research. It is said that a process well begun is half done.
A well-formulated research problem makes the research process easier and more
focussed. It helps the researcher:

34 €€ Separate the irrelevant data from the relevant data


€€ Keep the research work on track
Defining and Formulating a Research Problem

€€ Ensure efficient and focussed literature review and other studies Notes
€€ Keep the research centred around the problem

2.4.2 CONDITIONS AND COMPONENTS OF A RESEARCH PROBLEM


A research problem exists if the following four conditions are met:
€€ There must be a problem whose solution is presently not known.
€€ There must be an individual, group or organisation to which the problem can be
attributed.
€€ There must be minimum two courses of action which a researcher can pursue.
€€ There must be at least two feasible outcomes of the course of action. Out of the two
outcomes, one outcome should be more preferable to the other.
On the basis of these conditions, the components of a research problem are:
€€ Individual, group or institution: There must be somebody to whom the research
problem can be attributed. It may be an individual, a group or an institution. The
individual/group/organisation is the one that is facing the problem or difficulty. At
times, these individuals or the group may themselves be researchers.
M
€€ Research objectives: There must be a purpose for which the research is conducted.
Every research is carried out to meet some predefined objectives.
€€ Environment: It refers to the surrounding in which a problem exists. Environment
M
is of three types: economic, social and political. A problem pertaining to the
study of inflation would come under the economic environment and a problem
pertaining to studying the effects of child marriage on the health of women would
come under the social environment.
II

2.4.3 IDENTIFYING A RESEARCH PROBLEM


Identifying or selecting a research problem is a difficult and time-consuming task.
While doing so, a researcher should consider the following factors:
€€ Personal interest: This is the chief motivation to select a research problem.
Academic research is a time- and effort-consuming process. A researcher can
consistently pursue it only if he/she is personally interested in resolving the
problem. The interest of a researcher further depends on other factors, such as
educational background, professional and personal experience and outlook.
€€ Knowledge and competence: The selection of a research problem depends on the
researcher’s knowledge in the field of interest and his/her capability to perform
research successfully. The qualification of the researcher, and his/her training and
experience must match the research problem.
€€ Availability of resources: An academic research usually involves large-scale data
collection, wide travelling, a lot of time and finance. If sufficient resources, such as
time and money are available to research a problem, then the problem is selected.
€€ Relative importance: If a problem is relatively important and urgent, then the 35
research must be conducted to solve that problem first.
Research Methodology

Notes €€ Usefulness and significance: The practical usefulness of a problem is also a major
motivation for a researcher to attend it.
€€ Timelines of the problem: Some problems take little time to be resolved, while
others take a considerable time. So, the time taken to complete research work is
also an important criterion to select a problem.
€€ Data availability: A researcher would select a problem, which has sufficient and
relevant data available.
€€ Novelty: If a problem is around a current topic of interest, then it is more likely
to be picked up for research. Any findings would invite immediate publicity and
funding for the researcher.

2.4.4 FORMULATING A RESEARCH PROBLEM


The next step after identifying a research problem is to formulate it in a form
agreeable for research. It means specifying the research problem in detail and
narrowing it down to a workable size. In this step, all questions and sub-questions,
which a researcher wants to answer by his/her research, are specified. In addition,
the scope and boundaries of investigation are determined. While formulating a
research problem, the researcher should clearly form the assumptions. Formulating
a research problem is a three-step process. These steps are shown in Figure 4:
M
Defining the Research Problem
M

Identifying the Variables


II

Evaluating the Research Problem

Figure 4: Steps to Formulate a Research Problem

Let us understand these steps in detail:


1. Defining the research problem: The first step of formulating a research problem
is to mention the problem in the form of a question or statement to make it clearer
and understandable. A good statement must clearly mention what exactly you
want to solve or determine by the research study. You also need to describe the
theoretical basis and background of the study. Major issues and elements of the
research should be divided into sub-parts for better understanding. It is also
important to state the problem in a manner that indicates the relationship between
two or more variables.
2. Identifying the variables: As already discussed, it is very important to identify the
variables involved in a research study because it helps in stating the problem in a
more precise manner. In other words, all variables involved in a research problem
36 should be defined in such a manner that they can be measured or expressed
quantitatively or qualitatively.
Defining and Formulating a Research Problem

3. Evaluating the research problem: The third step in formulating a research problem Notes
is to evaluate it in terms of originality, importance and feasibility. These factors are
discussed as follows:
zz Originality: The research problem should be unique. Any topic on which a
lot of research has already been done should be avoided because it would be
difficult to highlight anything new in that topic. However, in some cases, you
may decide to research a previously researched topic to verify its conclusions,
explain and elaborate the conclusions in a more effective manner, and solve
some of the inconsistencies of the previous research.
zz Importance: The research study should be significant enough to either become
the basis of any new theory or pose some problems for further research. In
addition, the research study should also have some practical applications.
zz Feasibility: This refers to the chances of conducting a successful research. You
should take up a problem, which is feasible for you to conduct a research. A
research problem may not be feasible because of the following reasons:
99 Lack of skills and competencies of the researcher
99 Lack of interest and enthusiasm of the researcher
99 High cost involved in the research study
M
99 Time constraint
99 Administrative constraints, such as lack of cooperation from administrative
M
authorities
S elf A ssessment Q uestions
7. A well-formulated __________ makes the research process easier and more
II

focussed.
8. The first step of formulating a research problem is to identify the variables.
(True/False)
9. __________ is the chief motivation to select a research problem.
10. Formulating a research problem is a __________-step process. Choose the
correct answer.
a. three b. four
c. five d. six

2.5 SUMMARY
€€ A dilemma is referred to as a tough choice in a complicated situation where
managers have to choose between more than one alternative. The word ‘dilemma’
is created by combining prefix ‘di’ and suffix ‘lemma’, where ‘di’ means ‘double’
and ‘lemma’ means ‘proposition’ or ‘subject’.
€€ A management dilemma is usually the symptom of a problem that requires a
business decision, which can be related to an increase in the overall costs, decline 37
in sales, conflicts among employees, etc.
Research Methodology

Notes €€ A literature review is a document that is prepared after conducting search and
evaluation according to the subject or chosen topic area. It examines the published
information related to the particular subject area about which the writer is writing.
€€ A good literature review provides a clear picture of the current knowledge on the
research subject. Conducting literature review is just like doing homework and
getting an idea about the topic in advance.
€€ Literature review also helps broaden the knowledge base in the related research
area in which researchers want to study.
€€ A literature review gives the theoretical background of the research subject.
€€ The steps to conduct the literature review process include searching the existing
literature in your field of interest, reviewing the literature obtained, developing a
theoretical framework, and writing up the literature review.
€€ Literature review sources can be divided into two categories, such as primary and
secondary sources for literature review.
€€ The primary sources are original and provide the first-hand information. The
secondary sources are non-original and provide the second-hand information.
€€ A research problem is referred to as a statement which is about an area of concern,
M
a condition that needs improvement, a difficulty to be eliminated, or a troublesome
query that exists in scholarly literature, in theory, or in practice that requires
meaningful understanding and deliberate investigation.
M
€€ It is important to formulate a research problem carefully to indicate what you
intend to achieve through research.

2.6 KEY WORDS


II

€€ Dilemma: The tough choice in a complicated situation where managers have to


choose between more than one alternative
€€ Journals: A scholarly publication that consists of articles written by the researchers,
professors and other experts
€€ Literature: The assembly of scholarly writings on a certain topic
€€ Primary sources: Sources which are original and provide the first-hand information
€€ Secondary sources: Non-original sources which provide the second-hand
information

2.7 CASE STUDY: MANAGEMENT DILEMMA OF WALMART INC.


Walmart Inc. was founded by Sam Walton. It was incorporated in 1969. Walmart is
a multinational retail corporation that is based in America and operates discount
department stores, grocery stores, and a chain of hypermarkets. Walmart is
headquartered in Bentonville, Arkansas.

Walmart is considered as the largest retail organisation that operates different


38 warehouses and departmental stores globally. The research was conducted for
finding the solution of management dilemma of all Walmart stores in the US that
Defining and Formulating a Research Problem

faced a reduction in sales during the harsh economic times with 2.6 percent reduction Notes
in store visits. Researchers used Management-Research Question Hierarchy
(MRQH) for finding the management’s dilemma. MRQH is a process of sequential
question formulation that helps researchers find solutions to a specific situation or
management dilemma.

It was found that during the initial 5 months, there was a drop of 82.8 million in
customer visits when Walmart’s competitors like the Dollar General Corp and the
Kroger Co. have increased their sales. This was identified and defined as the existing
problem which was required to be solved promptly. Walmart was required to make
sure that its stores address all the existing demands of their customers for ensuring
customer retention.

Various solutions suggested by the researches for Walmart stores were as follows:
€€ Management must recreate the organisation’s leadership in terms of price and
delivery as per the customer’s needs.
€€ Management must focus on delivering high-quality products at reasonable or
reduced prices in every season.
€€ Management must emphasise on offering a different range of products to the
customer and offer more choices.
M
Source: https://ivypanda.com/essays/wal-marts-management-dilemma/

QUESTIONS
M
1. What dilemma was faced by the management of Walmart?
(Hint: Sales reduction, increase in competitor’s sales)
II

2. What is Management-Research Question Hierarchy (MRQH)?


(Hint: Sequential question formulation, management’s problem solution)
3. What is a dilemma?
(Hint: Difficulty in making choice, complex situation)
4. Who were the competitors of Walmart?
(Hint: Dollar General Corp, Kroger Co.)
5. What were the solutions recommended by the researchers to solve the management’s
dilemma?
(Hint: Product quality, reduced price, more range of products)

2.8 EXERCISE
1. What do you understand by the term ‘management dilemma’?
2. What is the importance of conducting a literature review in research?
3. Define research problem.
39
4. Explain the conditions and components of a research problem.
Research Methodology

Notes 5. What are the steps to formulate a research problem?


6. What points must be kept in mind while managing a dilemma?
7. Describe the types of sources for review.
8. Explain the process of conducting a literature review.
9. List down the ways to organise the sources of a literature review.
10. What factors must be considered while identifying a research problem?

2.9 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer
Management Dilemma 1. dilemma
2. True
3. defensive
Literature Review 4. True
5. second-hand information
6. d. N
 ewspapers
M
Concept of a Research Problem 7. research problem
8. False
9. Personal interest
M
10. a. t hree

2.10 SUGGESTED BOOKS AND E-REFERENCES


II

SUGGESTED BOOKS
€€ KOTHARI, C. (2019). Research Methodology. [S.l.]: New Age International.
€€ Goddard, W., & Melville, S. (2011). Research Methodology. Kenwyn, South Africa:
Juta & Co.

E-REFERENCES
€€ (2020). Retrieved 3 March 2020, from http://newhorizonindia.edu/nhc_
kasturinagar/wp-content/uploads/2018/01/IV-BBA-BRM-1.pdf
€€ (2020). Retrieved 3 March 2020, from http://www.crectirupati.com/sites/default/
files/lecture_notes/BRM_notes.pdf
€€ 7 Basic Steps in Formulating a Research Problem | Research Idea. (2020). Retrieved
3 March 2020, from https://www.campuscareerclub.com/steps-in-formulating-a-
research-problem/
€€ (2020). Retrieved 3 March 2020, from https://www.manaraa.com/upload/43ef7b58-
40 5c8a-4371-8aea-699609cd2aaf.pdf
R
TE
3

AP
H
C
Research Design
M
Table of Contents
M
3.1 Introduction
3.2 The Concept of Research Design
II

Self Assessment Questions


3.3 The Need and Features of Research Design
Self Assessment Questions
3.4 Types of Research Design
3.4.1 Research Design for Exploratory (Formulative) Research Studies
3.4.2 Research Design for Descriptive Studies
3.4.3 Research Design for Experimental Studies
Self Assessment Questions
3.5 The Components of Research Design
Self Assessment Questions
3.6 Summary
3.7 Key Words
3.8 Case Study
3.9 Exercise
3.10 Answers for Self Assessment Questions
3.11 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Explain the concept of research design
€€ List the needs and features of a research design
€€ Discuss various types of research design
€€ Explain the components of a research design

3.1 INTRODUCTION
In the previous chapter, the use of research for handling management dilemma has
been discussed. The chapter discussed the importance, functions and process of a
literature review. The chapter next described how to write a literature review and
the types of sources for review. Further, the need of defining a research problem,
conditions and components of a research problem have been discussed. The chapter
concluded with an explanation of formulating a research problem.

The preparation of the design of any research project, generally known as a research
design, is one of the crucial stages for the success of a research project. A research
M
design is a blueprint which is followed as a guide during the complete research
study. A research design is needed to create the framework for a research study
that acts as a guide for data collection and data analysis. A research design is the
blueprint for collection measurement and analysis of data. The all-inclusive purpose
M
of any research is to seek an answer to a research problem. The successful completion
of any research project depends on how well its research design fits with its research
problem.
II

A research design is, therefore, a comprehensive plan, framework and strategy


for conducting a research. It formulates the basis of every research and offers vital
information to the researcher, such as the research topic, data type, data sources and
methods of data collection.

This chapter will help you in understanding the concept of research design. You will
study the need and features of a research design. Further, various types of research
design are also discussed. Towards the end, you will learn about the components of
research design.

3.2 THE CONCEPT OF RESEARCH DESIGN


Once the research problem has been identified and the literature review has been
done, the next step is to frame a research design. Any sort of research needs a design
before beginning with data collection and analysis. A research design is framed with
the purpose to ensure that the information collected from the research will enable
the researcher to answer the research problem satisfactorily. Typically, in a research,
it is needed to first consider what information is required to collect to answer the
research problem. A research design is a systematic approach that a researcher uses
42 to efficiently handle a research problem. It provides insights into ‘how’ to conduct
Research Design

research using a particular methodology. It combines various components and data Notes
to arrive at a feasible outcome.

The decisions concerning what, where, when, how much, and by what means
regarding an investigation or a research study constitute a research design. Some
definitions of a research design by different experts are given as follows:

According to Claire Selltiz and others, “A research design is the arrangement of


conditions for collection and analysis of data in a manner that aims to combine relevance to
the research purpose with economy in procedure.”

According to David J. Luck and Ronald S. Rubin, “A research design is the determination
and statement of the general research approach or strategy adopted for the particular project.
It is the heart of planning. If the design adheres to the research objective, it will ensure that
the client’s needs are served.”

According to Kerlinger, “A research design is the plan, structure and strategy of


investigation conceived so as to obtain answers to research questions and to control variance.”

According to Green and Tull, “A research design is the specification of methods and
procedures for acquiring the information needed. It is the overall operational pattern or
framework of the project that stipulates what information is to be collected from which source
by what procedures.”
M
In other words, a research design is a complete guide and provides answers to the
following questions:
M
€€ What is the research all about?
€€ Why is the research required?
II

€€ Where will the research be conducted?


€€ What type of data is required?
€€ Where can the required data be found?
€€ What is the time-period of research?
€€ What will be the sample design?
€€ What techniques of data collection will be used?
€€ How will the data be analysed?
€€ What will be the style of report preparation?
S elf A ssessment Q uestions
1. ______________ provides insights into ‘how’ to conduct research using a
particular methodology.
2. The decisions concerning what, where, when, how much, by what means
regarding an investigation or a research study constitute a research design.
(True/ False)
43
Research Methodology

Notes 3.3 THE NEED AND FEATURES OF RESEARCH DESIGN


All researchers need a research design for conducting research. The need of a
research design is depicted by the following points:
€€ To facilitate smooth research operations: A research design is needed because
it enables the smooth functioning of various research operations. Thus, it makes
research as efficient as possible in yielding maximal information with minimal
expenditure of effort, time and money.
€€ To plan data collection and analysis: A research design is needed to make a plan
in advance of data collection and analysis for conducting a research project.
€€ To plan availability of research resources: A research design should be prepared
keeping in mind the objective and available resources of the research. A research
design is needed to plan in advance the availability of staff, time and money.
Groundwork of the research design should be done with great care as any error in
it may trouble the entire research project.
€€ To attain reliable results: A research design creates a firm foundation of the entire
structure of the research work and this has a great bearing on the trustworthiness
of the results arrived at the end of the research work.
M
€€ To conduct useful research: Negligence in designing the research project may
result into worthless research efforts. Such negligence may also give pointless
research outcomes. Before starting research operations, it is vitally important to
prepare an appropriate and efficient design.
M
€€ To organise research ideas: A research design is needed by the researcher to
organise the ideas of research in a form, whereby it enables researchers to look for
flaws and inadequacies in research problem. A research design can even be given
II

to others for their comments and critical evaluation.


A research design is beneficial to researchers to plan research methods well in
advance, select appropriate tools for data collection and run the research project
smoothly.

Usually, a good research design minimises unfairness and maximises the


trustworthiness of the data collected and analysed. The design that gives the least
experimental error is reported to be the best design in scientific research. Similarly,
a design is considered to be efficient and appropriate if it gives the maximum
information and considers various aspects of a problem by yielding an opportunity.
A good research design should possess the following characteristics:
€€ Reliability: A research design should be consistent throughout a series of
measurements to provide consistency or reliability.
€€ Objectivity: A research design should allow the use of uniform measuring
instruments. An impartial measuring instrument enables every observer or judge
recording the performance to precisely record the data and give the uniform
report. The objectivity also implies the use of the research methods which must be
44 judged by the degree of agreement between the final scores assigned to different
individuals by more than one autonomous observer. This guarantees the fairness
Research Design

and transparency of the collected data which is further analysed and interpreted Notes
to get information.
€€ Validity: A research design should define the use of a measuring device or
instrument and it only measures what it is expected to measure. For instance,
an intelligence test conducted to measure the Intelligence Quotient (IQ) should
measure only the intelligence and nothing else. The questionnaire for IQ test shall
be framed accordingly.
€€ Adequate information: A research design should provide adequate information
so that the research problem can be analysed on a wide perspective. A perfect
research design should consider the following important factors:
zz The exact research problem to be studied
zz The main purpose of the research
zz The procedure of finding information
zz The accessibility of adequate and skilled manpower
zz The availability of enough financial resources for carrying research
€€ Generalisability: This implies how best the data collected from the samples
can be utilised for drawing certain generalities, which will be relevant to a large
M
group from which the sample is drawn. Therefore, a research design helps
researchers generalise their findings provided that due care is taken in defining
the population, selecting the sample, deriving appropriate statistical analysis, etc.,
M
while preparing the research design. A research problem to be generalised should
have the following characteristics:
zz The problem should be clearly formulated.
II

zz The population should be clearly defined.


zz The most suitable techniques of sample selection should be used to form an
appropriate sample.
zz Suitable statistical analysis should be carried out.
zz The findings of the research study should be capable of generalisations.
€€ Other features: A good research design should have other features too, such as
flexibility, adaptability, efficiency, being economic, and so on. The maximum
reliability with generalisation and minimum biasness should be depicted by a
good research design.
S elf A ssessment Q uestions
3. ___________ in designing the research project may result in the execution of
the futile research exercise.
4. A research design cannot be given to others for their comments and critical
evaluation. (True/False)

45
Research Methodology

Notes 3.4 TYPES OF RESEARCH DESIGN


Research can be conducted in various ways and under diverse conditions. The type
of research defines the type of research design needed. It is quite possible that a
research design may be appropriate for one type of research, while it may not be
suitable for another type of research. For instance, the marketing department of an
organisation should conduct exploratory research to identify the potential areas
of growth. In case the marketing department wants to see the impact of different
packaging on the sales of a product, then experimental research should be conducted.

So, for an exploratory research, the research design should be flexible to accommodate
continuous changes. On the other hand, if a research is diagnostic, then the flexible
research design is not appropriate because this type of research demands precision,
accuracy, minimum bias and reliability. Therefore, the research design must be rigid
(not flexible) in this case.

Prior to deciding the research design of a particular type of research, the following
questions must be asked:
€€ What is the nature of the problem of research to be conducted?
€€ Which technique of data collection and analysis would be used in conducting the
research?
M
€€ Which situations are required to be applied to the selected method of data collection
and analysis?
M
Depending on the type of research study to be conducted, the types of research
design are shown in Figure 1:
II

Types of Research Design

Research design for


Research design Research design
exploratory
for for
(formulative)
descriptive studies experimental studies
research studies

Figure 1: Types of Research Design

All types of research designs are discussed in further sub-heads.

3.4.1 RESEARCH DESIGN FOR EXPLORATORY (FORMULATIVE) RESEARCH


STUDIES
The other name for exploratory studies is formulative studies. The prime objective of
this type of research study is to formulate a problem for more precise investigation.
Typically, these studies are undertaken in the absence of enough information
46 regarding a problem or situation. An exploratory study exerts more emphasis on
a problem or situation to gain familiarity with its different aspects. A researcher
Research Design

conducts an exploratory study when some facts are known about a problem or Notes
situation and there is a need to know more about it. The key emphasis in such studies
is on the discovery of ideas and insights.

For example, a restaurant chain might undertake an exploratory study to find out
different ways that can be used to improve the quality of customer service in its
restaurants chain without making any major investments. The researcher, in this
case, initially will have only a little information regarding the current status of
quality of customer service for which the researcher wants to conduct a research.
Such information can be gained by exploratory study only. The researcher along
with the research team formed for the purpose may interview the existing customers
of the restaurant, review the available literature and consult experienced people in
the field.

An exploratory study, by its very nature, considers different aspects of a situation


or topic. Thus, the research design for an exploratory study must be flexible enough
to consider all aspects of the research problem. Usually, the following methods are
considered regarding the research design for exploratory studies:
€€ Literature review: The most important and fruitful method of formulating
a problem with precision is the review of literature. If the problem has been
formulated earlier, then the available literature can be reviewed to test it for
M
its significance and usefulness. If the problem has not been formulated earlier,
then the literature has to be reviewed for formulating it. Reviewing the available
literature helps the researcher in applying the already developed theories and
M
concepts to the subject of research.
€€ Experience survey: This implies doing survey of people who have real-world
knowledge on the topic of the projected research or on the related topics.
II

Experienced people can prove helpful in the research by providing significant


and innovative ideas in the research. An experience survey can be conducted by
scheduling interviews with the experienced people.
For conducting interviews, one should prepare a set of methodical questions to be
put to the experienced people. However, one should also give a sufficient chance
to the respondents to raise questions and satisfy their concerns. An experience
survey makes a research more practical, feasible and applicable. One may use
either of the aforementioned methods for conducting an exploratory research.
However, one should ensure that a research design is flexible enough to include
different aspects of a problem.
€€ Analysis of ‘insight-stimulating’ examples: This is also considered as a fruitful
and vital method for recommending research hypotheses. Predominantly, it
is appropriate in areas where there is little experience to serve as a guide. This
technique consists of the rigorous study of selected cases of the phenomenon
in which one is interested. The existing records, if any, may be examined, the
unstructured interviewing may take place, or some other approach may be adopted
for the purpose of analysis of ‘insight-stimulating’. The attitude of the researcher,
the power of the study and the ability of the researcher to draw together diverse
information into a unified interpretation are the main features that make this 47
method an appropriate procedure for reminding insights.
Research Methodology

Notes 3.4.2 RESEARCH DESIGN FOR DESCRIPTIVE STUDIES


Descriptive studies describe the facts and situations as they are. Such studies are
concerned with describing the characteristics of a particular individual, or of a group.
They are concerned with ‘what’ and not with ‘how’ and ‘why’ of a research problem.
Research studies related to specific predictions, narration of facts and characteristics
of human beings are examples of descriptive research. For instance, a research study
that aims to describe or list the major features of the organisational culture that
exists in Infosys office located at Hyderabad, India is a descriptive study.

As the aim of a descriptive study is to obtain accurate and complete information,


so the researcher should be very careful about data and methods to be used. For
descriptive studies, the research design should not be flexible as was the case with
exploratory studies. It should be rigid and free from any bias. While finalising the
research design for the descriptive and diagnostic studies, the researcher should
focus on the following points:
€€ Objectives of the research study
€€ Clearly defining the hypothesis
€€ Techniques of data collection
M
€€ Sample selection
€€ Place and time-period of data collection
€€ Data processing
M
€€ Data analysis
€€ Report presentation
II

Therefore, in the descriptive and diagnostic studies, the primary requirement of the
research design is the clarity of objectives. It means that the researcher should be
clear about the type of study undertaken and the reasons behind the study. After
that, the techniques of data collection should be selected.

There are various methods of data collection, such as interviews, observations and
questionnaires. The researcher should select any of these methods according to the
research study requirement, but the collected data should be free from any bias
and ambiguity. However, it is good to ensure that the data collection method used
would result in the least number of errors.

The time and place of data collection should also be taken carefully. For instance, if
the researcher wants to survey the effects of recession, the data of only the recession
period is to be considered. In the same manner, if the researcher wants to survey the
effects of water scarcity on the lives of people, then the researcher should approach
those areas that face acute water shortage. Thus, the time and place of data require
discretion on the researcher’s part.

The collected data must be properly analysed by using proper statistical and
software tools. Finally, the report of the study is presented in detail. The report must
48 be presented in a simple and planned manner to explain the findings to the people
concerned in an effective way.
Research Design

Generally, a descriptive research design combines the following: Notes

€€ Overall design: It is framed with rigidity to protect against biasness and maximise
reliability. So, it has a rigid design.
€€ Sampling design: It follows a probability sampling design.
€€ Statistical design: A pre-planned design for analysis is used.
€€ Observational design: Well-thought and structured data collection instruments
are used.
€€ Operational design: In this, advanced decision about operational procedures is
taken.

3.4.3 RESEARCH DESIGN FOR EXPERIMENTAL STUDIES


This research study is also known as the hypothesis-testing research study. In this
form of research study, some variables of interest are manipulated to observe their
effects on other variables. The simplest example of an experimental research is
conducting a laboratory test. An experimental research is considered to be successful
if the researcher confirms that a change in the dependent variable is only due to the
change of the independent variable. M
It is important for an experimental research to establish the cause and effect of a
phenomenon. For example, a researcher conducts research to understand the effect
of food on cholesterol and derive that most heart patients are non-vegetarians or
have diabetes. These aspects are causes which can result in a heart attack (effect).
M
Professor R.A. Fisher prepared a research design for experimental studies when
he was working with the Centre of Agriculture Research in England (Rothamsted
II

Experimental Station). Initially, he conducted various agricultural researches by


dividing an agricultural field into blocks. He carried out a separate research on
every block. He found that the data collected in these experimental separate blocks
was reliable. This encouraged him to develop experimental designs for scientific
investigations also. An experimental research design is a blueprint within which an
experimental study is conducted. The purpose of an experiment is to decide:
€€ Whether the observed differences among the treatments (or sets of experimental
conditions) included in the experiment are due to chance only
€€ Whether the extent of these differences is of practical importance
To decide the above two points, following three principles of experimental design
can be used:
€€ Principle of replication: It implies that an experiment should be repeated more
than once and each treatment is used in more than one experimental unit. This will
ultimately help in achieving statistical accuracy of the experiment.
€€ Principle of randomisation: It implies that the plan or design of an experiment
should combine all extraneous factors under a general heading of ‘chance’. A
better estimate of the experimental error is achieved through application of the
principle of randomisation.
49
Research Methodology

Notes €€ Principle of local control: It implies to reduce the experimental error by conducting
the experiment more efficiently. As per this principle, the extraneous factor, a
known source of variability, is made to vary purposely over as varied a range as
needed. This is done to measure the inconsistency caused due to variation and
eliminate the experimental error.
There are multiple ways to categorise experimental research designs. A basic way to
categorise them is as follows:
€€ Formal experimental research designs: These designs use comparatively more
refined and precise forms of data analysis.
€€ Informal experimental research designs: These designs use less sophisticated
forms of data analysis.
Another common way in which they are categorised is:
€€ Basic designs: Basic designs refer to those designs that include only one
independent variable. The main types of basic designs are shown in Figure 2:

zz Pre-test–Post-test Control
Basic Group Design
Designs
M zz Post-test-Only Control
Group Design
M
Figure 2: Types of Basic Designs

zz Pre-test-post-test control group design: It is also called the randomised pre-


test-post-test design or the classical controlled experimental design. In such
II

experimental designs, the subjects are assigned to the experimental (treatment)


and control (no treatment) groups using random numbers.
The researcher or the experimenter controls the timing of administering
treatment. Both groups are kept in the same environment except that the
experimental group receives the treatment; whereas, the control group does
not. The notions which are generally used in a basic design are:
99 R: Random assignment
99 T: Treatment
99 O: Observation, outcome or effect
Table 1 presents the symbolic representation of the pre-test-post-test control
group design:

Table 1: Pre-Test, Post-Test Control Group Design

Group Pre-test (First Treatment (T) Post-test (Second


observation of the observation of the
dependent variable) dependent variable)

50 Experimental O1 (Average score of the T O2 (Average score of the


Group (E) experimental group on experimental group on
the dependent variable) the dependent variable)
Research Design

Group Pre-test (First Treatment (T) Post-test (Second Notes


observation of the observation of the
dependent variable) dependent variable)
Control O3 (Average score of the No-T O4 (Average score of the
Group (C) control group on the control group on the
dependent variable) dependent variable)

In such an experiment, the changes that are observed in the values of the
dependent variable in the experimental group (O2 – O1) arise as a result of the
treatment. Here, it might happen that there is a difference between the control
group’s score, i.e., (O4 – O3). The difference of O3 and O4 is the change in the
value of the dependent variable that may occur even in the absence of any
treatment.
zz Post-test-only control group design: In a post-test-only control group design,
the researcher randomly assigns subjects to the experimental and control
groups. In such a design, the pre-test is not administered. The experimental
group is exposed to a treatment, whereas no treatment is administered to the
control group. Table 2 presents the symbolic representation of the post-test-
only control group design:

Table 2: Post-Test-Only Control Group Design


Group Treatment (T)
M Post-test (First observation of the
dependent variable)
Experimental T O1 (Average score of the experimental
M
Group (E) group on the dependent variable)
Control Group No-T O2 (Average score of the control group on
(C) the dependent variable)
II

The post-test-only control group design is used for research where it is not
possible to assign subjects to groups randomly due to any (ethical/practical)
reason. The main benefit of this design is that it is very simple to implement
and has a low error propagation percentage. The main disadvantage of this
design is that it is highly vulnerable to threats to internal validity.
€€ Statistical designs: Statistical designs refer to those experimental designs in which
there are two or more independent variables. The main types of statistical designs
are as follows:
zz Completely randomised (C.R.) design: The C.R. design refers to the design
in which there is random assignment of subjects (experimental units) to
treatments. Out of the three basic principles of experimental design, this
design includes only two (the principle of randomisation and the principle
of replication). In complete randomisation, every subject carries an equal
probability to be assigned to any treatment. For example, if you wish to test
eight subjects under two treatments (A and B), there is an equal opportunity
of every subject to be assigned to any of the treatments. C.R. designs may
be analysed using ANOVA, independent t-test, or non-parametric tests
depending upon the number of treatments. A two-group randomised design is
the simplest form of C.R. design. In this design, two randomisations (selecting 51
the items randomly), namely random sampling and random assignment, take
Research Methodology

Notes place. Random sampling refers to selecting a sample from the population.
Random assignment refers to assigning subjects selected from the sample to
an experimental group and a control group. The diagrammatic representation
of the two-group simple randomised design is shown in Figure 3:

Experimental Treatment
Group A

Random Independent
Population Sample Variable
Selection

Control Treatment
Group B

Figure 3: Two-Group Simple Randomised Design

The two-group simple randomised design is very simple to implement. The


M
variations due to extraneous variables can be controlled using the control
group.
Let us understand the concept of the two-group simple randomised design
M
with the help of an example. Suppose you are conducting a study to compare
two groups of students from a college. In this example, the college represents
the complete population. On the basis of random sampling, students are
selected out of the population and randomly assigned to two groups, that is,
II

experimental group and control group. One group is given training, whereas,
the other group is not. Here, it can be assumed that the group that has received
the training (experimental group) is in a better position as compared to the
other group (control group). This assumption/hypothesis can be tested using a
two-group simple randomised design.
zz Randomised block design: In this design, all three principles of experimental
designs can be applied. The randomised block design refers to the design that is
used when you want to eliminate uncontrolled variations. These variations are
caused by a variable called blocking variable or nuisance variable. For example,
a doctor wants to treat a patient with a specifically prepared medicine. In this
case, the nuisance factor may be the time of giving medicine to the patient
or room temperature. These factors affect the outcome but are not of prime
interest to the doctor.
Numerous nuisance variables exist in all experiments. One can eliminate their
effect on the research study by a technique called blocking. For example, in
the study of school students, one can expect homogeneity in the students of
the same class as compared to the students of the entire school in terms of
knowledge and skills. In this case, a class is a block that can help in reducing
52
variation in the research.
Research Design

S elf A ssessment Q uestions Notes


5. The type of ___________ defines the type of research design needed.
6. An experimental study exerts more emphasis on a problem or situation to gain
familiarity with its different aspects. (True/False)
7. Research studies related to specific predictions, narration of facts and
characteristics of human beings are examples of ___________ research.
8. The simplest example of an experimental research is conducting a __________
test.

3.5 THE COMPONENTS OF RESEARCH DESIGN


Various components that constitute a research design are as follows:
€€ Overall design: This component of a research design concerns with a clear
statement of the rigidity and flexibility of the research study to be followed.
€€ Sampling design: This component of a research design decides the population to
be studied. This part of a research design also deals with the method of selecting
samples for research.
€€
M
Observational design: This component of a research design relates to the conditions
under which the observations related to the research problem are to be made.
€€ Statistical design: This vital component of a research design is related to getting
M
the answer of the question of the number of items to be observed for research. This
also clarifies about data collection, data gathering and its analysis to arrive at the
relevant information.
II

€€ Operational design: This component of a research design deals with the techniques
of carrying out the procedures related to sampling design, observational design
and statistical design.
The important concepts related to a research design which are also useful in framing
various components of the research design are explained through the following
points:
€€ Variable: It refers to a parameter that keeps changing with time and space. The
parameter or the variable can take on different quantitative values. Examples of
the variables are income, expenditure and weight that keep on fluctuating from
time to time. Various forms of variables are as follows:
zz Dependent variable: It refers to the variable that can be measured by the
researcher. A dependent variable is affected by the changes in an independent
variable. Researchers measure dependent variables.
zz Independent variable: It refers to the variable that causes a change in a
dependent variable. Independent variables can be controlled. Researchers
manipulate the independent variable to measure its impact on the dependent
variable(s).
53
Research Methodology

Notes zz Extraneous variables: These are referred to as independent variables or


confounding variables which are not directly related to the research but they
may affect dependent variables. For example, a researcher wants to study
the relationship between the price of a commodity and the demand of that
commodity. In this case, consider that ‘demand’ is a dependent variable and
‘price’ is an independent variable. When the price is low, the demand increases
and when the price becomes higher, the demand decreases. However, the
dependent variable demand may also be affected by other factors, such as
income and taste of customers. These other factors constitute extraneous
variables. These variables need to be controlled.
zz Control variables: Control variables are those extraneous variables that can
potentially affect the research experiment, but the researchers keep them
same (or controlled) during experiments. This ensures that the experiments
are conducted in a fair environment and are not affected by the extraneous
variables.
€€ Factors, outcomes, levels and treatments: In an experiment, a factor refers to that
variable that is manipulated or controlled by the researcher. The manipulation of
a factor is done to study its impact on the research study.
The observation of the variable of interest yields outcome. Each factor can have
M
two or more values called factor levels. These different factor levels are called
treatments.
It must be remembered that factors may be qualitative or quantitative in nature.
M
For instance, factors may include soil quality, type of seeds, type and amount of
fertiliser. Here, the outcome is observed by observing the yield. In research studies,
the researcher may use one or more factors. In a single factor study, treatments
II

correspond to the factor levels. In a single-factor study, the number of treatments


and the number of factor levels are equal. On the other hand, in n-factor studies,
the treatments correspond to the combination of the factor levels. The number
of treatments is calculated as the product of all the different factor levels. For
instance, if there are two factors and one factor has 3 levels and the other has 4
levels, then the number of treatments would be 3 × 4, which equals to 12. Assume
that a researcher is studying the impact of remuneration on job motivation. Then
it is a single factor study wherein remuneration is the factor and different amounts
of remuneration are factor levels. Similarly, if a researcher is studying the impact
of gender and ethnicity on income, then it is a two-factor study wherein gender
and ethnicity are factors. Here, gender can be male or female or trans-sexual (three
levels) and ethnicity can be Dalits, Punjabis, Khasis, Bengalis, Jats, Rajputs, etc.
€€ Experimental unit/group: It denotes the unit/group to which a treatment is
applied in a single trial of experiment. The experimental unit may be a plot of land,
a patient in hospital, a group of operators or a set of machines. For example, one
can compare a patient in a private ward with a patient in a general ward, in terms
of the treatment they receive in the same hospital. In this case, the two patients are
the experimental units.

54 €€ Response: It denotes the results of an experiment on the basis of a treatment. The


response may be the yield of a process, the purity of a chemical, or any quantitative
or qualitative expression.
Research Design

S elf A ssessment Q uestions Notes


9. ____________ component of a research design decides about the population to
be studied.
10. A dependent variable is not affected by the changes in an independent variable.
(True/False)

A ctivit y
Prepare a PowerPoint presentation on different types of research designs and
their usage in real world.

3.6 SUMMARY
€€ A research design is framed with the purpose to ensure that the information
collected from research will enable the researcher to answer the research problem
satisfactorily.
€€ A research design is needed because it enables the smooth functioning of various
research operations.
€€ A research design is needed to plan in advance the availability of staff, time and
money.
M
€€ A research design should be consistent throughout a series of measurements so as
to provide consistency or reliability.
M
€€ A research design should provide the use of measuring device or instrument
which measures what it is expected to measure.
II

€€ Research can be conducted in various ways and under diverse conditions.


€€ An exploratory research should have flexible research design to accommodate
continuous changes.
€€ A researcher conducts an exploratory study when some facts are known about a
problem or situation and there is a need to know more about it.
€€ An exploratory study, by its very nature, considers different aspects of a situation
or topic.
€€ The aim of descriptive studies is to describe the facts and situations as they are.
€€ For descriptive studies, research design should not be flexible as was the case with
exploratory studies. It should be rigid and free from any bias.
€€ An experimental research is considered to be successful if the researcher confirms
that a change in the dependent variable is only due to the change of the independent
variable.
€€ Various components which constitute a research design are overall design,
sampling design, observational design, statistical design and operational design.

55
Research Methodology

Notes 3.7 KEY WORDS


€€ Factor: It is a quantitative or qualitative independent variable.
€€ Causal relationship: It is the cause-and-effect relationship between two variables.
€€ Experimental unit: It is an object from which data are taken for an experiment.
€€ Experiment: It is the test done to check a statement or assumption made by the
researcher.
€€ Nuisance variable: It is a measurable quantity that cannot be controlled and affects
a dependent variable.
€€ Random assignment: It is a method by which subjects are assigned to experimental
and control groups without any bias.

3.8 CASE STUDY: RANDOMISED DESIGN FOR BUS FARE REDUCTION


A bus transport organisation’s senior manager wants to know the effect of reduction
in fare by ` 5, 10 and 15 on an increase in the number of passengers. The senior
manager designs a completely randomised design which is as follows:

To conduct the study of reduction in price, the senior manager takes 24 routes and
M
randomly assigns 8 routes to treatment A (reduction of ` 5), 8 routes to treatment
B (reduction of ` 10) and 8 routes to treatment C (reduction of ` 15). The tabular
representation of design is as follows:
M
Table A: Completely Randomised Design Table

Routes Number of Treatment Number of


Travellers Earlier Travellers After
II

Group 1 (8 routes) X1 A X2
Group 2 (8 routes) X3 B X4
Group 3 (8 routes) X5 C X6

The preceding table shows the observations made by the researcher before the
treatment, which are termed as X1, X3 and X5 for different fare reductions. It is also
showing the observations made by the researcher after the treatment, which are
termed as X2, X4 and X6.

Thus, by comparing X2 and X1, X4 and X3, X6 and X5, the effect of fare reduction
will be clear to the mamnager.

QUESTIONS
1. How many bus routes were considered by bus transport organisation’s senior
manager for conducting bus fare reduction study?
(Hint: 24 routes)
2. Which of the designs should be considered better: randomised block design or
56 completely randomised design? Give reasons.
Research Design

(Hint: Completely randomised design because there is an equal opportunity of Notes


every subject to be assigned to any of the treatments)
3. How have the 24 routes been assigned to three groups?
(Hint: Randomly assigned 8 routes to treatment)
4. How has the effect of fare reduction been analysed by the manager?
(Hint: By comparing X2 and X1, X4 and X3, X6 and X5)
5. How was the comparison of travellers made by the manager?
(Hint: By comparing the number of travellers prior to treatment and number of
travelIers after the treatment)

3.9 EXERCISE
1. What is a research design? List the questions asked to structure a complete research
design.
2. Explain the needs of a research design.
3. What are the features of a research design? Explain.
4. Explain the components of a research design.
M
5. Discuss the significant concepts used in framing various components of a research
design.
6. What do you mean by research design for exploratory research studies?
M
7. Explain the research design for descriptive studies.
8. Describe the research design for experimental studies.
II

3.10 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer
The Concept of Research Design 1. Research Design

2. True

The Need and Features of Research Design 3. Negligence


4. False
Types of Research Design 5. research
6. False
7. descriptive

8. laboratory
The Components of Research Design 9. Sampling design
10. False
57
Research Methodology

Notes 3.11 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Welman, J., Kruger, F., & Mitchell, B. (2005). Research Methodology. Cape Town:
Oxford University Press.
€€ Lancaster, G. (2012). Research Methods. Taylor & Francis.
€€ Kothari, C. (2004). Research Methodology. New Delhi: New Age International (P)
Ltd.
€€ Goddard, W., & Melville, S. (2011). Research Methodology. Kenwyn, South Africa:
Juta & Co.

E-REFERENCES
€€ Elements of Research Design - SAGE Research Methods. (2020). Retrieved 17
March 2020, from https://methods.sagepub.com/book/handbook-of-research-
design-social-measurement/n9.xml
€€ Research Guides: Organizing Your Social Sciences Research Paper: Types of
Research Designs. (2020). Retrieved 17 March 2020, from https://libguides.usc.
M
edu/writingguide/researchdesigns
M
II

58
R
TE
4

AP
H
C
Sampling
M
Table of Contents
M
4.1 Introduction
4.2 Concept of Sampling
II

4.2.1 Census versus Sample Survey


4.2.2 Sampling Design Process
4.2.3 Characteristics of a Good Sample Design
4.2.4 Determining Sample Size
Self Assessment Questions
4.3 Errors in Measurement and Sampling Errors
Self Assessment Questions
4.4 Non-Sampling Errors
Self Assessment Questions
4.5 Methods of Sampling
4.5.1 Probability Sampling Methods
4.5.2 Non-Probability Sampling Methods
Self Assessment Questions
4.6 Summary
4.7 Key Words
4.8 Case Study
4.9 Exercise
4.10 Answers for Self Assessment Questions
4.11 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Explain the concept of sampling
€€ Describe the characteristics of a good sample design
€€ Discuss the errors in measurement and sampling errors
€€ Explain the concept of non-sampling errors
€€ Describe the probability and non-probability sampling methods

4.1 INTRODUCTION
In the previous chapter, you studied the concept of research design. The chapter
discussed the features and types of research design. The chapter concluded with the
components of research design.

Sampling is the process of choosing a subset of subject matter or units from the
whole population related to the area of study for the purpose of conducting research.
Researchers use the sampling method when it is not feasible to study every single
element of the target population. Population refers to the collection of elements,
M
individuals, items and objects about which the researcher desires to collect the
information. Population can be finite or infinite. The population is finite when it has
a fixed number of items or elements; for example, the number of people working
M
in an organisation or the number of students in a school. The population is infinite
when it has no fixed number of items or elements and the researcher has no clue or
idea regarding the number of items or elements, for example, the total number of
II

stars in the sky.

The researcher must make a methodological plan for obtaining a sample from the
target population. This plan is called sample design and the number or items or
elements in the sample are known as the sample size of the population. The researcher
can use the census method or sample survey method for collecting information, but
sample method may result in inaccuracy or error, which is called sampling error.

In this chapter, you will study the concept of sampling and how to determine a
sample size. The chapter will also describe the errors in measurement and sampling
errors. Towards the end, the chapter briefs about the methods of sampling.

4.2 CONCEPT OF SAMPLING


A sample can be a group of people, items/elements or objects selected out of the
population for conducting research. The sample must be taken in a way that it should
represent the population characteristics for confirming that research findings can be
generalised to the entire population. Researchers use sampling when the size of the
population is large as it helps in decreasing the time taken to collect information,
reduce expenses, efforts, etc.
60
Sampling

Figure 1 shows how samples are taken from the population: Notes

Population

Sample

Figure 1: Population and Sample

According to P.V. Young, “A statistical sample is a miniature picture of cross selection of


M
the entire group or aggregate from which the sample is taken.”

According to Goode and Hatt, “A sample, as the name applies, is a smaller representative
of a larger whole.”
M
Let us understand the population and sample with the help of some examples:
1. Manyata is a professor of psychology in the University of Delhi. She is interested in
II

studying the level of stress that B. Tech. students encounter during finals. Manyata
is planning to conduct a sample survey and send it around the finals time to some
students for ranking their level of stress during finals on a scale of 1 to 5. Manyata
needs to select students for conducting her survey.
Once the students are selected for the survey, the final number of students is called
the survey sample.
2. Dr. Suyash, the chancellor of a university, wants to collect the feedback of students
on a grading system. It is not practically possible to take the feedback from each
and every student. A sample shall be chosen and based on it, the general feedback
would be considered. A representative sample is the outcome of improved
exactness and accuracy of results.
Researchers may use two primary methods of data collection, i.e., the census method
and sampling method. Let us understand these methods in detail.

4.2.1 CENSUS VERSUS SAMPLE SURVEY


Census method of data collection is the method in which researchers study all the
elements or items of the population. The Government of India (GOI) conducts the
Census of India in every 10 years. Census is also called ‘Complete Enumeration’. It
gives thorough information covering many aspects of the problems, but it is a time-
consuming and expensive method of data collection. Now, instead of studying all
61
the elements of the population, some representatives, i.e., elements are selected from
Research Methodology

Notes the target population which is called sample. For selecting a sample, researchers
must determine the population. Once a researcher recognises the target population,
a sample must be selected. Table 1 shows the difference between census and sample
survey:

Table 1: Difference between Census and Sample Survey

Census Sample Survey


In this method, a researcher studies every unit or In this method, few elements of the target
element of a target population. population are studied, and not all.
This method involves complete calculation. This method involves partial calculation.
This method of data collection is very time- This method is quicker than the census
consuming. method.
This method incurs more cost. This method incurs less cost.
The results derived from the census method are The results are subject to inaccuracy as
accurate as each member is surveyed. So, there is only a few items are surveyed from a
a chance of minimal sampling error. large population as a sample. So, there
are chances of sampling errors.
This method is good for heterogeneous (high This method is good for homogeneous
variability types) data. (similar type) data.
This method is highly reliable. This method is not reliable.
M
4.2.2 SAMPLING DESIGN PROCESS
A sampling design is considered as a road map that provides the foundation or
M
basis for the selection of a sample survey. Figure 2 shows the steps involved in the
sampling design process:
II

Recognise the target population as per the area of study

Choose a sampling frame

Define sampling units

Specify the techniques of sampling

Determine the sample size

Implement the sampling plan

Figure 2: Steps in the Sampling Design Process

Let us understand these steps in detail:


1. Recognise the target population as per the area of study: The first step is to identify
the target population in which the researcher is interested in generalising the
findings. The group of individuals or items from which the sample can be drawn
as a sample is called the target population.
62 2. Choose a sampling frame: The next step is to select a sampling frame. Sampling
frame is a list of all those elements or items within a target population which can
Sampling

be sampled. For example, Geeta takes 4 schools near to her house in her sampling Notes
frame for conducting her study.
3. Define sampling units: The next step is to define the sampling units. It is splitting
up the population in parts, for example, if a researcher wants to survey the entire
nation, the sampling unit will be states, districts, blocks and villages.
4. Specify the techniques of sampling: The next step is to choose the technique
of sampling. There are two types of sampling techniques, i.e., probability and
non-probability techniques. When the sampling frame is the same as the target
population (approximately), the researcher must use a random sampling technique
for choosing the sample. But when the sampling frame is not representing the
target population, the researcher must select a non-random sampling technique.
5. Determine the sample size: The number of elements/items that a sample has
is called the sample size. A researcher should decide the sample size carefully.
It should be neither too large nor too small. Before selecting a sample size, the
following points should be considered:
zz Flexibility: The sample size should have the ability to adapt to changes to
some extent when required.
zz Representativeness: The sample should represent the whole population.
zz Precision: The sample should involve the desired accuracy.
zz
M
Reliability: The sample should be free from errors.
zz Population variance: The deviation in the items of the population is called
population variance.
M
Choose a sample size wisely to control the population variance. For an extremely
diverse population, choose a large sample and vice versa if there is little diversity
in population.
II

6. Implement the sampling plan: The last step is to implement the sampling plan
after identifying the target population, choosing sampling frame, specifying
sampling technique and determining the sample size.

4.2.3 CHARACTERISTICS OF A GOOD SAMPLE DESIGN


Researchers must know the characteristics of good sampling design for better and
accurate results. A good sample design satisfies the following conditions:
€€ Sampling design must produce a representative sample.
€€ Sampling design must result in less sampling errors.
€€ Sampling design must be feasible in the context of available funds.
€€ Sampling design results should be applicable to the whole population.
€€ Sampling design should be able to prevent systematic bias in a better way.

4.2.4 DETERMINING SAMPLE SIZE


Researchers must take care of the following points while determining the sample
size:
€€ Homogeneity (similarity) or heterogeneity (dissimilarity) of the population: 63
Researchers while determining the sample size must consider the nature of
the universe/population. When the nature of the universe or population is
Research Methodology

Notes homogeneous, a small sample can be taken for representing the behaviour of the
entire universe or population. When the universe or population is heterogeneous
(dissimilar) in nature, samples must be selected as from each heterogeneous unit.
€€ Class intervals: If there is a large number of class intervals to be created, then the
sample size should be more as it has to showcase the whole population. In case of
small samples, there are chances that few samples are not being included.
€€ Research study nature: The sample size depends on the researcher’s study. For
an intensive study conducted for a long duration, large samples are selected. But
for technical study, the selection of a large number of respondents may result in
increasing complexity while collecting information.

S elf A ssessment Q uestions


1. Researchers use _________________ when the size of the population is large as
it helps in decreasing the time taken to collect information, reduce expenses,
efforts, etc.
2. A sample size is considered as a road map which provides the foundation or
basis for the selection of a sample survey. (True/False)
3. ________________ method is quicker than the census method.
M
4.3 ERRORS IN MEASUREMENT AND SAMPLING ERRORS
An error is a fault or the disparity between the evaluated value and the correct or
exact value. Following are the types of errors in measurement:
M
€€ Systematic errors: These errors are also known as systematic bias. Systematic
errors are consistent and happen again and again because of defective equipment
or inappropriate experiment design.
II

€€ Gross errors: These errors are physical errors in the analysis, calculation and
recording. Gross errors occur due to human errors that lead to inconsistencies in
the research data. When the researcher studies or records incorrect or different
values from the data, these errors occur. These errors are predictable in nature and
can be corrected by reviewing and revisiting the research report.
€€ Random errors: This error is random and unpredictable in nature. Random errors
occur due to a large number of variables that are beyond the researcher’s control
and affect the outcome of the study. Random errors are of two types: sampling
errors and non-sampling errors.
The researcher must be able to recognise the sources of errors in the measurement
and should minimise them. Some important reasons that may cause errors are:
€€ Errors due to the interviewer’s attitude: These errors occur because of the biased
attitude of the interviewer. The interviewer can encourage or discourage certain
viewpoints of respondents by rephrasing questions.
€€ Errors due to respondent’s reluctance: These errors occur due to the reluctance of
respondents to respond to questions. The respondents may feel reluctant to answer
questions correctly because of fatigue, hunger or ill-health. The respondents may
also commit errors because of the lack of knowledge.
64
€€ Errors due to ineffectiveness of the instrument: These errors occur because of
the ineffective measuring instrument, such as a questionnaire. If the questionnaire
Sampling

contains a lack of choices, complex language, poor printing, non-essential Notes


questions, then it cannot get the desired outcomes from the respondents.
€€ Errors due to the situational factors: These errors occur because of the situational
factors. Any condition that puts a strain on the interviewer can cause an adverse
impact on the rapport between the interviewer and the respondent.
Some important methods that result in less sampling errors are discussed as follows:
€€ Increasing sample size: Increasing the sample size will reduce sampling errors. If
the sample size is equal to the complete population, the scope of sampling error
is zero.
€€ Stratification: This refers to dividing the given population into homogeneous and
non-overlapping units or sub-groups (known as stratum) to make the sample more
representative. Grouping is done on the basis of one or more common attributes.
Sampling error is the error or mistake that occurs in the data collection process as an
outcome of taking a sample from the target population rather than using the entire
population.

It is a statistical error that occurs when a researcher does not select a sample that
represents the entire population of data and the results from the sample are not
applicable to the entire population. These statistics may have a value close to or
exactly the same as that of the entire population. For example, if a researcher wants
M
to analyse the average production of wheat in a village during a specific year, then
the researcher needs to find out the target population. In this case, the population
will be the farmers. There are 8,000 farmers in the village who produce wheat. Out
M
of these 8,000 farmers, the researcher selects 800 farmers and calculates the average
production of wheat based on the data (figures) given by 800 farmers. There is a
surety that the output average that has been extracted will have a slightly different
value as compared to the original average. This particular phenomenon is commonly
II

known as sampling error. Hence, these errors may arise because only a small section
of the population as a sample had been selected. Hence, it is commonly known that
a small sample of population won’t give the exact criteria to anything, or won’t
show the real trend of the outcome, but still these errors can be brought down with
a better sampling design.

S elf A ssessment Q uestions


4. ________________ error occurs due to a large number of variables that are
beyond a researcher’s control and affect the outcome of the study.
5. The researcher must be able to recognise the sources of errors in the
measurement and should minimise them. (True/False)
6. Which among the following is also known as a systematic bias?
a. Systematic b. Random
c. Gross d. None of these

4.4 NON-SAMPLING ERRORS


These errors do not occur because of wrong sampling. Some examples of non-
65
sampling errors are population mis-specification error, data-processing error,
Research Methodology

Notes respondent error, and non-response error. Non-sampling errors may occur even if
all the elements of a given population are considered for a study. In other words,
non-sampling errors arise as a result of factors other than sample selection. Non-
sampling errors may also be caused because of selection bias, ambiguous population
specification, sampling frame error, processing error, respondent errors, non-
response errors, physical environment, inadequacy of enumerators, etc. For example,
a population contains 1,000 elements. The researcher intends to find the average
income of the population. Even if the researcher considers all 1,000 elements to find
the average income, he/she may get inaccurate results because of non-sampling
errors.

A few of the main reasons are given as follows:


€€ Improper division of sampling units of a population: The following example
explains why improper division of sampling units causes non-sampling errors.
Suppose an organisation wants to conduct a skill development program for its
employees who do not have the required skills to perform job duties. For this, the
organisation needs to find out the number of employees who need to attend the
program. To do so, employee population of an organisation is divided as skilled
population and unskilled population. However, this division may not be a precise
division of the employee population. The reason is that some people are involved
M
in non-technical jobs, some are involved in technical jobs and some people work
on multiple projects simultaneously. In addition, some people may be more
qualified, but they perform unskilled jobs. Thus, this division of the employee
population as skilled and unskilled workers is not clear cut, which may lead to
M
non-sampling errors.
€€ Poor response of respondents: This makes it difficult for the researcher to derive
accurate results. Usually, respondents show reluctance in supplying accurate
II

information about their income, age and current level of education and skills. The
incorrect information provided by respondents leads to erratic results, even if each
element of the population is considered.
S elf A ssessment Q uestions
7. ____________________ errors do not occur because of taking a sample for data
collection.
8. The poor response of respondents makes it easy for the researcher to derive
accurate results. (True/False)
9. Non-sampling errors may also be caused because of:
a. Selection bias
b. Ambiguous population specification
c. Sampling frame error
d. All of these

66
Sampling

4.5 METHODS OF SAMPLING Notes

There are two methods of sampling, i.e., probability and non-probability sampling
methods. These sampling methods are shown in Figure 3:

Probability Sampling Vs Non-Probability Sampling

Figure 3: Probability and Non-Probability Sampling


Source: https://towardsdatascience.com/sampling-techniques-a4e34111d808

The researchers select the sampling methods based on the requirement of their
chosen topic of the research study. Sampling methods must give:
€€ Precision and accuracy
Extra information about the target population
€€
M
€€ Expenditure recognition
Let us understand the concept of probability and non-probability sampling.
M
4.5.1 PROBABILITY SAMPLING METHODS
Probability sampling is the method of sampling in which the probability of selecting
II

each item from the target population as a sample is equal. The probability sampling
method is alternatively known as random sampling. Examples of this sampling are
tossing the coin and selecting a chit out of five chits. Figure 4 shows the types of
probability sampling:

Simple Random Stratified Cluster


Sampling Sampling Sampling

Systematic Multi-stage
Sampling Sampling

Figure 4: Types of Probability Sampling

Let us understand the types of probability sampling in detail:


€€ Simple random sampling: In simple random sampling, each element or item has the
same or equal chance of getting selected. This method is used when the researchers
do not have any type of prior information regarding the target population; for
example, random selection of 20 employees from the manufacturing organisation 67
of 50 employees. Each employee has the same chance of getting selected as a part
Research Methodology

Notes of sample. Here, the probability of selection is 1/50. Figure 5 shows simple random
sampling:

Figure 5: Simple Random Sampling


Source: https://towardsdatascience.com/sampling-techniques-a4e34111d808

€€ Stratified sampling: In the stratified sampling method, there is a division of the


M
elements or items of the target population into little sub-groups which is called
strata. These groups are made on the basis of the similar characteristics between the
elements or items within the sub-groups. After making sub-groups, the researcher
M
selects the elements randomly from each of these strata. Figure 6 shows stratified
sampling:
II

Stratum 1

Stratum 2

Stratum 3

Figure 6: Stratified Sampling


Source: https://towardsdatascience.com/sampling-techniques-a4e34111d808

€€ Cluster sampling: In cluster sampling, the entire population is divided into groups
or clusters. After that, these clusters are selected on the basis of random sampling.
All elements of the selected clusters should be included in the sample leaving out
all the elements of the non-selected clusters. For example, a population has been
68 divided into 10 clusters named a, b, c, d, e, f, g, h, i and j. The researcher requires
only 3 clusters for his/her sample out of 10 clusters. Suppose 3 clusters, namely a,
Sampling

i, and d, are selected randomly. In the sample, all elements from these 3 clusters Notes
would be included. Cluster sampling looks similar to stratified, but in stratified
sampling, elements or items are selected from all sub-groups. However, in cluster
sampling, clusters themselves are selected and all elements or items are from the
selected clusters that are included in the sample.
€€ Systematic sampling: In systematic sampling, a sample is selected from the
sampling frame at regular intervals. In this type of sampling, the elements in
the sampling frame are numbered consecutively. After this, a random number is
chosen. Thereafter, the sampling fraction is calculated as the ratio of the actual
sample size and the total population. Starting with the randomly selected number,
samples are drawn using a frequency (inverse of the sampling fraction). For
example, if we need to select 100 units out of available 1,000, the sampling fraction
is 1/10 and the frequency would be 10. It means that if a researcher starts with a
randomly selected element at number 3 (in sampling frame), then he would choose
13th and 23rd elements. Systematic sampling is used in cases when the researcher
has a complete list of all the members of the population.
€€ Multi-stage sampling: In multi-stage sampling, population is partitioned
into various clusters and multiple clusters are again divided and grouped into
different strata on the basis of similarity. Clusters (one or more) can be randomly
selected by the researcher from each stratum. The researcher performs this process
continuously until the cluster cannot be further divided anymore. For instance, the
M
world can be divided into countries and countries into different states, states into
cities, cities into urban and rural areas, and these areas having similar features can
be merged for forming strata. Figure 7 shows multi-stage sampling:
M
Cluster =
Stratum =
II

Item = Population

1 2 3 4 5 6 7 8 9 10

1 5 9 2 6 4 8 3 7 10

1 9 6 4 7 10

Figure 7: Multi-Stage Sampling

4.5.2 NON-PROBABILITY SAMPLING METHODS


Non-probability sampling is also known as non-random sampling. In this sampling
method, samples are collected in a manner that all individuals or elements in the
population do not get equal chances of being selected. The major advantage of
this sampling is that it is inexpensive. However, the results obtained from non-
69
probability sampling cannot be generalised confidently for the entire population
Research Methodology

Notes because an unknown proportion of the entire population was not sampled. However,
the results obtained from the non-probability sampling cannot be generalised with
much confidence. Figure 8 shows the types of non-probability sampling methods:

Convenience Purposive Quota


Sampling Sampling Sampling

Judgement Snowball
Sampling Sampling

Figure 8: Types of Non-Probability Sampling Methods

Let us understand the types of non-probability sampling:


€€ Convenience sampling: In this type of non-probability sampling, the researcher
selects those elements or subjects from the target population that are easily
accessible to him/her. For example, in a college, volunteers are required to organise
a tree plantation camp. The strength of the college is 2,000 and the number of
volunteers required is 50. In this case, the easiest way to select volunteers is their
accessibility. The researcher can select those students as volunteers who are easily
accessible to him/her. Convenience sampling helps in conducting pilot studies by
M
facilitating the researcher to obtain the basic data. However, convenience sampling
has a sampling bias as the researcher selects the sample according to his/her own
convenience. Since the sample is not truly representative of the population, the
results of the study cannot be generalised for the whole population.
M
€€ Purposive sampling: Purposive sampling is a non-probability sampling method
wherein the sample is chosen purposively on the basis of certain characteristics
of a population and the objective of the study. For example, a researcher wants to
II

gather opinion of working mothers about the conditions at their workplaces. In


this case, the researcher would contact only those women who are mothers and
working. The females not falling in this category would not be surveyed. There
are two sub-types of purposive sampling, namely quota sampling and judgement
sampling.
€€ Quota sampling: In this sampling method, the given population is first divided into
mutually non-overlapping sub-groups, such as male/female/children or Indian/
American/Asian/European or salary from ` 20,000 to 30,000 p.m./salary from `
30,001 to 40,000 p.m. The sub-groups are divided in a manner such that they are a
replication of the population. Thereafter, a sample is formed by selecting members
from each sub-group according to the proportion of each sub-group in the total
population. This is called proportionate quota sampling. In case the members from
each sub-group are selected based on a criteria other than proportion, it is called
non-proportionate. In quota sampling, the researcher ensures equal representation
of all elements from the population in the sample.
€€ Judgement sampling: In judgement sampling, the elements or units from the
population are selected on the recommendation of experts in the field of research
work that is being carried out. The experts are asked to select the units that
should be included in the sample so that the sample is truly representative of the
70 population. Usually, the expert selects such elements in the sample that can provide
the best information on the research subject. In judgement sampling, the reliability
Sampling

of sample directly depends on the expert’s judgement. Quota sampling can also be Notes
considered as a type of judgement sampling because the elements that are chosen
for each quota depend upon the judgement of the interviewer/researcher.
€€ Snowball sampling: Snowball sampling is also known as chain sampling or
referral sampling. This type of sampling is used in research where it is difficult to
identify or locate the units or elements to be included in the sample. In the snowball
technique, the researcher first picks up one or more subjects (to be included in
sample) and then he/she asks them to recommend or refer to subjects that conform
to the criteria for being included in the sample. This process of referral is repeated
with the new subjects till the required number of subjects in the sample is fulfilled.
This method of sampling is called snowball sampling because the process is akin
to the process of rolling a snowball downhill. The initial snowball size (sample
subjects) keeps on increasing in size till the snowball reaches a flat surface (the
desired sample size is achieved). Snowball sampling is used in those cases where
there is no list of population of interest or when the subjects refrain from identifying
themselves socially or due to the secretiveness or illegality of the organisation for
which they work.

S elf A ssessment Q uestions


10. In _______________ sampling, population is partitioned into various clusters
and multiple clusters are again divided and grouped into different strata on
the basis of similarity.
M
11. Which of the following is also known as chain sampling or referral sampling?
a. Judgement sampling b. Snowball sampling
M
c. Quota sampling d. Purposive sampling
12. Convenience sampling helps in conducting pilot studies by facilitating the
II

researcher to obtain the basic data. (True/False)

A ctivit y
Find out the advantages and disadvantages of probability and non-probability
sampling method.

4.6 SUMMARY
€€ A sample can be a group of people, items/elements or objects selected out of the
population for conducting research.
€€ Sampling is the process of choosing a subset of subject matter or units from the
whole population for the purpose of conducting a research.
€€ Researchers may use two primary methods of data collection, i.e., the Census
method and Sampling method.
€€ Census method of data collection is the method in which researchers study all
elements or items of the population. In the sample survey method, a few elements
of the target population are studied, and not all.
€€ A sample design is considered as a road map which provides the foundation or
basis for the selection of a sample survey. 71
Research Methodology

Notes €€ A good sample design must be error-free and must produce the representative
sample.
€€ An error is a fault or the disparity between the evaluated value and the correct
or exact value. An error may occur due to biased attitude of the interviewer,
respondent’s reluctance, ineffectiveness of instrument, etc.
€€ Sampling error is the error or mistake that occurs in a data collection process as
an outcome of taking a sample from target population rather than using the entire
population.
€€ Non-sampling errors may also be caused because of selection bias, ambiguous
population specification, sampling frame error, processing error, respondent
errors, non-response error, physical environment, inadequacy of enumerators, etc.
€€ There are two types of sampling methods, i.e., probability and non- probability
sampling methods.
€€ Probability sampling is the method of sampling in which the probability of selecting
each item from the target population as a sample is equal. In the non-probability
sampling method, samples are collected in such a manner that all individuals or
elements in the population do not get equal chances of being selected.

4.7 KEY WORDS


€€
M
Sampling: The process of obtaining elements from the entire population
€€ Sample design: The methodological plan to obtain a sample from population
€€ Probability sampling: A sampling technique wherein the samples are gathered in
M
a process that gives all the individuals in the population an equal chance of being
selected
€€ Census: The method in which the researcher studies all elements or items of the
II

population
€€ Sampling errors: The statistical error that occurs when a researcher does not select
a sample that represents the entire population of data

4.8 CASE STUDY: IRS-RANDOM SAMPLING ERROR


This is the case of a random sampling issue between Internal Revenue Service (IRS)
and the owner of the tax preparation service in a federal court where the IRS was
the plaintiff and the owner was the defendant. Simple random sampling is a method
of sampling where all the elements of the population have equal chances of being
selected as a sample. The key highlight of the case is that the IRS in their statistical
evaluation or analysis inappropriately used their sample for creating inferences
regarding the population as a whole. The defendant was responsible for filing 24,399
federal income tax returns’ applications for the tax year 2003. According to IRS, it
reviewed 345 tax returns out of 24,399 and 313 resulted in needing additional tax assessment
which means 91% of the original sample had returns that owed additional tax to the IRS,
and the additional tax was owed for a variety of reasons. The IRS calculated from these 345
returns that the actual tax loss directly due to these returns being improperly prepared by
the defendant(s) was in excess of $1.1 million (United States v. Brier, et. al., pg. 3). The IRS
further stated that if this rate loss were applied to all 24,399 returns, then the estimated
72 loss to the United States government would be in excess of $85 million for the years 2003
Sampling

through 2007 (United States v. Brier, et. al., pg. 5). Thus, the IRS was looking for damages Notes
close to 85 million dollars.

There were two major sampling selection errors made by the IRS analyst:
€€ The 345 tax returns were chosen from returns that had a Schedule C attached.
€€ The statistical inference and finding were made by evaluating these 345 samples
only.
Therefore, any inferences from the study could not be generalised for the whole
population. The IRS made the basic mistake in sample selection and did wrong
calculation and, ultimately, provided inaccurate conclusions. This affected the
credibility of the IRS. This proves that any person having the basic knowledge of
statistics and research methodology can catch and highlight errors that are being
stated in the inferences made from inaccurate mathematical analysis and poor
sampling selection techniques.

QUESTIONS
1. Who is/are the plaintiff and defendant in the above case study?
(Hint: IRS and owner of a tax preparation service)
2. What type of errors were found in the inferences made by IRS?
M
(Hint: Mathematical, sampling errors)
3. According to you, what are the sampling selection errors?
M
(Hint: Small sample out of large population)
4. What are the reasons behind the occurrence of errors in sampling?
(Hint: Interviewer’s attitude, knowledge, sample selection techniques, respondent’s
II

skills)
5. Describe some ways that can help in reducing errors in sampling.
(Hint: Careful assessment of population, sample selection, sample size)

4.9 EXERCISE
1. What is sampling design?
2. Explain the steps involved in sampling design.
3. Describe the types of errors in measurement. How can these errors be minimised?
4. What is the difference between census and sample survey?
5. Explain the sampling errors in detail.
6. Describe the methods of sampling.
7. Write a short note on:
a. Stratified sampling
b. Cluster sampling
c. Snowball sampling 73
8. Why does non-sampling error occur?
Research Methodology

Notes 4.10 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topics S. No. Answer
Concept of Sampling 1. sampling

2. False
3. Sample survey
Errors in Measurement and Sampling Errors 4. Random
5. True
6. a. Systematic
Non-Sampling Errors 7. Non-sampling
8. False
9. d. All of these
Methods of Sampling 10. multi-stage
11. b. Snowball sampling
12. True
M
4.11 SUGGESTED BOOKS AND E-REFERENCES
SUGGESTED BOOKS
M
€€ Goddard, W. and Melville, S. (2011). Research Methodology. Kenwyn, South Africa:
Juta& Co.
€€ Welman, J., Kruger, F. and Mitchell, B. (2005). Research Methodology. Cape Town:
II

Oxford University Press.

E-REFERENCES
€€ How Simple Random Samples Work. (2020). Retrieved 18 March 2020, from
https://www.investopedia.com/terms/s/simple-random-sample.asp
€€ Types of sampling methods | Statistics (article) | Khan Academy. (2020). Retrieved
18 March 2020, from https://www.khanacademy.org/math/statistics-probability/
designing-studies/sampling-methods-stats/a/sampling-methods-review
€€ Sampling in Statistics: Different Sampling Methods, Types & Error - Statistics
How To. (2020). Retrieved 18 March 2020, from https://www.statisticshowto.
datasciencecentral.com/probability-and-statistics/sampling-in-statistics/
€€ Skye Wills, T. (2020). Sampling Design. Retrieved 18 March 2020, from http://ncss-
tech.github.io/stats_for_soil_survey/chapters/3_sampling/3_sampling.html

74
R
TE
5

AP
H
C
Measurement and Scaling
M
Table of Contents
M
5.1 Introduction
5.2 Concept of Measurement
II

5.2.1 Measurement Scales


5.2.2 Developing Measurement Tools
5.2.3 Basic Criteria of a Good Measurement Tool
Self Assessment Questions
5.3 Concept of Scaling Techniques
5.3.1 Types of Scaling Techniques
5.3.2 Bases of Scale Classification
5.3.3 Techniques of Scale Construction
Self Assessment Questions
5.4 Summary
5.5 Key Words
5.6 Case Study
5.7 Exercise
5.8 Answers for Self Assessment Questions
5.9 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Explain the concept of measurement
€€ Discuss the process of developing measurement tools
€€ List the basic criteria of a good measurement tool
€€ Explain the types of scaling techniques
€€ Describe the bases of scale classification

5.1 INTRODUCTION
In the previous chapter, the concept of sampling has been explained. The chapter
discussed census versus sample survey, developing sample design/sampling process
and characteristics of a good sample design. The chapter also described errors in
measurement and sampling processes as well as non-sampling errors. The chapter
concluded with an explanation of methods of sampling.

Measurement and scaling both play an important role in the daily lives of all
human beings. Measurement is important for measuring physical objects that can
M
be measured and expressed in some measuring units. The absence of measurement
in human lives would even make their daily activities very tough. For instance,
construction of a house or an office cannot be imagined without measuring its length,
M
width or height. Even purchase of rice, flour, milk or any groceries is not possible in
the absence of a measurement unit, such as kilograms, grams and litres. Dimensions
are required even for buying clothes.
II

Scaling is the procedure of making a continuous sequence of values upon which


the objects to measure are placed or rated on the base of certain rules.

Apart from physical objects measurement, measuring abstract concepts, such as


happiness, motivation, and success is difficult using a standardised measure. This
is because these are subjective concepts and everybody holds different views about
them. Hence, scaling is used, which helps measure abstract concepts. Scaling is used
for measuring the non-physical entities that are not measurable.

Both measurement and scaling are essential parts of a research study. In any research,
data collection is not possible without measurement and scaling. The reliability and
authenticity of a research result are related to accurate and specific data collection.
It is said that ‘a research is as good as the data that is used for research’ and quality
data requires a well-defined measurement and scaling.

This chapter will help you in understanding the concept of measurement. You will
study the measurement scales, the development of measurement tools and the basic
criteria of a good measurement tool. Further, the concept of scaling techniques is
also discussed. The latter sections of the chapter will discuss the types of scaling
techniques and bases of scale classification. Towards the end, you will learn about
76 the techniques of scale construction.
Measurement and Scaling

5.2 CONCEPT OF MEASUREMENT Notes

Measurement is referred to as the process through which numbers are assigned


to observations or objects. Measurement is done through numbers assigned on
the basis of pre-determined rules. Technically, measurement can also be defined
as a process of mapping specific domain aspects against other aspects of a range
conferring to some rule of correspondence. It is quite easy to assign numbers to
objects on the basis of their characteristics, but it is quite tough in case of other
objects. For example, physical weight, biological age or a person’s financial assets can
be easily measured, whereas measuring things, like social conformity, intelligence,
or marital adjustment is very difficult and needs closer attention. In other words, the
quantitative features, like weight, height, etc., can be easily measured directly with
some pre-defined standard unit of measurement, but it is not that easy to measure
properties, like motivation to succeed, ability to stand stress, etc.

Overall, measurement is considered as a function of using a yardstick to determine


the characteristics of a physical object. Measurement standards are important
for measuring both qualitative and quantitative aspects. Earlier, measurement of
qualitative aspects was not easy, but, today, there exist standardised tools to measure
abstract concepts, such as intelligence, unity, honesty, bravery, success and stress.
High accuracy and confidence can be expected while measuring the quantitative
characteristics of an object.
M
5.2.1 MEASUREMENT SCALES
M
The famous psychologist, Stanley Smith Stevens mentioned ‘measurement scales’
in his article named ‘On the Theory of Scales of Measurement’ in the year 1946.
The term measurement scale refers to a classification used to describe the nature
II

of the information within the numerals assigned to variables. Scale characteristics


determine the level of measurement. Prior to the selection of ‘measuring scale’, a
researcher must make efforts to understand the level of measurement. Stevens also
mentioned that all scientific measurement is done using the four types of scales.
These types of measurement scales are shown in Figure 1:

Nominal Scale Ordinal Scale Interval Scale Ratio Scale

zz Named and
zz Named and ordered
ordered variables having
zzNamed and variables having proportionate
zz Named variables
ordered variables proportionate interval between
interval between them. It can also
them accommodate
absolute zero.

Figure 1: Types of Measurement Scales

For the purpose of measurement in the research study, researchers require to


formulate some form of scale in a defined range and then, accordingly, map the 77
characteristics of objects from the domain onto the formulated scale. The scales of
Research Methodology

Notes measurement are considered in terms of their mathematical properties. Various


forms of measurement scales are discussed as follows:
€€ Nominal scale: It is the basic form of measurement scale used to classify, identify
or label individuals, companies, products, brands or other entities into categories.
The scale is also known as a categorical scale. Nominal scales are not numerically
significant as the numbers used in nominal scales do not have arithmetical value
and these are also not mutually exclusive. Different numbers, such as (1, 2, 3, ...)
assigned to cricket players in a team, books in a library, computers in the Internet
cafe or computers in an office, are based on a nominal scale. These numbers of
nominal scales do not help perform any mathematical operations related to the
research study. Consider a cricket team in which 11 players of the team are assigned
numbers from 1 to 11. In this situation, finding the average of 1 to 11 does not
signify any meaning as numbers assigned to all team members are simply labels
and do not have any arithmetical value. The nominal scale signifies the lowest
level of measurement. Some of the characteristics of a nominal scale are as follows:
zz A nominal scale does not have any arithmetic origin.
zz A nominal scale does not show any order or distance relationship.
zz A nominal scale distinguishes things by putting them into various groups.
zz
M
A nominal scale is generally used to conduct surveys and ex-post-facto research
(a type of research that is used by researchers to predict the possible causes
behind an effect that has already occurred).
M
The nominal scale also has certain limitations, which are as follows:
zz There is no rank order.
zz There is no possibility of mathematical calculation and analysis.
II

zz There is a limitation of statistical implication as there is a possibility to express


the mode, but the calculation of the standard deviation and the mean is not
possible.
€€ Ordinal scale: This scale is also known as ranking scale. An ordinal scale only
specifies a greater than or less than value, but does not answer how much greater or
how much less. It allows the respondents to rank some alternatives based on some
common characteristics. It simply places events or objects in order by assigning
ranks. This scale is used as a measure to compare two or more entities. Let us
understand the ordinal scale with the help of an example. A company assigns
ranks to its objectives as shown in the following table:
Company Objectives Ranks Assigned
Increase in sales 2
Increase in revenue 1
Increase in customers 3
Decrease in cost 4

In the above table, different ranks have been assigned to the company’s objectives.
78 It is clear that the company has preferred to increase sales as compared to the
number of customers. However, it cannot be said that the company’s preference
Measurement and Scaling

for the increase in sales is two times higher than the company’s preference for a Notes
decrease in cost. Therefore, it can be inferred that an ordinal scale is an improvement
over the nominal scale. The main characteristics of the ordinal scale are as follows:
zz The scale is not expressed in absolute terms.
zz The scale ranks the things from the highest to the lowest.
zz The adjacent ranks do not have equal variance always.
zz Central tendency is measured with the use of a median.
zz Dispersion is measured by using percentile or quartile.
Despite the above-mentioned characteristics of the ordinal scale, the limitation
of the ordinal scale is that arithmetic operations, such as addition, subtraction,
division, etc., cannot be performed with the assigned ranks.
€€ Interval scale: The interval scale is also known as the cardinal scale. It is based on
the principle of ‘equality of interval’, i.e., the intervals are assumed equal and are
used as the basis for making the units equal. In other words, in the interval scale,
the interval between successive positions is equal. The positions are separated by
equally spaced intervals or bases. For example, a researcher wants to measure the
level of happiness among youths along a scale rated from 1 to 10. With the use of
M
an interval scale, following conclusions can be made:
zz The most happy is represented by number 10 and the least happy is represented
by number 1.
M
zz Number 7 represents a higher level of happiness than number 6.
zz The difference in the level of happiness between 4 and 3 is the same as the
difference in the level of happiness between 7 and 8.
II

The characteristics of the interval scale include the following:


zz Interval scales are set arbitrarily and have no absolute zero.
zz The central tendency can be measured by using mean.
zz Dispersion can be measured by using standard deviation.
zz The test of significance is measured by using t-test (a type of hypothesis test
that allows comparison of means) and f-test (a test used to know if the variances
of two populations are equal).
The interval scale contains features of nominal scale and ordinal scale. Though, the
key limitation with the interval scale is that the ratio of two observations cannot be
taken. It cannot be stated that number 4 represents the double happiness level as
compared to number 2. The other limitation of the interval scale is that it lacks an
absolute or true zero of measurement.
€€ Ratio scale: It is a scale that contains an absolute zero. The ratio scale implies that
at point zero, the scale does not infer any feature of an object. It signifies the actual
amounts of variables. The physical dimension measures, such as weight, height and
distance, fall under this category. Generally, all statistical techniques are pertinent
to ratio scales and all mathematical operations that have real numbers can also be
79
assigned with ratio scale values. For instance, the point zero of centimetre scale or
Research Methodology

Notes metre scale does not imply any measurement, such as length, breadth or height.
Arithmetical operations, such as multiplication and division can be easily carried
out with ratio scale.
Ratio scale is the most substantial measurement scale as almost all statistical
operations are possible by its measurements, which cannot be performed by other
scales. For example, it can be measured that the weight or height of Ram is twice
that of Shyam with the help of ratio scale, but this measurement is not possible
with the help of all other scales. Some of the important characteristics of the ratio
scale are as follows:
zz The ratio scale has an absolute zero measurement.
zz The central tendency can be measured by using geometric and harmonic
means.

5.2.2 DEVELOPING MEASUREMENT TOOLS


Measurement tools are also known as assessment tools and are used for the
purpose of measuring or collecting data. Some examples of measurement tools are
scales, questionnaires, interviews, surveys and indexes. The procedure to develop
measurement tools comprises four stages. The stages are shown in Figure 2:
M
Formation of
Index
Selection of
M
Indicators
Specification
of Concept
Dimension
Concept
II

Development

Figure 2: Measurement Tools Development

The stages of developing measurement tools are discussed as follows:


€€ Stage 1 – Concept development: This stage requires that researchers should
develop a good understanding of the topic that they want to research. This stage
of concept development is more evident in theoretical studies than in the more
pragmatic form of researches, which are evolved on pre-established fundamental
concepts.
€€ Stage 2 – Specification of concept dimension: As the concept gets developed
properly in the first stage, the researcher needs to identify the dimensions of the
concepts. The task of specification of the concept of dimension can be accomplished
either by using deduction, which means the use of more or less intuitive approach,
or by use of empirical correlation. The correlation should be of the individual
dimensions with the total concept and/or the other concepts. For example, on the
80 launch of a new product in the market, several dimensions, such as customers,
price, social responsibility, reputation of the organisation, and geographical areas
should be considered.
Measurement and Scaling

€€ Stage 3 – Selection of indicators: It is the third stage in the process of developing Notes
measurement tools. Indicators are devices to measure knowledge, opinion, choices,
expectations, and feelings of respondents. Examples of indicators are scales and
questionnaires. As there is rarely a perfect measure of a concept, the researcher
should consider more than one indicator to have the stable scores and improved
validity.
€€ Stage 4 – Formation of index: The last stage of the process of developing
measurement tools is based on the other three stages. A researcher takes into
account multiple dimensions of a particular concept and collects suitable indicators
for proper measurement. After that, these indicators are combined into a single
index for proper measurement. Consequently, an overall index is prepared. For
example, an overall Body Mass Index (BMI) is prepared by National Institute
of Health (NIH) by using individual indicators, such as weight and height, to
measure body fat.

5.2.3 BASIC CRITERIA OF A GOOD MEASUREMENT TOOL


A measurement tool should clearly and accurately indicate what the researcher
intends to measure. Additionally, it should be easy and efficient to use. There are
three basic criteria of good measurement, as shown in Figure 3:
M
Good
M
Measurement
Tool Criteria
II

Validity Reliability Practicality

Content Criterion-related Construct


Validity Validity Validity

Figure 3: Criteria of Good Measurement

The fundamental criteria of good measurement are explained as follows:


€€ Validity: It denotes the ability of an instrument to measure the sample under study
with logic and reasonability. This is needed to achieve the expected outcome from
a good measurement. Ascertaining the validity of a measuring instrument is not
an easy or quick task. Researchers have put efforts to assess the validity in diverse
ways. However, to assess the validity of a measuring instrument, there are three
81
widely approved criteria. These are discussed as follows:
Research Methodology

Notes zz Content validity: It is the scope to which the measuring instrument content
provides appropriate coverage of the topic. In case the measuring instrument
contains a representative sample of the universe, the content validity
is considered good. The determination of the instrument is principally
judgemental and instinctual. A panel of persons can also determine the validity
by judging how well the measuring instrument meets the standards, but it
cannot be expressed numerically.
zz Criterion-related validity: It is the situation in which some criterion is used
to judge the validity of the measuring instrument. In other words, it is as the
ability to predict some outcome or estimate the existence of some current
condition. Generally, it is done by making a comparison of the instrument
with other instruments of the same type in which the researcher has more
confidence; for example, comparing the results of two IQ tests on a group of
four students.
The success of measures used for some empirical estimating purposes is
reflected by such validity. A criterion-related validity in a broad term is
subdivided into the following:
99 Predictive validity: It refers to the usefulness of a test in predicting some
future performance.
99
M
Concurrent validity: It refers to the usefulness of a new test as compared
to a well-established test.
zz Construct validity: It is one of the complex and abstract validity measurement
M
criteria. It implies that there should be compatibility between a theoretical
concept and a measuring instrument. Technically, a measure signifying a degree
confirming to predicted correlations with other theoretical propositions is said
II

to possess construct validity. For defining construct validity, a set of other


propositions is associated with the results received from using the researcher’s
measurement instrument. If the measurements arrived at the researcher’s
devised scale correlate in a predicted way with the other propositions, it can
be concluded that there is a presence of some construct validity.
€€ Reliability: This is another important criterion of good measurement. The
reliability of a measuring instrument is defined if it gives a consistent outcome.
A valid instrument is considered to be reliable, but a reliable instrument is not
necessarily a valid instrument. For example, suppose a scale is used to measure
the weight of objects. The scale consistently shows all objects to be overweight by 2
kilos. In that case, a scale is said to be reliable but not valid at all. The two important
aspects of reliability are stability and equivalence. Stability prevails when an object
or a person, being measured by the same instrument, and consistent results over
a period of time, are exhibited. The equivalence aspect observes the diverse errors
which can occur under different conditions. One should not get confused about the
two aspects of reliability, stability, and equivalence as stability is more concerned
with situational variations, whereas, equivalence is concerned with variations due
to investigators and sample of items.
82 €€ Practicality: This feature of instruments of measurement implies that a good
measurement should be economic, interpretable and convenient. The economy
Measurement and Scaling

aspect is related to the cost consideration of measurement. If a measurement is out Notes


of reach because of the high cost, it is of no use for the researcher. The interpretability
aspect is specifically important when persons other than those who designed the
test will interpret the results. The measuring instrument, to be interpretable, must
be accompanied by (a) comprehensive instructions for administering the test, (b)
scoring keys, (c) suggestion about the reliability, and (d) guides for using the test
and for interpreting results. Convenience refers to the ease with which a measuring
instrument can be administered. For example, a questionnaire with sufficient
coverage of topic and easy language can be administered with more ease.

S elf A ssessment Q uestions


1. The term ‘measurement scale’ refers to a classification that describes the nature
of information within the numerals assigned to _____________.
2. ____________ allows the respondents to rank some alternatives based on some
common characteristics.
3. In the interval scale, the interval between successive positions is unequal.
(True/False)
4. _______________ operation can be easily carried out with ratio scale.
5. Predictive validity refers to the usefulness of a test in predicting some future
performance. (True/False)
M
5.3 CONCEPT OF SCALING TECHNIQUES
M
Researchers often face difficulties of valid measurement at the time of measuring
attitudes, opinions, physical concepts and institutional concepts. Certain specific
procedures are required, which may enable researchers to measure abstract
II

concepts more precisely. A scale denotes a continuum consisting of the highest point
and lowest point along with numerous intermediate points between the extreme
points. The relation of scale-point positions is that when the first point appears
to be the highest point, the second point indicates a higher degree in terms of the
given characteristics as compared to the third point, and so on. Scaling defines the
measures that are to be used for assigning numbers to various degrees of opinion,
attitudes and other concepts. It may be described as a ‘procedure for the assignment
of numbers to a property of objects to impart some of the characteristics of numbers
to the properties in question’.

Following are the ways of doing scaling:


€€ Judging about some characteristics of an individual and then judging the
individual directly on the basis of a scale, which has been defined in terms of that
characteristic.
€€ Framing questionnaires in such a way that the score of individual’s responses can
be assigned on the scale easily.
In practical usage, the normally used attitude measurement scales are ordinal. These
scales are basically self-report inventories, with a list of favourable and unfavourable 83
statements towards the subject under study.
Research Methodology

Notes 5.3.1 TYPES OF SCALING TECHNIQUES


A tool or mechanism that is used to differentiate individuals from one another on
the variables of interest of research is known as a scale. Various researchers use
different scales depending on the needs of their study. The different types of scales
are shown in Figure 4:

Types of Scaling
Techniques

Comparative Non-Comparative
Scales Scales

Ranking Constant Continous Itemised


Scale Sum Scale Rating Scale Rating Scale

Summated
Paired
M Scale (Likert)
Comparison
M
Semantic
Rank Order Differential Scale
Scale
II

Guttman
Scale

Figure 4: Types of Scaling Techniques

The types of scales mentioned in Figure 4 are discussed as follows:


€€ Comparative scales: Comparative scales comprise the direct measurement
of stimulus objects and data having rank-orders or ordinal properties only. It
consists of scales wherein the researchers ask the respondents for their relative
preference between two or more objects. For instance, “Do you prefer Nescafe or
Bru?” Comparative scales comprise paired comparison, rank order, and constant
sum scale.
zz Ranking scale: Ranking scales are defined as scales that are used for making
relative judgements. Ranking scales are further divided into two approaches,
which are discussed as follows:
99 Paired comparison: The paired comparison approach of ranking scale
offers a way to make comparisons among objects. In the pair comparison
84 method, the respondents are asked to express their attitude by making
a choice between two objects, say between Real Juice and Tropicana,
Measurement and Scaling

according to some criterion. Fundamentally, if there are ‘n’ stimuli to Notes


judge, the number of judgements required in a paired comparison is N
= n(n–1)/2. For example, if there are 8 products, then the respondents
need to make (8(8–1/2)) = 28 comparisons. If the number of comparisons
becomes quite large, then there is a risk because the respondents may
show reluctance to take part in the research. In such a case, comparisons
can be reduced by applying the law of transitivity. This law says that if A
is preferred to B and B is preferred to C, then A would automatically be
preferred to C. The limitations of paired comparison include:
 This technique is useful when the number of brands is limited.
 This technique may give biased results as it is dependent upon the
order in which the objects are presented.
 This technique does not replicate a true market situation, which
involves selection from multiple alternatives.
99 Rank order scale: In this approach of ranking scale, respondents are asked
to give rank to their choices as per their preferences. The rank order scale
is commonly used to measure preferences for brands as well as attributes.
Following is an example of the rank order method:
M
Items Choices/Preferences
A 6
B 3
M
C 2
D 5
E 1
II

F 4

In the given example, 6 items are shown. The respondent was asked to
rank the items as per his/her preferences. Item E is the most preferred and
item A is the least preferred by the respondent.
zz Constant sum scale: Constant sum scale is viewed as an ordinal scale because
of its comparative nature. In this form of scaling, the respondents are asked
to rate the different characteristics of an object and assign some number of
units to each characteristic. The respondents have to rate each characteristic in
such a manner that the total number of units or points equals the total number
of units assigned by the researcher or the experimenter. Respondents assign
the number of units to each characteristic based on the importance of the
characteristics to them. If the characteristic holds no importance for an object,
the respondent can assign zero units to it. For example, an HR professional
may create a constant sum scale that equals to 100 marks, to know the relative
importance of different infrastructural features in an organisation, such
as drinking water, clean washroom, gymnasium, sports room, canteen, etc.
The respondents under study were instructed to assign the numbers to
infrastructural features in such a way that the sum of all the marks allocated
to infrastructural features of the organisation must be equal to 100. After the 85
response of all respondents has been noted, the numbers of points earned by
Research Methodology

Notes each attribute are counted. The values arrived through constant sum scale
method can be used to conclude results or help in research.
€€ Non-comparative scales: These are those scales wherein each object is measured
independently of the other objects under the same research study. Absolute
results are obtained for each object. Examples of non-comparative scales include
continuous rating scales, Likert scale, etc. They are generally divided into two
categories: continuous rating scale and itemised rating scale.
zz Continuous rating scale: In a continuous rating scale, the respondents are
asked to rate different objects on a continuum according to a certain criterion.
A continuum is a line running from one extreme value to the other extreme
value of the criterion. The rating is given by respondents by marking a point
on the continuum.
zz Itemised rating scale: In itemised rating scale, items are shown in the form
of ordered statements and the respondents are required to select the category
that best describes the concerned item. The respondents are asked to make
a choice according to their preferences or opinions. A brief description of
each category is associated with the itemised rating scales. The most common
itemised rating scales used by researchers include Likert scale (summated
scale), semantic differential scale, Thurstone and Guttman scale.
99
M
Summated scale (Likert): Summated scales are constructed by using the
item analysis approach. Such scales consist of a number of statements
that express either positive or adverse feelings towards any topic or idea.
M
The summated scale is most frequently used in studying social attitudes.
It follows the pattern developed by Likert; thus, the summated scale is
also termed as the Likert scale. Most commonly, a Likert scale contains
II

five degrees of a statement. Let us know more about the Likert scale with
the help of the following example (statement and options).
Statement: The Internet is creating a positive impact on children.
Response options:
(a) Strongly Agree (1)
(b) Agree (2)
(c) Neutral (3)
(d) Disagree (4)
� (e) Strongly Disagree (5)
In the preceding example, there are five degrees of responses for the given
statement. The right extreme of the scale shows the strongest approval of
the statement, whereas the left extreme indicates the strongest disapproval
of the statement. The middle points are between these two extremes. Each
point on the scale has a numerical value. This example constitutes only
one statement, but more than one statement can be used in Likert scale. In
the Likert scaling method, each statement is assigned a numerical value.
86 The total score for each respondent is calculated by considering his/her
response to each statement.
Measurement and Scaling

99 Semantic differential scale: Factor scales are developed using the Notes
factor analysis approach. The semantic differential scale is an example
of factor scale and was developed by Charles E. Osgood, G.J. Suchi
and P.H. Tannenbaum. It measures the connotative meaning of objects,
events and concepts. The semantic differential scale comprises bipolar
adjectives, such as valuable-worthless and good-bad. The respondent
is asked to select his/her position between these two adjectives. Let us
understand the concept of the semantic differential scale with the help of
the following example. A semantic differential scale analysing candidates
for a managerial position is shown in Table 1. Here, two adjectives
(successful and unsuccessful) are shown on two extremes. In between
these two extremes, scores (3, 2, 1, 0, –1, –2, and –3) are mentioned to
rate the candidates according to the level of traits possessed by them.
Successful-unsuccessful, progressive-regressive, and true-false represent
the evaluative attitude. The potency attitude is represented by the severe-
lenient and strong-weak pairs. The rest of the adjectives shown in Table
1 represent the activity factor. The semantic differential scale has a wide
usage in measurement of the attitude of different people. Table 1 shows
the semantic differential scale for rating candidates for a managerial
position on the basis of the given traits and scores:
M
Table 1: Semantic Differential Scale for Analysing Candidates for a Managerial Position
Successful Unsuccessful
Progressive Regressive
M
Active Passive
Fast Slow
II

Strong Weak
Severe Lenient
True False
3 2 1 0 −1 −2 −3

99 Guttman scale: The Guttman scale, also known as cumulative scale,


consists of a series of statements to which respondents express their
agreement or disagreement. It is important to note that in the cumulative
scale, statements appear in the form of a cumulative series. It means that
if there are seven statements and the respondents agree with statement 4,
then they would also agree with statements 1, 2 and 3.
Let us understand the concept of cumulative scale with the help of the
following example:
 I can do the counting.
 I can do addition.
 I can do subtraction.
 I can do multiplication.
87
 I can do division.
Research Methodology

Notes 5.3.2 BASES OF SCALE CLASSIFICATION


You have come to know that some numbers are assigned to measure abstract concepts.
These numbers are not assigned arbitrarily but on the basis of certain factors.

These are shown in Figure 5:

Subject Orientation

Response Form

Degree of Subjectivity

Scale Properties

Number of Dimensions

Scale Construction Approach


M
Figure 5: Bases of Scale Classification
M
A brief description of the bases of scale classification is as follows:
€€ Subject orientation: In this base of scaling, variations in the responses given by
II

differentpeople are analysed and examined.


€€ Response form: It is the basis of scale classification in which variations across both
stimulus and subject are assessed. Based on responses, two types of scales are
available, such as categorical and comparative. Categorical scales are also called
rating scales. For example, suppose there is a statement,
I cannot live without my mobile phone.
Response options are as follows:
zz Strongly disagree
zz Disagree
zz Agree
zz Strongly agree
On the other hand, comparative scales are known as ranking scales. For example:
Please rank the following options in the order of your preferences.
zz Watching TV zz Listening to songs

88 zz Going out for a movie zz Morning walk


Use alphabetical letters to show your preference.
Measurement and Scaling

€€ Degree of subjectivity: It is the basis which classifies the scale either by measuring Notes
personal preferences or non-preference judgements. In the first case, respondents
may be asked to exhibit their personal opinion. For example:
Which of the following organisations do you favour the most?
zz Organisation A zz Organisation C
zz Organisation B zz Organisation D
In the second case, the respondent may be simply asked to decide the most
profitable organisation. It is clear that in the second case, the scope of personal
opinion is not there.
€€ Scale properties: This is the base of scale classification according to which scales
can be classified as nominal, ordinal, interval, and ratio scales. These are already
discussed in the previous sections.
€€ Number of dimensions: It indicates the dimensions on the basis of which scales are
classified. Two types of scales are used: one-dimensional and multi-dimensional.
€€ Scale construction approach: It indicates scale-classification on the basis of
different approaches used.

5.3.3 TECHNIQUES OF SCALE CONSTRUCTION


M
Scales are used in almost all fields of research. However, it is used extensively in
studies related to psychology and social sciences. While measuring attitudes of the
M
individuals, a researcher normally follows the technique of marking the attitude
scale in such a way that the score of the individual responses assigns a place on a
scale. Under this approach, respondents express their agreement or disagreement
with a number of statements relevant to the issue. While developing such statements,
II

following points should be taken care of:


€€ The statements must elicit responses, which are psychologically related to the
attitude being measured.
€€ The statements need to be such that they discriminate not only between extremes
of attitude, but also among individuals who differ slightly.
A brief description of scale construction approaches is as follows:
€€ Arbitrary approach: According to this approach, a scale is developed on an ad-hoc
basis, that is, for a specific purpose; therefore, it cannot be generalised. Arbitrary
scales are easy to develop and provide a specific information about a particular
topic.
€€ Consensus approach: According to this approach, items to be included in the scale
are decided on the basis of consensus in a panel of judges. The items are evaluated
in terms of their relevance to topic and certainty in implication.
€€ Item analysis approach: According to this approach, items are developed in the
form of a test and given to respondents. After completing the test, the total scores
are calculated for each respondent and the items are evaluated to determine which
questions discriminate between high and low raters.
89
Research Methodology

Notes €€ Factor analysis approach: According to this approach, the correlation between
different items is established on the basis of a common factor.
€€ Cumulative scale approach: According to this approach, the approval of an item
representing an extreme position should also result in the approval of all items
indicating a lesser than the extreme position.

S elf A ssessment Q uestions


6. ___________ defines the measures of assigning numbers to various degrees of
opinions, attitudes and other concepts.
7. The paired comparison approach of ranking scale restricts to make comparisons
among objects. (True/False)
8. In itemised rating scale, items are shown in the form of ordered statements in
and the _______________ are required to select the category that best describes
the concerned item.
9. The summated scale is most frequently used in studying social attitudes.
(True/False)
10. The degree of subjectivity is the basis which develops the scale either by
measuring ______________ preferences or non-preference judgements.
M
A ctivit y
Search on the Internet a research paper related to ‘measurement scales and its
M
use in presenting statistical data’ and prepare a report of 1,000 words.

5.4 SUMMARY
II

€€ Measurement is considered as a function of using a yardstick to determine the


characteristics of a physical object.
€€ For the purpose of measurement in research study, researchers require to
formulate some form of scale in a defined range and then, accordingly, map the
characteristics of objects from the domain onto the formulated scale.
€€ Nominal scale is the basic form of measurement scale used to classify, identify or
label individuals, companies, products, brands or other entities into categories.
€€ Ordinal scale is also known as ranking scale. An ordinal scale only specifies a
greater than or less than value, but does not answer how much greater or how
much less.
€€ The interval scale is also known as cardinal scale. It is based on the principle of
‘equality of interval’, i.e., the intervals are assumed equal and are used as the basis
for making the units equal.
€€ Ratio scale contains an absolute zero. The ratio scale implies that at point zero, the
scale does not infer any feature of an object.
€€ Measurement tools are also termed as assessment tools and these are used for the
90 purpose of measuring or collecting data.
Measurement and Scaling

€€ The stages of developing measurement tools are concept development, specification Notes
of concept dimension, selection of indicators and formation of index.
€€ A measurement tool should clearly and accurately indicate what the researcher
intends to measure.
€€ Validity denotes the ability of an instrument to measure the sample under study
with logic and reasonability.
€€ Reliability is another important criterion of good measurement. The reliability
of a measuring instrument implies the consistent outcomes received through
measurement.
€€ The practicality feature of instruments of measurement implies that a good
measurement should be economic, interpretable and convenient.
€€ A scale denotes a continuum consisting of the highest point and lowest point along
with numerous intermediate points between the extreme points.
€€ A tool or mechanism that is used to differentiate individuals from one another on
the variables of interest of research is known as a scale.
€€ The types of scales include comparative scales and non-comparative scales.
Comparative scales include ranking scale and constant sum scale. Ranking scale
M
is divided into paired comparison and rank order scale. Non-comparative scales
include continuous rating scale and itemised rating scale. Itemised rating scale
includes summated scale (Likert), semantic differential scale and Guttman scale.
M
€€ Arbitrary scales are easy to develop and provide specific information about a
particular topic.
€€ The consensus approach implies that the items to be included in the scale are
II

decided on the basis of consensus by a panel of judges.

5.5 KEY WORDS


€€ Ad hoc: It implies a solution that is designed for a specific problem.
€€ Measurement: It implies a yardstick to determine the characteristic of any physical
object.
€€ Connotative: It implies the figurative meaning of a word.
€€ Non-preference judgement: It implies the style of judgement in which there is no
scope of personal bias.
€€ Scaling: It implies a branch of measurement that tries to measure abstract concepts.

5.6 CASE STUDY: CBA RANK ORDER SCALING TO KNOW CUSTOMER


PREFERENCES
In Sri Lanka, a toothpaste company named CBA Ltd. is proposing the launch of a
new brand of toothpaste in its product chain. Prior to launching the new product,
the management of the organisation thinks that it is imperative to gather material
information about customer preferences and the most leading brands in the 91
Research Methodology

Notes toothpaste industry. The study of the existence of competitors in the market, desired
expectations of consumers and the preferences of customers will enable the company
to design its new product as per the market requirements.

For this purpose, CBA Ltd. conducted small research on a sample of 300 respondents,
using a questionnaire containing questions based on rank order scale. The respondents
were presented with 10 toothpaste brands simultaneously and were asked to rank
or order them according to their own presumed criteria. Following form along with
instructions was given to the respondents:

Form for Preference for Toothpaste Brands


S. No. Brand Rank Order
1 Crest
2 Formula Action
3 Sensodyne
4 Pepsodent
5 Plus White
6 Oral B
7 Close Up
8 Antiplaqu
9
M
Ultra Brite
10 Colgate
M
Instruction to rank preferences:
zz Rank the several brands of toothpaste in the order of your preference.
zz Pick the brand that you prefer the most and assign it a number 1.
zz Assign a number 2 to the second most liked brand.
II

zz Continue with this process until all the brands have been ranked in order
of your preference.
zz The least preferred brand of toothpaste should be rated 10.
zz Also, no two brands should be ranked in the same number.
zz The criteria of preference entirely dependent on respondent. There is
nothing like right or wrong answer.

By compiling and analysing the information received from the survey, the company
could make an assessment that the product characteristics present in Brand 5 (Close
Up) were most valued by customers, followed by Brand 3 (Ultra Bite) and Brand
9 (Pepsodent). The price, durability, quality, functionality, packaging and other
features of the topmost brands gave the required market information to the company
for deciding the desired specifications in the new product to be developed.

The value of competition prevailing in the toothpaste market could also be assessed
by the company. Although the survey gave details on the most favoured and
unfavoured brands, but could not reveal the distances between research objects
or the reasons for customers’ choices between different brands. It was felt that
the survey provided limited information for knowing about the criteria based on
which consumers accept or reject a product. It could not reveal why a product was
important or unimportant to the respondents.
92
Measurement and Scaling

QUESTIONS Notes

1 What other forms of scaling methods could be used by CBA Ltd.?


(Hint: No other scaling method can be used.)
2. What are the limitations of the measurement scale used by CBA Ltd.?
(Hint: Rank order scale yields ordinal data. It gives better results only when a
direct comparison is required between research objects.)
3. What had management thought to do before launching the new product?
(Hint: Gather material information such as customer choices and preferences)
4. What did CBA Ltd. management do to collect information from customers?
(Hint: Conducted a survey of 300 samples)
5. How did CBA Ltd. collect the desired specifications for the new product to be
developed?
(Hint: Price, durability, quality, functionality, packaging and other features of the
topmost brands gave the required market information)

5.7 EXERCISE
M
1. Discuss the concept of measurement.
2. Explain the types of scales.
M
3. Explain comparative scales.
4. Describe various non-comparative scales.
5. What are the different bases of scale classification?
II

6. Explain the stages of developing measurement tools.


7. Enlist the basic criteria of a good measurement tool.

5.8 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer
Concept of Measurement 1. variables
2. Ordinal scale
3. False
4. Arithmetical
5. True
Concept of Scaling Techniques 6. Scaling
7. False
8. respondents
9. True
10. personal
93
Research Methodology

Notes 5.9 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Khanzode, V.V. (2007). Research Methodology Techniques and Trends. Daryaganj,
New Delhi: APH Publishing Corporation.
€€ Redman, L.V. and Mory, A.V.H. (1923) The Romance of Research. Baltimore,
Maryland: Published by Williams and Wilkins Company.
€€ Welman, J., Kruger, F., & Mitchell, B. (2005). Research Methodology. Cape Town:
Oxford University Press.
€€ Lancaster, G. (2012). Research Methods. Taylor & Francis.
€€ Kothari, C. (2004). Research Methodology. New Delhi: New Age International (P)
Ltd.

E-REFERENCES
€€ Measurement Scales in Research Methodology Tutorial 21 March 2020 - Learn
Measurement Scales in Research Methodology Tutorial (11478) | Wisdom Jobs
India. (2020). Retrieved 21 March 2020, from https://www.wisdomjobs.com/e-
M
university/research-methodology-tutorial-355/measurement-scales-11478.html
€€ What are Scaling Techniques? - Business Jargons. (2020). Retrieved 21 March 2020,
from https://businessjargons.com/scaling-techniques.html
M
II

94
R
TE
6

AP
H
C
Data Collection Techniques
M
Table of Contents
M
6.1 Introduction
6.2 Data Collection
II

6.2.1 Types of Data


Self Assessment Questions
6.3 Methods of Data Collection
6.3.1 Methods of Primary Data Collection
6.3.2 Methods of Secondary Data Collection
Self Assessment Questions
6.4 Factors Affecting the Selection of Data Collection Methods
Self Assessment Questions
6.5 Summary
6.6 Key Words
6.7 Case Study
6.8 Exercise
6.9 Answers for Self Assessment Questions
6.10 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Outline the importance of data collection
€€ Differentiate between primary data and secondary data
€€ Explain the different methods of data collection
€€ Discuss the factors affecting the selection of data collection methods

6.1 INTRODUCTION
In the previous chapter, you studied about the construction of measurement scales
and different types of scaling techniques used for measurement of objects in research.
After completing this part of the research design, the next step is to collect data from
the respondents. This chapter focusses on methods of collection of data. Data can be
collected from two types of sources, i.e., primary or secondary.

Every researcher requires several data-gathering tools and techniques. Data


collection methods form an integral part of the research design. There are various
data collection methods and each has its merits and demerits. These tools vary
M
in design, complexity, interpretation and administration. Each data collection
tool is suitable for gathering a certain type of information. The problems that
are researched with the usage of an appropriate data collection method largely
enhance the value of research study. Different tools available for data collection are
M
interviews, questionnaires, schedules, observation techniques, etc. The researcher
should select a tool from the available ones which will best provide the data that
is sought for testing the research hypothesis. If the existing research tool does not
II

suit the purpose of research, then the researcher must modify the tool accordingly
or construct some other tool. Reliability and accuracy must be maintained in the
process of data collection.

This chapter begins by defining primary data and secondary data. The primary data
refers to the information gathered firsthand by the researcher on the interest variables
for the specific purpose of the research study. On the other hand, the secondary
data refers to information gathered from the already existing sources like records of
companies, government publications, etc. In the latter part of the chapter, various
methods of primary data collection and secondary data collection are discussed in
detail. Factors affecting the selection of data collection methods are described at the
end of the chapter.

6.2 DATA COLLECTION


A research can be done on any subject related to any stream, including management,
computers, medical engineering, etc. However, every type of research requires data
to be collected from various sources. The process of gathering data for the research
is known as data collection. Data collection can be defined as a process of collecting
96
Data Collection Techniques

information from all the relevant sources for finding answers to the research problem, Notes
for testing the hypothesis, and for evaluating the results.

No research can be carried out without sufficient, useful and relevant data. To obtain
accurate data, it is important for a researcher to approach the right resource. For
instance, if the researcher wants to conduct research on the most prevailing disease,
then he/she would approach doctors to collect data for a number of patients suffering
from different types of diseases. After collecting data, the researcher processes and
analyses the data to obtain meaningful information.

6.2.1 TYPES OF DATA


Data is basically a collection of facts and figures retrieved from observations or
surveys. It is collected by a researcher keeping in view the objectives of the research
study.

There are mainly two types of data which are explained below:
€€ Primary data: Primary data are the data that are collected fresh and for the first
time. The researcher may himself collect this data directly from the respondents or
through his team. Since this data has not been published yet anywhere, it proves
to be more objective and authentic for research objectives. The relevance of this
M
data is higher than other data because it has not been altered. The primary data
can be collected through field observations, surveys, questionnaires or through
experiments. It can include a wide geographical coverage and a large population.
The degree of accuracy of primary data is very high because they are specific to
M
the researcher’s needs and relevant to the topic of the research study. Moreover,
since the primary data is current, it can provide a realistic view of the topic under
consideration to the researcher.
II

For example, census reports and records collected by Central Statistical


Organisation (CSO) are examples of secondary data.
€€ Secondary data: It refers to the data that was collected in the past, but can be
utilised in the present scenario/research work. The collection of secondary data
requires less time in comparison to collecting primary data.
For example, census reports and records collected by Central Statistical
Organisation (CSO) are examples of secondary data.
S elf A ssessment Q uestions
1. ___________ refers to the data that does not have any prior existence and is
collected directly from the respondents.
2. ___________ refers to the data that has already been collected by other sources
and is readily available.

6.3 METHODS OF DATA COLLECTION


Selecting the right method for data collection is important to get reliable data. The
different methods of primary and secondary data collection are described in the 97
upcoming section.
Research Methodology

Notes 6.3.1 METHODS OF PRIMARY DATA COLLECTION


There are various methods of primary data collection, as shown in Figure 1:

Methods of Primary Data Collection

Observation Method Survey Method

Schedule Method Interview Method

Sociometric Questionnaire
Method Method

Figure 1: Methods of Primary Data Collection

Let us discuss these methods in detail.


€€ Observation method: In this method, the population of interest is observed to find
out the relevant facts and figures. Observation method is a technique under which
data is collected by the observer from the field, through the process of recording
behavioural patterns of people, objects and occurrences without communicating or
questioning. It may be defined as the process of systematic viewing coupled with a
M
recording of the observed phenomenon. It is used for studying the dynamics of a
given situation, for making frequency counts of target behaviours, or for reviewing
any other behaviours as indicated by the evaluation needs.
M
For example, site visits may be made to an after-school programme for documenting
the interactions between youth and staff present within the programme. This
method is generally suitable when the researcher wants to gather currently
prevailing real-life information for his research.
II

It offers the following advantages:


zz This technique can be stopped or begun at any time
zz It provides access to large sections of people
It suffers from the following disadvantages:
zz It is a time-consuming activity
zz The result is dependent upon the performance of the observer
Observation is done using the following methods:
zz Natural method: In this method, the researcher observes the behaviour of
people without any intervention.
For example, the researcher observes bikes passing on the road to study the
most popular brand in the city. In addition, the researcher can observe the
activities, movements, gestures and facial expressions of people. It takes place
when the people being observed have no idea that they are being observed.
It offers the following advantages:

98 99 It is the simplest method.


99 It does not require the willingness of the people to report.
Data Collection Techniques

It suffers from the following disadvantages: Notes


99 Not all occurrences may be open to observational studies.
99 When the researchers are collecting natural observations, they usually
do not acquire the informed consent of the people being observed which
makes it somewhat unethical.
zz Contrived method: Under contrived method, a research setting is created
by the observer in order to carry out the research. All the respondents are
observed in this simulated environment. The researcher can control all the
major aspects of the research environment. Therefore, data can be collected
easily and quickly. It takes place when people are aware of their participation
in the study, but do not have an idea about what aspects are being observed.
For example, when a group of people’s reaction towards a particular situation,
such as the impact of different types of bacteria and resistance level of people
is being observed in a laboratory set-up.
It offers the following advantages:
99 It is easier than the natural method
99 The researcher has full control over the method
It suffers from the following disadvantages:
99
M
The artificial environment may increase the frequency of certain
behavioural patterns to be observed
Contrived method is less natural than other forms of observation.
M
99

zz Direct method: In this method, the researcher waits for a particular experiment
or behaviour to occur. This process takes a longer time to get a single response.
II

For example, the researcher is observing the sale of new products in an


automobile showroom. In this case, the researcher has to wait till the time a
customer comes in the showroom and asks for the new product. When the
customer comes and sees the new product, he/she may or may not purchase it
on the same day. In such a situation, the researcher has to wait till that customer
comes back to buy the product. This method is used when other data collection
procedures like survey/questionnaire are not effective or when the objective is
to analyse an ongoing behaviour process.
It offers the following advantages:
99 The physical outcomes can be readily counted
99 The method is easy to execute and complete
It suffers from the following disadvantages:
99 It is a time-taking procedure
99 It requires high diligence on the part of the observer
In the direct method, the researcher directly observes and records the actions
of the people under study. On the contrary, in the indirect observation, the
researcher reports the event through documents or correspondence diaries,
or organisational files. The researcher usually observes the effects of the 99
behaviour which is recorded using mechanical or electronic devices.
Research Methodology

Notes For example, the calls between customer care executives and customers are
recorded in various call centres for training and quality purposes.
zz Structured method: In this method, the researcher knows what is to be
observed.
For example, if the researcher has to know about a particular brand of a car,
he/she would observe only that brand of car and would not pay any attention
to other car brands. The structured method consumes less time and makes it
easier for the researcher to analyse the data. It is used when the researcher
specifies in detail what has to be observed and how measurements have to be
recorded.
It offers the following advantages:
99 It simplifies and systematises the data-recording process.
99 It is likely to produce quantitative data beneficial for analysing and
comparing information.
It suffers from the following disadvantages:
99 Results are not detailed and in-depth.
99 It is useful for studying small-scale interactions only.
Unstructured method: In this method, the researcher does not know what
zz
M
exactly he/she has to observe. The unstructured method is used in exploratory
research. In this method, the researcher wants to search for all the aspects that
can affect a particular problem.
M
For example, the researcher observes the buying behaviour of people for
different brands of the same product. He/she would study all factors that
can affect the buying decision of people. After that, he/she would analyse the
II

buying decision for a particular brand. Under this method, the researcher
enters the research field with some idea of what might be important, but not of
what exactly will be observed.
It offers the following advantages:
99 The observer has the freedom to decide and observe everything that is
relevant.
99 It is more explorative than the structured method.
It suffers from the following disadvantages:
99 It is an unfocussed approach with the investigator documenting as much
as possible.
99 It is more time-taking than the structured approach.
zz Mechanical method: In this method, the researcher uses some devices to
observe people’s responses. Examples of these devices are video cameras and
audiometres. This method has application in real-time scenarios, such as voice
pitch metres for measuring emotional reactions, analysing traffic flows in the
urban square, monitoring website traffic, etc.

100 It offers the following advantages:


99 It does not require the direct participation of the respondents.
Data Collection Techniques

99 It is subject to a low level of observation bias. Notes


99 The method is more accurate as compared to natural method and its
recordings can also be reviewed later for further detailed study.
This method suffers from the limitation of bearing the expenses of advanced
technology.
In a nutshell, the observation method helps in getting non-biased responses
from the respondents; and, therefore, provides accurate data for the research.
However, this method does not allow the researcher to evaluate the past data.
This method is used to study only the present scenarios.
€€ Survey method: The essence of the survey method is explained as questioning
individuals on a certain topic and describing their responses accordingly. It is
used to test concepts, reflect the attitude of people, establish customer satisfaction
level, conduct market segmentation research, and so on. Surveys can be conducted
in a faster and cheaper manner as compared to other methods of data collection
such as observation method. However, they are subject to the human bias of the
respondents and their unwillingness to provide information. The survey method is
further categorised into four types, i.e., interview method, questionnaire method,
sociometric method and schedule method.
zz Interview method: The interview method is basically used to do an in-
M
depth study of the research problem. In this method, the researcher asks the
respondents to react or speak on a particular topic or situation. In this method,
the researcher is in a better position to study the attitudes, motivation level
and opinions of the respondents. However, the researcher should keep certain
M
things in mind while conducting the interviews. Sometimes, it is very difficult
for the researcher to ask direct or personal questions because the respondents
are not willing to answer such questions. Therefore, the researcher should
II

make the interview environment comfortable to get the answers to personal


questions from the respondents. An interview offers the researchers an
opportunity to uncover information that is otherwise not accessible using
techniques, such as questionnaires and observations. However, this method
has the potential of being affected by subconscious bias because interviewees
will only reveal information which they are prepared to give about their
perceptions of opinions and events.
For example, in a job interview, the recruiters usually try to get information
regarding the work attitude of the prospective employees. For instance, if an
organisation has a work culture of continuous and perpetual crisis situation,
then the employer must question the candidate whether he/ she will be able to
perform under conditions of stress and, if yes, how?
The interview method is further divided into some sub-methods which are as
follows:
99 Structured interviews: In these interviews, the researcher prepares
questions and decides their sequence before the interview. This method
is used for validating results when the number of participants is quite
large. Structured interviews are conducted using a set of previously
decided questions and the same set of questions is administered to all
the participants. Structured interviews should be used in case of research 101
related to areas where literature is highly developed or after using
observational or some other less structured approach.
Research Methodology

Notes For example, a structured interview can include questions such as:
1. How (as an HR) will you handle a situation of understaffing.
2. What were your major achievements in the previous job?
3. Which manager in your previous jobs was best according to you and
why?
4. Which organisations do you dream of working in at some point in
time and why?
It offers the following advantages:
 They are easy to replicate because a fixed set of questions are used.
 They are fairly quick to conduct and the results obtained are
representative of a large population.
It suffers from the following disadvantages:
 This method is not flexible because a fixed interview schedule has
to be followed.
 The answers lack detail and closed questions only generate the
quantitative data.
Unstructured interviews: In these interviews, questions are not
99
M
predefined. The researcher asks questions according to the situation
and environment of the interview. This method is used for probing
more details of a participant so as to assess and judge his responses.
M
Unstructured interviews are carried out when the researcher wants
to explore detailed information about the thoughts or behaviour of
interviewees.
II

It offers the following advantages:


 This method is more flexible as questions can be adapted and
changed based upon the respondents’ answers.
 It generates qualitative data with the use of open questions.
It suffers from the following disadvantages:
 It may be time-consuming.
 It is cost-intensive as it includes the costs of employing and training
the interviewers.
 The participants are interviewed one at a time.
99 Individual in-depth interviews: In these interviews, the researcher takes
the interview of one respondent at a time. These interviews prove useful in
getting in-depth knowledge of the topic under study from each respondent.
However, individual in-depth interviews are time-consuming.
It offers the following advantages:
 More complete answers can be obtained if there are certain doubts
in the mind of the interviewer.
102  The researcher can analyse the body language of the interviewees.
Data Collection Techniques

It suffers from the following disadvantages: Notes


 It is time-consuming and capital-intensive
 The respondents may be self-conscious and may not answer
trustfully.
99 Focus group interviews: In these interviews, the researcher takes the
interview of a group of respondents at a time. The groups of respondents
can be further classified into consumer panels. In one consumer panel,
there are 8 to 12 members and they are provided with a topic for discussion.
They are informed about the motive for conducting the interview, various
aspects that would be covered during the discussion, and the guidelines
of the interview. Consumer panels are used to collect in-depth data from
a group of people about their experiences and perceptions related to a
specific matter. Group interviews are more structured and are, thus, easy to
evaluate. In consumer panels, people are selected randomly and introduced
to a new product, flavour or advertisement. Thereafter, they are asked
to discuss their experiences with each other. This helps the researcher in
assessing interviewees’ responses with respect to the product.
For example, a group of 10 members of a sales team are interviewed and
asked about their opinions related to a particular sales strategy.
M
It offers the following advantages:
 It helps in collecting the inputs of multiple persons in one session.
 Group interaction can provide in-depth discussion and greater
M
insight.
It suffers from the following disadvantages:
This method requires a skilled facilitator to conduct the interview.
II



 Only a limited number of questions can be asked in group interviews.


99 Telephonic interviews: The researcher takes these interviews with the
help of a telephone. The researcher searches the telephone numbers of
people and contacts them to get information. Telephonic interviews are
convenient for the researcher, as they save travelling cost and time.
For example, many organisations today prefer to conduct a telephonic
interview before calling candidates for the interview. The usual questions
included in a telephonic interview include:
1. Breifly describe about yourself.
2. Why do you want to change your job?
3. How did you come to know about this job?
4. What particular attributes of this position do you find interesting?
It offers the following advantages:
 It is cheaper and faster than the personal interview method.
 Since there is no face-to-face contact, the respondents may be willing
to give information which they might reluctantly provide in a personal 103
interview.
Research Methodology

Notes It suffers from the following disadvantages:


 Surveys have to be restricted to the respondents who have telephonic
facilities.
 Designing effective telephonic surveys is a tedious and challenging
task.
99 Computer-assisted interviews: As is clear from the name, the researcher
takes interviews with the help of computers. There are two types of
computer-assisted interviews, namely Computer-Assisted Telephonic
Interviews (CATI) and Computer-Assisted Personal Interviews (CAPI).
In the CATI, a computer system is connected with the telephone of the
interviewer. The questions appear on the screen of the computer and the
interviewer asks those questions through the telephone. The interviewer
feeds the responses of the interviewees in the computer system. In the CAPI,
interviewees can administer their interviews themselves with the help of
software installed in their systems. They can directly feed their responses
in computer systems. This method is used for conducting business-to-
business research at various trade shows or conventions.
It offers the following advantages:
 It makes implementation of surveys possible in a shorter period of
M
time and with lesser costs.
 Data collection is not limited by geographical or time constraints of
interviewees.
M
It suffers from the following disadvantages:
 It requires expert knowledge of computer-aided tools and technology.
Respondents or interviewers must have access to a computer system.
II



Interviews provide an in-depth knowledge of the topic and help in getting


responses from a large population. Moreover, less cost and effort are
involved in telephonic interview and CAPI technique. However, interviews
do not rule out the influence of interviewer on the respondent.
zz Questionnaire method: A questionnaire represents the written form of an
interview; however, there is one difference between a questionnaire and an
interview. It is easy to code a questionnaire than an interview because the
questions in a questionnaire are mostly in quantitative form while the questions
in an interview are mostly in qualitative or exploratory form. A questionnaire
is known as a research instrument comprised of a series of questions, used
for the purpose of gathering information from the respondents. It is generally
used to collect useful information from a large population in a short period of
time.
It offers the following advantages:
99 Questionnaires are cheaper and do not require much effort on the part of
the questioner as compared to other verbal or telephonic surveys.
99 Data can be collected from a large number of people.
104 99 Quantifiable answers provided by a standard questionnaire are easy to
compile and analyse.
Data Collection Techniques

It suffers from the following disadvantages: Notes


99 Questionnaires are limited by the fact that the respondents must be able
to correctly understand and respond to questions.
99 Designing a good questionnaire requires a lot of effort and skill.
A detailed explanation of the questionnaire method is given in Chapter 7 of
this book.
zz Sociometric method: The sociometric method/test enables the researcher to
analyse a social group or workgroup by studying attractions and repulsions
among group members. In this method, a social group is taken and its
members are asked to perform some activities in a particular situation. This
method is used to understand the interaction, communication, and choices of
individuals in a group. The researcher uses the sociometric test to find out the
relationship pattern within a group. On the basis of the choices of individuals,
a sociogram or sociomatrix is built to study these patterns. The process of the
sociometric method involves three steps which are as follows:
99 Introduction: The researcher informs the respondents how to perform
activities.
99 Gathering information: The researcher asks questions from the
respondents.
M
99 Drawing conclusion: The researcher starts interpreting the responses of
the team members and drawing conclusions from the data collected.
This method presents an insight of flow of information within a social
M
group. However, at the same time, it increases the necessity to have a skilled
researcher. This method is generally used to describe and evaluate social
status, social structure, or social development by measuring the extent of
II

acceptance or rejection among individuals in the group. In short, it is a graphical


representation to study social relationships and social problems.
zz Schedule method: The schedule method is same as the questionnaire method
as both the methods contain a set of questions in the written form. The main
difference between the two is that in the schedule method, enumerators are
appointed to conduct the research. These enumerators meet the respondents
personally and fill the questionnaires themselves. Sometimes the responses
can also be filled by the respondents, but in the presence of enumerator, who
can guide them if they face any problem. The schedule method is the most used
method by government agencies, research institutes, or big organisations to
make extensive enquiries on a certain issue. This method increases the chances
of getting accurate responses and the number of responses as compared to
the questionnaire method. However, this method consumes more time and
involves more cost as enumerators have to be appointed.
It offers the following advantages:
99 It reduces the non-response level of the respondents to a negligible level
as opposed to the higher level of non-response in the questionnaire
method.
105
Research Methodology

Notes 99 Information can also be collected from the illiterate respondents.


99 The identity of persons is known to ensure that expected respondents
have filled the answers.
It suffers from the following disadvantages:
99 It is costly than questionnaire method as it requires field workers.
99 This method is difficult to use if the researcher wants to cover a wide area.

6.3.2 METHODS OF SECONDARY DATA COLLECTION


As secondary data has been collected in the past, it can be retrieved from various
sources by the researcher. Following are the sources for collecting secondary data:
€€ Company records: They provide information in the form of balance sheets and
sales records. This information is used to perform trend analysis of the data and
forecasts the overall growth of a company in the future. It also helps in deciding
whether the company is moving on the right track to achieve its vision or not.
Company records are maintained every year by the company itself.
€€ The Internet: It gives information regarding previous researches done on the same
topic. The Internet also provides lots of data related to research from different
sources.
€€
M
Print media: It offers information that is publicised. Print media includes
newspapers, magazines, books, research papers and journals. The data collected
from print media is used to get an overview of the present market situation and
M
experts’ opinions on different topics.
€€ Census and other government records: They include large data of each and every
individual of the state. This data contains personal information of respondents. It
II

is used mostly used by government and big organisations. This type of data helps
in conducting research on a big scale.
€€ Indirect method: In this method, the researcher observes the behaviours that have
occurred in the past using recordings, journals, magazines, industry publications,
etc. This method consumes less time and is less expensive as compared to the
other methods. Suppose a researcher needs to know the sale of a particular brand
in a store. In this case, data can be collected from registers showing the sale of
different products in the store.
S elf A ssessment Q uestions
3. The sociometric method/test enables the researcher to analyse a social group
or workgroup by studying attractions and repulsions among the group
members. (True/False)
4. The schedule method is same as the sociometric method as both the methods
contain a set of questions in the written form. (True/False)
5. Which of the following provide(s) information in the form of balance sheets
and sales records?
a. Company records b. The Internet
106 c. Print media d. Census
Data Collection Techniques

6.4 FACTORS AFFECTING THE SELECTION OF DATA COLLECTION Notes

METHODS
The selection of an appropriate method of data collection depends on a number of
factors which are as follows:
€€ The objective of research: It plays an important role in determining the method of
data collection. It defines the motive of conducting research, which, in turn, helps
in knowing the type of data (quantitative or qualitative) that needs to be collected.
€€ The time frame for research: This is the duration within which research needs
to be completed. If the time frame to complete the research is less, the researcher
would use data collection methods that are less time-consuming. However, if the
time to complete the research is more, then the researcher can use data collection
methods that take more time, but provide relatively authentic data such as an in-
depth interview used for exploratory study.
€€ Availability of resources/funds: If the researcher has sufficient funds to conduct
the research, he/she can use expensive methods of data collection, otherwise, he/
she has to look for economical methods.
€€ Precision: It refers to the measure of how close a result comes to its true value. If
the data collection is not done with precision, the findings of the research would
not be valid.
M
€€ Skills of the researcher: This makes or destroys the whole effort of data collection.
Selection of a skilled researcher is necessary because if the researcher is unskilled,
M
he/she may not be able to select the right method of data collection.
€€ Size of sample: Different types of data collection methods are suitable for different
sample sizes. Therefore, the researcher must select the type of data collection
II

method based on the sample size. For example, it would be highly inconvenient to
administer a questionnaire to the participants of a census survey.
S elf A ssessment Q uestions
6. Selection of a skilled researcher is necessary because if the researcher is
unskilled, he/she may not be able to select the right method of data collection.
(True/False)
7. A researcher having sufficient funds to conduct the research can use expensive
methods of data collection; otherwise, they should look for economical
methods. (True/False)
8. If the time period to complete the research is________, the researcher would
use data collection methods that are less time-consuming.

A ctivit y
Suppose you are given the responsibility by your organisation for conducting
research on the popularity of baby food brands among consumers. Which data
collection method would you prefer to select for conducting the research?

107
Research Methodology

Notes 6.5 SUMMARY


€€ The process of collecting data for research purposes is known as data collection.
€€ Primary data is the data that does not have any prior existence and is collected
directly from the respondents.
€€ The data that is collected in the past but can be utilised in the present scenario/
research work is known as secondary data.
€€ Observation method is a technique under which data is collected by the observer
from the field, and through the process of recording behavioural patterns of
people, objects and occurrences without communicating or questioning.
€€ A questionnaire is known as a research instrument comprised of a series of
questions, used for the purpose of gathering information from the respondents.
€€ The selection of an appropriate method of data collection depends on a number of
factors, such as objectives of research, resource availability, etc.

6.6 KEY WORDS


€€ Observation: It is the process or action of closely monitoring something or
someone.
€€
M
Survey: A general investigation of the experiences or opinions of a group of
persons so as to record the facts or features.
€€ Enumerator: A person employed in executing a specific task, for example, in taking
M
a census of the population.
€€ Respondent: It refers to an individual who replies to something, especially one
who gives information for a questionnaire or responds to an advertisement.
II

€€ Census: A process of systematic acquiring and recording of information related to


members of a population.

6.7 CASE STUDY: BUYERSYNTHESIS’S PRIMARY DATA COLLECTION


FOR ABC (A NON-PROFIT ORGANISATION)
BuyerSynthesis is a consultancy organisation that provides primary data collection
services. It provides consumer insights to its various clients which include consumer-
facing companies, creative agencies and non-profit organisations. This organisation
was established in 2002 and is located in Denver, USA. The organisation helps
its clients by creating more effective marketing strategies and plans by better
understanding their buyers.

BuyerSynthesis believes that the consumers are the most important factor in
any business. Therefore, the organisations must become consumer-oriented.
BuyerSynthesis helps in taking the voice of an organisation’s consumers to the
concerned organisations which can then plan their marketing strategies accordingly

BuyerSynthesis team carries out primary research projects along with their client’s
108 in-house teams to carry out their research.
Data Collection Techniques

BuyerSynthesis worked with an organisation ABC (a non-profit organisation). The Notes


management of ABC wanted to research ways in which it can refresh its image so as
to attract new-generation people without losing its loyal customers. The organisation
also wanted to bridge the gaps with its core audiences.

In order to carry out data collection for this, BuyerSynthesis started with an internal
audit of the marketing department of ABC so that they may assess the challenges and
the resources of ABC. This was essential in order to find out what aspects of marketing
required refurbishing and whether the recommendations of BuyerSynthesis would
be feasible for them or not.

To begin with their research, BuyerSynthesis roped in numerous participants


from ABC’s audience to carry out its focus group research. The focus groups were
segmented using three categories, viz., generation, visitation frequency and when
their last visit to ABC was.

The focus groups were moderated and they discussed the following aspects:
€€ What did ABC mean to them?
€€ What changes in the organisation would they like to see?
€€ What could be the effect of innovations on them?
All the participants narrated their experiences with respect to the recent and
memorable experience.
M
Focus group research helped BuyerSynthesis in gaining information regarding who
M
ABC’s audience was and what attributes were important for them. BuyerSynthesis
also recognised that the organisational members felt a high degree of personal
attachment with ABC and they deeply appreciated it.
II

On the basis of research, BuyerSynthesis made certain recommendations which


helped ABC in enhancing the relationship between the organisation and its clients
and, at the same time, keep the costs under control. This research led ABC to develop
innovative audience engagement and delivery plans. In addition, the process of
planning infrastructure improvement was also expedited.

QUESTIONS
1. Describe the nature of BuyerSynthesis as an organisation.
(Hint: BuyerSynthesis is a marketing research organisation and it helps its clients
by creating more effective marketing strategies and plans by better understanding
their buyers.)
2. What were the major topics that were discussed within the focus groups created
by BuyerSynthesis for ABC?
(Hint: The major topics that were discussed within the focus groups included:
What ABC meant to them?; What changes in the organisation would they like to
see, etc.)
3. What was the first step adopted by BuyerSynthesis for collecting data for ABC?
(Hint: Data collection with an internal audit of the marketing team of ABC.) 109
Research Methodology

Notes 4. How did focus group research help BuyerSynthesis in gaining information about
ABC?
(Hint: To gain information about ABC’s audience and what attributes were
important for them.)
5. How did the recommendations made by BuyerSynthesis help ABC?
(Hint: Enhancing the relationship between the organisation and its clients and, at
the same time, keep the costs under control.)

6.8 EXERCISE
1. Define data collection and describe the different types of data collection in detail.
2. Explain the different methods of primary data collection.
3. How is data collected using the schedule method?
4. Explain the different methods of secondary data collection.
5. Which factors are to be considered while selecting the methods of data collection?

6.9 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer
M
Data Collection 1. Primary data
2. Secondary data
Methods of Data Collection 3. True
M
4. False
5. a. Company records
Factors Affecting the Selection of Data 6. True
II

Collection Methods
7. True
8. less

6.10 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Biddle, J, & Emmett, R. Research in the History of Economic Thought and Methodology.
€€ Detterman, D. (1985). Research Methodology. Norwood, N.J.: Ablex.
€€ Goddard, W, & Melville, S. (2011). Research Methodology. Kenwyn, South Africa:
Juta & Co.

E-REFERENCES
€€ Jha, G., & Jha, G. (2020). 4 Data Collection Techniques: Which One’s Right for You?
- Atlan | Humans of Data. Retrieved 8 April 2020, from https://humansofdata.
atlan.com/2017/08/4-data-collection-techniques-ones-right/
110 €€ Data Collection Methods - Research-Methodology. (2020). Retrieved 8 April 2020,
from https://research-methodology.net/research-methods/data-collection/
R
TE
7

AP
H
C
Introduction to Questionnaire
Designing M
Table of Contents
M
7.1 Introduction
7.2 Concept of Questionnaire Designing
II

7.2.1 Features of a Well-Designed Questionnaire


Self Assessment Questions
7.3 Types of Questions in Questionnaire Designing
7.3.1 Errors in Responses
Self Assessment Questions
7.4 Steps of Questionnaire Designing
Self Assessment Questions
7.5 Designing of an Effective Questionnaire
Self Assessment Questions
7.6 Summary
7.7 Key Words
7.8 Case Study
7.9 Exercise
7.10 Answers for Self Assessment Questions
7.11 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Describe the concept of designing a questionnaire
€€ Identify the different types of questions used in questionnaire designing
€€ List the steps used in questionnaire designing
€€ Discuss how to design an effective questionnaire

7.1 INTRODUCTION
In the previous chapter, you studied the concept of data collection. The chapter
discussed the types of data. The latter section of the chapter described the methods
of data collection. The chapter concluded with the explanation of the factors affecting
the selection of data collection methods.

Businesses operate on facts and data. Without data, an organisation would have no
idea on where it stands and where it needs to go. One of the simplest, cheapest and
quickest ways to gather data is to create questionnaires. The design of a questionnaire
determines the success of data collection.
M
Creating a questionnaire is an art as well as a science. If it is well-designed, then it
will have better chances of inviting responses than a badly crafted questionnaire.
While creating a questionnaire, you must consider various factors, such as how
M
many questions to ask, whether to ask close-ended questions or open-ended ones,
how to keep the wording of questions simple and effective, how to create questions
that invite correct responses from respondents, how to place questions in the
II

questionnaire and when and where to distribute questionnaires.

In this chapter, you will study the concept of questionnaire designing. Next, you
learn about the types of questions in questionnaire designing. Further, the chapter
will describe the steps of questionnaire designing. Towards the end, the chapter will
brief about designing an effective questionnaire.

7.2 CONCEPT OF QUESTIONNAIRE DESIGNING


Questionnaires are often designed for collecting standardised information about
behaviour, opinion, experience or preference of a group of respondents. As compared
to other forms of surveys, questionnaires are cheap and require less effort. Some
advantages of the questionnaire are:
€€ Economical: The cost of creating and implementing questionnaires is very low.
€€ Wide coverage: They are best to cover a large number of people. They make it
possible to contact many people who could not otherwise be reached.
€€ Rapidity: They provide speedy results.
€€ Easy to implement: They are easy to plan, create and administer.
112
€€ Less pressure on respondents: Respondents can take their time to answer
questions.
Introduction to Questionnaire Designing

€€ Uniformity: They do not allow much variation in recording responses. Notes


€€ Greater validity: Responses are interpreted without any bias or prejudice by the
recorder.
€€ Anonymity: They ensure anonymity of the respondents.
The questionnaires also have their set of disadvantages as follows:
€€ Limited response: These are only applicable to an educated class; they cannot
be applied to illiterate or semi-literate class. There is also a high possibility of the
respondents skipping questions.
€€ Lack of personal contact: Even the best designed questionnaire may fail to elicit
a suitable response due to lack of proper personal contact, which may result in
failure to interpret questions or plain indifference.
€€ Poor response: The questionnaires sent on email generally have very poor response
rate.
€€ Incomplete entries: Often respondents may leave out some crucial fields, making
it difficult for the recorder to interpret their responses.
Thus, it is important to design a proper questionnaire to gather complete, relevant
and meaningful data.
M
As an example, consider the two questionnaires: Which one is well-designed?

Figure 1 shows a sample of a well-designed questionnaire:


M
Thank you for taking the time to fill in this questionnaire, you will remain anonymous. I just need a
sample of an audience (in this case media students) to use as an example for a research project.

Are you Male Female When was the last time you saw a film, what was it?
II

Are you old are you? Years Months How many hours a day would you spend watching
reading or listening to:
Who do you live with at home (be specific please) TV
Radio
Internet
Print (magazines/newspapers)

Do you have a part-time job? Please list your top 3 TV programmes.


1.
YES NO PAID VOLUNTARY 2.
If so, please describe what you do
3.

List in order of preference (I being your most preferred,


Do you get pocket money/allowance? Yes No 5 being the least preferred) which genres (types of
programmes)you watch:
If Yes, how much do you get per week?
Sports
Soap
Sitcom
What do you spend it on, generally Documentaries
Film
Please list your top 3 favourite foods You are going to help organise some kind of music
1. event for your age group: what types of music/
bands would you want to play?
2.
3.

Figure 1: A Well-Designed Questionnaire 113


(Source: cs3240 Team 13)
Research Methodology

Notes Figure 2 shows a sample of a bad questionnaire:

M
M
II

Figure 2: A Bad Questionnaire

(Source: cs3240 Team 13)

Figure 1 shows a well-designed questionnaire, whereas Figure 2 shows a bad


questionnaire. Figure 1 depicts the characteristics of a good questionnaire because
of the following:
€€ It deals with a specific topic and target audience (i.e., media students).
€€ The structure is more clear as compared to Figure 2.
€€ It consists of complete and clear directions, and important terms are clarified.
€€ Its significance is clearly stated on the covering paragraph/questionnaire itself.
€€ Less number of private questions as compared to Figure 2 will make the respondents
more comfortable in answering.
€€ It is properly arranged and visually appealing.
Figure 2 depicts a bad questionnaire because the structure of this questionnaire is
not clear and it contains questions that people may not be comfortable in answering.
114
Introduction to Questionnaire Designing

7.2.1 FEATURES OF A WELL-DESIGNED QUESTIONNAIRE Notes

The development of a questionnaire is a complex and laborious process and


requires verification of its usefulness prior to its implementation. A well-designed
questionnaire must attain the following characteristics in order to achieve the
predefined objectives of research:
€€ Interest
€€ Precision
Let us discuss each feature.
€€ Interest: Respondents are more likely to complete a questionnaire, which is
interesting to them. Here are some tips to create an interesting questionnaire:
zz Visually appealing: Present questionnaire in an appealing and engaging
format. Try adding in colour and images to convey your organisation’s
brand and personality. Make your questionnaire usable and intuitive with no
possibility of doubt or confusion.
zz Intriguing and engaging options: Instead of boring choices like very satisfied,
satisfied, etc., try using more interesting languages, such as I love you guys,
We’re still friends, I’m a little upset, and so on.
M
zz Make it brief: Ask only relevant questions. Value the time of the respondents.
They will thank you for it.
M
€€ Precision: The questions included in a questionnaire must be to the point.
Questions are considered precise when the researcher receives correct answers
for the given to-the-point questions. Table 1 indicates some dos and don’ts of a
questionnaire design:
II

Table 1: Dos and Don’ts of a Questionnaire Design

Dos Don’ts
1.  learly define target
C 1. A
 void leading questions, which subtly prompt the
respondents, their age, respondents to answer in a particular way. Such
education level, etc. questions result in false or slanted information.
Examples:
Leading question: You are satisfied with our customer
service, aren’t you?
Non-leading question: How satisfied are you with our
customer service?
Leading question: Do you always consume fast food?
Non-leading question: How frequently do you consume
fast food?
2.  ecide if your
D 2. Avoid technical terms or jargons.
questionnaire should be Jargon question: Which feature would you like baked
anonymous or not. into our new product?
Non-jargon question: Which feature would you suggest
to be included into our new product?
115
Research Methodology

Notes Dos Don’ts


3. Carefully research and 3. A
 void using terms that the respondents may not be
draft questions so that familiar with:
they meet the purpose of Bad question: Do you have a history of carcinomic
the questionnaire and get cancer in your family? Yes/No
the desired data.
Good question: Do you have a history of lung/prostate
cancer in your family? Yes/No
4.  tart your questionnaire
S 4. Avoid making the questionnaire too lengthy.
with the most relevant
questions and then follow
naturally.
5. C
 reate engaging 5. Avoid repetitive questions.
questions throughout the
questionnaire.
6.  ord questions so that
W 6. A
 void double-barrelled questions — asking two
they are clear and easy to questions in one line. For example, do not ask: Did this
understand. project teach you to discipline your child and manage
your home finances?
7. Give space for
respondents to write
their comments on
topics not covered in the
questionnaire.
M
8. Pilot test the questionnaire
before launch.
M
9. Use multiple formats of
the questionnaire: pen
and paper, online, email,
telephonic, etc.
II

S elf A ssessment Q uestions


1. What is the benefit of a questionnaire over other methods of conducting a
survey?
a. Personal rapport with the recorder
b. Easy to convey feelings and emotions
c. Speedy results
d. None of these
2. Questionnaires with lengthy, well-formed questions elicit more response than
those with to-the-point questions. (True/False)
3. A question that subtly prompts the respondents to answer in a particular way
is called a ___________ question.
a. Double-barrelled question
b. Focussed question
c. Repetitive question
116 d. Leading question
Introduction to Questionnaire Designing

7.3 TYPES OF QUESTIONS IN QUESTIONNAIRE DESIGNING Notes

You can add various types of questions in a questionnaire, including:


€€ Open-ended or close-ended questions
€€ Fixed alternative or multiple choice questions
€€ Dichotomous questions
€€ Rating scale (continuum) questions
€€ Agree to disagree scale questions
€€ Rank ordering questions
€€ Projective methods questions
Let us discuss each type.
€€ Open-ended vs close-ended questions: Open-ended questions enable respondents
to elaborate their answers and express what they really want to say. Such questions
are usually asked during interviews and are most useful in exploratory research.
In open-ended (or unstructured) questions, respondents give answers in their
own words, whereas in close-ended (or structured) questions, they get to choose
M
from a limited number of choices provided to them. The open-ended questions
take a longer time to administer and analyse. Close- ended questions help the
respondents to interpret the questions in the same manner. Respondents are more
M
likely to find such questions less stressful. These questions may be multiple choice,
dichotomous (yes/no) or rating scale questions.
Table 2 helps you to understand open-ended questions by comparing them with
II

close-ended ones:

Table 2: Close-Ended Questions vs Open-Ended Questions

Close-Ended Questions Close-Ended Questions


Do you like working with us? Tell us about your experience with our
zz Yes organisation so far.

zz No
How satisfied are you with your current job role? What do you expect from this appraisal?
zz Very satisfied
zz Somewhat satisfied
zz Somewhat unsatisfied
zz Very unsatisfied
How satisfied are you with your manager? How will you describe your relationship
zz Very satisfied with your manager?
zz Somewhat satisfied
zz Somewhat unsatisfied
zz Very unsatisfied
117
Research Methodology

Notes €€ Fixed alternative or multiple choice questions: These questions provide multiple-
choice answers. These questions are usually asked when the possible responses
are limited and clear, such as age, gender, etc. For example:
1. How old are you?
99 12 or younger 99 40 to 59
99 13 to 19 99 60 to 79
99 20 to 39 99 80 or older
2. Which product would you like to see in the showroom?
99 Sports Utility Vehicle 99 Convertible
99 Sedan 99 All of these
99 Hatchback
€€ Dichotomous questions: These are also close-ended questions which can be
answered as Yes/No, True/False or Agree/Disagree. Examples:
1. Have you ever purchased a product or service from our website?
a. Yes M b. No
2. Do you intend to buy a new car within the next six months?
a. Yes b. No
€€ Rating scale/continuum questions: These are close-ended questions where you
M
can assign weights to each answer choice on a scale. The commonly used rating
scales are:
zz Likert rating scale: It is typically a five-, seven- or nine- point scale used to
II

measure respondents’ agreement with a variety of statements. For example:


The website has a user-friendly interface.
a. Strongly disagree d. Agree
b. Disagree e. Strongly agree
c. Neutral
zz Graphic rating scale: This is a line on which respondents place a cross ‘X’ on
any point on the line. For example:
The customer service person used check-back to confirm orders.
Strongly Strongly
Disagree Agree

1 2 3 4 5 6

zz Itemised rating scale: This scale is similar to the graphic scale, except that
there are a number of categories which can be marked. For example:
Evaluate each of the following attributes of our product by checking the
118 appropriate box.
Introduction to Questionnaire Designing

Excellent Very Good Average Below Poor Notes


Good Average
1. Quality □ □ □ □ □ □
2. Size □ □ □ □ □ □
3.­ Durability □ □ □ □ □ □
4. Brand name □ □ □ □ □ □

€€ Agree-to-disagree questions: In this type of question, respondents need to answer


on the agree and disagree responses. For example:
The sales representative spent enough time to explain the product features:
zz Strongly agree zz Somewhat disagree
zz Somewhat agree zz Strongly disagree
zz Neutral
€€ Rank-ordering questions: In this type of question, the respondent is asked to rank
a set of items against each other. For example:
zz Rank the following in order of importance from 1 to 4 where 1 is most important
to you and 4 is least important to you.M
Speed of service □
Ease of parking □
Cleanliness □
Friendliness of staff □
M
€€ Projective test questions: Projective tests are designed to develop an in-depth
understanding of hidden motivations. These questions allow respondents to
II

‘project’ their own thoughts or attitude in the response. These questions can use
techniques, such as word associations or fill in the blanks. They are difficult to
analyse and are usually used in exploratory research. For example:
Complete the following sentences with the first word or phrase that comes into
your mind.
1. My father seldom ___________.
2. Most people don’t know that I am afraid of ___________.
3. When I was a child, I ___________.
4. When encountering frustration, I usually ___________.

7.3.1 ERRORS IN RESPONSES


While creating questionnaires, you should be aware of the following errors which
may occur during responses:
€€ Telescoping error: This error occurs where people remember recent events as
being more remote than they are (backward telescoping), or distant events as
being more recent than they are (forward telescoping). These errors may lead to
faulty marketing campaigns: 119
Research Methodology

Notes zz In case of backward telescoping, respondents may overstate their intention


to buy a replacement product, as they remember a recent purchase as distant.
zz In case of forward telescoping, respondents may inaccurately recall the time of
their last purchase.
€€ Recall loss: This error occurs when people forget that an event occurred at all. For
events that happened in the distant past, recall loss dominates.
€€ Differences in responses: Sometimes responses may be inconsistent or inaccurate
due to the following reasons:
zz Different response styles
zz Different personal factors, such as laziness, tiredness, etc.
zz Different situations, such as crowded atmosphere
zz Difference in administration of questionnaire, such as wording of questions
zz Difference due to lack in clarity

S elf A ssessment Q uestions


4. Which question below is an open-ended question?
a. Are you satisfied with this product?
M
b. Did it act as expected?
c. What more were you expecting?
M
d. Will you purchase it?
5. Dichotomous questions are a type of ____________ questions.
II

6. Which is not a rating scale used in questions?


a. Comparative b. Multiple choice
c. Graphic d. Itemised
7. You want to rate three items against each other, with 1 as most important and
3 as least important. Which type of question should you create?
a. Rank-ordering question
b. Agree-to-disagree question
c. Itemised question
d. Open-ended question
8. What do you call an error when people remember events as being more recent
than they are?
a. Recency error
b. Recall loss
c. Halo effect
120 d. Telescoping error
Introduction to Questionnaire Designing

7.4 STEPS OF QUESTIONNAIRE DESIGNING Notes

The process of designing a questionnaire involves ten steps, as illustrated in Figure 3:

4. Decide the
1. Initial 2. Define the Target 3. Identify the Data Content and
Considerations Audience Required Format of the
Question

7. Design the
8. Add
5. Select the Type 6. Make a Plan of Sequence and
Administrative
of Questions Statistical Analysis Layout of the
Instructions
Question

9. Pilot Test and 10. Finalise and


Revise Implement

Figure 3: Steps of Questionnaire Designing

Let us discuss each step.


M
1. Initial considerations: Decide the purpose of your questionnaire. To do so, get
familiar with the subject, do a literature review, formulate a hypothesis and then
M
define the information required to test the hypothesis.
2. Define the target audience: Identify the target audience. Depending on it, you
can choose whether the questionnaire should be administered to males/females, a
II

particular ethnic group or race, or to people belonging to a particular country, or


any such criteria.
3. Identify the data required: Make a list of the information/data required.
4. Decide the content and format of the question: Develop the questions as required.
Decide on their phrasing and response format. A well-phrased question results
in more accurate and useful data, as they can be easily understood by the target
audience.
5. Select the type of questions: Choose the type of questions to be used. In explorative
studies, open-ended questions are used, whereas in quantitative one, close-ended
questions are used.
6. Make a plan of statistical analysis: This should include the statistical tests which
you intend to use. It is helpful to draw a dummy table with the data of interest.
This will be helpful in determining the type of results you wish to get.
7. Design the sequence and layout of the question: Design the sequence of questions
and the layout of the questionnaire. Start with easy questions and then go on to
the more difficult questions. Sensitive questions should be placed somewhere in
the middle. Avoid putting the most important questions last.
121
Research Methodology

Notes 8. Add administrative instructions: Add instructions for the administrator. Also,
add definitions of keywords for the ease of participants.
9. Pilot test and revise: Conduct a pilot test and do revisions as necessary.
10. Finalise and implement: Finalise the questionnaire. Ensure that each question is
clear, simple and brief, and the layout is clear. Finally, launch it on the appropriate
media formats.
S elf A ssessment Q uestions
9. What is the first step to design a questionnaire?
a. Define the target audience
b. Decide the statistical tests to be used
c. Define the purpose
d. Select the type of questions to be used
10. You want to administer a questionnaire to different groups simultaneously.
Which design should you use for your questionnaire?
a. Latitudinal design b. Longitudinal design
c. Cross-purpose design d. Cross-sectional design
M
11. Where should you place sensitive questions in a questionnaire?
a. In the beginning b. In the middle
M
c. In the last d. They should not be placed
12. The most important questions should be asked at the end of a questionnaire.
(True/False)
II

7.5 DESIGNING OF AN EFFECTIVE QUESTIONNAIRE


Finally, the effectiveness of your questionnaire depends on its layout — respondents
should be able to easily read, understand and answer each question you ask. Below
are some tips to consider:
€€ Introduction: Start your questionnaire with a brief introduction that:
zz Informs the purpose of the questionnaire
zz Explains how the information collected will be used
zz Assures that the personal information of respondents will remain confidential
€€ Typeface: Use a clearly legible typeface. Allow for some blank space between
questions.
€€ No breaks: Avoid breaks between question text or instructions to turn pages. Keep
all text together for each question.
€€ Instructions: Give instructions in italics or bold font to distinguish them from the
122 questions.
Introduction to Questionnaire Designing

€€ Answer format: Arrange answers vertically under each question. If you need to Notes
place any explanatory text or definition, then place them in parenthesis immediately
after the question.
€€ Logical: As far as possible, the questionnaire should reflect some natural flow
of thoughts, a sequence of events, or a logical conversation, depending upon the
subject matter.
€€ Sensitive information: Sensitive topics, whether personal or societal, should be
explored appropriately through indirect questions and are best suited to be placed
at the end of survey.
€€ Pilot study: Always pilot the questionnaire either with some colleagues or people
from the target audience. This will help in detecting any flows prior to the main
survey.
€€ Grouping: Section heading may be used appropriately, and similar questions
related to a particular topic should be grouped together.
€€ Neutral language: The terminology used should be such that it does not lead the
respondents to answer in one particular way, i.e., positive or negative.
€€ Brevity: Make use of relevant, clear, concise and efficient questions. Clear and
concise questions will be able to achieve the desired results rather than including
too many questions.
M
S elf A ssessment Q uestions
M
13. What should not be used while designing a questionnaire?
a. Introduction stating the purpose of the questionnaire
b. Legible typeface
II

c. Blank space between questions


d. Breaks between question text
14. What can you use to distinguish instructions from the questions?
a. Black regular font
b. Bold or italicised font
c. Black underlined font
d. Red font
15. When should the important topics ideally be covered in a questionnaire?
a. In the beginning
b. In the middle
c. In the last
d. Somewhere between the middle and the last
16. A good idea is to start a questionnaire with specific questions and then move
on to general topics towards the end. (True/False) 123
Research Methodology

Notes A ctivit y
Develop a complete questionnaire for a survey that you will administer to fellow
students in your university. Develop a topic for your questionnaire, determine
the set of constructs you want to measure in the questionnaire and draft each
questionnaire item. Make sure of the following requirements:
€€ The questionnaire will be administered to fellow students, so should be
appropriate for this population.
€€ The questionnaire must include the following:
zz An introduction describing the purpose of the survey
zz At least 15 questions
zz At least three open-ended questions
zz At least three close-ended questions
zz At least one potentially sensitive question

7.6 SUMMARY
€€ Questionnaires are used to collect statistical data from a group of respondents.
M
They are economical, quick, easily implementable and cover a wide range of
population. However, they have the disadvantage of inviting limited response.
Therefore, it is important to design effective questionnaires.
M
€€ A well-designed questionnaire has the following features:
zz They are visually appealing, intriguing, engaging and brief.
II

zz The questionnaire should be to the point and without any unnecessary


questions.
€€ You can add various types of questions in a questionnaire, depending on the
purpose, question content and responses required:
zz Open-ended questions
zz Fixed alternative or multiple-choice questions
zz Dichotomous questions
zz Rating scale/continuum questions
zz Agree-to-disagree questions
zz Rank-ordering questions
zz Projective method questions
€€ Questionnaires are vulnerable to the following types of errors from respondents:
zz Telescoping error
zz Recall loss
124 zz Differences in responses
Introduction to Questionnaire Designing

€€ There are ten steps to design a questionnaire: Notes


1. Initial considerations
2. Define the target audience
3. Identify data required
4. Decide question content and format
5. Select the type of questions
6. Make a plan of statistical analysis
7. Design the sequence and layout of the question
8. Add administrative instructions
9. Pilot test and revise
10. Finalise and implement
€€ While designing an effective questionnaire, you should consider the following
tips:
zz Start with a brief introduction
zz Use a legible typeface
M
zz Avoid breaks between question text/instructions
zz Give instructions in italics or bold
M
zz Arrange answers vertically under each question
zz Give easy questions in the beginning, which cover important topics of interest
Go from generic to specific questions
II

zz

zz Use logical flow of questions


zz Use a transitional statement when switching to different topic areas
zz Be crisp and comprehensive

7.7 KEY WORDS


€€ Questionnaire: This is a set of written questions with a choice of answers for the
purpose of a survey.
€€ Face validity: It indicates whether a questionnaire appears to measure what it
claims to.
€€ Content validity: Content validity refers to the extent to which a questionnaire
fully measures the construct of interest.
€€ Construct validity: It measures the extent to which a questionnaire captures a
specific trait.
€€ Concurrent validity: It measures the degree to which a questionnaire compares to
a currently existing criterion.
125
Research Methodology

Notes €€ Predictive validity: It measures the degree to which a questionnaire predicts a


future criterion.
€€ Test-retest reliability: It measures how close the results are when measured
successively under the same conditions.
€€ Internal consistency reliability: It measures how well a questionnaire is actually
measuring what you want to measure.
€€ Open-ended questions: These are unstructured questions which enable
respondents to elaborate their answers and express what they really want to say.
€€ Close-ended questions: These are structured questions where respondents get to
choose from a limited number of choices provided to them. These questions can be
of multiple choices, dichotomous or rating scale questions.

7.8 CASE STUDY: QUESTIONNAIRE DESIGNING FOR MARKET


RESEARCH IN THE PET CARE INDUSTRY
The pet care industry is growing tremendously in India. From 2012 to 2017, the
industry moved ahead with the Compound Annual Growth Rate (CAGR) of 23%.
The industry is likely to grow with a CAGR of at least 20% up to 2021–22. Dog food
contributed to a majority share of 80% in value in the year 2017. The growth of the
M
pet care industry can be attributed to the following factors:
€€ Rise in disposable income
M
€€ Change in consumption patterns
€€ Urban lifestyle
In such a setup, various pet care and grooming companies have sprung up. Buddy-
II

Pets is one such venture. It assists people who would like to take their pets along
while they go on a vacation. The founder Amit Kumar got the idea for the start-
up when he needed to step out of town for a break and could not find a suitable
boarding facility for his 5-month-old Labrador Lucy. So, he decided to set up a start-
up to help like-minded people.

Buddy-Pets helps plan their vacations by providing a 24x7 boarding and day care
facility for pets. It also helps find the right grooming and pet supplies services. It
even provides a pet-friendly environment where owners can come with pets to dine
in, socialise and play in a garden cafe.

Buddy-Pets faces the challenge of drawing out a strategic marketing plan to make
its venture fundable by the right target group. It wants to position itself in the
operational gap in the current pet care setup, which mostly consists of pet shops,
clinics and grooming centres with referral tie-ups for boarding establishments. It
also wants to study customer preferences regarding pet care facility. Thus, Buddy-
Pets wants to do a market research to:
€€ Analyse customer preferences for the desired pet care facility in a major city,
including Delhi, Mumbai and Bangalore
126
€€ Identify and evaluate opportunities available in these cities
Introduction to Questionnaire Designing

€€ Develop implementable marketing strategies Notes


€€ Evaluate competitive dynamics from traditional pet shops and boarding facilities
The purpose of this research is to design a marketing strategy for Buddy-Pets.

QUESTIONS
1. What considerations should you keep in mind while designing a questionnaire for
the market research?
(Hint: Initial considerations, target audience, type of design, data required, type of
questions, tips, etc.)
2. What steps will you take to design a questionnaire?
(Hint: Initial considerations, define the target audience, make a plan of statistical
analysis, etc.)
3. What challenge does Buddy-Pets face?
(Hint: The challenge of making a marketing strategic plan which attracts venture
fund by the right target group)
4. What was the purpose of the research in the case?
M
(Hint: To design a market strategy for Buddy-Pets)
5. How did Buddy-Pets want to position itself in the existing market?
(Hint: To position itself as an operational gap filler by studying customer
M
preferences for pet care facility.)

7.9 EXERCISE
II

1. What are the attributes of a well-designed questionnaire?


2. List any five dos and don’ts of questionnaire design.
3. Explain any five types of questions which may be included in a questionnaire.
4. Describe the differences between open-ended and close-ended questions.
5. Enumerate the steps in a questionnaire designing.

7.10 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No Answer
Concept of Questionnaire Designing 1. c. Speedy results
2. False
3. d. Leading question
Types of Questions in Questionnaire 4. c. What more were you expecting?
Designing
5. close-ended
6. b. Multiple choice
7. a. R
 ank-ordering question
127
8. d. Telescoping error
Research Methodology

Notes Topic Q. No Answer


Steps of Questionnaire Designing 9. c. Define the purpose
10. d. Cross-sectional design
11. b. In the middle
12. False
Designing of an Effective Questionnaire 13. d. B
 reaks between question text
14. b. Bold or italicised font
15. a. In the beginning
16. False

7.11 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Boynton P.M., Greenhalgh T. Selecting, Designing, and Developing Your Questionnaire.
BMJ. 2004 May 29;328((7451)):1312–5.
€€ Edwards P., Roberts I., Clarke M. et al. Increasing Response Rates to Postal
Questionnaires: Systematic Review. BMJ. 2002 May 18;324((7347)):1183.
Leung W.C. How to Design a Questionnaire. Student. BMJ. 2001;9:187–9.
€€
M
E-REFERENCES
M
€€ Questionnaire Design - Guidelines on how to design a good questionnaire.
(2020). Retrieved 9 April 2020, from https://www.managementstudyguide.com/
questionnaire-design.htm
II

€€ (2020). Retrieved 9 April 2020, from https://www.researchgate.net/


publication/300010607_Questionnaire_Design
€€ (2020). Retrieved 9 April 2020, from https://imotions.com/blog/design-a-
questionnaire/

128
R
TE
8

AP
H
C
Data Processing and Analysis
M
Table of Contents
M
8.1 Introduction
8.2 Concept of Data Processing
II

8.2.1 Editing
8.2.2 Coding
8.2.3 Classification
8.2.4 Data Entry
8.2.5 Tabulation
Self Assessment Questions
8.3 Concept of Data Analysis
Self Assessment Questions
8.4 Measures of Central Tendency
8.4.1 Mean
8.4.2 Median
8.4.3 Mode
Self Assessment Questions
8.5 Measures of Dispersion
8.5.1 Range
8.5.2 Mean Deviation
8.5.3 Standard Deviation
Self Assessment Questions
Table of Contents
8.6 Measure of Skewness
Self Assessment Questions
8.7 Measures of Relationship
8.7.1 Correlation Analysis
8.7.2 Regression Analysis
Self Assessment Questions
8.8 Different Charts Used in Data Analysis
Self Assessment Questions
8.9 Summary
8.10 Key Words
8.11 Case Study
8.12 Exercise
8.13 Answers for Self Assessment Questions
8.14 Suggested Books and e-References
M
M
II
Data Processing and Analysis

LEARNING OBJECTIVES Notes

After studying this chapter, you will be able to:


€€ Explain the concept of data processing
€€ Describe the concept of data analysis
€€ Discuss the measures of central tendency
€€ Explain the measures of skewness
€€ Discuss the measures of relationship
€€ Describe various charts used in data analysis

8.1 INTRODUCTION
In the previous chapter, you studied about questionnaire designing. Now, you will
learn the significance and ways of processing and analysing data retrieved from
such questionnaires.

Data in its raw form does not convey any useful information. It needs to be organised
properly to extract the relevant information and make it fit for research. This is
M
done with the help of data processing that involves various steps, including editing,
coding, classification, data entry and tabulation.

After processing data, you need to analyse it to find answers to the research
M
problem. You can use various statistical measures, such as the measures of central
tendency, dispersion, skewness and relationship to analyse data. The selection of a
measure depends upon the type of the research problem. For example, if you wish
II

to find out the average marks of students of class IX in English, then you would use
the measures of central tendency. However, if you want to know the relationship
between the eating habits of children and problems of obesity, then you would
use the measures of relationship. It is important to note that no single statistical
measure is complete in itself to analyse a data series. Therefore, you should use an
optimum combination of different measures to address the problem at hand in the
most effective manner. Any carelessness in data processing and data analysis can
result in erroneous research findings. Moreover, these data tasks form a major part
of research and consume considerable time and effort of the researcher. Therefore, it
is advisable to remain extra vigilant while processing and analysing data for making
the research as authentic as possible.

The chapter begins by explaining the concept of data processing and data analysis.
Next, it talks about the measures of central tendency, including mean, median
and mode. Information is also provided about the measures of dispersion and
the measures of skewness. It also explains the measures of relationship, including
correlation analysis, regression analysis and multiple regression. Towards the end,
the chapter discusses other statistical measures used for data analysis.

131
Research Methodology

Notes 8.2 CONCEPT OF DATA PROCESSING


Data processing is a process of converting raw data (quantitative or qualitative)
into a form which is fit for analysis. The process involves various steps shown in
Figure 1:

Editing Coding Classification Data Entry Tabulation

Figure 1: Steps of Data Processing

Let us now discuss each step of data processing in the following section.

8.2.1 EDITING
Editing refers to reviewing the collected data to check whether it is valid or not. Data
is examined to detect errors and omission. Errors are corrected, omitted data is filled
in, and data is prepared for further processing. The data is retained for analysis.
M
The editor is responsible for ensuring that the data is accurate, uniform, as complete
as possible and acceptable for tabulation. Editing helps in filtering ambiguous
information that can create a problem at the time of data analysis. Ambiguous
M
information can be in the form of biased or incorrect responses in a questionnaire
and such information needs to be deleted.

8.2.2 CODING
II

Coding is the process of providing some codes to the data in the form of symbols,
characters and numbers. It helps the researcher in interpreting the data and deriving
accurate results. If the data is generated with the help of a questionnaire, it can be
coded either at the time of framing the questionnaire or after collecting the data.
€€ The data that is already coded is known as precoded data.
€€ The data that is coded at the time of data processing is known as postcoded data.
Generally, a questionnaire may contain the following types of questions:
€€ Interval-scale questions: An interval scale is any range of values that have a
relevant mathematical difference but no true zero. Any question where the
respondent must enter a temperature value is an interval scale question because
degrees are interval measurements. The data collected through interval-scale and
closed-ended questions is an example of precoded data.
€€ Closed-ended questions: These questions are those for which a researcher
provides respondents with options from which to choose a response.
€€ Open-ended questions: These questions are those which require more thought
and more than a simple one-word answer. The data collected through open-ended
132 questions is an example of postcoded data. Apart from these, the questionnaire
can also include questions based on nominal scale, ordinal scale and ratio scale.
Data Processing and Analysis

Precoded data has certain advantages over postcoded data: Notes

€€ It is easier to code.
€€ It reduces the effort in data processing.
€€ It leads to fewer chances of human error during data processing. Let us understand
the concept of coding with the help of an example.
Following questionnaire aims to measure the comfort level of women in a job after
marriage. Questions 1 to 5 are multiple-choice questions (close-ended questions),
questions 6 to 14 are interval-scale questions and questions 15–16 are dichotomous
questions.

Questions 1–5: Tick all the options that apply to you.

1.  ge
A a. 20-30 b. 30-40 c. 40-50 d. 5 0 and
Group above
(years)
2.  arital
M a. Married b. Unmarried c. Divorced d. P
 lease
Status specify…
(for
example,
engaged,
M widow or
whatever)
3. Children a. None b. One c. Two d. M
 ore than
M
two
4. Working a. Working b. Non- c. R
 etired d. S
 earching e. On
Status working from the for the job leave
job
II

5. Work a. Full-time b. Part-time c. Not


Type Applicable

Questions 6–14: Give the ratings in the following questions as per your choice.
The rating of 1 means the lowest and 5 means the highest.

6. My work gives me satisfaction more than anything. 1 2 3 4 5


7. I am able to manage my professional and personal life perfectly. 1 2 3 4 5
8. I go for holidays with my family frequently. 1 2 3 4 5
9. I am able to reach office on time. 1 2 3 4 5
10. I reach my home on time in the evening. 1 2 3 4 5
11. I complete most of my work projects on time. 1 2 3 4 5
12. I play with my children daily. 1 2 3 4 5
13. I reach home late at nights. 1 2 3 4 5
14. I most often extend the deadline for submission of my projects. 1 2 3 4 5

Questions 15–16: Answer in Yes or No. 133


15. I have kept a mind for household work.  Yes  No
Research Methodology

Notes 16. I have kept babysitters to look after my children.  Yes  No


 Not Applicable

8.2.3 CLASSIFICATION
Classification refers to categorising the coded questions into different segments as
per their relevance. This is done to simplify data processing and analysis to a great
extent. It is important to note that variables in a segment possess certain similar
characteristics. For example, demographic information is a segment that includes
variables, such as age, education and work experience of the respondents.

Questions in a questionnaire can be classified into qualitative and quantitative


questions:
€€ Qualitative questions: The classification of qualitative questions is called statistics
of attributes. These attributes cannot be measured directly in numbers. However,
qualitative attributes can be quantified. Examples of attributes are honesty and
attitude of the respondents.
€€ Quantitative questions: The classification of quantitative questions is called
statistics of variables. These variables can be expressed in numeric form, such as
demographic factors including age and income.
These variables can be grouped in the form of class intervals. A class interval contains
M
a lower limit and an upper limit. The difference between the two limits is called class
magnitude. For example, in the class interval 25-35, 25 is the lower limit and 35 is
the upper limit.
M
Class intervals can be inclusive or exclusive.
€€ Inclusive class intervals: If the value of the upper limit is included in the class
II

magnitude, it is an inclusive class interval. For example, the value 35 would be


included in the inclusive class 25-35. Thus, the inclusive class intervals would be
25-35, 36-45, 46-55, and so on.
€€ Exclusive class intervals: If the value of the upper limit is not included in the class
magnitude, then it is known as an exclusive class interval. For example, the value
35 would not be included in the class 25-35, but it would be included in group 35-
45. Thus, the exclusive class intervals would be 25-35, 35-45, 45-55, and so on.
Another important term to remember during classification is frequency. Frequency
is the number of occurrences of a repeating event per unit of time. Table 1 shows the
number of respondents in each age group:

Table 1: Frequency Distribution

Age Group (Class Interval) Number of Respondents


25-35 10
35-45 4
45-55 7
55-65 2

134 In Table 1, 10 respondents are in the age group of 25-35. Thus, 10 is the frequency
of the class interval 25-35. When class intervals and frequencies are represented in a
tabular form, as in Table 1, such a representation is known as frequency distribution.
Data Processing and Analysis

8.2.4 DATA ENTRY Notes

After classifying data, the researcher enters data in the computer. If wrong data is
entered, then the result would be inaccurate. There are various statistical or database
management software for data entry, such as:
€€ Bio-Medical Data Package (BMDP)
€€ Statistical Programming Language (S-PLUS)
€€ Statistical Analysis System (SAS)
€€ Statistical Package for Social Sciences (SPSS)
Out of all this software, SPSS is widely used by researchers for data entry.

8.2.5 TABULATION
Tabulation refers to presenting data in the form of a table so that it can be easily
analysed. In this stage, the frequencies of the dataset are also computed.

There are three types of frequencies, namely absolute frequency, relative frequency
and cumulative frequency.
€€ Absolute frequency is the exact frequency given by the respondents.
M
€€ Relative frequency is calculated with relation to the frequency of the other class
intervals. It is the percentage of all respondents who have given a particular
response.
M
€€ Cumulative frequency is the percentage of all respondents who have given a
response equal or less than a particular value.
II

There are two types of frequency distributions, which can be put into a tabular form:
1. Two-way frequency distribution: In this type of frequency distribution, two
variables can be analysed at a time. This frequency distribution is also known as
cross tabulation.
2. One-way frequency distribution: In this type of frequency distribution, a single
variable is analysed.
Table 2 shows an example of the one-way frequency distribution.

Table 2: One-Way Frequency Distribution

Age Group (Class Number of Persons (Frequency or Relative Cumulative


Interval) Absolute Frequency) Frequency Frequency

20-30 10 17.86 17.86

30-40 14 25.00 42.86

40-50 20 35.71 78.57

50 and above 12 21.43 100.00

Total 56 100 100 135


Research Methodology

Notes In Table 2, age group is taken as a variable and different types of frequencies are
calculated. As already discussed, absolute frequency is the precise frequency given
by the respondents. Relative frequency can be calculated by dividing the absolute
frequency with the total frequency. For example, in case of the 20-30 age group,
absolute frequency is 10 and the total frequency is 56; therefore, the relative
frequency is 17.86 (10/56×100). Cumulative frequency can be calculated by adding
up the relative frequency of the present class interval (whose cumulative frequency
we are calculating) and the relative frequency for the following class interval. For
example, in case of the 20-30 and 30-40 age groups, the relative frequencies are 17.86
and 25.00, respectively. Therefore, the cumulative frequency in the case of the 30-40
age group is 42.86 (17.86 + 25.00).

S elf A ssessment Q uestions


1. _______________ helps in filtering ambiguous information that can create a
problem at the time of analysis.
2. Ambiguous information can be in the form of biased or incorrect responses
given by the respondents. (True/False)
3. The data that is coded at the time of data processing is known as ____________
data. M
4. Data entry refers to presenting data in the form of a table so that it can be
easily analysed. (True/False)
M
8.3 CONCEPT OF DATA ANALYSIS
After processing data, a researcher analyses it to retrieve meaningful information.
Data analysis is broadly classified into two types, as shown in Figure 2:
II

Univariate Analysis

Descriptive Analysis Bivariate analysis


Types of Data Analysis

Multivariate analysis

Parametric tests

Inferential analysis

Non-parametric tests

136 Figure 2: Types of Data Analysis


Data Processing and Analysis

Let us now discuss each type in detail: Notes

€€ Descriptive analysis: In this type of data analysis, the distribution patterns and
characteristics of different types of variables are analysed. There are three types of
descriptive analysis:
zz Univariate analysis: This analysis studies a single variable. Examples include
measures of central tendency, dispersion and skewness. However, sometimes
these measures can also be used for bivariate and multivariate analysis.
zz Bivariate analysis: In this analysis, two variables are studied. One variable can
be classified as independent and the other as dependent. Examples are rank
correlation, simple correlation and simple regression.
zz Multivariate analysis: In this analysis, more than two variables are studied.
Among the variables being studied, there can be more than two independent
variables and more than one dependent variable. Examples include multiple
correlations and regressions.
€€ Inferential analysis: In this type of data analysis, significance tests are used to
check the validity of a hypothesis for studying a problem. There are two types of
significance tests:
zz
M
Parametric tests: These tests make assumptions about the parameters of the
population from which a sample is derived. Examples of parametric tests
include z-test and t-test.
M
zz Non-parametric tests: These tests do not make any assumptions about the
parameters of the population from which the sample is derived. An example
of a non-parametric test is the Kruskal Wallis test.
II

S elf A ssessment Q uestions


5. Simple regression is which type of data analysis?
a. Univariate analysis
b. Bivariate analysis
c. Multivariate analysis
d. Inferential analysis
6. Descriptive analysis uses tests of significance to check the validity of a
hypothesis for studying a problem. (True/False)

8.4 MEASURES OF CENTRAL TENDENCY


The measures of central tendency are used to study the distribution pattern of a
dataset. These measures give a central value that represents the large chunk of data
analysed. The central value is nothing but the average of data collected.

137
Research Methodology

Notes Figure 3 displays the various measures of central tendency:

Weighted Mean

Measures of Central Tendency


Mean Geometric Mean

Median Harmonic Mean

Mode

Figure 3: Measures of Central Tendency

Let us now discuss each measure.

8.4.1 MEAN
M
Mean represents the value calculated after dividing the sum of observations by the
total number of observations (n) taken. It is also known as arithmetic mean.
M
Following formula is used to calculate mean:

Mean (X) = X = ∑Xi/n


II

Where, X = Symbol for mean

∑Xi = Sum of all observations/frequency

Xi = X1 + X2 +… + Xn

n = Number of observations

Let us understand the concept of arithmetic mean with the help of an example.

Suppose you want to find the average weight of a group of five friends. Table 3
shows the weight of each person in the group:

Table 3: Weights of Five Friends

People Weight (kg)


Jenny 35
Robert 40
Ella 34
Andy 39
138 Eliza 42
Data Processing and Analysis

The average weight of five friends can be calculated as follows: Notes

X = ∑Xi/n

Where, X = Average weight of five friends

∑Xi = Sum of the weights of five friends Xi = 90

n=5

X = (35 + 40 + 34 + 39 + 42)/5

X = 190/5

X = 38 kg

Therefore, the average weight of five friends is 38 kg.

You can calculate different types of mean:


€€ Weighted mean: This mean is calculated after considering the weight attached to
each item. The formula used to calculate weighted mean is as follows:
Weighted Mean (Xw) = ∑ WiXi/wi
Where, Xw = Symbol for weighted mean
Xi = Value of the ith item
M
Wi = Weight assigned to the ith item
wi = Number of weights assigned
M
Example of Weighted Mean
A school grades its students by using weighted mean scores as follows:
II

15% weightage is assigned for homework, 15% weightage is assigned for


extracurricular activities, and 70% weightage is assigned for the examination.
Aditya scored 60 marks, 70 marks and 55 marks for homework, extracurricular
activities and in examination, respectively. Find the weighted score of Aditya if
the total score is 100.
Now, we calculate the weighted mean as follows:
Weighted Mean ( Xw) = (0.15 × 60) + (0.15 × 70)+(0.70 × 55)
= 9 + 10.5 + 38.5
= 58
€€ Geometric mean: Geometric mean represents the nth root of the product of all
the values or observations involved in a research. The formula used to calculate
geometric mean is as follows:

X g= n (x 1 )(x 2 )(x 3 )...(x n )


Where, X1, X2,...............Xn = Observations
n = Number of observations
Example of Geometric Mean
139
Research Methodology

Notes You want to calculate the geometric mean of four observations: 10, 12, 10 and 11.
The calculation of geometric mean is shown as follows:
X1, X2, X3, X4 = 10, 12, 10, 11
n=4

X g= 4 X1 × X2 × X3 × X4

Xg= 4 10 × 12 × 10 × 11

X g= 4
1320 = 10.718
Therefore, the geometric mean of four observations is 10.7 years.
€€ Harmonic mean: Harmonic mean refers to reciprocal of the average of the
reciprocals of the values in a data series (or observations). The formula to calculate
harmonic mean is as follows:
Harmonic mean (XΗ) = Rec. (Rec. X1 + Rec. X2 +…....+ Rec. Xn)/n
Where, Rec. X1, Rec. X2 …. Rec. Xn = Reciprocal of observations 1, 2,.........., n
n = Number of observations
Example of Harmonic Mean
M
Calculate the harmonic mean of four observations: 10, 12, 10 and 11.
M
Harmonic mean is calculated as:
(XH) = Rec. [(Rec. X1 + Rec. X2 + + Rec. X4)/n]
Where, Rec. X1, Rec. X2 …. Rec. X4 = 1/10, 1/12, 1/10, 1/11
II

n=4
 1 1 1 1 
X H = Rec.   + + +  4
  10 12 10 11  

 247 
 660  247 660 × 4
X H = Rec. = Rec. =
4 660 × 4 247
 247 
 660  247 660 × 4
X H = Rec. = Rec. = = 10.68
4 660 × 4 247

Therefore, the harmonic mean of the four observations is 10.7 years. It is used
for units that add up as reciprocals in a sequence, such as speed, distance,
capacitance in series or resistance in parallel.

8.4.2 MEDIAN
Median is defined as a central or mid-value of a dataset. Median divides a dataset
140 into two halves – one half contains the values greater than the mid-value (or median)
and the other half contains the values less than the mid-value.
Data Processing and Analysis

Before calculating median, you need to arrange the dataset in the ascending or Notes
descending order. The formula to calculate median is as follows:

n = Number of observations

Now, if n is an odd number

Median = Value of (n + 1/2)th observation

Now, if n is an even number

Median = Value of {[(n/2)th observation + (n + 1/2)th observation]/2}

Let us understand the concept of median with the help of an example.

A group of 17 people gave the following ratings to a book on a 5-pointer scale (where
1 is the lowest rating and 5 is the highest rating):

2, 5, 3, 4, 1, 5, 4, 3, 1, 2, 5, 4, 3, 2, 1, 5, 4

Now you want to calculate the average rating by using median. To do so, arrange the
data in the ascending order, as follows:

1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5
M
Since the number of observations is odd, the following formula will be used to
calculate median:
M
Median = Value of (n + 1/2)th observation

Median = (17 + 1/2)th observation


II

Median = 9th observation

Median = 3

Therefore, the median rating for the book is 3.

Now, if n is an even number, then we calculate median as the simple average of the
middle two numbers. In other words, median is the simple average of the n/2th and
(n/2 + 1)th terms.

Now, if a group of 20 people gave their ratings to a movie on a 5-point scale as:

2, 5, 3, 4, 1, 5, 4, 3, 1, 2, 5, 4, 3, 2, 1, 5, 4, 1, 2, 3

Where, 1 is the lowest rating and 5 is the highest rating

Now, to calculate the average rating using median, all the 20 observations are
arranged in ascending order as:

1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5

Here, median is the average of middle two values, i.e., values at 10th and 11th
positions. This is calculated as: 141
Median = (3 + 3)/2 = 3
Research Methodology

Notes 8.4.3 MODE


Mode refers to the value that has the highest frequency in a data series.

According to Croxton and Cowden, the mode of a distribution is value at the point
around which the items tend to be most heavily concentrated. It may be regarded as the most
typical of a series of values.

Let us learn to calculate mode with the help of an example. Suppose the marks of
five friends in a science paper are 70, 90, 50, 70, and 30. You want to find the mode
of their marks.

You need to find the highest frequency of the present data to calculate mode. Here,
the number having the highest frequency is 70 as it occurs two times; therefore, the
mode of students’ marks is 70.

Mode is used as the most important statistic for nominal data where values are names
rather than numbers. In such cases, there is no concept of centre because there are no
numbers. In addition, when we are dealing with continuous variables, probability
that observations occurring in the data sample are different is 1. Therefore, mode
cannot be used for continuous variables.

Mode is not considered a true measure of central tendency because of two reasons:
M
i. It is not necessary that one data series has only one mode because many numbers
in the data series can have the highest frequency.
M
ii. Mode does not consider all the frequencies to arrive at the central value of the data
series. Therefore, the results of mode are not reliable.
iii It is possible that a series has observations that occur only once. In such cases,
II

mode does not exist.


Let us summarise mean, median and mode as follows:
€€ Mean: Mean represents the average value in a dataset.
€€ Median: Median represents the middle value in a dataset.
€€ Mode: Mode represents the most common value in a dataset.
The measures of central tendency used for different types of variables are shown in
Table 4 as follows:

Table 4: Types of Variables and Measures of Central Tendency

Types of Variables Best Measure of Central Tendency


Nominal Mode
Ordinal Median and Mode
Interval/Ratio (not skewed) Mean, Median and Mode
Interval/Ratio (not skewed) Median and Mode

142 * For skewed data, median is better than mean


Data Processing and Analysis

S elf A ssessment Q uestions Notes


7. Mean represents the value that you get after dividing the sum of observations
by the total number of observations taken. (True/False)
8. _________________ mean represents the nth root of the product of all the
values or observations involved in the research.
9. Median can be defined as a central value that divides a dataset into two halves.
(True/False)

8.5 MEASURES OF DISPERSION


Using different measures of central tendency, you can find out the mean value, but
these measures do not explain the scattering of values near the mid-value in a data
series. The measures of dispersion can be used to study the dispersed values near
the mean value. Figure 4 shows the measures of dispersion:

Range
M
Measures of
Dispersion
M
Mean Standard
Deviation Deviation
II

Figure 4: Measures of Dispersion

Let us now discuss each measure of dispersion.

8.5.1 RANGE
Range represents the difference between the highest value and the lowest value in a
data series. It is considered as a rough measure of variability because it depends on
the size of the data series. When the highest (H) and/or the lowest (L) data point in a
data series changes, the range also changes.

The formula used to calculate range is as follows:

Range = (Highest value of data series – Lowest value of data series)

Let us learn to calculate range with the help of the preceding example in which a
group of 17 people rated a book on a 5-pointer scale, where 1 is the lowest rating and
5 is the highest rating. The rating given by the 17 people is as follows:

2, 5, 3, 4, 1, 5, 4, 3, 1, 2, 5, 4, 3, 2, 1, 5, 4
143
Research Methodology

Notes Now, you want to calculate the range for the data series.

To do so, you need to find the highest and lowest values of the data series. In the
present case,

Highest value of data series = 5

Lowest value of data series = 1.

Therefore, the range would be:

Range = (Highest value of data series – lowest value of data series)

Range = (5 – 1)

Range = 4

Therefore, the range of the ratings given by 17 people to a book is 4.

8.5.2 MEAN DEVIATION


Mean deviation represents the extent of deviation of values from the mean.

According to Clark and Schkade, average deviation is the average amount of scatter of
M
the items in a distribution from either the mean or the median, ignoring the signs of the
deviations. The average that is taken of the scatter is an arithmetic mean, which accounts for
the fact that this measure is often called the mean deviation.
M
Mean Deviation is used to measure variability across a data series.

The formula used to calculate Mean Deviation is as follows:


II

Mean Deviation (MD) = ∑|Xi – X|/n

Where Xi = Individual observation

X = Mean/Median/Mode

n = Number of observations

With the help of MD, you can also calculate the coefficient of MD. The coefficient of
MD refers to the relative measure of dispersion that can be calculated by dividing
MD with mean/median/mode.

The formula to calculate the coefficient of mean deviation is as follows:

Coefficient of MD = MD/X

Where

X = Mean/Median/Mode

Let us understand the concept of MD and the coefficient of MD with the help of an
earlier example in which you calculated the average weight of five friends.
144
Data Processing and Analysis

Table 5 shows the data used for calculating mean deviation: Notes

Table 5: Weights of Five Friends

People Weight (kg) |Xi – X|


Jenny 35 |35 – 38| = 3
Robert 40 |40 – 38| = 2
Ella 34 |34 – 38| = 4
Andy 39 |39 – 38| = 1
Eliza 42 |42 – 38| = 4
Total 14

The formula to calculate MD is shown as follows:

35 + 40 + 34 + 39 = 42
X= = 38
5
Mean Deviation (M.D.) = ∑|Xi – X|/n

M.D. = 14/5 M
M.D. = 2.8

Coefficient of Mean Deviation = M.D./X


M
= 2.8/38

= 0.074
II

Therefore, the dispersion of the weight of five friends from the mean value is 2.8.
Therefore, the weight of all friends is dispersed more or less by 2.8 kg from the
average weight. The relative measure of weight is 0.074.

8.5.3 STANDARD DEVIATION


Standard Deviation is used to calculate the scattering of values in a given dataset.
The symbol used to represent standard deviation is sigma (σ). Standard Deviation
(SD) is the square root of variance of a data series. The formula used to calculate SD
is as follows:

For research where the entire population is considered,

∑ (X − X) f
2
i
SD of population σ =
n
and σ = Parameter of the population

For research where only a sample is considered,

∑ (X − X) f
2
i
SD of Sample S = 145
n
Research Methodology

Notes and S = Statistics of sample

Also note that the square of SD is called variance

Population variance = σ2 and Sample variance = S2

Sample statics is used to estimate population parameter. S2 is an unbiased estimate


of σ2.

If the observations are grouped into a frequency table, then the formulae for SD and
variance change as follows:

∑ (X − X )
2
f
σ=
n

and X =
∑ Xf
∑f
n = ∑f

∑ (X − X )
2
2
f
Therefore, σ =
n
M
The coefficient of SD can be calculated by dividing SD with the mean of the series. It
is a relative measure of dispersion.
M
Let us understand the concepts of SD, the coefficient of SD, and the coefficient of
variance with the help of an example.
II

Suppose you want to calculate the standard deviation of the weights of five friends
shown in the preceding example. Table 6 shows the data used to calculate the standard
deviation, the coefficient of standard deviation, and the coefficient of variance:

Table 6: Weights of Five Friends

People Weight (kg) (Xi) (Xi – X) (Xi – X)2

Jenny 35 −3 9

Robert 40 2 4

Ella 34 −4 16

Andy 39 1 1

Eliza 42 4 16

Total Σ(Xi – X)2 =46

The calculation of standard deviation is as follows:

146 35 + 40 + 34 + 39 = 42
X= = 38
5
Data Processing and Analysis

(σ) = √∑ (Xi–X)2/n Notes

= √46/5 = √9.2

= 3.033

The calculation of coefficient of SD is as follows:

Coefficient of Standard Deviation = SD/X

= 3.03/38

= 0.0798

S elf A ssessment Q uestions


10. __________________ is used to study the scattered value near the mean value
of a data series.
11. Which formula is used for calculating the range of a data series?
a. Highest value of series – Lowest value of series
b. Lowest range – Highest range
c. Lowest value of series – Highest value of series
d. None of these
M
12. Coefficient of Mean Deviation = __________________
M
13. The symbol used to represent Standard Deviation is __________________.

8.6 MEASURE OF SKEWNESS


II

A frequency distribution can be represented by drawing a curve or a graph. The


measure of skewness is used to study the shape of a curve that can be drawn by
plotting the data of a frequency distribution on a graph.

As you have learned in the preceding sections, through a measure of central tendency,
you measure the concentration of values of a data series in the middle of a frequency
distribution. Through a measure of dispersion, you measure the scattering of values
near the middle value of the data series.

It may be possible that two data series, which are widely different in nature and
composition, have the same mean and standard deviation. However, when you plot
the data of such series on graphs, you obtain curves with different shapes. This
shows that the measures of central tendency and dispersion are not sufficient to
study the frequency distribution of a data series because they do not talk about the
shape of the frequency distribution curves. Therefore, you need skewness to gain
an understanding of the different shapes of various frequency distribution curves.

The measure of skewness is used when the concentration of values of a data series is
more on a single side that is either positive or negative.
147
Research Methodology

Notes Skewness can be classified as positive skewness and negative skewness. This is
shown in Figure 5:

M M M
E M
O M E O
D D
D I E M D
E A A E I E
N N A A
N N

Figure 5: Positive Skewness and Negative Skewness – Asymmetric Distribution

If curve of frequency distribution is symmetrical, then skewness = 0 and mean =


median = mode.
M Mean, Median
and Mode
M
II

Symmetric Distribution

Positive skewness implies that the concentration of values is on the right side of the
curve, whereas negative skewness implies that the concentration of values is on the
left side of the curve. Skewness is calculated by taking the difference of mean and
mode. In positive skewness, the values of these three measures of central tendency
are in the following order:

Mean (X) > Median (M) > Mode (Z)

However, in the case of negative skewness, the values of these three measures of
central tendency are in the following order:

Mean (X) < Median (M) < Mode (Z)

The formula to calculate skewness is as follows:

Skewness = X – Z

For moderately asymmetrical curves,


148 Mode = 3 Median – 2 Mean or
Data Processing and Analysis

Z = 3M – 2X Notes

The coefficient of skewness is the relative measure of skewness that can be calculated
by dividing skewness with standard deviation.

The formula used to calculate the coefficient of skewness is as follows:

X−Z
Coefficient of skewness = S k = σ

Pearson’s coefficient of skewness

Sk = {Mean – (3 Median – 2 Mean)}/σ

= [Mean – 3 Median + 2 Mean]/σ

Sk = (3 Mean – 3 Median)/σ

For a moderately skewed, if there is more than one mode or if there is no mode, then
you need to calculate skewness and the coefficient of skewness using the method of
Moments.

Let us now calculate skewness and the coefficient of skewness with the help of an
M
example. Suppose you want to calculate the skewness and the coefficient of skewness
of the data given in Table 7:

Table 7: Ages of Five Friends


M
People Age (Years) (Xi – X) (Xi – X)2
Jenny 18 0.2 0.04
II

Robert 17 – 0.8 0.64


Ella 18 0.2 0.04
Andy 17 – 0.8 0.64
Eliza 19 1.2 1.44
Total ∑Xi = 89 ∑(Xi – X)2 = 2.80
The mean of age is calculated as follows:

Mean of Age, X = ∑Xi/n

X = 89/5

X = 17.8

The median of age is calculated as follows:

Median, M = Value of (n + 1/2)th observation

M = (5 + 1/2)th observation

M = 3rd observation = 18

Since the data contains two modes (17 and 18), you do not consider mode in this 149
case.
Research Methodology

Notes The SD of age is calculated as follows:

σ = √∑ (Xi – X)2/n

= √2.80/5 ≅ 0.75

Skewness is calculated as follows:

Skewness = 3(17.8 – 18) = 0.6

The coefficient of skewness is calculated as follows:

Coefficient of skewness = 0.6/0.75 = 0.8

The skewness in the ages of five friends is 0.6 and the relative measure of skewness
is 0.8.

S elf A ssessment Q uestions


14. Negative skewness implies that the concentration of values is on the right side
of the curve. (True/False)

8.7 MEASURES OF RELATIONSHIP


M
The measures of relationship study the relationship between two or more variables
in a given data series. When you study the relationship between two variables in
a population, it is known as bivariate population. When you study more than two
M
variables in a population, it is known as multivariate population. The relationship
among variables can be of two types – correlation and cause and effect. Based on
these relationships, there are two types of analysis, as shown in Figure 6:
II

Measures of
Relationship

Correlation Regression
Analysis Analysis

Figure 6: Measures of Relationship

Let us now discuss each type of relationship among variables.

8.7.1 CORRELATION ANALYSIS


Correlation analysis is used to study the association between different types of
variables. It measures the extent to which one variable is linearly related to the other
variables.

150 Different tools are used to study the correlation pattern between variables. These
include: Rank correlation and Simple correlation.
Data Processing and Analysis

Let us discuss each tool. Notes

€€ Rank correlation: Rank correlation refers to the correlation between two data
series in which the data is ranked. Generally, it is found when the data is qualitative
in nature. It was given by Charles Spearman. Therefore, it is also known as
Spearman’s coefficient of correlation. It calculates the degree of relationship
between two types of variables.
The formula to calculate rank correlation is as follows:
6∑ di2
Rank Correlation ρ = 1 −
(
n n2 − 1 )
Where, di = Difference between the individual/ith pair of variables
n = Number of pairs of observations
€€ Simple correlation: Simple correlation is used to find the degree of linear
relationship between two variables. It is the most commonly used measure to
describe relationship between two linearly related variables. It was given by Karl
Pearson. Therefore, it is also known as Karl Pearson’s coefficient of correlation.
Simple correlation can be of three types, as given in Figure 7:
M
Positive Correlation Negative Correlation No Correlation
M
II

Figure 7: Types of Simple Correlation

The strength of association between two variables depends on the calculated value
of the correlation coefficient and the sample size. The value of the correlation
coefficient lies between a range of –1 and +1.
zz If the value of the correlation coefficient is close to –1 and the sample size
is sufficiently large, then there is a strong negative correlation between two
variables. For example, if the coefficient of correlation is –0.8, then there is a
strong negative association between variables.
zz If the value of the correlation coefficient is close to +1 and the sample size
is sufficiently large, then there is a strong positive correlation between two
variables. For example, if the coefficient of correlation is 0.8, then there is a
strong positive association between variables.
zz If the correlation coefficient is not close to –1 or +1 and the sample size is
sufficiently large, then there is weak correlation between two variables. For 151
example, if the coefficient of correlation is 0.3 or –0.3, then the association
between variables is weak.
Research Methodology

Notes The formula used to calculate simple correlation is as follows:

Correlation (r) = ∑ (Xi – X) (Yi – Y)/(n – 1)Sx Sy


or
Cov(X, Y)
rx ,y =
SD(X) SD(Y)
1
n
∑ X i Yi − XY
r=
1 2 1 2
 n ∑ X i − X   n ∑ Yi − Y 
2 2

n ∑ XY − ∑ (X )∑ ( Y )
r=
n X 2 − (∑ X )  n∑ Y − (∑ Y) 
2 2

 ∑
2

Where, Xi = ith value of X variable

X = Mean of X variable

Yi = ith value of Y variable

Y = Mean of Y variable
M
n = Number of pairs of observations
M
Sx = Standard deviation of X

Sy = Standard deviation of Y
II

Let us learn to calculate simple correlation between two variables with the help of an
example. Suppose you want to study the correlation between the age and weight of a
group of people to find out the relation between the two. Table 8 shows the required
data:

Table 8: Ages and Weights of a Group of People

Number of
Age (Xi) Weight (Yi) Xi2 Yi2 XiYi
Observations
1 18 35 324 1225 630
2 20 38 400 1444 760
3 25 50 625 2500 1250
4 30 65 900 4225 1950
5 35 70 1225 4900 2450
6 24 50 576 2500 1200
7 17 35 289 1225 595
8 16 39 256 1521 624
9 49 76 2401 5776 3724
10 45 72 2025 5184 3240
152
11 50 85 2500 7225 4250
Data Processing and Analysis

Number of Notes
Age (Xi) Weight (Yi) Xi2 Yi2 XiYi
Observations
12 18 32 324 1024 576
13 20 34 400 1156 680
14 25 57 625 3249 1425
15 24 50 576 2500 1200
16 17 35 289 1225 595
17 16 39 256 1521 624
18 23 44 529 1936 1012
19 22 45 484 2025 990
20 34 60 1156 3600 2040
21 36 65 1296 4225 2340
22 31 63 961 3969 1953
23 43 70 1849 4900 3010
24 44 72 1936 5184 3168
25 16 35 256 1225 560
Total ∑Xi=698 ∑Yi=1316 ∑Xi =22458
2
∑Yi =75464
2
∑XiYi=40846

The calculation of correlation is as follows:


M
Correlation (r) = (n∑XiYi– ∑Xi∑Yi)/√n∑Xi2 – (∑Xi)2 × n∑Yi2 – (∑Yi)2

r = (25 × 40846 – 698 × 1316)/√ (25 × 22458 – 698 × 698) (25 × 75464 – 1316 × 1316) r =
M
102582/√74246 × 154744

r = 0.96
II

8.7.2 REGRESSION ANALYSIS


Correlation need not necessarily imply causality. But it can be said that if correlation
between any two variables is very high, then it might be indicative of causality, i.e.,
a situation where one variable denotes the cause and the other variable denotes its
effect. For example, if X and Y are correlated, the causal relationship inferred from
correlation between them may indicate that X is a cause of Y, Y is a cause of X, or
both X and Y are caused by some other variable Z, etc.

Correlations are employed through methods such as regression analysis. In common


parlance, regression analysis (whether simple or multiple) is also termed as causal
analysis. Causality between different variables can be understood using causal
analysis.

Cause and effect analysis is measured using simple regression or multiple regression.

Regression is one step ahead of correlation in identification of relationship between


two variables. This is because regression allows for prediction of values within the
given data range. In simple language, if we know X, we can predict Y and if we know
Y, we can predict X. This is possible with the help of an equation called regression
equation.
153
The variable Y is generally termed as dependent or criterion variable and the
variable X is termed as independent or predictor variable. Regression equation
Research Methodology

Notes is used to generally predict the values of Y based on the values of X. However, it
cannot be rightly said that Y is caused by X. Before making such an interpretation,
it is extremely imperative for the researcher to thoroughly understand the variables
under study and the circumstances or context under which they operate.

The regression equation can be written as below:

Y = α + βX

Where,

Y represents scores on Y variable

X represents scores on X variable

α represents regression constant in the sample

β represents regression coefficient in the sample

α and β are calculated with the following formula:

1
M
α=  ∑ Y − β∑ X 
n
Simple regression analysis is useful in a number of situations, for example, it is used
M
in analysing the relationship between number of consumers (independent variable)
and product sales of a month (dependent variable). The regression equation to the
data is fitted with the use of least squares method in regression analysis.
II

Let us take an example with data of number of customers and monthly sales for 10
number of observations (N) as shown in Table 9:

Table 9: Customers and Monthly Sales

No. of Consumers Monthly Sales


S. No. XY X2
(X) (in ‘00) (Y (in ‘000)
1 2.0 12 24.0 4.0
2 3.4 6 20.4 11.6
3 6.2 7 43.4 38.4
4 7.6 11 83.6 57.8
5 6.5 13 84.5 42.3
6 8.2 33 270.6 67.2
7 7.6 31 235.6 57.8
8 9.3 22 204.6 86.5
9 3.1 36 111.6 9.6
10 8.1 24 194.4 65.6
154 Total 62.0 195 1,272.7 440.7

Now, regression equation is given by:


Data Processing and Analysis

Y = α + βX Notes

Using the formula for β

10 × 1272.7 − (62 )(195)


β= = 637 ÷ 563 = 1.1314
10 × 440.7 − (62 )(62 )

Using the formula for α


1
α= 195 − 1.1314 × 62  = 12.485
10 

Thus, the regression equation for the above data is given as:

Y = 12.485 + 1.1314X

With this equation, the values of Y (monthly sales) can be computed for any given
value of X (no. of customers) as depicted in Table 10 below:

Table 10: Monthly Sales for Given Number of Customers

S. No. No. of Consumers Y=12.485+1.1314X Monthly Sales (Y)


(X) (in ‘00)
M (in ‘000)
1 2.0 14.75 (12.485 + 1.1314×2.0)
2 3.4 16.33 (12.485 + 1.1314×3.4)
3 6.2 19.50 (12.485 + 1.1314×6.2)
4 7.6 21.08 (12.485 + 1.1314×7.6)
M
5 6.5 19.84 (12.485 + 1.1314×6.5)
6 8.2 21.76 (12.485 + 1.1314×8.2)
7 7.6 21.08 (12.485 + 1.1314×7.6)
II

8 9.3 23.01 (12.485 + 1.1314×9.3)


9 3.1 15.99 (12.485 + 1.1314×3.1)
10 8.1 21.65 (12.485 + 1.1314×8.1)
Total 62.0 195.00

S elf A ssessment Q uestions


15. ______________ is the study of the association between different types of
variables.
16. Causal analysis is used to study the cause and effect relationship of two
variables. (True/False)

8.8 DIFFERENT CHARTS USED IN DATA ANALYSIS


Graphical illustrations are visually appealing and bring life to a report so as to give
the target audience refreshing breaks from the monotony caused by texts and tables.

If the research report contains many descriptive tables, it can be made more
readable and attractive if the most important tables are presented through graphs
and diagrams. In the graphical presentation, facts and figures are gathered first and 155
then they are depicted in the form of graphs and charts to present the statistical
information.
Research Methodology

Notes The most frequently used graphs and charts include the following:
€€ Bar chart: A bar chart represents categorical data with the help of rectangular bars,
plotted vertically or horizontally. The heights or lengths of rectangular bars are
proportional to the values represented by them. The data can be in the form of
absolute frequencies or relative frequencies.
Figure 8 below shows a bar chart to depict the relative frequency/percentage of
shortages of anti-inflammatory medicines in the rural health organisations:
Shortages of anti-inflammatory

NEVER 31

OCASSIONALLY 11
medicines

FREQUENTLY 3

RARELY 55

0 10 20 30 40 50 60
M Percentage of health clinicals

Figure 8: Relative Frequency of Shortages of Anti-inflammatory Medicines


M
in Rural Health Organisations in Bar Chart

€€ Pie chart: A pie chart is a circular statistical graphic, segregated into different
segments to illustrate the numerical proportions/relative frequency of a number
II

of items. The arc length of each segment shows the proportionate quantity
represented by it. Pie charts provide a quick overview of the data presented to the
readers. All segments of the pie chart should be added up to 100%.
Figure 9 shows a pie chart to depict the relative frequency/percentage of shortages
of anti-inflammatory medicines in the rural health organisations:

Percentage of health clinicals

Never
31%

Rarely
55%
Ocassionally
11%

Frequently
3%

Rarely Frequently Ocassionally Never


156
Figure 9: Relative Frequency of Shortages of Anti-inflammatory Medicines
in Rural Health Organisations in Pie Chart
Data Processing and Analysis

€€ Histogram: A histogram is an accurate representation of the probability distribution Notes


of a continuous data variable grouped into bins. They are very similar to bar charts
used to show categorical data. The only difference between the two is that the
histogram bars are connected to each other (so long as there is no gap in the data)
to represent continuous data, whereas the bars in a bar chart are not connected as
they represent different categorical entities. Figure 10 shows a histogram to depict
the frequency of sales effected by different salespersons in a month, indicating
how many salespersons fall within a particular sales range:

Sales Patterns of Salespersons

5
No of sales Salespersons

(32, 182) (182, 332) (332, 482) (482, 632) (632, 782) (782, 932)
M
Sales Range

Figure 10: Absolute Frequency of Sales Effected by Different Salespersons in a Month (n=60)
M
€€ Line graph: A line graph or a line chart is generally used to visualise the value
of a particular variable over time. They are useful to show the trend of numerical
data over a period of time. Two or more distributions (each depicted by a separate
II

line) can be shown in one graph as long as the difference between them is easily
distinguishable. They also make it possible to compare the distributions of different
groups, for example, age distribution between males and females. Figure 11 shows
a line graph to depict the frequency of daily number of patients being treated at
the rural health organisations in District Y:

25
DAILY NUMBER OF PATIENTS
UNDERGOING TREATMENT

20

15

10

0
1 2 3 4 5 6 7 8 9 10 11 12

DAY NUMBER
157
Figure 11: Daily Number of Patients Being Treated at the Rural Health Organisations
in District Y in Line Chart
Research Methodology

Notes €€ Box and whisker plot: This is a method of graphically representing different
groups of numerical data through their quartiles. The box plots can also have
vertical lines extending from the boxes (called whiskers) to indicate the variability
outside the upper and lower quartiles. For example, variability between
sales patterns effected in Area X and Area Y is shown through box plots in
Figure 12:

Representation of sales in different areas

Sales in Area X Sales in Area Y

900
800
700
600
500
Sales

400
300
200
100
0
M
Figure 12: Sales Patterns of Food Grains Effected in Area X and Area Y

S elf A ssessment Q uestions


M
17. Bars in a histogram are not connected as they represent different categorical
entities. (True/False)
18. In a boxplot, vertical lines extending from the boxes are called ______________.
II

8.9 SUMMARY
€€ A researcher collects any type of data, quantitative and qualitative, in raw form.
After that, he/she needs to process the collected data to make it fit for analysis.
€€ Editing refers to reviewing the collected data to check whether it is valid or not.
This helps in eliminating the extra information and retaining the relevant matter
for analysis.
€€ When the data is generated with the help of a questionnaire, it can be coded either
at the time of framing the questionnaire or after collecting the data.
€€ Classification refers to categorising the coded questions into different segments as
per their relevance.
€€ Tabulation refers to presenting the data in the form of a table so that it can be
analysed easily.
€€ Descriptive analysis is used to study the relationship pattern among variables.
€€ Inferential analysis uses various types of test of significance to check the validity
158 of a hypothesis for studying a problem.
Data Processing and Analysis

€€ The measures of central tendency are used to study the distribution pattern of a Notes
dataset.
€€ Mean represents the value received after dividing the sum of observations by the
total number of observations.
€€ Median refers to the central value of the given dataset.
€€ Mode refers to the value that has the highest frequency in a data series.
€€ The measures of dispersion refer to the measures that are used to study the
dispersed value near the mean value.
€€ Standard deviation is used to calculate the scattering of values in a given dataset.
€€ The measure of skewness is used to study the shape of the curve that can be drawn
by plotting the data of a frequency distribution on a graph.
€€ The measures of relationship study the relationship between two or more variables
in a given data series.

8.10 KEY WORDS


€€ Base period: This refers to the period that acts as a benchmark for measuring
economic and financial data.
€€
M
Hypothesis: This is a proposed explanation of a phenomenon, which needs to be
tested.
Measures of central tendency: These measures are used to find the central value
M
€€
of a data series.
€€ Measures of dispersion: These measures are used to find the scattering of values
around the mean value of a data series.
II

€€ Measures of relationship: These measures are used to find the relationship


between different variables.
€€ Univariate analysis: This is the analysis of a single variable.

8.11 CASE STUDY: QUALITY STANDARDS IN A SERVICE SECTOR


COMPANY
TPR Inc. was a multi-cuisine restaurant based in India. It had several outlets in the
major Indian cities. The restaurant management wanted to find out if its various
outlets were meeting the established standards of quality and customer service. It
hired a consultancy firm for the purpose.

The consultants collected a large scale of data with the help of questionnaires,
interviews, and observations in the restaurants’ outlets. Then, they carefully
followed the data processing steps to analyse it and retrieve relevant and meaningful
information from it.

While processing the responses in the questionnaires, they found that quite a large
number of questions were left unanswered. Instead of ignoring such questions, they 159
Research Methodology

Notes proceeded in a systematic manner. Each questionnaire comprised a series of interval


questions, closed-ended questions and open-ended questions.

In case of interval questions, they gave a mid-value to the unanswered questions. In


case of open-ended questions, they went back to the customers and requested them
to fill in the answers.

After retrieving sufficient data from the questionnaires, they classified the collected
data. To do so, they combined customers’ responses from different cities and then
sub-grouped them according to their cities. Next, they formed a table to analyse the
relationship between customers’ satisfaction and the sales of the company:

Calculating the Correlation between Customer Satisfaction and Sales of the Company

Number of Customer Sales of Xi2 Yi2 XiYi


Observations Satisfaction (Xi) Company (Yi)
1 4 5 16 25 20
2 6 6 36 36 36
3 7 6 49 36 42
4 8 4 64 16 32
5 9 6 81 36 54
6 10
M 9 100 81 90
7 8 10 64 100 80
8 7 2 49 4 14
M
9 1 3 1 9 3
10 2 4 4 16 8
11 9 9 81 81 81
II

12 8 8 64 64 64
13 7 9 49 81 63
14 10 11 100 121 110
15 6 5 36 25 30
16 9 12 81 144 108
17 8 15 64 225 120
18 10 12 100 144 120
19 9 16 81 256 144
20 8 20 64 400 160
21 10 20 100 400 200
22 4 6 16 36 24
23 5 8 25 64 40
24 10 14 100 196 140
25 10 19 100 361 190
Total 185 239 1525 2957 1973

The correlation between the customers’ satisfaction and the sales of company is as
follows:
160
Correlation (r) = (n∑XiYi -∑ Xi∑Yi) / √n∑Xi2
Data Processing and Analysis

r = (25 × 1973 – 185 × 239) / √ (1525 × 25 – 185 × 185) (25 × 2957 – 239 × 239) Notes

r = 5110/8095.41

r = 0.6

Since the correlation coefficient is positive and close to 1, it indicates that the
relationship between the customers’ satisfaction and the sales is positive and strong.

Similarly, the consultants studied the relationship between different variables, such
as quality of service and customer satisfaction, quality of service and established
standards, and so on. Finally, they concluded that the satisfaction level of the
restaurant’s customers was positive and strong. However, the restaurant’s service
level was far behind the established quality standards.

QUESTIONS
1. What are the different steps of data processing used in the case study?
(Hint: The consultants used all the steps of data processing, that is, first they
extracted the relevant data. Then, they classified and organised the information
and studied the relationship between variables.)
M
2. Which type of measure is used in analysing the table and what type of analysis is
used?
(Hint: The measure of relationship is used to analyse the table.)
M
3. What was done to unanswered questions of the questionnaires filled by customers?
(Hint: Unanswered questions were not ignored and a systematic procedure was
followed to retrieve sufficient data.)
II

4. How was the data retrieved from questionnaire collected and classified?
(Hint: The customers’ responses from different cities were combined and then sub
grouped according to their cities.)
5. How the relationship between customers’ satisfaction and the sales of the company
was derived?
(Hint: By forming a table and calculating correlation between customers’
satisfaction and the sales of the company)

8.12 EXERCISE
1. Explain the different steps of data processing.
2. What are the different types of data analysis?
3. What are the measures of central tendency? Why are they used?
4. What are the measures of dispersion? Why are they used?
5. What do you understand by ‘skewness’? What is the measure of skewness? What
does its calculated value indicate?
161
6. What is the purpose of casual analysis?
Research Methodology

Notes 8.13 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer
Concept of Data Processing 1. Editing
2. True
3. postcoded
4. False
Concept of Data Analysis 5. b. Bivariate analysis
6. False
Measures of Central Tendency 7. True
8. Geometric
9. True
Measures of Dispersion 10. Measures of Dispersion
11. a. H
 ighest value of series – Lowest
value of series
12. MD/X
Where, MD = Mean Deviation
X = Mean/Median/Mode
13. Sigma (σ)
Measure of Skewness
M 14. False
Measures of Relationship 15. Correlation analysis
16. True
M
Different Charts Used in Data Analysis 17. False
18. whiskers
II

8.14 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Cahoon, M. (1987). Research Methodology. Edinburgh: Churchill Livingstone.
€€ Panneerselvam, R. (2014). Research Methodology. Delhi: PHI Learning.
€€ Welman, J., Kruger, F., & Mitchell, B. (2005). Research MWethodology. Cape Town:
Oxford University Press.

E-REFERENCES
€€ What are Mean, Median, Mode and Range?. (2020). Retrieved 9 April 2020, from
https://searchdatacenter.techtarget.com/definition/statistical-mean-median-
mode-and-range
€€ Introduction to Correlation and Regression Analysis. (2020). Retrieved 9 April 2020,
from http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Multivariable/
BS704_Multivariable5.html

162
R
TE
9

AP
H
C
Concept of Hypothesis
M
Table of Contents
M
9.1 Introduction
9.2 Defining Hypothesis
II

9.2.1 Characteristics of a Good Hypothesis


9.2.2 Types of Hypotheses
Self Assessment Questions
9.3 Hypothesis Testing
9.3.1 Null Hypothesis and Alternative Hypothesis
9.3.2 Decision Rule
9.3.3 Two-Tailed Test
9.3.4 One-Tailed Test
Self Assessment Questions
9.4 Procedure of Hypothesis Testing
Self Assessment Questions
9.5 Summary
9.6 Key Words
9.7 Case Study
9.8 Exercise
9.9 Answers for Self Assessment Questions
9.10 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Explain the concept of hypothesis
€€ Describe various types of hypothesis
€€ Explain the use of null and alternative hypotheses in hypothesis testing
€€ Differentiate between two-tailed and one-tailed tests
€€ Describe the procedure of hypothesis testing

9.1 INTRODUCTION
In the previous chapter, you studied the concept of data processing. The chapter
discussed the concept of data analysis. The latter sections of the chapter described
the measures of central tendency, measures of dispersion and measures of skewness.
The chapter concluded with the explanation of the different charts used in data
analysis.

A hypothesis refers to an assumption that is made in the population parameter and


M
a sample statistic is used to verify the same. It is a very useful tool to solve various
research problems and issues. A researcher first forms a hypothesis about a problem
and then tests it to check its validity by using statistical measures. The procedure to
utilise test statistics to check whether a hypothesis is true is known as hypothesis
M
testing.

Suppose a researcher is asked to check whether an organisation’s new advertisement


II

has resulted into enhanced sales or not. In this case, the researcher would first form
the hypothesis that the new advertisement has no impact on the organisation’s sales.
This hypothesis is known as null hypothesis. After that, the researcher would form
another hypothesis, known as alternative hypothesis, which states that the new
advertisement has a positive impact on the organisation’s sales. Then, the researcher
would analyse the data to find the relationship between the new advertisement and
the organisation’s sales. If he/she finds a relationship between the new advertisement
and the sales, he/she would reject the null hypothesis and accept the alternative
hypothesis.

In the field of research, the concept of hypothesis and hypothesis testing hold a very
special place. The formation of hypothesis helps the researcher remain focussed on
the research problem. In addition, it gives direction to the research project by clearly
defining the scope of research. Hypothesis testing assists the researcher in deriving
realistic results, as it takes into consideration the errors due to sampling.

In this chapter, you will learn about the concept of hypothesis and explore the
characteristics and types of hypothesis. The chapter also provides information about

164
Concept of Hypothesis

hypothesis testing, null and alternative hypotheses, decision rules, one-tailed test Notes
and two-tailed test.

9.2 DEFINING HYPOTHESIS


Hypothesis is a proposed explanation given for an observed situation. It is a specific
prediction, which can be tested, about what you expect to happen in a study or
research. It represents a tentative relationship between two or more variables,
which is predicted by the researchers. For example, a study designed to look at the
relationship between stress and common cold might have a hypothesis that states,
“This study is designed to assess the hypothesis that people with high-stress levels
will be more likely to catch common cold after being exposed to the virus than are
people who have low stress levels.”

Some definitions of hypothesis by experts are given below:

According to Mouton, hypothesis is: A statement postulating a possible relationship


between two or more phenomena or variables.

According to Guy, hypothesis is: A statement describing a phenomenon or which specifies


a relationship between two or more phenomena.
M
Both hypothesis and problem statements arise from a predefined situation. However,
there is a difference between the two. A problem statement cannot be directly tested,
while a hypothesis statement is derived from a problem statement. The hypothesis
M
is formulated after a problem has been stated and the researcher has done a detailed
theoretical study of the problem. It is formulated to solve the problem by testing it
with the help of various tests of significance.
II

9.2.1 CHARACTERISTICS OF A GOOD HYPOTHESIS


A hypothesis is a supposition (assumed or tentative) statement regarding
the relationship that exists between two or more variables. Following are the
characteristics of a good hypothesis:

€€ Clear topic: A hypothesis should clearly define its topic. The topic should also be
meaningful.
€€ Precise: A hypothesis should be clear and specific to facilitate a deep and
comprehensive study, and enable researchers to draw reliable inferences on its
basis.
€€ Testable: A hypothesis should be capable of being tested. Hypothesis is specific
and it may either agree or disagree with the research question.
€€ Limited in scope: A hypothesis should be limited in scope, as narrower hypotheses
are generally more testable.
€€ Consistent: A hypothesis should be based on previous research.

165
Research Methodology

Notes 9.2.2 TYPES OF HYPOTHESES


There are six types of hypotheses, which are classified on the basis of their derivation
and formulation are shown in Figure 1:

Inductive Hypothesis
On the Basis
of Derivation
Deductive Hypothesis:

Types of
Hypotheses Deductive Hypothesis:

Non-directional Hypothesis
On the Basis
of Formulation
Null Hypothesis

Alternative Hypothesis
M Figure 1: Types of Hypotheses

On the basis of derivation, there are two types of hypotheses, which are explained
as follows:
M
€€ Inductive hypothesis: In inductive hypothesis, you move from specific
observations to broad generalisations. First, you observe a phenomenon.
II

Then, you form a pattern from your observations. After that, you form a hypothesis
to study the pattern. Finally, you form a theory on the basis of your study of
the pattern. The inductive hypothesis is used to conduct qualitative studies of
subjective variables. In this type of hypothesis, you should ask open-ended and
process-oriented questions.
€€ Deductive hypothesis: In this type of hypothesis, you move from a general
statement to a specific, logical conclusion. You start from a theory and, based on
it, you make a prediction of its consequences. In other words, you predict what
the observations should be if the theory were correct. Finally, analysis is done
to arrive at a conclusion whether the theory is rejected or accepted with respect
to the problem. In deductive hypothesis, a research goes from general theory to
specific observation. In this type of hypothesis, you should ask closed-ended and
outcome-oriented questions.
On the basis of formulation, there are four types of hypothesis which are explained
as follows:
€€ Directional hypothesis: This hypothesis checks the direction of relationship
between two variables. In directional hypothesis, you use terms, such as more
166 than, less than, negative and positive. An example of the directional hypothesis is:
In an organisation, women are more productive than men.
Concept of Hypothesis

€€ Non-directional hypothesis: In this hypothesis, the direction of relationship Notes


between two variables cannot be specified. For example, an organisation wants to
get feedback from its employees about their job satisfaction level. In this example,
the test result can be positive or negative depending on the job satisfaction of the
employees.
€€ Null hypothesis: In this hypothesis, there is no relation between two variables
under study. It is denoted by H0. Null hypothesis is used as the first statement
in a hypothesis, which you (or the researcher) want to reject. For example, a null
hypothesis is: There is no relation between the number of years of experience held
by an individual and his performance. Therefore, researchers are more interested
in disproving or rejecting the null hypothesis. This is an example of null hypothesis
that would be tested for rejection because it is generally held that experience and
performance are related.
€€ Alternative hypothesis: This hypothesis states that there is a relationship between
two variables under study. It is denoted by H1. It is used as the second statement
in a hypothesis that you want to accept. For example, an alternative hypothesis
can be: There is a relation between the qualification of an individual and better job
opportunities. Since these two variables are related, you would want to accept this
statement.
M
Before studying the concept of null hypothesis and alternative hypothesis in detail,
one must understand how the process of hypothesis testing works.

The researchers initially state the null hypothesis and alternative hypothesis. After
M
this, they conduct certain specific tests and at the end of test, they make statements
regarding the likelihood that a research hypothesis is FALSE.

It is true that the researchers make probability statements regarding the likelihood
II

of hypothesis being false instead of it being true, i.e., researchers are interested in
rejecting null hypothesis rather than accepting the null hypothesis because they
never know how much type II error they might be making.

S elf A ssessment Q uestions


1. __________ is a specific prediction, which can be tested, about what you expect
to happen in a study or research.
2. Match the following:
1. Inductive hypothesis i. General statement to a specific conclusion
2. Deductive hypothesis ii. Second statement in a hypothesis
3. Null hypothesis iii. From specific observations to broad
generalisations
4. Alternative hypothesis iv. First statement in a hypothesis

a. 1–iv, 2–iii, 3–ii, 4–i


b. 1–iii, 2–iv, 3–i, 4–ii
c. 1–iii, 2 – i, 3–iv, 4–ii
167
d. 1–i, 2–ii, 3–iii, 4–iv
Research Methodology

Notes 9.3 HYPOTHESIS TESTING


Hypothesis testing is a process to make decisions for research problems by using
sample data. It is a logical method of taking and validating decisions.

In hypothesis testing, you take two statements:


i. The first statement states that there is no relationship or no difference between two
variables under study. You take this statement (also known as null hypothesis) as
true. A null hypothesis statement involves equality (≤, ≥, or =) about a population
parameter.
ii. The second statement states that there is a relationship or difference between
two variables under study. You take this statement (also known as alternative
hypothesis) as false. Alternative hypothesis contradicts null hypothesis and must
not involve equality (<, ≠, >).
After that, you test the null hypothesis to accept or reject it. The null hypothesis is
tested with the help of the levels of significance. A significance level is the probability
of rejecting null hypothesis in a statistical test when it is true. It is expressed in
percentage and its value can be calculated from the tables of various test statistics.
Examples of test statistics are t-test, z-test, and F-test. You would learn about these
M
tests in detail in the upcoming chapters.

After the null hypothesis and alternative hypothesis have been stated, the researcher
sets the decision criteria for which he/she needs to state the level of significance of
M
test. If the null hypothesis is true, the sample mean will be equal to population mean
on average.

The most commonly used levels of significance in statistics are 1%, 5% and 10%.
II

For example, if 5% is the most commonly used level of significance in behavioural


studies, it implies that the 5% area of the normal curve would be used for testing
the hypothesis and the value for this area is taken from the table of the respective
test statistic. For instance, the z-values for various levels of significance are shown
in Figure 2:

Mean Value

−2.58 −1.96 −1.64 +1.64 +1.96 +2.58

90%of Area
95%of Area
99%of Area
168
Figure 2: z-Values for the Levels of Significance
Concept of Hypothesis

In Figure 2, you can see that the areas expressed in percentage and their values are Notes
given on X-axis. Table 1 provides the levels of significance and their z-values:

Table 1: z-values of Levels of Significance

Level of Significance z-value


1% +/– 2.58 (for two-tailed)
1% +/– 2.33 (for one-tailed)
5% +/– 1.96 (for two-tailed)
5% +/– 1.64 (for one-tailed)
10% +/– 1.64 (for two-tailed)
10% +/– 1.28 (for one-tailed)

In hypothesis testing, the value level of significance is very important as it helps you
in rejecting or accepting a null hypothesis. You should be careful while formulating
or determining the level of significance for a problem/topic. The reason is that you
may reject a true hypothesis on the basis of a level of significance. If the level of
significance is 5%, it implies that the probability of rejecting a true hypothesis is 0.05
(max).

After the level of significance has been set, the researcher then proceeds to compute
the test statistic which basically describes how far a sample mean is from the
M
population mean. The greater the value of test statistic, the farther is the sample
mean from the population mean described in null hypothesis. Thereafter, on the
basis of value of test statistic, a decision is made.
M
If the null hypothesis is true and the probability of obtaining a sample mean is less
than 5%, then we reject the null hypothesis. On the contrary, if null hypothesis is
true and the probability of obtaining a sample mean is more than 5%, then the null
II

hypothesis is retained.

9.3.1 NULL HYPOTHESIS AND ALTERNATIVE HYPOTHESIS


Null hypothesis represents the first statement of a hypothesis that is assumed to be
true. This statement indicates that there is no relationship between two variables
under study and if there exists, any relation that is there is purely due to chance.
Alternative hypothesis represents the second statement of a hypothesis that is
assumed to be false. This statement indicates that there is a relationship between the
two variables.

Let us understand these through the following example:

Example 1: Assume that if a patient takes physiotherapy sessions two times instead
of three times in a week post operation, then his/her recovery time would be greater.
Assume that if the average recovery time after operation is 7 weeks:

H0: The average recovery time after operation is less than or equal to 7 weeks.

H1: The average recovery time after operation is greater than 7 weeks.

From the preceding two examples, it is clear that H0 is totally opposite of the statement 169
the researcher wants to study. The researchers always test H0 for significance, not H1
because they are usually interested in disproving H0.
Research Methodology

Notes H0 and H1 are in the descriptive form. The researcher must convert them into the
quantitative form to compute them.

In Example 1, the quantitative forms of H0 and H1 are as follows:

H 0: μ ≤ 7

H 1: μ > 7

Where,

μ = Population mean

You can also formulate a hypothesis for testing with the help of a benchmark. This
benchmark is a numerical digit with which you have to compare your results and
test the hypothesis. This is one of the finest and widely used methods for framing
null and alternate hypotheses because it represents null and alternate hypotheses in
quantitative form. This makes hypothesis testing easier.

For example, in a school, the average weight of every class is 100 (population mean).
You consider all sections of class 10 as a sample (assume there are 5 sections of
class 10) and calculate their average weight (sample mean). Now, you want to check
whether the sample mean is equal to the population mean or not. In this case, H0 and
H1 would be as follows:
M
H0: X = 100
M
H1: X < 100

Where,
II

X = Sample mean

μp = Population mean = 100

The researchers assume that the null hypothesis is true and proceed further to find
out various methods/possibilities to solve the problem. They try to reject the null
hypothesis.

A hypothesis can never be right or wrong. Rather, it is judged by what you want to
analyse. If a hypothesis is framed in such a way that it can answer your problem,
then it would be right.

9.3.2 DECISION RULE


Decision rule refers to the process or criteria that a researcher uses to decide whether
to accept or reject the null hypothesis. For example, a researcher forms a hypothesis
that the mean age of a population is equal to 30. The researcher then collects a sample
of observations to test this hypothesis. He/she will then create a decision criteria. For
instance, the researcher may decide to accept the hypothesis if the sample mean was
in the range of 10% on either side of 30; i.e., 30 ± 10% = between 27 and 33. It means
that the researcher would reject the hypothesis if mean of sample was below 27 or
170 above 33.
Concept of Hypothesis

It is important to note that different types of errors may occur while testing a Notes
hypothesis. Therefore, the researcher should take into consideration the possibilities
of these errors while taking decisions.

The decision grid helps the researcher in taking decisions, which is shown in
Figure 3:

Accept H0 Reject H0

H0 (true) Correct Decision Type I Error (α error)

H0 (false) Type II Error (β error)


M Correct Decision

Figure 3: Decision Grid


M
As per the grid shown in Figure 3, if H0 is true and it is accepted, then the decision
is correct. If H0 is false and it is rejected, then also the decision is right. However, if
the decision is wrong, two types of errors can occur, which are explained as follows:
II

€€ Type I errors: These errors occur when the researcher rejects a null hypothesis (H0)
when null hypothesis was true. In this case, the decision taken by the researcher
is wrong. Type I errors are also known as the first kind of error or false positive.
These errors are represented by α.
€€ Type II errors: Type II errors occur when the researcher accepts a null hypothesis
(H0) that should have been rejected. In this case, the decision taken by the
researcher is wrong. Type II errors are also known as the second kind of error or
false negative. These errors are represented by β. The probability of rejecting the
null hypothesis when it is false = 1 – β and is called the power of test.
If you minimise Type I errors, Type II errors would increase or vice versa. Therefore,
you have to be very careful while minimising one type of error. You must remember
that both the types of errors can be limited using an appropriate sample size.

9.3.3 TWO-TAILED TEST


Two-tailed test is a part of non-directional hypothesis that talks about the relationship
between two variables, but does not explain anything about the direction of the
relationship.
171
For example, a company produces tennis balls and it has laid down that the ball
should weigh 55 grams in order to get good ratings. The samples are drawn on
Research Methodology

Notes hourly basis and checked for ideal weight. In a given hour, 11 balls are checked
randomly and their mean is calculated as 55.006 grams and SD of 0.029 grams. If
the production line gets out of sync with more than 1% level of significance, the
production line is shut down. Let us see if the production line should be shut down
in this case.

Here,

μp = 55 g;

H0: μp = 55 g

H1: μp ≠ 55 g

α = 1% =0.01

Therefore, α/2 = 0.005

p = 1 – (α/2) = 0.995

Degree of freedom of sample = n – 1 = 11 – 1 = 10 Here,

tp = 3.169 M
Now, calculate tc.

X−µ
tc =
s n
M
55.006 − 55
tc =
0.029 10
II

0.006
tc = = 0.659
0.0091

The two-tailed test can be shown on a normal curve in Figure 4:

Fail to Reject H0

Reject H0 Reject H0

tp = −3.169 tc = 0.659 tp = +3.169



172
Figure 4: Two-tailed Test
Concept of Hypothesis

In Figure 4, at the 1% level of significance, the t value would be ±3.169. If the calculated Notes
value of test statistics lies in between the range of –3.169 and +3.169, then H0 would
be accepted. However, if the calculated value of test statistics lies outside this range,
it would be rejected. Here, the rejection region is equally divided between two tails
of the distribution (–0.005 is upper tail and 0.005 is lower tail). In this example, the
null hypothesis is accepted.

9.3.4 ONE-TAILED TEST


One-tailed test is a part of directional hypothesis that talks not only about the
relationship between two variables, but also the direction of relationship. It is
considered when you want to test a hypothesis on either positive or negative side of
a normal curve. When the hypothesis testing involves rejection region only on one
side of the sampling distribution, it is called a one-tailed hypothesis test.

For example, assume that the null hypothesis states that mean weight of people is 60
kg or more. In this case, the alternative hypothesis would be that the mean weight of
people is less than 60 kg. Here, the rejection region comprises the range of numbers
0 to 60 located on the left side of sampling distribution (set of numbers that are less
than 60).

The one-tailed test also forms a normal curve as shown in Figure 5:


M
Mean Value
M

Acceptance Region
(If sample mean lies
II

in this area, accept


H0)

Rejection Region
(If sample mean
lies in this area,
reject H0)

–1.64

Figure 5: One-tailed Test

In Figure 5, at the 5% level of significance, z value would be –1.64. If the sample


mean is greater than –1.64, then H0 is accepted. Else, H0 is rejected.

The level of significance can be represented with the help of α and α/2 in one-tailed
test and two-tailed test, respectively. For example:
€€ In one-tailed test, if the level of significance is 5%, then α is 5%. In this case, the
value of test statistics would be determined at 0.05.
173
€€ In two-tailed test, if the level of significance is 5%, then the value of test statistics
would be determined at 0.025% (α/2).
Research Methodology

Notes S elf A ssessment Q uestions


3. __________ is a process to make decisions for research problems by using the
available data.
4. Null hypothesis is tested with the help of the levels of significance. (True/
False)
5. __________ errors are also known as the first kind of error or false positive.
6. Which one of the following tests is a part of non-directional hypothesis?
a. Two-tailed test
b. One-tailed test
c. Both a and b
d. None of these

9.4 PROCEDURE OF HYPOTHESIS TESTING


Hypothesis testing is a step-by-step process that starts with the formulation of
hypothesis and ends with decision-making. The steps involved in hypothesis testing
are shown in Figure 6:
M
1. State H0 and H1
M

2. State the Level of Significance and the Nature of Tail-test


(Two-tail Test or One-tail Test)
II

3. Decide on the Type of the Test of Significance

4. State the Decision Rule

5. Calculate the Test Statistics

6. Take a Decision

Figure 6: Process of Hypothesis Testing

Let us now discuss the process of hypothesis testing in detail.

174 1. State H0 and H1: In this step, null hypothesis and the alternative hypothesis are
framed.
Concept of Hypothesis

For example, a research organisation wants to perform a significance test to Notes


determine whether the mean weight of Indian children aged 5 is 20 kg or not (as
claimed by reports). In this case, H0 and H1 would be as follows:
μp = 20
H0: μp = 20 kg
H1: μp ≠ 20 kg
Where, μp = Population mean
2. State the level of significance: This refers to deciding the level of significance (α)
for the hypothesis test. The most commonly used level of significance is 5%. This
happens because the range 5% is neither too big nor too small to accept or reject a
hypothesis.
3. Decide on the type of the test of significance: The test of significance is used to
check the hypothesis at a given level of significance. There are various types of
tests of significance, such as t-test, z-test, and F-test. The selection of a test depends
on various factors, such as the sample size, variance and type of population. For
example, you use the t-test when the sample size is less than 30 and the z-test
when the sample size is more than 30. M
4. State the decision rule: It refers to determining the conditions under which the null
hypothesis is accepted or rejected. If the decision rule is not determined correctly,
then there are chances of committing Type I and Type II errors. Therefore, you
should be careful while making the decision rule.
M
5. Calculate the test statistics: It refers to ascertaining the value of test statistics to
accept or reject the hypothesis.
II

6. Take a decision: It refers to either accepting or rejecting H0 on the basis of the


calculated value of test statistics. If the calculated probability is equal to or smaller
than α value (in one-tailed test) or smaller than α/2 (in two-tailed test), then null
hypothesis is rejected. However, if calculated probability is greater than α value,
then null hypothesis is accepted. Rejecting H0 may lead to Type I error, whereas
accepting H0 may lead to Type II error.

S elf A ssessment Q uestions

7. Hypothesis testing is a step-by-step process that starts with the formulation of


hypothesis and ends with __________.
8. What does μp stand for in hypothesis testing?
a. Sample mean
b. Population mean
c. Level of significance
d. Coefficient of correlation

175
Research Methodology

Notes
9. Which one of the following is the commonly used level of significance for
minimising Type I and Type II errors?
a. 10% b. 12%
c. 5% d. 7%
10. Which of the following are the types of test of significance?
a. t-test b. z-test
c. F-test d. All of these

A ctivit y
Prepare a PowerPoint presentation on hypothesis and hypothesis testing.

9.5 SUMMARY
€€ Hypothesis is a proposed explanation given for an observed situation.
€€ Inductive hypothesis is a type of derivation hypothesis where you move from
specific observations to broad generalisations.
€€
M
Deductive hypothesis is a type of derivation hypothesis in which you move from
a general statement to a specific conclusion.
€€ Directional hypothesis refers to the formulation hypothesis that checks the
M
direction of relationship between two variables.
€€ Non-directional hypothesis refers to the formulation hypothesis where the
direction of the relationship between two variables cannot be specified.
II

€€ Null hypothesis refers to the hypothesis in which there is no significant relation


between two variables under study. It is denoted by H0. It represents the first
statement of a hypothesis that is assumed to be true.
€€ Alternative hypothesis states that there is a relationship between two variables
under study. It is denoted by H1. It represents the second statement of a hypothesis
that is assumed to be false.
€€ The decision rule states that before accepting or rejecting a null hypothesis, the
researcher should keep in mind all the criteria set for the hypothesis.
€€ Two-tailed test is a part of non-directional hypothesis that talks about the
relationship between two variables under study, but does not explain anything
about the direction of the relationship.
€€ One-tailed test is a part of directional hypothesis that talks not only about the
relationship between two variables under study, but also the direction of
relationship.
€€ Hypothesis testing is a step-by-step process that starts with the formulation of
hypothesis and ends with decision-making.
176
Concept of Hypothesis

9.6 KEY WORDS Notes

€€ Alternative hypothesis: The hypothesis that finds out the relation between two
variables.
€€ Deductive hypothesis: The type of hypothesis that moves from a general
observation to a specific conclusion.
€€ Directional hypothesis: The hypothesis that checks the direction of relationship
between two variables under study.
€€ Non-directional hypothesis: The hypothesis where the direction of relationship
between two variables under study cannot be specified.
€€ Null hypothesis: The hypothesis that says there is no relationship between two
variables under study.

9.7 CASE STUDY: ONE-TAILED HYPOTHESIS TESTING FOR TESTING


PRODUCTION QUALITY
AM Pvt. Ltd. is a manufacturer which produces alternators. The Indian government
has a policy that an alternator can be sold in the market if it can run at less than
71.1° C under stress test assuming 95% confidence. To test the quality, the samples
M
are chosen randomly on a daily basis by the quality department of the company.
On a particular date (say, D), 7 samples were drawn having a mean 71.3° C and a
standard deviation of 0.214°. The quality department of the company wants to find
out if there is any quality issue or not.
M
For testing the above-stated problem, the null hypothesis and alternate hypothesis
are stated as:
II

H0: μp ≤ 71.1°

H0: μp > 71.1°

Now, the researcher (quality department) finds the p and α values:

p = 95% = 0.95

Level of significance α = 1 – p = 0.05

Degree of freedom (df) for the sample = 7 – 1 = 6

Thereafter, the researcher finds out the value of t-statistic at 95% confidence and df
at 6 using t-table, which comes out to be 1.943.

Now, the researcher calculates the value of t-statistic as:

X−µ
tc =
s n
71.3 − 71.1
tc =
0.214 7
177
Research Methodology

Notes 0.2
tc = = 2.47
0.0809
The researcher draws a detailed graph to represent his research as:

95%

Reject
Fail to Reject
5%

 tp = 1.943 tc = 2.47

It can be seen that the critical t-value (tc) lies in the rejection region. Therefore, the
researcher rejects the null hypothesis. Rejecting the null hypothesis means that
the sample was not acceptable and it can be stated that there is some issue in the
production of alternators at AM Pvt. Ltd., which it must find out and resolve.
M
QUESTIONS
M
1. What would be the value of tc if the sample had 49 alternators in it?
(Hint:
−µ
II

s n

71.3 − 71.1
tc =
0.214 49

0.2
tc = = 6.65
0.0305

2. What would be the value of tc if the standard deviation was changed to 0.851?
(Hint:
X−µ
tc =
s n

71.3 − 71.1
tc =
0.815 7

0.2
tc = = 0.623
0.321

178 In this case, the null hypothesis would have been accepted.)
Concept of Hypothesis

3. State the policy of the Indian government for selling an alternator in the market. Notes
(Hint: If an alternator can run at less than 71.1° C under stress test assuming 95%
confidence.)
4. How were the samples collected to test the quality?
(Hint: Samples were chosen randomly.)
5. What did the quality department of the company want to find out?
(Hint: The department wanted to find if there was any quality issue or not.)

9.8 EXERCISE
1. Describe the hypothesis and its types in detail.
2. What are the characteristics of a good hypothesis?
3. Explain the hypothesis testing in detail.
4. Explain the following terms:
a. Null hypothesis
b. Two-tailed test M
c. Decision rule

9.9 ANSWERS FOR SELF ASSESSMENT QUESTIONS


M
Topic Q. No. Answer
Defining Hypothesis 1. Hypothesis
II

2. c. 1–iii, 2–i, 3–iv, 4–ii


Hypothesis Testing 3. Hypothesis testing
4. True
5. Type I
6. a. Two-tailed test
Procedure of Hypothesis Testing 7. decision-making
8. b. Population mean
9. c. 5%
10. d. All of these

9.10 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Cahoon, M. (1987). Research Methodology. Edinburgh: Churchill Livingstone.
€€ Detterman, D. (1985). Research Methodology. Norwood, N.J.: Ablex.
€€ Panneerselvam, R. (2014). Research Methodology. Delhi: PHI Learning.
179
Research Methodology

Notes E-REFERENCES
€€ Hypothesis Testing - Statistics Solutions. (2020). Retrieved 9 April 2020, from
https://www.statisticssolutions.com/hypothesis-testing/
€€ Chapter 11: Fundamentals of Hypothesis Testing - Statistics for LIS with Open
Source R. (2020). Retrieved 9 April 2020, from http://www.statisticsforlis.org/
chapter-11-fundamentals-of-hypothesis-testing/

M
M
II

180
R
TE
10

AP
H
C
Parametric Tests
M
Table of Contents
M
10.1 Introduction
10.2 Types of Hypothesis Testing
II

Self Assessment Questions


10.3 Parametric Tests
Self Assessment Questions
10.4 One-Sample Test – Different Situations in Which One- Sample Test is Used
10.4.1 Exploring Case-I
10.4.2 Exploring Case-II
10.4.3 Exploring Case-III
10.4.4 Exploring Case-IV
10.4.5 Exploring Case-V
10.4.6 Exploring Case-VI
Self Assessment Questions
10.5 Two-Sample Tests
10.5.1 Differences Between Two Independent Samples
10.5.2 Differences Between Two Proportions
10.5.3 Comparing Two Related Samples
10.5.4 Study of Equality of Variances of Two Populations
Self Assessment Questions
Table of Contents
10.6 Exploring ANOVA
10.6.1 One-Way ANOVA
10.6.2 Two-Way ANOVA
Self Assessment Questions
10.7 Summary
10.8 Key Words
10.9 Case Study
10.10 Exercise
10.11 Answers for Self Assessment Questions
10.12 Suggested Books and e-References

M
M
II
Parametric Tests

LEARNING OBJECTIVES Notes

After studying this chapter, you will be able to:


€€ Distinguish between parametric and non-parametric tests for testing hypotheses
€€ Describe the different types of parametric tests
€€ Explain the concepts of one-sample test and two-sample test
€€ Describe the concept of ANOVA

10.1 INTRODUCTION
In the previous chapter, you studied to test a hypothesis to find the solution of a
research problem. To check the validity of a hypothesis, you can use two main types
of tests, namely parametric tests and non-parametric tests.

Parametric tests are statistical measures used in the analysis phase of research to
draw inferences and conclusions to solve a research problem. There are various
types of parametric tests, such as z-test, t-test and F-test. Selection of a particular
test for a research depends upon various factors, such as the type of population,
sample size, Standard Deviation (SD) and variance of population. It is important for
M
a researcher to identify the appropriate test to maintain the authenticity and validity
of research results.

In this chapter, you will learn about the concept of parametric tests. You will learn
M
about one-sample and two-sample tests. You will also learn to apply z-test, t-test and
F-test in different conditions and scenarios for one-sample and two-sample tests.
II

10.2 TYPES OF HYPOTHESIS TESTING


A hypothesis can be tested by using a large number of tests. Therefore, researchers
have found it more convenient to categorise these tests on the basis of their similarities
and differences. Hypothesis tests are divided into two types, as shown in Figure 1:

Types of
Tests

Parametric Non-parametric
Tests Tests

Figure 1: Types of Hypothesis Tests

€€ Parametric tests: In these tests, the researcher makes assumptions about the
parameters of the population from which a sample is derived. An example of a
parametric test is z-test.
183
Research Methodology

Notes €€ Non-parametric tests: These are distribution-free tests of hypotheses. Here, the
researcher does not make assumptions about the parameters of the population
from which a sample is derived. An example of a non-parametric test is the Kruskal
-Wallis test.
S elf A ssessment Q uestions
1. What do you call the hypotheses tests where the researcher makes assumptions
about the parameters of the population from which a sample is derived?
a. Non-parametric tests b. Parametric tests
c. Chi-Square test d. Distribution-free tests

10.3 PARAMETRIC TESTS


In parametric tests, researchers assume certain properties of the parent population
from which samples are drawn. These assumptions include properties, such as the
sample size, type of population, mean and variance of population and distribution
of the variable.

For example, t-test assumes that the variable under study in population is normally
distributed. Researchers calculate the parameters of population using various
M
test statistics. Then, they test the hypothesis by comparing the calculated value
of parameters with the benchmark value given in the problem. The scale used for
dependent value in parametric tests is mostly the interval scale or ratio.
M
There are various types of parametric tests, as shown in Figure 2:
II

z-test
Parametric Tests

t-test

F-test:

Figure 2: Types of Parametric Tests

Let us now discuss each type of test in detail:


€€ z-test: This test is used to study the mean and proportion of samples having a
sample size of more than 30. It involves comparison of means of two different and
unrelated samples drawn from the same population whose variance in known.
The z-value (test statistic) is calculated for the present data and compared with
184 the z-value at that level of significance, which is decided earlier in the question/
Parametric Tests

problem. After comparison, researcher may decide to reject or support null Notes
hypothesis. The z-test is used in the following cases:
zz To compare the mean of a sample with the mean of a hypothesised population
when the sample size is large and the population variance is known
zz To compare the significant difference between the means of two independent
samples in the case of large samples or when the population variance is known
zz To compare the proportion of a sample with the proportion of the population
€€ t-test: This test is used to study the mean of samples when the sample size is less
than 30 and/or the population variance is unknown. It is based on t-distribution. A
t-distribution is a type of probability distribution that is appropriate for estimating
the mean of a normally distributed population where the sample size is small and
population variance is unknown.
The t-value (test statistic) is calculated for the present data and compared with
the t-value at a specified level of significance for concerning degrees of freedom
for accepting/rejecting the null hypothesis. The degree of freedom is calculated by
subtracting one observation from the number of observations. It is used to check
the t-value in the t-distribution table.
Sometimes, the t-test is used to compare the means of two related samples when the
M
sample size is small and the population variance is unknown. In such a situation,
it is known as the paired t-test.
€€ F-test: This test is used to compare the ratio of variances of two samples under study.
M
It involves comparing the ratio of two variances of two samples. The F-distribution
is a right-skewed distribution that is used most commonly in Analysis of Variance
(ANOVA). Here, the test statistic has an F-distribution.
II

The F-value (test statistic) is calculated for the present data and compared with
the F-value at that level of significance, which is decided earlier in the question/
problem. In an F-test, these are two independent degrees of freedom in numerator
and denominator, respectively. The degrees of freedom (d.f.) of two samples are
calculated separately by subtracting one from the number of observations. After
that, the F-value is calculated from the F-distribution table.
Parametric tests are further divided into two parts – one-sample tests and two-
sample tests. You will learn more about them in the next sections.

ASSUMPTIONS OF F-TEST
F-distribution is usually asymmetric with minimum value of zero. However, the
maximum value is infinity.

Assumptions for using an F-test include:


1. Both the samples come from normal distribution.
2. Observations in each sample are selected randomly.
F-statistic can never be negative as it is a ratio of two squared numbers.
185
Research Methodology

Notes The degrees of freedom for different tests are calculated in different ways as
follows:

Test Degree of Freedom


One sample t-test n – 1; where, n = sample size
Paired data t-test n – 1; where, n = number of pairs of data points
t-test for two independent (n1 – 1) + (n2 – 1); where, n1 and n2 are sizes of two
populations samples
Chi-square test for independence (r–1) (c–1); where r equals number of levels for one
category of variable and c equals number of levels for
second category of variable
Chi-square test for goodness of fit n – 1; where, n = the number of levels of a single
categorical variable
One factor ANOVA (F-test) Degree of Freedom of Numerator (dfn)
= k – 1; and
Degree of Freedom of Denominator (dfd) = N – k;
Where, n = total number of data values in
an experiment, and
M k = the number of groups

S elf A ssessment Q uestions


2. Which of the following parametric tests is used to study the mean and
proportion of samples having a sample size more than 30?
M
a. t-test b. F-test
c. Chi-square test d. z-test
II

3. The ___________ is used to compare the mean of samples when the sample
size is less than 30 and the population variance is unknown.
4. Which test is used to compare the significant difference between the variances
of two samples under study?
a. z-test
b. Chi-square test
c. t-test
d. F-test
5. The degree of freedom is calculated by subtracting ___________ from the
________ for t-test.

10.4 ONE-SAMPLE TEST – DIFFERENT SITUATIONS IN WHICH ONE-


SAMPLE TEST IS USED
In a one-sample test, the researcher compares the mean of a sample to a pre-specified
value and tests for deviation from that value. In this test, you can determine the
186 mean, variance and proportion of the sample and population with the help of z-test
and t-test.
Parametric Tests

The one-sample test is used in various situations as mentioned in Table 1: Notes

Table 1: Cases of One-Sample Test

Case Population Sample Population Sample Population Sample Test


Size Mean Mean Variance Variance
Case-I Normal and Large Unknown Known Two-
finite and tailed or
n/N < 0.05 one-tailed
Case-II Normal and Large Known Known Two-
finite tailed or
one-tailed
Case-III Normal and Large Unknown Known Two-
infinite tailed or
one-tailed
Case-IV Population proportion and sample proportion are known
Case-V Normal and Large Unknown Known Two-
infinite tailed or
one-tailed
Case-VI Normal and Large Unknown Known Two-
infinite tailed or
one-tailed

Let us study these cases in detail.


M
10.4.1 EXPLORING CASE-I
In this case, the population is normal and finite, the sample size is large and the
M
population variance is unknown. The researcher uses the following test statistics:

t = X – μ/(s/√n) (√ (N – n)/(N – 1)
II

Where,

μ = Population mean

N = Population size

n = Sample size

s = Standard Deviation of the sample

X = Sample mean

10.4.2 EXPLORING CASE-II


In Case-II, the population is normal and finite, the sample size is large and the
population variance is known. In this case, the researcher uses the following test
statistics:

z = X – μ/(σp/√n) (√ (N – n)/(N – 1)

Where,

μ = Population mean

n = Sample size 187


σp = Standard Deviation of the population
Research Methodology

Notes X = Sample mean

Let us understand the application of Case-II with the help of an example.

Example 1: The population mean diameter of all products produced by an


organisation is presumed to be 8 cm, with an SD of 2.5. The size of the population
is 50. Now, the organisation has taken a random sample of 35 pieces of product A
to know whether the average diameter of sample production of this product is the
same or more than the overall production. The average mean for product A is 10 cm.
Use 5% as the level of significance. Construct the hypothesis and carry out the test
of significance for this problem.

Solution: The null hypothesis and the alternative hypothesis are as follows:

H0: The average of production of product A is the same as the overall production of
all products combined.

H1: The average of production of product A is more than the overall production of
all products combined.

Or,

H0: μs = 8 cm

H1: μs > 8 cm
M
Where, μs = Sample mean, that is, the average amount of production of product A
M
Assumed Population mean (μ) = 8 cm

Population size (N) = 50


II

Sample size (n) = 35

Sample mean (X) = 10

Standard Deviation of sample (σp) = 2.5

Since the population is finite, the researcher uses the following formula for z-test to
test the hypothesis for significance:

z = X – μ/(σp/√n) (√(N – n)/(N – 1)

X−µ N−n
z= ×
σp N−1
n

10 − 8 50 − 35
z= ×
2.5 50 − 1
35

2 15
z= ×
2.5 49
188 5.91

z = 4.728× 0.5532 = 2.6155


Parametric Tests

The z-value for the 5% level of significance for one-tailed test is + 1.64. Notes

The graphical representation of the preceding solution is given in Figure 3:

Acceptance Region

Rejection Region

+1.64 +2.61

Figure 3: Rejecting Calculated z-Value

In Figure 3, it can be observed that the calculated value of z lies in the rejection
region; therefore, H0 is rejected. This implies that the average diameter production
of product A is more than the overall production.
M
10.4.3 EXPLORING CASE-III
M
In Case-III, the population is normal and infinite, the sample size is large and the
population variance is unknown. In this case, the following test statistic is used:

t = X – μ/(σs/√n)
II

Where,

μ = Population mean

n = Sample size

σs = Standard Deviation of sample

X= Sample mean

Let us understand the application of Case-III with the help of an example.

Example 2: The rating given by 36 existing customers of an organisation from the


south part of a city to a newly launched product is as follows (1 being the lowest
rating and 10 being the highest rating):

5, 6, 10, 9, 8, 7, 2, 3, 8, 9, 7, 9, 10, 4, 3, 2, 10, 8, 9, 6, 2, 6, 5, 8, 9, 7, 7, 7, 7, 2, 4, 5, 5, 5, 10, 10

The marketers have the average rating from the whole city as 7.5. Now, the
organisation wants to know whether the south part also has the same rating. Use 5%
as the level of significance.
189
Research Methodology

Notes Solution: The null hypothesis and the alternative hypothesis are as follows:

H0: The average rating of the south part of the city is the same as the average rating
of the city

H1: The average rating of the south part of the city is not the same as the average
rating of the city

Or,

H0: μp = 7.5

H1: μp ≠ 7.5

Where, μp = Sample mean, that is, the rating given by the customers in the south part
of the city

The data and the calculation part of the previous problem are shown in Table 2:

Table 2: Ratings Given by Customers

No. of Rating Given by


Xi – X (Xi – X)2
Observations Customers (Xi)
1 6 –1.4 1.96
2
M 7 –0.4 0.16
3 10 2.6 6.76
4 9 1.6 2.56
M
5 8 0.6 0.36
6 7 –0.4 0.16
II

7 5 –2.4 5.76
8 8 0.6 0.36
9 8 0.6 0.36
10 9 1.6 2.56
11 7 –0.4 0.16
12 9 1.6 2.56
13 10 2.6 6.76
14 4 –3.4 11.56
15 8 0.6 0.36
16 5 –2.4 5.76
17 10 2.6 6.76
18 8 0.6 0.36
19 9 1.6 2.56
20 6 –1.4 1.96
21 6 –1.4 1.96
22 6 –1.4 1.96
190 23 8 0.6 0.36
Parametric Tests

No. of Rating Given by Notes


Xi – X (Xi – X)2
Observations Customers (Xi)
24 8 0.6 0.36
25 9 1.6 2.56
26 7 –0.4 0.16
27 7 –0.4 0.16
28 7 –0.4 0.16
29 7 –0.4 0.16
30 5 –2.4 5.76
31 4 –3.4 11.56
32 6 –1.4 1.96
33 8 0.6 0.36
34 5 –2.4 5.76
35 10 2.6 6.76
36 10 2.6 6.76
Total ∑Xi = 266 ∑(Xi – X)2 = 0.4 ∑(Xi – X)2 = 106.56

Sample mean (X) = ∑Xi/n


M
X = 266/36

X = 7.38
M
Population mean (μp) = 7.5

Sample size (n) = 36


II

Since the standard deviation for the population is not given, the researcher needs to
calculate the SD for the sample.

Standard Deviation of sample (σs) = √∑ (X – X)2/(n – 1)

σs = √106.56/35

σs = 3.044

The population is infinite; therefore, the researcher uses the following formula for
z-test to test the hypothesis for significance:

t = X – μ/(σs/√n)

(
t = (7.38 − 7.5) / 3.044 / 36 )
−0.12 × 6
t = − 0.12 / (3.044 6 ) = = − 0.236
3.044

The t-value for the 5% level of significance for two-tailed test is + 2.03.
191
Research Methodology

Notes After checking the t-value for significance, the researcher applies two-tailed test.
The graphical representation of the preceding solution is shown in Figure 4:

Acceptance Region

−0.196 0.28 +2.03

Figure 4: Position of Calculated z-Value

In Figure 4, it can be observed that the calculated z-value lies in the acceptance
region; therefore, H0 is accepted. This implies that the average rating of the south
part of the city is the same as the average rating of the city.
M
10.4.4 EXPLORING CASE-IV
In Case-IV, the observed sample proportions are known. In such a situation, the
M
researcher uses the following test statistic:

p̂ − p
z=
II

pq
n
x
p̂ =
n

Where, p = Proportion of success of population (assumed)

n = Sample size

q = Proportion of failure of population

x = value to be standardised

p̂ (Pronounced as p-hat) = Observed sample proportion

p̂ can be an unbiased measure of p

Example 3: According to a record of a college, the proportion of girl students


presumed in the college is 40%. The college principal conducted a survey of 3,000
students to validate the college record. Out of 3,000 students, 1,450 are girls and
the rest are boys. Now, the principal wants to check the authenticity of the survey
192 through the test of significance to know the degree of validity of the record. Use 5%
as the level of significance.
Parametric Tests

Solution: The null hypothesis and the alternative hypothesis are as follows: Notes

H0: The proportion of girl students observed in the survey is the same as in the
college record.

H1: The proportion of girl students observed in the survey is different from their
proportion in the college record.

Or,

H0: p = 0.40

H1: p ≠ 0.40

Where,

p = Probability of success, that is, the actual proportion of girls in the college

p = 0.40

q = 1 – 0.40

q = 0.60

Sample size (n) = 3000


M
Observed sample proportion, ( p̂ ) = 1450/3000

( p̂ ) = 0.4833
M
p̂ − p
z=
pq
n
II

z = 0.0833/0.009

z = 9.26

The z-value for the 5% level of significance for two-tailed test is ± 1.96. The graphical
representation of the preceding solution is shown in Figure 5:

Acceptance Region

Rejection
Region

−1.96 +1.96 9.26

193
Figure 5: Calculated z-Value When the Proportion of Population and Sample Means are Given
Research Methodology

Notes In Figure 5, it can be observed that the z-value lies in the rejection region; therefore,
H0 is rejected. This implies that the proportion of girl students observed in the survey
is different from their proportion in the college record. It can be interpreted from
the calculated z-value that the average number of girls in the college has increased.

10.4.5 EXPLORING CASE-V


In Case-V the population is normal and infinite, the sample size is small, and the
population variance is unknown. In this case, the researcher uses the following test
statistic:

t = (X– μ)/(σs/√n)

Where,

μ = Population mean

n = Sample size

σs = Standard deviation of sample

X = Sample mean

Let us understand the application of case V with the help of an example.


M
Example 4: A researcher wants to study the average income of a group of 25 people
working as marketing executives in different organisations (especially small and
M
medium enterprises). The salary of the sample of 25 marketing executives included
in a study sample is recorded as:

No. of Observations Income (Lakhs)


II

3 2
4 2
5 1.9
6 2
7 1.9
8 2
9 1.9
10 2
11 2
12 1.9
13 2
14 1.9
15 2
16 2
17 1.8
18 2
19 2
194 20 2
21 2
Parametric Tests

No. of Observations Income (Lakhs) Notes

22 1.9
23 2
24 1.9
25 2
Total

The average recorded package for the marketing executive post is ` 2 lakhs. The
researcher wants to know whether the average recorded package is valid for this
group or not. Use 5% as the level of significance.

Solution: The null hypothesis and the alternative hypothesis are as follows:

H0: The average recorded package and the sample average income of group are the
same.

H1: The average recorded package and the sample average income of group are
different.

H0: μs = 2,00,000

H1: μs ≠ 2,00,000
M
Where,

μs= Sample mean, that is, the sample mean for the income of the group
M
The data and the calculation part for this example are shown in Table 3:
II

Table 3: Income of People at the Marketing Executive's Post

No. of Observations Income (Lakhs) X i– X (X i – X2)

1 2 0.04 0.0016
2 1.9 – 0.06 0.0036
3 2 0.04 0.0016
4 2 0.04 0.0016
5 1.9 – 0.06 0.0036
6 2 0.04 0.0016
7 1.9 – 0.06 0.0036
8 2 0.04 0.0016
9 1.9 – 0.06 0.0036
10 2 0.04 0.0016
11 2 0.04 0.0016
12 1.9 – 0.06 0.0036
13 2 0.04 0.0016
195
Research Methodology

Notes
No. of Observations Income (Lakhs) X i– X (X i – X2)

14 1.9 – 0.06 0.0036


15 2 0.04 0.0016
16 2 0.04 0.0016
17 1.8 – 0.16 0.0256
18 2 0.04 0.0016
19 2 0.04 0.0016
20 2 0.04 0.0016
21 2 0.04 0.0016
22 1.9 – 0.06 0.0036
23 2 0.04 0.0016
24 1.9 – 0.06 0.0036
25 2 0.04 0.0016
Total ∑Xi = 49 ∑(Xi – X)2 = 0.08

Population mean (μ) = 2 lakhs (assumed)

Sample size (n) = 25


M
Sample mean (X) = ∑Xi/n
M
X = 49/25

X = 1.96
II

Since the standard deviation for the population is unknown, the researcher needs to
calculate the standard deviation for the sample as follows:

Standard deviation of sample (σs) = √∑ (Xi – X)2/(n – 1)

σs = √0.08/24

σs = 0.058

The population is infinite; therefore, the researcher uses the following formula for
t-test to test the hypothesis for significance:

t = X – μ/(σs/√n)

t = – 0.04/0.0116

t = – 3.45

Degree of freedom (d.f.) = n – 1

= 25 – 1

= 24
196
Parametric Tests

The t-value for the 5% level of significance for two-tailed test and 24 d.f. is ±2.064. Notes
The graphical representation of the preceding solution is shown in Figure 6:

Acceptance
Region

Rejection Region

–3.45 –2.064 +2.064

Figure 6: Position of Calculated t-Value

In Figure 6, it can be observed that the calculated t-value lies in the rejection region;
therefore, H0 is rejected. This implies that the average recorded package and the
sample average of the income of the group are different. It can be interpreted that
the average income for the marketing executive post has decreased in the market.
M
10.4.6 EXPLORING CASE-VI
In Case-VI, the population is normal and finite, the sample size is small, and the
population variance is unknown. In this case, the researcher uses the following test
M
statistic:

X−µ (N − n )
II

t=
(σ s
/ n ) (N − 1)
Where,

μ = Population mean

n = Sample size

s = Standard deviation of sample

X = Sample mean

Let us understand the application of Case-VI with the help of an example.

S elf A ssessment Q uestions


6. One-sample test helps in determining ________________, _______________
and __________ of a sample population with the help of z-test, t-test and chi-
square test.
7. In the given formula z = X – μ/(σs/√n) (√(N – n)/(N – 1), what does μ stand for?
a. Sample size b. Population mean 197
c. Standard deviation of sample d. Sample mean
Research Methodology

Notes 10.5 TWO-SAMPLE TESTS


In a two-sample test, a researcher wants to study the relationship between two
samples drawn from two different or same populations. In this section, you will
learn about the application of z-test, t-test, and F-test in different situations. These
situations are as follows:
€€ Differences between two independent samples
€€ Differences between two proportions
€€ Comparing two related samples
€€ Equality of the variances of two populations
The two-sample test in different situations is discussed in detail in the upcoming
sections.

10.5.1 DIFFERENCES BETWEEN TWO INDEPENDENT SAMPLES


In this study, a researcher finds the relationship between two samples that are taken
from two independent groups in terms of their means. The samples are compared
to find out whether they are significantly different in terms of their mean value or
they are drawn from the same population. The formula and method of conducting
M
the two-sample test are different in different situations.

Table 4 lists the different situations for conducting two-sample tests:


M
Table 4: Situations to Find Differences between Two Samples (Two-Sample Tests)

Situation Population Sample Population Test


II

Size Variance
Situation I Normal Large Unknown One-tailed or two-tailed
Situation II Normal Normal Known One-tailed or two-tailed
Situation III Normal Small Unknown One-tailed or two-tailed
These different situations with examples are discussed in the following sections.

Situation-I
In this situation, the population is normal, the sample size is large and the population
variance is unknown. The researcher can use either two-tailed test or one-tailed test
depending on the alternate hypothesis of the research. If the researcher wants to
compare the two samples drawn from two different populations, then he/she would
use the following test statistic:

t = X1 − X 2 / (σ 2
s1 ) (
/ n1 + σ s22 / n 2 )
Where,

X1 = Sample mean of the first sample


198
X2 = Sample mean of the second sample
Parametric Tests

σs1= Standard deviation of the first sample Notes

σs2 = Standard deviation of the second sample

n1 = Sample size of the first sample

n2 = Sample size of the second sample

When, in any research problem, the value of population variance in known, then the
researcher should use t-statistic.

Example 5: A researcher wants to compare the popularity of Brand A and Brand B.


Therefore, he/she takes a sample of 35 people and asks them to rate the two brands
on a 10-point scale (10 being the highest and 1 being the lowest). Use 5% as the level
of significance.

Solution: The null hypothesis and the alternative hypothesis are as follows:

H0: The popularity of Brand A and Brand B is the same.

H1: The popularity of Brand A and Brand B is different.

Or, M
H0: μ1 = μ2

H1: μ1 ≠ μ2
M
Where,

μ1 = Population mean of Brand A


II

μ2 = Population mean of Brand B

The data and the calculation part of the preceding problem are shown in Table 5:

Table 5: Calculating the Popularity of Brand A and Brand B

No. of Brand A Brand B


(X1i – X1) (X1i – X1)2 (X2i – X2) (X2i – X2)2
Observations (X1i) (X2i)
1 7 9 –2 4 – 0.4 0.16
2 8 9 –1 1 – 0.4 0.16
3 9 9 0 0 – 0.4 0.16
4 10 9 1 1 – 0.4 0.16
5 10 9 1 1 – 0.4 0.16
6 9 9 0 0 – 0.4 0.16
7 10 9 1 1 – 0.4 0.16
8 10 9 1 1 – 0.4 0.16
9 10 9 1 1 – 0.4 0.16
10 6 10 –3 9 0.6 0.36
11 9 10 0 0 0.6 0.36
12 8 10 –1 1 0.6 0.36 199
13 8 10 –1 1 0.6 0.36
Research Methodology

Notes No. of Brand A Brand B


(X1i – X1) (X1i – X1)2 (X2i – X2) (X2i – X2)2
Observations (X1i) (X2i)
14 9 9 0 0 – 0.4 0.16
15 7 10 –2 4 0.6 0.36
16 9 10 0 0 0.6 0.36
17 10 10 1 1 0.6 0.36
18 8 10 –1 1 0.6 0.36
19 9 10 0 0 0.6 0.36
20 10 9 1 1 – 0.4 0.16
21 10 9 1 1 – 0.4 0.16
22 9 9 0 0 – 0.4 0.16
23 9 9 0 0 – 0.4 0.16
24 8 9 –1 1 – 0.4 0.16
25 9 9 0 0 – 0.4 0.16
26 9 9 0 0 – 0.4 0.16
27 10 9 1 1 – 0.4 0.16
28 10 9 1 1 – 0.4 0.16
29 10 9 1 1 – 0.4 0.16
30 10 9 1 1 – 0.4 0.16
31 10
M 9 1 1 – 0.4 0.16
32 9 10 0 0 0.6 0.36
33 9 10 0 0 0.6 0.36
M
34 8 10 –1 1 0.6 0.36
35 9 10 0 0 0.6 0.36
Σ(X1i − X1) =
2
Total ΣX1i = 315 ΣX2i = 328 Σ(X2i − X2)2 = 8.2
36
II

Sample mean of Brand A (X1) = ∑X1i/n

X1 = 315/35

X1 = 9

Sample mean of Brand B (X2) = ∑X2i/n

X2 = 328/35

X2 = 9.37 ≈ 9.4

Standard Deviation of Sample A

∑ (X − X1 )
2

(σ ) =
s1
1i

(n 1
− 1)

36
= = 1.058 = 1.028
34
200
Parametric Tests

Standard Deviation of Sample B Notes

∑ (X − X2 )
2

(σ ) =
s2
(n
2i

2
− 1)

8.2
= = 0.2411 = 0.491
34

Since the sample size is more than 30 and two samples are under study, the researcher
applies the following z-test:

t = X1 − X 2 / (σ 2
s1 ) (
/ n1 + σ s22 / n 2 )
(1.028) (0.491)
2 2

t = (9 − 9.4 ) / +
35 35

1.056 + 0.241
t = −0.4 /
35
1.297
t = −0.4 / = 0.037
35
M
t = −0.4 / 0.037 = −10.81

The t-value for the 5% level of significance for two-tailed test is ± 2.032 (degree of
M
freedom = 35 – 1 = 34).

The graphical representation of the preceding solution is shown in Figure 7:


II

Acceptance Region

Rejection Region

–10.81 –2.032 +2.032

Figure 7: Position of Calculated z-Value in Case of Two Samples

In Figure 7, it can be observed that the z-value lies in the rejection region; therefore,
H0 is rejected. The popularity of Brand A is not the same as the popularity of
Brand B.
201
Research Methodology

Notes Situation-II
In this situation, the population is normal, the sample size is large, and the population
variance is known. The populations are equal. The researcher can use either two-
tailed test or one-tailed test depending on the alternate hypothesis of the research. If
the researcher wants to compare two samples drawn from the same population, then
he/she would use the following test statistic:

z = ( X 1 − X 2 ) / σ p2 (1 / n1 ) + (1 / n 2 )

Where,

X1 = Sample mean of the first sample

X2 = Sample mean of the second sample

σp = Standard deviation of the populations

n1 = Sample size of the populations

n2 = Sample size of the second sample

Example 6: A researcher has collected two samples from various production houses
of an organisation. He has taken a sample of Product P from 500 production houses.
M
He has found that the average production of Product P is equal to 1,000 pieces/
month with a standard deviation of 13 pieces. He has also taken a sample of product
Q from 400 production houses. He finds that the average production of product Q is
M
1,200 pieces/month with a standard deviation of 15 pieces. The standard deviation
of the production houses of the organisation is 14. Is this the same organisation from
where the researcher has collected the samples? Use 5% as the level of significance.
II

Solution: The null hypothesis and alternative hypothesis are as follows:

H0: Population means of products P and Q are the same.

H1: Population means of products P and Q are different.

Or,

H0: μ1 = μ2

H1: μ1 ≠ μ2

Where, μ1 = Population mean of product P

μ2 = Population mean of product Q

Details are given as follows:

Sample mean of product P (X1) = 1000

Sample mean of product Q (X2) = 1200

Standard deviation of sample P (σS 1) = 13


202
Standard deviation of sample Q (σs2) = 15
Parametric Tests

Standard deviation of population (σp) = 14 Notes

Number of observations of sample P (n1) = 500

Number of observations of sample Q (n2) = 400

Since the sample size is more than 30, the population variance is known, and two
samples are under study, the researcher would apply the following z-test:

z = ( X 1 − X 2 ) / σ p2 (1 / n1 ) + (1 / n 2 )

 1 1 
z = (1000 − 1200 ) / (14)
2
 500 + 400 
 

4 + 5
z = (−200 ) / 196  
 2000 

z = (−200 ) / 1764 / 2000

z = (−200) / 0.939

z = −212.99
M
The z-value for the 5% level of significance for two-tailed test is ± 1.96. The graphical
representation of the preceding solution is shown in Figure 8:
M

Acceptance Region
II

Rejection Region

–212.99 –1.96 +1.96

Figure 8: Representation of Calculated z-Value in Case of Two Samples

In Figure 8, it can be observed that the z-value lies in the rejection region; therefore,
H0 is rejected. This implies that the population means of products P and Q are
different. It can be interpreted that the calculated z-value showing the difference
between means of two samples is statistically significant.

Situation-III
In this situation, the population is normal, the sample size is small, and the population
variance is unknown. The researcher can use either two-tailed test or one-tailed test 203
on the basis of research problem and the alternative hypothesis. If the researcher
Research Methodology

Notes wants to compare two samples drawn from two different populations, then he/she
would use the following test statistic:

X1 − X 2
t=
SE

Where,

SE = Standard Error

1 1
SE = S p +
n1 n 2

Where Sp= Pooled standard deviation and

σ 12 (n1 − 1) + σ 22 (n 2 − 1)
Sp =
(n 1
− 1) + (n 2 − 1)

X1 − X 2
∴t =
 σ 12 (n1 − 1) + σ 22 (n 2 − 1)   1 1
  + 
 ( n1 − 1) + (n 2 − 1)   n1 n 2 
M
Where

X1 = Sample mean of the first sample


M
X2 = Sample mean of the second sample

σ1 = Standard deviation of the first sample


II

σ2 = Standard deviation of the second sample

n1 = Sample size of the first sample

n2 = Sample size of the second sample

Example 7: The average sales volume of two cities, A and B, for an organisation in
10 retail outlets is 100 and 200, respectively. The standard deviation for A is 5.5 and
for B is 6.5. Test the hypothesis for the difference in sales of the two cities by using
5% as a test of significance.

Solution: The null hypothesis and the alternative hypothesis are as follows:

H0: Average sale of City A is equal to the average sale of city B.

H1: Average sale of City A is not equal to the average sale of city B.

Or,

H 0: μ 1 = μ 2

H 1: μ 1 ≠ μ 2
204 Where,

μ1 = Population mean of city A


Parametric Tests

μ2 = Population mean of city B Notes

Sample mean of city A (X1) = 100

Sample mean of city B (X2) = 200

Standard deviation of sample A (σ1) = 5.5

Standard deviation of sample B (σ2) = 6.5

Number of observations of samples A and B (n) = n1 = n2 = 10

Since the sample size is less than 30 and two samples are under study, the researcher
would apply the following t-test:

X1 − X 2
t=
 σ (n1 − 1) + σ 22 (n 2 − 1)   1
2
1 1
  + 
 ( n1 − 1) + (n 2 − 1)   n1 n 2 

t=
(100 − 200)
 (5.5) (10 − 1) + (6.5) (10 − 1)   1
2 2
1
  + 


(10 − 1) + (10 − 1)   10 10 

M
−100
t=
 9 (30.25 + 42.25)   1 
  
M
 18   5 
−100
=
7.25
II

−100 −100
= =
2.692 2.7
= − 37.03

The t-value for the 5% level of significance for two-tailed test with 18 as degree of
freedom is ± 2.101. The graphical representation of the preceding solution is shown
in Figure 9:

Acceptance Region

Rejection Region

–37.03 –2.101 +2.101 205


Figure 9: Rejection of Calculated t-Value in Case of Two Samples
Research Methodology

Notes In Figure 9, it can be observed that the t-value lies in the rejection region; therefore,
H0 is rejected. This implies that the average sales volume of City A is not equal to
the average sales volume of city B. It can be interpreted from the calculated t-value
that the difference between the means of the two samples is statistically significant.

10.5.2 DIFFERENCES BETWEEN TWO PROPORTIONS


In this study, a researcher finds the relationship between two samples that are given
in the form of proportions. The researcher tries to find whether the two proportions
are significantly different from each other or not. The samples are drawn from the
same or different populations. This study can also be used to compare the proportions
of a sample and a population:

Difference between the proportion of two samples belonging to two independent


groups can be tested if the population is normal, the sample size is large, and the
proportion of samples is known. The researcher can use either two-tailed test or
one-tailed test on the basis of the nature of research question. If the researcher wants
to compare the proportions of two samples drawn from two different populations,
then he/she would use the following test statistic:
p1 − p2
z=
 p1q 1   p2 q 2 
M
 n  +  n 
1 2

Where,
M
p1 = Proportion of success of the first sample

p2 = Proportion of success of the second sample


II

q1 = Proportion of failure of the first sample

q2 = Proportion of failure of the second sample

n1 = Sample size of the first sample

n2 = Sample size of the second sample

Example 8: In a college, there are two streams: science and commerce. The college
management wants to find out whether there is a significant difference between
the proportions of average students (students who are neither toppers or laggards
with respect to study) of the two streams. Therefore, the management conducts a
survey and finds out that 350 out of 500 students of the science stream are under
the category of average students. In the case of the commerce stream, 550 students
out of 600 students are under the category of average students. Use 5% as a level of
significance.

Solution: The null hypothesis and the alternative hypothesis are as follows:

206 H0: There is no difference between the proportions of average students of the science
and commerce streams in the college.
Parametric Tests

H1: There is a significant difference between the proportions of average students of Notes
the science and commerce streams in the college.

Or,

H 0: p 1 = p 2

H 1: p 1 ≠ p 2

Where,

p1 = Proportion of success in the science stream

p2 = Proportion of success in the commerce stream

Proportion of success in the science stream, p1 = 350/500

p1 = 0.7

Proportion of failure in the science stream, q1 = 1 – p1 = 1 – 0.7

q1 = 0.3

Proportion of success in the commerce stream, p2 = 300/600


M
p2 = 0.5

Proportion of failure in the commerce stream, q2 = 1 – p1= 1 – 0.5


M
q2 = 0.5

Sample size of science stream, (n1) = 500


II

Sample size of commerce stream, (n2) = 600

The test of significance used is:

p1 − p2
z=
 p1q 1   p2 q 2 
 n  +  n 
1 2

0.7 − 0.5
z=
(0.7 )(0.3) + (0.5)(0.5)
500 600

z = 0.2/0.029

z = 6.9

The z-value for the 5% level of significance for two-tailed test is ± 1.96.

207
Research Methodology

Notes The graphical representation of the preceding solution is shown in Figure 10:

Acceptance Region

Rejection Region

–1.96 +1.96 6.9

Figure 10: Rejection of Calculated z-Value in Case of Two-Sample Proportions

In Figure 10, it can be observed that the z-value lies in the rejection region; therefore,
H0 is rejected. This implies that there is a significant difference between the average
students of the science and commerce streams in the college. It can be interpreted
from the calculated z-value that the difference between the proportions of the two
samples is statistically significant.
M
Example 9: In a sample of 700 engineering colleges from a state, littering by first
year students was prevalent in 500 colleges. After the ban on littering in the same
state, it was found that 500 colleges out of 800 colleges were involved in littering.
M
The decrease in the proportion of the number of colleges involved in littering was
significant or not? Test the hypothesis at the 1% level of significance.

Solution: The null hypothesis and the alternative hypothesis are as follows:
II

H0: There is no difference between the proportion of the number of engineering


colleges involved in littering before and after the ban on littering.

H1: There is a significant difference between the proportion of the number of


engineering colleges involved in littering before and after the ban on littering.

Or,

H 0: p 1 = p 2

H 1: p 1 ≠ p 2

Where,

p1 = Proportion of success in sample one

p2 = Proportion of success in sample two

Proportion of success in sample one, p1 = 500/700

p1 = 0.71
208 Proportion of failure in sample one, q1 = 1 – p1= 1 – 0.71

q1 = 0.29
Parametric Tests

Proportion of success in sample two, p2 = 500/800 Notes

p2 = 0.625

Proportion of failure in sample two, q2 = 1 – p1= 1 – 0.625

q2 = 0.375

Size of sample one, (n1) = 700

Size of sample two, (n2) = 800

The two samples are taken from the same population; therefore, you can calculate
the best estimate for proportion, which is the common value of proportion. The best
estimate for proportion (p0) for the two samples of colleges involved in ragging can
be calculated as follows:

p0 = (n1p1 + n2p2)/(n1 + n2)

p0 = (700 × 0.71 + 800 × 0.625)/700 + 800

p0 = 0.66

q0 = 1 – 0.66

q0 = 0.34
M
The test of significance used is as follows:
p1 − p2
M
z=
 p1q 1   p2 q 2 
 n  +  n 
1 2
II

z = 0.71 – 0.625/√ (0.66 × 0.34/700) + (0.66 × 0.34/800)

z = 0.085/0.024

z = 3.54

The z-value for the 1% level of significance for two-tailed test is ± 2.58. The graphical
representation of the preceding solution is shown in Figure 11:

Acceptance Region

Rejection Region

–2.58 +2.58 3.54


209
Figure 11: The z-Value Calculated with the Help of the Best Estimate of Proportion
Research Methodology

Notes In Figure 11, it can be observed that the z-value lies in the rejection region; therefore,
H0 is rejected. This implies that there is a significant difference between the number
of engineering colleges involved in littering. It can be interpreted from the calculated
z-value that the difference between the proportions of two samples is statistically
significant.

10.5.3 COMPARING TWO RELATED SAMPLES


In this study, the researcher takes two related samples. The samples are related
to each other in some way or the other. They are compared to find a relationship
between them. The researcher has to test if there is any statistical difference between
the means for the two groups. This type of study is done to find out the impact of
certain policies on an entity, such as the impact made by introducing new human
resource policies on an organisation. To study the impact of changes, data is collected
before and after the occurrence of events. The difference between both the samples
(datasets) is calculated to test whether the samples show a positive or negative
impact of the changes.

If a researcher wants to compare two related samples, then he/she can use the
following test statistic:

D
t=
 SD 
M
 
n
M
Where,

D = Mean difference between the two samples


II

SD = Standard Deviation of the sample

n = Sample size

SD of a sample can be calculated by using the following formula:

(∑ D)
2

∑D 2

n
(SD) =
(n − 1)

10.5.4 STUDY OF EQUALITY OF VARIANCES OF TWO POPULATIONS


In this study, a researcher takes two samples from two populations and finds whether
there is a significant difference between the two populations by comparing their
variances. The sample variances are known to the researcher. The researcher uses
the F-test to study the equality of variances of the two populations. If the researcher
wants to compare the variances of two different populations, then he/ she would use
the following test statistic:

F = s12/s22
210
Where, s1 is larger of the two variances
Parametric Tests

∑ (X − X1 )
2 Notes
1i
s1 = Variance of the first sample =
2
(n 1
− 1)

∑ (X − X2 )
2
2i
s 22 = Variance of the second sample =
(n 2
− 1)

n = Sample size

Variance of the two samples can be calculated using the following formula:

X1i = Value of observation of the first sample

X2i = Value of observation of the second sample

X1 = Mean of the first sample

X2 = Mean of the second sample

n1 = Sample size

n2 = Sample size

Degree of freedom for first sample 1, v1 = n1 – 1


M
Degree of freedom for second sample 2, v2 = n2 – 1

F-value is calculated by dividing the larger variance by smaller variance.


M
Let us learn to calculate the equality of variances from two different populations
with the help of an example.
II

Example 10: A researcher studied two samples of a type of wheat produced from the
north region and the south region of a state. He took two samples of wheat – type A
(north region) and type B (south region). The sample size of type A wheat is 10 cities
and the sample size of type B wheat is 13 cities. The variances for two samples with
respect to gluten content are 5 and 4, respectively. The researcher wants to find out
whether the two populations have the same variance. Test this at the 5% significance
level.

Solution: The null hypothesis and the alternative hypothesis are as follows:

H0: The variance of the two populations is the same.

H1: The variance of the two populations is different.

Or,

H0 : σ 12 = σ 22
H1 : σ 12 # σ 22
Where,

H0 : σ 12 = σ 2
Population
2 variance from sample A 211
H0 : σ 12 = σ 22 = Population variance from sample B
Research Methodology

Notes H0 : σ H
We are given that =0 :σ
1 =
2
5σ2and
2
1 = σ2 = 4
2 2

Therefore,

The test of significance used is:

σ 12
F:
σ 22

F = 5/4

F = 1.25

Degree of Freedom for sample A = n1 – 1

= 10 – 1= 9

Degree of Freedom for sample B = n2 – 1

= 13 – 1

= 12

The value of sample B is greater than the value of sample A; therefore, v1 = 12 and v2
M
= 9. In this case, the F-values for the two-tailed test are calculated as:

Fα/2 = F(0.025,12,9) = 3.87


M
1
F1–α/2 = F(0.975,12,9) = 0.291 =
F0.025,9,12
The graphical representation of the preceding solution is shown in Figure 12:
II

Accept H0

Reject H0
Reject H0
1.25

F1−/2 = 0.291 F/2 = 3.87


-

Figure 12: Position of Calculated F-Value

212 In Figure 12, it can be observed that the calculated F-value lies in the acceptance
region; therefore, H0 is accepted and H1 is rejected. This implies that there is no
Parametric Tests

difference between the variances in gluten content of two populations. It can be Notes
interpreted from the calculated F-value that the samples are statistically insignificant,
that is, the variances of the two populations are equal.

S elf A ssessment Q uestions


8. In the study of differences between two samples, researchers try to find out
the relationship between two samples from different populations in terms of
their ___________.
9. A researcher takes two samples from the same population before and after a
change and compares them to find the impact of the change. What test statistic
will he use?
a. z = X1 – X2 /√ (σ s12/n1 ) + (σ s22/n1 )
b. X1 – X2 /√ [(n1 – 1) σ12 s1+ (n2 – 1) σ22 s2 /n1+ n2– 2]/√ (1/n1) + (1/n2)
c. z = X1 – X2/√σp2 (1/n1) + (1/n2)
d. t = D/(SD/√n)
10. In the given formula t = D/(SD/√n), what does D stand for?
a. Mean difference between two samples
b. Standard deviation of sample
M
c. Sample size
d. Sample density
M

10.6 EXPLORING ANOVA


II

ANOVA is used to study and explain the amount of variation in more than two
samples or data sets. In a data set, two main types of variations can occur. One type
of variation occurs due to chance, while the other type of variation occurs due to
specific reasons. These variations are studied separately in ANOVA to identify the
actual cause of variation and help the researcher take effective decisions. There are
two main types of ANOVA. Let us learn about these in detail.

10.6.1 ONE-WAY ANOVA


One-way Analysis of Variance (ANOVA) is used to test whether the means of two or
more independent (unrelated) groups are statistically significantly different. A table
of variation, ANOVA table is created in this test. This is shown in Table 6:

Table 6: General Table of ANOVA

Source of Sum of Squares Degree of Mean of Square F-ratio


Variation (SS) Freedom (d.f.) (MS)
Between n1 (X1– X)2 + n2 (X2– (k – 1) SS between/(k – 1) MS between/
Sample X)2 + n3(X3- X)2 …… MS within
or
SSB Σnj (Xj − X)2 213
Research Methodology

Notes Source of Sum of Squares Degree of Mean of Square F-ratio


Variation (SS) Freedom (d.f.) (MS)
Within Sample Σ (X1i– X1)2 + Σ (n – k) SS within/(n – k)
(X2i– X2)2 + Σ (X3i–
X3)2…….
OR
SSE= ΣΣ (X− Xj)2

Total SST = ΣΣ (X− Xj)2 (n – 1)

X = Individual observation,
Xj = Sample mean of the jth treatment (or group),
X = Overall sample mean,
k = The number of treatments OR independent comparison groups
and
N = Total number of observations or total sample size

The process of carrying out one-way ANOVA is as follows:


1. Calculate the mean of each sample using the following formula: X = X1 + X2 +……+
Xk/k
M
Where, k = number of samples
Means of samples are termed as X1, X2, X3,……………
M
2. Calculate the mean of all sample means with the help of the following formula:
= X1 + X2+ ……+ Xk/k
Where, k = number of samples
II

3. Calculate the variation between two samples, known as SS between, with the help
of the following formula:
SS between = n (X1 – X) 2 + n 2(X2 – X)2 + n3 (X3 – X) 2
Where, n1, n2 …..= sample size of sample 1, sample 2, and so on………………
SS between is the square of deviations of the sample means from the mean of the
sample means value. It helps know variations between two samples.
4. Divide SS between with d.f. k – 1 to get mean of square between (MS between).
MS between is the mean of variations in two samples. The following formula is
used to calculate MS between:
MS between = SS between/(k – 1)
5. Calculate variation within samples, known as SS within, with the help of following
formula:
SS within = ∑ (X1i – X1)2 + ∑ (X2i – X2)2 + ∑ (X3i – X3)2
Where, X1i, X2i, X3i = observed values in a sample
214 X1, X2, X3 = means of corresponding samples
Parametric Tests

SS within is the square of deviations of values of data series from the corresponding Notes
means of samples. It helps calculate variations within samples.
6. Divide SS within with d.f. n – k to get mean of square within (MS within).
MS within is the mean of variations occurred within samples. The following
formula is used to calculate the MS within:
MS within = SS within/(n – k)
Where, n = total of the sample size of all the samples, that is, n1 + n2 +…..
7. Add the square of deviations to get the total variation in samples. The following
formula is used to calculate the total variation:
Total variation = SST = ∑∑( X − X)2
To calculate total SS, first individual observations are subtracted from the mean
of sample means. After that, the squares of individual observations are taken and
summed up to obtain results. The d.f. used in this case is n – 1.
8. Calculate the F-ratio with the help of the following formula: F-ratio = MS between/
MS within.
The calculated value of F-ratio is tested against the tabulated value of F-ratio
(determined at a specified level of significance). If the value of F-ratio lies under
M
the limits of acceptance region, the null hypothesis is accepted and the alternate
hypothesis is rejected.
M
Let us understand the application of one-way ANOVA with the help of the following
example.

Example 11: The researcher observed the sale of a product of a particular brand in
II

six big retail houses in three cities. He/She wants to determine whether the mean
sale is the same across cities. Use the data shown in Table 7 to calculate one-way
ANOVA:

Table 7: Sales Data of the Product in Cities A, B and C

Retail Houses City A (in Lakhs) City B (in Lakhs) City C (in Lakhs)
1 3 6 9
2 8 9 8
3 4 8 6
4 9 5 7
5 6 7 5
6 7 4 7

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: The sale in three cities is same

H1: The mean sale of at least one city is different from the rest of the two cities

First, calculate the mean sale of three cities separately, as follows:


215
Mean for City A (X1) = 3 + 8 + 4 + 9 + 6 + 7/6 = 6.17
Research Methodology

Notes Mean for City B (X2) = 6 + 9 + 8 + 5 + 7 + 4/6 = 6.5

Mean for City C (X3) = 9 + 8 + 6 + 7 + 5 + 7/6 = 7

Mean of the samples (X) = 6.17 + 6.5 + 7/3

X = 6.6

SS between = n1(X1 – X)2 + n2 (X2 – X)2 + n3 (X3 – X)2

= 6(6.17 – 6.6)2 + 6(6.5 – 6.6)2 + 6(7 – 6.6)2

= 1.11 + 0.06 + 0.96

= 2.1

SS within = ∑ (X1i – X1)2 + ∑ (X2i – X2)2+ ∑ (X3i – X3)2

= [(3 – 6.17)2 + (8 – 6.17)2 + (4 – 6.17)2 + (9 – 6.17)2 + (6 – 6.17)2 + (7 – 6.17)2 + (6 – 6.5)2


+ (9 – 6.5)2 + (8 – 6.5)2 + (5 – 6.5)2 + (7 – 6.5)2 + (4 – 6.5)2 + (9 – 7)2 + (8 – 7)2 + (6 – 7)2 +
(7 – 7)2 + (5 – 7)2 + (7 – 7)2]

= (10.05 + 3.35 + 4.71 + 8.01 + 0.03 + 0.7 + 0.25 + 6.25 + 2.25 + 2.25 + 0.25 + 6.25 + 4 + 1
+ 1 + 0 + 4 + 0) = 54.34
M
Total variance = [(3 – 6.6) 2 + (8 – 6.6)2 + (4 – 6.6)2 + (9 – 6.6)2 + (6 – 6.6)2 + (7- 6.6)2 + (6
– 6.6)2 + (9 – 6.6)2 + (8 – 6.6)2 + (5 – 6.6)2 + (7 – 6.6)2 + (4 – 6.6)2 + (9 – 6.6)2 + (8 – 6.6)2 +
(6 – 6.6)2 + (7 – 6.6)2 + (5 – 6.6)2 + (7 – 6.6)2] = 56.48
M
ANOVA table created after completing preceding calculation is shown in Table 8:

Table 8: Calculation of ANOVA


II

Source of Variation SS d.f. MS F-ratio 5% F limit


Between Sample 2.1 (3 – 1) = 2 2.1/2 = 1.06 1.06/6.04 = 0.29 3.68
Within Sample 54.34 (18 – 3) = 15 54.34/15 = 3.62
Total 56.48 (18 – 1) = 17

You can check the F-table for significance with the help of one-tailed test. The
graphical representation of the preceding solution is shown in Figure 13:

Acceptance Region

Rejection Region

0.29 3.68
216
Figure 13: Graph Showing the Position of the Calculated F-Value
Parametric Tests

Figure 13 shows that the calculated F-value lies in the acceptance region; therefore, Notes
H0 is accepted and H1 is rejected. The value implies that the product's sale is
almost same in the three cities. You can also use another method of ANOVA, which
is performed with the help of correction factor. It is also termed as the shortcut
method. It is more convenient in case of non-integer values. The steps involved in
this method are mentioned as follows:
1. Calculate the correction factor with the help of the following formula: Correction
Factor = (T)2/n
Where, T= Summation of all the observed values in the samples
n = Total number of observations
2. Compute SS between by first taking the sum of observed values in each sample.
Thereafter, obtain the square of the sum of the observed values and divide the
number with the respective sizes of samples. Then, add the resultant values and
take difference between the added value and correction factor to obtain variation
between two samples. The following formula is used to calculate the variation:
SS between = ∑ (Tj)2/nj– (T)2/n
Where, Ti= Sum of the observed value of a sample = T1, T2, ……….
M
nj = Sample size of a sample = n1, n2,……………………
n = Sum of the sample size of different samples
3. Divide SS between with d.f. k – 1 to get MS between. The following formula is
M
used to calculate MS between:
MS between = SS between/(k – 1)
II

4. Calculate and add the squares of all individual values in samples. The sum of the
square of individual values is subtracted from SS between and the value obtained
is termed as SS within or variation within the samples. The following formula is
used to calculate SS within:
SS within = ∑Xij2 – ∑ (Tj)2/nj
Where, Xij 2 = Squares of all individual values in samples
5. Divide SS within with d.f. n – k to get MS within. The following formula is used to
calculate MS within:
MS within = SS within/(n – k)
Where, n = Total of the sample size of all the samples, that is, n1 + n2 + …..
6. Calculate total variation by taking the sum of squares of all individual values
in the samples. After that, subtract each variation of individual values with its
corresponding correction factor. The following formula is used to calculate the
variation:
Total SS = ∑Xij2 – (T)2/n
7. Calculate the F-ratio with the help of the following formula:
217
F-ratio = MS between/MS within
Research Methodology

Notes The calculated value of F-ratio is tested against the tabulated F-value that is
determined at a specified level of significance. If the calculated value of F-ratio
lies under the limits of acceptance region, the null hypothesis is accepted and the
alternate hypothesis is rejected.
Let us learn the application of one-way ANOVA with the help of correction factor
using Example 12.

Example 12: First calculate the correction factor and then various components of
ANOVA table.

The correction factor can be calculated as follows: Correction factor = (T)2/n.

Where, T= summation of all the observed values in the three cities collectively

n = sum of the sample size of different samples.

Correction factor = (118)2/18

= 773.6

SS between = ∑ (Tj )2/n – (T)2/n

= (37 × 37)/6 + (39 × 39)/6 + (42 × 42)/6 – 773.6


M
= 228.17 + 253.5 + 294 – 773.6

= 2.1
M
SS within = ∑Xij2 – ∑ (Tj)2/nj

= (3)2+ (8)2 + (4)2 + (9)2 + (6)2 + (7)2 + (6)2 + (9)2 + (8)2 + (5)2 + (7)2 + (4)2+ (9)2 + (8)2 + (6)2
II

+ (7)2 + (5)2 + (7)2 – 775.67

= 54.34

Total SS = 830 – 773.6

= 56.4

The values of total SS, SS between and SS within are same in both the cases used for
the calculation of ANOVA. Therefore, the ANOVA table would also be the same.

10.6.2 TWO-WAY ANOVA


Two-way ANOVA is used when a researcher wants to test the differences between
groups that have been split on the basis of two attributes or independent variables or
factors. The steps involved in performing two-way ANOVA are as follows:
1. Calculate the correction factor of all attributes/factors separately with the help of
the following formula:
Correction factor = (T)2/n
Where, T= summation of all the observed values in the samples
218
n = total number of observations
Parametric Tests

2. Compute SS between rows. To do so, first take the sum of observed values in each Notes
row. Thereafter, take the square of the sum of observed values and divide the
number with the respective sample size of rows. Then, the resultant values are
added and difference between the added value and correction factor is taken to
obtain the variation between two rows. The following formula is used to calculate
SS between rows:
SS between rows = ∑ (Tj)2/nj – (T)2/n
Where, Tj = Sum of the observed value of a row = T1, T2,……….
nj = Sample size of a row = n1, n2,……………………
n = Sum of the sample size of different samples
In two-way ANOVA, there are three possible null hypotheses. These are as follows:
1. There is no difference in the means of the first factor.
2. There is no difference in the means of the second factor.
3. There is no interaction between first and second factors.
For null hypotheses 1 and 2, the alternative hypothesis is: The means of first factor
and second factor are not equal.
M
For null hypothesis 3, the alternative hypothesis is: There is an interaction between
first factor and second factor.
3. Divide SS between rows with d.f. k – 1 to get MS between rows, which is the mean
M
of variations occurred in between row samples. Similarly, MS between rows for
other attributes can also be calculated.
The following formula is used to calculate MS between rows: MS between rows =
II

SS between rows/(r – 1)
Where, r = number of rows
4. Calculate SS between columns. To do so, first take the sum of observed values in
each column. Thereafter, take the square of sum of observed values and divide
the number with the respective sample size of columns. Then, the resultant values
are added and difference between the added value and correction factor is taken
to obtain the variation between columns. Similarly, SS between columns for other
attributes can also be calculated. The following formula is used to calculate SS
between columns:
SS between columns = ∑ (Tj)2/nj – (T)2/n
Where, Tj = Sum of the observed value of a column = T1, T2,……….
nj = Sample size of a columns = n1, n2,……………………
5. Divide SS between columns with d.f. n – k to get MS between columns, which is
the mean of variations occurred within samples. Similarly, MS within for other
attributes can also be calculated. The following formula is used to calculate MS
within columns:
MS between columns = SS between columns/(c – 1) 219
Where, c = Total of the sample size of all the columns
Research Methodology

Notes 6. Calculate total variation by first taking the sum of squares of all individual values
in the samples. After that, subtract the sum of squares from correction factor.
Similarly, total variation for other attributes can also be calculated. The following
formula is used to calculate variation:
Total SS = ∑Xij 2 – (T)2/n
7. Compute residual variation by first adding SS between and SS within, and then
subtracting the difference between total SS and the value obtained by adding up
SS between and SS within. Similarly, residual variation for other attributes can
also be calculated. The following formula is used to calculate residual variation:
Residual variation = Total SS – (SS between + SS within)
8. Calculate the F-ratio with the help of the following formula: F-ratio = MS between/
MS within
The calculated value of F-ratio is tested against the tabulated F-value that is
determined at a specified level of significance. If the calculated value of F-ratio
lies under the limits of acceptance region, the null hypothesis is accepted and the
alternate hypothesis is rejected.
Let us understand the application of two-way ANOVA with the help of an example.
M
Example 13: Three respondents have rated three small cars of different brands on a
five-point scale (5 being the highest) with respect to their features. The ratings and
features are provided in Table 9:
M
Table 9: Ratings Given by Customers to Different Brands of Cars with Respect to their Features

Respondents Mileage Durability Maintenance Technology Price


Cost
II

1 Zen 3 2 4 3 5
i10 4 4 4 5 4
Alto 4 3 5 2 4
2 Zen 2 4 3 1 4
i10 4 5 3 4 4
Alto 3 1 2 5 3
3 Zen 4 5 3 2 4
i10 3 2 4 5 3
Alto 4 5 4 5 5

The researcher wants to know the difference between the brands in terms of features.

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: There is no difference in the means of the five features of the cars.

H1: The means of the five features are not equal.


220
Parametric Tests

Correction factor = (T)2/n Notes

= (162 × 162)/45

= 583.2

SS between columns (i.e., between variables) = (31 × 31)/9 + (31 × 31)/9 + (32 × 32)/9 +
(32 × 32)/9 + (36 × 36)/9 – 583.2

= 106.8 + 106.8 + 113.8 + 113.8 + 144 – 583.2

= 585.2 – 583.2

=2

SS between rows (i.e., between cars) = (56 × 56)/15 + (48 × 48)/15 + (58 × 58)/15 –583.2

= 209.1 + 153.6 + 224.3 – 583.2

= 587– 583.2

= 3.8

Total SS = (3)2 + (4)2 + (4)2 + (2)2 + (4)2 + (3)2 + (4)2 + (4)2 + (5)2 + (3)2 + (5)2 + (2)2 + (5)2 +
(4)2 + (4)2 + (2)2 + (4)2+ (3)2 + (4)2 + (5)2 + (1)2 + (3)2 + (3)2 + (2)2 + (1)2 + (4)2 + (5)2 + (4)2 +
M
(4)2 + (3)2 + (4)2 + (3)2 + (4)2 + (5)2 + (2)2 + (5)2 + (3)2 + (4)2 + (4)2 + (2)2 + (5)2 + (5)2 + (4)2 +
(3)2+(5)2 – 583.2
M
= 638 – 583.2

= 54.8
II

SS residual = Total SS – (SS between columns + SS between rows)

= 54. 8 – (2 + 3.8)

= 49

ANOVA table created after preceding calculation is shown in Table 10:

Table 10: Calculation of ANOVA for the Three Brands of Cars

Source of
SS d.f. MS F-ratio 5% F limit
Variation

Between
2 (5 – 1) = 4 2/4 = 0.5 0.5/6.125 = 0.08 F(4,8) = 3.84
columns

Between rows 3.8 (3 – 1) = 2 3.8/2 = 1.9 1.9/6.125 = 0.31 F(2,8) = 4.46

(5 – 1) × (3 – 1)
Residual 49 49/8 = 6.125
=8

Total 56.48 (45 – 1) = 44

221
Research Methodology

Notes You can check the F-value for significance with the help of one-tailed test. The
graphical representation of the preceding solution for F-value at 4 v1 and 8 v2 is
shown in Figure 14:

Acceptance Region

Rejection Region

0.08 3.84

Figure 14: Rejecting the Calculated F-Value

The graphical representation of the preceding solution for F-value at 2 v1 and 8 v2 is


shown in Figure 15: M
Acceptance Region
M
II

Rejection Region

0.31 4.46

Figure 15: Accepting the Calculated F-Value

Figures 14 and 15 show that the calculated F-value lies in the acceptance region.
Therefore, H0 is accepted and H1 is rejected. The value implies that the cars have the
same features.

S elf A ssessment Q uestions


11. ___________ is a non-parametric test that is used to study more than two
samples or data sets.
12. One-way ANOVA determines whether all samples have the same type of
variations. (True/False)
13. ___________ ANOVA is used when you need to determine the relation
between two attributes.
222
Parametric Tests

A ctivit y Notes
Search on the Internet more about the parametric tests and prepare a note of
1,000 words on parametric tests.

10.7 SUMMARY
€€ A hypothesis can be tested by using a large number of tests and these tests are
connected with each other in one way or another.
€€ In parametric tests, researchers make some assumptions about some properties of
the parent population from which samples are drawn. In non-parametric tests, no
assumptions are made.
€€ The different types of parametric tests are z-test, t-test, Chi-square test and F-test.
€€ In a one-sample test, you study the relationship between a sample and the
population.
€€ In a two-sample test, you study the relationship between two samples drawn from
two different or same populations.
€€ ANOVA is used to study and explain more than two samples or data sets. It helps
in explaining the amount of variation in two data sets.
M
10.8 KEY WORDS
M
€€ Distribution pattern: A probability distribution pattern that is similar to normal
distribution and is used for testing hypothesis.
€€ F-test: A test that is used to compare the significant difference between the
II

variances of two samples under study.


€€ Non-parametric tests: The tests which do not make assumptions about the
parameters of the population from which a sample is derived.
€€ Parametric tests: The tests which make assumptions about the parameters of the
population from which a sample is derived.
€€ t-distribution: A type of probability distribution that is appropriate for estimating
the mean of a normally distributed population where the sample size is small and
population variance is unknown.
€€ t-test: A test that is used to study the means of samples having a sample size below
30 and unknown population variance.
€€ z-test: A test that is used to study the means and proportion of samples whose size
is more than 30.

10.9 CASE STUDY: NATIONAL MOTORS INC.


National Motors Inc. is a manufacturer of motor scooters. As a part of its operating
policy, the executives want to determine whether the customers’ and dealers’
satisfaction depends on warranty cards or not. To test this, the company has 223
withdrawn its warranty cards from the market. The marketing research department
Research Methodology

Notes of the company develops a questionnaire in a summated scale form to collect data
about customer satisfaction with and without the warranty card. The department
mails the questionnaire to a random sample of customers after they have received
warranty cards. The same questionnaire is then sent to the same set of customers
after their warranty cards are expired. The company also sends the questionnaire to
dealers who have provided their customers with warranty cards.

The customers and dealers have provided marks out of 100 for their satisfaction level.
The data collected by the marketing research department for customer satisfaction
and dealer satisfaction is given in Table A as follows:

Table A: Data Collected for Customer and Dealer Satisfaction

Customers’ Satisfaction Customers’ Satisfaction


No. of
When They Have Warranty When They Do not Have Dealers
Observations
Cards Warranty Cards
1 74 43 92
2 81 23 42
3 35 88 54
4 59 55 59
5
M 90 67 83
6 33 53 30
M
7 82 85 34
8 68 70 54
9 56 30 39
II

10 46 75 65

After conducting the research, the company comes to the conclusion that warranty
cards do not have much impact on the customers’ and dealers’ satisfaction. A reason
behind this can be that the same type of warranty is given by the competitors of
National Motors Inc.

QUESTIONS
1. Find out the effect of warranty cards on the satisfaction of customers with the help
of data provided in the case study. Use 5% as the level of significance to test the
hypothesis.
(Hint: H0: The customer satisfaction before and after returning the warranty card
is the same.)
2. What should National Motors do to overcome this problem?
(Hint: The company can conduct a survey regarding the available warranty cards
in the entire motor scooters industry.)

224
Parametric Tests

3. Why did the marketing research department of National Motors Inc. develop a Notes
questionnaire?
(Hint: In order to determine whether the customers’ and dealers’ satisfaction
depends on warranty cards or not.)
4. What was the base of providing marks to the questionnaire?
(Hint: Satisfaction level)
5. Was the questionnaire mailed to all customers and dealers?
(Hint: The department mailed the questionnaire to a random sample of customers
and to dealers who have provided their customers with warranty cards.)

10.10 EXERCISE
1. What are the two types of hypotheses tests?
2. Explain the different types of parametric tests.
3. Explore the following cases of one-sample tests:
a. Normal and infinite population, large sample size, known population variance
and two-tailed test or one-tailed test.
M
b. Normal and infinite population, small sample size, unknown population
variance and two-tailed test or one-tailed test.
4. Explain any two-sample tests along with examples.
M
5. Explain the concept of ANOVA in detail.

10.11 ANSWERS FOR SELF ASSESSMENT QUESTIONS


II

Topic Q. No. Answer


Types of Hypothesis Testing 1. b. Parametric tests
Parametric Tests 2. d. z-test
3. t-test
4. d. F-test
5. one observation, number of
observations
One-Sample Test – Different Situations in 6. means; variance; proportion
Which One-Sample Test is Used
7. b. Population mean
Two-Sample Tests 8. means
9. d t = D/(SD/√n)
10. a. M
 ean difference between two
samples
Exploring ANOVA 11. ANOVA
12. True
13. Two-way 225
Research Methodology

Notes 10.12 SUGGESTED BOOKS AND E-REFERENCES


SUGGESTED BOOKS
€€ Biddle, J., & Emmett, R. Research in the History of Economic Thought and Methodology.
€€ Panneerselvam, R. (2014). Research Methodology. Delhi: PHI Learning.

E-REFERENCES
€€ IMPORTANT PARAMETRIC TESTS in Research Methodology Tutorial 09 April
2020 - Learn IMPORTANT PARAMETRIC TESTS in Research Methodology
Tutorial (11529) | Wisdom Jobs India. (2020). Retrieved 9 April 2020, from
https://www.wisdomjobs.com/e-university/research-methodology-tutorial-355/
important-parametric-tests-11529.html
€€ ANOVA Test: Definition, Types, Examples - Statistics How To. (2020). Retrieved
9 April 2020, from https://www.statisticshowto.com/probability-and-statistics/
hypothesis-testing/anova/

M
M
II

226
R
TE
11

AP
H
C
Non-Parametric Tests
M
Table of Contents
M
11.1 Introduction
11.2 Concept of Non-Parametric Tests
II

Self Assessment Questions


11.3 Sign Test
11.3.1 One Sample Sign Test
11.3.2 Two Sample Sign Test
11.3.3 Wilcoxon Matched Pairs Test/Signed Rank Test
Self Assessment Questions
11.4 Rank Correlation
Self Assessment Questions
11.5 Rank Sum Test
11.5.1 Mann-Whitney Test or U Test
11.5.2 Kruskal-Wallis Test
Self Assessment Questions
11.6 Chi-Square Test
11.6.1 Chi-Square Test for Goodness of Fit
11.6.2 Chi-Square Test for Independence
Self Assessment Questions
11.7 Summary
11.8 Key Words
11.9 Case Study
Table of Contents
11.10 Exercise
11.11 Answers for Self Assessment Questions
11.12 Suggested Books and e-References

M
M
II
Non-Parametric Tests

LEARNING OBJECTIVES Notes

After studying this chapter, you will be able to:


€€ Explain the concept of non-parametric test
€€ Discuss the sign tests
€€ Explain the concept of rank correlation
€€ Discuss the rank sum tests
€€ Elaborate the Wilcoxon matched pairs test
€€ Explain the concept of Chi-square test

11.1 INTRODUCTION
In the previous chapter on Parametric Tests, you have learned about different types
of parametric tests used to check the validity of a hypothesis. You have also studied
that parametric tests can only be applied if you know population type and population
parameters, such as mean and variance. However, if this information is unavailable,
you cannot use parametric tests. In such a situation, you need non-parametric tests
to check the validity of a hypothesis and draw inferences.
M
Non-parametric tests are used when you do not have adequate information about
population type and parameters. These tests are widely used to study data given in
the form of ranks. Examples of non-parametric tests are sign tests, rank correlation,
M
rank sum test, Wilcoxon matched pairs and chi-square test. The selection of the test
depends on problem type, sample size and data. For example, rank correlation is
used to establish correlation between two ranked data sets. Researchers should
II

observe caution while selecting a non-parametric test to ensure accurate and precise
results.

This chapter covers non-parametric tests and their types. It provides information
about one and two sample sign tests. It also elaborates on rank correlation and rank
sum tests, including the Mann-Whitney and Kruskal-Wallis tests. In addition, it
explains the Wilcoxon matched pairs test/signed rank test. Finally, the chapter also
sheds light on chi-square test for goodness of fit and chi-square test for independence.

11.2 CONCEPT OF NON-PARAMETRIC TESTS


As already discussed earlier, non-parametric tests are not based on assumptions
about a population and its parameters. A researcher can use these tests without
taking into consideration population distribution and sample type. Non-parametric
tests are also known as distribution-free tests because they do not assume that
the given data follows a specific distribution. These tests are mainly used when
the test model does not specify any stringent conditions regarding the population
parameters from which a sample is drawn.

Let us understand the reason behind choosing a non-parametric test over a parametric
test with the help of a simple example. Suppose, a researcher wants to find out the 229
preference of customers about the different brands of toothpaste available in the
Research Methodology

Notes market. He/she would ask customers to rank different brands according to their
preferences. The data collected would be in the rank form on which parametric tests
cannot be performed. This is because a parametric test requires numeric values, such
as mean and variance, to test a hypothesis. Therefore, in this case, the researcher
would use a non-parametric test.

The different types of non-parametric tests are shown in Figure 1:

Sign Test

Non-parametric Tests
Rank Correlation

Rank Sum Test

Matched Pairs Test

Chi-Square Test
M
Figure 1: Non-Parametric Tests

S elf A ssessment Q uestions


M
1. Non-parametric tests are also known as __________ tests.
2. A researcher can use non-parametric tests without taking into consideration
population distribution and sample type. (True/False)
II

11.3 SIGN TEST


Sign test is considered one of the easiest non-parametric tests because it takes into
account only the plus and minus signs of observations in a sample. It does not
consider the magnitude of observations while analysing the data present in a sample.
Sign test can be used in place of some parametric tests, such as one-sample t-test and
paired t-test. It uses binomial distribution to test the validity of a hypothesis. There
are two types of sign tests, which are shown in Figure 2:

Sign Test

One Sample Sign Test Two Sample Sign Test

Figure 2: Types of Sign Tests


230
Let us now discuss the types of sign tests in detail.
Non-Parametric Tests

11.3.1 ONE SAMPLE SIGN TEST Notes

One sample sign test is applied on a sample where the researcher does not assume
that the data is normally distributed. In this test, the probability of getting a sample
value of less or greater than median value is equal. This implies that the proportion
of success (p) and failure (q) is equal, which means that p = q = 0.50. Therefore, it
is called binomial sign test. In one sample sign test, the researcher provides sample
values with positive (+) and negative (–) signs to test the hypothesis.

Here, the researcher usually tests the null hypothesis: M = M0 against an appropriate
alternate hypothesis.

Sign test is a hypothesis test for population median and not for population mean.

Here, three types of tests are possible as shown in Table 1:

Table 1: Three Types of Sign Tests for Population Median

Null Hypothesis Alternate Hypothesis Type of Test


H0: M = M0 H1: M > M0 Right-tailed test
H0: M = M0 H1: M < M0 Left-tailed test
H0: M = M0 H1: M ≠ M0 Two-tailed test
M
In any given sign test, each data value or observation is converted into a plus sign or
a minus sign. The allocation of + and – signs is done by assuming a median value of
M
the sample. Values that are greater than the median value are replaced by a plus sign
and the values that are less than the median value are replaced by minus sign. The
values that are equal to the given median value are discarded or not considered. After
assigning the signs, the researcher may test the null hypothesis that the probability
II

of getting plus and minus signs is 0.5.

The sign test can be performed by using two methods as follows:


€€ When the sample size is small, the test is carried out by calculating the binomial
probabilities using the binomial probabilities table.
€€ When np ≥ 5 and nq ≥ 5, normal distribution can be used as an approximation of
binomial distribution.
When n is large and when p is sufficiently large (i.e., p > 0.10), normal distribution
is used as an approximation of binomial distribution.

The mean and standard deviation of normal distribution are given as follows:

Mean µ = np

SD = σ = npq

Let us understand the application of one sample sign test with the help of an example.

Example 1: The scores of 15 students in a class test of 20 marks are as follows: 09, 10,
16, 18, 17, 19, 20, 16, 14, 12, 11, 13, 14, 09 and 13. 231
Research Methodology

Notes Test the hypothesis that the median score of all the students is equal to 15 against
the hypothesis that the median score of 15 students is greater than 15. Use 5% level of
significance.

Solution: Null and alternate hypotheses are as follows:

H0: Median score of 15 students is 15

H1: Median score of 15 students is greater than 15 OR

H0: p = 0.5 H1: p > 0.5

The researcher assigns minus (–) sign to values of less than 15 and plus (+) sign to
values of greater than 15.

Observation 19 17 16 18 17 19 20 16 16 18 11 13 14 09 13
Sign + + + + + + + + + + – – – – –

The following result is obtained:

No. of + signs = 10

No. of – signs = 5

Number of observations = 15
M
It must be remembered that the test statistics is larger of the number of + signs and
the number of – signs.
M
Now, we need to check whether 10 plus signs observed in the given 15 trials support
the null hypothesis that p = 0.5 or p > 0.5.

Now, we use binomial probability table to find the probability of 10 or more successes
II

as follows:

⇒ P (10 or more successes (X ≥ 10) | n = 15, p = 0.5) = P(X = 10) + P(X = 11) +
…….. + P(X = 15)

⇒ P(X = 10) + P(X = 11) + + P(X = 15)

⇒ 0.092 + 0.042 + 0.014 + 0.003 + 0.000 + 0.000

⇒ P (10 or more successes (X ≥ 10) |n = 15, p = 0.5) = 0.151

Since the value of one-tailed p is greater than α = 0.05, null hypothesis is accepted.
Note that here np = 15 (0.5) = 7.5.

Therefore, we can also use normal approximation to binomial distribution.

Z-statistic is calculated as:

X − np
Z=
npq

10 − 7.5 2.5
= = = 1.295
232 15 1.93
4
Non-Parametric Tests

The value of Z at 0.05 level of significance is +1.645. Since Z = 1.295 lies in the Notes
acceptance region, the null hypothesis is accepted.

Acceptance
Region

Rejection Region

–1.96 +1.295 +1.645

Figure 3: Graph Showing the Position of the Calculated Binomial Value

Figure 3 shows that the calculated binomial value lies in the acceptance region.
Therefore, H0 is accepted. This implies that the median marks scored by 15 students
are 15.

11.3.2 TWO SAMPLE SIGN TEST


In two sample sign test, the researcher tests two related samples. This test is
M
equivalent to paired t test. Researchers use sign test when data is given as pairs. In
this test, the researcher provides positive (+) and negative (–) signs to values. These
signs are allocated on the basis of the difference between the values of first sample
M
and second sample. If the difference is positive, the difference value gets a plus (+)
sign and if the difference is negative, the difference value gets a minus (–) sign. If the
values of two samples are equal, these values are discarded.
II

Thereafter, the researcher calculates the total plus and minus signs and divides
the number by the sample size. Then, standard error is calculated and limits are
determined. Finally, the hypothesis is tested against the calculated value of limit.

Let us understand the application of the two sample sign test with the help of an
example.

Example 2: Sales target achieved by two employees in a year is shown in Table 2:

Table 2: Data Showing Sales Done by Employees

Month Employee 1 (in Lakhs) Employee 2 (in Lakhs)


1 2 1.5
2 2 2.5
3 4 3
4 1 1
5 1 1.5
6 2.5 2.75
7 3 2.5
8 3.5 1 233
9 4 3
Research Methodology

Notes Month Employee 1 (in Lakhs) Employee 2 (in Lakhs)


10 1.5 1.4
11 2 4
12 3 3

The researcher wants to find out whether the first employee is the better performer.
Use 5% as the level of significance.

Solution: Null hypothesis (H0) and alternate hypothesis (H1) are as follows:

H0: p = 1/2

H1: p > 1/2

Or

H0: Sales done by two employees is the same.

H1: Sales done by the first employee is more than that of the second employee.

The researcher assigns the plus (+) and minus (–) signs to the data shown in Table 3:

Table 3: Signs Allocated to the Data

Month
M
Employee 1 (in Lakhs) X Employee 2 (in Lakhs) Y Sign (X–Y)
1 2 1.5 +
2 2 2.5 –
M
3 4 3 +
4 1 1 0
5 1 1.5 –
II

6 2.5 2.75 –
7 3 2.5 +
8 3.5 1 +
9 4 3 +
10 1.5 1.4 +
11 2 4 –
12 3 3 0

According to Table 3, the number of + signs = 6

No. of – signs = 4

Number of observations = 10 (2 of the observations are 0; therefore, the researcher


does not consider them)

Now, we use binomial probability table to find the probability of 6 or more successes
as follows:

⇒ P (6 or more successes (X ≥ 6) | n = 10, p = 0.5) = P(X = 6) + P(X = 7) +........+


234 P(X = 10)
Non-Parametric Tests

⇒ P(X = 6) + P(X = 7) +.........+ P(X = 10) Notes

⇒ 0.205 + 0.117 + 0.044 + 0.010 = 0.376

Note that here np = 10 (0.5) = 5

Therefore, we can also use normal approximation to binomial distribution. Z-statistic


is calculated as:
X − np
Z=
npq

6−5
Z=
10
4
1
Z= = 0.632
1.581

The value of Z at 0.05 level of significance is +1.645. Since Z = 0.632 and it lies in
the acceptance region, null hypothesis is accepted. This implies that the median sale
done by two employees is equal.
M
11.3.3 WILCOXON MATCHED PAIRS TEST/SIGNED RANK TEST
The Wilcoxon matched pairs test/Signed rank test is a combination of sign and rank
M
tests and is used to compare a paired sample. It is used in place of paired t-test
when the distribution is not normal. The Wilcoxon matched pairs test is used when
the researcher wants to determine the direction and magnitude of difference in the
matched values. Steps to perform the test are mentioned below:
II

1. Determine the difference (di) among observed values.


2. Rank the difference |di| in the ascending order (lowest to highest). If the difference
between two values is zero, the researcher needs to ignore those values.
3. Segregate the ranks according to the positive and negative signs of di values.
4. Add the ranks with negative and positive signs separately.
5. Determine the T-value by comparing the sums of ranks with negative sign and
positive sign. If the sum of ranks with positive sign is more than the sum of ranks
with a negative sign, the T-value would be equal to the sum of ranks of negative
sign or vice versa.
Mean is calculated using the following formula:

Mean, µT = n (n + 1)/4

SD is calculated using the following formula:

Standard deviation, σT = √n (n + 1) (2n + 1)/24

Where, n = number of observations – number of ignored observations 235


Research Methodology

Notes The test statistic z can be calculated as follows:

T − µT
Z=
σT

If the calculated z-value lies under the limits of acceptance region, the null hypothesis
is accepted and the alternate hypothesis is rejected.

Let us understand the application of the Wilcoxon matched pairs test/signed rank
test with the help an example.

Example 3: Two brands are ranked on a five-point scale (five being the highest).
The researcher wants to determine the difference between the satisfaction levels of
customers for two brands. The data for Brand A and Brand B and their difference is
provided in Table 4:

Table 4: Rating Given by Customers to Brand A and Brand B

No. of Respondents Brand A Brand B Difference (di)


1 2 2 0
2 3 4 –1
3 4 3 1
4
M 1 2 –1
5 2 5 –3
6 5 4 1
M
7 4 2 2
8 3 4 –1
9 4 3 1
II

10 5 4 1
11 2 4 –2
12 4 3 1

Use the Wilcoxon matched pairs test with 5% level of significance.

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: Customer satisfaction for the two brands is same.

H1: Customer satisfaction for the two brands is different.

The researcher calculates the T statistic, as shown in Table 5:

Table 5: Calculation of Wilcoxon Matched Pairs Test

No. of Brand Brand Difference |di| Sign Rank Rank with Signs
Respondents A B (di) |di|
1 2 2 0 0 0 – –
2 3 4 –1 1 – 4.5 – 4.5
3 4 3 1 1 + 4.5 + 4.5
236 4 1 2 –1 1 – 4.5 – 4.5
5 2 5 –3 3 – 11 – 11
Non-Parametric Tests

No. of Brand Brand Difference |di| Sign Rank Rank with Signs Notes
Respondents A B (di) |di|
6 5 4 1 1 + 4.5 + 4.5
7 4 2 2 2 + 9.5 + 9.5
8 3 4 –1 1 – 4.5 – 4.5
9 4 3 1 1 + 4.5 + 4.5
10 5 4 1 1 + 4.5 + 4.5
11 2 4 –2 2 – 9.5 – 9.5
12 4 3 1 1 + 4.5 + 4.5
Total Sum of Positive
Ranks (W+) = 32
Sum of Negative
Ranks (W–) = 34
T (smaller of W+
and W–) = 32

In this case, the researcher has neglected the first observation, as it is 0. The ranking
of difference is done from a smaller to a larger value. If there is a tie between the
ranks, the mean of ranks is taken and assigned to identical values. The T statistic is
M
equal to 32, which is the smallest value between the ranks with positive signs and
negative signs. The T-value, with 5% level of significance and two-tailed test, is ±
1.96.
M
Value of z-statistic is calculated as:

T − µT
Z=
II

σT

32 − [(11(11 + 1)) / 4] 32 − 33 −1
Z= = = = −0.088
11(11 + 1)[2(11) + 1] 11 × 12 × 23 11.25
24 24

The graphical representation of the preceding solution is shown in Figure 4:

Acceptance
Region

Rejection Region Rejection Region

–1.96 –0.088 +1.96

237
Figure 4: Position of the Calculated Value
Research Methodology

Notes Figure 4 shows that the calculated Z-value lies in the acceptance region; therefore,
H0 is accepted. This implies that customer satisfaction for two brands is the same.

S elf A ssessment Q uestions


3. _________________ is considered as one of the easiest non-parametric
tests because it takes into account only the plus (+) and minus (–) signs of
observations in a sample.
4. In one sample sign test, the probability of getting a sample value less than or
greater than mean is equal. (True/False)
5. Which of the following tests is also known as paired sign test?
a. Sign test b. One sample sign test
c. Two sample sign test d. None of these

11.4 RANK CORRELATION


Rank correlation, also known as Spearman’s rank correlation coefficient, is used to
establish correlation between two data sets that can be ranked. Steps to calculate
rank correlation are mentioned below.
M
1. Assign ranks to all observations present in two data sets in the descending order. If
two or more values in the data sets are identical, calculate mean rank and allocate
it to all identical values. For example, if third, fourth and fifth ranks have the same
M
value, take out their mean (3 + 4 + 5)/3 = 4 and allocate it as rank to those values.
2. Calculate the difference between ranks by subtracting the rank of one data set
from that of second data set. The difference is denoted as di.
II

3. Calculate the square of di.


4. Find the sum of square of di.
5. Calculate Spearman’s rank correlation coefficient by using the following formula:
6∑ di2
ρ = 1−

(
n n2 − 1 )
Where, di= Difference between ranks
n = Sample size
The value of Spearman’s rank correlation coefficient lies between +1 and –1,
where +1 indicates perfect positive correlation and –1 indicates perfect negative
correlation. The values that lie between +1 and –1 show different degrees of
correlation. The researcher can assess the value of rank correlation coefficient by
performing a hypothesis test. If the sample size is less than 30, the researcher needs to
use the tabulated value of Spearman’s rank correlation coefficient to test the value of
coefficient. Suppose, the sample size (n) = 15 and σr = 0.6364, which shows a reasonably
high degree of correlation between two data sets. The researcher wants to check the
238 value of σr (rank correlation coefficient) to judge whether the correlation is actually
Non-Parametric Tests

present or not. He/She forms a null hypothesis that there is no correlation between Notes
the two data sets and tests it at 5% level of significance using two-tailed test. The
researcher checks the critical value for ρ in the table showing values of Spearman’s
rank correlation coefficient. The critical value of ρ is – 0.5179 (lower limit) and + 0.5179
(upper limit). The given value of ρ = 0.6364 is outside the acceptance region; therefore,
the researcher rejects the null hypothesis and concludes that there is a correlation
between two data sets.

Let us understand the application of Spearman’s rank correlation coefficient with


the help of an example.

Example 4: A researcher wants to test correlation between the IQ level and hours
spent in studying newspaper per week. The data is provided in Table 6:

Table 6: Data Showing IQ Level and Hours Spent on Reading Newspaper

Hours Spent in Studying


No. of Observations IQ (X)
Newspaper (Y) Per Week
1 105 6
2 91 7
3 99 24
4 100
M 56
5 99 29
6 103 30
M
7 97 20
8 113 12
II

9 112 10
10 110 17
11 94 16
12 110 8
13 112 9

Use rank correlation to find out correlation between the IQ level and hours spent on
reading a newspaper, with 5% level of significance.

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: There is no correlation between the IQ level and hours spent on reading the
newspaper every week.

H1: There is correlation between the IQ level and hours spent on reading a newspaper
every week.

Or

H 0: ρ = 0
239
H 1: ρ ≠ 0
Research Methodology

Notes Table 7 shows the calculation of rank correlation test:

Table 7: Calculation of Rank Correlation Test

No. of IQ (X) Hours Spent Rank X Rank Y Difference di2


Observations in Reading (di = X – Y)
Newspaper (Y)
1 105 6 6 13 –7 49
2 91 7 13 12 1 1
3 99 24 9.5 4 5.5 30.25
4 100 56 8 1 7 49
5 99 29 9.5 3 6.5 42.25
6 103 30 7 2 5 25
7 97 20 11 5 6 36
8 113 12 1 8 –7 49
9 112 10 2.5 9 – 6.5 42.25
10 110 17 4.5 6 – 1.5 2.25
11 94 16 12 7 5 25
12 110 8 4.5 11 – 6.5 42.25
13 112 9 2.5 10 – 7.5 56.25
Total
M ∑di2 = 449.5

The calculation of rank correlation is shown below:


6∑ di2
M
ρ = 1−
(
n n2 − 1 )
ρ = 1 – {6 × 449.5/[13(13 × 13 – 1)]}
II

ρ = 1 – [2697/2184]

ρ = 1 – 1.235 = –0.235

The rank correlation value at 5% level of significance with a degree of freedom (d.f.)
13 and two-tailed test is ± 0.484. The researcher can check the rank correlation value
for significance with the help of two-tailed test.

The calculated rank correlation value lies in the acceptance region; therefore, H0 is
accepted. This implies that there is no correlation between the IQ level and number
of hours spent on reading newspaper in a week. It can be interpreted that reading a
newspaper cannot increase your IQ level unless you analyse news.

S elf A ssessment Q uestions


6. _________________, also known as Spearman’s rank correlation coefficient, is
used to establish correlation between two data sets that can be ranked.

11.5 RANK SUM TEST


240 Rank sum test is used to analyse ordinal data (data in the rank form) and calculate
the value of rank sum. First, observations of different samples are arranged in the
Non-Parametric Tests

ascending order of value. Thereafter, these observations are ranked and the sum of Notes
ranked observations is calculated. Finally, the sum is tested against the specified test
statistic value to test the hypothesis. There are two types of rank sum tests, as shown
in Figure 5:

Rank Sum Test

Mann-Whitney Test Kruskal-Wallis Test

Figure 5: Types of Rank Sum Tests

Let us learn about these in detail.

11.5.1 MANN-WHITNEY TEST OR U TEST


Mann-Whitney test (or U test) is used to determine whether two independent samples
are drawn from the same population. The test is applied in general conditions and
M
does not have any specific requirement. The only requirement of the test is that
population should be continuous. However, failure to fulfil this requirement does
not have a huge impact on the result. In the Mann-Whitney test, first two samples are
M
merged in increasing or decreasing order. After that, the data in the merged sample
is ranked from lowest to highest. After rank allocation, the ranks are classified as
R1 for sample 1 and R2 for sample 2. After that, the total of ranks in R1 and R2 is
II

determined. Finally, the U test is applied in the following manner:


n1 (n1 + 1)
U1 = R 1 −
2
n 2 (n 2 + 1)
U2 = R 2 −
2

Where, U = Smaller of U1 and U2 and

n1= Sample size of sample 1 and

n2 = Sample size of sample 2

R1= Sum of the ranks of sample 1 and

R2= Sum of the ranks of sample 2

The mean and SD are determined to calculate the limits of acceptance region. The
mean could be calculated with the help of the following formula:

µU = n1 n2/2

Where, n1= Sample size of sample 1 241


n2 = Sample size of sample 2
Research Methodology

Notes The formula for standard deviation is as follows:

σU = n1 n 2 n1 + n 2 + 1
12

If the value of U test lies under the limits of the acceptance region, the null hypothesis
is accepted. However, if the calculated U value lies outside the limits of the acceptance
region, the null hypothesis is rejected and the alternate hypothesis is accepted. Let
us take an example to understand the application of the Mann- Whitney test.

Example 5: The production of Product A and Product B in a year is shown in Table 8:

Table 8: Production of Product A and Product B

No. of Respondents Product A Product B


1 40 28
2 35 30
3 20 35
4 36 40
5 22 45
6 26 21
7
M 45 26
8 50 28
M
9 44 30
10 47 44
11 48 50
II

12 25 49

The researcher wants to find out that the two products are from the same production
house. Use the Mann-Whitney test (or U test) with 10% significance level.

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: MedA = MedB

H1: MedA ≠ MedB

The researcher merges the data of two products and arranges it in the increasing
order. Thereafter, he/she calculates R1 and R2 for Products A and B, respectively, as
shown in Table 9:

Table 9: Calculation for Mann-Whitney Test

S. No. Product A Rank Product B Rank


1 14.5 7.5
2 11.5 9.5
3 1 11.5
242 4 13 14.5
5 3 18.5
Non-Parametric Tests

S. No. Product A Rank Product B Rank Notes


6 5.5 2
7 18.5 5.5
8 23.5 7.5
9 16.5 9.5
10 20 16.5
11 21 23.5
12 4 22
R1 = 152 R2 = 148

The calculation of U statistic is as follows:


12(13)
U1 = 152 − = 74
2
12(13)
U 2 = 148 − = 70
2
Therefore, U = 70

n1 n 2 12 × 12
µU = = = 72
2 2
M
n1 n 2 (n1 + n 2 + 1) 12 × 12(12 + 12 + 1)
SD = σU = =
12 12
M
SD = 5 12 = 17.3

The U-value at 10% level of significance and two-tailed test is 42.


II

Uα = 42

Since U is greater than Uα, the researcher rejects H0. This implies that Products A and
B are from different production houses.

11.5.2 KRUSKAL-WALLIS TEST


The Kruskal-Wallis test is equivalent to one-way ANOVA (explained later in this
chapter) with only one difference that the former is based on ranks while the latter
is based on numerical values. The test is an extension of the Mann-Whitney U-test.
In the Kruskal-Wallis test, the samples must be more than two, whereas, samples
are two in the Mann-Whitney U-test. The Kruskal-Wallis test is used to determine
whether samples in a study are taken from the same population. In the test, the data
from different samples is merged and values are ranked in any order (low to high
or high to low). Ranks are classified as R1,R2….and Rn, according to the samples to
which they belong. The test is performed in the following manner:

H = 12/n (n + 1) ∑ (Ri2/ni) – 3(n + 1)

Where,
243
n = Sample size
Research Methodology

Notes Ri= Sum of the ranks of all the samples separately, that is, R1, R2, and….., ni

= n1, n2, n3,……………

Chi-square value is determined at d.f. k–1 and the specified level of significance
and the calculated H value is tested against it. If the H value lies under the limits of
acceptance region, the researcher accepts the null hypothesis and rejects the alternate
hypothesis. However, if the H value lies outside the limits of acceptance region, the
researcher rejects the null hypothesis and accepts the alternate hypothesis.

Let us understand the application of the Kruskal-Wallis test with the help of an
example.

Example 6: An organisation wants to purchase hundreds of different milling


machines. As these machines cost a lot, the organisation wants to check whether it
should purchase machines or not. Initially, it borrows four machines and randomly
assigns them to 20 technicians with similar skill sets. Each machine was put through
a series of tasks and rated using a standardised test. The high score indicates
better performance. The scores given by technicians to four machines are shown in
Table 10:

Table 10: Scores Given by Technicians to Four Machines

Machine 1
M Machine 2 Machine 3 Machine 4
24 28 26 33
23 34 28 37
M
26 29 31 36
27 32 25 35
29 30 21 38
II

Perform the Kruskal-Wallis test to establish whether all four machines are equally
good. Use 5% level of significance.

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: All four machines are equally good. (This implies that Median1 = Median2 =
Median3 = Median4)

H1: At least two machines are different.

First, the researcher merges performance data for the four machines and arranges it
in an increasing order. Thereafter, he/she ranks the data and classifies ranks as R1,
R2, R3 and R4 for machines 1, 2, 3 and 4, respectively. Finally, the researcher takes out
the total of ranks in R1, R2, R3 and R4. The calculation is shown in Table 11:

Table 11: Allocation of Ranks to Scores Provided to Four Machines

No. of Observations Machine Data Ranks


1 21 1
2 23 2
244 3 24 3
4 25 4
Non-Parametric Tests

No. of Observations Machine Data Ranks Notes

5 26 5.5
6 26 5.5
7 27 7
8 28 8.5
9 28 8.5
10 29 10.5
11 29 10.5
12 30 12
13 31 13
14 32 14
15 33 15
16 34 16
17 35 17
18 36 18
19 37
M 19
20 38 20

After that, ranks are classified as R1, R2, R3 and R4 for machines 1, 2, 3 and 4,
M
respectively, as shown in Table 12:

Table 12: Calculation of Kruskal-Wallis Test


II

Machine 1 R1 Machine 2 R2 Machine 3 R3 Machine 4 R4


24 3 28 8.5 26 5.5 33 15
23 2 34 16 28 8.5 37 19
26 5.5 29 10.5 31 13 36 18
27 7 32 14 25 4 35 17
29 10.5 30 12 21 1 38 20
Total 28 61 32 89

The calculation of the Kruskal-Wallis test is as follows:

H = 12/n (n + 1) ∑ (R12/n1+ R22/n2+ R32/n3+ R42/n4) – 3(n + 1)

Where, n = 20

R1 = 28 R2 = 61 R3 = 32 R4 = 89

n 1= n 2= n 3= n 4= 5

H = 12/20(20 + 1) (28 × 28/5 + 61 × 61/5 + 32 × 32/5 + 89 × 89/5) – 3(20 + 1)

= (0.02857) (156.8 + 744.2 + 204.8 + 1584.2) – 63

= 13.85 245
d.f. = k – 1
Research Methodology

Notes =4–1
=3
Chi-square value at 5% level of significance and 3 d.f. is 7.815. You can check the
value for significance with the help of one-tailed test. The graphical representation
of the preceding solution is given in Figure 6:

Acceptance
Region

Rejection Region

7.815 13.85

Figure 6: Showing the Rejection of the Calculated Chi-square Value

Figure 6 shows that the calculated chi-square value lies in the rejection region;
M
therefore, H0 is rejected and H1 is accepted. This implies that all four machines are not
equally good. It can be interpreted that all four machines have different capabilities
and machine number 4 is the best, as its score (89) is the highest.
M
S elf A ssessment Q uestions
7. _________________ is used to determine whether two independent samples
are drawn from the same population.
II

8. _________________ test is used to determine whether the samples in the study


are taken from the same population.
9. Chi-square value is determined at d.f. k–1 and the specified level of significance
and the H value is tested against it. (True/False)

11.6 CHI-SQUARE TEST


Chi-square test is used to find out dependency between two attributes. It can also
be used to make comparisons between theoretical population (expected data) and
actual data (observed data). The formula used in chi-square test is as follows:

(Oi − Ei )2
k
χ =∑
2

i =1 Ei

Where, Oi = Observed frequency

Ei = Expected frequency
246 Expected frequency can be calculated with the help of the following formula:

Ei = Row total*Column total/Grand total (For test of independence)


Non-Parametric Tests

If the value of chi-square is greater than critical value of c2, null hypothesis is rejected. Notes

Figure 7 shows two types of chi-square tests that are mainly used to find out the
association between variables:

Chi-square Tests

Chi-square Test for Chi-square Test for


Goodness of Fit Independence

Figure 7: Types of Chi-square Tests

Let us discuss Chi-square test for goodness of fit and chi-square test for independence
in detail.

11.6.1 CHI-SQUARE TEST FOR GOODNESS OF FIT


The test helps the researcher know whether the theoretical distribution (distribution
M
of expected frequency) is fitted to the observed data and to what extent. In chi-square
test, first the researcher finds out expected frequency on the basis of distribution.

Thereafter, he/she calculates chi-square value with the formula used to calculate
M
chi-square. In chi-square test, d.f. used is n–1. Chi-square value is determined at the
specified level of significance and d.f. If the calculated chi-square value lies under
the limits of acceptance region, the researcher accepts the null hypothesis and rejects
II

the alternate hypothesis.

Let us understand the application of chi-square test with the help of an example.

Example 7: An FMCG company produces various products. Currently, this company


wants to launch four more products, namely A, B, C and D belonging to the same
category. However, before launching the products, the company wants to evaluate
the customer preferences for each product. For this, the company carries out a survey
of 1,000 customers and records their responses as shown in Table 13:

Table 13: Customers’ Responses

Product Number of Customers the Product is Preferred by


Product A 300
Product B 280
Product C 220
Product D 200

Test the hypothesis that the customers have no preference for any particular product.
Use 5% level of significance.
247
Solution: For H0: Customers have no preference for any particular product.
Research Methodology

Notes The expected frequency and observed frequency of customers’ responses are shown
in Table 14 as follows:

Table 14: Expected Frequency and Observed Frequency of Customers’ Responses

Product Expected Frequency (Ei) Observed Frequency (Oi)


Product A 250 300
Product B 250 280
Product C 250 220
Product D 250 200

k
(Oi − Ei )2
χ2 = ∑
i =1 Ei

(300 − 250)2 (280 − 250)2 (220 − 250)2 (200 − 250)2


χ2 = + + +
250 250 250 250

50 2 + 30 2 + 30 2 + 50 2
χ2 =
250

6800
χ2 = = 27.2
250
M
Critical value of χ2 at 5% level of significance with 3 degrees of freedom (k–1 = 3) is
7.81. Since our χ2 is greater than the critical value, we reject H0.
M

11.6.2 CHI-SQUARE TEST FOR INDEPENDENCE


In chi-square test for independence, two attributes are tested to find out whether they
II

are associated with each other. For example, the researcher wants to know that the
introduction of better/unique services helps increase sales of an organisation or not.
In this case, the researcher is trying to establish a relation between two attributes−
better services and sales. In chi-square test, first expected frequency is calculated
and then the value of chi-square is ascertained. The d.f. used in this case is (r–1)
(c–1), where r equals the number of levels for one category of variable and c equals
the number of levels for the second category of variable. The chi-square value is
determined at the specified level of significance and d.f. If the calculated chi-square
value lies under the limits of acceptance region, the null hypothesis is accepted and
the alternate hypothesis is rejected.

Let us understand the application of chi-square test with the help of an example.

Example 8: The researcher has the data for the preferences of men and women
regarding the joint and nuclear families, as shown in Table 15:

Table 15: Data for Preferences of Men and Women for Joint and Nuclear Families

Joint Family Nuclear Family Total


Men 96 35 131
248 Women 170 360 530
Total 266 395 661
Non-Parametric Tests

The researcher wants to find out whether the opinion of men and women about the Notes
type of family is the same. Use 5% level of significance.

Solution: Null hypothesis and alternate hypothesis are as follows:

H0: The opinion of men and women about the type of family is indifferent.

H1: The opinion of men and women about the type of family is different.

The test statistic used for this data is chi-square test for independence. The following
equation is used for calculation:

(Oi − Ei )2
k
χ =∑
2

i =1 Ei

Where, Oi = Observed frequency

Ei = Expected frequency

Expected frequency can be calculated with the help of the following equation:

Ei = Row total*Column total/Grand total

In the current scenario, expected frequency can be calculated using the following
M
method:

E1i = × 266/661 = 52.72


M
E2i = 131 × 395/661 = 78.28

E3i = 530 × 266/661 = 213.28


II

E4i = 530 × 395/661 = 316.72

After calculating the expected frequency and the square of differences between the
observed and expected frequency, Table 16 is created:

Table 16: Calculation of Chi-Square Test for Independence

No. of Observed Expected


Oi – Ei (Oi – Ei)2 (O i– Ei)2/E1
Observations Frequency (Oi) Frequency (Ei)
Men
Joint Family 96 52.72 43.28 1873.158 35.53032
Nuclear Family 35 78.28 – 43.28 1873.158 23.92895
Women
Joint Family 170 213.28 – 43.28 1873.158 8.782626
Nuclear Family 360 316.72 43.28 1873.158 5.914241
Total 74.15614

Calculated value of chi-square = 74.16

d.f. = (r – 1) (c – 1) 249
= (2 – 1) (2 – 1) = 1
Research Methodology

Notes Chi-square value at 5% level of significance with one-tailed test and 1 d.f. is 3.841.
You can check the chi-square value for significance with the help of one-tailed test.
The graphical representation of the preceding solution is shown in Figure 8:

Acceptance
Region

Rejection Region

3.841 74.16

Figure 8: Rejecting Chi-square Value

Figure 8 shows that the value lies in the rejection region; therefore, H0 is rejected.
The value implies that there is a vast difference between the opinions of men and
women about the type of family.
M
S elf A ssessment Q uestions
10. Expected frequency can be calculated using the formula ______________.
M
11. Chi-square test for independence refers to a test in which two attributes are
tested to find out whether they are associated with each other. (True/False)
II

A ctivit y
Prepare a PowerPoint presentation on ‘Non-Parametric Test’.

11.7 SUMMARY
€€ A researcher can use non-parametric tests without taking into consideration
population distribution and sample type. Non-parametric tests are also known as
distribution-free tests.
€€ Sign test is considered as one of the easiest non-parametric tests because it takes
into account only the plus and minus signs of observations in a sample.
€€ One sample sign test is applied on a single sample taken from a symmetrical
population.
€€ Two sample sign test is used to check whether two samples are related to each
other. It is also known as paired sign test.
€€ Rank correlation, also known as Spearman’s rank correlation coefficient, is used to
establish correlation between two data sets that can be ranked.
250 €€ Rank sum test is used to analyse ordinal data (in the rank form) and calculate the
value of rank sum statistics. To conduct this test, observations need to be arranged
in the ascending order.
Non-Parametric Tests

€€ The Mann-Whitney test (or U test) is used to determine whether two independent Notes
samples are drawn from the same population. It is applied in general conditions
and does not have any specific requirement.
€€ The Kruskal-Wallis test is similar to one-way ANOVA with only one difference
that the former is based on ranks, while the latter is based on numerical values.
€€ The Wilcoxon matched pairs test/signed rank test is a combination of sign and
rank tests. It is used to compare two paired samples.
€€ Chi-square test is used to find out dependency between two types of data. It can
also be used to make comparisons between theoretical population (expected data)
and actual data (observed data).

11.8 KEY WORDS


€€ Non-parametric tests: These tests do not require any information about the
parameters of a population from which a sample is derived.
€€ Sign tests: These are based on signs, not on the magnitude of observed values.
€€ Rank correlation coefficient: This test is used to study correlation among the
ranks of different data sets. M
€€ Signed rank test: It is used to study both the direction and magnitude of samples.
€€ Chi-square test of goodness of fit: It is used to analyse nominal data (in the yes/no
format) and find out the best solution to the problem under consideration.
M
€€ Chi-square test of independence: It is used to find out whether two attributes are
associated with each other.
€€ Correction factor: It refers to the adjustment made in a calculation to control
II

deviations in a sample or a method of measurement.

11.9 CASE STUDY: PROBLEM FACED BY PORTABLE GENERATOR


INDUSTRY
History of Portable Generators
The economic liberation policy of 1985 increased foreign industrial collaborations
in India. There was a spurt in industrial tie-ups and consequently in industrial
output. Foreign companies have collaborated with several Indian companies.
Portable generator was one such industrial segment where foreign companies have
collaborated with India companies to manufacture generators in India. For example,
Sri Ram Group entered into Joint Venture (JV) with Honda of Japan to form Sri Ram
Honda. The JV had a capacity to build 500 portable generators a day. In addition,
the Birla group has partnered with Yamaha of Japan to form Birla Yamaha, which
manufactures portable generators. However, some Indian companies independently
entered in the portable generator industry with their local brands. For example,
Greaves Cotton produced portable generators under the brand name ‘Lombardini’.
Kirloskare Group introduced a 1.5 KVA portable generator and Enfield India
launched its generator under the brand name ‘Gee’. There were 50−60 local brands 251
with the capacity to produce 100 portable generators a day. By 1986, the total output
Research Methodology

Notes of the portable generator industry was 2.5 lakh units a month due to huge demand
from customers. However, this demand was short-lived and, by 1987, many units
closed down the production of generators. For example, Kirloskare Group has
withdrawn its 0.5 KVA portable generator from the market. Lombardini has also
disappeared from the market. Two major competitors, Sri Ram Honda and Birla
Yamaha, were indulged in a price war.

Scenario of the Portable Generator Industry


In the portable generator industry, the rural market is emerging and requires
generators mainly to run pump sets in farms. The market has been totally ignored
by two market leaders (Sri Ram Honda and Birla Yamaha). The leaders produce
expensive, good quality generators. These generators are light and fragile; therefore,
these cannot be used in farms. Local brands so far have satisfied the requirement of
the rural market. The market leaders have finally realised the importance of the rural
market. For example, until now, portable generators were marketed on factors, such
as low noise, fuel efficiency and reliable machine. However, market requirements
have changed over time. So, leaders are conducting market research to know the
changed requirements. Sri Ram Honda and Birla Yamaha hired researchers to study
the changing market scenario. The researchers divided the problem into two research
topics. The first research topic is to study the requirements of rural market in terms
M
of technical feasibility and consumer preferences. The second topic is to compare
the two types of generators on the basis of their efficiency with respect to the rural
market. To study the first research topic, researchers collected two samples: one
from a market leader and another from a local marketer. Rural technicians have
M
assigned scores to two types of generators according to efficiency. The collected data
is as follows:
II

Scores given by Rural Technicians to Generators


Generators
No. of Technicians Top Competitors (A) Local Marketers (B)
1 33 45

2 35 30
3 24 35
4 36 40
5 22 45
6 26 41
7 45 46
8 50 52
9 44 49
10 47 44
11 48 50
12 25 42
252
Non-Parametric Tests

Following table shows the preferences of rural and urban customers for two types Notes
(branded and local) of generators:

Preferences of Rural and Urban People for Local and Branded Generators
Top Competitors Local Marketers Total
Rural Market 100 150 250
Urban Market 120 99 219
Total 220 249 469

The researchers concluded that the rural market is widely different from the urban
market. In addition, the efficiency of generators produced by top competitors is
almost same as those produced by local companies. Therefore, the generators
produced by top competitors to fulfil demand from urban marketers, can also be
introduced in the rural market. The two market leaders should market their products
very effectively in the rural market to capture the market share. Local marketers have
the first mover advantage in the rural market. The market leaders can make slight
changes in their generators to improve their capacity and promote their products as
specifically designed for the rural market.

QUESTIONS
M
1. What are the two research topics identified by researchers?
(Hint: Studying the requirements of rural market in terms of technical feasibility
and consumer preferences.)
M
2. What is the conclusion given by researchers in the case study?
(Hint: The researchers concluded that the rural market is widely different from the
II

urban market.)
3. How did the two market leaders (Sri Ram Honda and Birla Yamaha) conduct the
market research?
(Hint: Sri Ram Honda and Birla Yamaha hired researchers to study the changing
market scenario.)
4. What strategy was followed by researchers to conduct research?
(Hint: The researchers divided the problem into two research topics.)
5. How was the study of the first research topic conducted by researchers?
(Hint: Researchers collected two samples: one from a market leader and another
from a local marketer. Rural technicians have assigned scores to two types of
generators according to efficiency.)

11.10 EXERCISE
1. Explain the concept of non-parametric test.
2. Describe rank correlation with the help of an example.
3. Explain the types of sign tests. 253
Research Methodology

Notes 4. Discuss the concept of U test with the help of a diagram.


5. Write short notes on:
a. Chi-square test
b. Wilcoxon matched pairs/Signed rank test

11.11 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer
Concept of Non-Parametric Tests 1. distribution-free
2. True
Sign Test 3. Sign test
4. True
5. c. Two sample sign test
Rank Correlation 6. Rank correlation
Rank Sum Test 7. Mann-Whitney test (or U test)
8. Kruskal-Wallis
9. True
Chi-Square Test
M 10. Ei = Row total × Column total/Grand total
11. True
M
11.12 SUGGESTED BOOKS AND E-REFERENCES
SUGGESTED BOOKS
II

€€ Biddle, J., & Emmett, R. Research in the History of Economic Thought and Methodology
€€ National Academies Press. (2009). Partnerships for Emerging Research Institutions.
Washington, D.C.

E-REFERENCES
€€ Nonparametric Tests - Overview, Reasons to Use, Types. (2020). Retrieved 9 April
2020, from https://corporatefinanceinstitute.com/resources/knowledge/other/
nonparametric-tests/
€€ Using Chi-Square Statistic in Research - Statistics Solutions. (2020). Retrieved 9
April 2020, from https://www.statisticssolutions.com/using-chi-square-statistic-
in-research/

254
R
TE
12

AP
H
C
Report Writing
M
Table of Contents
M
12.1 Introduction
12.2 Research Proposal
II

Self Assessment Questions


12.3 Research Report
12.3.1 Written Report
12.3.2 Oral Presentations
Self Assessment Questions
12.4 Integral Parts of a Report
Self Assessment Questions
12.5 Summary
12.6 Key Words
12.7 Case Study
12.8 Exercise
12.9 Answers for Self Assessment Questions
12.10 Suggested Books and e-References
Research Methodology

Notes LEARNING OBJECTIVES


After studying this chapter, you will be able to:
€€ Explain the concept of report proposal
€€ Describe the research report
€€ Outline the importance of written reports
€€ Explain the concept of oral presentation
€€ Discuss the integral parts of a report

12.1 INTRODUCTION
In the previous chapter, you studied the non-parametric tests. The chapter discussed
the sign test and its types. The latter sections of the chapter described rank correlation
and rank sum test. The chapter concluded with the explanation of chi-square test.

Report writing is a process to document each and every step involved in the
research process. These steps are Introduction, Literature Review, Methodology,
Data Analysis and Interpretation, Conclusion and Recommendations. It helps the
researcher in checking whether the research is progressing in the right direction
M
or not. A research report serves as a reference for findings and recommendations
of a research in future. The research report consists of a written report and an oral
presentation. The written report states objectives, data, research methodology and
findings. The oral presentation helps the target audience in judging whether research
M
recommendations are feasible to address the research problem or not.

An organisation takes several crucial decisions on the basis of a research report


related to its functioning. If the report is not clear and concise, then the organisation
II

may misinterpret the research findings and take wrong decisions, which may prove
disastrous for the organisation. Therefore, the researcher should observe utmost care
and adopt a predetermined structure while writing a report to prevent the creeping
in of ambiguities in the report.

The chapter begins by explaining the concept of research proposal. Next, it provides
in-depth information about research report. Then the following topics are explained:
written report, audience of a report, types of reports and steps in writing a report.
The integral parts of a report are also discussed. Towards the end, the concept of oral
presentations is discussed.

12.2 RESEARCH PROPOSAL


A research proposal is a clearly outlined plan submitted by one party for acceptance
or rejection by another party. The first party wants the research to be conducted and
the second party actually conducts the research. The first party can be an organisation,
government body, or any other entity that has a problem, which can be solved only
through research. The second party can be a research agency, research institution,
or an independent researcher. The research proposal is a detailed description of the
research prepared by the second party to explain the first party how the research
would be conducted and what the requirements of the research are.
256
Report Writing

The research proposal includes the following information: Notes


€€ Purpose: This is the objective for which the research is to be conducted. It also
provides information about the needs and significance of the research.
€€ Population: It refers to the universe from which the researcher takes samples.
€€ Research design: It is the layout of the research giving details of procedures
required for conducting research. Research design defines the information needed,
deigns exploratory or descriptive phases of research, specifies measurement and
scaling procedures, defines an appropriate data collection method, specifies
sampling process and sample size, develops data analysis plan, etc.
€€ Methods of data collection: These are the techniques of collecting data for the
research. Examples of data collection methods are questionnaires, observations
and interviews.
€€ Tests of significance: These tests help in analysing the collected data. The
researcher can use z-test, t-test or F-test depending on the sample size, the type of
data and the research methodology.
€€ Time frame: It is the duration within which the research would be completed.
The research proposal includes the tentative schedule to start and complete each
activity in the research.
€€ Budget: It is the estimated cost to conduct the research. The research proposal
M
should clearly indicate funds required for the research work.
An example of the research proposal is as follows:
M
RESEARCH PROPOSAL

Submitted to
Sales Manager: Vikas Kumar
II

Submitted by
Manali Batra, Senior Researcher
MSD Research Institute

Name: Manali Batra

Designation: Senior Researcher

Location of the Work: Max New York Life, Elegance Tower, Jasola Vihar

Working Days: Monday to Saturday

Working Hours: 9:30 am to 5:00 pm

Contact Number: +919XXXXX69

Time Frame for the Project: 2 months

Expected Cost of the Project: ` XXX thousand

(This includes the cost of project designing, travelling, administration and


reporting.)

Name of the Reporting Officer: Mr. Vikas Kumar 257


Research Methodology

Notes Designation: Sales Manager


Contact Number: +919XXXXX66
Title of the Project
Comparative Analysis of Max New York Life (MNYL) and HDFC Life Insurance:
A Detailed Study on MNYL
Objectives:
€€ To study and compare the sales process of MNYL and HDFC
€€ To study the policies and products of MNYL and HDFC
€€ To compare the customer satisfaction of both the companies
€€ To study the impact of advertisement on the sale of both the companies
Methodology
The research methodology of this project consists of:
1. Research design
zz Descriptive research design
zz Hypothesis testing
M
2. Data collection
zz Primary data – Questionnaire, in-depth interviews
M
zz Secondary data – the Internet, articles in different sources (print media),
MNYL
3. Sample
II

zz Sampling – Purposive and convenient sampling


zz Sample size – 200
zz ™ Sample population – Customers of MNYL and HDFC
4. Tools
zz Excel
zz SPSS
Importance of the Research Work
This study will help us in determining the sales process, products and policies of
MNYL and HDFC. It will also shed light on the impact of advertisement on the
sale of insurance companies.
In addition, the study will help us in comparing MNYL and HDFC to know which
one is doing well in the market and satisfying its customers.
Expected Outcomes
The study aims to obtain information about:
€€ The sales process of MNYL and HDFC
258
€€ Products and policies of two companies
Report Writing

€€ The effect of advertisement on sales Notes

€€ The trend of sales in both the companies


€€ Comparison of customer satisfaction and expectations from respective companies
Limitations of Study

This study covers data analysis of MNYL and HDFC for only a limited period of
time from the financial years 2014–15 to 2018–19. Hence, the results are comparable
and representative for this period only.

S elf A ssessment Q uestions


1. A _______________ is an agreement between two parties. The first party
wants the research to be conducted and the second party actually conducts
the research.

12.3 RESEARCH REPORT


A research report is a crucial part of a research as it includes solutions and actionable
recommendations of the research problem. A research report is prepared by an
analyst or researcher who is a part of the research team. If the report is not made
properly, all efforts of the researcher would become useless. The research report can
be divided into two types, as shown in Figure 1:
M
Written Report
M
Types of
Research Report
Oral Presentation
II

Figure 1: Parts of the Research Report

The written report is an official document giving the facts and information to the
interested readers in a presentable manner. The facts must be accurate, complete
and interpreted. The oral report, on the other hand, is a piece of face-to-face
communication presenting one’s research work in a seminar, workshop, etc. It helps
the researcher present his/her views more clearly in front of research stakeholders.
Since the reporter has to interact directly with the audience, any faltering during oral
presentation can leave a negative impact on the audience. However, an oral report
helps the researcher gather valuable suggestions and feedback from the research
stakeholders. As compared to an oral report, a written report is a permanent record
that can be used for reference again and again. Let us discuss the written reports and
the oral presentations in detail.

12.3.1 WRITTEN REPORT


A research report refers to the systematic and orderly presentation of a research
activity in a written form. While writing a report, the researcher should take into
consideration various aspects, such as specific objectives of the study, description
of the methods or techniques used, review of the data on which the study is based,
assumptions made in the course of study and presentation of the findings including
their limitations and supporting data.
259
A written report provides information about a subject or topic. In other words, it
provides readers with an insight into topic on which the research work is carried,
Research Methodology

Notes time duration, the methodology adopted for research, and so on. In addition, a
written report helps in identifying alternative solutions to address a problem by
presenting the present and past findings and recommendations.

Audience of a Report
As already discussed, there are two parties involved in a research – the first party
wants the research to be conducted and the second party conducts the research. The
first party is called audience. The researcher should tailor the writing of research
report towards the specific requirements of the target audience. The length and
composition of a research report and the details provided in it vary as per the target
audience. This happens because organisations differ from one another in significant
ways.
The researcher should adapt his writing successfully to three types of audiences that
requires different techniques:
€€ High-tech peers: The research report should make use of the most professional/
complex resources, along with writing of jargon and technical terms, keeping in
mind the expert-level knowledge of the audience.
€€ Low-tech peers: The research report should provide proper definitions for all the
abbreviations/acronyms/technical terms used throughout the writing. This would
enhance understanding where it is a mixture of laymen and professionals.
€€
M
Lay readers: The research report should use simple terms that are a lot easier to
understand and interpret. There should be no use of abbreviations/acronyms.

Other Types of Reports


M
As already discussed, different types of audiences prefer different types of reports.
Broadly, reports are classified into two types – technical report and popular report.
The two types of reports are described as follows:
II

€€ Technical report: It lays emphasis on the method employed in conducting research,


assumptions made during the research, details about the research topic, and the
research findings and recommendations. Technical reports are full-fledged reports
that are generally lengthy. These reports involve a detailed description of the
research work. The target audience of technical reports are students, government
bodies, special commissions and other organisations that need an in-depth analysis
of the topic.
€€ Popular report: This report is non-technical in nature and is less comprehensive as
the audience of this report is interested in knowing the results of the research,
not the entire analysis. Therefore, the popular report focusses on the findings and
recommendations of the research. It lays emphasis on simplicity and attractiveness
in information presentation. The content of the popular report should be simple,
clear and less technical in nature. Information should be explained with the help of
simple charts and graphs instead of mathematical equations. The popular report
should be attractive in terms of layout, fonts, figures, print and use of subheadings.

Steps in Writing a Report


The research report should be written in such a format that it is easily comprehensible
by the target audience. The report writing process involves sequential steps that are
described as follows:
260
1. Analysing the subject matter: It involves determining the kind of development
pattern to be adopted for writing the report for a particular research. Two kinds
Report Writing

of development patterns are mostly used in research reports: logical development Notes
and chronological development. In logical development, the researcher makes
logical decisions by using mental thoughts and links between one topic and the
other. Logical thinking is mostly based on the study that the researcher has done
during the research work. In logical development, the subject matter moves from
simple to complex. In chronological development, the subject matter is sequentially
structured.
2. Drawing the outline of the report: At this stage of report writing, the researcher
makes a structure or outline of the report. It consists of a brief description of the
topic to be covered in the report. This helps the researcher not to miss out any topic
to be studied in the report. The outline is also considered as the framework of the
report.
3. Preparing the rough draft: At this stage, the researcher starts writing the report.
The researcher organises his/her thoughts and mentions methods to be used for
data collection, analysis techniques, major findings of the research and limitations
faced by him/her during the study. The recommendations of the study are also
described in the rough draft.
4. Reviewing the rough draft: The researcher checks whether the report conveys
the intent of the research work to be carried out. In addition, at this stage, the
researcher also checks whether the report is apt for the target audience.
M
5. Preparing bibliography: Bibliography is a section of the report that contains sources
of secondary data collection. It includes names of books, journals, magazines and
other sources of print media from where the data is collected. It also contains the
Internet links used in the preparation of the research report. There is a proper
M
pattern to write the name of the source from where the data is collected.
Multiple styles of referencing can be used, such as APA citation, Harvard
referencing and MLA format, each having its unique rules for the structure of
II

references with respect to author name, book title, date, publisher name, etc. Let us
understand the pattern of mentioning data sources in bibliography with the help
of the following examples:
For books and pamphlets, the order of writing in APA referencing is as follows:
Last name of the author, initials of the first name (year). Title of the book (edition).
Place. Publisher name.
For example,
Sekaran, U., & Bougie, R. (2016). Research Methods FOR Business (4th ed.). New
York: Wiley.
For websites, the order of writing in APA referencing is as follows: Article title.
(year). Retrieved from: URL.
For example,
4 Types of Research Methods For Start-Ups. (2019). Retrieved from https:// www.
bl.uk/business-and-ip-centre/articles/4-basic-research-methods-for-business-
start-ups.
6. Making the final draft: At this stage of report writing, the researcher gives a final
touch to his/her report. The final report is prepared keeping in mind the objective
of the research. It should be simple, concise and convincing. At this stage, it is 261
checked whether all the portions of the research are covered or not.
Research Methodology

Notes 12.3.2 ORAL PRESENTATIONS


Most of the time, oral presentations are given with the help of PowerPoint
software, which facilitate data presentation in the form of graphs and charts. These
presentations are preferred by most organisations, as these are less time-consuming
and economical. Oral presentations can be given to a large number of audiences in
a single instance, whereas written reports can be read by only one person in a single
instance.

The duration of an oral presentation is maximum 30 minutes. The researcher should


be able to explain his/her entire research work in the given time. He/she should
have convincing skills and presentable enough to gain the attention of the target
audience. The researcher should also handle the queries of the audience patiently
and should be well-prepared for the presentation to minimise the chances of errors.
He/she should not get irritated and frustrated while answering the queries.

S elf A ssessment Q uestions


2. Which of the following are the most common purposes of writing a report?
a. Providing information b. Generating ideas
c. Finding solution d. All of these
3. Which of the following types of audience needs only one- or two-page report?
M
a. Mathematicians b. Business firms
c. Students of literature d. Chemists
M
4. Oral presentations can be given to a large number of audiences in a single
instance, whereas written reports can be read by only one person in a single
instance. (True/False)
II

5. What is the duration of an oral presentation?


a. 10 mins b. 25 mins
c. 30 mins d. 40 mins

12.4 INTEGRAL PARTS OF A REPORT


A research report contains many sections that provide segregated research
information. Every part of the report is written and described in a different format.
The length, data, objective and style of every part are different. The following points
explain different parts of a report:
€€ Title page: It includes the heading of the report. The report should have a
descriptive title that gives an overview of the research. The title page contains the
name of the sponsor of the study, the name of the researcher, and duration of the
research. Some examples of research titles are as follows:
zz Study of the types of investors in the present scenario
zz Factors affecting consumer preferences during shopping
zz Impact of retail display and store design on customer-buying behaviour
262
€€ Preliminary pages: These pages include acknowledgement or preface of the
report which includes topic of the research and the person who authorises the
Report Writing

researcher to conduct the research. It also contains the name of the people who Notes
have contributed to the research. Preface talks about the subject matter of the
report.
€€ Executive summary: It contains a brief account of the introduction, body and
conclusion of the research. It gives an idea of every segment in the report. The
summary can come at the start or end of the report. It depends on the type of
report and the way of report-writing.
€€ Introduction and objective: It contains the detailed background of the research
topic and the purpose of conducting the research. For example, if the research
is carried on an organisation’s product, the introduction would include product
features and the background, profile, market and future plans of the organisation.
It can also involve the industry background, which includes information regarding
main players and the level of competition in the market.
€€ Body of the report: This part contains a detailed description of the research topic.
It also contains methodology used in the research and analysis of the collected
data.
€€ Findings, conclusion and recommendations: This part contains major findings of
the research.
€€ Bibliography and appendices: This part lists sources from where the research data
is collected. Bibliography contains sources of secondary data while appendices
M
contain the sources of primary data or some extra information about the research
topic. Appendices also contain the questionnaire or other sources of acquiring
data.
M
S elf A ssessment Q uestions
6. The _______________ contains the name of the sponsor of the research, the
name of the researcher and duration of the research.
II

7. Bibliography contains the sources of secondary data while appendices contain


the sources of primary data or some extra information about the research
topic. (True/False)
8. _______________ contain the questionnaire or other sources of acquiring data.

A ctivit y
Prepare a PowerPoint presentation on research report-writing techniques.

12.5 SUMMARY
€€ A research proposal is an agreement between two parties. The first party
wants the research to be conducted and the second party actually conducts the
research.
€€ The research proposal includes purpose, population, research design, methods
of data collection, tests of significance, time frame and budget.
€€ A research report is a crucial part of a research as it includes solutions and
actionable recommendations of the research problem.
263
€€ A research report can be of two types, namely written report and oral presentation.
Research Methodology

Notes €€ Broadly, reports are classified into two types – technical report and popular report.
€€ Technical report lays emphasis on the method employed in conducting research,
assumptions made during the research, details about the research topic and the
research findings and recommendations.
€€ Popular report is non-technical in nature and is less comprehensive as the audience
of this report is interested in knowing the results of the research, not the entire
analysis.
€€ The report writing process involves sequential steps, which are analysing the
subject matter, drawing the outline of the report, preparing the rough draft,
reviewing the rough draft and preparing bibliography.
€€ Oral presentations are given with the help of PowerPoint software, which facilitate
data presentation in the form of graphs and charts.
€€ A research report contains many sections that provide segregated research
information, which are title page, preliminary pages, executive summary,
introduction and objective, body of the report, findings, conclusion,
recommendations and bibliography and appendices.

12.6 KEY WORDS


€€ Final outline: The stage of the report writing in which the researcher makes a
M
structure or outline of the report.
€€ Population: The universe from which the researcher takes samples for the research.
€€ Review of the rough draft: The stage in which the researcher reviews his/her
M
report.
€€ Rough draft: The stage in which the researcher starts writing a report.
II

12.7 CASE STUDY: A NEW PRODUCT


ABC Company wants to launch a new product, PR Paints, in a new market. Therefore,
it hires a research agency to conduct a research and present a short report of 1–2
pages. Through the research, the manager wants to know about the market and
the ways to enter the new market. The researchers prepare the following report
proposal to be submitted to the company:

RESEARCH PROPOSAL

Submitted to
Manager S.R. Dicosta

Submitted by
Veera Malhotra
Senior Researcher
RPS Research Institute

Name: Veera Malhotra

Designation: Senior Researcher

264 Location of the Work: New Delhi

Working Days: Monday to Saturday


Report Writing

Working Hours: 9:30 am to 5:00 pm Notes

Contact Number: +919XXXXX69


Time Frame for the Project: One month
Name of the Reporting Officer: Mr. S.R. Dicosta
Designation: Sales Manager
Contact Number: +919XXXXX66
Title of the Project: Study of Paint Industry in Delhi
Objectives
The objectives of the study are as follows:
€€ To study the paint industry in Delhi
€€ To determine the prospective customers of PR Paint
€€ To compare the paint industry and the present industry of ABC Company
To give recommendations about the launch of PR Paint
Methodology
The research methodology of this project consists of:
1. Research Design
M
zz Descriptive research design
M
zz Hypothesis testing
2. Data collection
zz Primary data: Questionnaire and in-depth interviews
II

zz Secondary data: The Internet and articles in different sources (print media)
3. Sample
zz Sampling: Purposive and convenient sampling
zz Sample size: 200
zz Sample population: Customers of similar product in the market
4. Tools
zz Excel
zz SPSS
zz TABLEAU
Importance of the Work
This study will help in knowing the competition level in the new market and how
the company will beat this competition and enter the market.
Expected Outcomes

From this project, we will come to know about the following:


265
€€ What is size of the paint industry in Delhi?
Research Methodology

Notes €€ What are the products in this industry at present?


€€ What is the level of competition in the market?
€€ How to enter the market?
€€ What product development strategy will meet customers’ requirements?

After sending the report proposal, the researcher starts researching and writes a
report after the completion of the research. The report is as follows:

Research Report
Study of Paint Industry in Delhi
Findings of the Study

The major findings of the research are as follows:


€€ The paint industry in Delhi is very large.
€€ The paint market is highly competitive because a huge variety of paints with
new colour combinations are available. However, there is a scope to enter in the
paint market with new and innovative ideas.
€€ Customers always want to use new colours for their offices and houses.
Recommendations of the Study
M
Some recommendations made to ABC Company are as follows:
€€ Introduction of a new product, PR Paint, in the new market could be a good
M
decision.
€€ Consider the requirements of consumers while introducing a new product in the
paint market. The consumers want that the paint should give a smooth touch,
II

different shades and innovative colour combinations.


€€ Provide customers a different range of colours, which help in keeping the
product price higher.
Objectives of the Study

The objectives of the research are as follows:


€€ To study the paint industry in Delhi
€€ To study the market of PR Paint
€€ To compare the paint industry and the present industry of ABC Company
€€ To give recommendations about the launch of PR Paint
Data Collection

The data is collected from the following sources:


€€ Primary data: Includes the data collected by conducting interviews with
shopkeepers and customers. The shopkeepers were asked which company
was providing them better discounts and which products were preferred by
customers. The customers were asked about how and why they selected a
266
Report Writing

particular paint. Also, customers’ requirements in terms of expected product Notes


design or features were analysed.
€€ Secondary data: Includes the data collected from the Internet, books and articles
on the paint industry in Delhi. The researcher also used documents of the
company to know about its background.
Results

The researcher tries to explain the results of the research with the help of SWOT
(Strengths, Weaknesses, Opportunities and Threats) analysis, which is presented
in the following table:

Strengths Weaknesses
zz Strong brand image zz Centralised structure
zz Dedicated sales team zz Rigid department heads
zz Value-added services
Opportunities Threats
zz Large untapped market zz Presence of very strong competitors
zz Distinguishable product (such as PR zz Aggressive marketing by competitors
Paint) zz Various paints of good quality
zz Unsatisfied customer
zz New area of expansion
M
Table: SWOT Analysis
M
QUESTIONS
1. Which type of report is used in the case study?
II

(Hint: Popular report is used in the case study.)


2. What is the conclusion given by the researcher in the case study?
(Hint: The researcher concluded that the competition is very high in the paint
industry.)
3. How did the researcher explain the results of the research?
(Hint: With the help of SWOT analysis)
4. What did the researcher do prior to writing research report?
(Hint: Prepared research proposal and started the research)
5. What recommendations were given by ABC Company in the research report?
(Hint: The recommendations given were:
€€ Introduction of a new product, PR Paint, in the new market could be a good
decision.
€€ Consider the requirements of consumers while introducing a new product in
the paint market. The consumers want that the paint should give a smooth
touch, different shades and innovative colour combinations.
€€ Provide customers a different range of colours, which help in keeping the 267
product price higher.)
Research Methodology

Notes 12.8 EXERCISE


1. Explain the research proposal.
2. What do you meant by research report?
3. Explain the concept of written report in detail.
4. Discuss the integral parts of a report.

12.9 ANSWERS FOR SELF ASSESSMENT QUESTIONS


Topic Q. No. Answer

Research Proposal 1. research proposal


Research Report 2. d. All of these
3. b. Business firms
4. True
5. c. 30 mins

Integral Parts of a Report 6. title page


7. True
M
8. Appendices
M
12.10 SUGGESTED BOOKS AND E-REFERENCES
SUGGESTED BOOKS
II

€€ Biddle, J. & Emmett, R. Reserach in the History of Economic Thought and Methodology.
€€ Chandra, S. & Sharma, M. Research Methodology.
€€ National Academies Press. (2009). Partnerships for Emerging Research Institutions.
Washington, D.C.

E-REFERENCES
€€ How to Write a Research Proposal | Guide and Template. (2020). Retrieved 10
April 2020, from https://www.scribbr.com/research-process/research-proposal/
€€ Research Reports: Definition and How to Write Them | QuestionPro. (2020).
Retrieved 10 April 2020, from https://www.questionpro.com/blog/research-
reports/

268
About IIMM
“Indian Institute of Materials Management (IIMM)”, with its headquarters at Navi Mumbai, is a
Professional Body of Materials Management classified under Engineering & Technology Group under
Apprenticeship Act, 1961 and is recognised by ISTE, MHRD.

Through its wide network of 56 branches and 19 chapters having around 9500 members drawn

RESEARCH METHODOLOGY
from public and private sectors, IIMM is dedicated to the promotion of the profession of Materials
Management through its multifarious activities including Educational Programs approved by AICTE
(Post Graduate Diploma in Materials Management and Post Graduate Diploma in Supply Chain
Management & Logistics), Seminars, National Conferences, Regional Conferences, Workshops,
In-house training programs, Consultancy & Research Programs.

To have an effective global interaction, the Institute is a charter member of International Federation
of Purchasing and Supply Management (IFPSM), Helsinki, Finland which has its roots in over
44 member countries. M
In furtherance of its objectives, IIMM brings out a monthly journal, “Materials Management Review”
comprising latest Articles and Research Papers in the field of Materials, Logistics, Purchase, Inventory,
Supply Chain Management and latest Technological Innovations like Artificial Intelligence, Block
M
Chain, Cloud Computing and Internet of Things.

The Institute has its Centre for Research in Materials Management (CRIMM) at Kolkata, which
II

is engaged in promotion of research activities in collaboration with industries for furthering the
advancement of the profession of Materials and Supply Chain Management.

The Institute is dedicated for the Societal & Environmental considerations through Sustainable
Procurement, Green Purchasing and Life Cycle Consideration, which are part of our course curriculum.
The aim & objective of the Institution is to update & upgrade the skills & knowledge of professionals
so as to ensure inclusive and sustainable development.
RESEARCH METHODOLOGY

Indian Institute of Materials Management


Plot No. 102 & 104, Sector-15,
Institutional Area, CBD Belapur, Navi Mumbai – 400614
Ph.: 0222 757 1022/0222 756 5592,
E-mail: [email protected], Website: iimm.org

You might also like