Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
26 views98 pages

Questionners and Data Analysis

Uploaded by

Awais A. Naveed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views98 pages

Questionners and Data Analysis

Uploaded by

Awais A. Naveed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 98

6107BEUG- Engineering Research Project

6205CIV- Research Project


Lecture 8

Survey & Interview and Data Analysis


In this session….

• We will discuss what is the survey research and how to


complete an interview.

3
Survey Research

4
5
Introduction- What is a survey?

• A system for collecting information

• Purpose is to produce quantitative or numerical


descriptions, trends and patterns about some aspects of
a particular study population

• Main way of collecting survey information is by


conducting questionnaire

6
Survey Design

• Taking a view on the entire survey process is critical to


the success of the research project- “total survey design”

• The procedures used to conduct a survey have a major


effect on the accuracy of the resulting data

7
Sampling

• A census survey is when we gather information about


everyone in a target population

• A sample survey is when we select a small subset of a


population representative of the whole population

• Ideal sampling method is to allow all members of your


target population to have the same change of being
selected to complete your survey

8
Key Sampling Decisions

9
The Sample Frame

• A carefully selected group that represents a target


population who have the chance to be selected to
participate in the survey

• Comprehensiveness- how completely it covers the target


population?

• Accessibility- How likely is it that you can obtain details of


the desired sample frame to enable you to conduct a
sample survey?

10
Non- probability sampling

• The probability of inclusion is deliberate and participants


are not randomly selected from a sample frame

• If you choose this option- you must evaluate the positives


and negatives associated with a non-probability sample,
including rationale for your chosen sample

11
Probability sampling

The probability of inclusion is computerised, usually by


randomly selecting participants from a sampling frame

• Simple random sampling


• Systematic sampling
• Stratified sampling

12
Simple Random Sampling

• Assigning a number to each participant in sample frame


and randomising the sample frame list.

• Example:
You have a sample frame (population) of 1000 and you
want to randomly sample 100 participants. Obtain a list of
all participants and assign each participant in the sample
frame with a number. Randomise the sample frame so the
list is sorted into a randomised order. This can be achieved
using Microsoft Excel. Select the first 100 participants in
the randomised list.

13
Systematic Sampling

• Work out a fraction based on desired sample size and the


total sample frame

Example:

You have a sample frame (population) of 1000 and you


want to randomly sample 100 participants. Obtain and list
of all participants and assign each participant a number.
Work our a fraction based on the sample frame and desired
sample size (100/1000). Hence, you would select 1 out of
every 10 persons within the sample size

14
Stratified sampling

• Useful if want to get a proportionate number of key


variables (e.g. gender, age, etc.)

• Example
Your sample frame has 400 participants and your desired sample size
is 100. However, 300 are male and 100 are female. Males therefore
represent 75% of the population and females 25%. Firstly, divide the
sample frame into two separate sub-sample frames (300 and 100).
Then, work out the sample size needed for each, based on the
representative percentages. Hence we would want our sample size to
consist of 75 males (75/100) and 25 females (25/100) to ensure that
the percentage of the sample subgroups are representative of the
percentages of the sample frame (population). You can then use either
a simple random sample or systematic sample method within each
sub-sample frame to randomly determine this 75/25 split. 15
Saturation sampling

• Saturation sampling is an attempt to conduct a population


census (i.e. give everyone in the sample frame the
chance to complete the survey).

• Very common now for online surveys- able to overcome


traditional barriers of survey implementation

• Ideal if have access to every member of target population


(e.g. email address list)

• Non- response error can however be much higher

16
Sample size

• How big should my sample by?

• Common misconception that it should be a fraction of the


sample frame

• A sample size drawn from the size of the target


population has virtually no impact on how well the sample
size is likely to describe the population

• A sample of 150 will describe a population of 15000 or 15


million

17
Sample size

• Useful to calculate the margin of error you are required to


accept from your sample

• Allows you to determine the level of confidence in your


sample using a 95% confidence interval

• A margin of error of around +/- 5% is usually acceptable


for many national polls and surveys

• Generally margin of error increases quite significantly up


to 150-200 responses. After that point the margin of error
is more negligible
18
Sample size- Example

• Example of margin of error using a 95% confidence


interval:

Through a sample size of 100 collected from a total sample


frame of 200, a 95% confidence interval of 7% was
calculated

This means that if 50% of respondents answer “yes” to a


yes/no question, we can be 95% certain that the views of
the total population answering “yes” (including those who
did not participate in the survey) will lie between 43% and
57%
19
How many surveys should I distribute?

• Rule of 25%- Therefore for every 1 questionnaire


returned you should send out 4 questionnaire

20
Question design

• Survey research takes a reductionist approach using


questions as measures

• Ensuring appropriate working and structuring of


questions will increase the effectiveness of the resulting
data

• Making sure questions are well understood and answers


and meaningful

21
Type of questions

• Closed questions – a list of acceptable responses are


given to the respondent, reducing/ limiting choice of
answer

• Open questions- acceptable responses are not provided


to the respondent, giving freedom of choice to their
answer

22
Closed questions

Respondent answers more reliable when response


alternatives are given

Researcher can perform more reliably in interpreting


meaning of answers as they are pre-planned

Makes findings more analytically interesting as more


people will have answered particular responses

Open questions take time to quantify, input and analyse

23
Levels of measurement

Refers to how categories of questions relate to each other

There are three main levels of measurement, which are

24
Levels of measurement
• Nominal:

• Used to distinguish between categories of a variable but


cannot rank categories in any order

• E.g. Country of birth, sex, ethnicity

• Interval/ ratio data:

• Used when categories can be naturally ranked and


quantified

• E.g. age (24 or under; 25-34; 35-44; 45-54; 55-64; 65 and over
25
• Ordinal:

• Used when it is appropriate to order/rank categories


along a single dimension.

• Likert (1932) had a major impact on introducing scaling


techniques to measure questions- introduced the “Likert
scale”

• Very common in survey research to use a 3 or 5 point


Likert Scale, e.g.:

Levels of satisfaction of services –” Very satisfied”, “Fairly satisfied”,


“neither”, “Fairly dissatisfied”, “very dissatisfied”

26
Creating Questions

• Come up with as many questions as possible, the list will


eventually go down

• Make sure no questions are repeated

• It may be worthwhile having negatively worded questions


to see if participants respond to these consistently or not

• Ask others for their opinion on your questions- do they


understand what you are asking them?

27
Reliability and accuracy- Relevance
• What am I trying to find out and does this question help?

• Who is my intended audience and is this question


relevant?

• Are my questions meaningful and understandable to all of


my sample frame?

• Is the information returned useful or is it just “nice to


know”?

• The crucial test of this may be to think “Can I act on, or


do anything with, the information returned”? 28
Reliability and accuracy- Importance

• Should a question be mandatory or optional?

• If it is mandatory question, can everyone answer it?

• You might need to add an option such as:

“ Not Applicable”, “Don’t Know”

...but use these with caution!

29
Reliability and accuracy- Readability

• Are the possible response to your questions consistent?

• Will it confuse the respondent?

• E.g. don’t mix possible responses such as:

• “Very good”, “good”, “not very helpful”, “Not at all helpful”

30
Reliability and accuracy- Ethics

• Have I phrased my questions correctly, are any terms I


have used politically correct?

• Have I ensured that my possible responses are equally


balanced e.g. using 5-point scale?

31
Ethics

• If questions are of a sensitive nature then you will need to


gain ethical approval from the University’s Research
Ethics Committee

• For further information, please refer to the University’s


ethics pages where all the codes of practice are available

• There is also lots of useful information to ensure help you


formulate your questions accordingly

• Please attend Ethics Testing in CANVAS

32
Data Collection Methods

• There are various types of data collection modes in order


to undertake survey research:

• Mail
• Telephone
• Internet
• Email
• Face-to-face

33
34
35
Online surveys

• Speed: can be sent to many people from a selected


distribution list and posted on web page

• Economy: usually cost to buy software, but free at the


university. Economical if targeting a large and wide
population

• Added content options: potential to add graphics such as


images/video clips

• Expanded question types: provide a wide variety of


question types to help you when designing
36
• Anonymity is preserved: there is no email address linked
to a web response unless you ask for it

• Minimise data inputting: accepted directly into a database


avoiding the need for subsequent data-entry as with
traditional methods

• Minimise data validation: real-time analysis means that


invalid responses can be easily monitored and captured

• Sampling: potential to use saturation sapling although


caution to level of non-response

37
Online survey- disadvantages

• Limited population: user must be able to access the


internet to complete the questionnaire

• Abandonment of survey: respondents can quit before


finding

• Dependence on software: requires researchers to use


software to create and deploy questionnaires

38
Interviewing

• The collection on non-numerical data

• Can refer to an inductive approach, where theory is


essentially generated through research

• Questions are open allowing interviewees to provide their


own answers that are not restricted to specific chaces

39
Interviewing process

40
Thematising

• Process of bringing attention to the subject area

• Formulating research questions and clarifying the theory


of the theme investigated

41
Face to face interviews

• Often most effective mode of interview inquiry

• Create an interpersonal situation where trust is


established and disclosure becomes possible

• Using a Dictaphone can improve the accuracy and


eligibility of the data

42
43
44
45
46
47
48
49
50
51
Research Methodology

Quantitative Analysis
Research Type

 Quantitative
Research that produces Continuous
Numerical Data-

 Qualitative
Research that produces Non-Numerical
Data-
Research Type
• Quantitative- Numerical Data- Generated by
Experimental Research
Laboratory based.
Field Work Based.
Numerical Meta-Analysis Research
• Qualitative- Non-Numerical Data-Generated by
Surveys
Case Studies
Non-Numerical Meta-Analysis
Observational Research
Quantitative Analysis

 Examines relationships among variables


Variable- is a quantity that can be measured and have changing
values.

 Analysis is conducted using statistical procedures


Levels of Quantitative Analysis

Univariate (One-dimensional)
Can be presented by a Histogram (frequency plot)
Bivariate (Two-dimensional)
Presented by a scatter plot of the dependant
and independent variables.
Multivariate (Multi-dimensional)
Scatter plot demonstrating all the variables is
produced.
Can take the form of a 3D plot.
Levels of Quantitative Analysis
Univariate (One-dimensional)
Voltage
Bivariate (Two-dimensional)
Scatter plot
Voltage- V

Straight line indicate


Dependant
relationship
Time- T

+ve slope -
increasing
relationship
Voltage- V

-ve slope -
decreasing
relationship Time- T
Multivariate (Multi-dimensional))
Scatter plot
Voltage- V

Temperature

Time- T
The effects of a third variable (temperature) on
the dependant (voltage) and independent (time)
variables.
Bivariate (Two-dimensional)

 The strength of relationships in a scatter diagram can be


measured using a Correlation Coefficient (R2)

Strong Zero or very


Positive weak
Correlation Correlation
Correlation Coefficient

The coefficient is between 0 (No


Relationship) and 1 (Perfect Relationship)
– this shows the strength of the relationship

The closer the coefficient is to 1, the stronger


the relationship; the closer the coefficient is to
0, the weaker the relationship

The coefficient will either be positive or


negative – this shows the direction of the
relationship
Types of Trendlines
 Linear

 Exponential

 Logarithmic

 Power

etc
Data Distribution

 Normally Distributed Data-

Parametric

 Non-Normally Distributed Data-

Non-Parametric
Parametric Data-
Normally Distributed

Correlation
Pearson correlation coefficient
Calculate the linear correlation coefficient for
each pair of variables
Produces a P-value with <0.05 significance that
there is differences between the correlated
variables.
Parametric Data-
Normally Distributed

ANOVA
One-way & two-way analysis of variance (F-test)
Compares the means of a number of group samples (three or more)
for similarity.
Compares groups classified by two different factors.
Produces significance for similarities between the two groups, as
P<0.05.
Parametric Data-
Normally Distributed

Chi-square test
Test whether there is association between variable categories
Used for the evaluation of Un-paired groups (unrelated
samples).
Uses contingency tables.
Non-Parametric Data-
Non-Normally Distributed

Correlation
Spearman Rank correlation coefficient
Calculate the linear correlation coefficient for each
pair of variables
Produces a P-value with <0.05 significance that
there is differences between the correlated
variables.
Non-Parametric Data-
Non-Normally Distributed

Mann- Whitney u Test


Compares the means of two independent sample groups.
Produces significance for similarities between the two groups, as
P<0.05.
Non-parametric t-test
Wilcoxon Signed-Rank test
Non-parametric equivalent to paired t-test.
Produces significance for the measurements on the same samples
having the same mean, as P<0.05.
Non-Parametric Data-
Non-Normally Distributed

Kurskal-WallisTest
Compares the means of a number of group samples (three or more) for
similarity
Produces significance for similarities between the two groups, as
P<0.05.
Non-parametric ANOVA
If more than two groups then Mann- Whitney u Test
Non-Parametric Data-
Non-Normally Distributed

Chi-square test- as with parametric data


Test whether there is association between variable categories
Used for the evaluation of Un-paired groups (unrelated samples).
Uses contingency tables.

Fisher’s Exact test


Alternative to Chi-square test for 2x2 contingency tables and small
sample size.
Statistical Software
Minitab
User friendly - based on easy drop down menus
Data are entered in worksheet and can be copied from Excel.
Available on LJMU AppPlayer

SPSS
Popular statistical software.
Also based on drop down Menus and copying data from Excel.
Research Methodology

Qualitative Analysis
Qualitative Research

 Research that produces Non-Numerical Data-

 Non-Numerical Data-Collected by
Surveys
Case Studies
Non-Numerical Meta-Analysis
Observational Research
Qualitative Approaches
 Traditionally engineering and scientific research has relied
on Quantitative (experimental) Research.

 In recent years Qualitative Research methods have been


increasingly recognised and applied in engineering and
science to-
Advance the understanding of basic
causes, principles, and behaviours.
Qualitative Approaches-continued
 Three categories of approach to the analysis of Qualitative
Data.

Language based- focuses on how language is used and its


meaning.
Example- conversation analysis.
Descriptive or Interpretive- develops view of the participants and
subjects investigated.
Theory building- seeks to develop theory from the data collected
during study.
Analysis of Data
Researcher need to establish Categories,
Groups, and relationships between them from
data collected.
This can be achieved using-
Cluster analysis
Divides data into groups (clusters)
based on similarity
Qualitative Analysis

 Types of Categorical Data


Nominal
Ordinal
Interval
Ratio
Qualitative Analysis - Types of Data

Nominal data
Variables that use labelling, without any value

Example-
- Do you have site safety certification?
□ Yes □ No □ Do not know

- What materials do you mainly use?


□ Timber
□ Steel
□ Concrete
Qualitative Analysis - Types of Data

Ordinal data
With ordinal variables, it is the order of the values
that is important.

The differences between each value is not known.

The sequence makes sense in one order, or in


exactly the opposite order.
Qualitative Analysis - Types of Data
Example-
How do you rate energy saving progress?

Can summarize numerically by giving scores to the categories


Excellent Good Moderate Poor Very Bad
29 243 117 86 25
- In each case #4 is better than #3 or #2, but we don’t know–and
cannot quantify how much better it is.

Excellent Good Moderate Poor Very Bad


5 4 3 2 1
Qualitative Analysis - Types of Data

Interval data
Interval data are numeric values in which we
know both the order and the exact differences
between the values.
Example-
Temperature- the difference between each value is the same. The
difference between 60 and 50 degrees is a measurable 10 degrees,
as is the difference between 80 and 70 degrees.

Time- is an interval scale in which the intervals are known, consistent,


and measurable.
Qualitative Analysis - Types of Data

Ratio data
Numeric values in which the order and the exact
differences between the values are known
(interval) and have an absolute zero.
Example-
- Measurements of Height
- Measurements of Weight.
Qualitative Statistical
Analysis
Descriptive statistics
 mean
The mean (average) is the most popular statistic.
It is found by adding the values for all the (non-
missing) cases and dividing by the number of (non-
missing) cases. Careful with too many high or low
values.
Example-
Five people take a test. Their scores are
60, 62, 65, 68, and 95
The mean is 70
Qualitative Statistical
Analysis
Descriptive statistics
 median
The median provides a measure of central
tendency - half the sample will be above it and
half the sample will be below it.
Example-
Five people take a test. Their scores are
60, 62, 65, 68, and 95
The median is 65
Qualitative Statistical
Analysis
Descriptive statistics
 mode
Is the most common value or score- the one
that occurs most frequently.
It is possible to have more than one mode.
Example-
The following set of data has two modes: 12
and 16.
12 12 12 13 14 15 15 16 16 16 17 18
Qualitative Statistical Analysis

Descriptive statistics application


Nominal data
mode
Ordinal data
median and mode
Interval data
mean, median and mode
Ratio data
mean, median and mode
Qualitative Data Collection
 Survey Research
Survey is a system for collecting
information.
Its purpose is to produce numerical
descriptions, trends and patterns about
some aspects of a particular study
population.
Generally information is collected
through a sample of the population.
Qualitative Data Collection

Survey Conducted using


Questionnaire - a predefined series of questions are used to
collect information
Paper questionnaire
Postal delivery
Handouts
Online (web-based) questionnaire
Interview – researcher completes survey based on what
respondents says
Interview in person.
Interview by phone.
Qualitative Data Collection
Closed-Ended Questions-
–Provide a list of predetermined responses
from which to choose an answer.
–The list of responses should cover all
possible response and their meaning should
not overlap.
Open-Ended Questions-
–Survey respondents are asked to answer
each question in their own words.
–Responses are usually categorized into a
smaller list for statistical analysis.
Sampling

A census survey is when information gathered


about everyone in a target population
A sample survey is when we select a sample
of a population representative of the whole
population
Ideal sampling method is to allow all members
of the target population to have the same
chance of being selected to complete the
survey
Sampling
Random samples
Each member of population have equal chance of
being selected.
Selected members are excluded from further re-
selection.
Non-random samples- obtained by
Systematic sampling
Stratified sampling
Cluster sampling
Convenience sampling
Snowball sampling
Sampling
Non-random samples
Systematic sampling
Every xth member of the population is sampled
x is the interval and is kept constant.
The interval can be determined by
sample size/population size
Stratified sampling
Used when population occurs in distinct groups- example type
of company or construction
Sample is divided between the different groups
Sampling
Non-random samples
Cluster sampling
The population is divided into clusters (groups)
The clusters are selected randomly
The sample is represented by a Cluster
Each cluster can represent the population
Convenience sampling
Data collected from a sample that can be accessed
readily and conveniently
Population has no obvious indication of sample
Snowball sampling
Researcher collects data from a small source and
asks for further sources to build up a sample
Sample size
The larger the sample size the more
representative of the population
Sample size can be limited by the number of
respondents.
Useful to calculate the margin of error required
to accept from the sample.
Can aim for a confidence interval of 95%
means 95 out of 100 samples will have the true population value
within range of precision.
Sampling error is the level of precision- the range
in which the true value of the population is
estimated to be.
Sample size
Example
Through a sample size of 100 and a calculated
95% confidence interval of 7%

Then if 50% of respondents answer “yes” to a


yes/no question, one can be 95% certain that
the views of the total population answering
“yes” (including those who did not participate
in the survey) will lie between 43% and 57%
Survey Guidelines

Introduction
Explain the reason for the survey
Instructions
Information to the participants on how to complete
the survey.
Statements
Clear wording and structure of questions will
increase the effectiveness of the resulting data
Should be short statements without lengthy
explanations
Survey Guidelines-
continued
Should not have multiple themes
Respondents need to provide single response
Statements should not direct researcher
views onto respondents
Next Lecture

• During next lecture we will discuss Dissertation Writing.

98

You might also like