0% found this document useful (0 votes)

6 views50 pages

BRM Unit 3 & 5 Data Analysis

The document discusses different types of data and methods of data analysis. It defines categorical and continuous data and different types of each. It also explains various steps and methods used for data preparation, summarization, and analysis including tabulation, graphical representation, descriptive and inferential statistics.

Uploaded by

Aman Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views50 pages

BRM Unit 3 & 5 Data Analysis

Uploaded by

Aman Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

Dr.

Urooj A Siddiqui
 Data – Raw Facts, especially numerical facts,
collected together for reference or
information.
 Data is collected on some particular
variable/s
 Data analysis is processing of data to derive
useful information
 Knowledge communicated concerning some
particular fact
 The created knowledge helps in APPLICATION /
DECISION MAKING
 Categorical:Qualitative
 Continuous: Quantitative

Data

Categorical Continuous

Nominal Ordinal Interval Ratio

 Any phenomenon which takes at least two
different values/ observations

 Data:Set of values/ observations

collected on variable is called data
 Nominal
 Ordinal
 Interval
 Ratio
1. Data Preparation / Initial 2. Summarizing Data / Data
Operations Analysis Operations

 Tables / Crosstab
 Editing / Cleaning
 Graph / Figure
 Coding  Statistical Analysis
 Classification 1. Descriptive Methods
 Frequency, %age, Ratio,
 Tabulation
 Mean, Median, Standard
 Graphical Deviation (Variance)
Representation 2. Inferential Methods
 Comparison (t/z-test/Anova)
 Association (chi square test)
 Correlation (r)
 Prediction/ Regression
(y = ax + b)
 Editing / Data Cleaning
 examining the collected raw data to detect any errors
and omit/correct it if possible
 Coding
 assigning numerals to answers so that responses can
be put into a limited number of categories
 Classification
 Grouping of data on some basis (large volume of raw
data is reduced into homogenous groups
I. Attribute - on the basis of demographic bases
eg. gender, rural/urban, day scholar/hosteller
II. Class Interval – on the basis on some numeric range
eg. 0-10, 10-20 etc.
I. Tabulation
 is the process of displaying raw data in tabular
form and summarising it for further analysis
 orderly arranging data in columns and rows
Tabulation is essential because
 It conserves space and reduces statements
 It facilitates the process of summation of
items, comparison, detection of errors and
omissions
 Basis for various statistical computations
temp of
Gende Yrs in Pain
Name Caste Age Mob. No. Edu IQ locality
r school level
deg cel

Ram M Hindu 60 9450366367 NIL 0 16 Mild-0 -4

Akbar M Muslim 65 8004896712 HS 16 14 Mod-1 20

Sita F Hindu 309 9934876545 Int. 19 0 Mild-0 15

Shalini F Hindu 90 2542543598 HS 8 16 Mild-0 0

Mehnaj F Sikh 38 9458098734 UG 21 13 Severe-2 0

Ravi M Hindu 48 9412890112 PG 23 20 Mod-1 -1

Hari M Hindu 45 8796654398 Prim 12 10 Mod-1 30

temp of
Edu Yrs in Pain
Name Gender Caste Age Mob.No. IQ locality
level sch. level
deg cel

7 1 1 60 9450366367 1 0 16 0 4

2 1 2 65 8004896712 1 16 14 2 20

5 2 1 35 9934876545 2 19 0 0 15

4 2 1 90 2542543598 1 8 16 0 0

3 2 3 38 9458098734 3 21 13 3 0

6 1 1 48 9412890112 4 23 20 2 -1

1 1 1 45 8796654398 0 12 10 2 30

Nominal & Ordinal called qualitative . Interval and Ratio called quantitative
Roll. Age
 Single / Multi Variable Table - one or No (yr)
more variable (no interaction) 1 22
2 24
Single Variable Freq. Table
3 23
Age Group (years) Freq.
4 26
Below 20 2
5 19
20-22 28
6 25
22-24 16
. .
24-26 10
. .
Above 26 4
. .
60 . .
. .
**Multiple Variable Table – as presented in above slide
60 22
 Crosstabs – interaction of two or more
variables
Two Variable Interaction – Crosstab

Gender

Age Group Male Female Total

Below 20 1 1 2
20-22 18 10 28
22-24 9 7 16
24-26 7 3 10
Above 26 3 1 4
38 22 60
Graphical Representation of Data
 Pie Chart
 Bar Graph
 Histogram
 Line Graph
 Scatter Plot
 Scatter Plot & Correlation
Pie Charts
 It is used to represent %ages, distribution of 1
variable at various levels

Sales (in mn)

1.2,
8%
1.4,
10% 1st Qtr
2nd Qtr
3.2, 8.2, 58% 3rd Qtr
23%
4th Qtr
Bar Chart
 It is used to represent 1 variable at various levels
 Levels can be year/ groups etc.

4 Sales
3.5
3
2.5
4.3 4.5
2
3.5
1.5
2.5
1
0.5
0
2018 2019 2020 2021
Bar Chart
5 Clustered Bar
4.5
4
3.5
3 1st
2.5 2nd
2 4.3 4.4
4 3rd
3.5
1.5 3 3 4th
1 2.4 2.5 2.5
2 2 1.8
0.5
0
2018 2019 2020
Histogram
 To show the distribution of a Roll. Age
No (yr)
quantitative variable
1 22
2 24
3 23
12
4 26
10
5 19
8
Frequency

6 25
6
10 . .
4 8
6 . .
2 4 . .
2 0
0
10 20 30 40 50
. .
Class Interval/Variable Unit . .
60 22
Line Diagram
 To show change in variable in a particular time
period / on some reference range

₹ 7.40

₹ 7.20

₹ 7.00

₹ 6.80
Stock Price

₹ 6.60

₹ 6.40

₹ 6.20

₹ 6.00

₹ 5.80

₹ 5.60
1 2 3 4 5 6 7 8 9 10

Last 10 Days
Line Diagram
 May also be used to compare 2 or more variables
along the range
14
12
10
8 Adani
6 Tata
4 Reliance

2
0
1 2 3 4 5 6 7 8
Scatter Plot
 It is used to express relationships between two
variables
6
5
4
Sales in
3
Crore
2 Y-Values

1
0
0 1 2 3 4
Adv Budget in 10’Lacs
Scatter Plot
 to express relationships between two variables
Scatter Plot
 Trend Lines - Correlation
No. of
Income / day 80
families
70
0-500 20
60
500-1000 30
50

No.of families
1000-1500 50 40

1500-2000 70 30

2000-2500 40 20

2500-3000 30 10

3000-3500 10 0
0 1000 2000 3000 4000
Income
. .
age (xi) x-xi (x-xi) sqr.
A 21 2 4
B 22 1 1
C 23 0 0
D 24 -1 1
E 25 -2 4
10 (sum x-xi sq)
mean x 23 Sum 0

Avg Sq (variance) 2 (10 by 5), n=5

SD (root v) s 1.41
Roll. Age
No (yr) Age Group (years) Freq. Probability
1 22 Below 20 2 2/60
2 24
20-22 28 28/60
3 23
22-24 16 16/60
4 26
24-26 10 10/60
5 19
Above 26 4 4/60
6 22
60
. .
Mean 23 (years)
. . (x-sample-known)
. . (µ-population - unknown)

. . SD 2 (years)
(s-sample-known)
. . (𝜎 – population - unknown)
60 22
A distribution in frequencies of observations is
known – probability distribution

 Z- Normal Distribution/Test - Mean (µ), SD-

 To compare means (1 or 2 means)
t – Distribution/Test- Mean (x), SD (s)
 To compare means (1 or 2 means)
 Chi Square Distribution / Test
 To compare sample SD with population SD
F Test
 To compare two sample variances
A freq. distribution with bell shape curve and
some known properties
 Parameters - Mean (µ), SD (sigma)
 Known properties
 68% values are within µ ± 1 SD
 95% values are within µ ± 2 SD
 99% values are within µ ± 3 SD

 95% CI = µ ± 2.SD (range)

 Lower limit µ - 2.SD
 Upper limit µ + 2.SD
23

21 25

19 27

17 29
Example of our case
 95% CI = µ ± 2.SD
 Lower limit = µ - 2.SD, Upper limit = µ + 2.SD,
 LL = 23 - 2.2 = 19, UL = 23 + 2.2 = 27
 95% CI Range = 19-27 years
 95% of the students in the class are in the range
of 19-27 yrs
 We are 95% confident that if we randomly select
a student from the class his/her age will be
within this range (19-27 yrs)
 Reverse is Hypothesis Testing
 If mean and SD of any population is known and if
some value is given can we determine whether it
belongs to this population or distribution ?
0

-0.5 +0.5

-1
+1

-1.5 +1.5
Finding Probability
 Calculate z score (test statistic) of the observed
value or hypothesized value with the formula
 Determine p value associated with particular z
score at selected significance level (5%)
 P value can be seen in the tables of the particular
test

When Population SD is KNOWN When Population SD is UNKNOWN

t=
 Two types of Hypothesis, Null - H0, Alternate - Ha
P Value Method Table Value Method
 Determine p value  Calculate test statistic

 Compare with selected value – TSCal

alpha level (0.05)  Determine Critical value

 p ≤ 0.05 – Reject Null of test statistic at

selected significance level
 P > 0.05 – Fail to Reject
– TSTab
null / accept null
 If TSCal ≥ TSTab – Reject
 This method is generally
Null
employed by data analysis
software – Excel, SPSS  If TSCal < TSTab – Fail to
Reject null / accept null
 This method is generally
employed when manual
testing is done
No. of Marks Specialization
Gender Caste Age
RN Mob.No. Classes Obtained Opted
G C A
N M S

1 1 1 22 9450366367 87 72 HR-3

2 1 2 24 8004896712 65 68 HR-3

3 2 1 26 9934876545 48 56 Fin.-2

4 2 1 21 2542543598 95 83 Mktg.-1

5 2 3 22 9458098734 65 58 Fin.-2

6 1 1 23 9412890112 74 65 Mktg.-1

• Mean & Variance (SD) – Eg. A, N, M – sample stat. – x, s

• Correlation Eg. N-M, A-N, A-M –r
• Association between Gender and Sp. Opted (G n S) - chi
Note Sample Ch.c – Statistic , Population Ch.c - Parameter
 Assume a population – N, µ,
 Now assume we take many samples of size n and
calculate mean for each sample
 x1, x2, x3, x4, x5, x6, . . . . . . . . x100
 Can we make a freq. distribution of these values
and draw a curve?
 Now when we draw a distribution of these values
we will have an average (x) and SD (s)
 This average is called mean of means and
considered mean of population
 The SD of population is calculated as
which is called as Standard Error
 Sample mean & their difference - z / t
 Sample correlation statistic– z / t (derived from r)
 Variance (SD2) – F
 Association – Chi Sqr.

 Central Limit Theorem

 If we collect many samples and draw its
distribution the mean of this distribution is
population mean and SD of population is
 We use CLT in Hypothesis Testing
z - when is Known and sample size is ≥ 30
 t - when is Unknown and sample size < 30
 In sample estimation t test is employed

 Example - H0 & H1
 H0 – There is no difference b/w mean of two groups
 H1 – There is a significant difference b/w mean of two groups
 H0 – There is no difference b/w mean marks of males &
females
 H1 – There is a significant difference b/w male & females
 Hypothesis Testing steps
 Set Null Value (u1=u2, u1-u2=0) – Make Null Distribution –
Calculate z /t sample test statistic – compare with table
value/set p value – reject/accept null
 Used to compare variance of two samples
 Employed in ANOVA – analysis of variance
 When there are more than two groups and their
means are to be compared
 Example
 Comparison of marks among three streams of
students arts, commerce and science
 H0 – There is no difference among mean marks of three groups
 H1 – There is a significant difference among mean marks of three
groups

 Set Null Value (µ1=µ2=µ3) – Make Null Distribution – Calculate F

test statistic – compare with table value/p value – reject/accept
null
Test of Independence

 It is used to determine association between two
categorical variables (nominal & ordinal)
 Example
 Gender (M/F) and Opted Specialization (M/F/HR)
 Question like ‘is any specialisation is preferred by
females?’ are answered
 H0 – There is no association b/w gender and opted speclisa.n
 H1 – There is a significant association b/w gender & opted
speclisa.n
 Here, mean is not calculated instead frequency of categories
is taken into consideration
 Actual Frequency and Expected Frequency
 Cross tabs are used to calculate actual & expected freq

Two Variable Interaction – Crosstab

Opted Total Gender

Specialization (60) Male (40) Female (20)
Mktg. 30 20 8
Fin. 15 10 2
HR 15 10 10
60 40 20

 Hypothesis Testing steps

 Set Null Value (actual freq. = expected freq.) – Make Null
Distribution – Calculate chi sqr. sample test statistic –
compare with table value/set p value – reject/accept null
 Set Null and Alternate Hypothesis – H0 H1
 Select the null value
 Null – status quo, no difference, no effect
 Status quo – no change
 No difference – 0 difference
 No relationship – 0 effect / 0 correlation
 No association – 0 relationship (b/w nominal variab.)
 It is assumed that H0 is true in population
 Draw Null Distribution – find range of expected values
if null is true (µ ± 2.SE)
 Take observed value from sample and compare with
expected null values
 If observed value is among expected null range –
accept null
 If observed value is different from null range – reject
null
1. Univariate/Bi-variate 2. Muti-variate

 Mean/Variance  Correlation
Estimation  Regression
 Z test  Discriminant
 T test  Cluster Analysis etc.
 Chi Square
 F Test
 Correlation
 Regression analysis
 1 dependent variable/DV (continuous)
 many independent variables/IV (continuous)
 Y = a.x1 +b.x2 +c.x3…….+.x.n

 Discriminant analysis
 1 dependent variable (categorical)
 many independent variables (continuous)
 Z (yes/no) = a.x1 +b.x2 +c.x3…….+.x.n
 Cluster analysis
 No DV/IV
 Used to group respondents/customers in
various cluster
 Employed in market segmentation

 Factor analysis
 No DV/IV
 Used to group variables in various cluster of
more condensed variables

Elementary Stats for Archaeologists
No ratings yet
Elementary Stats for Archaeologists
108 pages
Statistics Course Overview
100% (3)
Statistics Course Overview
43 pages
Final Exam in Statistics
100% (2)
Final Exam in Statistics
12 pages
BRM Data Analysis Techniques
No ratings yet
BRM Data Analysis Techniques
53 pages
Chapter 5 Data Analysis Ab
No ratings yet
Chapter 5 Data Analysis Ab
56 pages
Quantitative Data Analysis Guide
No ratings yet
Quantitative Data Analysis Guide
26 pages
Statistical Techniques - Bda
No ratings yet
Statistical Techniques - Bda
33 pages
Inferential Statistics Course
No ratings yet
Inferential Statistics Course
46 pages
Statistics Equationls
No ratings yet
Statistics Equationls
5 pages
Statistics През
No ratings yet
Statistics През
46 pages
Imt 411 Data Analysis - 102431
No ratings yet
Imt 411 Data Analysis - 102431
4 pages
3 4 Research 8 2
No ratings yet
3 4 Research 8 2
54 pages
Unit IV - Analytics Tasks (Students)
No ratings yet
Unit IV - Analytics Tasks (Students)
127 pages
Main Title: Planning Data Analysis Using Statistical Data
100% (2)
Main Title: Planning Data Analysis Using Statistical Data
40 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Statistical Analysis Basics
100% (1)
Statistical Analysis Basics
143 pages
Data Analysis Tools.
No ratings yet
Data Analysis Tools.
51 pages
Medical Statistics New
No ratings yet
Medical Statistics New
46 pages
Business Data Analysis Techniques
No ratings yet
Business Data Analysis Techniques
108 pages
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
No ratings yet
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
31 pages
Statistics
No ratings yet
Statistics
64 pages
WK 1b Biostat
No ratings yet
WK 1b Biostat
38 pages
Quantitative Methods and Business Statistics For Decision Making (MSA606)
No ratings yet
Quantitative Methods and Business Statistics For Decision Making (MSA606)
63 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
CG8 Data-Analysis
No ratings yet
CG8 Data-Analysis
63 pages
Seminar 4
No ratings yet
Seminar 4
43 pages
Quantitative Research Methods
No ratings yet
Quantitative Research Methods
18 pages
Exploring Statistics
No ratings yet
Exploring Statistics
33 pages
Data Analysis Plan Handout
No ratings yet
Data Analysis Plan Handout
15 pages
Day 7 Biostatistics
No ratings yet
Day 7 Biostatistics
44 pages
Statistics for Data Analysts
No ratings yet
Statistics for Data Analysts
29 pages
W1 - Introduction To Statistics
No ratings yet
W1 - Introduction To Statistics
58 pages
Intro SPSS by Sherif Modified
No ratings yet
Intro SPSS by Sherif Modified
45 pages
Intro to Statistics Basics
No ratings yet
Intro to Statistics Basics
18 pages
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
No ratings yet
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
29 pages
Summary of Lectures
No ratings yet
Summary of Lectures
36 pages
DV Unit 1&2 Notes
No ratings yet
DV Unit 1&2 Notes
50 pages
Data Visualization Notes Ou
No ratings yet
Data Visualization Notes Ou
125 pages
Statistical Techniques For Analyzing Quantitative Data
100% (1)
Statistical Techniques For Analyzing Quantitative Data
41 pages
Data Analysis
100% (1)
Data Analysis
34 pages
Data Analysis & Statistics Guide
100% (1)
Data Analysis & Statistics Guide
120 pages
Data Analysis and Statistical Methods
No ratings yet
Data Analysis and Statistical Methods
44 pages
Bio Statistics
No ratings yet
Bio Statistics
97 pages
Advancedstatistics 130526200328 Phpapp02
No ratings yet
Advancedstatistics 130526200328 Phpapp02
104 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Mean, Median, Mode and Standard Deviation
No ratings yet
Mean, Median, Mode and Standard Deviation
42 pages
Statistical Data Analysis Guide
No ratings yet
Statistical Data Analysis Guide
28 pages
College 7 - Chapter 14&16 Zonder Antwoorden - Voor Student
No ratings yet
College 7 - Chapter 14&16 Zonder Antwoorden - Voor Student
42 pages
Biostatistics 140127003954 Phpapp02
No ratings yet
Biostatistics 140127003954 Phpapp02
47 pages
Tutoring Study Plan
No ratings yet
Tutoring Study Plan
17 pages
Not 1
No ratings yet
Not 1
8 pages
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
No ratings yet
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
1,595 pages
Business Statistics
No ratings yet
Business Statistics
25 pages
Emdad Rahman
No ratings yet
Emdad Rahman
85 pages
Introduction to Biostatistics and Statistical Methods
No ratings yet
Introduction to Biostatistics and Statistical Methods
47 pages
001 Glossary
No ratings yet
001 Glossary
7 pages
Statistics
100% (1)
Statistics
9 pages
Factors Affecting The Usage of Major Heu
No ratings yet
Factors Affecting The Usage of Major Heu
20 pages
2015 Task-Based Effectiveness of Basic Visualizations
No ratings yet
2015 Task-Based Effectiveness of Basic Visualizations
9 pages
Ba ZG524 Ec-3r First Sem 2023-2024
No ratings yet
Ba ZG524 Ec-3r First Sem 2023-2024
5 pages
Research Methodology. Lecture 13
No ratings yet
Research Methodology. Lecture 13
18 pages
Quantitative Research Design and Method
No ratings yet
Quantitative Research Design and Method
54 pages
Measurement and Scale Construction Techniques
No ratings yet
Measurement and Scale Construction Techniques
61 pages
BUAD 826 - Individual Assignment March 2025
No ratings yet
BUAD 826 - Individual Assignment March 2025
4 pages
Business Report: Predictive Modelling
100% (2)
Business Report: Predictive Modelling
37 pages
The Variables in Research
No ratings yet
The Variables in Research
1 page
Data Analytics for Business Growth
No ratings yet
Data Analytics for Business Growth
21 pages
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
32 pages
Variables and Data Variables and Data: Variable Qualitative Variable Quantitative Variable Discrete Variable Data
No ratings yet
Variables and Data Variables and Data: Variable Qualitative Variable Quantitative Variable Discrete Variable Data
3 pages
Psychology Statistics
No ratings yet
Psychology Statistics
26 pages
Nonparametric Test
No ratings yet
Nonparametric Test
18 pages
Types of Variables & Data Display
No ratings yet
Types of Variables & Data Display
4 pages
Lecture 6: Modeling, Evaluation, and Visualization
No ratings yet
Lecture 6: Modeling, Evaluation, and Visualization
14 pages
Choosing Statistical Method Number of Dependent Variables Nature of Independent Variables Test(s) How To SAS How To Stata How To Spss
No ratings yet
Choosing Statistical Method Number of Dependent Variables Nature of Independent Variables Test(s) How To SAS How To Stata How To Spss
2 pages
Statistics Problem Set Analysis
No ratings yet
Statistics Problem Set Analysis
6 pages
Sta 114 Sabi
No ratings yet
Sta 114 Sabi
17 pages
AIOU Solved Assignments 8614 Spring 2019
100% (1)
AIOU Solved Assignments 8614 Spring 2019
11 pages
Haramaya Univer
No ratings yet
Haramaya Univer
162 pages
Syllabus MA Sociology 3rd Semester
No ratings yet
Syllabus MA Sociology 3rd Semester
19 pages
Analysis of Data
No ratings yet
Analysis of Data
12 pages
MMW - Midterm - Modules - DATA MANAGEMENT
No ratings yet
MMW - Midterm - Modules - DATA MANAGEMENT
29 pages
Cami Original
No ratings yet
Cami Original
22 pages
Part II - Data Aalysis
No ratings yet
Part II - Data Aalysis
23 pages
Statistics I Problem Sets Guide
No ratings yet
Statistics I Problem Sets Guide
52 pages

BRM Unit 3 & 5 Data Analysis

Uploaded by

BRM Unit 3 & 5 Data Analysis

Uploaded by

Dr.

Nominal Ordinal Interval Ratio

 Data:Set of values/ observations

Ram M Hindu 60 9450366367 NIL 0 16 Mild-0 -4

Akbar M Muslim 65 8004896712 HS 16 14 Mod-1 20

Sita F Hindu 309 9934876545 Int. 19 0 Mild-0 15

Shalini F Hindu 90 2542543598 HS 8 16 Mild-0 0

Mehnaj F Sikh 38 9458098734 UG 21 13 Severe-2 0

Ravi M Hindu 48 9412890112 PG 23 20 Mod-1 -1

Hari M Hindu 45 8796654398 Prim 12 10 Mod-1 30

Age Group Male Female Total

Sales (in mn)

Avg Sq (variance) 2 (10 by 5), n=5

 Z- Normal Distribution/Test - Mean (µ), SD-

 95% CI = µ ± 2.SD (range)

When Population SD is KNOWN When Population SD is UNKNOWN

 Compare with selected value – TSCal

 p ≤ 0.05 – Reject Null of test statistic at

• Mean & Variance (SD) – Eg. A, N, M – sample stat. – x, s

 Central Limit Theorem

 Set Null Value (µ1=µ2=µ3) – Make Null Distribution – Calculate F

Two Variable Interaction – Crosstab

Opted Total Gender

 Hypothesis Testing steps

You might also like