0% found this document useful (0 votes)

19 views43 pages

2 Descriptives

Uploaded by

narlaanish7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views43 pages

2 Descriptives

Uploaded by

narlaanish7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 43

Describing Data Numerically

Central Tendency Variation

Arithmetic Mean Range

Median Interquartile Range

Mode Variance

Standard Deviation

Coefficient of Variation
Measures of Central Tendency
Overview
Central Tendency

Mean Median Mode

x i
x  i1
n
Arithmetic Midpoint of Most frequently
average ranked values observed value
Arithmetic Mean
 The arithmetic mean (mean) is the most
common measure of central tendency
 For a population of N values:
N

x i
x1  x 2    x N Population
μ 
i1
values
N N
Population size
 For a sample of size n:
n

x i
x1  x 2    x n Observed
x i1
 values
n n
Sample size
Arithmetic Mean

 The most common measure of central tendency

 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4
1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5
Median
 In an ordered list, the median is the “middle”
number (50% above, 50% below)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3

 Not affected by extreme values

Finding the Median

 The location of the median:

n 1
Median position  position in the ordered data
2
 If the number of values is odd, the median is the middle number
 If the number of values is even, the median is the average of
the two middle numbers

n 1
 Note that is not the value of the median, only the
2
position of the median in the ranked data
Mode
 A measure of central tendency
 Value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical data
 There may may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

No Mode
Mode = 9
Review Example
 Five houses on a hill by the beach
$2,000 K
House Prices:

$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000

$100 K

$100 K
Review Example:
Summary Statistics

House Prices:
 Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000
 Median: middle value of ranked data
Sum 3,000,000
= $300,000

 Mode: most frequent value

= $100,000
Which measure of location
is the “best”?
 Mean is generally used, unless extreme
values (outliers) exist . . .
 Then median is often used, since the median
is not sensitive to extreme values.
 Example: Median home prices may be reported for
a region – less sensitive to outliers
Shape of a Distribution

 Describes how data are distributed

 Measures of shape
 Symmetric or skewed

Left-Skewed Symmetric Right-Skewed

Mean < Median Mean = Median Median < Mean
Geometric Mean
 Geometric mean
 Used to measure the rate of change of a variable
over time
1/n
x g  (x1 x 2  x n ) (x1 x 2  x n )
n

 Geometric mean rate of return

 Measures the status of an investment over time

rg (x1 x 2 ... x n )1/n  1


Where xi is the rate of return in time period i
Example

An investment of $100,000 rose to $150,000 at the

end of year one and increased to $180,000 at end
of year two:

X1 $100,000 X 2 $150,000 X3 $180,000

50% increase 20% increase

What is the mean percentage return over time?

Example
(continued)

Use the 1-year returns to compute the arithmetic

mean and the geometric mean:

Arithmetic (50%)  (20%)

mean rate X 35% Misleading result
2
of return:

Geometric rg (x1 x 2 )1/n  1

mean rate
[(50) (20)]1/2  1 More
of return:
(1000)1/2  1 31.623  1 30.623% accurate
result
2.2
Measures of Variability
Variation

Range Interquartile Variance Standard Coefficient of

Range Deviation Variation

 Measures of variation give

information on the spread
or variability of the data
values.

Same center,
different variation
Range
 Simplest measure of variation
 Difference between the largest and the smallest
observations:

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13
Disadvantages of the Range
 Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

 Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Interquartile Range

 Can eliminate some outlier problems by using

the interquartile range

 Eliminate high- and low-valued observations

and calculate the range of the middle 50% of
the data

 Interquartile range = 3rd quartile – 1st quartile

IQR = Q3 – Q1
Interquartile Range

Example:
X Median X
minimum Q1 (Q2) Q3 maximum
25% 25% 25% 25%

12 30 45 57 70

Interquartile range
= 57 – 30 = 27
Quartiles
 Quartiles split the ranked data into 4 segments with
an equal number of values per segment

25% 25% 25% 25%

Q1 Q2 Q3
 The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
 Q2 is the same as the median (50% are smaller, 50% are
larger)
 Only 25% of the observations are greater than the third
quartile
Quartile Formulas

Find a quartile by determining the value in the

appropriate position in the ranked data, where

First quartile position: Q1 = 0.25(n+1)

Second quartile position: Q2 = 0.50(n+1)

(the median position)

Third quartile position: Q3 = 0.75(n+1)

where n is the number of observed values

Quartiles

 Example: Find the first quartile

Sample Ranked Data: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,

so Q1 = 12.5
Population Variance
 Average of squared deviations of values from
the mean
N
 Population variance:
2
 (x  μ)
i
2

σ  i1
N
Where μ = population mean
N = population size
xi = ith value of the variable x
Sample Variance
 Average (approximately) of squared deviations
of values from the mean
n
 Sample variance:
2
 (x  x)i
2

s  i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Population Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Has the same units as the original data

 Population standard deviation:

 i
(x  μ) 2

σ i1
N
Sample Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Has the same units as the original data

 Sample standard deviation: n

 i
(x  x) 2

S i1
n -1
Calculation Example:
Sample Standard Deviation
Sample
Data (xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16

(10  X)2  (12  x)2  (14  x)2    (24  x)2

s
n 1

(10  16) 2  (12  16) 2  (14  16) 2    (24  16) 2


8 1

126 A measure of the “average”

  4.2426 scatter around the mean
7
Measuring variation

Small standard deviation

Large standard deviation

Comparing Standard Deviations

Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 0.926

Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.570
Advantages of Variance and
Standard Deviation

 Each value in the data set is used in the

calculation

 Values far from the mean are given extra

weight
(because deviations from the mean are squared)
Coefficient of Variation

 Measures relative variation

 Always in percentage (%)
 Shows variation relative to mean
 Can be used to compare two or more sets of
data measured in different units

 s
CV   100%
 x 
Comparing Coefficient
of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

s $5
CVA   100%  100% 10%
x $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100
deviation, but
 Standard deviation = $5 stock B is less
variable relative
to its price
s $5
CVB   100%  100% 5%
x $100
Chebychev’s Theorem
 For any population with mean μ and
standard deviation σ , and k > 1 , the
percentage of observations that fall within
the interval
[μ + kσ]
Is at least

2
100[1  (1/k )]%
Chebychev’s Theorem
(continued)
 Regardless of how the data are distributed, at
least (1 - 1/k2) of the values will fall within k
standard deviations of the mean (for k > 1)
 Examples:

At least within
(1 - 1/1.52) = 55.6% ……... k = 1.5 (μ ± 1.5σ)
(1 - 1/22) = 75% …........... k = 2 (μ ± 2σ)
(1 - 1/32) = 89% …….…... k = 3 (μ ± 3σ)
The Empirical Rule
 If the data distribution is bell-shaped, then
the interval:
 μ 1σ contains about 68% of the values in
the population or the sample

68%

μ
μ 1σ
The Empirical Rule

μ 2σ contains about 95% of the values in
the population or the sample

μ 3σ contains almost all (about 99.7%) of
the values in the population or
the sample

95% 99.7%

μ 2σ μ 3σ
2.3
Weighted Mean

 The weighted mean of a set of data is

w x i i
w 1x1  w 2 x 2    w n x n
x i1

n n

Where wi is the weight of the ith observation
and n  w i

 Use when data is already grouped into n classes, with

wi values in the ith class
2.4
The Sample Covariance
 The covariance measures the strength of the linear relationship
between two variables

 The population covariance:

 (x   i x )(yi   y )
Cov (x , y)  xy  i1
N
 The sample covariance:
n

 (x  x)(y  y)
i i
Cov (x , y) s xy  i1
n 1

Only concerned with the strength of the relationship

No causal effect is implied
Interpreting Covariance

 Covariance between two variables:

Cov(x,y) > 0 x and y tend to move in the same direction

Cov(x,y) < 0 x and y tend to move in opposite directions
Cov(x,y) = 0 x and y are independent
Coefficient of Correlation
 Measures the relative strength of the linear relationship
between two variables

 Population correlation coefficient:

Cov (x , y)
ρ
σXσY
 Sample correlation coefficient:

Cov (x , y)
r
sX sY
Features of
Correlation Coefficient, r
 Unit free
 Ranges between –1 and 1
 The closer to –1, the stronger the negative linear
relationship
 The closer to 1, the stronger the positive linear
relationship
 The closer to 0, the weaker any positive linear
relationship
Scatter Plots of Data with Various
Correlation Coefficients
Y Y Y

X X X
r = -1 r = -.6 r=0
Y
Y Y

X X X
r = +1 r = +.3 r=0
Summary
 Described measures of central tendency
 Mean, median, mode
 Illustrated the shape of the distribution
 Symmetric, skewed
 Described measures of variation
 Range, interquartile range, variance and standard deviation,
coefficient of variation
 Discussed measures of grouped data
 Calculated measures of relationships between
variables
 covariance and correlation coefficient

Session 2 Inferential Statistics Slides
100% (1)
Session 2 Inferential Statistics Slides
93 pages
2 Basic Statistics Unit-II Class
No ratings yet
2 Basic Statistics Unit-II Class
28 pages
Math264 Numerical Measures Apaydın
No ratings yet
Math264 Numerical Measures Apaydın
64 pages
Lecture - 04 - TP
No ratings yet
Lecture - 04 - TP
126 pages
Lecture 06-Describing Data Visual Information
No ratings yet
Lecture 06-Describing Data Visual Information
49 pages
Week 3 Chapter 3 Numerical Decriptive Measures
No ratings yet
Week 3 Chapter 3 Numerical Decriptive Measures
57 pages
The Data Detective: Ten Easy Rules To Make Sense of Statistics
No ratings yet
The Data Detective: Ten Easy Rules To Make Sense of Statistics
8 pages
Introductory of Statistics - Chapter 3
No ratings yet
Introductory of Statistics - Chapter 3
7 pages
The Machine Learning Landscape
No ratings yet
The Machine Learning Landscape
25 pages
CH 3 - 250408 - 170537
No ratings yet
CH 3 - 250408 - 170537
33 pages
Ken Black QA ch03
0% (1)
Ken Black QA ch03
61 pages
Lecture 04
No ratings yet
Lecture 04
88 pages
Tutorial Wk3
No ratings yet
Tutorial Wk3
21 pages
Measures of Dispersion Tendency
No ratings yet
Measures of Dispersion Tendency
7 pages
EECM3724 Unit 1 Ch3 Slides 2022
No ratings yet
EECM3724 Unit 1 Ch3 Slides 2022
48 pages
Central Tendency - Lecture Notes
No ratings yet
Central Tendency - Lecture Notes
34 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
7 pages
Probability Theory & Statistics: Describing Data: Numerical
No ratings yet
Probability Theory & Statistics: Describing Data: Numerical
36 pages
3-Measures of Dispersion
No ratings yet
3-Measures of Dispersion
33 pages
AI's Impact on English Learning at UMR
No ratings yet
AI's Impact on English Learning at UMR
11 pages
6 Descriptive Statistics 2
No ratings yet
6 Descriptive Statistics 2
20 pages
Prosocial Behavior in Tamu Schools
No ratings yet
Prosocial Behavior in Tamu Schools
11 pages
Newbold SBE9e Accessible CH02
No ratings yet
Newbold SBE9e Accessible CH02
64 pages
Untitled
No ratings yet
Untitled
3 pages
Week 6+7+8
No ratings yet
Week 6+7+8
37 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
24 pages
Dispersion
No ratings yet
Dispersion
26 pages
Chap 03
No ratings yet
Chap 03
30 pages
Chap03 - Numerically Describing Data
No ratings yet
Chap03 - Numerically Describing Data
41 pages
Chap 03
No ratings yet
Chap 03
56 pages
Chapter 3 Review
100% (1)
Chapter 3 Review
12 pages
Bus. Statt. Chapter-Lecture 2+3
No ratings yet
Bus. Statt. Chapter-Lecture 2+3
43 pages
100 MCQs For Research Methodology
No ratings yet
100 MCQs For Research Methodology
10 pages
Chapter 1
100% (1)
Chapter 1
75 pages
2.descriptive Statistics
No ratings yet
2.descriptive Statistics
49 pages
Lec006 - Measures of Dispersion
No ratings yet
Lec006 - Measures of Dispersion
42 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Statistics For Business and Economics: Using Numerical Measures To Describe Data
No ratings yet
Statistics For Business and Economics: Using Numerical Measures To Describe Data
74 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
Part 2-Chapter 3 - Describing Data - Edit
No ratings yet
Part 2-Chapter 3 - Describing Data - Edit
46 pages
Numerical Measures: Bf1206-Business Mathematics SEMESTER 2 - 2016/2017
No ratings yet
Numerical Measures: Bf1206-Business Mathematics SEMESTER 2 - 2016/2017
25 pages
Statistics for Data Analysis
No ratings yet
Statistics for Data Analysis
59 pages
Data Analysis With Microsoft Excel
92% (25)
Data Analysis With Microsoft Excel
532 pages
Mobile Money Satisfaction Factors
No ratings yet
Mobile Money Satisfaction Factors
17 pages
Arroyo Angel Quantitative Research
No ratings yet
Arroyo Angel Quantitative Research
26 pages
Statistical Data
No ratings yet
Statistical Data
41 pages
Lecture 2b - Describing Data-Numerical
No ratings yet
Lecture 2b - Describing Data-Numerical
47 pages
PATER Model: Celebrity Endorsement Scale
No ratings yet
PATER Model: Celebrity Endorsement Scale
22 pages
Imrad Format Orientation: Rhoda Marie A. Carbonel, MPA, MA Psych Psychology Faculty, STELA UB IRB-Ethics Committee Member
No ratings yet
Imrad Format Orientation: Rhoda Marie A. Carbonel, MPA, MA Psych Psychology Faculty, STELA UB IRB-Ethics Committee Member
45 pages
2 Measures of Location - Dispersion
No ratings yet
2 Measures of Location - Dispersion
61 pages
ECON1005 Final Exam Sem I 2024-2025
No ratings yet
ECON1005 Final Exam Sem I 2024-2025
6 pages
Quantitative Methods
No ratings yet
Quantitative Methods
5 pages
Statistics For Managers Using Microsoft Excel: 5 Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: 5 Edition
54 pages
CH02
No ratings yet
CH02
46 pages
Intro to Descriptive Statistics
No ratings yet
Intro to Descriptive Statistics
68 pages
Chapter 3, Numerical Descriptive Measures: - Data Analysis Is
No ratings yet
Chapter 3, Numerical Descriptive Measures: - Data Analysis Is
21 pages
Data Science & AI Essentials
100% (1)
Data Science & AI Essentials
20 pages
Hypotheses Testing 2018
No ratings yet
Hypotheses Testing 2018
35 pages
Question Bank For Probabilty Queueing Theory Regulation 2013
100% (1)
Question Bank For Probabilty Queueing Theory Regulation 2013
19 pages
Discreet Poisson Continuous Probability Problems Q
100% (1)
Discreet Poisson Continuous Probability Problems Q
6 pages
Measures of Location and VARIATION For 1 Variable
No ratings yet
Measures of Location and VARIATION For 1 Variable
44 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
Year 5 Maths
100% (1)
Year 5 Maths
11 pages
Dr. K. M. Salah Uddin Associate Professor Dept. of MIS, DU
No ratings yet
Dr. K. M. Salah Uddin Associate Professor Dept. of MIS, DU
41 pages
MBA Financial Service Mgmt Syllabus
No ratings yet
MBA Financial Service Mgmt Syllabus
49 pages
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
No ratings yet
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
33 pages
Basic Business Statistics: (8 Edition)
No ratings yet
Basic Business Statistics: (8 Edition)
26 pages
Numerical Data Analysis Methods
No ratings yet
Numerical Data Analysis Methods
32 pages
Numerical Descriptive Measures 1
No ratings yet
Numerical Descriptive Measures 1
39 pages
The Project of Dividend Policy Commercial Bank Limited
No ratings yet
The Project of Dividend Policy Commercial Bank Limited
45 pages
Measusres of Locations
No ratings yet
Measusres of Locations
52 pages
Numerical Descriptive Measures
No ratings yet
Numerical Descriptive Measures
52 pages
Ahmed Mehrez
No ratings yet
Ahmed Mehrez
17 pages
Statistics For Business and Economics: Describing Data: Numerical
No ratings yet
Statistics For Business and Economics: Describing Data: Numerical
56 pages
Central Tendency Variation Outliers
No ratings yet
Central Tendency Variation Outliers
59 pages
CTU Masteral Exercise 5 - November 4, 2023
No ratings yet
CTU Masteral Exercise 5 - November 4, 2023
8 pages
Statistics For Business and Economics: Describing Data: Numerical
No ratings yet
Statistics For Business and Economics: Describing Data: Numerical
55 pages
Design and Analysis of Experiments With Two Nuisance Factors
No ratings yet
Design and Analysis of Experiments With Two Nuisance Factors
14 pages
Measures of Central Tendency and Spread: Chapter 1, Section 2
No ratings yet
Measures of Central Tendency and Spread: Chapter 1, Section 2
36 pages
EXAMPLE # 1, Lesson # 33
No ratings yet
EXAMPLE # 1, Lesson # 33
5 pages
Violations of OLS
No ratings yet
Violations of OLS
64 pages
03 Descriptive-Numerical
No ratings yet
03 Descriptive-Numerical
91 pages
Econometrics: Wages & Unemployment
No ratings yet
Econometrics: Wages & Unemployment
9 pages
Aldi PPT 1w2
No ratings yet
Aldi PPT 1w2
17 pages
Blackwell Publishing Royal Statistical Society
No ratings yet
Blackwell Publishing Royal Statistical Society
7 pages
The Kolmogorov-Smirnov Test: Vasileios Hatzivassiloglou University of Texas at Dallas
No ratings yet
The Kolmogorov-Smirnov Test: Vasileios Hatzivassiloglou University of Texas at Dallas
11 pages
The Ruble: A Political History Ekaterina Pravilova Instant Download
No ratings yet
The Ruble: A Political History Ekaterina Pravilova Instant Download
152 pages

2 Descriptives

Uploaded by

2 Descriptives

Uploaded by

Describing Data Numerically

Describing Data Numerically

Central Tendency Variation

Arithmetic Mean Range

Median Interquartile Range

Mean Median Mode

 The most common measure of central tendency

 Not affected by extreme values

 The location of the median:

 Mode: most frequent value

 Describes how data are distributed

Left-Skewed Symmetric Right-Skewed

 Geometric mean rate of return

rg (x1 x 2 ... x n )1/n  1

An investment of $100,000 rose to $150,000 at the

X1 $100,000 X 2 $150,000 X3 $180,000

50% increase 20% increase

What is the mean percentage return over time?

Use the 1-year returns to compute the arithmetic

Arithmetic (50%)  (20%)

Geometric rg (x1 x 2 )1/n  1

Range Interquartile Variance Standard Coefficient of

 Measures of variation give

Range = Xlargest – Xsmallest

 Can eliminate some outlier problems by using

 Eliminate high- and low-valued observations

 Interquartile range = 3rd quartile – 1st quartile

25% 25% 25% 25%

Find a quartile by determining the value in the

First quartile position: Q1 = 0.25(n+1)

Second quartile position: Q2 = 0.50(n+1)

Third quartile position: Q3 = 0.75(n+1)

where n is the number of observed values

 Example: Find the first quartile

 Population standard deviation:

 Sample standard deviation: n

(10  X)2  (12  x)2  (14  x)2    (24  x)2

(10  16) 2  (12  16) 2  (14  16) 2    (24  16) 2

126 A measure of the “average”

Small standard deviation

Large standard deviation

 Each value in the data set is used in the

 Values far from the mean are given extra

 Measures relative variation

 The weighted mean of a set of data is

 Use when data is already grouped into n classes, with

 The population covariance:

 Covariance between two variables:

Cov(x,y) > 0 x and y tend to move in the same direction

 Population correlation coefficient:

You might also like