Statistics for Finance and Business
Analytics
FR1203 and BA1201
Week1: Introduction to Statistics
Copyright © 2023 Pearson Education Ltd. Slide - 1
Motivation: Decision Making in an
Uncertain Environment (1 of 2)
Everyday decisions are based on incomplete
information
Examples:
• Will the job market be strong when I graduate?
• Will the price of Apple stock be higher in six months than it
is now?
• Will Bank of England lower the interest rate if the inflation
rate is predicted to decrease?
Copyright © 2023 Pearson Education Ltd. Slide - 2
Decision Making in an Uncertain
Environment (2 of 2)
Data are used to assist decision making
• Statistics is a tool to help process, summarize, analyze,
and interpret data
Copyright © 2023 Pearson Education Ltd. Slide - 3
Key Definitions
• A population is the collection of all items of interest or
under investigation
– N represents the population size
• A sample is an observed subset of the population
– n represents the sample size
• A parameter is a specific characteristic of a population
• A statistic is a specific characteristic of a sample
Copyright © 2023 Pearson Education Ltd. Slide - 4
Population vs. Sample
Population Sample
Values calculated using Values computed from
population data are sample data are called
called parameters statistics
Copyright © 2023 Pearson Education Ltd. Slide - 5
Examples of Populations
• Names of all registered voters in the UK
• Incomes of all families living in London
• Annual returns of all stocks traded on the London
Stock Exchange
• Exam score averages of all the students in your
university
Copyright © 2023 Pearson Education Ltd. Slide - 6
Random Sampling
Simple random sampling is a procedure in which
• each member of the population is chosen strictly by
chance,
• each member of the population is equally likely to be
chosen,
• every possible sample of n objects is equally likely to be
chosen
The resulting sample is called a random sample
Copyright © 2023 Pearson Education Ltd. Slide - 7
Systematic Sampling (1 of 2)
For systematic sampling,
• Assure that the population is arranged in a way that is not
related to the subject of interest
• Select every j th item from the population…
• …where j is the ratio of the population size to the
N
sample size, j =
n
• Randomly select a number from 1 to j for the first item
selected
The resulting sample is called a systematic sample
Copyright © 2023 Pearson Education Ltd. Slide - 8
Systematic Sampling (2 of 2)
Example:
Suppose you wish to sample n = 9 items from a population
of N = 72.
N 72
j= = =8
n 9
Randomly select a number from 1 to 8 for the first item to
include in the sample; suppose this is item number 3.
Then select every 8th item thereafter
(items 3, 11, 19, 27, 35, 43, 51, 59, 67 )
Copyright © 2023 Pearson Education Ltd. Slide - 9
Descriptive and Inferential Statistics
Two branches of statistics:
• Descriptive statistics
– Graphical and numerical procedures to summarize and
process data
• Inferential statistics
– Using data to make predictions, forecasts, and
estimates to assist decision making
Copyright © 2023 Pearson Education Ltd. Slide - 10
Descriptive Statistics
• Collect data
– e.g., Survey
• Present data
– e.g., Tables and graphs
• Summarize data
– e.g., Sample mean = X i
Copyright © 2023 Pearson Education Ltd. Slide - 11
Inferential Statistics
• Estimation
– e.g., Estimate the population
mean weight using the
sample mean weight
• Hypothesis testing
– e.g., Test the claim that the
population mean weight is
140 pounds
Inference is the process of drawing conclusions or
making decisions about a population based on sample
results
Copyright © 2023 Pearson Education Ltd. Slide - 12
Section 1.2 Classification of Variables
Copyright © 2023 Pearson Education Ltd. Slide - 13
Measurement Levels
Copyright © 2023 Pearson Education Ltd. Slide - 14
Section 1.3-1.5 Graphical
Presentation of Data (1 of 2)
• Data in raw form are usually not easy to use for
decision making
• Some type of organization is needed
– Table
– Graph
• The type of graph to use depends on the variable
being summarized
Copyright © 2023 Pearson Education Ltd. Slide - 15
Section 1.3 Tables and Graphs for
Categorical Variables
Copyright © 2023 Pearson Education Ltd. Slide - 16
The Frequency Distribution Table
Summarize data by category
Example: Hospital Patients by Unit
Copyright © 2023 Pearson Education Ltd. Slide - 17
Graph of Frequency Distribution
• Bar chart of patient data
Copyright © 2023 Pearson Education Ltd. Slide - 18
Graphing Multivariate Categorical
Data (1 of 2)
• Side by side horizontal bar chart
Copyright © 2023 Pearson Education Ltd. Slide - 19
Graphing Multivariate Categorical
Data (2 of 2)
• Stacked bar chart
Copyright © 2023 Pearson Education Ltd. Slide - 20
Vertical Side-by-Side Chart Example
• Sales by quarter for three sales territories:
Copyright © 2023 Pearson Education Ltd. Slide - 21
Bar and Pie Charts
• Bar charts and Pie charts are often used for
qualitative (categorical) data
• Height of bar or size of pie slice shows the
frequency or percentage for each category
Copyright © 2023 Pearson Education Ltd. Slide - 22
Bar Chart Example
Copyright © 2023 Pearson Education Ltd. Slide - 23
Pie Chart Example
Copyright © 2023 Pearson Education Ltd. Slide - 24
Pareto Diagram
• Used to portray categorical data
• A bar chart, where categories are shown in
descending order of frequency
• A cumulative polygon is often shown in the same
graph
• Used to separate the “vital few” from the “trivial
many”
Copyright © 2023 Pearson Education Ltd. Slide - 25
Pareto Diagram Example (1 of 3)
Example: 400 defective items are examined for
cause of defect:
Source of
Manufacturing Error Number of defects
Bad Weld 34
Poor Alignment 223
Missing Part 25
Paint Flaw 78
Electrical Short 19
Cracked case 21
Total 400
Copyright © 2023 Pearson Education Ltd. Slide - 26
Pareto Diagram Example (2 of 3)
Step 1: Sort by defect cause, in descending order
Step 2: Determine % in each category
Source of
Manufacturing Error Number of defects % of Total Defects
Poor Alignment 223 55.75
Paint Flaw 78 19.50
Bad Weld 34 8.50
Missing Part 25 6.25
Cracked case 21 5.25
Electrical Short 19 4.75
Total 400 100%
Copyright © 2023 Pearson Education Ltd. Slide - 27
Pareto Diagram Example (3 of 3)
Step 3: Show results graphically
Copyright © 2023 Pearson Education Ltd. Slide - 28
Section 1.4 Graphs to Describe Time-
Series Data
• A line chart (time-series plot) is used to show the
values of a variable over time
• Time is measured on the horizontal axis
• The variable of interest is measured on the
vertical axis
Copyright © 2023 Pearson Education Ltd. Slide - 29
Line Chart Example
Copyright © 2023 Pearson Education Ltd. Slide - 30
Frequency Distributions
What is a Frequency Distribution?
• A frequency distribution is a list or a table…
• containing class groupings (categories or ranges
within which the data fall)...
• and the corresponding frequencies with which
data fall within each class or category
Copyright © 2023 Pearson Education Ltd. Slide - 31
Why Use Frequency Distributions?
• A frequency distribution is a way to summarize
data
• The distribution condenses the raw data into a
more useful form...
• and allows for a quick visual interpretation of the
data
Copyright © 2023 Pearson Education Ltd. Slide - 32
Class Intervals and Class Boundaries
• Each class grouping has the same width
• Determine the width of each interval by
largest number − smallest number
w = interval width =
number of desired intervals
• Use at least 5 but no more than 15-20 intervals
• Intervals never overlap
• Round up the interval width to get desirable
interval endpoints
Copyright © 2023 Pearson Education Ltd. Slide - 33
Frequency Distribution Example (1 of 3)
Example: A manufacturer of insulation randomly
selects 20 winter days and records the daily high
temperature
data:
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Copyright © 2023 Pearson Education Ltd. Slide - 34
Frequency Distribution Example (2 of 3)
• Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
• Find range: 58 − 12 = 46
• Select number of classes: 5 (usually between 5 and 15)
• Compute interval width: 10
46
then round up
5
• Determine interval boundaries: 10 but less than 20, 20 but
less than 30, , 60 but less than 70
• Count observations & assign to classes
Copyright © 2023 Pearson Education Ltd. Slide - 35
Frequency Distribution Example (3 of 3)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Relative
Interval Frequency Percentage
Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100
Copyright © 2023 Pearson Education Ltd. Slide - 36
Histogram
• A graph of the data in a frequency distribution is
called a histogram
• The interval endpoints are shown on the
horizontal axis
• the vertical axis is either frequency, relative
frequency, or percentage
• Bars of the appropriate heights are used to
represent the number of observations within each
class
Copyright © 2023 Pearson Education Ltd. Slide - 37
Histogram Example
Copyright © 2023 Pearson Education Ltd. Slide - 38
Questions for Grouping Data into
Intervals
• How wide should each interval be?
(How many classes should be used?)
• How should the endpoints of the intervals be
determined?
– Often answered by trial and error, subject to user
judgment
– The goal is to create a distribution that is neither too
"jagged" nor too "blocky”
– Goal is to appropriately show the pattern of variation in
the data
Copyright © 2023 Pearson Education Ltd. Slide - 39
How Many Class Intervals?
• Many (Narrow class intervals) 3.5
– may yield a very jagged distribution 3
2.5
Frequency
with gaps from empty classes 2
1.5
– Can give a poor indication of how 1
0.5
frequency varies across classes 0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
More
Temperature
• Few (Wide class intervals)
– may compress variation too much
and yield a blocky distribution
– can obscure important patterns of
variation.
Copyright © 2023 Pearson Education Ltd. Slide - 40
The Cumulative Frequency
Distribution
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage
10 but less than 20 3 15 3 15
20 but less than 30 6 30 9 45
30 but less than 40 5 25 14 70
40 but less than 50 4 20 18 90
50 but less than 60 2 10 20 100
Total 20 100 blank blank
Copyright © 2023 Pearson Education Ltd. Slide - 41
The Ogive Graphing Cumulative
Frequencies
Copyright © 2023 Pearson Education Ltd. Slide - 42
Scatter Diagrams
• Scatter Diagrams are used for paired
observations taken from two numerical
variables
• The Scatter Diagram:
– one variable is measured on the vertical axis
and the other variable is measured on the
horizontal axis
Copyright © 2023 Pearson Education Ltd. Slide - 43
Scatter Diagram Example
Copyright © 2023 Pearson Education Ltd. Slide - 44
Chapter Summary (1 of 2)
• Reviewed incomplete information in decision
making
• Introduced key definitions:
– Population vs. Sample
– Parameter vs. Statistic
– Descriptive vs. Inferential statistics
• Described random sampling
• Examined the decision making process
Copyright © 2023 Pearson Education Ltd. Slide - 45
Chapter Summary (2 of 2)
• Reviewed types of data and measurement levels
• Data in raw form are usually not easy to use for decision
making -- Some type of organization is needed:
– Table
– Graph
• Techniques reviewed in this chapter:
– Frequency distribution – Line chart
– Bar chart – Frequency distribution
– Pie chart – Histogram and ogive
– Pareto diagram – Scatter plot
Copyright © 2023 Pearson Education Ltd. Slide - 46