Describing Data:
Frequency Tables, Frequency
Distributions, and Graphic Presentation
Chapter 2
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.
GOALS
1. Organize qualitative data into a frequency table.
2. Present a frequency table as a bar chart or a pie
chart.
3. Organize quantitative data into a frequency
distribution.
4. Present a frequency distribution for quantitative
data using histograms, frequency polygons, and
cumulative frequency polygons.
2-2
Learning Objectives
1. Describe data using graphs
2-3 © 2011 Pearson Education, Inc
2.1
Describing Qualitative Data
2-4 © 2011 Pearson Education, Inc
Data
Presentation
Qualitative Quantitative
Data Data
Summary Frequency
Table Distribution
Bar Pie Pareto
Histogram
Graph Chart Diagram
© 2011 Pearson Education, Inc
Summary Table
1. Lists categories & number of elements in category
2. Obtained by tallying responses in category
3. May show frequencies (counts), % or both
Row Is
Major Count Tally:
Category |||| ||||
Accounting 130
|||| ||||
Economics 20
Management 50
Total 200
© 2011 Pearson Education, Inc
Bar Graph
150 Equal Bar
Widths Bar Height
Shows
Frequency
Percent 100 Frequency or %
Used
Also
50
0
Acct. Econ. Mgmt.
Vertical Bars
Zero Point Major for Qualitative
Variables
© 2011 Pearson Education, Inc
Bar Charts
BAR CHART A graph in which the classes are reported on the
horizontal axis and the class frequencies on the vertical axis. The
class frequencies are proportional to the heights of the bars.
2-8
Pie Charts
PIE CHART A chart that shows the proportion or percent
that each class represents of the total number of
frequencies.
2-9
Visualizing Categorical Data:
The Bar Chart
In a bar chart, a bar shows each category, the length of which
represents the amount, frequency or percentage of values falling into
a category which come from the summary table of the variable.
Banking Preference
Banking Preference? % Internet
ATM 16%
In person at branch
Automated or live 2%
telephone
Drive-through service at branch
Drive-through service at 17%
branch
In person at branch 41% Automated or live telephone
Internet 24%
ATM
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Chap 2-10 Copyright ©2012 Pearson Education, Inc. publishing as
Prentice Hall
•Chap 2-10
Visualizing Categorical Data:
The Pie Chart
The pie chart is a circle broken up into slices that represent categories.
The size of each slice of the pie varies according to the percentage in
each category.
Banking Preference
Banking Preference? %
16% ATM
ATM 16% 24%
2% Automated or live
Automated or live 2%
telephone
telephone
Drive-through service at
Drive-through service at 17%
17% branch
branch
In person at branch
In person at branch 41%
Internet 24% Internet
41%
Chap 2-11 Copyright ©2012 Pearson Education, Inc. publishing as
Prentice Hall
•Chap 2-11
Visualizing Categorical Data:
The Pareto Chart
• Used to portray categorical data (nominal scale)
• A vertical bar chart, where categories are shown in
descending order of frequency
• A cumulative polygon is shown in the same graph
• Used to separate the “vital few” from the “trivial
many”
Chap 2-12 Copyright ©2012 Pearson Education, Inc. publishing as
Prentice Hall
•Chap 2-12
Visualizing Categorical Data:
The Pareto Chart (con’t)
Pareto Chart For Banking Preference
100% 100%
% in each category
80% 80%
Cumulative %
(line graph)
(bar graph)
60% 60%
40% 40%
20% 20%
0% 0%
In person Internet Drive- ATM Automated
at branch through or live
service at telephone
branch
Copyright ©2012 Pearson Education, Inc. publishing as
Prentice Hall Chap 2-14
•Chap 2-14
Pareto Diagram
Like a bar graph, but with the categories arranged by
height in descending order from left to right.
150
Equal Bar
Widths Bar Height
Shows
Frequency
100
Percent Frequency or %
Used
Also
50
0
Acct. Mgmt. Econ.
Major Vertical Bars
Zero Point for Qualitative
© 2011 Pearson Education, Inc
Variables
Pareto Diagram
© 2011 Pearson Education, Inc
Pareto Diagram
17
Summary
Bar graph: The categories (classes) of the qualitative variable are
represented by bars, where the height of each bar is either the class
frequency, class relative frequency, or class percentage.
Pie chart: The categories (classes) of the qualitative variable are
represented by slices of a pie (circle). The size of each slice is proportional
to the class relative frequency.
Pareto diagram: A bar graph with the categories (classes) of the
qualitative variable (i.e., the bars) arranged by height in descending order
from left to right.
© 2011 Pearson Education, Inc
Thinking Challenge
You’re an analyst for IRI. You want to show the market shares held by
Web browsers in 2006. Construct a bar graph, pie chart, & Pareto
diagram to describe the data.
Browser Mkt. Share (%)
Firefox 14
Internet Explorer 81
Safari 4
Others 1
© 2011 Pearson Education, Inc
Bar Graph Solution*
100%
Market Share (%)
80%
60%
40%
20%
0%
Firefox Internet Safari Others
Explorer
Browser
© 2011 Pearson Education, Inc
Pie Chart Solution*
Market Share
Firefox,
14%
Safari, 4%
Others,
1%
Internet
Explorer,
81%
© 2011 Pearson Education, Inc
Pareto Diagram Solution*
100%
Market Share (%)
80%
60%
40%
20%
0%
Internet Firefox Safari Others
Explorer
Browser
© 2011 Pearson Education, Inc
A Contingency Table Helps Organize Two or
More Categorical Variables
• Used to study patterns that may exist between the responses
of two or more categorical variables
• Cross tabulates or tallies jointly the responses of the
categorical variables
• For two variables the tallies for one variable are located in the
rows and the tallies for the second variable are located in the
columns
Chap 2-23 Copyright ©2012 Pearson Education, Inc. publishing as
Prentice Hall
•Chap 2-23
Contingency Table - Example
• A random sample of 400 invoices •Contingency Table Showing
is drawn. •Frequency of Invoices Categorized
• Each invoice is categorized as a •By Size and The Presence Of Errors
small, medium, or large amount. No
• Each invoice is also examined to Errors Errors Total
identify if there are any errors. Small 170 20 190
Amount
• These data are then organized in
the contingency table to the Medium
Amount
100 40 140
right.
Large 65 5 70
Amount
335 65 400
Total
Chap 2-24
•Chap 2-24
Contingency Table Based On Percentage of Overall
Total
No
Errors Errors Total •42.50% = 170 / 400
Small 170 20 190 •25.00% = 100 / 400
Amount
•16.25% = 65 / 400
Medium 100 40 140
Amount
No
Errors Errors Total
Large 65 5 70
Amount
Small 42.50% 5.00% 47.50%
Amount
335 65 400
Total
Medium 25.00% 10.00% 35.00%
Amount
•83.75% of sampled invoices have no
Large 16.25% 1.25% 17.50%
errors and 47.50% of sampled Amount
invoices are for small amounts.
83.75% 16.25% 100.0%
Total
Chap 2-25 Copyright ©2012 Pearson Education, Inc. publishing as
Prentice Hall
•Chap 2-25
Contingency Table Based On Percentage of Row
Totals
26
Contingency Table Based On Percentage of Column
Total
27
2.2
Graphical Methods for Describing
Quantitative Data
© 2011 Pearson Education, Inc
Frequency Distribution Table Steps
1. Determine range
2. Select number of classes
• Usually between 5 & 15 inclusive
3. Compute class intervals (width)
4. Determine class boundaries (limits)
5. Compute class midpoints
6. Count observations & assign to classes
© 2011 Pearson Education, Inc
Frequency Distribution Table Example
Raw Data: 24, 26, 24, 21, 27 27 30, 41, 32, 38
Class Midpoint Frequency
15.5 – 25.5 20.5 3
Width
25.5 – 35.5 30.5 5
35.5 – 45.5 40.5 2
Boundaries (Lower + Upper Boundaries) / 2
© 2011 Pearson Education, Inc
Histogram
HISTOGRAM A graph in which the classes are marked on the
horizontal axis and the class frequencies on the vertical axis. The
class frequencies are represented by the heights of the bars and the
bars are drawn adjacent to each other.
2-31
Frequency Polygon
A frequency polygon
also shows the shape
of a distribution and is
similar to a histogram.
It consists of line
segments connecting
the points formed by
the intersections of the
class midpoints and the
class frequencies.
2-32
Histogram Versus Frequency Polygon
Both provide a quick picture of the main characteristics of the
data (highs, lows, points of concentration, etc.)
The histogram has the advantage of depicting each class as a
rectangle, with the height of the rectangular bar representing
the number in each class.
The frequency polygon has an advantage over the histogram. It
allows us to compare directly two or more frequency
distributions.
2-33
Cumulative Frequency Distribution
2-34
Visualizing Numerical Data:
The Polygon
A percentage polygon is formed by having the midpoint of
each class represent the data in that class and then connecting
the sequence of midpoints at their respective class percentages.
The cumulative percentage polygon, or ogive, displays the
variable of interest along the X axis, and the cumulative
percentages along the Y axis.
Useful when there are two or more groups to compare.
•Copyright ©2012 Pearson Education, Inc. publishing as Prentice
Hall
•Chap 2-35
•Chap 2-35
Cumulative Frequency Distribution
2-36
Time Series Plot
• Used to graphically display data produced over time
• Shows trends and changes in the data over time
• Time recorded on the horizontal axis
• Measurements recorded on the vertical axis
• Points connected by straight lines
© 2011 Pearson Education, Inc
Time Series Plot Example
• The following data shows Average
the average retail price of Date Price
regular gasoline in New Oct 16, 2006 $2.219
York City for 8 weeks in
Oct 23, 2006 $2.173
2006.
Oct 30, 2006 $2.177
• Draw a time series plot for
this data. Nov 6, 2006 $2.158
Nov 13, 2006 $2.185
Nov 20, 2006 $2.208
Nov 27, 2006 $2.236
Dec 4, 2006 $2.298
© 2011 Pearson Education, Inc
Time Series Plot Example
Price
2.35
2.3
2.25
2.2
2.15
2.1
2.05
10/16 10/23 10/30 11/6 11/13 11/20 11/27 12/4
Date
© 2011 Pearson Education, Inc
SCATTER PLOTS AND LINES OF BEST FIT
•Positive •Negative •little or
correlation correlation no
correlation
•Example 1:
• The scatter plots of data relate characteristics of children from
•0 to 18 years old.
• Match each scatter plot with the appropriate variables studied.
• 1. age and height
• 2. age and eye color
• 3. age and time needed to run a certain distance
•no correlation •as your age increases •as your age
between age your height also increases increases the time
and eye color will decrease
•2 •1 •3
•An effective way to see a relationship in data is to display
•scatter plot
the information as a __________________.
•It shows how two variables relate to each other by showing
•fit
how closely the data points _______ to a line.
•The following table presents information on tornado occurrences.
•Make a scatter plot for the table.
Year 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995
# of
201 593 616 897 654 919 866 684 1133 1234
Tornadoes
•Scatter plots provide a convenient way to determine
•correlation exists between two variables.
whether a ___________
•positive
•A __________ correlation occurs when both variables
increase.
•negative
•A ___________ correlation occurs when one variable
increases and the other variable decreases.
•If the data points are randomly scattered there is _______
•little
or no correlation.