Stat 201 – Statistical Methods Lecture Notes No.
1
Second Semester, SY 2024-2025 CIRIACO T. RAGUAL
Professor In-charge; January 24, 2025
Lecture Notes No. 1
The Nature of Statistics
1. Statistics is the science of conducting studies to collect, organize, summarize, analyse, and draw
conclusions from data.
2. A variable is a characteristic or attribute that can assume different values.
3. Descriptive statistics consists of the collection, organization, summarization, and presentation
of data.
4. Inferential statistics consists of generalizing from samples to populations, performing
estimations and hypothesis tests, determining relationships among variables, and making
predictions.
5. A population consists of all subjects (human or otherwise) that are being studied.
6. A sample is a group of subjects selected from a population.
7. Qualitative variables are variable than can be placed into distinct categories, according to some
characteristic or attribute
8. Quantitative variables are numerical and can be ordered or ranked.
9. Discrete variables assume values that can be counted.
10. Continuous variables can assume an infinite number of values between any two specific values.
They are obtained by measuring. They often include fractions or decimals.
11. The nominal level of measurement classifies data into mutually exclusive (non-overlapping)
categories in which no order or ranking can be imposed on the data.
12. The ordinal level of measurement classifies data into categories that can be ranked, however,
precise differences between the ranks do not exist.
13. The interval level of measurement ranks data, and precise differences between units of
measure do exist; however, there is no meaningful zero.
14. The ratio level of measurement possesses all the characteristics of interval measurement, and
there exists a true zero. In addition, true ratios exist when the same variable is measured on two
different members of the population.
Applying the Concepts
1. Attendance and Grades
Read the following on attendance and grades, and answer the questions.
A study conducted at San Agustin Community College revealed that students who attended class
95 to 100% of the time usually received an A in the class. Students who attended class 80 to 90% of the
time usually received a B or C in the class. Students who attended class less than 80% of the time usually
received a D or F or eventually withdrew from class.
Based from the information, attendance and grades are related. The more you attend class,
more likely it is you will receive a higher grade. If you improve your attendance, your grades will
probably improve. Many factors affect your grade in a course. One factor that you have considerable
control over is attendance. You can increase your opportunities for learning by attending class more
often.
1. What are the variables in the study?
2. What are the data in the study?
3. Are descriptive, inferential, or both types of data used?
4. What is the population under study?
5. Was a sample collected? If so, from where?
6. From the information given, comment on the relationship between the variables?
1
Stat 201 – Statistical Methods Lecture Notes No. 1
Second Semester, SY 2024-2025 CIRIACO T. RAGUAL
Professor In-charge; January 24, 2025
2. Safe Travel
Read the following information about the transportation industry and answer the questions.
Transportation Safety
The chart shows the number of job-related injuries for each of the transportation industries for
a certain year.
Industry Number of Injuries
Railroad 4520
Intercity bus 5100
Subway 6850
Trucking 7144
Airline 9950
1. What are the variables under study?
2. Categorize each variable as qualitative or quantitative.
3. Categorize each quantitative variable as discrete or continuous.
4. Identify the level of measurement for each variable.
5. The railroad is shown as the safest transportation industry. Does that mean railroads have fewer
accidents than the other industries? Explain.
6. What factors other than safety influence a person’s choice of transportation?
7. From the information given, comment on the relationship between the variables.
Summary of Sampling Methods
15. Random – subjects are selected by random numbers
16. Systematic – subjects are selected by using every k th number after the first subject is randomly
selected from 1 to k.
17. Stratified – subjects are selected by dividing the population into groups (strata) and subjects are
randomly selected within groups.
18. Cluster – subjects are selected using an intact group that is representative of the population.
Appling the Concepts
3. American Culture and Drug Abuse
Assume you are a member of the Family Research Council and have become increasingly
concerned about the drug use by professional sports players. You set up a plan and conduct a survey on
how people believe the American Culture (television, movies, magazines, and popular music) influences
illegal drug use. Your survey consists of 2250 adults and adolescents from around the country. A
consumer group petitions you for more information about your survey. Answer the following questions.
1. What type of survey did you use (phone, mail, or interview)?
2. What are the advantages and disadvantages of the surveying methods you did not use?
3. What type of scores did you use? Why?
4. Did you use a random method for deciding who would be in your sample?
5. Which of the methods (stratified, systematic, cluster, or convenience) did you use?
6. Why was that method more appropriate for this type of data collection?
7. If a convenience sample were obtained considering of only adolescents, how would the
results of the study be affected?
2
Stat 201 – Statistical Methods Lecture Notes No. 1
Second Semester, SY 2024-2025 CIRIACO T. RAGUAL
Professor In-charge; January 24, 2025
Uses and Misuses of Statistics
19. The applications of statistics are many and varied. People encounter them in everyday life, such
as in reading newspapers or magazines, listening to radio, or watching television. Since statistics
is used in almost every field of endeavour, the educated individual should be knowledgeable
about the vocabulary, concepts, and procedures of statistics. Also, everyone should be aware
that statistics can be misused.
20. Today, computers and calculators are used extensively in statistics to facilitate the
computations.
Methods of Organization and Presentation of Data
1. A frequency distribution is the organization of raw data in table form, using classes and
frequencies
2. The categorical frequency distribution is used for data that can be placed in specific categories,
such as nominal – or ordinal-level data.
3. When the range of data is large, the data must be grouped into classes that are more than one
unit in width, in what is called a grouped frequency distribution.
Procedure for constructing a frequency distribution for categorical data
1. Make a table as shown. The number of rows depends on the number of categories in the data.
(A) (B) (C) (D)
Classes Tally Frequency Percent
2. Tally the data and place the results in column B.
3. Count the tallies and place the results in column C.
3
Stat 201 – Statistical Methods Lecture Notes No. 1
Second Semester, SY 2024-2025 CIRIACO T. RAGUAL
Professor In-charge; January 24, 2025
f
4. Find the percentage of values in each class by using the formula %= ∙ 100 %.
n
5. Find the totals for columns C (frequency) and D (percent).
Rules to construct a frequency distribution:
1. There should be between 5 and 20 classes.
2. It is preferable but not absolutely necessary that the class width be an odd number.
3. The classes must be mutually exclusive. Mutually exclusive classes have non-overlapping class
limits so that data cannot be placed into two classes.
4. The classes must be continuous.
5. The classes must be exhaustive.
6. The classes must be equal in width.
Procedure for constructing a grouped frequency distribution for numerical data:
1. Determine the classes.
a. Find the highest and lowest values
b. Find the Range, R . R=H−L
c. Determine the desired number of classes, k . k =√ n (Square Root method)
or k =1+3.322 log n (Sturge’s Rule)
R
d. Find the class size, c . c=
k
e. Select a starting point (usually the lowest value or any convenient number less than the
lowest value); add the width to get the lower limits.
f. Find the upper class limits.
g. Find the boundaries
2. Tally the data.
3. Find the numerical frequencies from the tallies.
Complete Frequency Distribution table for numerical data:
Class Class Midpoints Frequenc Relative Ogives CF Percentage
limits boundaries y Frequency Less Greater Less Greater
than than than than
Notes: Example:
Class limits: lower limit – upper limit 11 - 18
Class boundaries: lower true class boundary – upper true class boundary 10.50 – 18.50
1
LTCB=¿− unit of measure
2
¿+UL LTCB+UTCB
Midpoints: or 14.50
2 2
f f
Relative Frequency: RF= RFP= ∙ 100 %
n n
Less than Ogive: the number of observations less than or equal to the UTCB.
Greater than Ogive: the number of observations greater than or equal to the LTCB
¿ ogive ¿ ogive
CFP: ¿ CFP= ∙ 100 % ¿ CFP= ∙ 100 %
n n
Reasons for constructing a frequency distribution table:
4
Stat 201 – Statistical Methods Lecture Notes No. 1
Second Semester, SY 2024-2025 CIRIACO T. RAGUAL
Professor In-charge; January 24, 2025
1. To organize the data in a meaningful, intelligible way.
2. To enable the reader to determine the nature or shape of the distribution.
3. To facilitate computational procedures for measures of average and spread.
4. To enable the researcher to draw charts and graphs for the presentation of data.
5. To enable the reader to make comparisons among different data sets.
Graphs Associated with FDTs:
1. The histogram is a graph that displays the data by using contiguous vertical bars (unless the
frequency of a class is 0) of various heights to represent the frequencies of the classes.
2. The frequency polygon is a graph that displays the data by using lines that connect points
plotted for the frequencies at the midpoints of the classes. The frequencies are represented by
the heights of the points.
3. The ogive is a graph that represents the cumulative frequencies for the classes in a frequency
distribution.
Other Types of Graphs
1. A bar graph represents the data by using vertical or horizontal bars whose heights or lengths
represent the frequencies of the data.
2. A pareto chart is used to represent a frequency distribution for a categorical variables and the
frequencies are displayed by the heights of vertical bars, which are arranged in order from
highest to lowest.
3. A time series graph represents data that occur over a specific period of time.
4. A pie graph is a circle that is divided into sections or wedges according to the percentage of
frequencies in each category of the distribution