0% found this document useful (0 votes)

34 views14 pages

Basic Statistical Concepts - Measures of Location

Statistics is a scientific discipline focused on collecting, organizing, summarizing, analyzing, and drawing conclusions from numerical data to aid decision-making under uncertainty. It encompasses two main areas: descriptive statistics, which describes data, and inferential statistics, which makes inferences about populations based on samples. Key concepts include types of variables, levels of measurement, sampling methods, and measures of central tendency such as mean, median, and mode.

Uploaded by

Prince Xavier Baritua

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views14 pages

Basic Statistical Concepts - Measures of Location

Uploaded by

Prince Xavier Baritua

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

DATA MANAGEMENT

What is Statistics?

❖ Statistics is a scientific discipline consisting of theory and methods for processing

numerical information that one can use when making decisions in the face of
uncertainty.
❖ It is a science of conducting studies, to collect, organize, summarize, analyze and
draw conclusions from data.

Two Main Areas of Statistics

1. Descriptive Statistics

❖ In here, statisticians try to describe a situation.

❖ it consists of the collection, organization, summarization, and presentation of
data.

2. Inferential Statistics

❖ In here, statistician try to make inference from samples to populations.

❖ It uses probability, i.e., the chance of an event occurring.
❖ it consists of generalizing from samples to populations, performing estimations
and hypothesis test, determining relationships among variables, and making
predictions.

Key Definitions

A universe is the collection of things or observational units under consideration.

A variable is a characteristic observed or measured on every unit of the universe. It is a

characteristic or attribute that can assume different values.

A data are values that the variables can assume.

Data Set is a collection of data values.

A population consists of all subjects (human or otherwise) that are being studied.

A sample is a group of subjects selected from a population.

Types of Variables

1. Qualitative variables

▪ These are non-numerical values

▪ These are variables that can be placed into distinct categories accdg. to
some characteristics or attributes.
Examples: Type of School, Educational Qualification, Ethicity, Economic Status,

2. Quantitative variables

▪ These are numerical values that can be ordered or ranked

Example: age, height, weight, body temperature

Classification of Quantitative Variables

1. Discrete Variable – assume values that can be counted.

Example: no. of children in a family, no. of students in a classroom, etc.

2. Continuous Variable – assume an infinite number of values bet. any two specific values.
These include fractions and decimals.

Example: height, weight, etc.

length (15cm)-14.5-15.5 cm
weight (1.6g) 1.55 – 1.65 g

* Since data must be measured, answer must be rounded because of the very limited device.

Levels of Measurement

1. Nominal Level

▪ It classifies data into mutually exclusive (nonoverlapping), in which no

order or ranking can be imposed on the data
▪ numbers or symbols are used to classify

Examples: classifying teachers according to subject taught, classifying subjects

according to educational attainment, etc.

2. Ordinal Level

▪ Classifies data into data that can be ranked; however precise difference
between the ranks do not exist

Example:

Student evaluation result might be ranked the faculty as excellent, satisfactory ,poor, etc
Children in a family might be ranked as 1st child, 2nd child, etc.

3. Interval Level

- Ranks data and precise differences bet units of measure do exist; however there is no
meaningful zero.

Example: Temperature, say a meaningful difference of 10 C bet. 370C and 380C.

▪ No meaningful/absolute zero means, say a temperature of 00 C doesn’t

mean no heat at all.

4. Ratio Level

- possesses all the characteristics of interval measurement, and there exists a true zero.

- True ratios exist when the same variable is measured on two different members of the
population.

Example: If 1 person can lift 50kg and another can lift 100kg, then the ratio bet them is
1:50, 1; 100

Methods of Presenting Data

1. Textual
2. Tabular
3. Graphical

Sampling Methods

1. Random Sampling

Random Sample is a sample in which all members of the population have equal
chance of being selected.

2. Systematic Sampling

Systematic Sample is a sample obtained selecting every kth member of the population.

3. Stratified Sampling

Stratified Sample is a sample obtained by dividing up the population into
subgroups(strata) according to some characteristics relevant to the study.( There can be
several subgroups.). Then subjects are selected from each subgroup.

4. Cluster Sampling

Cluster Sample is a sample selected by dividing the population into sections or clusters
and then selecting one or more clusters and using all members in the cluster(s) as the
members of the sample.

▪ It is used when the population is large or when it involves subjects residing in a

large geographic area.

Frequency Distribution and Graphs

The most convenient method of organizing data is to construct a frequency distribution

and the most useful method of presenting data is by the use of statistical tables and graphs.

A frequency distribution is the organization of raw data in table form using classes and
frequencies

Types of Frequency Distribution

1. Categorical Frequency Distribution

- used for data that can be placed in specific categories, such as nominal or ordinal
level data.

Examples: Data such as age, gender, civil status, educational attainment, income etc.
2. Grouped Frequency Distribution

▪ It is used when the data is large.

Definition of Terms:

1. Range = Highest Value – Lowest Value

2. Class Limits (lower and upper)

▪ It is the difference by subtracting lower(upper) class limit of one class from the
lower(upper) class limit of the next class.

▪ should have decimal place value as the data

3. Class Boundaries

▪ It should have one additional place value and end in 5

4. Class Width

▪ It the difference bet. the lower(upper) class from the Lower (upper) class
of the next class.

5. Class Midpoint

▪ It is obtained by adding the lower and upper boundaries or adding the

lower and upper limits and dividing by 2.
▪ It is the numerical location of the center of the class.

Rules in Constructing Frequency Distribution

1. There should be 5- 20 classes.

2. It is preferable but not absolutely necessary that class width be an odd number. This is
to ensure that class midpoints of each class has the same place value as the data.

𝑙𝑜𝑤𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 + 𝑢𝑝𝑝𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦

Class Midpoint= 2

3. Classes must mutually exclusive. Mutually exclusive classes have nonoverlapping class
limits so that data cannot be placed into 2 classes.
4. The classes must be continuous.
5. The classes must be exhaustive . There should be enough classes to accommodate all
the data.
6. The classes must be equal in width.
Example: The following data represent the scores of 40 students in a 100-item STAT 221
exam. Construct a frequency distribution table using 9 classes. Find the mean class and the
median.

67 67 45 56 56 56 43 77 67 78

39 67 39 29 45 39 27 78 23 45

89 67 92 59 60 79 58 23 96 19

93 79 67 78 89 45 67 18 45 20

MEASURES

1. Measure of Location

A Measure of Location summarizes a data set by giving a “typical value” within the range of
the data values that describes its location relative to entire data set.

Some Common Measures:

☞ Central Tendency

☞ Percentiles, Deciles, Quartiles

a. Percentile
▪ It is a numerical measure that give the relative position of a data
value relative to the entire data set.

▪ It divides an array (raw data arranged in increasing or decreasing

order of magnitude) into 100 equal parts.

Percentiles are also used to compare an individual’s test score with the national
norm.

A percentage score indicates the proportion of a test that someone has

completed correctly.

A percentile score tells us what percent of other scores are less than the data
point we are investigating.

Example:
1. (Exam Result) John is in the 78 Percentile on CPA Exam, it means than he
performs better than 78% of the takers.
2. (Height). Jamie is in the 98 Percentile among the students in the class. This
means 98% of the class are shorter than her.

b. Decile
▪ It divides an array into ten equal parts, each part having ten
percent of the distribution of the data values, denoted by Dj.

The 1st decile is the 10th percentile; the 2nd decile is the 20th percentile…..

c. Quartile

▪ It divides an array into four equal parts, each part having 25% of the
distribution of the data values, denoted by Qj.
The 1st quartile is the 25th percentile; the 2nd quartile is the 50th percentile,
also the median and the 3rd quartile is the 75th percentile.

Steps in Finding Quartiles

Step 1: Arrange the data in order from lowest to highest.

Step 2. Find the median of the data values. This the value for Q2.
Step 3. Find the median of the data values that fall below Q2.
This is the value for Q1.
Step 4. Find the median of the data values that fall above Q2.
This is the value for Q3.

Example:
Find the Q1, Q2 and Q3 for the data set:

15, 13, 6, 5, 12, 50, 22, 18

Solution:

Step1. Arrange the data in order:

5, 6, 12, 13, 15, 18, 22, 50

Step2. Find the median (Q2).

5,6,12,13,15,18,22,50

Q2 = 14

Step 3. Find the median of the data values less than 14.
5,6,12,13

Q1 = 9

Step 4. Find the median of the data values greater than 14.
15,18, 22,50

Q3= 20

The interquartile range (IQR) is defined as the difference between Q1 and Q3

IQR = Q3-Q1 = 20-9 = 11

MEASURES OF CENTRAL TENDENCY

1. Mean

The arithmetic mean, often called as the mean, is the most frequently used measure of
central tendency. The mean is the only common measure in which all values play an equal role
meaning to determine its values you would need to consider all the values of any given data
set. The mean is appropriate to determine the central tendency of an interval or ratio data. The
symbol 𝑥 , called “x bar”, is used to represent the mean of a sample and the symbol μ, called
“mu”, is used to denote the mean of a population.

A. Properties of Mean

1. The mean is found by using all the values of the data.
2. The mean varies less than the median or mode when samples are taken from the
same population and all three measured are computed for these samples.
3. The mean is used in computing other statistics, such as variance.
4. The mean for the data set is unique and not necessarily one of the data values.
5. The mean cannot be computed for data in a frequency distribution than has an
open ended class.
6. The mean is affected by extremely high or low values, called outliers, and may not
be the appropriate average to use in these situations.
2. Median

The median is the midpoint of the data array. When the data set is ordered whether ascending
or descending, it is called data array. Median is an appropriate measure of central tendency for
data that are ordinal or above, but it is more valuable in an ordinal type of data.

A. Properties of the Median

1. The median is unique, there is only one median for the data set.

2. The median is used to find the center or middle value of a data set.

3. The median is used when it is necessary to find out whether the data values fall in the upper
or lower half of the distribution.

4. Median is not affected by the extreme values.

5. Median can be computed for an open-ended frequency distribution.

6. Median can be applied for ordinal, interval and ratio data.

B. Median for the Ungrouped Data

To determine the value of median for ungrouped we need to consider two rules:

1. If n is odd, the median is the middle ranked.

2. If n is even, then the median is the average of the two middle ranked values.

𝑛1
Median (Rank Value) = 2

Example1: Find the median of the ages of the middle-management employees of a certain
company. The ages are 53, 45, 59, 48, 54, 46,51, 58 and 55.

Solution:

1. Arrange the data in ascending order.

45, 46, 48, 51, 53, 54, 55, 58, 59
2. Select the middle rank value using the Formula
𝑛1 91 10
Median (Rank Value) = 2 = 2 = 2 = 5
3. Identify the median in the data set.
45, 46, 48, 51, 53, 54, 55, 58, 59
⇑
5th
Hence, the median age is 53 years of age.

Example 2: The daily rates of eight employees of a certain Municipality of Davao del Sur are
Php 550, 420, 560, 500, 700, 670, 860, 480. Find the median of the daily rate of employee.

Solution:

1. Arrange the data in Php in order.

420, 480, 500, 550, 560, 670, 700, 860

2. Select the middle rank value using the Formula:

𝑛1 81 9
Median (Rank Value) = 2
= 2
= 2
= 4.5

3. Identify the median in the data set.

420, 480, 500, 550, 560, 670, 700, 860

⇑
4.5th

Since the middle point falls between 550 and 560, we can determine the median of the
data set by getting the average of the two values.
550+560 1,110
Median = 2
= 2
= 555

Therefore, the median daily rate is Php 555.

3. Mode

The mode is the value in the data set that appears most frequently. Like the median and unlike
the mean, extreme value in the data set do not affect the mode. A data may not contain any
mode if none of the values is ‘most typical”. A data set that has only one value that occur the
greatest frequency is said to be unimodal. If the data has two values with the same greatest
frequency, both values are considered the mode and the data set is bimodal. If the data set
have more than two modes, the data set is said to be multimodal. If all the values in a data
set are different from each other, the data set is said to have no mode.

A. Properties of Mode

1. The mode is used when the most typical case is desired.
2. The mode is the easiest average to compute.
3. The mode can be used when the data are nominal or categorical, such religious
affiliation, gender, or political affiliation.
4. The mode is not always unique. A data set can have more than one mode or the may
not exist for a data set.

Example1: The following data represent the total unit sales for PSP 2000 from a
sample of 10 Gaming Centers for the month of August: 15, 17, 10, 12, 13, 10, 14, 10, 8
and 9. Find the mode.

Solution: The ordered array for these data is 8, 9, 10, 10, 10, 12, 13, 14, 15, 17.
Because 10 appears three times, more times than the other value, therefor the mode is
10.

Example 2: An operation manager in charge of a company’s manufacturing keeps track

of the number of manufactured LCD television in a day. Compute for the following data
that represents the number of LCD television manufactured for the past three weeks:

20, 18, 19, 25, 20, 21, 20, 25, 20, 29, 28, 29, 25, 27, 26, 22 and 20.
Find the mode of the given data set.

Solution: The ordered array for these data is:

18, 19, 20, 20, 20, 20, 21, 22, 25, 25, 25, 25, 26, 27, 28, 29, 29, 30.

There are two modes 20 and 25, since each of these values occurs four times in
a data set.

Measures of Variation

A measure of variation is a single value that is used to describe the spread

of the distribution
A measure of central tendency alone does not uniquely describe a
distribution

❖ Range - The difference between the maximum and minimum value in a data
set, i.e.

R = MAX – MIN
The larger the value of the range, the more dispersed the observations
are.
It is quick and easy to understand.
A rough measure of dispersion.

❖ Variance
important measure of variation
shows variation about the mean
a. Population Variance

Formula:

b. Sample Variance

Formula:

❖ Standard Deviation

a. Population SD

Formula:

b. Sample SD

Formula:

Properties of Standard Deviation

It is the most widely used measure of dispersion. (Chebychev’s Inequality)

It is based on all the items and is rigidly defined.
It is used to test the reliability of measures calculated from samples.
The standard deviation is sensitive to the presence of extreme values.
It is not easy to calculate by hand (unlike the range).

Coefficient of Variation (CV)

measure of relative variation
usually expressed in percent
shows variation relative to mean
used to compare 2 or more groups

Formula :

Measures of Skewness

describes the degree of departures of the distribution of the data from

symmetry.
The degree of skewness is measured by the coefficient of skewness,
denoted as SK and computed as
3 (𝑀𝑒𝑎𝑛−𝑀𝑒𝑑𝑖𝑎𝑛)
SK = 𝑆𝐷
Types of Distributions

Frequency distribution can assume many shapes. Three most familiar shapes are symmetric,
positively skewed, and negatively skewed. In a symmetric distribution the data values are
evenly distributed on both sides of the mean. Also, the distribution is unimodal and the mean,
median and mode are similar and are at the center of distribution

What is Symmetry?

Exercise No. 3.

1. The hourly output of two groups of employees assembling plug-in units at Zenith were
selected at random. The sample outputs were:
Complete the table; all measurements should be in 2-decimal

a. Which shift performed better? ____________________________________

b. Justify your answer.____
_________________________________________

2. Complete the table and find the mean for the following grouped frequency distribution.

N = ________

∑ fx = _______

𝑥 = _______

Correlation- a statistical method used to determine whether a relationship exist

between variables exist.

Regression – a statistical method used to describe the nature of the relationship

bet variables, that is positive or negative, linear or nonlinear.

One Way - ANOVA (Analysis of Variance) – a technique used to determine if there is a

significant difference amon

MMW Reviewer
No ratings yet
MMW Reviewer
3 pages
Data Management (1)
No ratings yet
Data Management (1)
46 pages
Statistics - CH - 1 & CH - 2 - Introduction and Describing Data - Tabular and Graphical Presentation
No ratings yet
Statistics - CH - 1 & CH - 2 - Introduction and Describing Data - Tabular and Graphical Presentation
37 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Social Work Research and Statistics July 18 2023 Quevedo
No ratings yet
Social Work Research and Statistics July 18 2023 Quevedo
182 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
3rd QTR Stats Reviewer
No ratings yet
3rd QTR Stats Reviewer
24 pages
Sta 131 Complete Note
No ratings yet
Sta 131 Complete Note
33 pages
Math Reviewer
No ratings yet
Math Reviewer
6 pages
Data Managementmmw
No ratings yet
Data Managementmmw
26 pages
Sta 103 L1 Upda2
No ratings yet
Sta 103 L1 Upda2
104 pages
Data Collection & Organization Guide
No ratings yet
Data Collection & Organization Guide
13 pages
Lesson 5 - Quantitative Analysis and Interpretation of Data
No ratings yet
Lesson 5 - Quantitative Analysis and Interpretation of Data
78 pages
AL - I (Unit - I)
No ratings yet
AL - I (Unit - I)
19 pages
Math11n PPT 3.1
No ratings yet
Math11n PPT 3.1
40 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
50 pages
Midterm Reviewer
No ratings yet
Midterm Reviewer
8 pages
BSTA205 - Revision Sheet - Midterm Examination
No ratings yet
BSTA205 - Revision Sheet - Midterm Examination
12 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
39 pages
STATS
No ratings yet
STATS
3 pages
Statistics and Probability Basics
No ratings yet
Statistics and Probability Basics
9 pages
Statistical Data Analysis Guide
No ratings yet
Statistical Data Analysis Guide
5 pages
Math 5
No ratings yet
Math 5
3 pages
Module 3 4 MMW
No ratings yet
Module 3 4 MMW
6 pages
Data Management
No ratings yet
Data Management
43 pages
MMW Statistics
No ratings yet
MMW Statistics
50 pages
Data Analysis Basics
No ratings yet
Data Analysis Basics
67 pages
Research II Q4 M2
No ratings yet
Research II Q4 M2
14 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
7 pages
Intro to Statistics for Beginners
No ratings yet
Intro to Statistics for Beginners
101 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
Inferential Statistics Course
No ratings yet
Inferential Statistics Course
46 pages
Data Management
No ratings yet
Data Management
44 pages
Data Management (1) (1) - Compressed
No ratings yet
Data Management (1) (1) - Compressed
46 pages
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
32 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
26 pages
Lec Notes Business Stat
No ratings yet
Lec Notes Business Stat
7 pages
Chapter 1 BFC34303 (Lyy)
No ratings yet
Chapter 1 BFC34303 (Lyy)
104 pages
Chapter 1 BFC34303
No ratings yet
Chapter 1 BFC34303
104 pages
Statistics
No ratings yet
Statistics
46 pages
Review of Statistical Concepts
No ratings yet
Review of Statistical Concepts
60 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Intro Stat
No ratings yet
Intro Stat
47 pages
Quantitative Decision Making Methods
No ratings yet
Quantitative Decision Making Methods
61 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Data Management
No ratings yet
Data Management
36 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
2 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Module-4 PPT
No ratings yet
Module-4 PPT
54 pages
Stats Reviewer
No ratings yet
Stats Reviewer
5 pages
2nd Software Engineering
No ratings yet
2nd Software Engineering
107 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Midterm Exam Reviewer
No ratings yet
Midterm Exam Reviewer
12 pages
1st Mid
No ratings yet
1st Mid
19 pages
Statistics for CSS Students
No ratings yet
Statistics for CSS Students
73 pages
ELT Type Eeder For Oybeans
No ratings yet
ELT Type Eeder For Oybeans
4 pages
Investment Concepts: True/False & MCQs
No ratings yet
Investment Concepts: True/False & MCQs
74 pages
Unit 3 Descriptive Statistics
No ratings yet
Unit 3 Descriptive Statistics
46 pages
31 Narendranathan
No ratings yet
31 Narendranathan
14 pages
Housing Market Polarization Trends
No ratings yet
Housing Market Polarization Trends
26 pages
Waste Water Treatment Plant Ref KSIA70632023RFP 28 04 2023
No ratings yet
Waste Water Treatment Plant Ref KSIA70632023RFP 28 04 2023
162 pages
Protein Analysis by Lowry Method
No ratings yet
Protein Analysis by Lowry Method
19 pages
Anhydrous Ammonia Application Rate Errors: Abstract
No ratings yet
Anhydrous Ammonia Application Rate Errors: Abstract
7 pages
Lonza ManualsProductInstructions SND Plate Product Insert
No ratings yet
Lonza ManualsProductInstructions SND Plate Product Insert
7 pages
F CA-drawframes Combingsection Lowres Marzoli
No ratings yet
F CA-drawframes Combingsection Lowres Marzoli
32 pages
Measure of Validity
No ratings yet
Measure of Validity
79 pages
Descriptive Statistics: Histogram
No ratings yet
Descriptive Statistics: Histogram
4 pages
Comparison of Wind Loads Calculated by Fifteen Different Codes and Standards, For Low (Steel Portal Frame), Medium and High-Rise Buildings
No ratings yet
Comparison of Wind Loads Calculated by Fifteen Different Codes and Standards, For Low (Steel Portal Frame), Medium and High-Rise Buildings
16 pages
Chapter 12. Safety Inventory
0% (1)
Chapter 12. Safety Inventory
61 pages
Statistics for MBA Students
No ratings yet
Statistics for MBA Students
58 pages
Errors in Chemical Analysis
No ratings yet
Errors in Chemical Analysis
51 pages
Accountancy SDL Manual
No ratings yet
Accountancy SDL Manual
129 pages
Measure of Variation PART-3
No ratings yet
Measure of Variation PART-3
18 pages
Balanced and Unbalanced Growth
No ratings yet
Balanced and Unbalanced Growth
10 pages
IVT Network - Statistical Analysis in Analytical Method Validation - 2014-07-10
100% (1)
IVT Network - Statistical Analysis in Analytical Method Validation - 2014-07-10
11 pages
Casamichana, D. (2012) - COMPARING THE PHYSICAL DEMANDS OF FRIENDLY MATCHES AND SMALL-SIDED GAMES IN SEMIPROFESSIONAL SOCCER PLAYERS
No ratings yet
Casamichana, D. (2012) - COMPARING THE PHYSICAL DEMANDS OF FRIENDLY MATCHES AND SMALL-SIDED GAMES IN SEMIPROFESSIONAL SOCCER PLAYERS
7 pages
Evaluation of Glucose Oxidase and Hexokinase Methods
No ratings yet
Evaluation of Glucose Oxidase and Hexokinase Methods
8 pages
GPS Accuracy in Rowing Race Timing
No ratings yet
GPS Accuracy in Rowing Race Timing
7 pages
Lab Quality Control Essentials
No ratings yet
Lab Quality Control Essentials
22 pages
Fifth Integrated Household Survey Ihs5 2019 2020 Basic Information Document
No ratings yet
Fifth Integrated Household Survey Ihs5 2019 2020 Basic Information Document
69 pages
Running Shoe Ergonomics Insights
No ratings yet
Running Shoe Ergonomics Insights
7 pages
Full Mock Test Series 05
No ratings yet
Full Mock Test Series 05
10 pages
Chapter 3
No ratings yet
Chapter 3
54 pages
Manish 22222
100% (1)
Manish 22222
16 pages

Basic Statistical Concepts - Measures of Location

Uploaded by

Basic Statistical Concepts - Measures of Location

Uploaded by

DATA MANAGEMENT

❖​ Statistics is a scientific discipline consisting of theory and methods for processing

Two Main Areas of Statistics

❖​ In here, statisticians try to describe a situation.

❖​ In here, statistician try to make inference from samples to populations.

A universe is the collection of things or observational units under consideration.

A variable is a characteristic observed or measured on every unit of the universe. It is a

A data are values that the variables can assume.

Data Set is a collection of data values.

A sample is a group of subjects selected from a population.

▪​ These are non-numerical values

▪​ These are numerical values that can be ordered or ranked

Example: age, height, weight, body temperature

Classification of Quantitative Variables

1. Discrete Variable – assume values that can be counted.

Example: no. of children in a family, no. of students in a classroom, etc.

Example: height, weight, etc.

▪​ It classifies data into mutually exclusive (nonoverlapping), in which no

Examples: classifying teachers according to subject taught, classifying subjects

Example: Temperature, say a meaningful difference of 10 C bet. 370C and 380C.

▪​ No meaningful/absolute zero means, say a temperature of 00 C doesn’t

Methods of Presenting Data

1.​ Random Sampling​

2.​ Systematic Sampling​

3.​ Stratified Sampling​

4.​ Cluster Sampling​

▪​ It is used when the population is large or when it involves subjects residing in a

Frequency Distribution and Graphs

The most convenient method of organizing data is to construct a frequency distribution

Types of Frequency Distribution

1. Categorical Frequency Distribution

▪​ It is used when the data is large.

1. Range = Highest Value – Lowest Value

2. Class Limits (lower and upper)

▪​ should have decimal place value as the data

▪​ It should have one additional place value and end in 5

▪​ It is obtained by adding the lower and upper boundaries or adding the

Rules in Constructing Frequency Distribution

1.​ There should be 5- 20 classes.

𝑙𝑜𝑤𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 + 𝑢𝑝𝑝𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦

67​ 67​ 45​ 56​ 56​ 56​ 43​ 77 67 78

​ 39​ 67​ 39​ 29​ 45​ 39​ 27 78 23 45

​ 89​ 67​ 92​ 59​ 60​ 79​ 58 23 96 19

93​ 79​ 67​ 78​ 89​ 45​ 67 18 45 20

1.​ Measure of Location

Some Common Measures:

​ ☞ Percentiles, Deciles, Quartiles

▪​ It divides an array (raw data arranged in increasing or decreasing

A percentage score indicates the proportion of a test that someone has

Steps in Finding Quartiles

Step 1: Arrange the data in order from lowest to highest.

15, 13, 6, 5, 12, 50, 22, 18

Step1. Arrange the data in order:​

Step2. Find the median (Q2).

The interquartile range (IQR) is defined as the difference between Q1 and Q3

IQR = Q3-Q1 = 20-9 = 11

MEASURES OF CENTRAL TENDENCY

A. Properties of the Median

4. Median is not affected by the extreme values.

5. Median can be computed for an open-ended frequency distribution.

6. Median can be applied for ordinal, interval and ratio data.

B. Median for the Ungrouped Data

1. If n is odd, the median is the middle ranked.

1.​ Arrange the data in ascending order.

1.​ Arrange the data in Php in order.

420, 480, 500, 550, 560, 670, 700, 860

2.​ Select the middle rank value using the Formula:

3.​ Identify the median in the data set.

420, 480, 500, 550, 560, 670, 700, 860

Therefore, the median daily rate is Php 555.

Example 2: An operation manager in charge of a company’s manufacturing keeps track

Solution: The ordered array for these data is:

​ A measure of variation is a single value that is used to describe the spread

b.​ Sample Variance

Properties of Standard Deviation

​ It is the most widely used measure of dispersion. (Chebychev’s Inequality)

❖ Statistics is a scientific discipline consisting of theory and methods for processing

❖ In here, statisticians try to describe a situation.

❖ In here, statistician try to make inference from samples to populations.

▪ These are non-numerical values

▪ These are numerical values that can be ordered or ranked

▪ It classifies data into mutually exclusive (nonoverlapping), in which no

▪ No meaningful/absolute zero means, say a temperature of 00 C doesn’t

1. Random Sampling

2. Systematic Sampling

3. Stratified Sampling

4. Cluster Sampling

▪ It is used when the population is large or when it involves subjects residing in a

▪ It is used when the data is large.

▪ should have decimal place value as the data

▪ It should have one additional place value and end in 5

▪ It is obtained by adding the lower and upper boundaries or adding the

1. There should be 5- 20 classes.

67 67 45 56 56 56 43 77 67 78

39 67 39 29 45 39 27 78 23 45

89 67 92 59 60 79 58 23 96 19

93 79 67 78 89 45 67 18 45 20

1. Measure of Location

☞ Percentiles, Deciles, Quartiles

▪ It divides an array (raw data arranged in increasing or decreasing

Step1. Arrange the data in order:

1. Arrange the data in ascending order.

1. Arrange the data in Php in order.

2. Select the middle rank value using the Formula:

3. Identify the median in the data set.

A measure of variation is a single value that is used to describe the spread

b. Sample Variance

It is the most widely used measure of dispersion. (Chebychev’s Inequality)

describes the degree of departures of the distribution of the data from