Introduction to Statistics
Chamilanka Wanigasekara
MSc. (Data Science), BSc. (Hons.) in Statistics
Statistics
Collection of procedures and principles for gathering data and
analyzing information to help people make decisions when faced with
uncertainty
Based on Numerical data, Statistical procedures can be divided into
two major categories
Statistics
Descriptive Statistics Inferential Statistics
Descriptive Statistics
• Descriptive statistics are used to describe
the basic features of the data in a study.
• They provide simple summaries about the
sample and the measures
Descriptive Statistics
Basic Functions of the Statistics
• Data collection – collection of accurate data relevant to the study. Most
important function
• Data Organizing – Organizing the collected raw data (make group ungrouped
data Data distribution)
• Data analyzing Calculation of mean, variance and many
• Data interpreting measurements and interpreting them
• Data presenting – Present the information by using graphs, tables, charts ect.
Inferential Statistics
• Inferential statistics is concerned with making predictions or inferences about a
population from observations and analyses of sample
• Populatipon – The entire group of individuals, conside for the is called as
population
• Sample – A portion / part of the population selected for analysis. It is a subset of
the population. It should represent population accuracy.
Inferential Statistics
• Ex: consider that 10000 bulbs are produced. Want to get know the mean life
time of the produced bulbs. Its useless to check all the bulbs .Therefore 100
bulbs are taken and cheacked. Finally estimated that the produced bulbs have
1200 hours of mean life time
Population - 10000 bulbs
Sample. - 100 bulbs
Inferential Statistics
Selection of sample
Sample
Sample
Population
Individuals selected
participate to the study
Inference on the population from the sample
Statistics for every Spheres
Agriculture sphere
Ex. Probability of sprout of newly introduced weeds
Engineering sphere
Ex. calculate the strength of the cement use, to construct the new building
Medical sphere
Ex. To prepare a blood report for patients
Business sphere
Ex. To check the quality of the goods in production line
Definitions in Statistics
1. Characteristics – Attribute of a specific person , objective or physical situation
Ex - Specific person – height of the student
Objective – weight of the milk packet
Physical situation- Temperature of Kandy town
2. Variable – A variable is a characteristic that may assume more than one set of values to
which a numerical measure can be assigned, The value of the variable can vary from one
entity to another
Ex – age , Height weight of the students in a classroom
Variables may be classified into various categories
Types of Variables
2.1. Qualitative variable and Quantitative Variable
variable
Qualitative Quantitative
Qualitative variables take on values that are Quantitative variables are numeric.
names or labels They represent a measurable quantity
Ex color of a book, Ex -population of a city
Quantitative variables can be further classified as discrete or continuous
Types of Variables
2. 2 Discrete variable and Continuous Variable
Discrete variable - Variables that can only take on a finite number of values. Data
values with unconnected data points, often a count.
No any scales –( Km, m, cm. l, ml)
Ex- No of telephone calls receive per day ,No of Boys in a families
Types of Variables
2. 2 Discrete variable and Continuous Variable
Continuous Variable –
Variables which have Infinite values with connected data points, often a measurement.
Continuous variables can get any value in a certain range
These measures can be infinitely accurate
These variables have scales–( Km, m, cm. l, )
Ex- time can be measured to the nearest minute, second, half-second
weight of the milk powder packet
Types of Variables
variable
Qualitative Quantitative
Discrete Continuous
Definitions in Statistics
3. Sample survey - Information will be collected from all sample units on the
other hand , sample survey is a study that obtains data from a subset of a
population (sample)
4. Census - A census is a study that obtains data from every member/unit of
a population.
5. Population frame - Population frame is a list of all the elements/ units in
the population
6. Sample frame - Sample frame is a list of all the elements/ units in the sample
Data
•The subject Statistics is based on Data
• Collecting accurate and reliability data is essential and it generates Optimal
Decisions to the decision makers
• Statistics data is generated from calculations, measuring etc
• Data can be classified into many Categories
Data
Quantitative and qualitative data
Quantitative data – Quantitative data which consisted with numerical data. This
type of data directly analyze
Ex - The profit of the firm for last year is 1.2 million
Qualitative data – Qualitative data which consisted without numerical data. This type
of data directly cannot be analyzed
Ex - Gender of the person
Data
Data
Internal Data External Data
Primary Data Secondary Data
Data
Internal Data - Data that has been collected within the firm
Ex- Employee attendance records, Accounts ,employee salary records
External Data - Data obtained outside the firm
Ex- Exchange rate details
Primary Data Secondary Data
Primary Data
Data collected by the investigator himself/ herself for a specific purpose
• This type of data is generally a fresh and collected for first time.
Ex- Data collected by a student for his Research
• Major Methods of collecting Primary data
• By using a Postal Questionnaire
• Interview Method
• Telephone conversations
• Direct Observations
Secondary Data
Data collected by someone else for some other purpose (but being
utilized by the investigator for another purpose)
• Secondary data is data that is being reused usually in different context
• One purpose’s primary data can become a another purpose’s secondary data
• This data is easily and quickly obtainable than the Primary data.
• Secondary data may be obtained from many resources and Secondary Data
providing resources are
• Central Bank Reports
• Department of Census and Statistics
• Custom Reports
Limitations of Statistics
Limitations of Statistics
• Single unit is not consideration when carrying out a study. We cannot conclude
by studying a single unit
• Qualitative data cannot be directly analyzed. We can analyze qualitative data by
transforming them into percentages and rates
• Statistics Data is not 100% accurate , as we generate data by studying a sample.
Statistics data is relatively accurate but not absolutely Accurate
• Statistical data can be misused
Limitations of Statistics