0% found this document useful (0 votes)

10 views7 pages

Doc1 1

Statistics is a mathematical discipline focused on collecting, analyzing, interpreting, and presenting data to inform decision-making across various fields. Key concepts include population and sample, descriptive and inferential statistics, and probability, with applications ranging from healthcare to finance. The document also discusses the importance of statistical analysis, its advantages and disadvantages, and various sampling techniques and measures of central tendency and variation.

Uploaded by

gsid4600

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views7 pages

Doc1 1

Uploaded by

gsid4600

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Unit 2

Introduction with statistical fundamental

What is Statistics?

Statistics is a branch of mathematics that involves collecting, analyzing, interpreting, presenting, and
organizing data. It's used to make sense of complex data sets and to inform decision-making in
various fields like business, science, healthcare, and more.

Key Concepts in Statistics

1. Population and Sample:

o Population: The entire group you want to study or make conclusions about.

o Sample: A subset of the population selected for analysis. Sampling is often used
because it's impractical to study the whole population.

2. Descriptive Statistics:

o Mean: The average value.

o Median: The middle value when data is sorted.

o Mode: The most frequently occurring value.

o Range: The difference between the highest and lowest values.

o Standard Deviation: A measure of how spread out the values are around the mean.

3. Inferential Statistics:

o Hypothesis Testing: Determining whether there is enough evidence to support a

specific claim about the population.

o Confidence Intervals: A range of values that's likely to contain the population

parameter.

o Regression Analysis: Understanding relationships between variables.

4. Probability:

o Probability: The likelihood of an event occurring.

o Normal Distribution: A bell-shaped curve that represents the distribution of many

types of data.

Basic Steps in Statistical Analysis

1. Collect Data: Gather information relevant to the study.

2. Organize Data: Sort and format the data for analysis.

3. Analyze Data: Use statistical methods to explore and summarize the data.

4. Interpret Data: Draw conclusions based on the analysis.

5. Present Data: Use visualizations and reports to communicate findings.

Example of Descriptive Statistics

Unit 2

Imagine you have the following test scores: 70, 75, 80, 85, 90.

 Mean: (70 + 75 + 80 + 85 + 90) / 5 = 80

 Median: 80 (the middle value)

 Mode: There is no mode since all values are unique.

 Range: 90 - 70 = 20

 Standard Deviation: Measures how much the scores deviate from the mean.

Introduction to Statistics

Statistics is the science of collecting, organizing, analyzing, interpreting, and presenting data. It helps
us make sense of the world through numbers and data.

Need for Statistics

Statistics is essential for:

1. Decision Making: Helps in making informed decisions based on data analysis.

2. Understanding Patterns: Identifies trends and patterns within data.

3. Prediction: Forecasts future outcomes based on historical data.

4. Scientific Research: Validates hypotheses and tests theories.

5. Quality Control: Monitors and improves the quality of products and services.

Advantages of Statistics

1. Data-Driven Insights: Provides objective insights based on data rather than intuition.

2. Improved Accuracy: Enhances the precision of results through quantitative analysis.

3. Trend Analysis: Identifies and analyzes trends over time.

4. Risk Management: Assesses and mitigates risks in various fields like finance and healthcare.

5. Effective Communication: Presents complex data in an understandable and visual manner.

Disadvantages of Statistics

1. Misinterpretation: Data can be misinterpreted or manipulated to support a biased view.

2. Complexity: Statistical methods can be complex and require specialized knowledge.

3. Data Quality: Results depend on the quality of the data; poor data leads to inaccurate
conclusions.

4. Overreliance: Overreliance on statistical analysis can overlook qualitative factors.

5. Time-Consuming: Collecting and analyzing data can be time-consuming and resource-

intensive.

Applications of Statistics

Statistics is used in a wide range of fields, including:

Unit 2

1. Healthcare: Analyzing patient data to improve treatments, understanding disease patterns,

and managing public health.

2. Finance: Risk assessment, stock market analysis, and financial forecasting.

3. Marketing: Consumer behavior analysis, market research, and product development.

4. Government: Policy making, census data analysis, and resource allocation.

5. Education: Evaluating educational programs, analyzing test scores, and improving teaching
methods.

6. Sports: Performance analysis, game strategy, and player statistics.

Case Study of a Statistical Application: Analyzing Customer Satisfaction in a Retail Store

In this case study, we’ll explore how a retail company applies statistical methods to measure and
analyze customer satisfaction. This involves both descriptive statistics and inferential statistics.

Background

A retail store has conducted a survey to measure the satisfaction level of its customers. The survey
asked customers to rate their experience on a scale of 1-10, with 1 being very dissatisfied and 10
being very satisfied. The company wants to analyze this data to make informed decisions about
improving the customer experience.

Data Collected

The company gathers feedback from 500 customers. The data collected is as follows:

 Ratings: Each customer provides a rating between 1 and 10.

 Additional Factors: Customers also answer questions about the store’s location, the quality
of service, product availability, etc.

1. Descriptive Statistics in the Case Study

Descriptive statistics are used to summarize and organize the data into meaningful patterns, making
it easier to interpret the large volume of information collected. This process includes the following
techniques:

a. Measures of Central Tendency

 Mean: The average satisfaction rating across all customers.

For example, if the satisfaction ratings of 10 customers are: [5, 6, 7, 8, 9, 6, 7, 6, 8, 7], the mean rating
is calculated as:

Mean=5+6+7+8+9+6+7+6+8+710=7.0\text{Mean} = \frac{5+6+7+8+9+6+7+6+8+7}{10} =
7.0Mean=105+6+7+8+9+6+7+6+8+7=7.0

This means that, on average, customers rate their experience a 7 out of 10.

 Median: The middle value when the ratings are ordered in ascending or descending order. If
there is an even number of ratings, the median is the average of the two middle numbers.
Unit 2

For the example above, the ordered ratings are: [5, 6, 6, 6, 7, 7, 7, 8, 8, 9]. The median is the average
of 7 and 7, which gives a median of 7.0.

 Mode: The most frequent rating given by the customers. In this case, the mode is 7 since it
occurs the most often.

b. Measures of Spread/Dispersion

 Range: The difference between the highest and lowest ratings. If the highest rating is 9 and
the lowest is 5, the range is:

Range=9−5=4\text{Range} = 9 - 5 = 4Range=9−5=4

 Standard Deviation: A measure of how spread out the ratings are from the mean. A lower
standard deviation means the ratings are close to the mean, while a higher standard
deviation indicates greater variability in customer satisfaction.

For the same data, a standard deviation calculation would show how consistent or varied the ratings
are.

c. Visualization

 Bar Graph: A bar graph could be created to visualize the number of customers giving each
rating from 1 to 10.

 Histogram: A histogram could show the distribution of customer ratings, helping to visualize
how ratings are spread.

2. Inferential Statistics in the Case Study

Inferential statistics help us make predictions or inferences about a larger population based on a
sample. In this case, we want to understand whether the sample data (500 customers) can be used
to make conclusions about the satisfaction of all customers.

a. Hypothesis Testing

Suppose the company wants to test whether their customers, on average, rate their satisfaction
above 7 (i.e., they hypothesize that the mean satisfaction rating is greater than 7). The null
hypothesis (H₀) and alternative hypothesis (H₁) can be defined as:

 H₀: The mean satisfaction rating is less than or equal to 7.

 H₁: The mean satisfaction rating is greater than 7.

The company would use t-tests (or Z-tests, depending on the sample size) to test this hypothesis. If
the p-value obtained from the test is smaller than a chosen significance level (typically 0.05), the
company may reject the null hypothesis, concluding that the mean satisfaction rating is significantly
greater than 7.

b. Confidence Intervals

Based on the sample data, the company might want to construct a confidence interval for the mean
satisfaction rating. A 95% confidence interval might suggest that, based on the sample data, the true
mean satisfaction rating for all customers is between 6.8 and 7.2.
Unit 2

This allows the company to understand the potential range of customer satisfaction in the larger
population, accounting for uncertainty in the sample data.

c. Regression Analysis

If the company wants to predict satisfaction ratings based on other factors (e.g., service quality, store
cleanliness, etc.), they can use regression analysis. For instance, they could run a linear regression to
predict customer satisfaction based on the store’s cleanliness score. The result of this regression
would show how strongly store cleanliness affects satisfaction.

d. Chi-Square Test for Independence

If the company also collects categorical data (e.g., gender, age group), a Chi-square test for
independence could be used to check whether there’s a relationship between customer satisfaction
and demographic factors like age or gender. For example, the company could investigate whether
younger customers rate their satisfaction higher than older customers.

Summary

In this case study:

 Descriptive statistics helped the company summarize customer satisfaction data using
measures like the mean, median, mode, and standard deviation.

 Inferential statistics allowed the company to make predictions about the larger customer
base, test hypotheses about satisfaction levels, and assess the relationship between
customer satisfaction and other factors.

Variables and Types of Data:

1. Variables: A variable is a characteristic or property that can take different values. Variables are
used in statistical analysis to understand how certain phenomena behave or vary. There are two main
types of variables:

 Qualitative (Categorical) Variables: These represent categories or groups, and the data
cannot be measured numerically.

o Example: Gender (Male/Female), Color (Red/Blue/Green)

 Quantitative (Numerical) Variables: These represent numerical measurements that can be

measured on a scale.

o Example: Age (years), Height (cm), Income (dollars)

Quantitative variables can be further divided into:

o Discrete Variables: These can only take specific values (countable).

 Example: Number of children in a family.

o Continuous Variables: These can take any value within a range and are measurable.

 Example: Weight (kg), Temperature (°C).

2. Types of Data: Data can be classified into different types based on its nature:
Unit 2

 Nominal Data: Categorical data with no inherent order or ranking.

o Example: Colors, Gender, Religion.

 Ordinal Data: Categorical data with a meaningful order but no consistent difference between
values.

o Example: Education level (High School, Bachelor's, Master's), Customer satisfaction

rating (Good, Average, Poor).

 Interval Data: Numeric data where the difference between values is meaningful, but there is
no true zero point.

o Example: Temperature (in Celsius or Fahrenheit).

 Ratio Data: Numeric data with a true zero point, where both differences and ratios are
meaningful.

o Example: Height, Weight, Age.

Sampling Techniques:

Sampling is the process of selecting a subset of data from a larger population for analysis. Common
sampling techniques include:

 Random Sampling: Every individual in the population has an equal chance of being selected.

 Systematic Sampling: Every nth individual is selected from the population.

 Stratified Sampling: The population is divided into subgroups (strata), and a random sample
is taken from each stratum.

 Cluster Sampling: The population is divided into clusters, and some clusters are randomly
selected, then all individuals from those clusters are included.

 Convenience Sampling: Individuals are chosen based on their availability or ease of access
(often biased).

 Judgmental or Purposive Sampling: Individuals are selected based on the researcher's

judgment or purpose of the study.

Descriptive Measures:

Descriptive statistics are used to summarize or describe a set of data. They include measures of
central tendency, variation, and position.

1. Measures of Central Tendency:

These measures describe the "center" or typical value of a dataset.

 Mean: The average of all data points.

 Median: The middle value when the data is arranged in ascending or descending order.

o If there is an odd number of data points, the median is the middle value.

o If there is an even number of data points, the median is the average of the two
middle values.
Unit 2

 Mode: The most frequently occurring value in a dataset.

o Example: In the dataset {2, 4, 4, 6, 8}, the mode is 4.

2. Measures of Variation:

These measures describe how spread out the data is.

 Range: The difference between the maximum and minimum values.

o Formula: Range=Max value-min value.

 Variance: A measure of how far each data point is from the mean. It is the average of the
squared differences from the mean.

 Standard Deviation: The square root of the variance, giving a measure of spread in the same
units as the original data.

3. Measures of Position:

These measures describe the relative standing of a particular data point within a distribution.

 Percentiles: Values that divide the data into 100 equal parts. The pth percentile is the value
below which p percent of the data falls.

o Example: The 50th percentile is the median.

 Quartiles: Specific percentiles that divide the data into four equal parts.

o First Quartile (Q1): 25th percentile, below which 25% of the data falls.

o Second Quartile (Q2): 50th percentile (median).

o Third Quartile (Q3): 75th percentile, below which 75% of the data falls.

 Interquartile Range (IQR): The range between the first and third quartiles (Q3 - Q1), used to
measure the spread of the middle 50% of the data.

These descriptive measures help summarize, understand, and communicate the patterns within data.

Statistics for Computer Science Students
No ratings yet
Statistics for Computer Science Students
6 pages
Statistics
No ratings yet
Statistics
68 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
73 pages
1 Data and Statistics
No ratings yet
1 Data and Statistics
65 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
Ge8 Statistics
No ratings yet
Ge8 Statistics
2 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
QT Module-2
No ratings yet
QT Module-2
45 pages
Statistics Presentation
No ratings yet
Statistics Presentation
27 pages
Ssmda End Sem
No ratings yet
Ssmda End Sem
152 pages
Unit 1 - Business Statistics
No ratings yet
Unit 1 - Business Statistics
10 pages
Statistics
No ratings yet
Statistics
63 pages
Statistics Module: Arijit Mitra
No ratings yet
Statistics Module: Arijit Mitra
25 pages
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
No ratings yet
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
74 pages
Ai - Ssmda
No ratings yet
Ai - Ssmda
142 pages
Statistics Notes Self Made
100% (1)
Statistics Notes Self Made
41 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
Chapter2-Statistical Analysis
No ratings yet
Chapter2-Statistical Analysis
86 pages
Unit .......
No ratings yet
Unit .......
45 pages
4 Inferential Statistical Analysis - ACE - INTL
No ratings yet
4 Inferential Statistical Analysis - ACE - INTL
110 pages
SCS3250A - Module 1 - Introduction To Statistics and Analytics
No ratings yet
SCS3250A - Module 1 - Introduction To Statistics and Analytics
44 pages
Statistics
No ratings yet
Statistics
5 pages
1 Introduction
No ratings yet
1 Introduction
15 pages
Statistical Techniques - Bda
No ratings yet
Statistical Techniques - Bda
33 pages
Lecture 4 - Data Science Statistics
No ratings yet
Lecture 4 - Data Science Statistics
21 pages
Data Management
No ratings yet
Data Management
48 pages
Basic Statistics and Probability Assignment
No ratings yet
Basic Statistics and Probability Assignment
11 pages
Element of Stat - Docx 11111
No ratings yet
Element of Stat - Docx 11111
12 pages
Statistics
No ratings yet
Statistics
45 pages
L8a - Central Tendency (Nota Shah Alam)
No ratings yet
L8a - Central Tendency (Nota Shah Alam)
79 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
49 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Advance Statistics For Data Science and Data Analysis
No ratings yet
Advance Statistics For Data Science and Data Analysis
47 pages
Statistics PP CH 01 Without Answer
No ratings yet
Statistics PP CH 01 Without Answer
51 pages
Statistics Referesher
No ratings yet
Statistics Referesher
30 pages
Chapter 2 BSC TY Statistical Data Analysis
No ratings yet
Chapter 2 BSC TY Statistical Data Analysis
124 pages
Business Statistics for Students
100% (1)
Business Statistics for Students
54 pages
Introduction To Statistics Final
No ratings yet
Introduction To Statistics Final
30 pages
Data Analysis for Researchers
No ratings yet
Data Analysis for Researchers
50 pages
Full Text en
No ratings yet
Full Text en
68 pages
Unit 2 DS PDF
No ratings yet
Unit 2 DS PDF
97 pages
Data Analysis Reporting-Original
No ratings yet
Data Analysis Reporting-Original
10 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
20 pages
Business Statistics and Analytics
No ratings yet
Business Statistics and Analytics
52 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Term Paper Stat
No ratings yet
Term Paper Stat
20 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Business Inferential Analysis Notes - Copy 9-17-2024
No ratings yet
Business Inferential Analysis Notes - Copy 9-17-2024
22 pages
Descriptive Statistics Basics
No ratings yet
Descriptive Statistics Basics
72 pages
Data Literacy
No ratings yet
Data Literacy
9 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
13 pages
Types of Statistics
No ratings yet
Types of Statistics
3 pages
2466939-EDA and STATISTICS NOTES
No ratings yet
2466939-EDA and STATISTICS NOTES
15 pages
Sasa Reviewer P1 J P4 at P5
No ratings yet
Sasa Reviewer P1 J P4 at P5
10 pages
Boxplot - ActivityAnswerKey
No ratings yet
Boxplot - ActivityAnswerKey
6 pages
Western Mindanao State University Siay Campus: Mode Median
No ratings yet
Western Mindanao State University Siay Campus: Mode Median
5 pages
Measures of Spread: Range & Quartiles
No ratings yet
Measures of Spread: Range & Quartiles
9 pages
GCSE Maths Higher Tier Exam
No ratings yet
GCSE Maths Higher Tier Exam
20 pages
Golden Gate Colleges: Raw Data (Result of Pretest and Post Test in English Iv) Respondent Pretest Post Test
No ratings yet
Golden Gate Colleges: Raw Data (Result of Pretest and Post Test in English Iv) Respondent Pretest Post Test
12 pages
Mdm4U Final Exam Review: This Review Is A Supplement Only. It Is To Be Used As A Guide Along With Other Review
No ratings yet
Mdm4U Final Exam Review: This Review Is A Supplement Only. It Is To Be Used As A Guide Along With Other Review
6 pages
ADC直方图分析对鼻咽癌调强放疗患者放射诱导颞叶损伤的预测价值
No ratings yet
ADC直方图分析对鼻咽癌调强放疗患者放射诱导颞叶损伤的预测价值
11 pages
Sibd Questions Soved Theory
No ratings yet
Sibd Questions Soved Theory
14 pages
Psy Ass M1 Reviewer
No ratings yet
Psy Ass M1 Reviewer
13 pages
Thursday 2020: Mathematics
No ratings yet
Thursday 2020: Mathematics
24 pages
The Effect of A Joint Protection Education Programme For People With Rheumatoid Arthritis
No ratings yet
The Effect of A Joint Protection Education Programme For People With Rheumatoid Arthritis
9 pages
Introduction To Statistics and Statistical Inference
No ratings yet
Introduction To Statistics and Statistical Inference
68 pages
Evans Analytics2e PPT 04 Revised
No ratings yet
Evans Analytics2e PPT 04 Revised
51 pages
Mean Histogram Box-Whisker Plot MYP 4 2025
No ratings yet
Mean Histogram Box-Whisker Plot MYP 4 2025
4 pages
Math IA - Lokesh Prasad
No ratings yet
Math IA - Lokesh Prasad
19 pages
A Spatial Analysis of Vehicular Accidents in Cavite Philippines From 2020 To 2024
No ratings yet
A Spatial Analysis of Vehicular Accidents in Cavite Philippines From 2020 To 2024
23 pages
Swe370 Data Mining
No ratings yet
Swe370 Data Mining
12 pages
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
No ratings yet
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
51 pages
Python
No ratings yet
Python
32 pages
Juniper Apstra 5.0.1 Release Notes
No ratings yet
Juniper Apstra 5.0.1 Release Notes
71 pages
Boxplot Outlier
No ratings yet
Boxplot Outlier
3 pages
Data Science Report
No ratings yet
Data Science Report
35 pages
Topic: Measures of Central Tendency and Measures of Dispersion
No ratings yet
Topic: Measures of Central Tendency and Measures of Dispersion
45 pages
Teaching Mathematics With Manipulative
100% (4)
Teaching Mathematics With Manipulative
127 pages
Pediatric Blood Cancer - 2022 - Mader - Social Emotional and Behavioral Functioning in Young Childhood Cancer Survivors
No ratings yet
Pediatric Blood Cancer - 2022 - Mader - Social Emotional and Behavioral Functioning in Young Childhood Cancer Survivors
11 pages
(ISOM2500) (2019) (F) Quiz N Sgpimc 83667
No ratings yet
(ISOM2500) (2019) (F) Quiz N Sgpimc 83667
20 pages
Statistic - With Python PDF
No ratings yet
Statistic - With Python PDF
11 pages
PhonePe Data Analyst Interview Questions
No ratings yet
PhonePe Data Analyst Interview Questions
21 pages
Business Mathematics and Statistics Past Papers
No ratings yet
Business Mathematics and Statistics Past Papers
92 pages
Journal Club 1
No ratings yet
Journal Club 1
50 pages