0% found this document useful (0 votes)

8 views37 pages

Inferential Statistics

Uploaded by

Vinay Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views37 pages

Inferential Statistics

Uploaded by

Vinay Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Inferential Statistics

Introduction to Probability
Exploratory data analysis helped you understand how to discover patterns in data using various techniques and
approaches. As you learnt, EDA is one of the most important parts of the data analysis process. It is also the part on
which data analysts spend most of their time.
However, sometimes, you may require a very large amount of data for your analysis, which may need too much time and
resources to acquire. In such situations, you are forced to work with a smaller sample of the data instead of having to
work with the entire data.
Situations like these arise all the time at big companies like Amazon. For example, let's say the Amazon QC department
wants to know what proportion of the products in its warehouses are defective. Instead of going through all of its
products (which would be a lot!), the Amazon QC team can just check a small sample of 1,000 products and then find, for
this sample, the defect rate (i.e., the proportion of defective products). Then, based on this sample's defect rate, the
team can 'infer' what the defect rate is for all the products in the warehouses.
This process of 'inferring' insights from sample data is called 'inferential statistics'.
Note that even after using inferential statistics, you will arrive at only an estimate of the population data from the sample
data, not the exact values. This is because when you don't have the exact data, you can only make reasonable estimates
about it with a limited level of certainty. Therefore, when certainty is limited, we talk in terms of probability. Probability
is useful and important in inferential statistics.
In this session, you will learn the basic concepts of probability and the various rules associated with it. The broad agenda
of the session covers the following:
1. Permutation and combination
2. Definition of probability and its properties
3. Key terms related to probability
4. Probability rules (Addition and Multiplication)

Permutations
A permutation is a way of arranging a select group of objects in such a way that the order is of significance. As shown in
the example, when you arrange the top order batsmen of a cricket team, you use permutation to find all the possible
orders in which they can be arranged.
If there are n 'objects' that are to be arranged among r available 'spaces', then the number of ways in which this task can
be completed is n! / (n-r)! If there are n 'spaces' as well, then the number of ways would be just n!. Here n! (pronounced
as n factorial) is simply the product of all the numbers from n till 1 and is given by the following formula:
n! = n*(n-1) *(n-2) .... *3*2*1.
nPr or P (n, r) = n! /(n-r)!

In the case of counting using the method of permutations, you had considered the 'order' to be an important factor.
1. Finding all possible four-letter words that can be formed using the alphabets R, E, A and D
2. Finding all possible ways in which the final league standings of the eight teams can be in an Indian Premier
League (IPL) tournament
3. Finding all possible ways that a group of 10 people can be seated in a row in a cinema hall, and so on
Inferential Statistics

Combinations
Permutation vs combinations
Permutation Combinations
Order matters Order does not matter
In how many ways you can select In how many ways you can choose
three number from 1,2,3 three number from 1,2,3
123
132
213
123
231
312
321

When you just have to choose some objects from a larger set and the order is of no significance, then the rule of
counting that you use is called combination.
Some other examples of combinations are as follows.
1. The number of ways in which you can pick three letters from the word 'UPGRAD'
2. The number of ways a team can win three matches in a league of five matches
3. The number of ways in which you can choose 13 cards from a deck of 52 cards, and so on

The formula for counting the number of ways to choose r objects out of a set of n objects is as follows:

One way to look at it is to see if the order matters or not. If it does, then use the permutations formula, and if does not,
then use the one for combinations.

Note: A helpful hint here would be to look for a keyword in the given scenario to know which method is needed. If the
problem requires you to order/arrange a group of objects, then you would most probably use the method of
permutations. Else, if you are told to pick/choose a group of objects, then more often than not you would be using the
formula for combinations.
Inferential Statistics

Probability: Definition and Properties

probability values have the following two major properties:

1. Probability values always lie in the range of 0 to 1. The value is 0 in the case of an impossible event (like the
probability of you being in Delhi and Mumbai at the same time) and 1 in the case of a sure event (like the
probability of the sun rising in the east tomorrow).
2. The probabilities of all outcomes for an experiment always sum up to 1. For example, in a coin toss, there can be
two outcomes, heads or tails. The probability of both of the outcomes is 0.5 each. Hence, the sum of the
probabilities turns out to be 0.5 + 0.5 = 1.
Couple of definitions that are crucial in understanding probability.
1. Experiment: Essentially, any scenario for which you want to compute the probabilities is considered to be an
experiment. It is of the following two types:
a. Deterministic: Outcome is the same every time.
b. Random: Outcome can take many possible values. Throughout the majority of our business analytics
course, we will only be discussing the random experiment.
2. Sample space: A sample space is nothing but the list of all possible outcomes of a random experiment. It is
denoted by S = {all the possible outcomes}. For example, in the coin toss example, the sample space is S = {H, T},
where H = heads and T = tails.
3. Event: It is a subset, i.e., a part of the sample space that you want to be true for your probability experiment. For
example, if in a coin toss you want heads to be the desired outcome, then the event becomes {H}. As you can see
clearly, {H} is a part of {H, T}.

Types of Events
The two main categories of events that you need to know right now are independent events and disjoint or mutually
exclusive events. Let's learn their formal definitions.
 Independent events: If you have two or more events and the occurrence of one event has no bearing
whatsoever on the occurrence/s of the other event/s, then all the events are said to be independent of each
other. For example, the chances of rain in Bengaluru on a particular day has no effect on the chances of rain in
Mumbai 10 days later. Hence, these two events are independent of each other.

 Disjoint or mutually exclusive events: Now, two or more events are mutually exclusive when they do not occur
at the same time, i.e., when one event occurs, the rest of the events do not occur. For example, if a student has
been assigned grade C for a particular subject in an exam, he or she cannot be awarded grade B for the same
subject in the same exam. So, the events in which a student gets a grade of B or C for the same subject in the
same exam are mutually exclusive or disjoint.
Inferential Statistics

For example
 The events 'Customer A buys the product' and 'Customer B buys the product' are independent, whereas the
events 'Customer A buys the product' and 'Customer A does not buy the product' are disjoint.
 The events 'You will win Lottery A' and 'You will win Lottery B' are independent, whereas 'You will win Lottery A'
and 'You will not win Lottery A' are disjoint events.

Complement Rule for Probability

Now, disjoint events have one special property that is pretty intuitive and easy to understand. For example, let’s
say A and B are two disjoint events. If A = 'Event that it rains today' and B =' Event that it does not rain today' and
you know the P(A) = 0.3, can you guess what P(B) might be?
complement rule for probability: - It states that if A and A' are two events which are mutually exclusive/disjoint
and are complementary/in negation of each other (you can read A' as 'not A'), then:
P(A) + P(A') =1

Rules of Probability – Addition

where,
 P(A∪B) denotes the probability that either event A or B occurs.
 P(A) denotes the probability that only event A occurs
 P(B) denotes the probability that only event B occurs
 P(A∩B) denotes the probability that both events A and B occur simultaneously.
You can also read
 P(A∪B) as P(either event A or B occurs)
 P(A∩B) as P(both events A and B occur).
As mentioned in the video, for disjoint events A and B, P(A∩B) = 0 since both cannot occur simultaneously. Hence, the
formula can be rewritten as P(A∪B) = P(A) + P(B).
Inferential Statistics

Rules of Probability – Multiplication

Multiplication rule is applicable only on independent events

when an event A is not dependent on event B and vice versa, they are known as independent events. The multiplication
rule allows us to compute the probabilities of both of them occurring simultaneously, which is given as:
P(A and B) = P(A)*P(B)
P(A and B and C and D) = P(A)*P(B)*P(C)*P(D).
Comparison Between Addition Rule and Multiplication Rule
Both the addition rule and the multiplication rule allow you to compute the probabilities of the occurrence of
multiple events. However, there is a key difference between the two, which should help you to decide when to
use which rule.
1. The addition rule is generally used to find the probability of multiple events when either of the events can
occur at that particular instance. For example, when you want to compute the probability of picking a face
card or a heart card from a deck of 52 cards, a successful outcome occurs when either of the two events is
true. This includes either getting a face card, a heart card, or even both a face and a heart card. This rule
works for all types of events.
2. The multiplication rule is used to find the probability of multiple events when all the events need to occur
simultaneously. For example, in a coin toss experiment where you toss the coin three times and you need to
find the probability of getting three heads at the end of the experiment, a successful outcome occurs when
you get a head in the first, second and third toss as well. This rule is used for independent events only.
3. Also, in the addition rule, do you remember the P(A⋂B) that we used to compute the final value of P(A⋃B)?
This value is exactly the same as the P(A and B) that we compute in independent events using the
multiplication rule. You can go back and verify it for the same example shown in the video. There we had
P(Heart Card) = P(H) = 13/52, P(Face Card) = P(F) = 12/52 and P(Heart Card and Face Card) = P(H⋂F) = 3/52.
Now, as mentioned by the multiplication rule, you can see that P(H and F) = P(H)*P(F) = (13/52)*(12/52) =
3/52, which is the same as the value of P(H⋂F).
Note: A helpful hint here to decide when to use the addition rule and when to use the multiplication rule is to observe
the language of the question. If the question mentions an 'OR' to denote the relationship between the events, then you
need to apply the addition rule. That is, either of the given events can occur at that time, P(Event A or Event B). Else, if an
'AND' is used to denote the relationship between the events, then the multiplication rule should be used. Here, the
events need to happen simultaneously and must be independent, i.e., P(Event A and Event B).
Inferential Statistics

Basics of Probability
Random Variables
The random variable X converts the outcomes of experiments to measurable values.

For example, let’s say as a data analyst at a bank, you are trying to find out which of the customers will default
on their loan, i.e., stop paying their loans. Based on some data, you have been able to make the following
predictions:

Customer Number of
Yearly Income (in ₹) Amount of Loan Due (in ₹) Default Prediction (Yes/No)
Number Dependents
1 10 lakh 75 lakh 3 Yes
2 15 lakh 50 lakh 2 No
3 20 lakh 40 lakh 1 No

Now, instead of processing the yes/no response, it will be much easier if you define a random variable X to
indicate whether the customer is predicted to default or not. The values will be assigned according to the
following rule:
X = 1, if the customer defaults;
X = 0, if the customer does not default.
Now, the data changes to the following:

Customer Number of
Yearly Income (in ₹) Amount of Loan Due (in ₹) Default Prediction (Yes/No)
Number Dependents
1 10 lakh 75 lakh 3 1
2 15 lakh 50 lakh 2 0
3 20 lakh 40 lakh 1 0

Now, in this form, the table is entirely quantified, i.e., converted to numbers. Now that the data is entirely in quantitative
terms, it becomes possible to perform a number of different kinds of statistical analyses on it.

Eg. In casino, In the long run (i.e., if it is played a lot of times), is this game profitable for the players or for the house? Or,
will everybody break even in the long run?
we established a three-step process for answering this question:
1. Find all the possible combinations.
2. Find the probability of each combination.
3. Use the probabilities to estimate the profit/loss per player.
Inferential Statistics

Probability Distributions
A probability distribution is a form of representation that tells us the probability for all the possible values of X. It could
be any of the following:
A table A Chart An equation

Tabular Form of Probability Bar Chart Form of Probability

An equation
Distribution Distribution

Expected Value
Expected Value for a variable X, is the value of X that we would “expect” to get after performing the experiment an
infinite number of times.
It is also called the expectation, average or mean value.

Mathematically speaking, for a random variable X that can take the values: x1, x2, x3, ..........., xn.
The expected value (EV) is given by:
EV(X)=x1∗P(X=x1) + x2∗P(X=x2) + x3∗P(X=x3) + ........... + xn∗P(X=xn)

The expected value should be interpreted as the average value you get after the experiment has been conducted an
infinite number of times.
For example, the expected value for the number of red balls is 2.385. This means that if we conduct the
experiment (play the game) infinite times, the average number of red balls per game would end up being 2.385.

Discrete Probability Distributions

Binomial Distribution

In a bag containing red and blue bolls, let’s say that the probability of getting 1 red ball in one trial is equal to p. hence
probability of getting 1 blue ball in one trial is equal to (1-p).

p = probability of getting one red ball

Inferential Statistics
So, there are 4 combinations of 3 red balls and 1 blue balls

So, the probability distribution for X (i.e., the number of red balls drawn after 4 trials) if the probability of getting a red
ball in 1 trial is 'p' is as follows:

Let’s say
n = the number of trials, Probability of getting r red balls = pr
p = the probability of success Probability of getting (n-r) blue balls = (1-p)(n-r)
(1-p) = the probability of not success No of combination of r success is nCr
r = the number of successes after n trials.

Probability of getting 1 combination of r red balls and (n-r) blue balls = P(X=r) for one combination = p r x (1-p)(n-r)

P(X=r) for all combinations = nCr(p)r(1−p)n−r

So, the formula for finding binomial probability is given by:

However, there are some conditions that need to be met in order for us to be able to apply the formula.

1. The total number of trials is fixed at n.

2. Each trial is binary, i.e., it has only two possible outcomes: success or failure.
3. Probability of success is the same in all trials, denoted by p.

Binomial Distribution Applicable Binomial Distribution Not Applicable

Tossing a coin 20 times to see how many tails occur Tossing a coin until a head occurs
Asking 200 randomly selected people if they are older than 21
Asking 200 randomly selected people how old they are
or not
Drawing 4 red balls from a bag, putting each ball back after Drawing 4 red balls from a bag, not putting each ball back after
drawing it drawing it

1. If you toss a coin 20 times to see how many times you get tails, you are following all the conditions required for a
binomial distribution. The total number of trials is fixed (20), and you can only have two outcomes, i.e., tails or
heads. The probability of getting a tail is 0.5 each time you toss a coin.
Inferential Statistics
2. In a way, this is similar to drawing 20 balls out of a bag, replacing each ball after drawing it, and seeing how many
of the balls are red. Here, the probability of getting a red ball in one trial is 0.5.
3. When you toss a coin until you get heads, the total number of trials is not fixed. This is similar to taking out balls
from the bag repeatedly until you draw a red ball. You can still find the probability of getting heads in 1 trial, 2
trials, 3 trials etc. and so on, but you cannot use binomial distribution to find that probability.
4. In the second example, where binomial distribution is not applicable, the experiment does not have only two
outcomes, but several. It is similar to taking out balls from a bag that contains red, blue, black, orange, and
other-colored balls. The probability distribution for this experiment cannot be made using binomial distribution.
5. In the final example, the probability of trials is not equal to each other. For example, the probability of drawing a
red ball in the first trial is 3/5. Now, in the second trial, the probability of drawing a red ball would be equal
to 2/4 not 3/5, as the red ball taken out in the first trial was not put back. Hence, the probability of getting the
combination red-red-red-blue, for example, would be 3/5*2/4*1/3*2/2, which is not the value we got while
deriving binomial distribution (3/5*3/5*3/5*2/5). Again, you cannot use binomial distribution to find the
probability in this case.

In other words, binomial distribution is applicable in situations where there are a fixed number of yes or no questions,
with the probability of a yes or a no remaining the same for all questions.

A random variable follows a Bernoulli distribution if it only has two possible outcomes: 0 or 1.

For example, suppose we flip a coin one time. Let the probability that it lands on heads be p. This means the
probability that it lands on tails is 1-p.

Now, if we flip a coin multiple times then the sum of the Bernoulli random variables will follow a Binomial
distribution.

PDF of Binomial Distribution = nCr(p)r(1−p)n−r

When n = 1 trial, the Binomial distribution is equivalent to the Bernoulli distribution.

PDF of Bernoulli distribution = p(1−p)1- r

Variance of Bernoulli distribution = p(1−p)

There are some more probability distributions that are commonly seen among discrete random variables. They are
not covered in this course, but if you want to go through some of them, you can use the following links:
1. Poisson Distribution :
It gives the probability of an event happening a certain number of times (k) within a given interval of
time or space. The Poisson distribution has only one parameter, λ (lambda), which is the mean number
of events.
2. Geometric Distribution :
The number of trials required to achieve success (P(X = n) = (1 − p)(n − 1) • p), trials are performed till first
success
3. Negative Binomial Distribution :
The number of failures before the nth success in a sequence of draws of Bernoulli random variables,
trials are performed till a certain number of successes.
4. Binomial Distribution:
Trials are fixed

Cumulative Probability
Inferential Statistics
In the previous example, we only discussed the probability of getting an exact value. For example, we know the
probability of X = 4 (4 red balls). But what if the house wants to know the probability of getting < 3 red balls, as the house
knows that for < 3 red balls, the players will lose and the house will make money?

Sometimes, talking in terms of less than is more useful. For example — how many employees can get to work in less than
40 minutes? Let’s explore how you can find the probability for such cases.

cumulative probability of X, denoted by F(x), is defined as the probability of the variable being less than or equal to x.

F(x) = P(X<=x)
For example:
F(4) = P(X<=3) = P(X=0) + P(X=1)+ P(X=2)+ P(X=3)
F(3) = P(X<=2)= P(X=0)+ P(X=1)+ P(X=2)

Comprehension: Expected Value

Calculating the expected value is a three-step process:

1. Define the random variable (X).

2. Calculate the probability distribution P(X). You’ll need to calculate it on your own.
3. Plug the above two terms in the following formula: E[X]=∑(X×P(X))

Let’s solve this problem step by step:

1. The first step is defining the random variable. The random variable (X) is the outcome of a die throw. So, X = {1,
2, 3, 4, 5, 6}
2. The second step is to calculate the probabilities related to each outcome. The probability of each outcome
is 1/6 in a die throw.

Now, you have X and P(X). If you plug these values in the formula E[X]=∑(X×P(X)), you’ll get 3.5 as the expected value. So
how to interpret this number? This means if you were to throw the die a large number of times, the average of those
numbers will tend towards 3.5.

So, why do we need the expected value at all? Well, the expected value lets you reason about real-world random
phenomenon more rationally.
Inferential Statistics

Continuous Probability Distribution

Probability Density Functions

Probability of continuous variable is calculated in terms of ranges

CDF and PDF these two functions talk about probabilities in terms of intervals rather than the exact values, it is
advisable to use them when talking about continuous random variables, not the bar chart distribution that we used for
discrete variables.

bar chart distribution is used for discrete variables.

CDF, or a cumulative distribution function, is a distribution that plots the cumulative probability of X against X.

CDF (Cumulative Distribution Function)

A PDF, or a Probability Density Function, however, is a function in which the area under the curve gives you the
cumulative probability.

For example, the area under the curve

between 20, the smallest possible value of X,
and 28 gives the cumulative probability for
X, which is equal to 28.
Inferential Statistics
PDF (Probability Density Functions)

The main difference between the cumulative probability distribution of a continuous random variable and a discrete
one lies in the way you plot them. While a continuous variables’ cumulative distribution is a curve, a distribution for
discrete variables looks more like a bar chart.

CDF for Continuous Variables (Commute Time) CDF for Discrete Variables (Number of Red Balls)

The continuous variable, i.e., the daily commute time, The reason for the difference is that for discrete
you have a different cumulative probability value for variables, the cumulative probability does not change
every value of X. For example, the value of cumulative very frequently. In the discrete variable example, we only
probability at 21 will be different from its value at 21.1, care about what the probability is for 0, 1, 2, 3 and 4. This
which will again be different from the one at 21.2, and so is because the cumulative probability will not change
on. Hence, you would show its cumulative probability as a between, say, 3 and 3.999999. For all values between
continuous curve, not a bar chart. these two, the cumulative probability is equal to 0.8704.

Uniform Distribution

A commonly observed type of distribution among continuous variables is a uniform distribution. For a
continuous random variable following a uniform distribution, the value of probability density is equal for all
possible values. Let’s explore this distribution a little more.

Since all possible values are between 0 and 10, the area under the curve between 0 and 10 is equal to 1.

It means F(X=10) =1.

Area under the graph (10*h) = 1; h=0.1

The value of the PDF for all values between 0 and 10 is 0.1.
Inferential Statistics

The cumulative probability for X = 0.5 is equal to the area

under the curve between X = 0, the lowest possible value,
and X = 0.5

This area = 0.1 * 0.5 = 0.05.

When to use PDFs and when to use CDFs

Well, PDFs are more commonly used in real life. The reason is that it is much easier to see patterns in PDFs as
compared to CDFs. For example, here are the PDF and the CDF of a uniformly distributed continuous random
variable:

PDF and CDF for a Uniformly Distributed Variable

The PDF clearly shows uniformity, as the probability density’s value remains constant for all possible values.
However, the CDF does not show any trends that help you identify quickly that the variable is uniformly
distributed.

PDF and the CDF of a symmetrically distributed continuous random variable:

Again, it is clear that the symmetrical nature of the variable is much more apparent in the PDF than in the CDF.
Hence, generally, PDFs are used more commonly than CDFs.

For Continuous Variable

You can say that : P(X ≤ 175.3 cm) = P(X < 175.3 cm) + P(X = 175.3 cm).
Now, since X is a continuous variable, you know that the probability of getting an exact value is zero.
Hence, P(X=175.3 cm) = 0,
which means that P(X ≤ 175.3 cm = P(X < 175.3 cm) + 0.
P(X ≤ 175.3 cm = P(X < 175.3 cm)
Inferential Statistics

Normal Distribution

Normally distributed data follows the 1-2-3 rule. This rule states that there is a:
1. 68% probability of the variable lying within 1 standard deviation of the mean,
2. 95% probability of the variable lying within 2 standard deviations of the mean, and
3. 99.7% probability of the variable lying within 3 standard deviations of the mean.
Inferential Statistics

Standard Normal Distribution

Standard Normal Distribution:

Mean (µ) =0

Standard Deviation (σ)=1

Standardized random variable (Z)

Standardized random variable is an important parameter. It is given by:

Basically, Z value tells us, how many standard deviations away from the mean your random variable is. We can
find the cumulative probability corresponding to a given value of Z, using the Z table:
Inferential Statistics

The value of σ is an indicator of how wide the graph is. A low value of σ means that the graph is narrow, while
a high value implies that the graph is wider. This is because a wider graph has more values away from the
mean, resulting in a high standard deviation.
Again, some more probability distributions are commonly seen among continuous random variables. They are
not covered in this course, but if you want to go through some of them, you can use the links below:
1. Exponential Distribution
2. Gamma Distribution
3. Chi-Squared Distribution

We can also use Excel to find the cumulative probability for Z.

syntax is:
NORM.S.DIST(z, TRUE)
z is the value of the Z score
TRUE = find cumulative probability
FALSE = find probability density.

Also, We can find the probability without standardising

syntax is:
NORM.DIST(x, mean, standard_dev, TRUE)
Inferential Statistics

Central Limit Theorem

Samples

Suppose for a business application, you want to find out the average number of times people in urban India visited malls
last year. That’s 400 million (40 crore) people! You cannot possibly go and ask every single person how many times they
visited the mall. That’s a costly and time-consuming process. How can you reduce the time and money spent on finding
this number?

Notations and formulas related to populations and their samples:

Sampling Distributions
Sampling distributions, whose properties, will help us estimate the population mean from the sample mean.

So, the sampling distribution, specifically the sampling distribution of the sample means, is a probability density function
for the sample means of a population.

This distribution has some very interesting properties, which will later help you estimate the sampling error. Let's take a
look at these properties.

The sampling distribution’s mean is denoted by μX̄, as it is the mean of the sampling means. Let’s see what it is for this
sampling distribution.
Note that you would divide by n and not n-1 as you have the data for
all 100 entries (i.e. mean of samples) of the distribution and you don't
need to sample the distribution.
Inferential Statistics
Inferential Statistics
Properties of Sampling Distributions

We’ve been saying that the sampling distribution has some interesting properties that will later help you estimate the
error in your samples. Let’s finally see what these properties are.

Notations and formulas related to sampling Distribution:

So, there are two important properties of a sampling distribution of the mean:

1. Sampling distribution’s mean (μX̄) = Population mean (μ)

σ
2. Sampling distribution’s standard deviation (Standard error) = , where ‘σ’ is the population’s standard
√n
deviation and ‘n’ is the sample size
3. For n > 30, the sampling distribution becomes a normal distribution.
Inferential Statistics
Central Limit Theorem (CLT)

So, the central limit theorem says that for any kind of data, provided a high number of samples has been taken, the
following properties hold true:

1. Sampling distribution’s mean (μX̄) = Population mean (μ)

σ
2. Sampling distribution’s standard deviation (Standard error) = , where ‘σ’ is the population’s
√n
standard deviation and ‘n’ is the sample size
3. For n > 30, the sampling distribution becomes a normal distribution.

Central Limit Theorem says that, no matter how the population is distributed the sampling distribution will be
σ
approximately normal, with mean (μ) and standard deviation , for n>30 the sampling distribution can be taken as
√n
exactly a normal distribution.

What is the use of central limit theorem:

Sample means of any population follow a normal distribution, no matter if the population follows a
normal distribution or not if all we need to estimate the population mean the CLT makes it possible to do
that

CLT Demonstration
Link for python calculations for sample mean of population i.e. CLT

Import necessary packages 1. The scipy.stats is the SciPy sub-package. It is

import pandas as pd, numpy as np mainly used for probabilistic distributions and
import matplotlib.pyplot as plt, seaborn as sns statistical operations.
import scipy.stats as stats
%matplotlib inline
import warnings Ignore or suppress warning massages
warnings.filterwarnings("ignore")
np.random.seed(42) the code produces the same random value if we use the
same seed
LINK TO EXPLAIN np.random.seed
df.Weight.sample(30) Choose 30 sample from the df
df.Weight.sample(sample_size).mean() ; {Sample_size = 30} Code to calculate the mean of sample size
sample_means= [df.Weight.sample(sample_size).mean() for i Calculating mean of 1000 samples having each sample
in range (1000)] size = 30
sample_means = pd.Series(sample_means) Converting list into Series because list is not
providing mean and st. deviation

No matter the parent population distribution, when you take samples, compute their means and find the sampling
distribution, it will always be normal, or at least nearly normal. This is one of the most important implications of the
Central Limit Theorem.
Inferential Statistics

CLT Demonstration: II

enumerate() is faster when you want to repeatedly access the

plt. sample_sizes = [3, 10, 30, 50, 100, 200] list/iterable items at their index
enumerate gives index to the iteam
figure(figsize=[10,7]) for ind, samp_size in enumerate(sample_sizes): here ind is
for ind, samp_size in enumerate(sample_sizes): index for samp_size
sample_means = [df1.VAL.sample(samp_size).mean() for i
in range(500)]
plt.subplot(2,3,ind+1)
sns.distplot(sample_means, bins=25)
plt.title("Sample size: "+str(samp_size))
plt.show()

Estimating Mean Using CLT

The population mean, i.e., the daily commute time of all 30,000 employees (μ) = 36.6 = (sample mean) + some margin of
error.

we can find this margin of error using the CLT (central limit theorem).

Let’s say that you have a sample with

sample size n,
mean X̄ and
standard deviation S.
Now, the y% confidence interval (i.e., the confidence interval corresponding to a y% confidence level) for would be given
by the range:
z∗s z∗s
Confidence interval =( X̄ − , X̄ + ¿
√n √n
Inferential Statistics
where, Z* is the Z-score associated with a y% confidence level.
z∗s
In other words, the population mean and the sample mean differ by a margin of error given by
√n
Some commonly used Z* values are given below:

Consider a food industry and find the average lead content in the food product (let’s say maximum permissible lead
content is 2.33 ppm)

So, population mean (µ) = sample mean X̄ ± Margine error= 2.3 ppm ± Margine error

Confidence level = 99%

z∗s 2.576 × 0.3

Confidence interval = ( X̄ ±
√n ) = 2.3 ± √100 )= (2.223 , 2.377)
population mean (µ) = (2.223, 2.377) with 99% confidence level

Types of Sampling Methods

Applications of Sampling Methods

Generally, sampling methods are categorized into four types.

These four types of sampling methods are:
1. Random Sampling
2. Stratified Sampling
3. Volunteer Sampling
4. Opportunity Sampling

1. Random Sampling:
In this method, people in the sample are selected randomly. This is similar to randomly pulling names out of a
hat.
Example: Suppose you want to find out the average internet usage per person in India. You just put the names of
all the Indians in a hat and pull out 100 names at random, and then calculate the average internet usage
of these 100 Indians.
Inferential Statistics

2. Stratified Sampling:
Here, people are divided into subgroups and then selected randomly from those subgroups. But this is done in
such a way that the final sample has the same proportions of these subgroups as the population.
Example: Again, suppose you want to find out the average internet usage per person in India. Note that 70% of
Indians live in rural areas, and 30% live in urban areas. So, you would put the names of all the rural
Indians in hat A and the names of all the urban Indians in hat B. Then, you’d pull 70 names out of hat
A and 30 names out of hat B. Now, again, you’d have a sample of 100 Indians, but this time, your
sample would be more representative of the population as its rural and urban proportions would be
the same as that of the population.

3. Volunteer Sampling:
Here, your sample is composed of people who want to volunteer for the survey.
Example: Suppose that once more, you want to find out the average internet usage per person in India. You
could ask people to take an online survey, which asks them how often/much they use the internet. You
could ask the same question through a telephonic survey.
The good thing about this type of sampling is that it looks unbiased and random because the survey participants
are selected at random through the medium (internet, telephone) itself. There is no human interference.
However, the medium will also bring in some bias. For example, an internet survey is more likely to include
people who have high internet usage, whereas a telephone survey is a little more likely to have a balanced
representation of heavy internet users and people who use the internet infrequently.

3. Opportunity Sampling:
In this method, the people around and close to the surveyor form their sample space.
Inferential Statistics
Example: This time, when you want to find out the average internet usage per person in India, you just ask 100
people around you about their internet usage.

Clearly, this sampling method has the potential to become extremely biased. The only good thing here, probably, is that
this is a relatively convenient sampling method.

Uses of Sampling in Market Research

So, there are four typical cases in which sampling is generally used but not limited to:

1. Market research: Suppose your company wants to launch a product whose usage depends on people having a
decent internet connection, such as Hotstar, Netflix, etc. Before launching such a product, you need to
understand the potential market size. For this, you need to conduct a survey with some people and based on
their data, infer parameters such as the average data usage, the willingness to adopt new technologies, etc. for
the entire population.

2. Marketing campaign efficacy: Suppose you work for a company such as Hotstar or Netflix. You want more and
more people to move from your competitors’ platforms to your platform. You plan to do this through a
marketing campaign. But how should you structure this marketing campaign? What should be its budget? Which
strategy should be used (free membership for a week, lower membership fees for a few weeks, etc.)? You can
use the data from your past marketing campaign and your knowledge of sampling techniques to make these
decisions.

3. Pilot testing: Again, let’s take the Hotstar and Netflix example. Suppose you’ve done all the market research
required and developed a product. Now, before putting your product out in the market, you want to give it a trial
run. For this, you can perform what is called a pilot test. It means that instead of giving your product a full-
fledged launch, you can just launch it partially to a few people, who can test your product and help you decide
whether it is good enough for a full launch.

4. Quality control: This is more of a manufacturing-centered application. Let’s say your company produces 10
million smartphones annually. This means that around 30,000 phones are manufactured every day. In such a
situation, quality assurance (QA) becomes a function of utmost importance. Since it is difficult to check all 30,000
phones every day, your company would just “sample” a few and then make decisions based on those samples.

So, now you know how stratified sampling can be used to improve your inferences. Let’s go through the case again:

1. You want to conduct a brand equity survey for e-commerce brands. In other words, you want to find out how
much of the e-commerce market is controlled by Flipkart, Amazon, and Snapdeal, respectively.
2. An important part of this process would be to conduct a survey, the results of which would tell you the
proportion of Indian e-commerce buyers that uses each of these websites.
3. However, in order to do this, you would need to perform stratified sampling on the basis of gender
(male/female), age, and location (metro/tier 2/other urban/rural). Not doing this would mean that you run the
risk of erroneous selection, for example, selecting too many people from metro areas or too few women, etc.
Hence, by not using stratified sampling you might end up with an unrepresentative sample.
Inferential Statistics
4. So, you give the questionnaire prepared by your team to the general public. Once you’ve acquired sufficient
sample data, you can make estimations for the general population and estimate the brand equity of major Indian
e-commerce brands.
5. However, you must not accept every entry you get. You can run some checks to screen out fraudulent entries.
For example, if a person takes only one minute to fill a survey that usually takes 10 minutes, he/she is probably
committing fraud.

However, as much as you would like to believe that you have used stratified random sampling, there actually is a big
chance that the sampling done here is closer to stratified volunteer sampling or to stratified opportunity sampling than
to stratified random sampling. Let’s understand why this is the case.

So, let’s say you used email as the medium for your survey. Once you decided on your quota guide, etc. and sent the
emails, you probably used the survey results to estimate population parameters. That’s the entire process. But where
exactly did you make it a volunteer/opportunity sampling exercise?

For many people, the email could have ended up in the spam folder. If this happened, you would probably not get a
response from them. Now, if all these people happened to fall in a specific general category (such as old people who
don’t understand how to filter spam), then your survey would have ended up being biased.

Another potential source of bias is non-response. Let’s say that out of 80 people, only 40 chose to respond to the email.
In that case, the 40 who did not respond would not be represented in your survey results. Hence, again, if these 40
people happened to disproportionately represent a particular segment (such as people who are digitally less savvy), the
survey results would be biased.

Uses of Sampling in Marketing Campaigns

Inferential Statistics
Which one of these should you select? Is the strategy of giving a 20% discount more effective than the one in which you
give only a 10% discount? Do you even need to give a discount? Is a reminder enough? Will the reminder even work, or
will people just return to the app/site themselves after some time?

To be able to answer all these questions, you would need to perform A/B testing. Basically, you would divide your current
customer base into four groups, say, group A, group B, group C, and group D. Then, each of the groups would be
subjected to one of the above strategies. For example, group A would get a 20% discount coupon, group C would get an
app reminder, etc. Then, when you got the data for these sample groups, you could use the concepts of hypothesis
testing and sampling to answer the questions asked above.

Stratified technique for A/B testing

Well, first of all, you’d need to break up your population into various small segments on the basis of factors such
as the acquisition channel, the frequency of shopping, the payment mode generally used, etc.

Once you break your population into microsegments this way, you can then get sample information for each
segment. Remember that the reason for making these divisions would be to ensure that the sample represents
the population as closely as possible.

Finally, once you make these probably unbiased divisions, you would have your sample. And, once you get the
sample, you can perform A/B testing.

Uses of Sampling in Pilot Testing

Use of sampling in piloy/concept testing

1. Multiple Reasons 2 process of pilot testing 3. Four ways to do concept testing

a. Concept a. Choose a target group a. Monadic concept testing

Inferential Statistics
b. Product b. Figure out where such audience  concept tested in isolation
c. Brand Logo will be available  example Testing home delivery of food
d. Packaging c. Choose a sample b. Sequential monadic concept testing
d. Get feedback  multiple concept tested in sequence
e. Record and study the feedback  aggregator vs Chefs vs Restorent
c. Comparative concept testing
 all concept shown to customer

d. Proto-monadic concept testing

 Concept are shown in sequence and then
next to each other

So, if you are creating any product, the process you will follow is given below.

Before you even start with the product development process, you will need to test your concept. This can be
accomplished by asking a few people how they would feel about a web streaming service and if such a service existed,
how often would they use it. How much would they be willing to spend on it? For people who don’t want to use it, what
is the reason for not using it? Will they reconsider their decision if certain features are added to the product?

Once you have the results of this concept test, you can start developing the actual product accordingly. Now, when this
product nears completion, you should have a few people try it out and collect their feedback. Based on their feedback,
you can make some last-minute changes which will help you rectify any mistakes or help you add small features you may
have missed. This process of getting your product checked once before its final development is called pilot testing.

Once this product, i.e., the streaming service, is developed in accordance with the results of the concept test and the
pilot test, you can launch it. However, if you wish to be really careful, there is one last thing you could do — you could
have a few people try this developed product, take their feedback, and make one more round of changes before you
launch the product. This process, where you get your final product checked, is called beta testing.

Hence, once you have conducted a concept test, a pilot test, and preferably, a beta test too, you will be ready to launch
your product. Now, let’s listen to Ujjyaini as she further explains this framework for product development.

Now there is scope for using sampling at various stages of the product development process. For example:

1. In the initial stages, you want to talk to people and figure out if they are interested in a digital payment service.
However, you need to be careful about how you design this survey: the people you talk to should be a mix of
those who are already comfortable with cashless products such as credit cards, etc. and those who are only
comfortable with cash. While interpreting your findings, you need to make sure that each stratum of the society
is represented correctly in the survey and that there are enough people of each type in your survey. If you only
interviewed 20 people who are comfortable with digital payments, you may need to use booster methods.
Inferential Statistics
2. In the final stages, during the pilot testing stage, you will need to use the sampling concepts again. Since this
stage also involves you surveying people and making decisions for the population based on the sample you
surveyed, you will need to stratify your sample accordingly, steer clear of biases, use boosters wherever needed,
etc.

Uses of Sampling in Quality Control

Finally, let’s go through the fourth use case of sampling, i.e., its use in quality control. Quality control is a process
followed at manufacturing sites, where batches of the manufactured product are regularly checked to ensure that they
meet the standards the company would like them to meet. Since it will be very expensive and time-consuming to check
each and every product manufactured, companies typically just check a few randomly selected products and decide for
the entire batch based on that.

Let’s say you’re inspecting a batch of bolts to assess their quality. You decide to check every 1000th bolt and see whether
it is manufactured as per the desired quality or not. Since all the bolts you inspect turn out to be good, you decide that
there are no defects in the batch.

However, there is a problem with this approach. What if the 6th product made by the machine is defective, and then,
every 1000th product the machine makes after that is defective? In that case, the defective pieces will have ID numbers
as follows: 6, 1006, 2006, 3006, and so on. However, since you’re only checking every 1000th product, i.e., ID
numbers 1000, 2000, 3000, etc., you will never find the defective piece.

The point is if the defects occur in a pattern, then your best chances to catch them are if you randomly select batch
numbers. If your selection has any trend to it, you risk matching the pattern of the defects and missing out.

Hence, it is always advisable to use a table of random numbers to decide which batches you’re going to inspect.
Inferential Statistics

Note: Using this table will not ensure that you will detect the defective pieces, but it will make that more likely.
Inferential Statistics
Inferential Statistics - Additional Resources
Basics of Probability
Probability is a measure of the likelihood of the occurrence of an event.
Measures of probability range from 0 to 1;
0 means an impossible event and 1 means a certain event.

Terminology
1. Trial or experiment: This refers to an action, the result of which is uncertain. For example, throwing a dice,
tossing a coin, etc.
2. Event: This refers to a single result of an experiment.
3. Sample space: This is the total number of possible outcomes of an experiment.
4. Sample point: This refers to one of the possible outcomes.

Formula
Number of favourable outcomes
The probability of an event=
Total number of possible outcomes

A few common experiments and important points to remember are as follows:

 Tossing a coin
This can lead to two possible outcomes:
 Heads
 Tails
The probability of getting either heads or tails is the same, and it is equal to 0.5.

 Throwing a die
This can lead to six outcomes; you can get any value between 1 and 6, both included:
 The probability of the occurrence of any number on the die is 1/6.

 Picking a card from a deck of cards

This can lead to 52 outcomes. A deck consists of:
 13 cards of hearts,
 13 cards of spades,
 13 cards of clubs, and
 13 cards of diamonds.

 Balls of different colours in a bag

Steps to find probability:

1. List every possible outcome of the experiment being performed.
2. Count every possible outcome of the experiment.
3. Count all the favorable outcomes.
4. Use the probability formula to determine the probability of the occurrence of an event.

Demonstrations: Basic Probability

1. If you toss two coins simultaneously, what is the probability of you getting heads in both the cases?
Solution: Total number of possible outcomes = 4, i.e., (H,H), (H,T), (T,H), (T,T)
Total number of favourable outcomes = 1, i.e., (H,H)
Thus, P = 1/4.
Inferential Statistics
2. If you throw two dice simultaneously, what is the probability of you getting the same number on both of them?
Solution: Total number of possible outcomes = 6 x 6 = 36
Total number of favorable outcomes = 6, i.e., (1,1), (2,2), (3,3), (4,4), (5,5), (6,6)
Thus, the probability of you getting the same number on both dice = 6/36 = 1/6.

3. What is the probability of you getting an ace from a deck of cards?

Solution: Total number of favorable outcomes = 4 (ace of each type of card)
Total number of possible outcomes = 52
Thus, P = 4/52 = 1/13.

Joint Probability and Conditional Probability

Joint probability is the probability of two events taking place simultaneously.

Simply put, it is the probability that event X occurs at the same time as event Y.
The basic assumption while calculating the joint probability of two events is that the events are independent of each
other.

The formula for joint probability is:

Multiplication Rule
( Probability of happening of event A 'AND' B):
P(X በ Y) = P(X) * P(Y)

Addition Rule
( Probability of happening of event A'OR' B):
P(X ⋃ Y) = P(X) + P(Y) - P(X በ Y)

Demonstrations: Joint Probability

1. What is the probability of you getting an 8 and a red card while drawing a card from a deck?
Solution: Using the formula, P(8 በ red) = P(8) * P(red)
= (4/52) * (26/52)
=1/26

2. A bag contains three red balls, five green balls and ten black balls. What is the probability of you getting either a
red ball or a green ball when you randomly draw a ball from the bag?
Solution: The probability of getting a red ball = 3/18
The probability of getting a green ball = 5/18
The probability of getting either a red ball or a green ball = 8/18 = 4/9

Conditional Probability
Conditional probability is the probability of an event, given that some other event has already occurred.
In conditional probability, the two events are dependent.

Notation P(B|A): This notation denotes the probability of event ‘B’, given that event ‘A’ has already occurred.

Formula:
P(B በ A )
P(B∨ A)=
P( A)
Inferential Statistics

Demonstrations: Conditional Probability

1. In a game of cards, a player wins if he draws two cards from the same unit. If the first card is a spade, what is the
probability that the player will win?
Solution: If the first card is a spade, the player is left with 12 spades and 51 cards. Thus, the probability of the
second card being a spade and of the player winning the game is 12/51.

3. A student has applied to a university and has a 50% chance of getting an admission. Also, as per the university
guidelines, 50% of the admitted students will get hostel accommodation. What is the probability of the student
getting hostel accommodation, given that he has been admitted?

P(Hostel በ Admission)
Solution: P(Hostel∨Admission)=
P (Admission)
P(Hostel | Admission) = 0.5 * 0.5/0.5
P(Hostel | Admission) = 0.5

4. A bag contains four red balls and five green balls. You draw a ball from it without replacing it with another one.
What is the probability of you drawing a green ball in the second draw, & a red ball in the first draw?
Solution: The probability of you drawing a red ball in the first draw is 4/9,
and the probability of you drawing a green ball in the second draw is 5/8.
So, the probability of you drawing a green ball after drawing a red ball is (4/9) * (5/8) = 5/18.

Bayes' Theorem

Bayes’ theorem describes the probability of an event based on prior knowledge of conditions that might be related to
the event.
If you know the conditional probability P(B|A), you can use Bayes’ rule to find out the reverse probability P(A|B).

Formula:

Demonstration
We will now understand the application of Bayes’ theorem through a demonstration:
1. Two bags contain red and green balls. The first bag contains two red and three green balls; the second bag
contains five red and seven green balls. If a green ball is drawn from one of the bags, what is the probability that
it was drawn from the first bag?

Solution: Based on the given question,

P(1): Probability of choosing bag 1 = 1/2
P(2): Probability of choosing bag 2 = 1/2
P(G1): Probability of getting a green ball from bag 1 = 3/5
P(G2): Probability of getting a green ball from bag 2 = 7/12

Now let's compute the different probabilities needed for solving this using the Bayes' Theorem.
Let A be the event that the first bag is chosen and B be the event that a green ball is chosen.
Inferential Statistics
Therefore,
P(A) = 1/2 [same as P(1)]
P(B) = Probability of getting a green ball = P(1)*P(G1) + P(2)*P(G2)
= (1/2 * 3/5) + (1/2*7/12)
= 71/120
P(B|A) = 3/5 [same as P(G1)]

Applying Bayes’ theorem to determine the probability of a green ball being drawn from bag 1, we get:

P(A|B) = [P(B|A)*P(A)]/P(B)
= (3/5 * 1/2 )/ (71/120)
= 36/71

You can learn more about conditional probability and Bayes’ theorem here.

Standardized Normal Distribution and Z- Score

A unique combination of mean (μ) and standard deviation (σ) represents or defines a unique normal distribution. So, to
analyze or compare different normal distributions, you make use of a standardized normal distribution. A standardized
normal distribution is a special type of normal distribution where μ = 0 and σ = 1.
A normal distribution is converted into a standardized normal distribution with the help of the Z score.

The formula for Z score is as follows:

x−μ
Z Score=
σ

As it’s evident from the formula, for every value of x (or the values on the X-axis), we will calculate the corresponding Z
scores using the formula above and plot these Z scores against their respective probabilities on the Y-axis.
For example, for a normal distribution with μ= 35 and σ = 5, the normal distribution curve and the standard normal
distribution curve will look like this:

So, Z scores can be used to:

 Calculate the probability of the occurrence of a particular random variable, and
 Compare normal distributions.
Inferential Statistics

Let’s try to understand both the scenarios above:

1. Calculating the probability of a random variable’s occurrence is done with the help of a Z table (watch
this YouTube video). It can also be done in Excel using the ‘NORMDIST’ formula.
2. Suppose that the marks obtained by the students of a class are normally distributed. In the mid-term exam, the
mean scores out of 100 was 50 and the standard deviation was 10; and in the end-term, the mean score
was 60 and the standard deviation was 20. A student, Ram, scored 70 in the mid-term exam and 72 in the end-
term exam. In which exam was his relative performance better?
To answer this question, you can make use of Z scores.
Mid-term’s Z score = 2
End-term’s Z score = 0.6
Looking at the Z scores, we can conclude that his relative performance was better in the mid-term exam even
though the marks he obtained were less compared to his end-term marks.

Sampling Methods
Population: This refers to the entire data.
Sample: This refers to the part of the population selected by a defined procedure to be representative of the data.

Types of Sampling
 Random sampling:
In this kind of sampling, each element of the population has the same probability of getting selected in the
sample.
o Simple random sampling with replacement:
In simple random sampling with replacement, for the creation of a sample size n, you select an
element from the population and then return it to the population. This procedure is repeated n
times. Thus, each element of the population can be selected more than once in a sample. This is
used when the population size is small.
o Simple random sampling without replacement:
In simple random sampling without replacement, for the creation of a sample size n, you select
an element from the population and don’t return it to the population. The selection of elements
from the population is repeated n times. This is used when the population size is large.
o Stratified random sampling:
In stratified random sampling, the population is divided into strata on the basis of common
characteristics. The elements are then selected from these strata.
o Cluster sampling:
In cluster sampling, the population is divided into clusters, and then, a simple random sample of
these clusters is selected.
o Systematic sampling:
In systematic sampling, a starting point is selected in the population, and then, the elements are
selected at regular, fixed intervals.
Inferential Statistics

Non-random sampling:
In this kind of sampling, each element of the population does not have the same probability of getting
selected in the sample.
o Convenience sampling:
In convenience sampling, the researcher selects the elements from the population on the basis
of the convenient accessibility of these elements.
o Judgemental sampling:
In judgemental sampling, the researcher selects the elements on the basis of his judgement and
bias.
o Quota sampling:
The population is divided into groups or quotas, on the basis of which you select the sample.
Quota sampling is, to a certain extent, similar to random sampling; the sampling procedure is
more or less the same in both the cases, except the quota is fixed in quota sampling. That is, you
don't consider the entire population, just a section of it to create a quota.
o Snowball sampling:
In the case of snowball sampling, a small sample is first selected, say a sample of five people.
Then, each of the five members can suggest five names, and those five can suggest five more
each. This creates a snowball effect.

What is the difference between stratified random sampling and cluster sampling?
In stratified random sampling, the whole population is divided into strata based on common characteristics, and
then, elements are selected from each stratum. In cluster sampling, on the other hand, the whole population is
divided into clusters, and then, some of the clusters are chosen randomly to create a sample.

Sampling and Estimation

Sampling Distribution
Sampling distribution is the probability distribution of a particular sample statistic (such as mean) obtained by
drawing all possible samples of a particular sample size ‘n’ from the population and calculating their statistics.

Sampling Distribution of Sample Mean

If you draw samples of size, let’s say, ‘n’ from a population, calculate the sample mean for all the samples and
then draw the probability distribution for the random variable X (where X denotes the mean of the sample), the
resulting probability distribution is called ‘sampling distribution of sample means’.
The mean of the sample means is denoted by μx̅.
The standard deviation of the sampling distribution of the sample means is denoted by σx̅.

Central Limit Theorem:

When you draw a sampling distribution of sample means, where the sample size is sufficiently large (i.e.>30), the
sampling distribution of the sample means will look like a normal distribution.

When is the sample size (n) considered sufficiently large?

For a non-normally distributed population, ‘n’ should be greater than or equal to 30. (This 30-rule is an
oversimplification and can be verified). For a normally distributed population, the sample size can be anything.
Inferential Statistics

Significance of the central limit theorem:

The central limit theorem states that for a sufficiently large sample size, the sampling distribution is
approximately normally distributed.
This approximation improves with an increase in the sample size because of this normal distribution.
The sampling distribution of sample means has its own normal variate (Z). In the next section, you will see how
this Z is used to estimate population parameters.

Important property:
Mean of the sample means (μx̅) = Mean of the population (μ)
σ
σ x̅ = where n is the sample size of all the samples.
√n
So, the normal variate or the Z-score for the sampling distribution of a sample means is:

( x̅ −μ)
Z=
(σ /√ n)

Estimation
The process of drawing inferences about a population using the information from its samples is known as estimation.

Types of estimation
1. Point estimate:
Here, a statistic obtained from a sample is used to estimate a population parameter. So, its accuracy
depends on how well the sample represents the population. The population parameters derived from
sample statistics of various samples may vary. This is why interval estimate is preferred to point estimate.

2. Interval estimate:
Here, the lower and upper limits of values (that is, the confidence interval) within which a population
parameter will lie are estimated along with a certain level of confidence.

The mathematics involved in interval estimate:

As discussed above, the normal variate of the sampling distribution of a sample means is:
(x̅ −μ)
Z ¿=
( σ /√ n)
Rearranging the equation above, you get:
¿
(X̅ −μ)=Z (σ /√ n)
Since Z can be both positive and negative (for a random variable smaller than the mean), you have:
¿
(X̅ −μ)=± Z (σ / √ n)
The equation above can be rearranged to:
¿
μ= X̅ ±(Z (σ /√ n))

So, you can say that the population mean μ will lie between:
¿ ¿
X̅ −( Z (σ /√ n))< μ< X̅ +(Z (σ /√ n))

The formula above is used to calculate the upper and the lower limits of μ for a certain level of confidence (a
certain value of Z), where the value of σ is known.
Inferential Statistics
What if the value of σ is not known? In that case, you use the t-distribution.

T-distribution
Properties of T-distribution:
1. It can only be applied when the samples are drawn from a normally distributed population.
2. It is flatter than a normal distribution.

Degrees of freedom = Sample size - Number of unknown parameters

Here, there is only one unknown parameter: the population standard deviation. So, the degree of freedom for
a t-distribution is given by ‘sample size (n) - 1’.

( X̅ i−μ)
Standard normal variate∨test statistic for t−distribution=
(s/ √ n)
where ‘s’ is the sample standard deviation.

The formula to find the confidence interval is:

¿ ¿
X̅ −( t α /2 ( s /√ n))< μ< X̅ +(t α /2 (s /√ n))

where (1-α) is the confidence level associated with it.

Go to this link for more details.

Statistics For Management PDF
75% (4)
Statistics For Management PDF
150 pages
CAPE Physics IA U1 Criteria 0809
100% (2)
CAPE Physics IA U1 Criteria 0809
2 pages
Example of Study Limitation in Research Work
50% (4)
Example of Study Limitation in Research Work
2 pages
TVS MOTOR COMPANY LTD - Main MBA Porject Report Prince Dudhatra
64% (11)
TVS MOTOR COMPANY LTD - Main MBA Porject Report Prince Dudhatra
63 pages
(Crim. 6) Criminological Research and Statistics
33% (3)
(Crim. 6) Criminological Research and Statistics
12 pages
Biostat Ch-6
No ratings yet
Biostat Ch-6
44 pages
CHAPTER 8 Introduction To Probability
No ratings yet
CHAPTER 8 Introduction To Probability
50 pages
Probability Basics for Beginners
100% (1)
Probability Basics for Beginners
17 pages
Statistics For Business Topic - Chapter 5 - Probability
No ratings yet
Statistics For Business Topic - Chapter 5 - Probability
1 page
Unit II Probability
No ratings yet
Unit II Probability
37 pages
Measures of Variance and Probability
No ratings yet
Measures of Variance and Probability
61 pages
Probability Concepts and Applications
85% (13)
Probability Concepts and Applications
10 pages
Chapter 05 PowerPoint
No ratings yet
Chapter 05 PowerPoint
43 pages
2.2 Intro To Proba Dist
No ratings yet
2.2 Intro To Proba Dist
31 pages
Basics of Probability Theory
No ratings yet
Basics of Probability Theory
42 pages
Probability
No ratings yet
Probability
93 pages
W6 - Randomness and Probability
No ratings yet
W6 - Randomness and Probability
52 pages
BUSINESS STATISTICS Notes UNIT 2
No ratings yet
BUSINESS STATISTICS Notes UNIT 2
6 pages
Chap 005
No ratings yet
Chap 005
14 pages
Statistics 2
No ratings yet
Statistics 2
15 pages
Probability Concepts and Applications
No ratings yet
Probability Concepts and Applications
9 pages
Probability All Theory
No ratings yet
Probability All Theory
9 pages
STA112 Note
No ratings yet
STA112 Note
8 pages
P (A or B) P (A) + P (B) : Rule of Addition When Events Are Mutually Exclusive
No ratings yet
P (A or B) P (A) + P (B) : Rule of Addition When Events Are Mutually Exclusive
6 pages
PROBABILITY
No ratings yet
PROBABILITY
27 pages
Lesson - 3.2 - Probability and Statistics - Measure - Phase
No ratings yet
Lesson - 3.2 - Probability and Statistics - Measure - Phase
30 pages
Probability
No ratings yet
Probability
46 pages
Lecture 7-Probability
No ratings yet
Lecture 7-Probability
29 pages
Chapter 05 PowerPoint
No ratings yet
Chapter 05 PowerPoint
43 pages
Statistics Chapter 5a (Probability Concept)
No ratings yet
Statistics Chapter 5a (Probability Concept)
26 pages
Probability Theory For Data Analytics: (CSPC-309)
No ratings yet
Probability Theory For Data Analytics: (CSPC-309)
50 pages
Probability
No ratings yet
Probability
34 pages
Unit II
No ratings yet
Unit II
118 pages
Engineering Data Analysis Handsout Module 1 6 - Compress
No ratings yet
Engineering Data Analysis Handsout Module 1 6 - Compress
12 pages
Prob and Stat
No ratings yet
Prob and Stat
15 pages
Handout For Mechanical
No ratings yet
Handout For Mechanical
15 pages
Week 5B - Probability Handout
No ratings yet
Week 5B - Probability Handout
9 pages
Basic Prob
No ratings yet
Basic Prob
12 pages
Statistics and Probability Q3
No ratings yet
Statistics and Probability Q3
29 pages
Probability Fundamentals
No ratings yet
Probability Fundamentals
38 pages
Fundamentals of Probability
100% (1)
Fundamentals of Probability
35 pages
Chap 005
No ratings yet
Chap 005
43 pages
ST2187 Block 5
No ratings yet
ST2187 Block 5
15 pages
Probability
No ratings yet
Probability
8 pages
Intro to Statistics for Students
No ratings yet
Intro to Statistics for Students
55 pages
ES209 Module 2 - Probability
No ratings yet
ES209 Module 2 - Probability
36 pages
Business Statistics Note #2
No ratings yet
Business Statistics Note #2
6 pages
Randomness Probability SV
No ratings yet
Randomness Probability SV
42 pages
Unit 1 Review of Probability and Basic Statistics
100% (1)
Unit 1 Review of Probability and Basic Statistics
90 pages
STAT Formulas
No ratings yet
STAT Formulas
130 pages
Probability
No ratings yet
Probability
5 pages
Engineering Data Analysis Handsout Module 1 6
No ratings yet
Engineering Data Analysis Handsout Module 1 6
13 pages
Quanti Midterm 1
No ratings yet
Quanti Midterm 1
4 pages
Reviewer in Stats Ana - Chapter 4
No ratings yet
Reviewer in Stats Ana - Chapter 4
5 pages
MATH 14 - 2. Probability
No ratings yet
MATH 14 - 2. Probability
31 pages
CS 459 Chapter 3
No ratings yet
CS 459 Chapter 3
35 pages
Probability: Random Experiments Probability Rules of Probability
No ratings yet
Probability: Random Experiments Probability Rules of Probability
62 pages
Data Analytics With Python Lecture 2
No ratings yet
Data Analytics With Python Lecture 2
25 pages
Chap 005
No ratings yet
Chap 005
14 pages
Basic Statistics Chapter 5-7
No ratings yet
Basic Statistics Chapter 5-7
32 pages
2006 Chapter 07 Assignment
No ratings yet
2006 Chapter 07 Assignment
6 pages
Adaptation An Validation of The Spanish Version of The Parents Evaluation of Aural-Oral Performance of Children (PEACH) Rating Scale
No ratings yet
Adaptation An Validation of The Spanish Version of The Parents Evaluation of Aural-Oral Performance of Children (PEACH) Rating Scale
9 pages
Chapter1 5 Final Na Talaga Group8
No ratings yet
Chapter1 5 Final Na Talaga Group8
35 pages
Report of Design of Book Stall
No ratings yet
Report of Design of Book Stall
41 pages
The Status of The Economy and Its Effect On The Brain Drain Phenomenon in The Philippines
No ratings yet
The Status of The Economy and Its Effect On The Brain Drain Phenomenon in The Philippines
70 pages
Dissertation Topics For Financial Economics
100% (2)
Dissertation Topics For Financial Economics
5 pages
9.1 Classroom Notes Key
No ratings yet
9.1 Classroom Notes Key
8 pages
Consumer Perception Write-Up
No ratings yet
Consumer Perception Write-Up
3 pages
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
23 pages
Statistics Homework Solutions
No ratings yet
Statistics Homework Solutions
20 pages
An Examination of The Influence of Retirement Planning On Retirement Adjustment and Retirement Well-Being in Nigeria
No ratings yet
An Examination of The Influence of Retirement Planning On Retirement Adjustment and Retirement Well-Being in Nigeria
8 pages
Inflation's Impact on Global Health
No ratings yet
Inflation's Impact on Global Health
11 pages
Analytical Test Method Transfer Guide
100% (5)
Analytical Test Method Transfer Guide
5 pages
Liveability Free Report 2019
No ratings yet
Liveability Free Report 2019
11 pages
Samplig-Probability & Non Probability
No ratings yet
Samplig-Probability & Non Probability
4 pages
Rocks Report
No ratings yet
Rocks Report
2 pages
Stratified Sampling Explained
No ratings yet
Stratified Sampling Explained
26 pages
Impact of Online Reviews On Hotel Service Research
No ratings yet
Impact of Online Reviews On Hotel Service Research
7 pages
Latent Clustering W Mplus v2
No ratings yet
Latent Clustering W Mplus v2
57 pages
Report of The RSK Committee On ELECTRIC INSTALLATIONS (EE) : RSK/ESK-Geschäftsstelle Beim Bundesamt Für Strahlenschutz
No ratings yet
Report of The RSK Committee On ELECTRIC INSTALLATIONS (EE) : RSK/ESK-Geschäftsstelle Beim Bundesamt Für Strahlenschutz
28 pages
What Is Impact
No ratings yet
What Is Impact
17 pages
Unit 3
No ratings yet
Unit 3
19 pages
Literature Review On Palm Tree
100% (1)
Literature Review On Palm Tree
5 pages
Kfleming Finalproject
No ratings yet
Kfleming Finalproject
5 pages
Key Performance Indicators KPI For Resea
No ratings yet
Key Performance Indicators KPI For Resea
32 pages
An Introduction To Statistics An Active Learning Approach 3rd Edition Kieth A
100% (2)
An Introduction To Statistics An Active Learning Approach 3rd Edition Kieth A
402 pages