Chapter 3: Fundamentals of Data Analysis Statistics
CONTINUOUS DISTRIBUTIONS
Outline
• Continuous Distributions
• Uniform Distribution
Continuous Distributions
• The distributions we have considered so far have all been
discrete
• What if a random variable can take any value in a
continuous range?
• This type of RV is called a continuous RV
• How many mL of soda are put in a bottle?
Continuous Distributions
• To define a distribution of a discrete random variable we
need a list of all possible values it can take on and the
corresponding probabilities
• This is a little more tricky for continuous RVs
• First let’s define the values it can take
• For a continuous RV this is some continuous range
– [0,1] or (-3,15) or (−∞, ∞)
Continuous Distribution
• The next thing we need is some notion of probability
• For continuous distributions we need a function
– Like we had for binomial or Poisson distributions…
– Let’s call that function 𝑓(𝑥)
• A random variable, 𝑋, is described by the probability distribution function (PDF), 𝑓(𝑥),
if
– 𝑓 𝑥 ≥ 0 ∀𝑥
∞
– −∞ 𝑓 𝑥 𝑑𝑥 = 1
𝑏
– 𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 = 𝑥𝑑 𝑥 𝑓 𝑎
• If the range of possible values is not (−∞, ∞), simply define 𝑓 𝑥 = 0 outside the
range
Continuous Distributions
• What is 𝑃 𝑋 = 500 ?
• What is 𝑃 499.5 ≤ 𝑋 ≤ 500.5 ?
Continuous Distributions
• We can also define the cumulative distribution function (CDF)
𝑥
– 𝐹 𝑥 =𝑃 𝑋≤𝑥 = −∞ 𝑓 𝑠 𝑑𝑠
– 𝑃 𝑎 ≤𝑋 ≤𝑏 =𝐹 𝑏 −𝐹 𝑎
𝑑
–𝑓 𝑥 = 𝐹 𝑥
𝑑𝑥
• The CDF is also defined for discrete random variables but it
isn’t used as frequently because you have to be very careful
about ≤ versus <
Continuous Distributions
• How do we get the expectation and variance of a continuous RV?
• For a discrete RV we multiplied probabilities by the values the RV can
take and added them up
– What is the continuous equivalent to adding?
∞
• 𝐸𝑋 = −∞ 𝑥𝑓 𝑥 𝑑𝑥
∞ 2𝑓
• 𝑉𝑎𝑟 𝑋 = −∞ 𝑥−𝐸 𝑋 𝑥 𝑑𝑥
Uniform Distribution
• The simplest continuous distribution is the uniform
distribution
• The uniform distribution is
1, 𝑖𝑓 0 ≤ 𝑥 ≤ 1
–𝑓 𝑥 =ቊ
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
• A uniform random variable only takes values in [0,1]
• All values in that range are equally likely
• What is F 𝑥 ?
Uniform Distribution
• What is the expectation of a uniform RV?
∞ 1
– 𝐸 𝑋 = −∞ 𝑥𝑓 𝑥 𝑑𝑥 = 2
• What is the variance of a uniform RV?
∞ 1 2 1
– 𝑉𝑎𝑟 𝑋 = −∞ 𝑥 − 2 𝑓(𝑥)𝑑𝑥 = 12
Uniform Distribution
• We can extend the definition of a uniform RV to any interval,
U(a,b)
1
, 𝑖𝑓 𝑎≤𝑥≤𝑏
–𝑓 𝑥 = ቐ𝑏−𝑎
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
• Suppose 𝑋~𝑈(0,1), then 𝑌 = 𝑎 + 𝑏 − 𝑎 𝑋 is a U(a,b)
random variable!
Uniform Distribution
• I will leave it to you to derive the CDF, expectation and variance of a general uniform
RV
0 ,𝑥 < 𝑎
𝑥−𝑎
• 𝐹 𝑥 = ൞𝑏−𝑎 , 𝑎 ≤ 𝑥 ≤ 𝑏
1 ,𝑥 > 𝑏
𝑎+𝑏
• 𝐸𝑋 =
2
1 2
• 𝑉𝑎𝑟 𝑋 = 𝑏−𝑎
12
Uniform Distribution
• Suppose the mL of soda in a bottle is a U(495,505) RV,
what is the probability that there are between 499 – 501
mL in the bottle?
• What is the standard deviation of mL in a bottle?
Chapter 3: Fundamentals of Data Analysis Statistics
NORMAL DISTRIBUTIONS
Outline
• Mean and Variance
• Normal Distribution
– Definition
– Properties
• Examples – in the next video
Mean and Variance
• Suppose 𝑋 is any random variable with
– 𝐸 𝑋 = 𝜇 and 𝑉𝑎𝑟 𝑋 = 𝜎 2
• If 𝑎 and 𝑏 are 2 numbers then
• 𝐸 𝑎𝑋 + 𝑏 = 𝑎𝜇 + 𝑏
• 𝑉𝑎𝑟 𝑎𝑋 + 𝑏 = 𝑎2 𝜎 2
Normal Distribution
• The normal distribution is one of the most important
distributions in all of statistics!
• It is also called the Gaussian distribution
• Later when we combine data and probability the
normal distribution will play a huge role in analyzing
averages through the central limit theorem
Normal Distribution
• A random variable, X, that follows a
normal distribution has the
following PDF
− 𝑥−𝜇 2
1
–𝑓 𝑥 = 𝑒 2 𝜎2
𝜎 2𝜋
– 𝜇 and 𝜎 are parameters (just like
Poisson)
Normal Distribution
• The normal distribution follows a bell curve!
• Any data that has a bell curve can be approximated
pretty well using the normal distribution!!!
• The only thing you have to do is find the 2 parameters
– We’ll talk more about this later
Normal Distribution
• What are the mean and variance of a normal random variable
– 𝐸 𝑋 =𝜇
– 𝑉𝑎𝑟 𝑋 = 𝜎 2
– 𝑆𝑑 𝑋 = 𝜎
• If we have data that looks like it has a bell curve we can
approximate it with a normal by setting
1 𝑛 1
– 𝜇Ƹ = σ𝑖=1 𝑋𝑖 and 𝜎ො = σ𝑛𝑖=1 𝑋𝑖 − 𝜇Ƹ 2
𝑛 𝑛−1
Normal Distribution
• The mean parameter, 𝜇, just shifts the distribution
left and right
• The standard deviation parameter, 𝜎, makes the
distribution wider or narrower
• What is the CDF of a normal?
– Unfortunately there isn’t a formula for this
– We will approximate it with R in the next video
Normal Distribution
• Suppose 𝑋~𝑁(𝜇, 𝜎)
𝑋−𝜇
– What is Z = 𝜎
?
• Suppose there are 2 independent random variables
– 𝑋~𝑁(𝜇1 , 𝜎1 )
– 𝑌~𝑁(𝜇2 , 𝜎2 )
– What is W = 𝑋 + 𝑌?
Normal Distribution
• Suppose 𝑋~𝑁(𝜇, 𝜎)
– 𝑃 𝜇 − 𝜎 ≤ 𝑋 ≤ 𝜇 + 𝜎 = 68%
– 𝑃 𝜇 − 2𝜎 ≤ 𝑋 ≤ 𝜇 + 2𝜎 = 95%
– 𝑃 𝜇 − 3𝜎 ≤ 𝑋 ≤ 𝜇 + 3𝜎 = 99.7%
Normal Distribution
• Suppose we can get the CDF, 𝐹(𝑥) (we will with R)
– How do we calculate 𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 ?
– 𝐹 𝑏 − 𝐹(𝑎)
• Alternatively, suppose I told you there is a number, 𝑥,
such that 𝑃 𝑋 ≤ 𝑥 = 𝑝 where 𝑝 is a given number
– How would we find 𝑥?
– 𝑥 = 𝐹 −1 (𝑝)