BITS F464 Machine Learning E-M Algorithm Practice Sheet
Q1. Verstappen’s race in dry conditions is modelled as a mixture of two Gaussians. Given Gaussian
1 𝑥−µ 2
1 −2( )
Distribution: 𝑒 σ
2πσ
2
● Quali Mode: µ1 = 87, σ1 = 2, π1 = 0. 7
2
● Fuel-Saving Mode: µ2 = 90, σ2 = 2, π2 = 0. 3
Give lap times X = {88, 89, 91}. Compute the log-likelihood.
Q2. Drivers adjust their driving styles based on tyre wear. Given initial means: µ1 = 88 & µ2 = 92,
computed responsibilities: γ(𝑧𝑖1) = {0. 8, 0. 5, 0. 2} & γ(𝑧𝑖2) = {0. 2, 0. 5, 0. 8}, and lap times:
𝑋 = {87, 90, 94}. Update the new mean values using the Maximisation Step.
2
Q3. In a race, Scuderia Ferrari’s lap times follow a Gaussian µ1 = 91 & σ1 = 1 and Red Bull Racing are
2
modelled with µ2 = 88 & σ1 = 1. Given observed lap times X = {89, 92, 87} and π1 = π2 = 0. 5.
Compute the posterior probability that each lap was set by Scuderia Ferrari.
Q4. Torro Rosso pit stop errors are modelled using a Poisson Mixture Model. Given Poisson Distribution:
−λ 𝑥
𝑒 λ
𝑃(𝑋 = 𝑥 | λ) = 𝑥!
. There are two types of errors:
● Minor Errors λ1 = 1 per pit stop
● Major Errors λ2 = 4 per pit stop
Given pit stop error counts X = {1, 2, 0, 5, 3, 1} with initial π1 = 0. 7 & π2 = 0. 3. Compute expected
assignment probabilities for each pit stop.
Q5. In a wet Monaco Grand Prix, we suspect two types of drivers - Rain Masters and Strugglers. Given
the observed lap time (in s): 𝑋 = {75. 2, 76. 1, 80. 5, 81. 3, 82. 9}. Assume a Gaussian Mixture Model
(GMM) with two components initialised with µ1 = 76, µ2 = 81, σ1 = σ2 = 1 & π1 = π2 = 0. 5 .
Compute initial responsibilities for each lap time.
Q6. Answer briefly: Explain the two main steps of the Expectation-Maximization (EM) algorithm. Why is
the E-step necessary, and what does it compute? What happens in the M-step, and how does it update
parameters?
Q7. F1 teams get known sponsors and unknown investors. We model the funding distribution using the
α
β α−1 −β𝑥
Gamma Mixture Model. Given Gamma Distribution: 𝑓(𝑥, α, β) = Γ(α)
𝑥 𝑒
● Sponsor Money: 𝐺𝑎𝑚𝑚𝑎(α1 = 3, β1 = 1)
● Hidden Investors: 𝐺𝑎𝑚𝑚𝑎(α2 = 5, β2 = 2)
Given observed funding amounts (in million $) X = {3, 4, 6, 8, 2, 7, 5} with π1 = 0. 6 & π2 = 0. 4.
Compute the expected probability that 6 M $ came from hidden investors.
Q8. Explain in short:
1. Why does the EM algorithm always increase the likelihood in each iteration?
2. Can EM guarantee convergence? If so, what does it converge to? Explain in short.
Q9. A team models the risk of spinning out in wet conditions using Beta distributions. Given Beta
Γ(α+β) α−1 β−1
Distribution: Γ(α)Γ(β)
𝑥 (1 − 𝑥)
● Careful Drivers: α1 = 4 & β1 = 6
● Aggressive Drivers: α2 = 8 & β2 = 2
Observed spin counts (per 10 laps): X = {1, 2, 3, 0, 2, 1}. Compute the updated Beta parameters after one
E-M step.
Q10. A company has two types of defective machines, Type A and Type B. When a machine fails, the
company records the failure but does not know which type of machine failed.
● Type A fails with probability 𝑝𝐴and Type B fails with probability 𝑝𝐵.
● A randomly chosen machine is of Type A with probability λ and of Type B with probability 1−λ.
(a) Why is the Expectation-Maximization (E-M) algorithm suitable for estimating 𝑝𝐴, 𝑝𝐵 and λ given this
incomplete data?
(b) What challenges might arise when using the E-M algorithm for this problem?