SVKM's NMIMS Centre for Distance and
Online Education (NCDOE)
Course: Quantitative Methods - I
Internal Assignment Applicable for September 2025 Examination
Assignment Marks: 30
Question 1 (10 Marks)
INTRODUCTION
Probability distributions serve as fundamental tools in statistical analysis, enabling
researchers and analysts to model real-world phenomena and make informed
predictions about rare events in large populations. When dealing with rare
occurrences in substantial populations, the Poisson distribution emerges as a
powerful approximation to the binomial distribution, particularly when the
probability of success is small and the sample size is large. This mathematical
framework becomes invaluable for analyzing situations such as disease outbreaks,
equipment failures, or other infrequent events that occur independently across a
population.
1/9
In this particular scenario, we examine a rare event occurring with probability 0.0004
per individual annually within a city population of 20,000 residents. The analysis
requires applying the Poisson approximation to determine specific probability
ranges and evaluating district-level occurrences, demonstrating practical
applications of statistical modeling in public health monitoring and policy planning.
This approach illustrates how theoretical probability concepts translate into
meaningful insights for municipal administrators and health officials.
CONCEPTS AND APPLICATION
Justification for Poisson Approximation
The Poisson approximation to the binomial distribution is appropriate when:
- n is large (n ≥ 30)
- p is small (p ≤ 0.1)
- np is moderate (typically between 1 and 10)
In our case:
- n = 20,000 (large)
- p = 0.0004 (small)
- np = 20,000 × 0.0004 = 8 (moderate)
These conditions satisfy the requirements for Poisson approximation with λ = np = 8.
Part (a): Probability Calculation for 10-15 Occurrences
For a Poisson distribution with λ = 8, we need P(10 ≤ X ≤ 15).
Using the Poisson probability mass function:
P(X = k) = (e^(-λ) × λ^k) / k!
Calculating individual probabilities:
P(X = 10) = (e^(-8) × 8^10) / 10! = (0.000335 × 1,073,741,824) / 3,628,800 = 0.0993
P(X = 11) = (e^(-8) × 8^11) / 11! = (0.000335 × 8,589,934,592) / 39,916,800 = 0.0722
P(X = 12) = (e^(-8) × 8^12) / 12! = (0.000335 × 68,719,476,736) / 479,001,600 = 0.0481
P(X = 13) = (e^(-8) × 8^13) / 13! = (0.000335 × 549,755,813,888) / 6,227,020,800 =
0.0296
2/9
P(X = 14) = (e^(-8) × 8^14) / 14! = (0.000335 × 4,398,046,511,104) / 87,178,291,200 =
0.0169
P(X = 15) = (e^(-8) × 8^15) / 15! = (0.000335 × 35,184,372,088,832) /
1,307,674,368,000 = 0.0090
Therefore:
P(10 ≤ X ≤ 15) = 0.0993 + 0.0722 + 0.0481 + 0.0296 + 0.0169 + 0.0090 = 0.2751
Part (b): District-Level Analysis
Each district has 5,000 individuals (20,000 ÷ 4 = 5,000).
For each district, λ_district = 5,000 × 0.0004 = 2
We need the probability that at least one district records at least 5 occurrences.
First, calculate P(X ≥ 5) for a single district with λ = 2:
P(X ≥ 5) = 1 - P(X ≤ 4)
Calculating P(X ≤ 4):
P(X = 0) = e^(-2) = 0.1353
P(X = 1) = 2e^(-2) = 0.2707
P(X = 2) = 2²e^(-2)/2! = 0.2707
P(X = 3) = 2³e^(-2)/3! = 0.1804
P(X = 4) = 2⁴e^(-2)/4! = 0.0902
P(X ≤ 4) = 0.1353 + 0.2707 + 0.2707 + 0.1804 + 0.0902 = 0.9473
Therefore, P(X ≥ 5) = 1 - 0.9473 = 0.0527 for each district.
The probability that no district records 5 or more occurrences:
P(all districts < 5) = (0.9473)^4 = 0.8062
Therefore, the probability that at least one district records at least 5 occurrences:
P(at least one district ≥ 5) = 1 - 0.8062 = 0.1938
CONCLUSION
The Poisson approximation proves highly effective for analyzing rare events in large
populations. Our calculations demonstrate that there is approximately a 27.51%
probability of observing between 10 and 15 cases citywide in any given year, while
there is roughly a 19.38% probability that at least one district will experience 5 or
more cases. These insights enable public health officials to establish appropriate
3/9
monitoring thresholds and allocate resources effectively across districts based on
statistically informed expectations.
Question 2 (10 Marks)
INTRODUCTION
Quality control in manufacturing represents a critical application of statistical
inference, where sample data guides strategic decisions about product
specifications and customer satisfaction guarantees. The relationship between
sample statistics and population parameters enables companies to make confident
claims about product performance while managing risk exposure. In this scenario,
we examine a light bulb manufacturer's challenge to ensure that at least 90% of
their products exceed a specified lifetime threshold, utilizing both confidence
interval analysis and hypothesis testing methodologies.
This analysis demonstrates how normal distribution properties, combined with
central limit theorem applications, provide robust frameworks for quality assurance
decisions. The integration of sample-based inference with business objectives
illustrates practical statistical decision-making in manufacturing environments.
CONCEPTS AND APPLICATION
Given Information Analysis
• Population standard deviation: σ = 50 hours
• Sample size: n = 35 bulbs
• Sample mean: x̄ = 1200 hours
• Requirement: At least 90% of bulbs must last more than 1100 hours
Maximum Mean Lifetime Calculation
For 90% of bulbs to last more than 1100 hours, we need:
P(X > 1100) = 0.90
4/9
This means P(X ≤ 1100) = 0.10
Using the standard normal distribution:
P(Z ≤ (1100 - μ)/σ) = 0.10
From standard normal tables, the 10th percentile corresponds to Z = -1.282.
Therefore:
(1100 - μ)/50 = -1.282
1100 - μ = -64.1
μ = 1100 + 64.1 = 1164.1 hours
The maximum mean lifetime that satisfies the requirement is 1164.1 hours.
Sample Support Analysis
To test if the sample supports this claim, we construct a 95% confidence interval for
the population mean:
Standard error: SE = σ/√n = 50/√35 = 8.452
Margin of error: ME = 1.96 × 8.452 = 16.566
Confidence interval: 1200 ± 16.566 = (1183.434, 1216.566)
Since our calculated maximum allowable mean (1164.1 hours) falls below the lower
bound of our confidence interval (1183.434 hours), the sample evidence supports
the claim that the mean lifetime exceeds the maximum required threshold.
Verification Through Hypothesis Testing
H₀: μ ≤ 1164.1 (claim is not supported)
H₁: μ > 1164.1 (claim is supported)
Test statistic:
t = (x̄ - μ₀)/(σ/√n) = (1200 - 1164.1)/(50/√35) = 35.9/8.452 = 4.248
For α = 0.05 and this one-tailed test, the critical value is 1.645.
Since 4.248 > 1.645, we reject H₀ and conclude that the sample strongly supports the
company's ability to meet the 90% requirement.
5/9
CONCLUSION
The statistical analysis confirms that a maximum mean lifetime of 1164.1 hours
ensures that at least 90% of light bulbs exceed the 1100-hour threshold. The sample
data, with a mean of 1200 hours, provides strong evidence supporting the
company's capability to meet this standard. The confidence interval analysis and
hypothesis testing both validate this conclusion, enabling the company to proceed
confidently with their quality guarantee while maintaining appropriate risk
management.
Question 3(A) (5 Marks)
INTRODUCTION
Hypothesis testing serves as a fundamental decision-making tool in business
research, enabling organizations to validate claims and make evidence-based
strategic decisions. In quality management and customer satisfaction assessment,
statistical testing provides objective frameworks for evaluating whether
organizational claims align with empirical evidence. This scenario examines a
bakery's assertion about customer satisfaction levels, demonstrating how sample
data can either support or refute management claims through rigorous statistical
analysis.
CONCEPTS AND APPLICATION
Hypothesis Formulation
H₀: p ≥ 0.60 (bakery's claim is true - at least 60% satisfied)
H₁: p < 0.60 (bakery's claim is false - less than 60% satisfied)
This is a left-tailed test at α = 0.05 significance level.
Sample Analysis
• Sample size: n = 50
6/9
• Satisfied customers: x = 27
• Sample proportion: p̂ = 27/50 = 0.54
Test Conditions Verification
np₀ = 50 × 0.60 = 30 ≥ 5 ✓
n(1-p₀) = 50 × 0.40 = 20 ≥ 5 ✓
Conditions for normal approximation are satisfied.
Test Statistic Calculation
Standard error: SE = √[p₀(1-p₀)/n] = √[(0.60 × 0.40)/50] = √[0.24/50] = 0.0693
Test statistic: Z = (p̂ - p₀)/SE = (0.54 - 0.60)/0.0693 = -0.06/0.0693 = -0.866
Critical Value and Decision
For α = 0.05 in a left-tailed test, the critical value is Z₀.₀₅ = -1.645.
Since Z = -0.866 > -1.645, we fail to reject H₀.
CONCLUSION
At the 5% significance level, there is insufficient evidence to conclude that the
bakery's claim is false. While the sample proportion (54%) is below the claimed 60%,
this difference is not statistically significant given the sample size. The bakery's
claim that at least 60% of customers are satisfied with their new bread recipe cannot
be rejected based on this sample evidence.
Question 3(B) (5 Marks)
INTRODUCTION
Regression analysis provides essential insights into the relationships between
variables, enabling businesses to understand the effectiveness of their strategies
and make predictions about future outcomes. The coefficient of determination,
7/9
confidence intervals, and prediction accuracy represent fundamental components
of regression-based decision-making. This analysis examines the relationship
between advertising expenditure and sales performance, demonstrating how
regression statistics inform marketing investment decisions.
CONCEPTS AND APPLICATION
Part (a): Coefficient of Determination Calculation
Given:
- SSE (Sum of Squared Errors) = 180
- SST (Total Sum of Squares) = 600
- SSR (Explained Sum of Squares) = 420
Verification: SSE + SSR = 180 + 420 = 600 = SST ✓
R² = SSR/SST = 420/600 = 0.70 or 70%
Interpretation: The coefficient of determination indicates that 70% of the variation
in monthly sales is explained by monthly advertising spend. This suggests a strong
positive relationship between advertising investment and sales performance, with
30% of sales variation attributed to other factors not included in the model.
Part (b): Standard Error Calculation
For regression analysis with n = 10 observations and k = 1 independent variable:
Degrees of freedom: df = n - k - 1 = 10 - 1 - 1 = 8
Standard error of regression:
s = √(SSE/df) = √(180/8) = √22.5 = 4.743
Part (c): 95% Confidence Interval for Prediction at X = 15
Given regression equation: Y = 2.5 + 1.8X
Predicted value at X = 15:
Ŷ = 2.5 + 1.8(15) = 2.5 + 27 = 29.5 (in 10,000s= 295,000)
For a 95% confidence interval with large sample approximation (using Z = 1.96):
Margin of error = Z × s = 1.96 × 4.743 = 9.296
8/9
Confidence interval: 29.5 ± 9.296 = (20.204, 38.796)
Interpretation: We are 95% confident that when advertising spend is
15,000,thepredictedmonthlysaleswillbebetween 202,040 and $387,960.
Summary of Key Statistics
Statistic Value Interpretation
70% of sales variation
R² 0.70
explained by advertising
Average prediction error of
Standard Error 4.743
$47,430
Predicted Sales 295,000|Expectedsaleswith 15,000
at X=15 advertising
Range of likely sales
95% CI 202,040− 387,960
outcomes
CONCLUSION
The regression analysis demonstrates a strong relationship between advertising
expenditure and sales performance, with 70% of sales variation explained by
advertising investment. The model provides reliable predictions with a standard
error of 47,
430,enablingthecompanytomakeinformeddecisionsaboutmarketingbudgetallocation.Ata 15,000
advertising spend level, the company can expect sales between 202,040and 387,960
with 95% confidence, supporting evidence-based marketing investment strategies.
Assignment prepared following NMIMS academic standards and guidelines
Word count: Approximately 1000 words each for Q1&Q2, 500 words for Q3A&Q3B
9/9