**Summary of Key Statistical Concepts with Formulae and Examples**
1. **Probability Distributions**
- **Normal Distribution:**
- Probability density function:
f(x) = (1/sqrt(2*pi*sigma^2)) * exp(-(x-mu)^2 / (2*sigma^2))
- Mean: mu, Variance: sigma^2.
- Example: If X ~ N(50, 16), find P(48 <= X <= 52).
- Solution: Standardize using Z = (X - mu)/sigma. Look up Z-values for 48 and 52.
- **Binomial Distribution:**
- Probability of k successes in n trials:
P(X = k) = nCk * p^k * (1-p)^(n-k)
- Mean: mu = np, Variance: sigma^2 = np(1-p).
- Example: A coin is flipped 10 times. What is the probability of getting exactly 6 heads, if p =
0.5?
- Solution: Use the formula with n = 10, k = 6, p = 0.5.
- **Poisson Distribution:**
- Probability of k events:
P(X = k) = (lambda^k * exp(-lambda)) / k!
- Mean and Variance: lambda.
- Example: If the average number of cars passing a checkpoint is 4 per hour, find the probability
of exactly 5 cars in an hour.
- Solution: Use lambda = 4 and k = 5 in the formula.
2. **Hypothesis Testing**
- Test Statistic:
z = (x_bar - mu) / (sigma / sqrt(n))
t = (x_bar - mu) / (s / sqrt(n))
- Types of Errors:
- Type I Error (alpha): Rejecting a true null hypothesis.
- Type II Error (beta): Failing to reject a false null hypothesis.
- Example: A company claims the mean weight of a product is 500g. A sample of 30 products has
a mean weight of 495g with sigma = 10. Test at alpha = 0.05.
- Solution: Calculate the z-statistic and compare with the critical value.
3. **Linear Regression**
- Regression Equation: Y = a + bX
- b: Slope of the line.
- a: Intercept.
- Slope formula:
b = (n * sum(XY) - sum(X) * sum(Y)) / (n * sum(X^2) - (sum(X))^2)
- Example: Given data points (1, 2), (2, 3), (3, 5), find the regression line.
- Solution: Calculate b and a using the formulas.
4. **Central Limit Theorem (CLT)**
- Key Formula:
X_bar ~ N(mu, sigma^2 / n) for large n.
- Example: A population has mu = 100 and sigma = 15. What is the distribution of the sample
mean for n = 25?
- Solution: The sample mean follows N(100, 3).
5. **Confidence Intervals**
- Formula for Population Mean (sigma known):
CI = x_bar ± Z* * (sigma / sqrt(n))
- Formula for Population Mean (sigma unknown):
CI = x_bar ± t* * (s / sqrt(n))
- Example: A sample mean is 50 with sigma = 5 and n = 36. Find the 95% CI.
- Solution: Use Z* = 1.96 in the formula.
6. **ANOVA (Analysis of Variance)**
- Total Sum of Squares (SST):
SST = sum((y_i - y_bar)^2)
- Between-group Sum of Squares (SSB):
SSB = sum(n_j * (y_bar_j - y_bar)^2)
- Within-group Sum of Squares (SSW):
SSW = sum(sum((y_ij - y_bar_j)^2))
- F-statistic:
F = Between-group variance / Within-group variance
- Example: Compare the means of three groups with given data and calculate the F-statistic.
- Solution: Compute SSB, SSW, and SST, then calculate F.