Important Statistics Formulas
This web page presents statistics formulas described in the Stat Trek tutorials. Each formula links to
a web page that explains how to use the formula.
Parameters
Population mean = = ( Xi ) / N
Population standard deviation = = sqrt [ ( Xi - )2 / N ]
Population variance = 2 = ( Xi - )2 / N
Variance of population proportion = P2 = PQ / n
Standardized score = Z = (X - ) /
Population correlation coefficient = = [ 1 / N ] * { [ (Xi - X) / x ] * [ (Yi - Y) / y ] }
Statistics
Unless otherwise noted, these formulas assume simple random sampling.
Sample mean = x = ( xi ) / n
Sample standard deviation = s = sqrt [ ( xi - x )2 / ( n - 1 ) ]
Sample variance = s2 = ( xi - x )2 / ( n - 1 )
Variance of sample proportion = sp2 = pq / (n - 1)
Pooled sample proportion = p = (p1 * n1 + p2 * n2) / (n1 + n2)
Pooled sample standard deviation = sp = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ]
Sample correlation coefficient = r = [ 1 / (n - 1) ] * { [ (xi - x) / sx ] * [ (yi - y) / sy ] }
Correlation
Pearson product-moment correlation = r = (xy) / sqrt [ ( x2 ) * ( y2 ) ]
Linear correlation (sample data) = r = [ 1 / (n - 1) ] * { [ (xi - x) / sx ] * [ (yi - y) / sy ] }
Linear correlation (population data) = = [ 1 / N ] * { [ (Xi - X) / x ] * [ (Yi - Y) / y ] }
Simple Linear Regression
Simple linear regression line: = b0 + b1x
Regression coefficient = b1 = [ (xi - x) (yi - y) ] / [ (xi - x)2]
Regression slope intercept = b0 = y - b1 * x
Regression coefficient = b1 = r * (sy / sx)
Standard error of regression slope = sb1 = sqrt [ (yi - i)2 / (n - 2) ] / sqrt [ (xi - x)2 ]
Counting
n factorial: n! = n * (n-1) * (n - 2) * . . . * 3 * 2 * 1. By convention, 0! = 1.
Permutations of n things, taken r at a time: nPr = n! / (n - r)!
Combinations of n things, taken r at a time: nCr = n! / r!(n - r)! = nPr / r!
Probability
Rule of addition: P(A B) = P(A) + P(B) - P(A B)
Rule of multiplication: P(A B) = P(A) P(B|A)
Rule of subtraction: P(A') = 1 - P(A)
Random Variables
In the following formulas, X and Y are random variables, and a and b are constants.
Expected value of X = E(X) = x = [ xi * P(xi) ]
Variance of X = Var(X) = 2 = [ xi - E(x) ]2 * P(xi) = [ xi - x ]2 * P(xi)
Normal random variable = z-score = z = (X - )/
Chi-square statistic = 2 = [ ( n - 1 ) * s2 ] / 2
f statistic = f = [ s12/12 ] / [ s22/22 ]
Expected value of sum of random variables = E(X + Y) = E(X) + E(Y)
Expected value of difference between random variables = E(X - Y) = E(X) - E(Y)
Variance of the sum of independent random variables = Var(X + Y) = Var(X) + Var(Y)
Variance of the difference between independent random variables = Var(X - Y) = Var(X) + Var(Y)
Sampling Distributions
Mean of sampling distribution of the mean = x =
Mean of sampling distribution of the proportion = p = P
Standard deviation of proportion = p = sqrt[ P * (1 - P)/n ] = sqrt( PQ / n )
Standard deviation of the mean = x = /sqrt(n)
Standard deviation of difference of sample means = d = sqrt[ (12 / n1) + (22 / n2) ]
Standard deviation of difference of sample proportions = d = sqrt{ [P1(1 - P1) / n1] + [P2(1 - P2) /
n2] }
Standard Error
Standard error of proportion = SEp = sp = sqrt[ p * (1 - p)/n ] = sqrt( pq / n )
Standard error of difference for proportions = SEp = sp = sqrt{ p * ( 1 - p ) * [ (1/n1) + (1/n2) ] }
Standard error of the mean = SEx = sx = s/sqrt(n)
Standard error of difference of sample means = SEd = sd = sqrt[ (s12 / n1) + (s22 / n2) ]
Standard error of difference of paired sample means = SEd = sd = { sqrt [ ((di - d)2 / (n - 1) ] } /
sqrt(n)
Pooled sample standard error = spooled = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ]
Standard error of difference of sample proportions = sd = sqrt{ [p1(1 - p1) / n1] + [p2(1 - p2) / n2] }
Discrete Probability Distributions
Binomial formula: P(X = x) = b(x; n, P) = nCx * Px * (1 - P)n - x = nCx * Px * Qn - x
Mean of binomial distribution = x = n * P
Variance of binomial distribution = x2 = n * P * ( 1 - P )
Negative Binomial formula: P(X = x) = b*(x; r, P) = x-1Cr-1 * Pr * (1 - P)x - r
Mean of negative binomial distribution = x = rQ / P
Variance of negative binomial distribution = x2 = r * Q / P2
Geometric formula: P(X = x) = g(x; P) = P * Qx - 1
Mean of geometric distribution = x = Q / P
Variance of geometric distribution = x2 = Q / P2
Hypergeometric formula: P(X = x) = h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]
Mean of hypergeometric distribution = x = n * k / N
Variance of hypergeometric distribution = x2 = n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ]
Poisson formula: P(x; ) = (e-) (x) / x!
Mean of Poisson distribution = x =
Variance of Poisson distribution = x2 =
Multinomial formula: P = [ n! / ( n1! * n2! * ... nk! ) ] * ( p1n1 * p2n2 * . . . * pknk )
Linear Transformations
For the following formulas, assume that Y is a linear transformation of the random variable X,
defined by the equation: Y = aX + b.
Mean of a linear transformation = E(Y) = Y = aX + b.
Variance of a linear transformation = Var(Y) = a2 * Var(X).
Standardized score = z = (x - x) / x.
t statistic = t = (x - x) / [ s/sqrt(n) ].
Estimation
Confidence interval: Sample statistic + Critical value * Standard error of statistic
Margin of error = (Critical value) * (Standard deviation of statistic)
Margin of error = (Critical value) * (Standard error of statistic)
Hypothesis Testing
Standardized test statistic = (Statistic - Parameter) / (Standard deviation of statistic)
One-sample z-test for proportions: z-score = z = (p - P0) / sqrt( p * q / n )
Two-sample z-test for proportions: z-score = z = z = [ (p1 - p2) - d ] / SE
One-sample t-test for means: t statistic = t = (x - ) / SE
Two-sample t-test for means: t statistic = t = [ (x1 - x2) - d ] / SE
Matched-sample t-test for means: t statistic = t = [ (x1 - x2) - D ] / SE = (d - D) / SE
Chi-square test statistic = 2 = [ (Observed - Expected)2 / Expected ]
Degrees of Freedom
The correct formula for degrees of freedom (DF) depends on the situation (the nature of the test
statistic, the number of samples, underlying assumptions, etc.).
One-sample t-test: DF = n - 1
Two-sample t-test: DF = (s12/n1 + s22/n2)2 / { [ (s12 / n1)2 / (n1 - 1) ] + [ (s22 / n2)2 / (n2 - 1) ] }
Two-sample t-test, pooled standard error: DF = n1 + n2 - 2
Simple linear regression, test slope: DF = n - 2
Chi-square goodness of fit test: DF = k - 1
Chi-square test for homogeneity: DF = (r - 1) * (c - 1)
Chi-square test for independence: DF = (r - 1) * (c - 1)
Sample Size
Below, the first two formulas find the smallest sample sizes required to achieve a fixed margin of
error, using simple random sampling. The third formula assigns sample to strata, based on a
proportionate design. The fourth formula, Neyman allocation, uses stratified sampling to minimize
variance, given a fixed sample size. And the last formula, optimum allocation, uses stratified
sampling to minimize variance, given a fixed budget.
Mean (simple random sampling): n = { z2 * 2 * [ N / (N - 1) ] } / { ME2 + [ z2 * 2 / (N - 1) ] }
Proportion (simple random sampling): n = [ ( z2 * p * q ) + ME2 ] / [ ME2 + z2 * p * q / N ]
Proportionate stratified sampling: nh = ( Nh / N ) * n
Neyman allocation (stratified sampling): nh = n * ( Nh * h ) / [ ( Ni * i ) ]
Optimum allocation (stratified sampling):
nh = n * [ ( Nh * h ) / sqrt( ch ) ] / [ ( Ni * i ) / sqrt( ci ) ]
Statistics Tutorial
Descriptive Statistics
Quantitative measures
Variables
Central tendency
Variability
Measures of position
Charts and graphs
Patterns in data
Dotplots
Histograms
Stemplots
Boxplots
Cumulative plots
Scatterplots
Comparing plots
Tabular displays
One-way tables
Two-way tables
Probability
Probability basics
Sets and subsets
Stat experiments
Counting data points
Probability laws
What is probability
Probability problems
Rules of probability
Bayes' rule
Random variables
Types of variables
Distributions
Mean and variance
Independence
Combining
Transforming
Sampling theory
Random sampling
Central tendency
Variability
Sampling distribution
Diff between props
Diff between means
Distributions
Distribution basics
Probability dist
Discrete/continuous
Discrete
Binomial distribution
Negative binomial
Hypergeometric
Multinomial
Poisson
Continuous
Normal distribution
Standard normal
Student's t
Chi-square
F distribution
Estimation
Estimation theory
Estimation overview
Standard error
Margin of error
Confidence intervals
Proportions
Estimate proportion
Small samples
Diff between props
Mean scores
Estimate mean
Diff between means
Matched pairs
Hypothesis Testing
Foundations of testing
Hypothesis tests
How to test
Mean scores
Test of the mean
Diff between means
Diff between pairs
Proportions
Test for a proportion
Small samples
Diff between props
Power
Region of acceptance
Power of a test
How to find power
Chi-square tests
Goodness of fit
Homogeneity
Independence
Survey Sampling
Sampling methods
Data collection
Sampling methods
Survey sampling bias
Simple random samples
Survey sampling
SRS analysis
Stratified samples
Stratified sampling
Stratified analysis
Cluster samples
Cluster sampling
CLS analysis
Sample planning
Sample size: SRS
Sample size: STR
Find right method
More Applied Statistics
Linear regression
Measurement scales
Linear correlation
Linear regression
Regression example
Regression tests
Residual analysis
Transformations
Influential points
Slope estimate
Slope significance
Experiments
Experiment intro
Experimental design
Simulations
Appendices
Notation
Statistics Formulas
Texas Instruments TI-89 Advanced Graphing Calculator
Buy Used: $35.95
Buy New: $130.00
Approved for AP Statistics and Calculus
Excel 2007 Data Analysis For Dummies
Stephen L. Nelson
List Price: $26.99
Buy Used: $4.24
Buy New: $15.63
Cracking the AP Statistics Exam, 2008 Edition (College Test Preparation)
Princeton Review
List Price: $19.00
Buy Used: $2.21
Buy New: $9.00
AP Statistics Crash Course Book + Online (Advanced Placement (AP) Crash Course)
Michael D'Alessio, Advanced Placement, Statistics Study Guides
List Price: $14.95
Buy Used: $1.57
Buy New: $12.70
5 Steps to a 5 AP Statistics, 2014-2015 Edition (5 Steps to a 5 on the Advanced Placement
Examinations Series)
Duane Hinders
List Price: $18.95
Buy Used: $1.00
Buy New: $14.17
Cracking the AP Statistics Exam, 2015 Edition (College Test Preparation)
Princeton Review
List Price: $19.99
Buy Used: $1.02
Buy New: $6.93
Advanced Excel for Scientific Data Analysis
Robert de Levie
List Price: $59.50
Buy Used: $4.41
Buy New: $55.80
Cracking the AP Statistics Exam, 2013 Edition (College Test Preparation)
Princeton Review
List Price: $19.99
Buy Used: $0.77
Buy New: $5.00
Sampling of Populations: Methods and Applications
Paul S. Levy, Stanley Lemeshow
List Price: $173.00
Buy Used: $110.79
Buy New: $126.29
Texas Instruments TI-83 Plus Graphing Calculator
List Price: $92.99
Buy Used: $41.95
Buy New: $92.99