Calculating sample size for a case-control study
Statistical Power
Statistical power is the probability of finding an effect if its real.
Factors Affecting Power
1. 2. 3. 4. Size of the effect Standard deviation of the characteristic Bigger sample size Significance level desired
Sample size calculations
Based on these elements, you can write a formal mathematical equation that relates power, sample size, effect size, standard deviation, and significance level.
Calculating sample size for a casecontrol study: binary exposure
Use difference in proportions formula
formula for difference in Represents the proportions
Sample size in the case group
r=ratio of controls to cases
desired power (typically .84 for 80% power).
r 1 ( p)(1 p)(Z Z/2 ) n( ) 2 r (p1 p2 )
A measure of variability (similar to standard deviation) Effect Size (the difference in proportions)
Represents the desired level of statistical significance (typically 1.96).
Example
How many cases and controls do you need assuming
80% power You want to detect an odds ratio of 2.0 or greater An equal number of cases and controls (r=1) The proportion exposed in the control group is 20%
Example, continued
r 1 ( p)(1 p)(Z Z/2 ) n( ) 2 r (p1 p2 )
For 80% power, Z=.84 For 0.05 significance level, Z=1.96 r=1 (equal number of cases and controls) The proportion exposed in the control group is 20% To get proportion of cases exposed:
pcaseexp ORpcontrols exp pcontrols exp (OR 1) 1
pcaseexp 2.0(. 20 ) .40 .33 (. 20 )( 2.0 1) 1 1.20
Average proportion exposed = (.33+.20)/2=.265
Example, continued
r 1 ( p)(1 p)(Z Z/2 ) n( ) 2 r (p1 p2 )
2
(. 265 )(1 .265 )(. 84 1.96 ) n2 181 2 (.33 .20 )
Therefore, n=362 (181 cases, 181 controls)
Calculating sample size for a casecontrol study: continuous exposure
Use difference in means formula
formula for difference in means
Sample size in the case group
r=ratio of controls to cases
Represents the desired power (typically .84 for 80% power).
r 1 ( Z Z/2 ) n( ) 2 r (difference)
2
Represents the desired level of Standard deviation Effect Size of the outcome (the difference statistical significance variable in means) (typically 1.96).
Example
How many cases and controls do you need assuming
80% power The standard deviation of the characteristic you are comparing is 10.0 You want to detect a difference in your characteristic of 5.0 (one half standard deviation) An equal number of cases and controls (r=1)
Example, continued
r 1 ( Z Z/2 ) n( ) 2 r (difference)
2 2
For 80% power, Z=.84 For 0.05 significance level, Z=1.96 r=1 (equal number of cases and controls) =10.0 Difference = 5.0
Example, continued
r 1 ( Z Z/2 ) n( ) 2 r (difference)
2
2
10 (7.84 ) 2 n ( 2) (2) 2 (7.84 ) 63 2 (5)
Therefore, n=126 (63 cases, 63 controls)