0% found this document useful (0 votes)

389 views43 pages

Case Control Matching

1. Matching in case-control studies addresses confounding at the design stage rather than analysis stage by balancing distributions across strata. 2. There are two main types of matching: individual matching where each case is matched to a control on attributes, and frequency matching where cases and controls are matched on cell frequencies of attributes. 3. Matching aims to increase precision by reducing standard errors and narrowing confidence intervals when sample sizes are small or there are many confounding factors. It also ensures control of confounding when obtaining subject information is expensive.

Uploaded by

juli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

389 views43 pages

Case Control Matching

Uploaded by

juli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

MATCHING IN CASE CONTROL STUDIES

Matching addresses issues of confounding in the

DESIGN stage of a study as opposed to the analysis
phase

A means of providing a more efficient stratified analysis

rather than a direct means of preventing confounding,
by increasing precision of estimates (reduction in SE)
Individual Matching

Controls are matched to cases on one or more attributes

(i.e. age, gender, smoking status, etc). Each case/control
pair then has identical values on the matching factors.
Requires a more complex analysis than unmatched
data—analytical complexity required to stratify on data
not matched on. Each matched set defines it’s own
stratum—can be viewed as a single “individual”

Frequency Matching

Match on cell instead of individual. Ex. Frequency

matching on age and sex. If 20% of cases are 50-54 year
old females, than controls are selected in such a way
that 20% are also 50-54 years old and female. Does
not require using a matched analysis, because you take
a random sample of controls in that cell (50-54 year old
females). But you have to wait until cases accumulate
before controls are selected (unless you know
distribution in advance of matching factors)
WHY MATCH?

1. To make control for confounding more efficient when

sample size is small

Without matching, control for confounding in the

analysis will result in many strata with sparse data. By
balancing the distribution across strata, the estimates of
the OR will be more stable—smaller standard errors,
and thus narrower confidence intervals.

2. Even if sample size is not small, if there are many

confounders with many categories, data can be sparse in
any given stratum. However, you may be able to use
multivariate analysis instead.

3. If obtaining information from subjects is expensive.

i.e running expensive lab tests on blood samples.
Matching will insure control of confounding and will
not lead to loss of information. If cost of matching is
small compared to cost of expanding study size,
matching is worthwhile.

4. Sometimes control of confounding only possible by

matching—i.e., controlling for sibship.

Alternatives to matching: frequency matching, use

multivariate analyses to control confounding
DISADVANTAGES OF MATCHING

Time consuming

Can be expensive

Can’t always find an exact match

Matching can decrease study efficiency because the effort

expanded in finding matched subjects could be spent on
gathering information for a greater number of unmatched
subjects.
Matching Criteria

Matching will increase efficiency only if the matching

variables are associated with both the disease and the
exposure.

The matching variable must also NOT be on the causal

chain, i.e. if high fat diet is an exposure of interest,
don’t match on high cholesterol or vice versa.

If matching is used, matched analysis must be used to

take advantage of the matching. If matching was done
appropriately, and matching is not taken into account in
the analysis, the OR will be biased towards the null.

Matching allows you to assess the relationship to

exposure and disease having already taken the
confounding variable(s) into account, so you don’t need
to adjust for these variables in the analysis.
AVOID OVERMATCHING

Term originally referred to “loss of validity in a case control

study stemming from a control group that was so closely
matched to the case group that the exposure distributions
differed very little.” (Rothman, Modern Epidemiology)

Once you match on a factor, you can NOT analyze this factor in
the analysis. You have to be assured that you do NOT want to
assess the relationship of this factor to the disease. If you match
on a variable that is associated with another variable of interest,
you will have essentially matched on both of these variables.

Example: If you match on neighborhood (i.e census tract), you

may also be matching on SES, if neighborhood is correlated with
SES. So you would NOT be able to analyze SES as a potential
“exposure” variable because you have made the cases and
controls the same on this variable.

Ex. If female controls are matched to female cases, and vice-

vers, you can NOT assess the role of gender on disease because
you’ve made cases, controls similar on this variable. If you
match on smoking status, you cannot assess the role of this factor
in disease. In effect, you are matching on this factor to control
confounding for this factor. But you are not concerned about
assessing the impact of this factor on the disease.
MORE RECENT INTEPRETATIONS OF OVERMATCHING

CONCERNS WITH EFFICIENCY, NOT VALIDITY

Matching can result in LESS information if the expense of

matching reduces the total number of study subjects.

Study efficiency:

Total information content of data/ total number of subjects

Cost efficiency:

Total information content of data/costs of study

CONTROLS SIMILAR TO CASES ON EXPOSURE WILL

NOT CONTRIBUTE TO THE ANALYSIS—loss of efficiency

Unnecessary matching—

IF MATCHING FACTOR IS ASSOCIATED WITH DISEASE

BUT NOT EXPOSURE, MATCHED ANALYSIS WILL BE
STATISTICALLY LESS EFFICIENT

IF MATCHING FACTOR IS ASSOCIATED WITH

EXPOSURE BUT NOT DISEASE, MUST USE MATCHED
ANALYSIS, OTHERWISE ODDS RATIO WILL BE BIASED
TOWARDS NULL IN UNMATCHED ANALYSIS. BUT
VARIANCE OF ODDS RATIO WILL BE INCREASED
COMPARED TO UNMATCHED ANALYSIS OF SAME
SAMPLE SIZE.
SAMPLE SIZE IN MATCHED STUDY IS NUMBER OF
MATCHED PAIRS (OR TRIPLETS, ETC).

SAMPLE SIZE IN UNMATCHED STUDY IS NUMBER OF

CASES AND CONTROLS

DO NOT MATCH UNLESS MATCHING VARIABLE

ASSOCIATED WITH DISEASE AND EXPOSURE. MORE
EFFICIENT TO CONDUCT UNMATCHED STUDY
OTHERWISE.
ANALYSIS METHODS FOR INDIVIDUALLY MATCHED
CASE-CONTROL STUDIES

1. Rationale is to control at the design stage for potential confounders

2. Case-Control Pair; Dichotomous Exposure

• Unit of Analysis is the matched case-control pair.

• There are 4 possible outcomes with respect to the matched

pair:

⇒ Case exposed; Control exposed

⇒ Case exposed; Control not exposed
⇒ Case not exposed; control exposed
⇒ Case not exposed; control not exposed
Usual Display of Matched Case-Control Data with
Dichotomous Exposure

Controls

Exposure

Exposure Present Absent Total

Cases Present f g f+g

Absent h j h+j

Total f+h g+j n

ODDS RATIO FOR MATCHED PAIR DICHOTOMOUS
EXPOSURE CASE-CONTROL STUDIES

OR = g / h
HEURISTIC DEMONSTRATION OF ODDS RATIO

• Suppose that for each of I strata, the layout for a fourfold table is
given by:

Cases Controls Total

Exposed ai bi m1i
Not Exposed ci di m2i
Total n1i n2i Ni

• From earlier discussion of stratified analysis, we recall that the

Mantel-Haenszel odds ratio is given by:

I
ai d i
∑ N
ORMH = i=I1 i
bi ci
∑i =1 N i
• Consider a single matched pair. The four possible outcomes are
shown below:

1.
Cases Controls Total
Exposed 1 1 2
Not exposed 0 0 0
Total 1 0 2

2.
Cases Controls Total
Exposed 1 0 1
Not Exposed 0 1 1
Total 1 1 2

Cases Controls Total

Exposed 0 1 1
Not exposed 1 0 1
Total 1 1 2

Cases Controls Total

Exposed 0 0 0
Not exposed 1 1 2
Total 1 1 2
Contribution to Odds Ratio from Tables of Type 1:

1.
Cases Controls Total
Exposed 1 1 2
Not exposed 0 0 1
Total 1 1 2

Contribution to Numerator:

ai di 1× 0
= =0
Ni 2

Contribution to Denominator:

bi ci 1× 0
= =0
Ni 2
Contribution to Odds Ratio from Tables of Type 2:

Cases Controls Total

Exposed 1 0 1
Not exposed 0 1 1
Total 1 1 2

Contribution to Numerator:

ai di 1× 1 1
= =
Ni 2 2

Contribution to Denominator:

bi ci 0 × 0
= =0
Ni 2
Contribution to Odds Ratio from Tables of Type 3:

3.
Cases Controls Total
Exposed 0 1 1
Not exposed 1 0 1
Total 1 1 2

Contribution to Numerator:

ai di 0 × 0
= =0
Ni 2

Contribution to Denominator:

bi ci 1× 1 1
= =
Ni 2 2
Contribution to Odds Ratio from Tables of Type 4:

Case Controls Total

Exposed 0 0 0
Not exposed 1 1 2
Total 1 1 2

Contribution to Numerator:

ai di 0 × 1
= =0
Ni 2

Contribution to Denominator:

bi ci 0 ×1
= =0
Ni 2
COMPUTATION OF ODDS RATIO FROM STRATIFIED
ANALYSIS OF I MATCHED PAIRS

Table Type Number of Contribution Contribution Mantel-

Tables to Numerator to Haenszel
Denominator Odds Ratio
1 f f×0=0 f×0=0
2 g g × 1/2 = g/2 g×0=0
3 h h×0=0 h × 1/2 = h/2
4 j j×0=0 j×0=0
Total I g/2 h/2 g/h

g
g
OR = 2 =
h h
2

Ratio of # of pairs of discordant exposure:

(#pairs with case exposed, controls not exposed, divided by

#pairs with cases not exposed, controls exposed)

Case control pairs with same exposure NOT used

Matched pairs OR=g/h, or b/c

EXAMPLE OF MATCHED PAIR CASE-CONTROL
ANALYSIS USING PAIRS MODULE

• Matched Case-Control Study of Association Between Use of Oral

Conjugated Estrogens and Cervical Cancer (PEPI Manual Page
137)

Controls

Estrogen Use

Estrogen Present Absent Total

Use
Cases Present 12 43 55

Absent 7 121 128

Total 19 164 183

OR=43/7=6.14
OUTPUT FROM PAIRS MODULE

PAIRS - Analysis of Paired Samples

Thursday, 3rd October 2002.
------------------------------------------------------------------------

DATA
Number of "case = Yes, control = No" pairs: 43
Number of "case = No, control = Yes" pairs: 7

-----------------------------------------------------------------------
**IF NO CHI SQUARE TEST USE TWO TAILED P VALUE
Doesn’t do unless enough pairs

One-tailed P = 0.000 [ 1.05E-07 ] Two-tailed P = 0.000 [

2.10E-07 ]

Odds ratio = 6.14 or [reciprocal]: 0.16

90% conf. interval = 2.99 to 13.19 or [reciprocals]: 0.08 to 0.33
95% conf. interval = 2.66 to 14.93 or [reciprocals]: 0.07 to 0.38
99% conf. interval = 2.13 to 18.86 or [reciprocals]: 0.05 to 0.47

Low-bias indicator of O.R. = 5.38 or 0.16

------------------------------------------------------------------------
SIGNIFICANCE TEST USED IS McNEMAR’S TEST
One Degree of Freedom Chi Square Test
p.442 Szklo

( | g − h | −1) 2
χ12df =
g+h

FOR THIS EXAMPLE

χ MCNEMAR
2
=
( 43 − 7 − 1)
2

= 24.5
43 + 7
APPROXIMATE CONFIDENCE INTERVALS FOR
MATCHED Odds Ratios (p.442 Szklo)

1 1
SE (ln OR)= +
b c

95% CI(ln OR)=ln OR ± (1.96 × SE [ ln OR ] )

For 95% CI for OR take exponent

  1 1 
95%CI (OR)=exp  ln ( OR ) ±  1.96 × + 
  b c 

FOR THIS EXAMPLE:

  1 1 
Exp  ln(6.14) ±  1.96 × +   = 2.76 - 13.64
  43 7  

(PEPI gives slightly different CI (2.66-14.93) —uses

exact methods)
NOTE: Cannot stratify on another variable that was
not matching on—otherwise you will break the
matching

Example: If you matched only on age, but stratified on

sex, age within gender would not necessarily be
balanced on age. So if you want to control for both sex
and age in a matched analysis, you must match on these
factors in ADVANCE. Otherwise you can use logistic
regression analyses, which retains the pairing, but
allows for adjustment for other variables not matched
on.
EXAMPLE FROM SCHLESSELMAN:

Table 7.19

Age and sex of three case control pairs, matched on age, but
NOT matched on SEX

Pair Case Control

1 M 20 F 20
2 M 30 F 30
3 F 40 M 40

Ages according to SEX—no longer matched on age

Male Female
Case Control Case Control
20 40 40 20
30 30
INDIVIDUALLY MATCHED CASE CONTROL PAIRS WITH
STRATIFICATION on ANOTHER MATCHING VARIABLE—
Matched on Source of controls in addition to other matching
variables

• Estrogen-Cervicial Cancer Example Shown Earlier with

“Augmented” Data (controls matched according to source of
controls and other matching variables)

Stratum 1 : Controls Selected from Hospitalized Patients

Stratum 2. Controls Selected from Population

Hospital
Controlled
Study
Controls
Estrogen Use
Estrogen Use Present Absent Total
Cases Present 12 43 55
Absent 7 121 128
Total 19 164 183

Population
Controlled
Study
Controls
Estrogen Use
Estrogen Use Present Absent Total
Cases Present 9 37 46
Absent 8 104 112
Total 17 141 158
PAIRS - Analysis of Paired Samples
Thursday, 3rd October 2002.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ

DATA
STRATUM 1
Number of "case = Yes, control = No" pairs: 43
Number of "case = No, control = Yes" pairs: 7
Number of "case = No, control = No" pairs: 121
Number of "case = Yes, control = Yes" pairs: 12
STRATUM 2
Number of "case = Yes, control = No" pairs: 37
Number of "case = No, control = Yes" pairs: 8
Number of "case = No, control = No" pairs: 104
Number of "case = Yes, control = Yes" pairs: 9

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
Stratum Odds ratio Chi-square DF

1 6.14 28.82 1 P = 0.000 [ 7.95E-]

2 4.63 20.26 1 P = 0.000 [ 6.75E-06
}

Chi-square for heterogeneity = 0.25 DF = 1 P = 0.614

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

POOLED DATA (Chi-square is continuity-corrected).

Odds ratio = 5.33 or [reciprocal]: 0.19

90% conf. interval = 3.26 to 8.84 or [reciprocals]: 0.11 to 0.31
95% conf. interval = 3.00 to 9.64 or [reciprocals]: 0.10 to 0.33
99% conf. interval = 2.54 to 11.40 or [reciprocals]: 0.09 to 0.39
Log-likelihood chi-sq. = 47.17 d.f. = 1 P = 0.000 [ 6.50E-12 ]
Pearson's chi-sq. = 43.12 d.f. = 1 P = 0.000 [ 5.16E-11 ]

IF THE STRATA ARE CLUSTERS OF RELATED OBSERVATIONS:

The above results require no modification (no positive
correlation within clusters: Eliasziw-Donner rho = -0.02).

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
CAN MATCH MORE THAN ONE CONTROL PER CASE TO
INCREASE PRECISION OF THE ODDS RATIO
(DECREASE STANDARD ERROR)

GIVEN FIXED NUMBER OF CASES, PRECISION OF ODDS

RATIO ESTIMATE DECLINES CONSIDERABLY FOR 5 OR
MORE CONTROLS PER CASE

In other words, the increase in precision when matching

5 or more cases is minimal, and not worth the extra
expense and resources required to conduct the
matching.
INDIVIDUALLY MATCHED CASE-CONTROL STUDIES
HAVING MORE THAN ONE CONTROL MATCHED TO
EACH CASE

• Scenario: R controls are matched to each case

• Mantel Haenszel Chi-Squared Statistic is Shown Below:

F
G
∑
R
L m IJ
O 2

H M
N
m=1
f 1, m − 1
− (f
R + 1 1, m− 1 P
+ f 0, m )
QK
χ2 =
m( R + 1 − m )
∑ cf h
R

1, m− 1 + f 0 , m ×
m=1 ( R + 1) 2

Where

f 1, x = The number of sets with the case exposed and x controls in the exposed category.
f 0 , x = The number of sets with the case not exposed and x controls in the exposed cate

Mantel-Haenszel Odds Ratio:

∑( R + 1 − m) f 1, m − 1
ORMH = m =1
R

∑m
m=1
f0 , m
EXAMPLE FROM SAHAI AND KHURSHID (PAGE 131)

The data below represent a case-control study of the relationship

between history of induced abortions and tubal pregnancy. The 18
cases are women with tubal pregnancy; the controls are women not
having tubal pregnancy. Each case has 4 matched controls; and
history of induced abortions is designated with a ‘+’ indicating “yes”
and a ‘-’ indicating ‘no’.

History of Induced Abortion

Exposure
Type Case Controls
0,0 - - - - -
1,1 + - + - -
1,0 + - - - -
0,0 - - - - -
0,1 - + - - -
1,0 + - - - -
1,0 + - - - -
0,0 - - - - -
1,2 + + - - +
1,1 + - + - -
1,2 + - + + -
0,0 - - - - -
1,4 + + + + +
1,1 + - - + -
1,1 + - - + -
1,1 + + - - -
0,0 - - - - -
1,2 + + - - +
SUPPOSE WE IGNORE THE MATCHING

Exposed Not Exposed Total

Cases 12 6 18

Controls 16 56 72

Total 18 72 90

CASECONT - Analysis of 2 X 2 Tables for Case-Control Studies

ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ

DATA
TABLE 1
Exposed Not exposed
Cases 12 6
Controls 16 56

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
ANALYSIS OF TABLE 1: Total cases = 18 Total controls = 72

Proportion of cases exposed = 0.667 Proportion of controls exposed = 0.222

Chi-square (1 DF) = 13.272 P = 0.000 [ 2.69E-04 ]

Continuity corrected chi-sq. (Yates) = 11.279 P = 0.001 [ 7.84E-04 ]
Upton's adjusted chi-square = 13.124 P = 0.000 [ 2.91E-04 ]

Odds ratio = 7.00 [Low-bias indicator of O.R. in the population = 5.65]

90% confidence interval = 2.38 to 21.34
95% confidence interval = 2.01 to 25.34
99% confidence interval (approximate) = 1.46 to 34.94
Adjusted O.R. (0.5 added in each cell) = 6.59

Yule's Q = 0.75 Phi = 0.38

Lambda (prediction of exposure status from "caseness") = 0.21
(prediction of "caseness" from exposure status) = 0.00
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
RATIONALE FOR MATCHED CHI-SQUARED STATISTIC

• As with paired data, let us consider each matched set of 1 case and
R controls as a single “stratum” which would yield the following
fourfold table(NOTE CASE-CONTROL/EXPOSURE order
reversed to be consistent with PEPI modules):

Exposed Not Exposed Total

Cases y 1-y 1
Controls x R-x R
Total x+y R + 1 - (x + y) R+1

Note: y = 0 or 1 (Only one case, either exposed or not exposed)

• We can then compute the Mantel-Haenszel test and odds ratios as

we do for a stratified analysis .

• The tables having x = 0 and y = 0; and x = R and y = 1 are “non-

informative” analogous to what we saw for the individually 1 to 1
matched case-control design.

• The Mantel-Haenszel test and odds ratio can then be calculated in

the usual stratified analysis way (e.g., using the CASECONT
module).
OUTPUT FROM CASECONT FOR 12 INFORMATIVE
MATCHED SETS—EXCLUDES 1 SET WHERE ALL CASES
and CONTROLS EXPOSED, AND 5 SETS WHERE ALL
CASES AND CONTROLS NOT EXPOSED

CASECONT - Analysis of 2 X 2 Tables for Case-Control Studies

Saturday, 21st February 1998.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ

DATA
TABLE 1
Exposed Not exposed
Cases 1 0
Controls 1 3

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
ANALYSIS OF TABLE 1: Total cases = 1 Total controls = 4

Proportion of cases exposed = 1.000 Proportion of controls exposed = 0.250

Chi-square (1 DF) = 1.875 P = 0.171

Continuity corrected chi-sq. (Yates) = 0.052 P = 0.819
Upton's adjusted chi-square = 1.500 P = 0.221
** WARNING: 4 cells have an expected frequency of <5.

Odds ratio = infinity. [Low-bias indicator of O.R. in the population = 1.50]

90% confidence interval (approximate) = 0.09 to infinity
95% confidence interval (approximate) = 0.06 to infinity
99% confidence interval (approximate) = 0.04 to infinity
Adjusted O.R. (0.5 added in each cell) = 7.00

Yule's Q = 1.00 Phi = 0.61

Lambda (prediction of exposure status from "caseness") = 0.50
(prediction of "caseness" from exposure status) = 0.00
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
DATA
TABLE 2
Exposed Not exposed
Cases 1 0
Controls 0 4

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
ANALYSIS OF TABLE 2: Total cases = 1 Total controls = 4

Proportion of cases exposed = 1.000 Proportion of controls exposed = 0.000

Chi-square (1 DF) = 5.000 P = 0.025

Continuity corrected chi-sq. (Yates) = 0.703 P = 0.402
Upton's adjusted chi-square = 4.000 P = 0.046
** WARNING: 4 cells have an expected frequency of <5.

Odds ratio = infinity. [Low-bias indicator of O.R. in the population = 4.00]

90% confidence interval (approximate) = 0.28 to infinity
95% confidence interval (approximate) = 0.20 to infinity
99% confidence interval (approximate) = 0.12 to infinity
Adjusted O.R. (0.5 added in each cell) = 27.00

Yule's Q = 1.00 Phi = 1.00

Lambda (prediction of exposure status from "caseness") = 1.00
(prediction of "caseness" from exposure status) = 1.00
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
DATA
TABLE 3
Exposed Not exposed
Cases 0 1
Controls 1 3

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
ANALYSIS OF TABLE 3: Total cases = 1 Total controls = 4

Proportion of cases exposed = 0.000 Proportion of controls exposed = 0.250

Chi-square (1 DF) = 0.313 P = 0.576

Continuity corrected chi-sq. (Yates) = 0.000 P = 1.000
Upton's adjusted chi-square = 0.250 P = 0.617
** WARNING: 4 cells have an expected frequency of <5.

Odds ratio = 0.00 [Low-bias indicator of O.R. in the population = 0.00]

90% confidence interval (approximate) = 0.00 to 249.77
95% confidence interval (approximate) = 0.00 to 418.72
99% confidence interval (approximate) = 0.00 to 1010.07
Adjusted O.R. (0.5 added in each cell) = 0.78

Yule's Q = -1.00 Phi = -0.25

Lambda (prediction of exposure status from "caseness") = 0.00
(prediction of "caseness" from exposure status) = 0.00
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
SIMILARLY , THE INFORMATIVE TABLES ARE ENTERED
THROUGH TABLE 12

DATA
TABLE 12
Exposed Not exposed
Cases 1 0
Controls 2 2

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
ANALYSIS OF TABLE 12: Total cases = 1 Total controls = 4

Proportion of cases exposed = 1.000 Proportion of controls exposed = 0.500

Chi-square (1 DF) = 0.833 P = 0.361

Continuity corrected chi-sq. (Yates) = 0.000 P = 1.000
Upton's adjusted chi-square = 0.667 P = 0.414
** WARNING: 4 cells have an expected frequency of <5.

Odds ratio = infinity. [Low-bias indicator of O.R. in the population = 0.67]

90% confidence interval (approximate) = 0.03 to infinity
95% confidence interval (approximate) = 0.02 to infinity
99% confidence interval (approximate) = 0.01 to infinity
Adjusted O.R. (0.5 added in each cell) = 3.00

Yule's Q = 1.00 Phi = 0.41

Lambda (prediction of exposure status from "caseness") = 0.00
(prediction of "caseness" from exposure status) = 0.00
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
FINALLY, THE SUMMARY STRATIFIED ANALYSIS IS
PERFORMED

SUMMARY ANALYSIS OF TABLES 1 to 12

Mantel-Haenszel chi-square (DF = 1) = 16.000 P = 0.000 [ 6.33E-05 ]

continuity corrected (DF = 1) = 13.598 P = 0.000 [ 2.26E-04 ]
NOTE: Due to small numbers, M-H test is not recommended.

Mantel-Haenszel odds ratio = 33.00

90% confidence interval = 4.35 to 250.44
95% confidence interval = 2.95 to 369.18
99% confidence interval = 1.38 to 788.57
Maximum-likelihood estimate of uniform odds ratio = 78.80
90% confidence interval (Cornfield-Gart) = 6.61 to 5383.27
95% confidence interval (Cornfield-Gart) = 4.95 to 8529.62
99% confidence interval (Cornfield-Gart) = 2.93 to 19066.58

Heterogeneity of O.R.'s: chi-sq (DF: 11) = 9.11 P = 0.612

Standardized rate ratio (standard: exposed group) = 33.00
Standardized rate ratio (standard: unexposed group) not computed.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
THE FORMULAS STATED EARLIER ARE
“SHORTCUTS” TO AVOID HAVING TO ENTER
EVERY MATCHED SET AS A SEPARATE TABLE

INSTEAD, TABULATE FREQUENCY OF

EXPOSURE OUTCOMES AND ENTER INTO PEPI
MODULE “MATCHED”

For this example:

MATCHED - Multiple Matched Controls
Thursday, 3rd October 2002.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ

DATA
Number of controls per case = 4
Case '+' and 0 control '+': 3 case-control sets
Case '+' and 1 control '+': 5 case-control sets
Case '+' and 2 controls '+': 3 case-control sets
Case '+' and 3 controls '+': 0 case-control set
Case '+' and 4 controls '+': 1 case-control set
Case '-' and 0 control '+': 5 case-control sets
Case '-' and 1 control '+': 1 case-control set
Case '-' and 2 controls '+': 0 case-control set
Case '-' and 3 controls '+': 0 case-control set
Case '-' and 4 controls '+': 0 case-control set

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Mantel-Haenszel chi-square (1 DF)

without continuity correction = 16.000 P = 0.000 [ 6.33E-05
]

Walter's test
without continuity correction: z = 4.619 P = 0.000 [ 3.86E-06
]
with continuity correction: z = 4.547 P = 0.000 [ 5.45E-06
]

Mantel-Haenszel estimate of odds ratio = 32.97

Approximate 90% confidence interval = 4.34 to 250.19
Approximate 95% confidence interval = 2.95 to 368.83
Approximate 99% confidence interval = 1.38 to 787.81

Maximum-likelihood estimate of odds ratio = 22.57

Approximate 90% confidence interval = 3.94 to 129.37
Approximate 95% confidence interval = 2.82 to 180.74
Approximate 99% confidence interval = 1.47 to 347.57

Low-bias estimator of odds ratio = 16.50

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
FOR SITUATIONS WHERE THERE ARE A VARIABLE
NUMBER OF CONTROLS MATCHED TO EACH CASE

• This is an important situation since intentions to match each case

with R controls are not often accomplished successfully.

• Mantel Haenszel Chi Squared Test:

F
G
∑ ∑
L R
m IJ
O 2

H M
N
R m=1
f 1, m − 1
− (f
R + 1 1, m −1 P
+ f 0, m )
QK
χ2 =
∑∑G
R
Fcf h
+ f 0, m ×
IJ
m( R + 1 − m )
H
R m=1
1 , m −1
( R + 1) 2 K
EXAMPLE OF INDIVIDUALLY MATCHED CASE-
CONTROL STUDY WITH VARYING NUMBER OF
CONTROLS PER CASE (FROM PEPI MANUAL PAGE 116)

• Cases are persons with myocardial infarctions (MI’s); exposure is

coffee consumption at level of 6+ cups per day. Summary tables are
shown below.

Cases with 1 matched control:

Cases Exposed Controls

0 1 Total

Exposed 8 8 16
Not Exposed 8 3 11

Total 16 11 27

Cases with 2 Matched Controls:

Cases Exposed Controls

0 1 2 Total

Exposed 16 23 4 43
Not Exposed 20 22 3 45

Total 36 45 7 88
MATCHED - Multiple Matched Controls
Thursday, 3rd October 2002.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ

DATA
Total controls per case = 1
Case '+' and 0 control '+': 8 case-control sets
Case '+' and 1 control '+': 8 case-control sets
Case '-' and 0 control '+': 8 case-control sets
Case '-' and 1 control '+': 3 case-control sets
Total controls per case = 2
Case '+' and 0 control '+': 16 case-control sets
Case '+' and 1 control '+': 23 case-control sets
Case '+' and 2 controls '+': 4 case-control sets
Case '-' and 0 control '+': 20 case-control sets
Case '-' and 1 control '+': 22 case-control sets
Case '-' and 2 controls '+': 3 case-control sets

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Mantel-Haenszel chi-square (1 DF)

without continuity correction = 7.792 P = 0.005 [ 5.25E-03
]

Walter's test
without continuity correction: z = 2.714 P = 0.007 [ 6.64E-03
]
with continuity correction: z = 2.672 P = 0.008 [ 7.54E-03
]

Mantel-Haenszel estimate of odds ratio = 2.06

Approximate 90% confidence interval = 1.35 to 3.14
Approximate 95% confidence interval = 1.25 to 3.41
Approximate 99% confidence interval = 1.06 to 3.99

Maximum-likelihood estimate of odds ratio = 1.98

Approximate 90% confidence interval = 1.32 to 2.99
Approximate 95% confidence interval = 1.22 to 3.23
Approximate 99% confidence interval = 1.04 to 3.77

Low-bias estimator of odds ratio = 1.96

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Factors Affecting Career Choices Among Abm Senior High School Students in A Catholic College
94% (17)
Factors Affecting Career Choices Among Abm Senior High School Students in A Catholic College
52 pages
Case Control Study PPT Dr.e S Reddy
No ratings yet
Case Control Study PPT Dr.e S Reddy
41 pages
Case Control 2025
No ratings yet
Case Control 2025
40 pages
Practical Research 2 Participants of The Study
88% (8)
Practical Research 2 Participants of The Study
3 pages
Quality Control and Realiability - All in One
No ratings yet
Quality Control and Realiability - All in One
430 pages
2016 Pearce Case Control
No ratings yet
2016 Pearce Case Control
4 pages
Diploma in Teacher Education Research Skills 2024 Notes
No ratings yet
Diploma in Teacher Education Research Skills 2024 Notes
36 pages
Introduction To Statistics 1662031282
100% (1)
Introduction To Statistics 1662031282
936 pages
Alemayehu Delelegn MSC AHN Research Proposal
No ratings yet
Alemayehu Delelegn MSC AHN Research Proposal
33 pages
Psycology of Infant and Child Feeding
No ratings yet
Psycology of Infant and Child Feeding
6 pages
Sampling Methods for Managers
No ratings yet
Sampling Methods for Managers
26 pages
Psychosocial Care and Support in Disaster Management
100% (1)
Psychosocial Care and Support in Disaster Management
14 pages
Health Care Admin & Community Health
No ratings yet
Health Care Admin & Community Health
144 pages
Social Behavior Change Communication Notes
No ratings yet
Social Behavior Change Communication Notes
43 pages
UNIT-5 Natural Acceptance of Human Values
100% (1)
UNIT-5 Natural Acceptance of Human Values
10 pages
A Research Proposal On Factors Affecting Student Enrollment in The University of Zambia Extension Studies Program
No ratings yet
A Research Proposal On Factors Affecting Student Enrollment in The University of Zambia Extension Studies Program
44 pages
Principles and Practice of Clinical Research 4th Edition John I. Gallin - Ebook PDF Download
No ratings yet
Principles and Practice of Clinical Research 4th Edition John I. Gallin - Ebook PDF Download
75 pages
The Role of A Social Worker in Dealing With Disability
100% (1)
The Role of A Social Worker in Dealing With Disability
4 pages
Modul 7 - Behavioral Diagnosis
100% (2)
Modul 7 - Behavioral Diagnosis
19 pages
Applied Statistics Manual A Guide To Improving and Sustaining Quality With Minitab Matthew A Barsalou Joel Smith Instant Download
No ratings yet
Applied Statistics Manual A Guide To Improving and Sustaining Quality With Minitab Matthew A Barsalou Joel Smith Instant Download
85 pages
Actinobacillos - Research 2025
No ratings yet
Actinobacillos - Research 2025
28 pages
Estimation Theory and Problem
No ratings yet
Estimation Theory and Problem
5 pages
AP Statistics - Chapter 7 Notes: Sampling Distributions 7.1 - What Is A Sampling Distribution?
No ratings yet
AP Statistics - Chapter 7 Notes: Sampling Distributions 7.1 - What Is A Sampling Distribution?
1 page
Attitude Likert Scale Statements
No ratings yet
Attitude Likert Scale Statements
3 pages
Kruskal-Wallis Tests (Simulation)
No ratings yet
Kruskal-Wallis Tests (Simulation)
15 pages
Stats Test #3 Word Cheat Sheet
No ratings yet
Stats Test #3 Word Cheat Sheet
3 pages
Designing The Sampling Plan
No ratings yet
Designing The Sampling Plan
38 pages
Indicator of Health
100% (2)
Indicator of Health
45 pages
Work Sampling: Method & Benefits
No ratings yet
Work Sampling: Method & Benefits
16 pages
Sample Size Calculations Thabane
No ratings yet
Sample Size Calculations Thabane
42 pages
Adjusted Control Limits For P Charts
No ratings yet
Adjusted Control Limits For P Charts
9 pages
Health Statistics
100% (1)
Health Statistics
8 pages
Group Discussion
No ratings yet
Group Discussion
20 pages
Health Management Information System
No ratings yet
Health Management Information System
46 pages
Nurses of India Journal Overview
No ratings yet
Nurses of India Journal Overview
2 pages
2.0case Control PPT F
No ratings yet
2.0case Control PPT F
26 pages
Study On New Mobility Patterns in European Cities
No ratings yet
Study On New Mobility Patterns in European Cities
166 pages
Statistics With Economics and Business Applications: Chapter 7 Estimation of Means and Proportions
No ratings yet
Statistics With Economics and Business Applications: Chapter 7 Estimation of Means and Proportions
31 pages
Biostat Exam
No ratings yet
Biostat Exam
5 pages
Veterinary Epidemiology Essentials
No ratings yet
Veterinary Epidemiology Essentials
57 pages
Connection of Psychology With Other Sciences
No ratings yet
Connection of Psychology With Other Sciences
2 pages
Thesis Writing Structure Guide
No ratings yet
Thesis Writing Structure Guide
16 pages
June 2019 Internship Impact Assessment Report
No ratings yet
June 2019 Internship Impact Assessment Report
26 pages
Role of Statistics in Healthcare
100% (2)
Role of Statistics in Healthcare
4 pages
Class 11 Unit 5 SOCIAL PROCESS
No ratings yet
Class 11 Unit 5 SOCIAL PROCESS
4 pages
Divergent Thinking Test Scoring Guide
No ratings yet
Divergent Thinking Test Scoring Guide
19 pages
Understanding Random Variables in Statistics
No ratings yet
Understanding Random Variables in Statistics
83 pages
Chiranjeevi Final Project
No ratings yet
Chiranjeevi Final Project
67 pages
Lecture - Dimensions of Health
100% (1)
Lecture - Dimensions of Health
6 pages
Lesson 3 USes and Core Function in Epidemiology
No ratings yet
Lesson 3 USes and Core Function in Epidemiology
17 pages
Formula Sheet - Test 2 - STAT4001
No ratings yet
Formula Sheet - Test 2 - STAT4001
5 pages
Health Statistics Vital Statistics
No ratings yet
Health Statistics Vital Statistics
2 pages
Statsprob Finals
No ratings yet
Statsprob Finals
14 pages
First Year B.Sc. Nursing Question Paper 2005
No ratings yet
First Year B.Sc. Nursing Question Paper 2005
16 pages
Ayushman Bharat - Pradhan Mantri Jan Arogya Yojana
No ratings yet
Ayushman Bharat - Pradhan Mantri Jan Arogya Yojana
3 pages
Health & Vital Statistics Guide
No ratings yet
Health & Vital Statistics Guide
35 pages
1705128528galley Proof-Treatment Outcome and Associated Factors Among Severely Malnourished
No ratings yet
1705128528galley Proof-Treatment Outcome and Associated Factors Among Severely Malnourished
12 pages
National Policy For Children
0% (1)
National Policy For Children
20 pages
Vital Statistics
100% (1)
Vital Statistics
30 pages
Voi Rental Market Analysis
No ratings yet
Voi Rental Market Analysis
8 pages
Unit I Introduction To Health
No ratings yet
Unit I Introduction To Health
11 pages
1 CH - 7 - WKSHT
No ratings yet
1 CH - 7 - WKSHT
8 pages
Sociology Notes - 1.docx A LostFile
No ratings yet
Sociology Notes - 1.docx A LostFile
47 pages
IGNOU Block 4 Unit 2 Non-Communicable Diseases 1
No ratings yet
IGNOU Block 4 Unit 2 Non-Communicable Diseases 1
20 pages
Population Theory Final
100% (1)
Population Theory Final
36 pages
PNC Case
No ratings yet
PNC Case
32 pages
Cohort Study Design
No ratings yet
Cohort Study Design
25 pages
Health Informatics & Technology Guide
No ratings yet
Health Informatics & Technology Guide
14 pages
UBA Baseline Household Survey Form
No ratings yet
UBA Baseline Household Survey Form
2 pages
3.1test - Treat For Anemia - T3 Camp 031219
100% (1)
3.1test - Treat For Anemia - T3 Camp 031219
17 pages
Journal Impact Factor Explained
No ratings yet
Journal Impact Factor Explained
4 pages
HEALTH CARE DELIVERY SYSTEM (5hr)
No ratings yet
HEALTH CARE DELIVERY SYSTEM (5hr)
61 pages
Crisis Theory and Intervention Guide
No ratings yet
Crisis Theory and Intervention Guide
26 pages
Individual and Group
No ratings yet
Individual and Group
14 pages
Participatory Learning & Action Guide
No ratings yet
Participatory Learning & Action Guide
16 pages
Sociological Conflict Analysis Lecture
No ratings yet
Sociological Conflict Analysis Lecture
21 pages
CHN Demography
No ratings yet
CHN Demography
9 pages
Bias and Confounding
No ratings yet
Bias and Confounding
26 pages
Demography, Vital Stat
100% (1)
Demography, Vital Stat
25 pages
Influence of Social Media On Marital Relationship Among Couples in Samuel Adegboyega University
No ratings yet
Influence of Social Media On Marital Relationship Among Couples in Samuel Adegboyega University
10 pages
Nursing Research Statistics
No ratings yet
Nursing Research Statistics
7 pages
Health Education - Changing Concept
No ratings yet
Health Education - Changing Concept
2 pages
Concept of PHC
No ratings yet
Concept of PHC
78 pages
Public Health & Epidemiology Guide
No ratings yet
Public Health & Epidemiology Guide
34 pages
Epidemology
No ratings yet
Epidemology
41 pages
ABDURRAHIMTP
No ratings yet
ABDURRAHIMTP
32 pages
Ayushman Bharat
0% (1)
Ayushman Bharat
28 pages
Empathy and Egotism
No ratings yet
Empathy and Egotism
11 pages
3.social Construction of Diagnosis, Illness and Medicalisation - Medical Sociology
No ratings yet
3.social Construction of Diagnosis, Illness and Medicalisation - Medical Sociology
15 pages
Investigation of An Epidemic
No ratings yet
Investigation of An Epidemic
18 pages