Contingency Tables Using R
What is a Contingency Table?
A contingency table (also called a cross-tabulation or crosstab) is a matrix used to display the frequency
distribution of variables. It helps in examining the relationship between two categorical variables.
Creating Contingency Tables in R
R provides several ways to create and work with contingency tables. The most common functions include:
- table()
- xtabs()
- addmargins() - prop.table()
- chisq.test()
Example Dataset
Gender <- c('Male', 'Female', 'Female', 'Male', 'Female', 'Male', 'Male', 'Female')
Major <- c('Math', 'Biology', 'Math', 'CS', 'CS', 'Math', 'Biology', 'CS')
1. Create a Contingency Table
table(Gender, Major)
Output:
Major
Gender Biology CS Math
Female 121
Male 112
2. Add Margins (Row & Column Totals)
addmargins(table(Gender, Major))
Contingency Tables Using R
Output:
Major
Gender Biology CS Math Sum
Female 1214
Male 11
Sum 23
3. Proportional Table
- Overall Proportions:
prop.table(table(Gender, Major))
- Row-wise Proportions:
prop.table(table(Gender, Major), 1)
- Column-wise Proportions:
prop.table(table(Gender, Major), 2)
4. Using xtabs() Function
If your data is in a data frame:
data <- data.frame(Gender, Major)
xtabs(— Gender + Major, data = data)
5. Chi-Square Test for Independence
To check if there is a significant association between two categorical variables:
chisq.test(table(Gender, Major))
Contingency Tables Using R
Output:
Chi-squared test for independence
X-squared = p-value =
Save to PDF (Optional)
pdf("contingency_table_output.pdf")
print(table(Gender, Major))
dev.off()
OUTPUT:-
Contingency Tables Using R