Apriori Algorithm Example (Grocery Items)
Dataset: Transactions (5 Transactions)
TID Items
T1 Milk, Bread, Butter
T2 Milk, Bread, Beer
T3 Bread, Butter, Eggs
T4 Milk, Bread, Eggs
T5 Bread, Butter
Step 1: Count 1-itemsets
Item Count Support (count/5)
Milk 3 0.60 ✅
Bread 5 1.00 ✅
Butter 3 0.60 ✅
Beer 1 0.20 ❌
Eggs 2 0.40 ✅
Assuming min_support = 0.4, discard Beer.
Frequent 1-itemsets (L1):
{Milk}, {Bread}, {Butter}, {Eggs}
Step 2: Generate 2-itemsets from L1
All combinations from 1-itemsets of size 2:
2-itemset Count Support
{Milk, Bread} 3 0.60 ✅
{Milk, Butter} 1 0.20 ❌
{Milk, Eggs} 1 0.20 ❌
{Bread, Butter} 3 0.60 ✅
{Bread, Eggs} 2 0.40 ✅
{Butter, Eggs} 1 0.20 ❌
Frequent 2-itemsets (L2):
{Milk, Bread}, {Bread, Butter}, {Bread, Eggs}
Step 3: Generate 3-itemsets from L2
Candidates from frequent 2-itemsets:
3-itemset Count Support
{Milk, Bread, Butter} 1 0.20 ❌
{Milk, Bread, Eggs} 1 0.20 ❌
{Bread, Butter, Eggs} 1 0.20 ❌
All 3-itemsets have support < 0.4 → discarded
Final Frequent Itemsets:
Itemset Support
{Milk} 0.60
{Bread} 1.00
{Butter} 0.60
{Eggs} 0.40
{Milk, Bread} 0.60
{Bread, Butter} 0.60
{Bread, Eggs} 0.40
Sample Association Rule
mlxtend
pip install efficient-apriori
from efficient_apriori import apriori
# Step 1: Define transactions
transactions = [
('Milk', 'Bread', 'Butter'),
('Milk', 'Bread'),
('Bread', 'Butter', 'Eggs'),
('Milk', 'Bread', 'Eggs'),
('Bread', 'Butter'),
# Step 2: Run Apriori
itemsets, rules = apriori(transactions, min_support=0.4, min_confidence=0.6)
# Step 3: Print Frequent Itemsets
print(" Frequent Itemsets:")
for k in itemsets:
print(f"{k}-itemsets: {itemsets[k]}")
OUTPUT
1-itemsets: {('Milk',): 3, ('Bread',): 5, ('Butter',): 3, ('Eggs',): 2}
2-itemsets: {('Milk', 'Bread'): 3, ('Bread', 'Butter'): 3, ('Bread', 'Eggs'): 2}
# Step 4: Print Association Rules
print("\n Association Rules:")
for rule in rules:
print(rule)
OUTPUT
{Milk} -> {Bread} (conf: 1.0, supp: 0.6, lift: 1.0)
{Butter} -> {Bread} (conf: 1.0, supp: 0.6, lift: 1.0)
{Bread} -> {Butter} (conf: 0.6, supp: 0.6, lift: 1.0)
{Bread} -> {Milk} (conf: 0.6, supp: 0.6, lift: 1.0)
pip install mlxtend
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules
# Step 1: Sample Transactions
transactions = [
['Milk', 'Bread', 'Butter'],
['Milk', 'Bread'],
['Bread', 'Butter', 'Eggs'],
['Milk', 'Bread', 'Eggs'],
['Bread', 'Butter']
# Step 2: One-Hot Encode the transactions
te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_ary, columns=te.columns_)
print(" One-Hot Encoded Transaction DataFrame:")
print(df)
# Step 3: Generate Frequent Itemsets
frequent_itemsets = apriori(df, min_support=0.4, use_colnames=True)
print("\nFrequent Itemsets (min_support=0.4):")
print(frequent_itemsets)
# Step 4: Generate Association Rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.6)
print("\n Association Rules (min_confidence=0.6):")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])
Bread Butter Eggs Milk
0 True True False True
1 True False False True
2 True True True False
3 True False True True
4 True True False False
support itemsets
0 1.00 [Bread]
1 0.60 [Milk]
2 0.60 [Butter]
3 0.40 [Eggs]
4 0.60 [Bread, Butter]
5 0.60 [Milk, Bread]
6 0.40 [Bread, Eggs]
antecedents consequents support confidence lift
0 (Milk) (Bread) 0.6 1.00 1.00
1 (Butter) (Bread) 0.6 1.00 1.00
2 (Bread) (Butter) 0.6 0.60 1.00
3 (Bread) (Milk) 0.6 0.60 1.00