CSE 548 / AMS 542: Analysis of Algorithms
Lecture 3
( Divide-and-Conquer Algorithms:
Matrix Multiplication )
Rezaul A. Chowdhury
Department of Computer Science
SUNY Stony Brook
Fall 2023
1
Iterative Matrix Multiplication
n
zij = ∑x
k =1
ik y kj
Iter-MM ( Z, X, Y ) { X, Y, Z are n × n matrices,
where n is a positive integer }
1. for i ← 1 to n do
2. for j ← 1 to n do
3. Z[ i ][ j ] ← 0
4. for k ← 1 to n do
5. Z[ i ][ j ] ← Z[ i ][ j ] + X[ i ][ k ] ⋅ Y[ k ][ j ]
2
Recursive ( Divide & Conquer ) Matrix Multiplication
Z n = X n
× Y n
n n n
3
Recursive ( Divide & Conquer ) Matrix Multiplication
Z X Y
n/2 n/2 n/2
n/2 Z11 Z12 n/2 X11 X12 n/2 Y11 Y12
n = n
× n
Z21 Z22 X21 X22 Y21 Y22
n n n
4
Recursive ( Divide & Conquer ) Matrix Multiplication
Z X Y
n/2 n/2 n/2
n/2 Z11 Z12 n/2 X11 X12 n/2 Y11 Y12
n = n
× n
Z21 Z22 X21 X22 Y21 Y22
n n n
n/2
n/2
= n
n
5
Recursive ( Divide & Conquer ) Matrix Multiplication
Z X Y
n/2 n/2 n/2
n/2 Z11 Z12 n/2 X11 X12 n/2 Y11 Y12
n = n
× n
Z21 Z22 X21 X22 Y21 Y22
n n n
n/2
X11 Y11 X11 Y12
n/2 + +
= X12 Y21
X21 Y11
X12 Y22
X21 Y12
n
+ +
X22 Y21 X22 Y22
n
6
Recursive ( Divide & Conquer ) Matrix Multiplication
Z X Y
n/2 n/2 n/2
n/2 Z11 Z12 n/2 X11 X12 n/2 Y11 Y12
n = n
× n
Z21 Z22 X21 X22 Y21 Y22
n n n
n/2
X11 Y11 X11 Y12
n/2 + +
= X12 Y21
X21 Y11
X12 Y22
X21 Y12
n
+ +
X22 Y21 X22 Y22
n
7
Recursive ( Divide & Conquer ) Matrix Multiplication
Rec-MM ( X, Y ) { X and Y are n × n matrices,
where n = 2k for integer k ≥ 0 }
1. Let Z be a new n × n matrix
2. if n = 1 then
3. Z←X⋅Y
4. else
5. Z11 ← Rec-MM ( X11, Y11 ) + Rec-MM ( X12, Y21 )
6. Z12 ← Rec-MM ( X11, Y12 ) + Rec-MM ( X12, Y22 )
7. Z21 ← Rec-MM ( X21, Y11 ) + Rec-MM ( X22, Y21 )
8. Z22 ← Rec-MM ( X21, Y12 ) + Rec-MM ( X22, Y22 )
9. endif
10. return Z
# recursive matrix products: 8 Θ 1 , 𝑖𝑖𝑖𝑖 𝑛𝑛 = 1,
# matrix sums: 4 𝑇𝑇 𝑛𝑛 = � 𝑛𝑛
8𝑇𝑇 + Θ 𝑛𝑛2 , 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜.
2
= Θ 𝑛𝑛3 10
Strassen’s Algorithms for Matrix Multiplication ( MM )
In 1968 Volker Strassen came up with a recursive MM algorithm that
runs asymptotically faster than the classical Θ 𝑛𝑛3 algorithm.
In each level of recursion the algorithm uses:
7 recursive matrix multiplications ( instead of 8 ), and
18 matrix additions ( instead of 4 ). 11
Strassen’s MM: 10 Matrix Additions/Subtractions
row sums (4) column diffs (4) diagonal sums (2)
𝑋𝑋𝑟𝑟𝑟 = 𝑋𝑋11 + 𝑋𝑋12
𝑋𝑋𝑟𝑟𝑟 = 𝑋𝑋21 + 𝑋𝑋22
𝑋𝑋𝑐𝑐𝑐 = 𝑋𝑋11 − 𝑋𝑋21 𝑋𝑋𝑐𝑐𝑐 = 𝑋𝑋12 − 𝑋𝑋22 𝑋𝑋𝑑𝑑𝑑 = 𝑋𝑋11 + 𝑋𝑋22
𝑌𝑌𝑟𝑟𝑟 = 𝑌𝑌11 + 𝑌𝑌12
𝑌𝑌𝑟𝑟𝑟 = 𝑌𝑌21 + 𝑌𝑌22
𝑌𝑌𝑐𝑐𝑐 = 𝑌𝑌11 − 𝑌𝑌21 𝑌𝑌𝑐𝑐𝑐 = 𝑌𝑌12 − 𝑌𝑌22 𝑌𝑌𝑑𝑑𝑑 = 𝑌𝑌11 + 𝑌𝑌22
12
Strassen’s MM: 7 Matrix Products
𝑌𝑌11 𝑌𝑌22 𝑌𝑌𝑐𝑐𝑐 𝑌𝑌𝑐𝑐𝑐 𝑌𝑌𝑟𝑟𝑟 𝑌𝑌𝑟𝑟𝑟 𝑌𝑌𝑑𝑑𝑑
𝑋𝑋11 𝑃𝑃11
𝑋𝑋22 𝑃𝑃22
𝑋𝑋𝑟𝑟𝑟 𝑃𝑃𝑟𝑟𝑟
𝑋𝑋𝑟𝑟𝑟 𝑃𝑃𝑟𝑟𝑟
𝑋𝑋𝑐𝑐𝑐 𝑃𝑃𝑐𝑐𝑐
𝑋𝑋𝑐𝑐𝑐 𝑃𝑃𝑐𝑐𝑐
𝑋𝑋𝑑𝑑𝑑 𝑃𝑃𝑑𝑑𝑑
Strassen’s MM: 7 Matrix Products
X11 X12 X21 X22 X11 X12 X21 X22
𝑌𝑌11 𝑌𝑌22 𝑌𝑌𝑐𝑐𝑐 𝑌𝑌𝑐𝑐𝑐 𝑌𝑌𝑟𝑟𝑟 𝑌𝑌𝑟𝑟𝑟 𝑌𝑌𝑑𝑑𝑑
Y11 Y11
Y21 Y21
𝑋𝑋11 𝑃𝑃11
Y12 Y12 +_
Y22 ++ 𝑋𝑋22 𝑃𝑃22 Y22
𝑃𝑃𝑟𝑟1 = 𝑋𝑋𝑟𝑟1 � 𝑌𝑌22 𝑃𝑃11 = 𝑋𝑋11 � 𝑌𝑌𝑐𝑐𝑐
𝑃𝑃𝑟𝑟𝑟 = 𝑋𝑋11 𝑌𝑌22 + 𝑋𝑋12 𝑌𝑌22 𝑋𝑋𝑟𝑟𝑟 𝑃𝑃𝑟𝑟𝑟 𝑃𝑃11 = 𝑋𝑋11 𝑌𝑌12 − 𝑋𝑋11 𝑌𝑌22
X11 X12 X21 X22 𝑋𝑋𝑟𝑟𝑟 𝑃𝑃𝑟𝑟𝑟 X11 X12 X21 X22
Y11 ++ 𝑋𝑋𝑐𝑐𝑐 𝑃𝑃𝑐𝑐𝑐
Y11
_+
Y21 Y21
Y12 Y12
Y22 𝑋𝑋𝑐𝑐𝑐 𝑃𝑃𝑐𝑐𝑐 Y22
𝑃𝑃𝑟𝑟2 = 𝑋𝑋𝑟𝑟2 � 𝑌𝑌11 𝑃𝑃22 = 𝑋𝑋22 � 𝑌𝑌𝑐𝑐𝑐
𝑋𝑋𝑑𝑑𝑑 𝑃𝑃𝑑𝑑𝑑
𝑃𝑃𝑟𝑟𝑟 = 𝑋𝑋21 𝑌𝑌11 + 𝑋𝑋22 𝑌𝑌11 𝑃𝑃22 = 𝑋𝑋22 𝑌𝑌11 − 𝑋𝑋22 𝑌𝑌21
+ _
X11 X12 X21 X22 X11 X12 X21 X22 X11 X12 X21 X22
Y11 Y11
_ Y11 + +
+
+ _
Y21 Y21 Y21
_
Y12 Y12 Y12
Y22 Y22 + Y22 + +
𝑃𝑃𝑐𝑐𝑐 = 𝑋𝑋𝑐𝑐𝑐 � 𝑌𝑌𝑟𝑟𝑟 𝑃𝑃𝑐𝑐𝑐 = 𝑋𝑋𝑐𝑐𝑐 � 𝑌𝑌𝑟𝑟𝑟 𝑃𝑃𝑑𝑑𝑑 = 𝑋𝑋𝑑𝑑𝑑 � 𝑌𝑌𝑑𝑑𝑑
𝑃𝑃𝑐𝑐𝑐 = 𝑋𝑋11 𝑌𝑌11 + 𝑋𝑋11 𝑌𝑌12 𝑃𝑃𝑐𝑐𝑐 = 𝑋𝑋12 𝑌𝑌21 + 𝑋𝑋12 𝑌𝑌22 𝑃𝑃𝑑𝑑𝑑 = 𝑋𝑋11 𝑌𝑌11 + 𝑋𝑋11 𝑌𝑌22
𝑃𝑃𝑐𝑐𝑐 − 𝑋𝑋21 𝑌𝑌11 − 𝑋𝑋21 𝑌𝑌12 𝑃𝑃𝑐𝑐𝑐 − 𝑋𝑋22 𝑌𝑌21 − 𝑋𝑋22 𝑌𝑌22 𝑃𝑃𝑑𝑑𝑑 + 𝑋𝑋22 𝑌𝑌11 + 𝑋𝑋22 𝑌𝑌22
Strassen’s Matrix Multiplication
20
Strassen’s Matrix Multiplication
P11 P22 Pr1 Pr2 Pc1 Pc2 Pd1
X11 X12 X21 X22
Y11
Y21
Z11 Y12
Y22
X11 X12 X21 X22
Y11
Y21
Z12 Y12
Y22
X11 X12 X21 X22
Y11
Y21
Z21 Y12
Y22
X11 X12 X21 X22
Y11
Y21
Z22 Y12
Y22 22
Strassen’s Matrix Multiplication
P11 P22 Pr1 Pr2 Pc1 Pc2 Pd1
X11 X12 X21 X22
Y11
Y21
Z11 Y12
Y22
X11 X12 X21 X22
Y11
Y21
Z12 Y12
Y22
X11 X12 X21 X22
Y11
Y21
Z21 Y12
Y22
X11 X12 X21 X22
Y11
Y21
Z22 Y12
Y22 23
Strassen’s Matrix Multiplication
P11 P22 Pr1 Pr2 Pc1 Pc2 Pd1
X11 X12 X21 X22 +
Y11 + +
Y21
Z11 Y12
Y22 + +
X11 X12 X21 X22 +
Y11
Y21
Z12 Y12
Y22 ++
X11 X12 X21 X22 +
Y11 ++
Y21
Z21 Y12
Y22
X11 X12 X21 X22 +
Y11 + +
Y21
Z22 Y12
Y22 + 24 +
Strassen’s Matrix Multiplication
P11 P22 Pr1 Pr2 Pc1 Pc2 Pd1
X11 X12 X21 X22 + +
Y11
_ + +
Z11
Y21 +
Y12
Y22 + _ + +
X11 X12 X21 X22 + +
Y11
Y21
Z12
_+
Y12
Y22 ++
X11 X12 X21 X22 - _ +
Y11 ++
Z21
Y21 +
Y12
Y22
X11 X12 X21 X22 -_ +
Y11 + + +
Z22
Y21
Y12
_ +
Y22 + 25 +
Strassen’s Matrix Multiplication
P11 P22 Pr1 Pr2 Pc1 Pc2 Pd1
X11 X12 X21 X22 - + +
Y11
_ + +
Z11
Y21 +
__
Y12
Y22 + _ + +
X11 X12 X21 X22 + +
Y11
Y21
Z12
_+
Y12
Y22 ++
X11 X12 X21 X22 - _ +
Y11 ++
Z21
Y21 +
Y12
Y22
X11 X12 X21 X22 - _ _ -_ +
Y11 + + +
Z22
Y21
Y12
_ +
Y22 + 26 +
Strassen’s Matrix Multiplication
P11 P22 Pr1 Pr2 Pc1 Pc2 Pd1
X11 X12 X21 X22 - _ - + +
Y11
_ + +
Z11
Y21 + +
__
Y12
Y22 + _ + +
X11 X12 X21 X22 + +
Y11
Y21
Z12
_+
Y12
Y22 ++
X11 X12 X21 X22 - _ +
Y11 ++
Z21
Y21 +
Y12
Y22
X11 X12 X21 X22 + - _ _ -_ +
Y11 + + +
Z22
Y21
_ +
Y12
Y22
_+ + 27 +
Strassen’s Matrix Multiplication
Sums:
𝑋𝑋𝑟𝑟𝑟 = 𝑋𝑋11 + 𝑋𝑋12 𝑌𝑌𝑟𝑟𝑟 = 𝑌𝑌11 + 𝑌𝑌12
𝑋𝑋𝑟𝑟𝑟 = 𝑋𝑋21 + 𝑋𝑋22 𝑌𝑌𝑟𝑟𝑟 = 𝑌𝑌21 + 𝑌𝑌22
𝑋𝑋𝑐𝑐𝑐 = 𝑋𝑋11 − 𝑋𝑋21 𝑌𝑌𝑐𝑐𝑐 = 𝑌𝑌11 − 𝑌𝑌21
𝑋𝑋𝑐𝑐𝑐 = 𝑋𝑋12 − 𝑋𝑋22 𝑌𝑌𝑐𝑐𝑐 = 𝑌𝑌12 − 𝑌𝑌22
𝑋𝑋𝑑𝑑𝑑 = 𝑋𝑋11 + 𝑋𝑋22 𝑌𝑌𝑑𝑑𝑑 = 𝑌𝑌11 + 𝑌𝑌22
Running Time:
Products:
Θ 1 , 𝑖𝑖𝑖𝑖 𝑛𝑛 = 1,
𝑃𝑃11 = 𝑋𝑋11 � 𝑌𝑌𝑐𝑐𝑐 𝑃𝑃𝑐𝑐𝑐 = 𝑋𝑋𝑐𝑐𝑐 � 𝑌𝑌𝑟𝑟𝑟 𝑇𝑇 𝑛𝑛 = � 𝑛𝑛
𝑃𝑃22 = 𝑋𝑋22 � 𝑌𝑌𝑐𝑐𝑐 𝑃𝑃𝑐𝑐𝑐 = 𝑋𝑋𝑐𝑐𝑐 � 𝑌𝑌𝑟𝑟𝑟 7𝑇𝑇 + Θ 𝑛𝑛2 , 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜.
2
𝑃𝑃𝑟𝑟1 = 𝑋𝑋𝑟𝑟1 � 𝑌𝑌22 𝑃𝑃𝑑𝑑𝑑 = 𝑋𝑋𝑑𝑑𝑑 � 𝑌𝑌𝑑𝑑𝑑
𝑃𝑃𝑟𝑟2 = 𝑋𝑋𝑟𝑟2 � 𝑌𝑌11 = Θ 𝑛𝑛log2 7 = Ο 𝑛𝑛2.81
28
Deriving Strassen’s Algorithm
Use the Feynman Algorithm:
Step 1: write down the problem
Step 2: think real hard
Step 3: write down the solution
29
Deriving Strassen’s Algorithm
𝑎𝑎 𝑏𝑏 0 0 𝑒𝑒 𝑝𝑝
𝑏𝑏 𝑒𝑒 0 0 𝑔𝑔 = 𝑟𝑟
𝑎𝑎
𝑐𝑐 𝑑𝑑 𝑔𝑔
𝑓𝑓
ℎ
=
𝑝𝑝
𝑟𝑟
𝑞𝑞
𝑠𝑠 ⇒ 𝑐𝑐 𝑑𝑑
0 0 𝑎𝑎 𝑏𝑏 𝑓𝑓 𝑞𝑞
0 0 𝑐𝑐 𝑑𝑑 �
ℎ 𝑠𝑠
�
𝑋𝑋 𝑌𝑌 𝑍𝑍
We will try to minimize the number of multiplications needed to
evaluate 𝑍𝑍 using special matrix products that are easy to compute.
Type Product #Mults
(·)
𝑎𝑎 𝑏𝑏 𝑒𝑒 𝑎𝑎𝑎𝑎 + 𝑏𝑏𝑏𝑏
4
=
𝑐𝑐 𝑑𝑑 𝑔𝑔 𝑐𝑐𝑐𝑐 + 𝑑𝑑𝑑𝑑
𝑎𝑎 𝑎𝑎 𝑒𝑒 𝑎𝑎 𝑒𝑒 + 𝑔𝑔
(A) = 1
𝑎𝑎 𝑎𝑎 𝑔𝑔 𝑎𝑎 𝑒𝑒 + 𝑔𝑔
𝑎𝑎 𝑎𝑎 𝑒𝑒 𝑎𝑎 𝑒𝑒 + 𝑔𝑔
(B) = 1
−𝑎𝑎 −𝑎𝑎 𝑔𝑔 −𝑎𝑎 𝑒𝑒 + 𝑔𝑔
(C)
𝑎𝑎 0 𝑒𝑒 𝑎𝑎𝑎𝑎
2
= 𝑎𝑎𝑎𝑎 + 𝑏𝑏 𝑔𝑔 − 𝑒𝑒
𝑎𝑎 − 𝑏𝑏 𝑏𝑏 𝑔𝑔
(D)
𝑎𝑎 𝑏𝑏 − 𝑎𝑎 𝑒𝑒 𝑎𝑎 𝑒𝑒 − 𝑔𝑔 + 𝑏𝑏𝑏𝑏
2
𝑔𝑔 =
0 𝑏𝑏 𝑏𝑏𝑔𝑔 30
Deriving Strassen’s Algorithm
31
Deriving Strassen’s Algorithm
32
Deriving Strassen’s Algorithm
33
Deriving Strassen’s Algorithm
34
Deriving Strassen’s Algorithm
35
Algorithms for Multiplying Two n×n Matrices
A recursive algorithm based on multiplying two 𝑚𝑚 × 𝑚𝑚 matrices
using 𝑘𝑘 multiplications will yield an Ο 𝑛𝑛log𝑚𝑚 𝑘𝑘 algorithm.
To beat Strassen’s algorithm: log 𝑚𝑚 𝑘𝑘 < log 2 7⇒ 𝑘𝑘 < 𝑚𝑚log2 7 .
So, for a 3 × 3 matrix, we must have: 𝑘𝑘 < 3log2 7 < 22.
But the best known algorithm uses 23 multiplications!
Inventor Year Complexity
Classical ― Θ 𝑛𝑛3
Volker Strassen 1968 Θ 𝑛𝑛2.807
Victor Pan
( multiply two 70 × 70 matrices using 1978 Θ 𝑛𝑛2.795
143,640 multiplications )
Don Coppersmith & Shmuel Winograd
1990 Θ 𝑛𝑛2.3737
( arithmetic progressions )
Andrew Stothers 2010 Θ 𝑛𝑛2.3736
Virginia Williams 2011 Θ 𝑛𝑛2.3727
38
Lower bound: Ω 𝑛𝑛2 ( why? )