Constrained Least Squares
Stephen Boyd
EE103
Stanford University
November 9, 2017
Outline
Linearly constrained least squares
Least norm problem
Solving the constrained least squares problem
Linearly constrained least squares 2
Least squares with equality constraints
I the (linearly) constrained least squares problem (CLS) is
minimize kAx − bk2
subject to Cx = d
I variable (to be chosen/found) is n-vector x
I m × n matrix A, m-vector b, p × n matrix C, and p-vector d are
problem data (i.e., they are given)
I kAx − bk2 is the objective function
I Cx = d are the equality constraints
I x is feasible if Cx = d
I x̂ is a solution of CLS if C x̂ = d and kAx̂ − bk2 ≤ kAx − bk2 holds
for any n-vector x that satisfies Cx = d
Linearly constrained least squares 3
Least squares with equality constraints
I CLS combines solving linear equations with least squares problem
I like a bi-objective least squares problem, with infinite weight on
second objective kCx − dk2
Linearly constrained least squares 4
Piecewise-polynomial fitting
I piecewise-polynomial fˆ has form
p(x) = θ1 + θ2 x + θ3 x2 + θ4 x3
x≤a
fˆ(x) =
q(x) = θ5 + θ6 x + θ7 x2 + θ8 x3 x>a
(a is given)
I we require p(a) = q(a), p0 (a) = q 0 (a)
I fit fˆ to data (xi , yi ), i = 1, . . . , N by minimizing sum square error
N
X
(fˆ(xi ) − yi )2
i=1
I can express as a constrained least squares problem
Linearly constrained least squares 5
Example
fˆ(x)
q(x)
p(x)
x
a
Linearly constrained least squares 6
Piecewise-polynomial fitting
I constraints are (linear equations in θ)
θ1 + θ2 a + θ3 a2 + θ4 a3 − θ5 − θ6 a − θ7 a2 − θ8 a3 = 0
2 2
θ2 + 2θ3 a + 3θ4 a − θ6 − 2θ7 a − 3θ8 a = 0
I prediction error on (xi , yi ) is aTi θ − yi , with
(1, xi , x2i , x3i , 0, 0, 0, 0) xi ≤ a
(ai )j =
(0, 0, 0, 0, 1, xi , x2i , x3i ) xi > a
I sum square error is kAθ − yk2 , where aTi are rows of A
Linearly constrained least squares 7
Outline
Linearly constrained least squares
Least norm problem
Solving the constrained least squares problem
Least norm problem 8
Least norm problem
I special case of constrained least squares problem, with A = I, b = 0
I least-norm problem:
minimize kxk2
subject to Cx = d
i.e., find the smallest vector that satisfies a set of linear equations
Least norm problem 9
Force sequence
I unit mass on frictionless surface, initially at rest
I 10-vector f gives forces applied for one second each
I final velocity and position are
v fin = f1 + f2 + · · · + f10
fin
p = (19/2)f1 + (17/2)f2 + · · · + (1/2)f10
I let’s find f for which v fin = 0, pfin = 1
I f bb = (1, −1, 0, . . . , 0) works (called ‘bang-bang’)
Least norm problem 10
Bang-bang force sequence
1 1
Position
Force
0 0.5
−1 0
0 2 4 6 8 10 0 2 4 6 8 10
Time Time
Least norm problem 11
Least norm force sequence
I let’s find least-norm f that satisfies pfin = 1, v fin = 0
I least-norm problem:
minimize kf k2
1 1 ··· 1 1 0
subject to f=
19/2 17/2 ··· 3/2 1/2 1
with variable f
I solution f ln satisfies kf ln k2 = 0.0121 (compare to kf bb k2 = 2)
Least norm problem 12
Least norm force sequence
1
0.05
Position
Force
0 0.5
−0.05
0
0 2 4 6 8 10 0 2 4 6 8 10
Time Time
Least norm problem 13
Outline
Linearly constrained least squares
Least norm problem
Solving the constrained least squares problem
Solving the constrained least squares problem 14
Optimality conditions via calculus
to solve constrained optimization problem
minimize f (x) = kAx − bk2
subject to cTi x = di , i = 1, . . . , p
1. form Lagrangian function, with Lagrange multipliers z1 , . . . , zp
L(x, z) = f (x) + z1 (cT1 x − d1 ) + · · · + zp (cTp x − dp )
2. optimality conditions are
∂L ∂L
(x̂, z) = 0, i = 1, . . . , n, (x̂, z) = 0, i = 1, . . . , p
∂xi ∂zi
Solving the constrained least squares problem 15
Optimality conditions via calculus
∂L
I (x̂, z) = cTi x̂ − di = 0, which we already knew
∂zi
I first n equations are more interesting:
n p
∂L X X
(x̂, z) = 2 (AT A)ij x̂j − 2(AT b)i + zj ci = 0
∂xi j=1 j=1
I in matrix-vector form: 2(AT A)x̂ − 2AT b + C T z = 0
I put together with C x̂ = d to get KKT conditions
2AT A C T 2AT b
x̂
=
C 0 z d
a square set of n + p linear equations in variables x̂, z
I KKT equations are extension of normal equations to CLS
Solving the constrained least squares problem 16
Solution of constrained least squares problem
I assuming the KKT matrix is invertible, we have
−1
2AT A CT 2AT b
x̂
=
z C 0 d
I KKT matrix is invertible if and only if
A
C has independent rows, and has independent columns
C
I implies m + p ≥ n, p ≤ n
I can compute x̂ in 2mn2 + 2(n + p)3 flops; order is n3 flops
Solving the constrained least squares problem 17
Direct verification of solution
I to show that x̂ is solution, suppose x satisfies Cx = d
I then
kAx − bk2 = k(Ax − Ax̂) + (Ax̂ − b)k2
= kA(x − x̂)k2 + kAx̂ − bk2 + 2(Ax − Ax̂)T (Ax̂ − b)
I expand last term, using 2AT (Ax̂ − b) = −C T z, Cx = C x̂ = d:
2(Ax − Ax̂)T (Ax̂ − b) = 2(x − x̂)T AT (Ax̂ − b)
= −(x − x̂)T C T z
= −(C(x − x̂))T z
= 0
I so kAx − bk2 = kA(x − x̂)k2 + kAx̂ − bk2 ≥ kAx̂ − bk2
I and we conclude x̂ is solution
Solving the constrained least squares problem 18
Solution of least-norm problem
I least-norm problem: minimize kxk2 subject to Cx = d
I
I matrix always has independent columns
C
I we assume that C has independent rows
I optimality condition reduces to
2I C T
x̂ 0
=
C 0 z d
I so x̂ = −(1/2)C T z; second equation is then −(1/2)CC T z = d
I plug z = −2(CC T )−1 d into first equation to get
x̂ = C T (CC T )−1 d = C † d
where C † is (our old friend) the pseudo-inverse
Solving the constrained least squares problem 19
so when C has independent rows:
I C † is a right inverse of C
I so for any d, x̂ = C † d satisfies C x̂ = d
I and we now know: x̂ is the smallest solution of Cx = d
Solving the constrained least squares problem 20