4115 Troyou06
4115 Troyou06
Abstract. A Lavrentiev type regularization technique for solving elliptic boundary control problems with
pointwise state constraints is considered. The main concept behind this regularization is to look for controls in the
range of the adjoint control-to-state mapping. After investigating the analysis of the method, a semismooth Newton
method based on the optimality conditions is presented. The theoretical results are confirmed by numerical tests.
Moreover, they are validated by comparing the regularization technique with standard numerical codes based on the
discretize-then-optimize concept.
Key words. Boundary control, state constraints, Lavrentiev type regularization, semismooth Newton method,
optimize-then-discretize, nested iteration.
Ay = 0 in Ω
(1.1)
∂n y = u on Γ
however, restricts the theory to two-dimensional domains Ω. In this case, the mapping u 7→ y
defined by (1.1) is continuous from L2 (Γ) to C(Ω̄). Another way would consist of restricting the
constraints (1.2) to a compact subset Ω0 ⊂ Ω.
This obstacle is overcome after regularizing problem (P). Then necessary optimality conditions
can be stated for the regularized problem for any dimension N ≥ 2. To show, however, convergence
for vanishing regularization parameter, the restriction to N = 2 is more or less needed, again.
It is well known that the numerical treatment of state-constrained problems is a quite difficult
issue. On the one hand, the measure type form of Lagrange multipliers complicates the numerical
treatment of the problems. On the other hand, in the analysis one is faced with some ill-conditioned
equations when dealing with state-constrained problems. This is mostly due to the compactness of
the mapping u 7→ y. This is known for distributed optimal control problems and it turns out to be
even harder in the case of boundary control.
In the last years, two different regularization concepts were proposed to overcome the difficulties
mentioned previously. First, Ito and Kunisch [10] suggested a Moreau-Yosida type regularization
approach, which removes the pointwise state inequality constraints by adding a penalty term to
the objective functional. Hereafter, the penalized problems are solved in an efficient way. We also
refer to [2], [3], and [11].
Later, Meyer et al. [14] came up with a Lavrentiev type regularization to the pointwise state
inequality constraints, see also the case of pure state constraints in [15]. In contrast to the first
method, it preserves, in some sense, the structure of the state-constrained problem. Let us briefly
introduce this regularization technique and compare it with the main issue of our paper. Suppose
that the state equation is given by
Ay = u in Ω
∂n y = 0 on Γ
with distributed control u. Then the Lavrentiev type regularization converts the state constraints
(1.2) into the mixed control-state constraints
(1.3) ya ≤ y + λu ≤ yb a.e. in Ω
with a small parameter λ > 0. This technique is not applicable in the case of boundary control,
since the control is defined only on Γ, while (1.3) must be considered on Ω. The domains of u and
y do not fit together.
Our main idea to overcome this difficulty is as follows: Let S : L2 (Γ) → L2 (Ω) denote the
control-to-state mapping defined by (1.1). We look for controls in the range of the adjoint operator
S ? , i.e., we use the ansatz
u = S?v
with a new control v defined in Ω. Then, the pointwise state constraints (1.2) admit the form
ya ≤ SS ? v ≤ yb and, substituting a new ”control function” w ∈ L2 (Ω) by w = SS ? v, we obtain
the ”control constraints”
ya ≤ w ≤ yb .
However, this is too formal, since SS is compact in L2 (Ω), and hence the equation SS ? v = w is
?
ill-posed, if w ∈ L2 (Ω) is given. To obtain a well-posed equation, we apply the Lavrentiev type
regularization, cf. [12], i.e. we write
λv + SS ? v = w.
3
ya ≤ λv + SS ? v ≤ yb a.e. in Ω.
There are two reasons for the ansatz u = S ? v. We obtain from the necessary optimality
condition (2.3) that the optimal control ū is in the range of the adjoint control-to-state mapping
G? : C(Ω̄)? → L2 (Γ) (notice that p = G? (ȳ − yd + µb − µa )). By restricting the domain of G? to
L2 (Ω), we avoid measures and arrive at S ? . Moreover, we obtain the representation y = SS ? v with
a positive semidefinite and self-adjoint operator SS ? , which is useful for computations.
We investigate the analysis of this idea as well as its numerical performance. The method
slightly increases the number of unknowns and doubles the number of equations. This is certainly
some drawback. However, the numerical results are encouraging. In some cases, the results were
even better then those obtained by the discretize-then-optimize concept that was used to compare
our method. Our main aim was to find an extension of the Lavrentiev type regularization concept
to the case of boundary controls that guarantees the existence of regular Lagrange multipliers.
Moreover, it should generate a problem that is equivalent to a control-constrained one. Then, we
have access to known results on numerical approximations of control-constrained problems such as
error estimates or mesh-independence principles.
To give the reader a better orientation on the possible choices for the dimension N , we mention
already here that N = 2 is only needed to show the convergence of optimal solutions for vanishing
regularization parameter. All other results on the regularized problems hold true for arbitrary
dimension. If N = 2 is needed, we explicitely state this in the associated theorems.
The paper is organized as follows: After introducing the general assumptions as well as our
notation in Section 2, we analyze different aspects of our regularization in Section 3. We discuss the
optimality conditions for the regularized version of (P ) and based on them, we present a semismooth
Newton algorithm in Section 4. Finally, in Section 5, we provide a numerical report including a
validation of our theoretical results and a comparison of our technique with the application of the
discretize-then-optimize concept.
2. General assumptions and notation. Throughout this paper, we consider a bounded
domain Ω ⊂ RN , N ≥ 2, with a C 0,1 -boundary Γ. The lower and upper bounds ya , yb ∈ C(Ω̄)
satisfy ya (x) < yb (x) for all x ∈ Ω̄. Furthermore, the desired state yd is given in L2 (Ω). If V is a
linear normed function space, then we use the notation k · kV for the standard norm used in V . By
A, we denote the second-order elliptic partial differential operator defined by
N
X
Ay(x) = − Di (aij (x)Dj y(x)) + c0 (x)y(x),
i,j=1
where the coefficient functions aij ∈ C 0,1 (Ω̄) satisfy the ellipticity condition
n
X
aij (x)ξi ξj ≥ θkξk2L2 (Ω) ∀ (ξ, x) ∈ Rn × Ω̄
i,j=1
for some constant θ > 0. By A? , the associated formally adjoint operator is denoted. We assume
that c0 ∈ L∞ (Ω) is non-negative with kc0 kL∞ (Ω) 6= 0. By G, we denote the solution operator
G : L2 (Γ) → H 1 (Ω) that assigns to each u ∈ L2 (Γ) the weak solution y = y(u) ∈ H 1 (Ω) of the
4
elliptic equation
Ay = 0 in Ω
∂n y = u on Γ,
in which ∂n y denotes the co-normal derivative of y (often denoted by ∂nA ). For later use, we set
S = i0 G, where i0 is the compact embedding operator from H 1 (Ω) to L2 (Ω). Now, our problem
can be expressed as follows:
1 α
2 2
minimize f (u) := 2 kSu − yd kL2 (Ω) + 2 kukL2 (Γ)
(P ) over u ∈ L2 (Γ)
subject to ya ≤ Gu ≤ yb a.e. in Ω.
We still use G instead of S in the constraints, since we need higher regularity of y in Theorem 2.1.
Remark 2.1. Consider the following more general problem
minimize 21 ky − yd k2L2 (Ω) + α2 ku − ud k2L2 (Γ)
over (u, y) ∈ L2 (Ω) × H 1 (Ω)
(2.1)
subject to Ay = e in Ω, ∂n y = u on Γ
ya ≤ y ≤ yb a.e. in Ω
with a fixed function e ∈ L2 (Ω) and a fixed shift control ud ∈ L2 (Γ). By substituting u = u − ud
and splitting the state into two components y(e, u) = ye + Gu, where ye is solution of
Ay = e in Ω
∂n y = 0 on Γ,
the problem (2.1) is equivalent to:
1 α
2 2
minimize 2 kSu − yΩ kL2 (Ω) + 2 kukL2 (Γ)
(2.2) over u ∈ L2 (Γ)
subject to ya0 ≤ Gu ≤ yb0 a.e. in Ω,
with yΩ = yd − ye − Sud , ya0 = ya − Gud − ye and yb0 = yb − Gud − ye . Therefore, we are justified
to concentrate on the simpler problem (P).
2.1. Standard results. It is well known that the operator G : L2 (Γ) → H 1 (Ω) is continuous.
In the case of a two-dimensional domain, N = 2, the mapping G : u → y is even continuous from
L2 (Γ) into H 1 (Ω) ∩ C(Ω̄), see [5]. One can show that, independently of the dimension N ≥ 2, the
problem (P ) admits a unique solution ū ∈ L2 (Γ) provided that the admissible set {u ∈ L2 (Γ) | ya ≤
G(u) ≤ yb a.e. in Ω} is not empty. In the rest of the paper, we assume hence that the admissible
set for (P ) is not empty and denote the optimal solution to (P ) by ū with the associated state
ȳ = Gū. In the following, we present the optimality system for (P ) in a appropriate sense defined
in [5].
Theorem 2.1 (First-order optimality conditions for (P )). Let N = 2 and assume that the
following Slater condition is satisfied: There exists a function u0 ∈ L2 (Γ) (a so-called Slater point)
such that
ya (x) < G(u0 )(x) < yb (x) ∀ x ∈ Ω̄.
5
Then, ū is optimal for (P ) if and only if there exists an adjoint state p ∈ W 1,s (Ω) for all 1 ≤ s <
N ∗
N −1 and Lagrange multipliers µa , µb ∈ C (Ω̄) such that
(2.3) p + αū = 0 on Γ,
Notice that < ·, · >C ? ,C stands for the duality pairing between C ? (Ω̄) and C(Ω̄).
3. Regularization. As pointed out in the introduction, we look for controls u in the range
of the adjoint operator S ? , i.e., we substitute u = S ? v in (P ). This idea leads us to an associated
problem with a new control v defined on the domain Ω. In this section, we study this problem with
different regularization terms, to motivate the final form of our regularized problem.
We start by investigating the operator S ? . By definition, S ? is defined from L2 (Ω) to L2 (Γ).
It is represented by S ? v = w|Γ , where w = w(v) ∈ H 1 (Ω) is defined as the solution of
A? w
= v in Ω
(3.1)
∂n w = 0 on Γ.
1
Hence, for each v ∈ L2 (Ω), w|Γ possesses at least the regularity w|Γ ∈ H 2 (Γ). In convex domains
3
Ω, we obtain even w|Γ ∈ H 2 (Γ). This higher regularity raises some difficulties, since it implies that
S ? : L2 (Ω) → L2 (Γ) is not surjective. However, we have the following result:
Lemma 3.1. S ? (L2 (Ω)) is dense in L2 (Γ).
Proof. Assume the contrary: Then there exists a function d ∈ L2 (Γ) with d 6= 0 such that
This implies that (Sd, v)Ω = 0 for all v ∈ L2 (Ω) and hence y(d) = Sd = 0. Since y = y(d) satisfies
Ay = 0 in Ω
∂n y = d on Γ,
˜
1 α
? ? 2 ? 2
minimize f (v) := f (S v) = 2 kSS v − yd kL2 (Ω) + 2 kS vkL2 (Γ)
(Paux ) over v ∈ L2 (Ω)
subject to ya ≤ GS ? v ≤ yb a.e. in Ω.
6
As noticed previously, the operator S ? is not surjective and consequently the auxiliary problem
(Paux ) is not necessarily solvable. However, under some assumptions of approximability, we are
able to show that (P ) and (Paux ) have the same infimal value.
Definition 3.1 (Approximability and Slater condition). We say that ū satisfies the approx-
imability condition, if there exists a sequence {an }∞ 2
n=1 in L (Ω) such that
for all n.
If there exists a function v0 ∈ L∞ (Ω) satisfying the condition
(3.3) ya + δ ≤ GS ? v0 ≤ yb − δ
with some constant δ > 0, then we say that the Slater condition is satisfied.
Lemma 3.2. For N = 2, the approximability condition is satisfied if the Slater condition (3.3)
is fulfilled.
Proof. Since S ? (L2 (Ω)) is dense in L2 (Γ), there exists a sequence {ṽn }∞ 2
n=1 in L (Ω) such that
where {dn }∞ n=1 is a sequence in C(Ω̄) converging to zero. Define now an = (1 − tn )ṽn + tn v0 , where
∞ kdn kC(Ω̄)
{tn }n=1 is a sequence of positive numbers tending to zero, given by tn = δ . Obviously, due to
(3.4), the sequence {S ? an }∞ 2
n=1 converges strongly to ū in L (Ω). By (3.5) and the Slater condition
(3.3), we find further for all sufficiently large n:
GS ? an = (1 − tn )Gū + (1 − tn )dn + tn GS ? v0
≤ (1 − tn )yb + (1 − tn )dn + tn yb − tn δ
≤ yb + (1 − tn )kdn kC(Ω̄) − tn δ
= yb − tn kdn kC(Ω̄) ≤ yb .
In a similar way, we infer ya ≤ GS ? an for all sufficiently large n and hence {an }∞ n=1 satisfies the
property (3.2).
Theorem 3.1. Assume that the optimal solution ū satisfies the approximability condition.
Then (P ) and (Paux ) have the same infimal value. Furthermore, there exists an infimal sequence
{vn }∞ ? ∞
n=1 for (Paux ) such that {S vn }n=1 converges strongly to ū in L (Γ).
2
˜
Proof. By the nonnegativity of f , the infimal values of (Paux ) and (P ), denoted by jaux and j,
respectively, exist in R+ ? 2
0 . Furthermore, since the range of S is dense in L (Γ), we have obviously
j ≤ jaux . On the other hand, by the approximability assumption, there exists a sequence {an }∞ n=1
in L2 (Ω) such that S ? an is feasible for (P ) for sufficiently large n and limn→∞ S ? an = ū. Thus, by
the continuity of S, we have
1 α 1 α
lim kSS ? an − yd k2L2 (Ω) + kS ? an k2L2 (Γ) = kS ū − yd k2L2 (Ω) + kūk2L2 (Γ) = j
n→∞ 2 2 2 2
7
ya ≤ Gũ ≤ yb a.e. in Ω.
Therefore, by the uniqueness of the optimal solution, we find ũ = ū. Finally, due to the weak
convergence, the strong convergence of S ? vn to ū can be directly derived from the convergence of
S ? vn in norm.
In the preceding proof, we did not show the convergence of {vn }∞ n=1 . As yet, we are only
able to show the strong convergence of S ? vn to ū for the auxiliary problem (Paux ). Still, this is
not satisfactory due to the possible unsolvability of (Paux ). To overcome this difficulty, we next
study different kinds of regularization and their specific properties. Finally, we end up with our
Lavrentiev-regularized problem. We start with the following problem:
1 ? 2 α ? 2 ε 2
minimize g̃(v) := kSS v − yd kL2 (Ω) + kS vkL2 (Γ) + kvkL2 (Ω)
ε 2 2 2
(P ) over v ∈ L2 (Ω)
subject to ya ≤ GS ? v ≤ yb a.e. in Ω.
Lemma 3.3. Assume that the admissible set Ṽad = {v ∈ L2 (Ω) | ya ≤ GS ? v ≤ yb a.e. in Ω} is
not empty. Then, for every ε > 0, (P ε ) admits a unique solution.
Proof. The infimal value of (P ε ) exists in R+0 , since the objective functional g̃ is nonnegative.
Let now {vn } be an infimal sequence for (P ε ). Due to the regularization parameter 2ε in g̃, vn is
uniformly bounded in L2 (Ω). Consequently, there exists a subsequence of vn converging weakly to
ṽ ∈ L2 (Ω). Owing to the continuity of GS ? , the weak limit ṽ belongs obviously to Ṽad . Thus, by
the lower semicontinuity of g̃, ṽ is optimal for (P ε )
3.2. Lavrentiev type regularization. As mentioned previously, our aim is to propose a
Lavrentiev type regularization applied to the boundary control problem. We regularize now (P ) in
the following way:
1 α ε
minimize g̃(v) := kSS ? v − yd k2L2 (Ω) + kS ? vk2L2 (Γ) + kvk2L2 (Ω)
(Pλ ) 2 2 2
over v ∈ L2 (Ω)
subject to ya ≤ λv + GS ? v ≤ yb a.e. in Ω,
with some regularization parameters λ > 0 and ε(λ) ≥ 0. We mention here some reasons for
regularizing the problem in that way: Without any restriction on the dimension N and without
the assumption on approximability and Slater conditions, we always have the solvability of (Pλ )
and additionally, we are able to show that the associated Lagrange multipliers exist and belong to
L2 (Ω). However, for the convergence of the regularized solution to the solution of (P ) in the case
8
of vanishing regularization parameters, one has to restrict again the theory to the two-dimensional
case, N = 2. We require this since for N = 2, the operator G is continuous from L2 (Ω) to
C(Ω̄) ∩ H 1 (Ω).
Theorem 3.2. Let λ > 0 and ε(λ) ≥ 0 be arbitrarily fixed. Then, the regularized problem (Pλ )
admits a solution and the solution is unique if ε(λ) 6= 0.
Proof. In the proof, we consider G again as mapping with range in L2 (Ω), i.e. we subtitute S
for G. This does not change the admissible set. First of all, we have to show that the admissible
set Vad := {v ∈ L2 (Ω) | ya ≤ λv + SS ? v ≤ yb a.e. in Ω} is not empty. To this purpose, we
consider the operator λI + SS ? , where I : L2 (Ω) → L2 (Ω) denotes the identity operator in L2 (Ω).
Since SS ? : L2 (Ω) → L2 (Ω) is positive semidefinite and λ > 0 holds by assumption, the equation
λv + SS ? v = 0 admits only the trivial solution v = 0. Hence, by the Fredholm alternative, the
compactness of SS ? implies the existence of the inverse operator (λI + SS ? )−1 . From this, we infer
that Vad is not empty: For instance, we have (λI + SS ? )−1 yb ∈ Vad .
Next, since the objective functional in (Pλ ) is nonnegative, the infimum in (Pλ ) exists in
R+ ∞
0 . Let now {vn }n=1 be an infimal sequence for (Pλ ). By the presence of the cost parameter
α > 0, S vn is uniformly bounded in L2 (Ω) and hence we can find a subsequence {S ? vnj }∞
?
j=1 of
{S ? vn }∞ ? 0
n=1 such that S vnj * u . Subsequently, invoking again the compactness of S, we infer the
strong convergence of SS ? vnj to Su0 . Moreover, since vnj ∈ Vad for all j, the strong convergence of
SS ? vnj ensures then the uniform boundedness property of vnj . Thus, we find again a subsequence
of vnj converging weakly to v̄λ ∈ L2 (Ω). This weak limit v̄λ is clearly feasible and finally, by the
lower semicontinuity of the objective functional in (Pλ ), v̄λ is optimal. For ε(λ) > 0, g̃ is strictly
convex and consequently we obtain the uniqueness of v̄λ .
Next, setting z = λv + SS ? v and hence, v = (λI + SS ? )−1 z = Rz, (Pλ ) is equivalent to the
following control problem:
minimize g(z) := g̃(Rz)
(Pλz ) over z ∈ L2 (Ω)
subject to y ≤ z ≤ y
a b a.e. in Ω.
In this way, we have just transformed (Pλ ) into a control problem with a simple box constraint.
Subsequently, by standard arguments, cf. [15], the following optimality system can easily be shown.
Theorem 3.3 (First-order optimality conditions). Let λ > 0 be arbitrarily fixed and let v̄λ be an optimal solution
to (Pλ ). Moreover, we set ȳλ = GS ? v̄λ and S ? v̄λ = w|Γ , with w = w(v̄λ ) ∈ H 1 (Ω) solution of (3.1). Then, there
exist Lagrange multipliers µa b 2 1
λ , µλ ∈ L (Ω) and adjoint states p, q ∈ H (Ω) such that the following optimality system is
satisfied:
Aȳλ = 0 in Ω A? w = v̄λ in Ω
(3.6)
∂n ȳλ = w on Γ, ∂n w = 0 on Γ,
A? p = ȳλ − yd + µbλ − µa
λ in Ω A? q = 0 in Ω
(3.7)
∂n p = 0 on Γ, ∂n q = αw + p on Γ,
b a
(3.8) ε(λ)v̄λ + q + λ(µλ − µλ ) = 0,
(3.9) ya ≤ λv̄λ + ȳλ ≤ yb a.e. in Ω,
µa b
λ ≥ 0, µλ ≥ 0,
(3.10)
(µa b
λ , ya − λv̄λ − ȳλ )L2 (Ω) = (µλ , λv̄λ + ȳλ − yb )L2 (Ω) = 0.
9
3.3. Pass to the limit λ → 0. We study now the convergence of the solution to the regularized
problem in the case of vanishing Lavrentiev parameter λ.
Assumption 3.1. The regularization parameter ε = ε(λ) satisfies
(3.11) ε = σ0 λ1+σ1
with some constants σ0 > 0 and 0 ≤ σ1 < 1. Moreover, there exists a function v0 ∈ L∞ (Ω) such
that
(3.12) ya + δ ≤ G(ū + S ? v0 ) ≤ yb − δ
is satisfied with some constant δ > 0.
Let {λn }∞ ∞
n=1 be a sequence of positive real numbers converging to zero and by {vn }n=1 , we
denote the sequence of optimal solutions to (Pλn ). The presence of the Tikhonov parameter α > 0
in (Pλn ) ensures the boundedness of the sequence {S ? vn }∞ 2
n=1 in L (Γ). For this reason, we can
? ∞ ? ∞
find a subsequence of {S vn }n=1 , denoted w.l.o.g. again by {S vn }n=1 , converging weakly to some
ũ ∈ L2 (Γ). Our goal now is to show that ũ minimizes the original unregularized problem. For this
purpose, we should show first the feasibility of ũ for (P ), i.e., ya ≤ Gũ ≤ yb a.e. in Ω.
Lemma 3.4. Let Assumption 3.1, (3.11), be satisfied. Then the weak limit ũ of the sequence
{S ? vn }∞
n=1 defined above is feasible for (P ).
Proof. We know that it holds
ya ≤ λn vn + GS ? vn ≤ yb ∀n.
Therefore, it suffices to show that λn vn converges to zero. Clearly, by the presence of the regular-
ization parameter ε(λ) > 0 in the objective functional g̃, one has:
ε(λn )
kvn k2L2 (Ω) ≤ c ∀n
2
with some constant c > 0 and hence
ε(λn )
kλn vn k2L2 (Ω) ≤ c ∀n.
2λ2n
¿From Assumption 3.1, we infer then
2cλ2n 2c
(3.13) kλn vn k2L2 (Ω) ≤ ≤ λ1−σ
n
1
.
ε(λn ) σ0
This implies λn vn → 0 in L2 (Ω) as n → ∞ and hence the Lemma is shown.
Theorem 3.4. Let N = 2. Then, under Assumption 3.1, the sequence {S ? vn } converges
strongly in L2 (Γ) to the optimal solution of the unregularized problem (P ).
Proof. Since C(Ω̄) is dense in L2 (Ω) and due to Lemma 3.1, we can find a sequence {zk }∞
k=1 in
C(Ω̄) such that
1
(3.14) kū − S ? zk kL2 (Γ) ≤ ∀ k ∈ N.
k
Moreover, continuity and linearity of G from L2 (Γ) to H 1 (Ω) ∩ C(Ω̄) ensure the existence of a real
positive number c0 such that
1
(3.15) kG(ū − S ? zk )kC(Ω̄) ≤ c0 kū − S ? zk kL2 (Γ) ≤ c0 ∀ k ∈ N.
k
10
2c0 c1
(3.16) vk0 = zk + v0 = zk + v0 ,
kδ k
where v0 satisfies (3.12) and c1 = 2cδ0 . In a view of (3.14), the definition above implies the strong
convergence of S ? vk0 in L2 (Γ) to ū, i.e.
First, we show that, for every k ∈ N with k ≥ c1 , one can find an index number nk ∈ N such that
vk0 is feasible for (Pλn ) for all n ≥ nk . Let now k ∈ N with k ≥ c1 be arbitrarily fixed. Then, by
our assumptions, it holds that
λn vk0 + GS ? vk0 = λn vk0 + G(S ? zk − ū) + (1 − ck1 )Gū + ck1 (GS ? v0 + Gū)
≤ λn kvk0 kL∞ (Ω) + c0 kS ? zk − ūkL2 (Γ) + (1 − ck1 )yb + ck1 (yb − δ)
(3.18)
≤ yb + (λn kvk0 kL∞ (Ω) + ck0 − 2 ck0 )
= yb + (λn kvk0 kL∞ (Ω) − ck0 ).
c0
Since λn converges to zero, we can find then an index nk ∈ N such that λn kvk0 kL∞ (Ω) ≤ k for all
n ≥ nk . Inserting this in (3.18), we obtain
λn vk0 + GS ? vk0 ≤ yb ∀ n ≥ nk .
In the same way, one derives ya ≤ λn vk0 + GS ? vk0 for all n ≥ nk . This implies the feasibility of vk0
for all (Pλn ) with n ≥ nk .
Since vn is optimal to (Pλn ) for each n, we infer further that
εn εn
f (S ? vn ) ≤ f (S ? vn ) + kvn k2L2 (Ω) ≤ f (S ? vk0 ) + kvk0 k2L2 (Ω) ∀ n ≥ nk .
2 2
Hence, due to the lower semicontinuity of f , one finds by passing to the limit n → ∞
where ũ is the weak limit introduced in Lemma 3.4. Finally, letting k pass to infinity, we infer from
(3.17) that
Hence, we have shown the optimality of ũ to (P ) and again, due to the uniqueness of ū, we have
ũ = ū. Notice that the latter equality limn→∞ f (S ? vn ) = f (ū) implies the convergence of {S ? vn }
in norm and hence, together with the weak convergence, the strong convergence of {S ? vn } to ũ is
verified.
4. Semismooth Newton algorithm. Based on the experience in [11, 8, 9], the semismooth
Newton method is quite efficient when dealing with state-constrained optimal control problems.
Mainly due to its superlinear convergence and mesh-independence properties, the method is highly
efficient in many applications. This was analyzed and verified numerically, quite recently, also for a
similar Lavrentiev regularization technique applied to distributed optimal control problem, [9]. Our
11
goal in this section is to present a semismooth Newton algorithm based on the first-order optimality
conditions (3.6)-(3.10) for the regularized problem (Pλ ). The analysis for the mesh-independence
principle is not included here. This is a subject of our ongoing research. Following [6, 8], we define
now the Newton generalized differentiability.
Definition 4.1. Let X, Y be Banach spaces and U be an open set in X. A mapping F : U → Y
is said to be semismooth (or Newton differentiable) in U if there exists a (possibly set-valued)
mapping ∂F : U ⇒ L (X, Y ) such that
is satisfied for all x ∈ U . We call ∂F the Newton differential, and its elements V are referred to
as Newton maps.
In the following, we derive a semismooth Newton algorithm based on the concept above. Let
us start by reformulating the complementarity system (3.10) in the optimality conditions by the
following max-formulation.
Lemma 4.1. The complementarity conditions (3.10) are equivalent to:
with arbitrarily fixed ξ ∈ R, provided that M is defined from Lq2 (Ω) to Lq1 (Ω) with 1 ≤ q1 < q2 ≤
∞.
12
Setting now a special choice γ := ε(λ)/λ2 in (4.2)-(4.3), short computations show that due to
the equation (3.8) in the optimality system, i.e.,
1 ε(λ)
(4.5) µaλ = max(0, q + 2 (ya − ȳλ )),
λ λ
b 1 ε(λ)
(4.6) µλ = max(0, − q + 2 (ȳλ − yb )).
λ λ
The maximum operators above are defined from Lq2 (Ω) to Lq1 (Ω) with 1 ≤ q1 < q2 and thus the
application of semismooth Newton method is justified. Our algorithm is based on the particular
choice ξ = 0 for the Newton map. Then, the semismooth Newton algorithm is equivalent to an
active-set-strategy. The complete algorithm is defined by the following steps, cf. also [8, 9] for the
details.
Algorithm 4.1.
(i) Initialization: Choose initial data q 0 , y 0 ∈ L2 (Ω) and set l = 0.
(ii) Set the active and inactive sets:
1 l ε(λ)
Ala = {x ∈ Ω : q (x) + 2 (ya (x) − y l (x)) > 0 a.e. in Ω},
λ λ
1 ε(λ)
Alb = {x ∈ Ω : − q l (x) + 2 (y l (x) − yb (x)) > 0 a.e. in Ω},
λ λ
I l = Ω\(Ala ∪ Alb ).
l+1
(iii) Find the solution (y l+1 , q l+1 , pl+1 , wl+1 , v l+1 , µl+1
a , µb ) of
A? q l+1 = 0 in Ω,
∂n q l+1 = αwl+1 + pl+1 on Γ,
5. Numerical Experiments. Our numerical report is splitted into two parts. First, we
confirm the numerical reliability of our regularization approach to solve some classes of state-
constrained optimal boundary control problems. In particular, we aim at investigating the influence
of the regularization parameter on the algorithm. To this purpose, a test example with analytically
known solution to the problem (2.1) will be considered. As pointed out in Remark 2.1, our theory
is applicable also to this more general problem (2.1). By means of this example, the numerical
approximation of our method as well as its convergence behavior will be analyzed. We also study
briefly a nested iteration technique, based on a multigrid concept, to gain a higher efficiency of the
algorithm.
Second, we compare our technique based on the ”optimize-then-discretize” concept with the
standard numerical optimization code QUADPROG of the MATLAB optimization toolbox applied
to the discretized problem (”discretize-then-optimize”). We mainly aim at showing that our method
exhibits a reasonable performance. We do not intend to compare the concepts ”optimize-then-
discretize” and ”discretize-then-optimize”, since this would require various test runs with common
available nonlinear optimization codes. A detailed study of the method ”discretize-then-optimize”
was carried out by Maurer and Mittelmann [13].
Our experience showed that the semismooth Newton method applied to the Lavrentiev type
regularization was as efficient as QUADPROG and partially even more advantageous. We point
out that all the numerical computations in this paper were carried out on a PC with a 250-GHz
AMD processor and a 16-gigabyte memory.
5.1. Discretization. As noticed early, the regularization technique that we propose here is
based on the optimize-then-discretize strategy. In the following, we explain the discretization of
Algorithm 4.1 and call later the algorithm under this discretization ”OTD”.
Throughout the experiment, we use for simplicity the unit square domain Ω = (0, 1)×(0, 1) and
set A = −∆ + I. We discretize Ω by a regular Friedrichs-Keller triangulation with mesh size h and
the mesh on Γ is induced by that on Ω. The partial differential equations (PDEs) are approximated
by the finite element method. The state space H 1 (Ω) ∩ C(Ω̄) is discretized by the span of the
standard finite element basis {φ1h (x), . . . , φN
h (x)} consisting of the piecewise linear and continuous
h
hat functions on Ω̄. Analogously, we define the discrete control space with the standard finite
element basis {ψh1 (x), . . . , ψhMh (x)}, composed of the piecewise linear and continuous hat functions
defined on the boundary Γ. Hence, the state equation is approximated by the system of linear
equations
Ah yh = Bh uh + Mh eh ,
where the matrices Ah , Mh ∈ RNh ×Nh and Bh ∈ RNh ×Mh are given by
(Ah )ij = (∇φih , ∇φjh )L2 (Ω) + (φih , φjh )L2 (Ω) ,
(Bh )i,j = (φih , ψhj )L2 (Γ) ,
(Mh )ij = (φih , φjh )L2 (Ω) .
Here, the vectors yh ∈ RNh , uh ∈ RMh and eh ∈ RNh serve for the discrete approximations of the
state, the control and the fixed function e, respectively, with mesh size h. For instance, yhi is the
numerical approximation of the value y(xi ) in the node point xi that is associated with the ansatz
function φih . The remaining PDEs in the optimality system associated with (Pλ ) are analogously
14
discretized. The active sets are discretized by the approximated values of corresponding functions
at the nodes. For instance, the discretization of Aa is given by
1 ε(λ)
{i : qh + 2 (ya,hi − yhi ) > 0},
λ i λ
where the vectors qh , ya,h ∈ RNh serve for the discrete approximations of q and ya , respectively,
with mesh size h. Algorithm 4.1 is implemented in this way.
5.2. Test example 1. To construct the example, we first define the optimal state yopt , the
adjoint state popt , the upper bound ψ and the fixed function e by
2
yopt (x) = sin(πx1 ) sin(πx2 ),
π
popt (x) = −0.5,
1
ψ(x) = max( , yopt (x)),
π
e(x) = −∆yopt (x) + yopt (x).
Notice that, in our test examples, we only consider an upper bound constraint
Clearly, our theory applies to this case as well. Short computations show that
1.7 if y(x) > π1
µopt (x) :=
0 if y(x) ≤ π1
fulfills the complementarity slackness condition for (P ). Setting for the desired state
µopt can serve for the Lagrange multiplier associated with (P ). Next, by computing the normal
derivative of yopt , one obtains the optimal boundary control uopt , which is identical on all edges of
Ω. For example, on the lower boundary of Ω, uopt = −2 sin(πx1 ). Finally, for the cost parameter α
and the desired control ud , we set:
α = 10−2 ,
ud = uopt + α1 popt Γ .
1
For the choice of the parameter ε = ε(λ), we select throughout the numerical test ε(λ) = λ1+ 2 .
This satisfies clearly Assumption 3.1.
Our aim now consists of investigating the numerical approximations based on Algorithm 4.1
in the case of vanishing Lavrentiev parameter λ. In Table 5.1, we report on the numerical results
when solving the problem utilizing Algorithm 4.1. We found out in our test runs (h = 1/128) that
the problem is becoming harder to be solved for decreasing λ. Observing the second and third
columns of Table 5.1, we notice that the distance to the optimal solution is quite satisfactory and
getting smaller with respect to decreasing Lavrentiev parameter λ. This confirms the result of
15
Table 5.1
Convergence behavior of Algorithm 4.1 with respect to decreasing Lavrentiev parameter.
λ kuh − uopt kL2 kyh − yopt kL2 kµh − µopt kL2 kph − popt kL2
−2.0
10 3.1880078e-01 3.9198985e-01 6.2201011e-01 3.5549336e-03
10−3.0 3.2990915e-02 2.0899831e-02 2.4068944e-01 1.1464553e-03
10−4.0 4.4104922e-03 4.5454556e-03 1.2453488e-01 7.7483476e-04
10−5.0 2.5020193e-03 2.6584714e-03 6.3426567e-02 3.5686997e-04
10−6.0 1.8128938e-03 1.0053254e-03 4.2474621e-02 2.6608656e-04
10−7.0 1.1850217e-03 3.5760613e-04 4.0462811e-02 4.1388376e-04
10−8.0 8.1408314e-04 1.3624889e-04 9.2715530e-02 1.0450310e-03
10−9.0 7.3457286e-04 6.5919685e-05 2.6964435e-01 3.0716652e-03
10−10.0 7.2968216e-04 4.5600222e-05 7.6555488e-01 8.5845442e-03
Theorem 3.4. At the same time, it indicates the applicability of our technique when dealing with
state-constrained optimal boundary control problems.
If the selection of the parameter λ is too small, the approximation to the Lagrange multiplier
turns out to be rather poor. In the fourth column of Table 5.1, we observe that the distance to
the Lagrange multiplier is increasing with respect to decreasing λ ≤ 10−7 , see Figure 5.2. This
effect occurred most likely due to the noticeably-increasing ill-conditioning, which is monitored in
the experiment for decreasing λ < 10−8 .
Therefore, we suggest to choose a moderate Lavrentiev parameter, λ ≈ 10−7 . For this selection,
we obtained the best approximation of the desired Lagrange multiplier, see Figure 5.1.
5.3. Comparison with a commercial optimization code. As noted earlier, it is one of
our goals to compare our regularization technique based on ”optimize-then-discretize” (OTD) with
a commercial code applied to the discretized problem ( ”discretize-then-optimize”, DTO). To this
aim, we selected the code QUADPROG from the MATLAB optimization toolbox, since this is often
applied by users of MATLAB.
We start by defining briefly the discretized version of (P ):
minimize 21 (yh − ydh )T Mh (yh − ydh ) + α2 (uh − udh )T M̃h (uh − udh )
subject to Ah yh = Mh eh + Bh uh
(Ph )
yh ≤ ψ h
(uh , yh ) ∈ RMh × RNh ,
where the mass matrix M̃h is given by M̃hij = (ψhi , ψhj )L2 (Γ) and the vectors ψh , ydh ∈ RNh and
udh ∈ RMh stand for the discretization of the upper bound function ψ, the desired state yd and
the desired control ud , respectively. QUADPROG was used to solve this linear quadratic problem
(Ph ).
Table 5.3 shows the results when solving the previous test example. Considering the first and
second columns of Table 5.3, the numerical approximations of DTO to the optimal state and optimal
control with respect to decreasing mesh size are quite satisfactory just as those of OTD. We point
out here that, in contrast to the previous numerical test for OTD, the computation based on DTO
for the test example with mesh size h = 1/128 failed due to exceeding the memory of the PC.
16
Fig. 5.1. Computed solution based on Algorithm 4.1 for h = 1/128 and λ = 10−7 : Optimal state (upper left),
optimal control (upper right), adjoint state (lower left) and Lagrange multiplier (lower right).
Table 5.2
The L2 -error to the optimal values
Next, we aim at comparing the efficiency of our methods OTD with that of QUADPROG based
on the discretize-then-optimize concept. For this purpose, we test again OTD at λ = 10−7 as well as
DTO with various mesh sizes and report on their required CPU-time for each grid. Observing Table
5.3, we detect that OTD was more efficient. On the finest grid of the numerical tests (h = 1/64),
DTO required about 6.46e+03 seconds to converge, whereas OTD was at least 50 times faster. This
demonstrates the computational efficiency of our regularization strategy.
5.4. Test example 2. We consider now an example without given analytical solution.
17
Fig. 5.2. Spurious oscillation of the Lagrange multipliers with respect to decreasing λ.
Table 5.3
CPU times for various mesh sizes.
1 1 1 1
h 8 16 32 64
DTO 4.00e-02 1.18e+00 1.23e+02 6.46e+03
OTD λ = 10−7 2.00e-01 1.85e+00 1.33e+01 9.89e+01
Figure 5.3 displays the computed solution for λ = 10−9 . Again, the optimal control is identical on
all edges of Ω and hence is only plotted on the lower boundary.
Fig. 5.3. Computed solution based on Algorithm 4.1 for h = 1/128 and λ = 10−9
Table 5.4
Number of iterations for several Lavrentiev-parameter choices with fixed mesh size h = 1/128.
In Table 5.4, we provide the required iteration numbers when solving the problem using our
algorithm with fixed initial data p0 = y0 = 0. Just as before, we detect that the problem is
harder to be solved if the choice of λ is too small. Based on our numerical observation, this effect
occurred mostly due to the following two reasons: the linear equations solved by the algorithm are
ill-conditioned. This behavior is monitored particularly if the selection of the Lavrentiev-parameter
is too small. Furthermore, the measure structure of the Lagrange multipliers associated with
the upper bound complicates considerably the numerical computation. As λ → 0, the computed
Lagrange multiplier approaches a Dirac measure concentrated in a single point, cf. Figure 5.4.
5.5. Nested iteration. In this last part of the paper, we briefly introduce an interpolation
technique with the objective to gain a faster method. Our experience in handling the regularized
distributed control problem, cf. [9], indicates that a simple nested iteration scheme, based on the
coarse-to-fine grid strategy, may improve the efficiency of the algorithm. In fact, this method is
also reliable for boundary control problems and can significantly accelerate the convergence. First,
let us shortly explain the nested iteration method: We set a sequence of grids Ωk with mesh size
hk = 2hk−1 and start by solving the problem over the coarsest grid Ω0 . Subsequently, we interpolate
19
Fig. 5.4. Computed Lagrange multipliers associated with the upper bound: λ = 10−4 , λ = 10−6 , λ = 10−7 and
λ = 10−9 (from left to right).
the result to the next finer grid Ω1 by the nine prolongation method, see [7], and utilize it as initial
data for the algorithm with finer mesh size h1 . We repeat this process until the desired mesh size
is reached. In table 5.5, we present a comparison of the results for the first test example with
λ = 10−5 based on Algorithm 4.1 (OTD) and those based on the nested iteration strategy.
In the second row of Table 5.5, we report on the iteration numbers as well as the CPU-time
required by OTD with fixed initial data y0 = q0 = 0. The row with title Interpolation displays
the performance of the algorithm combined with the nested iteration scheme (NIS) including its
required CPU-time for each grid. Clearly, we find a significant speed-up of the algorithm under
this coarse to fine grid sweep: By comparing the accumulated CPU-time of NIS with that of OTD
at h = 1/128 with fixed initial data, we infer that NIS is more than three times faster than OTD
with fixed initial data.
Certainly, under this nested iteration scheme, a similar improvement of the performance might
have obtained for DTO based on QUADPROG. We did not investigate this alternative.
Our second noticeable observation is the mesh-independence behavior of OTD. Regardless of the
mesh size of the discretization, the iteration numbers of OTD with fixed initial data remain constant
equal to six. This might correspond to the mesh independence principle for a similar regularization
technique applied to distributed optimal control problems [9]. Notice that after regularization, our
problem is equivalent to a control-constrained one (cf. our remarks before Theorem 3.3).
20
Table 5.5
Mesh independence behavior of OTD and speed-up under a coarse-to-fine mesh sweep.
1 1 1
h 32 64 128
Fixed mesh (OTD) 6 6 6
CPU-time 6.420e+00 4.600e+01 4.208e+02
Interpolation (NIS) 6 1 1
CPU-time 6.420e+00 1.221e+01 1.278e +02
REFERENCES
[1] J.-J. Alibert and J.-P. Raymond. Boundary control of semilinear elliptic equations with discontinuous leading
coefficients and unbounded controls. Numer. Funct. Anal. and Optimization, 3&4:235–250, 1997.
[2] M. Bergounioux, K. Ito, and K. Kunisch. Primal-dual strategy for constrained optimal control problems. SIAM
J. Control and Optimization, 37:1176–1194, 1999.
[3] M. Bergounioux and K. Kunisch. Primal-dual active set strategy for state-constrained optimal control problems.
Computational Optimization and Applications, 22:193–224, 2002.
[4] E. Casas. Control of an elliptic problem with pointwise state constraints. SIAM J. Control and Optimization,
4:1309–1322, 1986.
[5] E. Casas. Boundary control of semilinear elliptic equations with pointwise state constraints. SIAM J. Control
and Optimization, 31:993–1006, 1993.
[6] X. Chen, Z. Nashed, and L. Qi. Smoothing methods and semismooth methods for nondifferentiable operator
equations. SIAM J. Numer. Anal., 38(4):1200–1216 (electronic), 2000.
[7] W. Hackbusch. Multigrid methods and applications, volume 4 of Springer Series in Computational Mathemat-
ics. Springer-Verlag, Berlin, 1985.
[8] M. Hintermüller, K. Ito, and K. Kunisch. The primal-dual active set strategy as a semismooth Newton method.
SIAM J. Optim., 13:865–888, 2003.
[9] M. Hintermüller, F. Tröltzsch, and I. Yousept. Mesh-independence of semismooth Newton methods for
Lavrentiev-regularized state constrained nonlinear optimal control problems. 2006.
[10] K. Ito and K. Kunisch. Augmented Lagrangian methods for nonsmooth, convex optimization in Hilbert spaces.
Nonlinear Analysis TMA, 41:591–616, 2000.
[11] K. Ito and K. Kunisch. Semi-smooth Newton methods for state-constrained optimal control problems. Systems
and Control Letters, 50:221–228, 2003.
[12] Lavrentiev, M. M. Some Improperly Posed Problems of Mathematical Physics. Springer, New York, 1967.
[13] H. Maurer and H. D. Mittelmann. Optimization techniques for solving elliptic control problems with control
and state constraints. I: Boundary control. J. of Computational and Applied Mathematics, 16:29–55, 2000.
[14] C. Meyer, A. Rösch, and F. Tröltzsch. Optimal control of PDEs with regularized pointwise state constraints.
Computational Optimization and Applications, 33:209–228, 2006.
[15] C. Meyer and F. Tröltzsch. On an elliptic optimal control problem with pointwise mixed control-state con-
straints. In A. Seeger, editor, Recent Advances in Optimization. Proceedings of the 12th French-German-
Spanish Conference on Optimization held in Avignon, September 20-24, 2004, Lectures Notes in Economics
and Mathematical Systems, Vol. 563, pages 187– 204. Springer-Verlag, 2006.