0% found this document useful (0 votes)

56 views5 pages

+Q G1R11'G Q4-Qig2R22'G Q +H V2-1H2P2Q4: Algorithm

This document summarizes an algorithm for computing the optimal control of a dynamical system described by a state vector, control vector, and vector-valued function, with the goal of minimizing a cost functional. The algorithm is analogous to the first-order differential dynamic programming method but is based on the Pontryagin minimum principle rather than dynamic programming. The algorithm employs a convergence control technique, and the document analyzes the global convergence conditions for the algorithm to work. An example is provided to demonstrate the algorithm.

Uploaded by

mamurtaza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views5 pages

+Q G1R11'G Q4-Qig2R22'G Q +H V2-1H2P2Q4: Algorithm

Uploaded by

mamurtaza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

IEEE TRANSACTIONS

AUTOMATIC
ON CONTROL, VOL. AC-25,NO. 6, DECEMBER
1149 1980

+(QI + Q ~ W I R , ' G ; Q ~+ P [ - ( Q l +Q'PIHIVI-'HIQ~ On Global Convergence of an Algorithm for Optimal

-(QI ~~3~'p1H~v2~1H2p1~3~~3p2H~v2~'H2p1~3 Control
+ (QI + Q3)'p1HiV2-1H2p2Q4 -Q,PZH$V~-'H~PZQ~]
Q4 = -&F-F'Qd +Q;G2R,'GiQ3 +Q;G2Rs'G;Q2
+Q;G1R11'G;Q4-QiG2R22'G;Q~+H~V2-1H2P2Q4
+Q4GIR,'G;Q1-Q4G2R,'G$Q2 +Q.IP~H;V~-'H~

+Q2G2R22lR12RG1GQ2 +Q3G1Ri1G;Q3

-P[Q~PIH;VI-'HIPIQ~
+Q~PIH;VZ-'HZPIQ~
~ ~ ~ p l H ~ v 2 ~ ' H 2 p 2 ~ 4 ~ ~ 4 p 2 H ~ v 2 ~ ' H 2 p ~ ~ 3 I. INTRODUCTION
+Q~P~H;V~-'H~PZQ~] When we solve optimal control problems numerically, we sometimes
encounter convergencedifficulties. That is, it is difficult to r i d a
Q4(T)=o.
nominal solution such that an algorithm starting from it is stable. Since
the differential dynamic programming (DDP) technique had been pro-
V. CONCLUSION posed by Jacobson and Mayne [I], several computational methods for
For the general LQG stochastic differential game with differing ob- optimal control problems which ensure the convergence of the algorithm
servations for the two players, the pair of Nash equilibrium solutions were invented.Among others, Mayne and Polak [2]proposed a DDP-type
known to us are unimplementable. However, when one of the players is algorithm and proved that every limitpoint generated by their algorithm
assumed to "spy" on the other, the resulting solutions are implementable satisfies the optimality condition. However, their procedure of s u v
by finite-dimensional systems. A similar situation holds when the cost sively constructing controls is complicated. Ohno 131 presented a new
criterion of one of the players is modeled by theexponential of a approach to discrete time systems and proved l o c a l convergence of the
quadratic form. Also, using the "spy" situation as worst case, a lower algorithm. Jibmark [4], [5] proposedconvergence control parameter
bound for the performance in the game of any one of the players m y be technique for controlling the convergence. Althoughthis technique seems
obtained; also, fiite-dimensionally implementable solutions guarantee to workwell, the mathematical mechanism of this technique has not
this lower bound are exhibited. been clarified yet.
In this paper we present a simple algorithmfor computing the optimal
control, which is analogous to the first-order DDP, but essentially based
upon the Pontryagin minimum (or maximum) principle rather than the
REPERENCES dynamic programming. We employthe convergence control technique in
W. W.Willman, "Formal solutions for a class of stochastic pursuit-evasion game$"
the algorithm, and we consider the global convergence conditionsfor the
IEEE Trans. Aulonwt. C o w . . vol. AG14, pp. 504-509, 1969. algorithm. An example is worked out by using our algorithm.
A. Bagchi and G . J. Olsdcr, "Linear quadratic stochastic purmit-evasion games,"
Twenk Univ. T&oL Enscbede, The Netherlands, Memo. 231,1978.
P. R K-, "Stochastic optimal control and stochastic differential games," D.Sc. 11. OFTMAL. CONTROL PROBLEMAND COMPUTA~ONAL
dissertation, Washington Univ., St. L O G MO, 1977. PRocmuRe
W. M. Wonham, RMdmn Differentiaf&rationsin Control W y , P&ilistic
Mefhods in Applied Mmhematiies, A. T.Bharucha-Reid, Ed., voL 11. New Yo*
Academic. 1969. We consideradynamicalsystemdefined on afixed time interval
M. Fuji&, G . Kallianpur, and H.Kunita, "Stochastic differential equations for and described by
T=[to, t l ]
the nonlinear filIaing problem," Os& J . Math., voL 9, pp. 19-40.1972.
J. H.van Schuppcn, "Estimation theory for continuous time p r o c s s s , a martingale
approach," F'h.D. dissertation,Univ. California, Berkeley, 1973.
M. H.A. Davis and P. Varaiya, -Dynamic programming conditions for partially
observable stochastic system" SIAM 1. Cow., vol. 11, pp. 226-261, 1973. where ~ ( t is) an n-dimensional state vector, u ( t ) is an r-dimensional
C Striebel, "Martingale conditions for the optimal control of continuous time
stochastic systems," presented at the Int. Workshop on Stochastic Filtering and control vector, and the vector-valued function f(x, u, t ) should satisfy
Contr., Los Angeles, CA, May 13-17, 1974. somedifferentiability and continuity conditions which will be stated
T. Barar, "Decmtralized multicriteria optimization of linear stochastic systeq" later. The control vectors are required to satisfy the constraint
IEEE Tram. Aulomat. Conrr.. vol. AC-23, pp. 233-243, 1978.
1. B. RhDdts and D. G. Luenberger, "Differential games with m i p e
rd state
f
information," IEEE Trans. Automar. Contr., voL AG14, pp. 29-38, 1969.
I . B. Rhodes, "On nonzero sum d i f f e r m u games with quadratic cost functionals,"
in Proe. 1st Int. G n f . and Appl. of Differenrial Comes, Univ. Masa&upettS, where U is a conpuct and conuex subset of the r-dimensional Euclidean
Amherst, 1969.
R D. Bchn and Y. C Ho, "On a class of linear stochastic differential games" , space. The class D of admissible controlr isdefined as the set of all
IEEE T m . Automat. Contr., vol. AG13, pp. 227-239, 1968. measurable functions u: T+U satisfying (2).
K. Mori and E Shimemura, "Linear differential games withdelayed and noisy
information," 1. Oprimis. %ory Appl, vol. 13, pp. 275-289, 1974. The problem is to find the optimal control UED that minimhx the
Y.C. Ho, "On the minimax principle and Z C T D . stochastic
~ differential games," cost functional
J . Optimiz. lhwy Appl., voL 13, pp. 343-361, 1974.
D. H. Jacobson, "Opoptimal stochastic linear systems with exponential criteria and
their relation to deterministic differential games," IEEE Tram. A w o m a ~ .Comr.,
voL AG18, pp. 124-131, 1973.
J. L Speyer, J. Lkyst, and D. H. Jacobson, "Optimjzation of stochastic linear
systems with additive measurement and process noise using exponential perfor- Note that the cost functional of the form
mana a i & " IEEE Trans. AWomot. Cow.., voL AG19, pp. 358-366, 1974.
P. R Kumar and J. H. vanSchuppen, "On the optimal control of stochastic
bysfcms with an exponential-of-integral performance irides," 1. Mmh. A d . Appl.,
to be published.
I . B. Rhodes and D. G.Luenherger, "Stochastic differentialgames with conspained
staterstimators," IEEE Trans. Automar.Cow., vol. AG14, pp. 476-481, 1969. can be represented as (3) by setting
Y.C. H o and K.C Chu, '%formation structure in multi-personcontrol problems,"
Automatiea. voL IO, pp. 341-351. 1974.
C T. -des and B. Mom, "Differential gameswith noise corrupted observatiom,"
1. Optimiz. lhwy Appl., voL 28. pp. 233-251, 1979.
Y.C.Ho, I . Blao, and T. Basar, "A tale of four information smhms," in control Manuscript received December 19,1979; revised June 5,1980. Paper recammended by
%oy, Nanerical Methods and Computer SVJtems Modeling, voL 107, Lecture N o t a A J. h u b , chairman of the Computational Methods and Discrete Systems cornmitt&.
in Economics and Mathemafieaf S y s t m . New York Springer-Verlag. 1975, pp. The auihors are with the Department of Control Engin&& Faculty of Engin-
85-96. Science, Osaka University, Toyonaka, Osaka 560, Japan.

0018-9286/80/1uw)-1149$00.75 6 1980 IEEE

1150 IEEE TRANSACTIONS ON AUTOMTIC C O ~ O L VOL.
, AC-25,NO. 6, DECEMBER 1980

, denotes the gradient row vector defined by

where 0 i'(k+l)=x'(k)+f(x'(k),u'(k),to+kA)A ( 13)

e,=(ae/ax,,...,ae/ax,) . to x'(k+ I), calculatethe approximationf(i'(k+ I), u'(k), t o + ( k + l)A)

to i i ( t o + ( k +1)A).
Let U ( t ) , I E T be the optimal control and let x(r) bethe corre- iv) Calculate x'(k+ 1) by
sponding optimal trajectory satisfying(1). Then it is neceSSary that there
exist a nonzero continuous row-vector function h ( t ) = ( A l ( r ) , -+.
,h,(t)) x'(k+l)=x'(k)+~
A[ f ( x ' ( k ) , u ' ( k ) , t o + k A )
corresponding to the functions u ( t ) and x(t) such that [6]
+ f ( i i ( k + l ) , u i ( k ) , t o + ( k + l ) A ) ] . (14)
i) = -H,(x(t), u ( t ) , h ( t ) , t ) , x(t,)=o (5)
v) Set k:=k+ 1 and go to ii).
where It isclear that the step functions u i ( t ) constructed as above are
measurable. Consequently, the conceptual functionsu ' ( t ) defined by
H(x,u,h,t)=L(x, u,t)+Af(x,u,t); (6)
u'( t ) = lim ui( t )
A-+O
ii) for almost all t E T , the function H ( x ( t ) , u, h(t),t) of the variable
U EU attains its minimum at the point u=u(t), namely, are also measurable and u'(.)EO.
In connection with the implementable criterion for the convergenceof
H(x(t),u(t),X(t),t)= minH(x(r),u,h(r),t). (7)
U € U the sequence ( u i ( r ) } , let us define
For seeking the optimal pair (x(t),u ( t ) ) satisfying the above condi- si=: ~ ~ u ' ( k ) - u ' -(k)ll.
I
tions, we consider the following algorithm.
Step 0: Selecta nominal control uOEQ. Let x O ( t ) , t E T be the
corresponding nominal trajectory. Set i= 1. Let c>O be a given small number. If there is a number io such that the
Step I: Compute h'-'(t) by solving the differentialequation relation

dh'-'(t)/dt= -Hx(xi-l(t),~'-l(t),hi-l(t),t), h'-'(tl)=O. 6'< c

(8) holds for any i > io, then we may conclude that the sequence{ ~ ' ( t )has
}
converged.
Step 2: Define the function In Step 2 of the algorithm, the mhimization of the function K with
respect to u has to be performed for each grid point of T. It should be
K(x,u,h,t;o,C)=H(x,u,h,r)+(u-")=C(u-o) (9) noted that in most cases the minimi7ing point u ' ( t ) U~ can be calcu-
lated analytically and expressedby simple equations.
where C=diag(c,,. . . ,c,,), cl,. . . ,c,, 2 0 . Select a nonnegative diagonal
matrix C' properly. Determine x'(t) and u'(t), r E T which satisfy both
111. ASSuMpnONS AND SOME
K(x~(~),u'(~),h'-~(~),~;u~-~(~),c')
In order to consider the convergence of the algorithm, we make the
=H(x'(t),u'(t),X'-'(f),f) following assumptions throughout this paper.
Assumption I : Functions fi(x,u, t ) ( i = 1,. .. ,n ) and L(x, u, t ) and
+ ( u ~ ( ~ ) - u ' - ~ ( t ) ) ~ c ' ( u ~ ( (~t )) -) u ~ - ~their partial derivativesfi,,fi,,.fi,,fi,,,fi,,, L,, L,,, L,, L,,, L,, are
= ~ K ( x ~ ( t ) , u , h ' - ~ ( ~ ) , ~ ; u ' - ~ ( t )continuous,c') on R"x U X T.
UEU Ammprion 2: For any admissible control u(.)€B, there existsa
(10) uniformly bounded solution x(t; u ) , ET of (1). In other words, there is
a solution x ( t ; u ) of (1) that satisfies
and the differential equation
Ilx(t; U)ll4MI (15)
dx'(r)/dt=l(x'(t),u'(t),t), xi(to)=x0. (11)
for any t E T and for any u EQ, where M I is a constant independent of f
This is possible by integrating (11) from t o to t I while seeking u ' ( t ) that and u.
minimizes K. We denote by X a convex setof R" given by
Step 3: Calculate
X = { x E R " : llx114Ml}. (16)
J(ui)=/"L(xi(t),~i(t),r)dr. (12)
10
If there is a constant c such that the inequality

If J(ui)-J(ui-')>O, make the elements of C' larger and go to Step 2. IIf(x,~Y~)ll~~(IIXII+1) (17)
Otherwise, set i:=i+l and go to Step 1. holds for any ( x , u , t ) E R " X U X T , then it is easily seen that Assump
Stop the computation if the sequence {C') is bounded for all i and the tion 2 holds. In fact, integrating the inequhty
sequence { u ' ( t ) ) of the controls converges.
In this algorithm we minimize K instead of H. Since the function K
contains a quadratic penalty term ( u - u ~ - ~ I ) * c ' ( u - u ~ -)~for the possi-
ble large change of control, instability of the algorithm at the first stage
of computation can be avoided by taking large C'. This idea is due to we have
Jiirmark 141, [q.
In Step 2 of the algorithm, we have to determine x'(t) and u ' ( f ) Ilx(r; u)ll ~ ( ~ ~ x ( r l)e@1-'0)-
o ) ~ ~ + 1.
satisfying both (10) and (11). For this purpose, we propose the following Proposition I : The function h i ( t ) defined as thesolution of (8) is
implementable algorithm. We replace the differential equation (11) by a uniformly bounded, i.e., there is a constant M2 independent of t and i
difference equation with a uniform step length A > 0 and proceed as
such that
follows.
i) Set k=O. Ilh'(r)ll <M2 (18)
ii)Given x'(to+kA)~x'(k), determine ui(to+kA)=ui(k) via (10).
iii) Using an approximation foranytETandforanyi(i=O,l,..-).
1151 1980
IEBE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. AC-25, NO. 6, DECEMBER

Proof: From (8) we obtain minimum element of the nonnegative diagonal matrix Ci. If we choose
C' such that
Ai(fl)=0.
-dAi(t)/dt=Ai(t)f,(~i(t),ui(t),t)+Lx(xi(f),ui(t),t),
ci>co (i=1,2,---) (3)
Since f , ( x , u , t ) and & ( x , u , t ) are continuous on the compact set
X X U X T,there are constants a and b such that where co is a constant satisfying
co>(M-r)/4, (29)
~ ~ f , ( x ~ u( it()t ,) , r ) l l < a ,

llLx(xi(t), U'(t), t)ll < b . then the sequence {J(u')) of the cost function& decreases monotoni-
cally and converges.
Let T = t l - f , and define Proof: First we prove (27). In view of (6)and (12), we see that

Yi(T)=IIXi(tl-T)II=IIhi(f)II. J ( ~ i ) - J ( ~ ' - ~ ) = ~ " [ ~ ( ~ i , ~ i , ~ i - l , ~ ) - Ht Ai-1

( x i , -t l) , ~ ' - l
0

Then we see that

- A i - ~ ( ~ ( x i , u ' , f ) - ~ ( x i - ~ , u i - ~ , t ) ) ] d r . (30)
dLi(T)---
-- IIA'(t)ll< IIh'(t)ll < v i ( ~ ) + ~ , yi(0)=O. (19) Define
dT dt
sx'(t)=x'(t)-x'-'(t),
Integrating (19) and applying Gronwall's inequality [7l yields
Gu'(t)=u'(t)-u'-'(t).
Yi(T)<(b/a)[e"('l-'0)- 1]=M 2 ' (20)
Then (30) can be rewritten as
Q.E.D.
J( u i ) -J( ui-' 1
We denote by A a convex setdefined by
=/f'[H(xi,ui,A'-',t)-H(xi,ui-6ui,Ai-',t)
A={AER": IlAll<M2}. (21) l0

+H(xi,ui-l,Ai-l ,t)-~(xi-l,u'-~,A'-l,f)-A'-~6xi(t)]~
Proposition 2: There is a constant M3 independent of t and i such that
ciIIui(t)-ui-'(t)II<M3 (22) =i:'[ H U ( x i , u i , A i - 1 , t ) 6 u ' - 2-1( G u i ) T H ~ , ( x i , I ; i , A i - 1 , t ) 6 u i

for any t E T and for any i where ci > 0 is a minimum element of the ,
+ ~ ~ ( ~ i - l , ~ i - l , ~ i - lt )6x'
diagonal matrix c'=diag(ci,- cj). + I ,

Proofi In view of (9) and (lo), we see that

H(xi,ui,Ai-~,~)+(ui-ui-~)Tci(ui-ui-~ 1
+ I1( ~ x i ) T H x x ( ; iui--l
, 7 Ai-1 9 t)6xi-A'-1&i
I dl

< H ( X ' , U ' - ~ , A ' - ' , ~ ) .(23) where

From (23) we obtain &'=u'-g16u'EU to< 8, < 0 ,

cillu'(f)-u'-'(t)l12 ii=xi-1+e28xiEx (O<e2<l).

) - ~ ~ ( From
< H , ( ~ ~ ( t ) , I ; ~ ( f ) , A ~ - ' ( f ) , f ) ( ~ ~ - ' ( t(24) f ) ) (9) we obtain

where Hu(x', U i , A + l , f)SU'

I ; i ( t ) = U i ( t ) + e i ( t ) ( u i - l ( t ) - u i ( t ) ) (o<ei(t)<l). = K u ( x i ,ui, Ai-', t ; u ~ ~ ' , C ~ ) ~ U ~ - ~ ( (31)

~U~)~C~
since the set u is assumed to be convex, I;'(t)E U. The function Since ui minimks K on the convex set V , it follows that [SI
)IHu(xi,2 , A i - ] , t)ll is continuous on the compact set X X U x AX T.
Consequently, thereis a constant M3 independent of t and i such that K,(x',u',A~-',~;u'-',C')(U'-U)<O (32)

IlH,(xi(t), I;'(f), A i - q t ) , t)ll < M 3 . (25) for any U EU. Thus the firstterm of the right-handside of (31) is
nonpositive. In view of (S), we obtain
From (24) and (25) we(22). obtain Q.E.D.
By Proposition 2, if we select a large C', then the variation of control /'l[~x(x'-I,ui-~,~i-~,t)6xi-xi-~6x']
u i ( t ) - u i - ' ( t ) is kept small and the stability of the algorithmis ensured. f0

The following assumption wiU usually hold in most optimal control = ( L ~ ~ - ) ; i - ~ ~ x i - ~ i - - ~ ; i ~ ~ = - [ ~ ~ - l ( f ) ~ x i ( (33)
t ) l ~ ~ = ~ .
problems. t0
Asmnption 3: There is a nonnegative definite matrix R such that
Furthermore, since H x x ( x ,u, A, t ) is continuous on the compact set
H,,(x,u,A,t)>R>O (26) X X UX A X T,there is a constant M4 independent of t and i such that
for a n y x ~ XUEV,
, AEA, and t E T . IIHxx(~'(t),u'-'(t),Ai\'-'(t),t)ll~M4 (34)

w. REDUCTION OF THE COST for any t ET and for any i. Using relations (3 1)-(34) and Assumption 3,
we obtain
Proposition 3: There is a constant M>O independent of i such that
the inequality J( u') -J( ui-1)

J ( u i ) - J ( u i - l ) < - - (214 c i + r - M ) / ~ ~ l l u i ( f ) - u i - (f)Il2

10
~ df (27) <i;'[
-~(4ci+r)l16u'(t)l12+ 1 pf4116x'(t)112] dr. (35)

holds for any i where r > 0 is the minimum eigenvalue of R and ci is the It is easily seen that there exist positive constants ai(i= 1,2) such that
1152 I E ~ BTRANSACFIONS ON AUTOMATIC CONTROL, VOL. AC-25,NO. 6, DECEMBER 1980

which implies the convergence of ( u ' } in L;(T).

Let x ' ( t ) be the solution of (1 1). Then, in the same way as in (36), we
obtain
Integrating the above inequality and applying Gronwall's inequality [7l ~d~ ~ ~ ( r ) - x i ( r ) l l ~ ~ ~ ~ ( ~ ( r ) , i ( r ) , t ) - ~ ( x i ( r ) , u i ( r ) , r ) ~ ~
gives
< a , l ~ x ( r ) - x ' ( r ) ~ ~ + a ~ I ~ ~ ( r ) - u ' ( (45)
t)~~.

Integrating (45) and applying Gronwall's inequality [7] gives

where
~ ~ ~ ( t ) - x ' <( M
t ), ~
/ ' ~1 1 ; ( ~ ) - ~ ' ( ~ ) 1 1 d ~ .
M,= a z e d l l - ' o ) . IO

Using the Schwarz inequality, we obtain from (37) <M511i-"i111. (46)

In view of (44)and (46J we see that
~ ~ 6 ~ ' ( t ) ~ ~ ~ < M ~ ~ ( t - t ~ ) ~ ~ ' ~ ~ s u ' ( ~ ) ~ ~ ~ d ~ .
lim x ' ( r ) = i ( r )
i-t m
Integrating the above inequality gives
for any t E T.
Let A ( r ) be the solution of

di(r)/dr=-H,(i(r),i(t),i(t),r), i(r,)=O. (47)

Since A i ( r ) satisfies (8), by Assumption there

1 are constants pi( i = 1,2,3)
such that
Substituting (38) into (35) yields (27) where

M=M4M6=M4MS2(rl-r0)2/2. (40)
~ ~ ~ i ( ~rp ), ~ -~ ~~( r~) - x( i ~
( r ) )~ ~~+ p~z ~ ~ ~ ( r ) - u ~ ( r ) ~ ~

If (28) holds, it is clear that the sequence ( J ( u i ) } is monotonically +&lli;.<t>-~'(t)ll.

decreasing. Since
In the same way as before, we see that
J(ui)>J(u*), i=1,2,*-.,
A'(r)=i(r) (48)
i-m
where u* is the optimal control, the sequence { J ( u ' ) ) is bounded from
below.
Therefore, the sequence { J ( u ' ) ) converges. Q.E.D. for any t E T.
Since u=u'(t)EUminimizes K ( x i , u , A i - ' , t ; u i - I , C i ) on the convex
set U,it follows from (32)that for any u E U
V. CONVERGENCE TO THE O m CONTROL
fzu(xi(r),ui(r),Ai-l(r),r)(ui(r)-u)
We denote by L;(T) the Banach spaceof r-dimensional functions u ( r )
with norm defined by + ~ ( u i ( r ) - u i - ' ( r ) ) ~ ~ i ( u ' ( r ) - u ) < ~ (49)
.

The matrices Ci are bounded, u'(r)E U, and by (42)

.~II ~ l d ( r ) - u ~( r-) ~
ll=O a.e. on T .
It is clear that QCLi(T). I-+m
As stated in Proposition 3, the sequence { J( u')} can always converge.
However,convergence of thesequence { u ' ( t ) } of the controls is not Therefore, by letting i+m in (49), we obtain
clear. If the sequence { u i ( r ) } converges, we obtain the following.
Proposition 4: Suppose that the nonnegativediagonal matrices c' H,(x(r),i(Z),~(r),r)(i(t)-u)40 (50)
which satisfy (28) are bounded, i.e., there is a constant y such that
for almost all r E T and for all u E U. Because of Assumption 3, the
function H is a c o m x funcrion on the convex set U. Consequently,
u = i ( r ) minimim H(<(r), u, A(?), r ) on U if and only if (50) holds for
for any i. If the sequence ( u ' ( r ) ) converges in the sense that any U E U [8]. Thus, we see that i ( r ) and x ( r ) satisfy the necessary
conditions (5) and (7)for optimality.
Q.E.D.
lim u'( r ) =i( t ) almost everywhere on T , (42) Proposition 5: A necessary condition that the sequence { u i ( r ) } con-
i-m
verges almost everywhere on T is that the sequence {J( u ' ) ) also con-
then the control i ( r ) and the corresponding solution i ( r ) of (I) satisfy verges.
the necessary conditions ( 5 ) and (7)for optimality. Proof: Suppose that (42) holds.Then (44)also holds. Leti ( r ) be the
Roo$ Since u'(r) are measurable functions, it is obvious that i ( r ) solution of (1) for the control i ( r ) . By Assumption 1, thereare constants
is also measurable and ; ( - ) € S I . Since the set U is compact, there is a yi ( i = 1,2) suchthat
constant A such that 11 u i ( r ) - i ( r ) l l < A for any r E T . The functions
11 ui(r)-i(r)ll are clearly integrableand (42) is equivalent to ~ ~ ( ~ ~ , ~ ~ , t ) - ~ ( x , i , r ) ~ < y ~ ~ ~ ~ ~ - ~ ~

Therefore, we obtain

~ J ( ~ ' ) - ~ ( ~ ) ~ ~ ~ ~ ~ ~ ' l ~ ~ ' ( ~ ) - ~ ( r ) ~ ~ d r + y ~

Therefore, by the dominated convergence theorem [9],we see that 10 10
IEEE TRANSACTIONS ON AUTOMA~CCONTROL, VOL AC-25,NO. 6, DECEMBER 1980 1153

2o
0
t
I
1 2
I

Iteration
3
I
4
Fig. 2. Control functions for various iterations.

principle. We considered the control constraints of the form (2). In the

case where the control constraints are dependent on the state variables
Fig 1. Cost versus iteration number. and have the form

g ( x ( 0 , u(t))cO,
from which we see that

lim J( u i ) =J( ii). (5 1) we have not yet sucex$ed in proving the global convergence of the
i-r m algorithm. Constraints on the terminal state can be taken into considera-
Q.E.D. tion by adding a penalty term B ( x ( t , ) ) for the terminal state constraints
as in (4) and rewriting the cost functional as in (3).
VI. A NUMERICALEXAMPIX In [2] it is proved that, if a successively constructed sequence {ut> of
controls has a limit point, then it satisfies the necessary conditions for
We consider the following control problem described by the Rayleigh optimality. In [lo] it is further proved that, by extending the class of
equation: controls to the relaxed controls, at least one limit point exists that
x, = x 2 satisfies the necessary conditions for optimality. Although thm results
are stronger than our r d t , implementable criterion for determining a
i , = - x ] +1.4x2 -0.14~;
+4u, (52) limit point seems to be difficult.
The matrices C' should be chosen adaptively depending on the pro-
with the initial condition ceeding of the computation. In general, when the matrix Ci is smaller,
x1(0)= - 5 , x*(O)= -5. the obtained variation of the control function is larger. Therefore,C' are
desired to be as small as possible as far as the cost functionals decrease.
The problem is to find the optimal control u(t), 06 t ~ 2 . 5that mini- According to our computational experience, the following way of choos-
mizes ing the matrices C' is recommended.Choose the initial matrix C1
properly. If J(u')>J(u'-'), then set c':=x'. If J(u')<J(u'-'), then
'
set C'+ =aC' where a is a constant such that 0.5 6 a < 1. a =0.8 to 0.9
seems to be a good choice.
under the constraint Computational results of applying our algorithm to much more com-
plicated optimal control problems w ibe reported in a forthcoming,
l
l
-ltu(t)<l.
paper.
This problem was solved in [l] by the second-order DDP. Note that
Huu=2 in this case. REFERENCES
We solved this problemby our algorithm, starting from the same
nominal control u(t)= -0.5 as in [l]. Fig. 1 shows the cost as a function D. H. Jacobson and D. Q. Maync, DifferenfiolQvnamic ProgrMvmirg.
New York:
Elsevier, 1970.
of iteration number. We set C'=O for all i , and our algorithm found the D.Q. Mayneand E. P O W T i ~ ~ t ~ r strong
d e r variation algorithms for optimal
optimal solution in several iterations as in [l]. Compared with the result control," J . Opfimiz.
77mv &I., vol. 16, pp. 277-301, 1975.
K.Ohno, ' A new approach to differential dynamic programming for discrete timc
in [l], the rate of reduction of the cost by our algorithm is better than systems," IEEE Tmns.A u f o m f .Coria, vol. AG23, pp. 37-47, 1978.
that by the second-order DDP. Fig. 2 shows the control function for B. Jirmark, 'Vn convergence control in differential dynamicprogmmming applied
various iterations. Note that each control function is continuous. Al- to realisticaircraftand differential game problems," in Proc I977 IEEE Conj
Dairion Conrr., pp. 471-479.
though the cost converges very fast, the convergence rate of the control [51 - , "A New convergence control technique in differential dynamic
functions appears to be slower. This indicates that even if the sequence p r o g r e Royal Inst. Technol., Stockhotm, Swedcn, Rep. =A-REG-7332,
1975.
of the cost functionals has converged already, computation mustbe L S. Pontryagin, V. G . Boltyanskii, R V. G a m k r e l i d z e , and E.F. Mishchenko, The
continued until the sequenceof control functions converges. Mathematid Thew of optimalProcesses. New Yo& Interscience, 1962.
w. waltet, D#,?w~,,I ~d rntesrp~ rnequolita. ~eriin: springer. 1970.
M. R Hstenes, OpWzution l 7 i w f y : The Finite Dimemiona/ Care. New York
VII. CONWJDINGREMARKS Wiley. 1975.
E Asplundand L.Bungart, A Firsr Course in Integration. NewYork Holt,
€finehartand Winston, 1966.
Global convergence conditions have been investigated for our algo- L. J. Williamron and E. Polak, "Relaxed controls and the oonvcrgence of optimal
rithm, which can be derived naturally from the Pontryagin minimum umtrol algorithms," SIAM J . C o r n . Oprimiz., VOL 14, pp. 737-756. 1976.

ADA2604 Udh
No ratings yet
ADA2604 Udh
89 pages
Differential Games Overview
No ratings yet
Differential Games Overview
16 pages
Andersson Djehiche - AMO 2011
No ratings yet
Andersson Djehiche - AMO 2011
16 pages
Introducción Piazza
No ratings yet
Introducción Piazza
33 pages
P550
No ratings yet
P550
27 pages
A Neural RDE Approach For Continuous-Time Non-Markovian Stochastic Control Problems
No ratings yet
A Neural RDE Approach For Continuous-Time Non-Markovian Stochastic Control Problems
11 pages
Lacker 2015
No ratings yet
Lacker 2015
39 pages
Infinite Time Horizon Optimal Control of Mckean-Vlasov Sdes: Silvia Rudà
No ratings yet
Infinite Time Horizon Optimal Control of Mckean-Vlasov Sdes: Silvia Rudà
42 pages
Elements of Linear System Theory
No ratings yet
Elements of Linear System Theory
21 pages
Solution of Differential Games
No ratings yet
Solution of Differential Games
6 pages
SC Dec22
No ratings yet
SC Dec22
82 pages
Dynamic Programming & Control
No ratings yet
Dynamic Programming & Control
62 pages
Bismut IntroductoryApproachDuality 1978
No ratings yet
Bismut IntroductoryApproachDuality 1978
18 pages
Dynamic Programming in Stochastic Control
No ratings yet
Dynamic Programming in Stochastic Control
12 pages
Infinite TimeReachability
No ratings yet
Infinite TimeReachability
10 pages
Game Lnew
No ratings yet
Game Lnew
80 pages
BF 00932802
No ratings yet
BF 00932802
8 pages
The Isaacs Equation For Differential Games, Totally Optimal Fields of Trajectories and Related Problems
No ratings yet
The Isaacs Equation For Differential Games, Totally Optimal Fields of Trajectories and Related Problems
30 pages
Dynamic Programming and Optimal Control, Volumes I Solution Selected
No ratings yet
Dynamic Programming and Optimal Control, Volumes I Solution Selected
30 pages
Vol I Dimitri PDF
No ratings yet
Vol I Dimitri PDF
30 pages
Stochastic Control for Academics
No ratings yet
Stochastic Control for Academics
203 pages
Sde 3
No ratings yet
Sde 3
19 pages
Hu Mingshang-非线性期望下HJB
No ratings yet
Hu Mingshang-非线性期望下HJB
19 pages
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
0% (1)
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
54 pages
Stochastic Approximations and Differential Inclusions, Part II: Applications
No ratings yet
Stochastic Approximations and Differential Inclusions, Part II: Applications
23 pages
DP - Bellman - 1741339134 2025-03-07 09 - 19 - 05
No ratings yet
DP - Bellman - 1741339134 2025-03-07 09 - 19 - 05
13 pages
A Leader-Follower Stochastic Linear Quadratic Differential Game
No ratings yet
A Leader-Follower Stochastic Linear Quadratic Differential Game
27 pages
Minimax Control for Positive Systems
No ratings yet
Minimax Control for Positive Systems
26 pages
Numerical Methods For Stochastic Control Problems in Continuous Time (PDFDrive)
100% (1)
Numerical Methods For Stochastic Control Problems in Continuous Time (PDFDrive)
480 pages
Meanfieldgames Priceformation
No ratings yet
Meanfieldgames Priceformation
32 pages
Hermes MathematicsComputation 1965
No ratings yet
Hermes MathematicsComputation 1965
3 pages
Linear Theory For Control
No ratings yet
Linear Theory For Control
4 pages
Hamadene
No ratings yet
Hamadene
25 pages
Necessary Conditions For Optimal Singula
No ratings yet
Necessary Conditions For Optimal Singula
37 pages
Surveycontrol
No ratings yet
Surveycontrol
44 pages
Master LN
No ratings yet
Master LN
135 pages
Probabilistic MFG
No ratings yet
Probabilistic MFG
102 pages
Tac 1977 1101561
No ratings yet
Tac 1977 1101561
25 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
28 pages
Optimal Control Exercises Guide
100% (2)
Optimal Control Exercises Guide
79 pages
Mean-Field-Type Games For Engineers
No ratings yet
Mean-Field-Type Games For Engineers
76 pages
(Nisio) Stochastic Control Theory (2015)
No ratings yet
(Nisio) Stochastic Control Theory (2015)
263 pages
An Application of Stochastic Maximum Principle For
No ratings yet
An Application of Stochastic Maximum Principle For
12 pages
Lectures On Stochastic Control and Its Applications To Finance Chap 4 Martingale Approach Pham
No ratings yet
Lectures On Stochastic Control and Its Applications To Finance Chap 4 Martingale Approach Pham
84 pages
Recent Developments in Automatic Control Systems
No ratings yet
Recent Developments in Automatic Control Systems
490 pages
Linear Quadratic Nonzero-Sum Differential Games With Random Jumps
No ratings yet
Linear Quadratic Nonzero-Sum Differential Games With Random Jumps
6 pages
Optimal Control Under Unknown Intensity With Bayesian Learning
No ratings yet
Optimal Control Under Unknown Intensity With Bayesian Learning
23 pages
Adaptive Dynamic Programming For Stochastic Systems With State and Control Dependent Noise
No ratings yet
Adaptive Dynamic Programming For Stochastic Systems With State and Control Dependent Noise
12 pages
Ref 16
No ratings yet
Ref 16
13 pages
CH 5
No ratings yet
CH 5
53 pages
17 - Discrete Time Entropy Formulation of Optimal and Adaptive Control Problem
No ratings yet
17 - Discrete Time Entropy Formulation of Optimal and Adaptive Control Problem
6 pages
Distributed Optimal Coordination Control For Continuous-Time Nonlinear Multi-Agent Systems With Input Constraints
No ratings yet
Distributed Optimal Coordination Control For Continuous-Time Nonlinear Multi-Agent Systems With Input Constraints
6 pages
© by SIAM. Unauthorized Reproduction of This Article Is Prohibited
No ratings yet
© by SIAM. Unauthorized Reproduction of This Article Is Prohibited
26 pages
Dynamic Programming
No ratings yet
Dynamic Programming
384 pages
Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming
No ratings yet
Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming
13 pages
Control of Toys
No ratings yet
Control of Toys
6 pages
Vrabie JC Ta 2011
No ratings yet
Vrabie JC Ta 2011
8 pages
GDD Nonlinear NIPS 2009 Convergent Temporal Difference Learning With Arbitrary Smooth Function Approximation
No ratings yet
GDD Nonlinear NIPS 2009 Convergent Temporal Difference Learning With Arbitrary Smooth Function Approximation
9 pages
Finance: Stochastic Control Guide
No ratings yet
Finance: Stochastic Control Guide
67 pages
Sparse Regression and Dictionary Learning
No ratings yet
Sparse Regression and Dictionary Learning
14 pages
Quadratic Mean Differentiability Example
No ratings yet
Quadratic Mean Differentiability Example
5 pages
Power Consumption Comparison
No ratings yet
Power Consumption Comparison
3 pages
Convex HW 6
0% (1)
Convex HW 6
8 pages
Linear Algebra Foundations
No ratings yet
Linear Algebra Foundations
17 pages
CS373: Glossary of Terms: Unit One
No ratings yet
CS373: Glossary of Terms: Unit One
3 pages
Essential Trigonometry Formulas
No ratings yet
Essential Trigonometry Formulas
5 pages
Sine Rule, Cosine Rule and Area (H) MA
No ratings yet
Sine Rule, Cosine Rule and Area (H) MA
12 pages
HSC Math Graphs 1995-2006
No ratings yet
HSC Math Graphs 1995-2006
3 pages
IMO Longlist 1982 Problems
No ratings yet
IMO Longlist 1982 Problems
8 pages
3.2.4 Journal - Completing The Square (Journal)
No ratings yet
3.2.4 Journal - Completing The Square (Journal)
4 pages
12th Grade Math Exam Prep
No ratings yet
12th Grade Math Exam Prep
24 pages
College Algebra (Math 101)
No ratings yet
College Algebra (Math 101)
5 pages
Dse 2025 Ii Q26
No ratings yet
Dse 2025 Ii Q26
4 pages
Calculus Early Transcendentals 10th Edition Anton Fast Access
No ratings yet
Calculus Early Transcendentals 10th Edition Anton Fast Access
308 pages
General - Mathematics - Lesson - 2.2 - Rational Functions
No ratings yet
General - Mathematics - Lesson - 2.2 - Rational Functions
48 pages
Lesson Plan
No ratings yet
Lesson Plan
3 pages
AP - Calculus AB Study Guide
No ratings yet
AP - Calculus AB Study Guide
296 pages
Matrix
No ratings yet
Matrix
35 pages
Mira
100% (1)
Mira
399 pages
Year 12 Physics: Vectors Assignment
No ratings yet
Year 12 Physics: Vectors Assignment
10 pages
MAT201: Calculus: Lecture 01: Limits and Continuity
No ratings yet
MAT201: Calculus: Lecture 01: Limits and Continuity
62 pages
Python Exercises
No ratings yet
Python Exercises
2 pages
LinearAlgebra Matlab
No ratings yet
LinearAlgebra Matlab
12 pages
Quadratic Equations Practice
No ratings yet
Quadratic Equations Practice
14 pages
Conditioning in Numerical Analysis
No ratings yet
Conditioning in Numerical Analysis
2 pages
Practice EOC Assessment 2
No ratings yet
Practice EOC Assessment 2
10 pages
MicroBit - Plus Two Maths - Chapter 1,2,3,4,5,6 & 8
No ratings yet
MicroBit - Plus Two Maths - Chapter 1,2,3,4,5,6 & 8
27 pages
Nature of Mathematics: II. Mathematical Language and Symbols
100% (4)
Nature of Mathematics: II. Mathematical Language and Symbols
56 pages
Advanced Calculus for STEM Students
No ratings yet
Advanced Calculus for STEM Students
25 pages
34 Rationals - The Game
No ratings yet
34 Rationals - The Game
4 pages
Fractions Improper1 PDF
100% (1)
Fractions Improper1 PDF
2 pages
CFD Course Schedule & Resources
No ratings yet
CFD Course Schedule & Resources
2 pages
Inverse Functions-Tables and Graphs: Homework Problems
No ratings yet
Inverse Functions-Tables and Graphs: Homework Problems
3 pages
Radicals All Notes
No ratings yet
Radicals All Notes
28 pages
Linear Algebra Solutions Guide
No ratings yet
Linear Algebra Solutions Guide
8 pages

+Q G1R11'G Q4-Qig2R22'G Q +H V2-1H2P2Q4: Algorithm

Uploaded by

+Q G1R11'G Q4-Qig2R22'G Q +H V2-1H2P2Q4: Algorithm

Uploaded by

IEEE TRANSACTIONS

+(QI + Q ~ W I R , ' G ; Q ~+ P [ - ( Q l +Q~~'PIHIVI-'H~~IQ~ On Global Convergence of an Algorithm for Optimal

0018-9286/80/1uw)-1149$00.75 6 1980 IEEE

, denotes the gradient row vector defined by

e,=(ae/ax,,...,ae/ax,) . to x'(k+ I), calculatethe approximationf(i'(k+ I), u'(k), t o + ( k + l)A)

dh'-'(t)/dt= -Hx(xi-l(t),~'-l(t),hi-l(t),t), h'-'(tl)=O. 6'< c

Yi(T)=IIXi(tl-T)II=IIhi(f)II. J ( ~ i ) - J ( ~ ' - ~ ) = ~ " [ ~ ( ~ i , ~ i , ~ i - l , ~ ) - Ht Ai-1

Then we see that

Proofi In view of (9) and (lo), we see that

< H ( X ' , U ' - ~ , A ' - ' , ~ ) .(23) where

From (23) we obtain &'=u'-g16u'EU to< 8, < 0 ,

cillu'(f)-u'-'(t)l12 ii=xi-1+e28xiEx (O<e2<l).

where Hu(x', U i , A + l , f)SU'

I ; i ( t ) = U i ( t ) + e i ( t ) ( u i - l ( t ) - u i ( t ) ) (o<ei(t)<l). = K u ( x i ,ui, Ai-', t ; u ~ ~ ' , C ~ ) ~ U ~ - ~ ( (31)

J ( u i ) - J ( u i - l ) < - - (214 c i + r - M ) / ~ ~ l l u i ( f ) - u i - (f)Il2

which implies the convergence of ( u ' } in L;(T).

Integrating (45) and applying Gronwall's inequality [7] gives

Using the Schwarz inequality, we obtain from (37) <M511i-"i111. (46)

di(r)/dr=-H,(i(r),i(t),i(t),r), i(r,)=O. (47)

Since A i ( r ) satisfies (8), by Assumption there

If (28) holds, it is clear that the sequence ( J ( u i ) } is monotonically +&lli;.<t>-~'(t)ll.

The matrices Ci are bounded, u'(r)E U, and by (42)

~ J ( ~ ' ) - ~ ( ~ ) ~ ~ ~ ~ ~ ~ ' l ~ ~ ' ( ~ ) - ~ ( r ) ~ ~ d r + y ~

principle. We considered the control constraints of the form (2). In the

You might also like

+(QI + Q ~ W I R , ' G ; Q ~+ P [ - ( Q l +Q'PIHIVI-'HIQ~ On Global Convergence of an Algorithm for Optimal