Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
56 views5 pages

+Q G1R11'G Q4-Qig2R22'G Q +H V2-1H2P2Q4: Algorithm

This document summarizes an algorithm for computing the optimal control of a dynamical system described by a state vector, control vector, and vector-valued function, with the goal of minimizing a cost functional. The algorithm is analogous to the first-order differential dynamic programming method but is based on the Pontryagin minimum principle rather than dynamic programming. The algorithm employs a convergence control technique, and the document analyzes the global convergence conditions for the algorithm to work. An example is provided to demonstrate the algorithm.

Uploaded by

mamurtaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views5 pages

+Q G1R11'G Q4-Qig2R22'G Q +H V2-1H2P2Q4: Algorithm

This document summarizes an algorithm for computing the optimal control of a dynamical system described by a state vector, control vector, and vector-valued function, with the goal of minimizing a cost functional. The algorithm is analogous to the first-order differential dynamic programming method but is based on the Pontryagin minimum principle rather than dynamic programming. The algorithm employs a convergence control technique, and the document analyzes the global convergence conditions for the algorithm to work. An example is provided to demonstrate the algorithm.

Uploaded by

mamurtaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

IEEE TRANSACTIONS

AUTOMATIC
ON CONTROL, VOL. AC-25,NO. 6, DECEMBER
1149 1980

+(QI + Q ~ W I R , ' G ; Q ~+ P [ - ( Q l +Q~~'PIHIVI-'H~~IQ~ On Global Convergence of an Algorithm for Optimal


-(QI ~~3~'p1H~v2~1H2p1~3~~3p2H~v2~'H2p1~3 Control
+ (QI + Q3)'p1HiV2-1H2p2Q4 -Q,PZH$V~-'H~PZQ~]
Q4 = -&F-F'Qd +Q;G2R,'GiQ3 +Q;G2Rs'G;Q2
+Q;G1R11'G;Q4-QiG2R22'G;Q~+H~V2-1H2P2Q4
+Q4GIR,'G;Q1-Q4G2R,'G$Q2 +Q.IP~H;V~-'H~

+Q2G2R22lR12RG1GQ2 +Q3G1Ri1G;Q3

-P[Q~PIH;VI-'HIPIQ~
+Q~PIH;VZ-'HZPIQ~
~ ~ ~ p l H ~ v 2 ~ ' H 2 p 2 ~ 4 ~ ~ 4 p 2 H ~ v 2 ~ ' H 2 p ~ ~ 3 I. INTRODUCTION
+Q~P~H;V~-'H~PZQ~] When we solve optimal control problems numerically, we sometimes
encounter convergencedifficulties. That is, it is difficult to r i d a
Q4(T)=o.
nominal solution such that an algorithm starting from it is stable. Since
the differential dynamic programming (DDP) technique had been pro-
V. CONCLUSION posed by Jacobson and Mayne [I], several computational methods for
For the general LQG stochastic differential game with differing ob- optimal control problems which ensure the convergence of the algorithm
servations for the two players, the pair of Nash equilibrium solutions were invented.Among others, Mayne and Polak [2]proposed a DDP-type
known to us are unimplementable. However, when one of the players is algorithm and proved that every limitpoint generated by their algorithm
assumed to "spy" on the other, the resulting solutions are implementable satisfies the optimality condition. However, their procedure of s u v
by finite-dimensional systems. A similar situation holds when the cost sively constructing controls is complicated. Ohno 131 presented a new
criterion of one of the players is modeled by theexponential of a approach to discrete time systems and proved l o c a l convergence of the
quadratic form. Also, using the "spy" situation as worst case, a lower algorithm. Jibmark [4], [5] proposedconvergence control parameter
bound for the performance in the game of any one of the players m y be technique for controlling the convergence. Althoughthis technique seems
obtained; also, fiite-dimensionally implementable solutions guarantee to workwell, the mathematical mechanism of this technique has not
this lower bound are exhibited. been clarified yet.
In this paper we present a simple algorithmfor computing the optimal
control, which is analogous to the first-order DDP, but essentially based
upon the Pontryagin minimum (or maximum) principle rather than the
REPERENCES dynamic programming. We employthe convergence control technique in
W. W.Willman, "Formal solutions for a class of stochastic pursuit-evasion game$"
the algorithm, and we consider the global convergence conditionsfor the
IEEE Trans. Aulonwt. C o w . . vol. AG14, pp. 504-509, 1969. algorithm. An example is worked out by using our algorithm.
A. Bagchi and G . J. Olsdcr, "Linear quadratic stochastic purmit-evasion games,"
Twenk Univ. T&oL Enscbede, The Netherlands, Memo. 231,1978.
P. R K-, "Stochastic optimal control and stochastic differential games," D.Sc. 11. OFTMAL. CONTROL PROBLEMAND COMPUTA~ONAL
dissertation, Washington Univ., St. L O G MO, 1977. PRocmuRe
W. M. Wonham, RMdmn Differentiaf&rationsin Control W y , P&ilistic
Mefhods in Applied Mmhematiies, A. T.Bharucha-Reid, Ed., voL 11. New Yo*
Academic. 1969. We consideradynamicalsystemdefined on afixed time interval
M. Fuji&, G . Kallianpur, and H.Kunita, "Stochastic differential equations for and described by
T=[to, t l ]
the nonlinear filIaing problem," Os& J . Math., voL 9, pp. 19-40.1972.
J. H.van Schuppcn, "Estimation theory for continuous time p r o c s s s , a martingale
approach," F'h.D. dissertation,Univ. California, Berkeley, 1973.
M. H.A. Davis and P. Varaiya, -Dynamic programming conditions for partially
observable stochastic system" SIAM 1. Cow., vol. 11, pp. 226-261, 1973. where ~ ( t is) an n-dimensional state vector, u ( t ) is an r-dimensional
C Striebel, "Martingale conditions for the optimal control of continuous time
stochastic systems," presented at the Int. Workshop on Stochastic Filtering and control vector, and the vector-valued function f(x, u, t ) should satisfy
Contr., Los Angeles, CA, May 13-17, 1974. somedifferentiability and continuity conditions which will be stated
T. Barar, "Decmtralized multicriteria optimization of linear stochastic systeq" later. The control vectors are required to satisfy the constraint
IEEE Tram. Aulomat. Conrr.. vol. AC-23, pp. 233-243, 1978.
1. B. RhDdts and D. G. Luenberger, "Differential games with m i p e
rd state
f
information," IEEE Trans. Automar. Contr., voL AG14, pp. 29-38, 1969.
I . B. Rhodes, "On nonzero sum d i f f e r m u games with quadratic cost functionals,"
in Proe. 1st Int. G n f . and Appl. of Differenrial Comes, Univ. Masa&upettS, where U is a conpuct and conuex subset of the r-dimensional Euclidean
Amherst, 1969.
R D. Bchn and Y. C Ho, "On a class of linear stochastic differential games" , space. The class D of admissible controlr isdefined as the set of all
IEEE T m . Automat. Contr., vol. AG13, pp. 227-239, 1968. measurable functions u: T+U satisfying (2).
K. Mori and E Shimemura, "Linear differential games withdelayed and noisy
information," 1. Oprimis. %ory Appl, vol. 13, pp. 275-289, 1974. The problem is to find the optimal control UED that minimhx the
Y.C. Ho, "On the minimax principle and Z C T D . stochastic
~ differential games," cost functional
J . Optimiz. lhwy Appl., voL 13, pp. 343-361, 1974.
D. H. Jacobson, "Opoptimal stochastic linear systems with exponential criteria and
their relation to deterministic differential games," IEEE Tram. A w o m a ~ .Comr.,
voL AG18, pp. 124-131, 1973.
J. L Speyer, J. Lkyst, and D. H. Jacobson, "Optimjzation of stochastic linear
systems with additive measurement and process noise using exponential perfor- Note that the cost functional of the form
mana a i & " IEEE Trans. AWomot. Cow.., voL AG19, pp. 358-366, 1974.
P. R Kumar and J. H. vanSchuppen, "On the optimal control of stochastic
bysfcms with an exponential-of-integral performance irides," 1. Mmh. A d . Appl.,
to be published.
I . B. Rhodes and D. G.Luenherger, "Stochastic differentialgames with conspained
staterstimators," IEEE Trans. Automar.Cow., vol. AG14, pp. 476-481, 1969. can be represented as (3) by setting
Y.C. H o and K.C Chu, '%formation structure in multi-personcontrol problems,"
Automatiea. voL IO, pp. 341-351. 1974.
C T. -des and B. Mom, "Differential gameswith noise corrupted observatiom,"
1. Optimiz. lhwy Appl., voL 28. pp. 233-251, 1979.
Y.C.Ho, I . Blao, and T. Basar, "A tale of four information smhms," in control Manuscript received December 19,1979; revised June 5,1980. Paper recammended by
%oy, Nanerical Methods and Computer SVJtems Modeling, voL 107, Lecture N o t a A J. h u b , chairman of the Computational Methods and Discrete Systems cornmitt&.
in Economics and Mathemafieaf S y s t m . New York Springer-Verlag. 1975, pp. The auihors are with the Department of Control Engin&& Faculty of Engin-
85-96. Science, Osaka University, Toyonaka, Osaka 560, Japan.

0018-9286/80/1uw)-1149$00.75 6 1980 IEEE


1150 IEEE TRANSACTIONS ON AUTOMTIC C O ~ O L VOL.
, AC-25,NO. 6, DECEMBER 1980

, denotes the gradient row vector defined by


where 0 i'(k+l)=x'(k)+f(x'(k),u'(k),to+kA)A ( 13)

e,=(ae/ax,,...,ae/ax,) . to x'(k+ I), calculatethe approximationf(i'(k+ I), u'(k), t o + ( k + l)A)


to i i ( t o + ( k +1)A).
Let U ( t ) , I E T be the optimal control and let x(r) bethe corre- iv) Calculate x'(k+ 1) by
sponding optimal trajectory satisfying(1). Then it is neceSSary that there
exist a nonzero continuous row-vector function h ( t ) = ( A l ( r ) , -+.
,h,(t)) x'(k+l)=x'(k)+~
A[ f ( x ' ( k ) , u ' ( k ) , t o + k A )
corresponding to the functions u ( t ) and x(t) such that [6]
+ f ( i i ( k + l ) , u i ( k ) , t o + ( k + l ) A ) ] . (14)
i) = -H,(x(t), u ( t ) , h ( t ) , t ) , x(t,)=o (5)
v) Set k:=k+ 1 and go to ii).
where It isclear that the step functions u i ( t ) constructed as above are
measurable. Consequently, the conceptual functionsu ' ( t ) defined by
H(x,u,h,t)=L(x, u,t)+Af(x,u,t); (6)
u'( t ) = lim ui( t )
A-+O
ii) for almost all t E T , the function H ( x ( t ) , u, h(t),t) of the variable
U EU attains its minimum at the point u=u(t), namely, are also measurable and u'(.)EO.
In connection with the implementable criterion for the convergenceof
H(x(t),u(t),X(t),t)= minH(x(r),u,h(r),t). (7)
U € U the sequence ( u i ( r ) } , let us define
For seeking the optimal pair (x(t),u ( t ) ) satisfying the above condi- si=: ~ ~ u ' ( k ) - u ' -(k)ll.
I
tions, we consider the following algorithm.
Step 0: Selecta nominal control uOEQ. Let x O ( t ) , t E T be the
corresponding nominal trajectory. Set i= 1. Let c>O be a given small number. If there is a number io such that the
Step I: Compute h'-'(t) by solving the differentialequation relation

dh'-'(t)/dt= -Hx(xi-l(t),~'-l(t),hi-l(t),t), h'-'(tl)=O. 6'< c

(8) holds for any i > io, then we may conclude that the sequence{ ~ ' ( t )has
}
converged.
Step 2: Define the function In Step 2 of the algorithm, the mhimization of the function K with
respect to u has to be performed for each grid point of T. It should be
K(x,u,h,t;o,C)=H(x,u,h,r)+(u-")=C(u-o) (9) noted that in most cases the minimi7ing point u ' ( t ) U~ can be calcu-
lated analytically and expressedby simple equations.
where C=diag(c,,. . . ,c,,), cl,. . . ,c,, 2 0 . Select a nonnegative diagonal
matrix C' properly. Determine x'(t) and u'(t), r E T which satisfy both
111. ASSuMpnONS AND SOME
K(x~(~),u'(~),h'-~(~),~;u~-~(~),c')
In order to consider the convergence of the algorithm, we make the
=H(x'(t),u'(t),X'-'(f),f) following assumptions throughout this paper.
Assumption I : Functions fi(x,u, t ) ( i = 1,. .. ,n ) and L(x, u, t ) and
+ ( u ~ ( ~ ) - u ' - ~ ( t ) ) ~ c ' ( u ~ ( (~t )) -) u ~ - ~their partial derivativesfi,,fi,,.fi,,fi,,,fi,,, L,, L,,, L,, L,,, L,, are
= ~ K ( x ~ ( t ) , u , h ' - ~ ( ~ ) , ~ ; u ' - ~ ( t )continuous,c') on R"x U X T.
UEU Ammprion 2: For any admissible control u(.)€B, there existsa
(10) uniformly bounded solution x(t; u ) , ET of (1). In other words, there is
a solution x ( t ; u ) of (1) that satisfies
and the differential equation
Ilx(t; U)ll4MI (15)
dx'(r)/dt=l(x'(t),u'(t),t), xi(to)=x0. (11)
for any t E T and for any u EQ, where M I is a constant independent of f
This is possible by integrating (11) from t o to t I while seeking u ' ( t ) that and u.
minimizes K. We denote by X a convex setof R" given by
Step 3: Calculate
X = { x E R " : llx114Ml}. (16)
J(ui)=/"L(xi(t),~i(t),r)dr. (12)
10
If there is a constant c such that the inequality

If J(ui)-J(ui-')>O, make the elements of C' larger and go to Step 2. IIf(x,~Y~)ll~~(IIXII+1) (17)
Otherwise, set i:=i+l and go to Step 1. holds for any ( x , u , t ) E R " X U X T , then it is easily seen that Assump
Stop the computation if the sequence {C') is bounded for all i and the tion 2 holds. In fact, integrating the inequhty
sequence { u ' ( t ) ) of the controls converges.
In this algorithm we minimize K instead of H. Since the function K
contains a quadratic penalty term ( u - u ~ - ~ I ) * c ' ( u - u ~ -)~for the possi-
ble large change of control, instability of the algorithm at the first stage
of computation can be avoided by taking large C'. This idea is due to we have
Jiirmark 141, [q.
In Step 2 of the algorithm, we have to determine x'(t) and u ' ( f ) Ilx(r; u)ll ~ ( ~ ~ x ( r l)e@1-'0)-
o ) ~ ~ + 1.
satisfying both (10) and (11). For this purpose, we propose the following Proposition I : The function h i ( t ) defined as thesolution of (8) is
implementable algorithm. We replace the differential equation (11) by a uniformly bounded, i.e., there is a constant M2 independent of t and i
difference equation with a uniform step length A > 0 and proceed as
such that
follows.
i) Set k=O. Ilh'(r)ll <M2 (18)
ii)Given x'(to+kA)~x'(k), determine ui(to+kA)=ui(k) via (10).
iii) Using an approximation foranytETandforanyi(i=O,l,..-).
1151 1980
IEBE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. AC-25, NO. 6, DECEMBER

Proof: From (8) we obtain minimum element of the nonnegative diagonal matrix Ci. If we choose
C' such that
Ai(fl)=0.
-dAi(t)/dt=Ai(t)f,(~i(t),ui(t),t)+Lx(xi(f),ui(t),t),
ci>co (i=1,2,---) (3)
Since f , ( x , u , t ) and & ( x , u , t ) are continuous on the compact set
X X U X T,there are constants a and b such that where co is a constant satisfying
co>(M-r)/4, (29)
~ ~ f , ( x ~ u( it()t ,) , r ) l l < a ,

llLx(xi(t), U'(t), t)ll < b . then the sequence {J(u')) of the cost function& decreases monotoni-
cally and converges.
Let T = t l - f , and define Proof: First we prove (27). In view of (6)and (12), we see that

Yi(T)=IIXi(tl-T)II=IIhi(f)II. J ( ~ i ) - J ( ~ ' - ~ ) = ~ " [ ~ ( ~ i , ~ i , ~ i - l , ~ ) - Ht Ai-1


( x i , -t l) , ~ ' - l
0

Then we see that


- A i - ~ ( ~ ( x i , u ' , f ) - ~ ( x i - ~ , u i - ~ , t ) ) ] d r . (30)
dLi(T)---
-- IIA'(t)ll< IIh'(t)ll < v i ( ~ ) + ~ , yi(0)=O. (19) Define
dT dt
sx'(t)=x'(t)-x'-'(t),
Integrating (19) and applying Gronwall's inequality [7l yields
Gu'(t)=u'(t)-u'-'(t).
Yi(T)<(b/a)[e"('l-'0)- 1]=M 2 ' (20)
Then (30) can be rewritten as
Q.E.D.
J( u i ) -J( ui-' 1
We denote by A a convex setdefined by
=/f'[H(xi,ui,A'-',t)-H(xi,ui-6ui,Ai-',t)
A={AER": IlAll<M2}. (21) l0

+H(xi,ui-l,Ai-l ,t)-~(xi-l,u'-~,A'-l,f)-A'-~6xi(t)]~
Proposition 2: There is a constant M3 independent of t and i such that
ciIIui(t)-ui-'(t)II<M3 (22) =i:'[ H U ( x i , u i , A i - 1 , t ) 6 u ' - 2-1( G u i ) T H ~ , ( x i , I ; i , A i - 1 , t ) 6 u i

for any t E T and for any i where ci > 0 is a minimum element of the ,
+ ~ ~ ( ~ i - l , ~ i - l , ~ i - lt )6x'
diagonal matrix c'=diag(ci,- cj). + I ,

Proofi In view of (9) and (lo), we see that


H(xi,ui,Ai-~,~)+(ui-ui-~)Tci(ui-ui-~ 1
+ I1( ~ x i ) T H x x ( ; iui--l
, 7 Ai-1 9 t)6xi-A'-1&i
I dl

< H ( X ' , U ' - ~ , A ' - ' , ~ ) .(23) where

From (23) we obtain &'=u'-g16u'EU to< 8, < 0 ,

cillu'(f)-u'-'(t)l12 ii=xi-1+e28xiEx (O<e2<l).

) - ~ ~ ( From
< H , ( ~ ~ ( t ) , I ; ~ ( f ) , A ~ - ' ( f ) , f ) ( ~ ~ - ' ( t(24) f ) ) (9) we obtain

where Hu(x', U i , A + l , f)SU'

I ; i ( t ) = U i ( t ) + e i ( t ) ( u i - l ( t ) - u i ( t ) ) (o<ei(t)<l). = K u ( x i ,ui, Ai-', t ; u ~ ~ ' , C ~ ) ~ U ~ - ~ ( (31)


~U~)~C~
since the set u is assumed to be convex, I;'(t)E U. The function Since ui minimks K on the convex set V , it follows that [SI
)IHu(xi,2 , A i - ] , t)ll is continuous on the compact set X X U x AX T.
Consequently, thereis a constant M3 independent of t and i such that K,(x',u',A~-',~;u'-',C')(U'-U)<O (32)

IlH,(xi(t), I;'(f), A i - q t ) , t)ll < M 3 . (25) for any U EU. Thus the firstterm of the right-handside of (31) is
nonpositive. In view of (S), we obtain
From (24) and (25) we(22). obtain Q.E.D.
By Proposition 2, if we select a large C', then the variation of control /'l[~x(x'-I,ui-~,~i-~,t)6xi-xi-~6x']
u i ( t ) - u i - ' ( t ) is kept small and the stability of the algorithmis ensured. f0

The following assumption wiU usually hold in most optimal control = ( L ~ ~ - ) ; i - ~ ~ x i - ~ i - - ~ ; i ~ ~ = - [ ~ ~ - l ( f ) ~ x i ( (33)
t ) l ~ ~ = ~ .
problems. t0
Asmnption 3: There is a nonnegative definite matrix R such that
Furthermore, since H x x ( x ,u, A, t ) is continuous on the compact set
H,,(x,u,A,t)>R>O (26) X X UX A X T,there is a constant M4 independent of t and i such that
for a n y x ~ XUEV,
, AEA, and t E T . IIHxx(~'(t),u'-'(t),Ai\'-'(t),t)ll~M4 (34)

w. REDUCTION OF THE COST for any t ET and for any i. Using relations (3 1)-(34) and Assumption 3,
we obtain
Proposition 3: There is a constant M>O independent of i such that
the inequality J( u') -J( ui-1)

J ( u i ) - J ( u i - l ) < - - (214 c i + r - M ) / ~ ~ l l u i ( f ) - u i - (f)Il2


10
~ df (27) <i;'[
-~(4ci+r)l16u'(t)l12+ 1 pf4116x'(t)112] dr. (35)

holds for any i where r > 0 is the minimum eigenvalue of R and ci is the It is easily seen that there exist positive constants ai(i= 1,2) such that
1152 I E ~ BTRANSACFIONS ON AUTOMATIC CONTROL, VOL. AC-25,NO. 6, DECEMBER 1980

which implies the convergence of ( u ' } in L;(T).


Let x ' ( t ) be the solution of (1 1). Then, in the same way as in (36), we
obtain
Integrating the above inequality and applying Gronwall's inequality [7l ~d~ ~ ~ ( r ) - x i ( r ) l l ~ ~ ~ ~ ( ~ ( r ) , i ( r ) , t ) - ~ ( x i ( r ) , u i ( r ) , r ) ~ ~
gives
< a , l ~ x ( r ) - x ' ( r ) ~ ~ + a ~ I ~ ~ ( r ) - u ' ( (45)
t)~~.

Integrating (45) and applying Gronwall's inequality [7] gives


where
~ ~ ~ ( t ) - x ' <( M
t ), ~
/ ' ~1 1 ; ( ~ ) - ~ ' ( ~ ) 1 1 d ~ .
M,= a z e d l l - ' o ) . IO

Using the Schwarz inequality, we obtain from (37) <M511i-"i111. (46)


In view of (44)and (46J we see that
~ ~ 6 ~ ' ( t ) ~ ~ ~ < M ~ ~ ( t - t ~ ) ~ ~ ' ~ ~ s u ' ( ~ ) ~ ~ ~ d ~ .
lim x ' ( r ) = i ( r )
i-t m
Integrating the above inequality gives
for any t E T.
Let A ( r ) be the solution of

di(r)/dr=-H,(i(r),i(t),i(t),r), i(r,)=O. (47)

Since A i ( r ) satisfies (8), by Assumption there


1 are constants pi( i = 1,2,3)
such that
Substituting (38) into (35) yields (27) where

M=M4M6=M4MS2(rl-r0)2/2. (40)
~ ~ ~ i ( ~rp ), ~ -~ ~~( r~) - x( i ~
( r ) )~ ~~+ p~z ~ ~ ~ ( r ) - u ~ ( r ) ~ ~

If (28) holds, it is clear that the sequence ( J ( u i ) } is monotonically +&lli;.<t>-~'(t)ll.


decreasing. Since
In the same way as before, we see that
J(ui)>J(u*), i=1,2,*-.,
A'(r)=i(r) (48)
i-m
where u* is the optimal control, the sequence { J ( u ' ) ) is bounded from
below.
Therefore, the sequence { J ( u ' ) ) converges. Q.E.D. for any t E T.
Since u=u'(t)EUminimizes K ( x i , u , A i - ' , t ; u i - I , C i ) on the convex
set U,it follows from (32)that for any u E U
V. CONVERGENCE TO THE O m CONTROL
fzu(xi(r),ui(r),Ai-l(r),r)(ui(r)-u)
We denote by L;(T) the Banach spaceof r-dimensional functions u ( r )
with norm defined by + ~ ( u i ( r ) - u i - ' ( r ) ) ~ ~ i ( u ' ( r ) - u ) < ~ (49)
.

The matrices Ci are bounded, u'(r)E U, and by (42)


.~II ~ l d ( r ) - u ~( r-) ~
ll=O a.e. on T .
It is clear that QCLi(T). I-+m
As stated in Proposition 3, the sequence { J( u')} can always converge.
However,convergence of thesequence { u ' ( t ) } of the controls is not Therefore, by letting i+m in (49), we obtain
clear. If the sequence { u i ( r ) } converges, we obtain the following.
Proposition 4: Suppose that the nonnegativediagonal matrices c' H,(x(r),i(Z),~(r),r)(i(t)-u)40 (50)
which satisfy (28) are bounded, i.e., there is a constant y such that
for almost all r E T and for all u E U. Because of Assumption 3, the
function H is a c o m x funcrion on the convex set U. Consequently,
u = i ( r ) minimim H(<(r), u, A(?), r ) on U if and only if (50) holds for
for any i. If the sequence ( u ' ( r ) ) converges in the sense that any U E U [8]. Thus, we see that i ( r ) and x ( r ) satisfy the necessary
conditions (5) and (7)for optimality.
Q.E.D.
lim u'( r ) =i( t ) almost everywhere on T , (42) Proposition 5: A necessary condition that the sequence { u i ( r ) } con-
i-m
verges almost everywhere on T is that the sequence {J( u ' ) ) also con-
then the control i ( r ) and the corresponding solution i ( r ) of (I) satisfy verges.
the necessary conditions ( 5 ) and (7)for optimality. Proof: Suppose that (42) holds.Then (44)also holds. Leti ( r ) be the
Roo$ Since u'(r) are measurable functions, it is obvious that i ( r ) solution of (1) for the control i ( r ) . By Assumption 1, thereare constants
is also measurable and ; ( - ) € S I . Since the set U is compact, there is a yi ( i = 1,2) suchthat
constant A such that 11 u i ( r ) - i ( r ) l l < A for any r E T . The functions
11 ui(r)-i(r)ll are clearly integrableand (42) is equivalent to ~ ~ ( ~ ~ , ~ ~ , t ) - ~ ( x , i , r ) ~ < y ~ ~ ~ ~ ~ - ~ ~

Therefore, we obtain

~ J ( ~ ' ) - ~ ( ~ ) ~ ~ ~ ~ ~ ~ ' l ~ ~ ' ( ~ ) - ~ ( r ) ~ ~ d r + y ~


Therefore, by the dominated convergence theorem [9],we see that 10 10
IEEE TRANSACTIONS ON AUTOMA~CCONTROL, VOL AC-25,NO. 6, DECEMBER 1980 1153

2o
0
t
I
1 2
I

Iteration
3
I
4
Fig. 2. Control functions for various iterations.

principle. We considered the control constraints of the form (2). In the


case where the control constraints are dependent on the state variables
Fig 1. Cost versus iteration number. and have the form

g ( x ( 0 , u(t))cO,
from which we see that

lim J( u i ) =J( ii). (5 1) we have not yet sucex$ed in proving the global convergence of the
i-r m algorithm. Constraints on the terminal state can be taken into considera-
Q.E.D. tion by adding a penalty term B ( x ( t , ) ) for the terminal state constraints
as in (4) and rewriting the cost functional as in (3).
VI. A NUMERICALEXAMPIX In [2] it is proved that, if a successively constructed sequence {ut> of
controls has a limit point, then it satisfies the necessary conditions for
We consider the following control problem described by the Rayleigh optimality. In [lo] it is further proved that, by extending the class of
equation: controls to the relaxed controls, at least one limit point exists that
x, = x 2 satisfies the necessary conditions for optimality. Although thm results
are stronger than our r d t , implementable criterion for determining a
i , = - x ] +1.4x2 -0.14~;
+4u, (52) limit point seems to be difficult.
The matrices C' should be chosen adaptively depending on the pro-
with the initial condition ceeding of the computation. In general, when the matrix Ci is smaller,
x1(0)= - 5 , x*(O)= -5. the obtained variation of the control function is larger. Therefore,C' are
desired to be as small as possible as far as the cost functionals decrease.
The problem is to find the optimal control u(t), 06 t ~ 2 . 5that mini- According to our computational experience, the following way of choos-
mizes ing the matrices C' is recommended.Choose the initial matrix C1
properly. If J(u')>J(u'-'), then set c':=x'. If J(u')<J(u'-'), then
'
set C'+ =aC' where a is a constant such that 0.5 6 a < 1. a =0.8 to 0.9
seems to be a good choice.
under the constraint Computational results of applying our algorithm to much more com-
plicated optimal control problems w ibe reported in a forthcoming,
l
l
-ltu(t)<l.
paper.
This problem was solved in [l] by the second-order DDP. Note that
Huu=2 in this case. REFERENCES
We solved this problemby our algorithm, starting from the same
nominal control u(t)= -0.5 as in [l]. Fig. 1 shows the cost as a function D. H. Jacobson and D. Q. Maync, DifferenfiolQvnamic ProgrMvmirg.
New York:
Elsevier, 1970.
of iteration number. We set C'=O for all i , and our algorithm found the D.Q. Mayneand E. P O W T i ~ ~ t ~ r strong
d e r variation algorithms for optimal
optimal solution in several iterations as in [l]. Compared with the result control," J . Opfimiz.
77mv &I., vol. 16, pp. 277-301, 1975.
K.Ohno, ' A new approach to differential dynamic programming for discrete timc
in [l], the rate of reduction of the cost by our algorithm is better than systems," IEEE Tmns.A u f o m f .Coria, vol. AG23, pp. 37-47, 1978.
that by the second-order DDP. Fig. 2 shows the control function for B. Jirmark, 'Vn convergence control in differential dynamicprogmmming applied
various iterations. Note that each control function is continuous. Al- to realisticaircraftand differential game problems," in Proc I977 IEEE Conj
Dairion Conrr., pp. 471-479.
though the cost converges very fast, the convergence rate of the control [51 - , "A New convergence control technique in differential dynamic
functions appears to be slower. This indicates that even if the sequence p r o g r e Royal Inst. Technol., Stockhotm, Swedcn, Rep. =A-REG-7332,
1975.
of the cost functionals has converged already, computation mustbe L S. Pontryagin, V. G . Boltyanskii, R V. G a m k r e l i d z e , and E.F. Mishchenko, The
continued until the sequenceof control functions converges. Mathematid Thew of optimalProcesses. New Yo& Interscience, 1962.
w. waltet, D#,?w~,,I ~d rntesrp~ rnequolita. ~eriin: springer. 1970.
M. R Hstenes, OpWzution l 7 i w f y : The Finite Dimemiona/ Care. New York
VII. CONWJDINGREMARKS Wiley. 1975.
E Asplundand L.Bungart, A Firsr Course in Integration. NewYork Holt,
€finehartand Winston, 1966.
Global convergence conditions have been investigated for our algo- L. J. Williamron and E. Polak, "Relaxed controls and the oonvcrgence of optimal
rithm, which can be derived naturally from the Pontryagin minimum umtrol algorithms," SIAM J . C o r n . Oprimiz., VOL 14, pp. 737-756. 1976.

You might also like