0% found this document useful (0 votes)

30 views54 pages

Robotics: Control Theory

This document discusses topics in control theory including optimal control, the Hamilton-Jacobi-Bellman (HJB) equation, linear-quadratic optimal control, and Riccati equations. It focuses on deriving the Bellman equation and Hamilton-Jacobi-Bellman equation for optimal control problems in discrete and continuous time, including for infinite horizon cases. The document provides examples of optimal control problems for balancing an inverted pendulum and reference trajectory following.

Uploaded by

Prakash Raj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views54 pages

Robotics: Control Theory

Uploaded by

Prakash Raj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Robotics

Control Theory

Topics in control theory, optimal control, HJB

equation, infinite horizon case,
Linear-Quadratic optimal control, Riccati
equations (differential, algebraic, discrete-time),
controllability, stability, eigenvalue analysis,
Lyapunov function

Marc Toussaint
U Stuttgart
Cart pole example

state x = (x, ẋ, θ, θ̇)

h i
g sin(θ) + cos(θ) −c1 u − c2 θ̇2 sin(θ)
θ̈ = 4
l − c2 cos2 (θ)
h 3 i
ẍ = c1 u + c2 θ̇2 sin(θ) − θ̈ cos(θ)
2/44
Control Theory
• Concerns controlled systems of the form

ẋ = f (x, u) + noise(x, u)

and a controller of the form

π : (x, t) 7→ u

• We’ll neglect stochasticity here

• When analyzing a given controller π, one analyzes closed-loop

system as described by the differential equation

ẋ = f (x, π(x, t))

(E.g., analysis for convergence & stability)

3/44
Core topics in Control Theory
• Stability*
Analyze the stability of a closed-loop system
→ Eigenvalue analysis or Lyapunov function method
• Controllability*
Analyze which dimensions (DoFs) of the system can actually in principle be
controlled
• Transfer function
Analyze the closed-loop transfer function, i.e., “how frequencies are
transmitted through the system”. (→ Laplace transformation)
• Controller design
Find a controller with desired stability and/or transfer function properties
• Optimal control*
Define a cost function on the system behavior. Optimize a controller to
minimize costs

4/44
Control Theory references

• Robert F. Stengel: Optimal control and estimation

Online lectures:
http://www.princeton.edu/~stengel/MAE546Lectures.html (esp.
lectures 3,4 and 7-9)

• From robotics lectures:

Stefan Schaal’s lecture Introduction to Robotics: http://www-clmc.
usc.edu/Teaching/TeachingIntroductionToRoboticsSyllabus
Drew Bagnell’s lecture on Adaptive Control and Reinforcement
Learning http://robotwhisperer.org/acrls11/

5/44
Outline
• We’ll first consider optimal control
Goal: understand Algebraic Riccati equation
significance for local neighborhood control

• Then controllability & stability

6/44
Optimal control (discrete time)
Given a controlled dynamic system

xt+1 = f (xt , ut )

we define a cost function

T
X
Jπ = c(xt , ut ) + φ(xT )
t=0

where x0 and the controller π : (x, t) 7→ u are given, which determines

x1:T and u0:T

7/44
Dynamic Programming & Bellman principle
An optimal policy has the property that whatever the initial state and
initial decision are, the remaining decisions must constitute an optimal
policy with regard to the state resulting from the first decision.

1 1
1 3
3
7 1
15 Goal
5 8
Start 20 3
3 3 10
1
1
5 3

“V (state) = minedge [c(edge) + V (next-state)]”

8/44
Bellman equation (discrete time)
• Define the value function or optimal cost-to-go function
T
hX i
Vt (x) = min c(xs , us ) + φ(xT )
π xt =x
s=t

• Bellman equation
h i
Vt (x) = minu c(x, u) + Vt+1 (f (x, u))
h i
The argmin gives the optimal control signal: πt∗ (x) = argminu · · ·

Derivation:
T
hX i
Vt (x) = min c(xs , us ) + φ(xT )
π
s=t
h T
X i
= min c(x, ut ) + min[ c(xs , us ) + φ(xT )]
ut π
s=t+1
h i
= min c(x, ut ) + Vt+1 (f (x, ut ))
ut
9/44
Optimal Control (continuous time)
Given a controlled dynamic system

ẋ = f (x, u)

we define a cost function with horizon T

Z T
Jπ = c(x(t), u(t)) dt + φ(x(T ))
0

where the start state x(0) and the controller π : (x, t) 7→ u are given,
which determine the closed-loop system trajectory x(t), u(t) via
ẋ = f (x, π(x, t)) and u(t) = π(x(t), t)

10/44
Hamilton-Jacobi-Bellman equation (continuous
time)
• Define the value function or optimal cost-to-go function
hZ T i
V (x, t) = min c(x(s), u(s)) ds + φ(x(T ))
π t x(t)=x

• Hamilton-Jacobi-Bellman equation
h i
∂ ∂V
− ∂t V (x, t) = minu c(x, u) + ∂x f (x, u)
h i
The argmin gives the optimal control signal: π ∗ (x) = argminu · · ·

Derivation:

dV (x, t) ∂V ∂V
= + ẋ
dt ∂t ∂x
∂V ∂V
c(x, u∗ ) = + f (x, u∗ )
∂t ∂x
∂V ∂V
− = c(x, u∗ ) + f (x, u∗ )
∂t ∂x
11/44
Infinite horizon case
Z ∞
Jπ = c(x(t), u(t)) dt
0

• This cost function is stationary (time-invariant)!

→ the optimal value function is stationary (V (x, t) = V (x))
→ the optimal control signal depends on x but not on t
→ the optimal controller π ∗ is stationary

• The HBJ and Bellman equations remain “the same” but with the same
(stationary) value function independent of t:
h ∂V i
0 = min c(x, u) + f (x, u) (cont. time)
u
h ∂x i
V (x) = min c(x, u) + V (f (x, u)) (discrete time)
u
h i
The argmin gives the optimal control signal: π ∗ (x) = argminu · · ·
12/44
Infinite horizon examples
• Cart-pole balancing:
– You always want the pole to be upright (θ ≈ 0)
– You always want the car to be close to zero (x ≈ 0)
– You want to spare energy (apply low torques) (u ≈ 0)
You might define a cost
Z ∞
Jπ = ||θ||2 + ||x||2 + ρ||u||2
0

• Reference following:
– You always want to stay close to a reference trajectory r(t)
˙
Define x̃(t) = x(t) − r(t) with dynamics x̃(t) = f (x̃(t) + r(t), u) − ṙ(t)
Define a cost Z ∞
Jπ = ||x̃||2 + ρ||u||2
0

• Many many problems in control can be framed this way

13/44
Comments
• The Bellman equation is fundamental in optimal control theory, but also
Reinforcement Learning
• The HJB eq. is a differential equation for V (x, t) which is in general
hard to solve
• The (time-discretized) Bellman equation can be solved by Dynamic
Programming starting backward:
h i
VT (x) = φ(x) , VT -1 (x) = min c(x, u) + VT (f (x, u)) etc.
u

But it might still be hard or infeasible to represent the functions Vt (x)

over continuous x!

• Both become significantly simpler under linear dynamics and quadratic

costs:
→ Riccati equation
14/44
Linear-Quadratic Optimal Control
linear dynamics
ẋ = f (x, u) = Ax + Bu

quadratic costs

c(x, u) = x>Qx + u>Ru , φ(xT ) = x>T F xT

• Note: Dynamics neglects constant term; costs neglect linear and

constant terms. This is because
– constant costs are irrelevant
– linear cost terms can be made away by redefining x or u
– constant dynamic term only introduces a constant drift

15/44
Linear-Quadratic Control as Local Approximation
• LQ control is important also to control non-LQ systems in the
neighborhood of a desired state!

Let x∗ be such a desired state (e.g., cart-pole: x∗ = (0, 0, 0, 0))

– linearize the dynamics around x∗
– use 2nd order approximation of the costs around x∗
– control the system locally as if it was LQ
– pray that the system will never leave this neighborhood!

16/44
Riccati differential equation = HJB eq. in LQ case
• In the Linear-Quadratic (LQ) case, the value function always is a
quadratic function of x!

Let V (x, t) = x>P (t)x, then the HBJ equation becomes

∂ h ∂V i
− V (x, t) = min c(x, u) + f (x, u)
∂t u
h ∂x i
−x>Ṗ (t)x = min x>Qx + u>Ru + 2x>P (t)(Ax + Bu)
u
∂ h > i
0= x Qx + u>Ru + 2x>P (t)(Ax + Bu)
∂u
= 2u>R + 2x>P (t)B
u∗ = −R-1 B>P x

⇒ Riccati differential equation

−Ṗ = A>P + P A − P BR-1 B>P + Q

17/44
Riccati differential equation

−Ṗ = A>P + P A − P BR-1 B>P + Q

• This is a differential equation for the matrix P (t) describing the

quadratic value function. If we solve it with the finite horizon constraint
P (T ) = F we solved the optimal control problem

• The optimal control u∗ = −R-1 B>P x is called Linear Quadratic

Regulator

Note: If the state is dynamic (e.g., x = (q, q̇)) this control is linear in the
positions and linear in the velocities and is an instance of PD control
The matrix K = R-1 B>P is therefore also called gain matrix
For instance, if x(t) = (q(t) − r(t), q̇(t) − ṙ(t)) for a reference r(t) and

K = Kp Kd then

u∗ = Kp (r(t) − q(t)) + Kd (ṙ(t) − q̇(t))

18/44
Riccati equations
• Finite horizon continuous time
Riccati differential equation

−Ṗ = A>P + P A − P BR-1 B>P + Q , P (T ) = F

• Infinite horizon continuous time

Algebraic Riccati equation (ARE)
0 = A>P + P A − P BR-1 B>P + Q

PT
• Finite horizon discrete time (J π = t=0 ||xt ||2Q + ||ut ||2R + ||xT ||2F )

Pt-1 = Q + A>[Pt − Pt B(R + B>Pt B)-1 B>Pt ]A , PT = F

P∞
• Infinite horizon discrete time (J π = t=0 ||xt ||2Q + ||ut ||2R )

P = Q + A>[P − P B(R + B>P B)-1 B>P ]A

19/44
Example: 1D point mass
• Dynamics:
q̈(t) = u(t)/m
         
q q̇  q̇  0 1  0 
x=
  , ẋ = 
  =   =  x +  u
q̇ q̈ u(t)/m 0 0 1/m
         

   
0 1 0 
= Ax + Bu , A=
  , B=
 
0 0 1/m
   

• Costs:

c(x, u) = ||x||2 + %||u||2 , Q = I , R = %I

• Algebraic Riccati equation:

−1
 
a c
P =
  , u∗ = −R-1 B>P x = [cq + bq̇]

c b

%m
0 = A>P + P A − P BR-1 B>P + Q
1 c2
       
c b 0 a −
bc 1 0
=
  +    +  
0 0 0 c %m2 bc b2 0 1
      

20/44
Example: 1D point mass (cont.)
• Algebraic Riccati equation:

−1
 
a c
P =
  , u∗ = −R-1 B>P x = [cq + bq̇]

c b

%m
1 c2
       
c b 0 a −
bc 1 0
0= 
  +    +  
0 0 0 c %m2 bc b2 0 1
      

√ √
First solve for c, then for b = m % c + . Whether the damping ration
b
ξ = √4mc depends on the choices of % and .

• The Algebraic Riccati equation is usually solved numerically. (E.g. are,

care or dare in Octave)

21/44
Optimal control comments
• HJB or Bellman equation are very powerful concepts

• Even if we can solve the HJB eq. and have the optimal control, we still
don’t know how the system really behaves?
– Will it actually reach a “desired state”?
– Will it be stable?
– It is actually “controllable” at all?

• Last note on optimal control:

Formulate as a constrainted optimization problem with objective function J π
and constraint ẋ = f (x, u). λ(t) are the Langrange multipliers. It turns out that
∂
∂x
V (x, t) = λ(t). (See Stengel.)

22/44
Relation to other topics
• Optimal Control:
Z T
min J π = c(x(t), u(t)) dt + φ(x(T ))
π 0

• Inverse Kinematics:
min f (q) = ||q − q0 ||2W + ||φ(q) − y ∗ ||2C
q

• Optimal operational space control:

min f (u) = ||u||2H + ||φ̈(q) − ÿ ∗ ||2C
u

• Trajectory Optimization: (control hard constraints could be included)

T
X T
X
min f (q0:T ) = ||Ψt (qt-k , .., qt )||2 + ||Φt (qt )||2
q0:T
t=0 t=0

• Reinforcement Learning:
– Markov Decision Processes ↔ discrete time stochastic controlled
system P (xt+1 | ut , xt )
– Bellman equation → Basic RL methods (Q-learning, etc) 23/44
Controllability

24/44
Controllability
• As a starting point, consider the claim:
“Intelligence means to gain maximal controllability over all degrees of
freedom over the environment.”

25/44
Controllability
• As a starting point, consider the claim:
“Intelligence means to gain maximal controllability over all degrees of
freedom over the environment.”

Note:
– controllability (ability to control) 6= control
– What does controllability mean exactly?

• I think the general idea of controllability is really interesting

– Linear control theory provides one specific definition of controllability,
which we introduce next..

25/44
Controllability
• Consider a linear controlled system

ẋ = Ax + Bu

How can we tell from the matrices A and B whether we can control x
to eventually reach any desired state?

• Example: x is 2-dim, u is 1-dim:

      
ẋ1


0
 = 
0 x  1
 1  +  u
ẋ2 0 0 x2 0
      

Is x “controllable”?

26/44
Controllability
• Consider a linear controlled system

ẋ = Ax + Bu

How can we tell from the matrices A and B whether we can control x
to eventually reach any desired state?

• Example: x is 2-dim, u is 1-dim:

      
ẋ1


0
 = 
0 x  1
 1  +  u
ẋ2 0 0 x2 0
      

Is x “controllable”?

      
ẋ1


0
 = 
1x  0
 1  +  u
ẋ2 0 0 x2 1
      

Is x “controllable”?
26/44
Controllability
We consider a linear stationary (=time-invariant) controlled system
ẋ = Ax + Bu
• Complete controllability: All elements of the state can be brought
from arbitrary initial conditions to zero in finite time

27/44
Controllability
We consider a linear stationary (=time-invariant) controlled system
ẋ = Ax + Bu
• Complete controllability: All elements of the state can be brought
from arbitrary initial conditions to zero in finite time
• A system is completely controllable iff the controllability matrix
h i
C := B AB A2 B · · · An-1 B

has full rank dim(x) (that is, all rows are linearly independent)

• Meaning of C:
The ith row describes how the ith element xi can be influenced by u
“B”: ẋi is directly influenced via B
“AB”: ẍi is “indirectly” influenced via AB (u directly influences some ẋj
via B; the dynamics A then influence ẍi depending on ẋj )
...
“A2 B”: x i is “double-indirectly” influenced
etc...
Note: ẍ = Aẋ + B u̇ = AAx + ABu + B u̇
...
x = A3 x + A2 Bu + AB u̇ + B ü 27/44
Controllability
• When all rows of the controllability matrix are linearly independent ⇒
(u, u̇, ü, ...) can influence all elements of x independently
• If a row is zero → this element of x cannot be controlled at all
• If 2 rows are linearly dependent → these two elements of x will remain
coupled forever

28/44
Controllability examples
        
ẋ1 0
 = 
0 x  1
 1  +  u
1 0
 rows linearly dependent

 C=

ẋ2 0 0 x2 1 1 0
        

        
ẋ1 0
 = 
0 x  1
 1  +  u
1 0
 2nd row zero

 C= 

ẋ2 0 0 x2 0 0 0
        

        
ẋ1 0
 = 
1 x  0
 1  +  u
0 1
 good!

 C=

ẋ2 0 0 x2 1 1 0
        

29/44
Controllability
Why is it important/interesting to analyze controllability?

• The Algebraic Riccati Equation will always return an “optimal” controller

– but controllability tells us whether such a controller even has a
chance to control x

30/44
Controllability
Why is it important/interesting to analyze controllability?

• The Algebraic Riccati Equation will always return an “optimal” controller

– but controllability tells us whether such a controller even has a
chance to control x

• “Intelligence means to gain maximal controllability over all degrees of

freedom over the environment.”
– real environments are non-linear
– “to have the ability to gain controllability over the environment’s DoFs”

30/44
Stability

31/44
Stability
• One of the most central topics in control theory

• Instead of designing a controller by first designing a cost function and

then applying Riccati,
design a controller such that the desired state is provably a stable
equilibrium point of the closed loop system

32/44
Stability
• Stability is an analysis of the closed loop system. That is: for this
analysis we don’t need to distinguish between system and controller
explicitly. Both together define the dynamics

ẋ = f (x, π(x, t)) =: f (x)

• The following will therefore discuss stability analysis of general

differential equations ẋ = f (x)

• What follows:
– 3 basic definitions of stability
– 2 basic methods for analysis by Lyapunov

33/44
Aleksandr Lyapunov (1857–1918)

34/44
Stability – 3 definitions
ẋ = F (x) with equilibrium point x = 0
• x0 is an equilibrium point ⇐⇒ f (x0 ) = 0

35/44
Stability – 3 definitions
ẋ = F (x) with equilibrium point x = 0
• x0 is an equilibrium point ⇐⇒ f (x0 ) = 0

• Lyapunov stable or uniformly stable ⇐⇒

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤

(when it starts off δ-near to x0 , it will remain -near forever)

35/44
Stability – 3 definitions
ẋ = F (x) with equilibrium point x = 0
• x0 is an equilibrium point ⇐⇒ f (x0 ) = 0

• Lyapunov stable or uniformly stable ⇐⇒

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤

(when it starts off δ-near to x0 , it will remain -near forever)

• asymtotically stable ⇐⇒
Lyapunov stable and limt→∞ x(t) = 0

35/44
Stability – 3 definitions
ẋ = F (x) with equilibrium point x = 0
• x0 is an equilibrium point ⇐⇒ f (x0 ) = 0

• Lyapunov stable or uniformly stable ⇐⇒

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤

(when it starts off δ-near to x0 , it will remain -near forever)

• asymtotically stable ⇐⇒
Lyapunov stable and limt→∞ x(t) = 0

• exponentially stable ⇐⇒
asymtotically stable and ∃α, a s.t. ||x(t)|| ≤ ae−αt ||x(0)||
R∞
(→ the “error” time integral 0 ||x(t)||dt ≤ αa ||x(0)|| is bounded!) 35/44
Linear Stability Analysis
(“Linear” ↔ “local” for a system linearized at the equilibrium point.)
• Given a linear system
ẋ = Ax

Let λi be the eigenvalues of A

– The system is asymptotically stable ⇐⇒ ∀i : real(λi ) < 0
– The system is unstable stable ⇐⇒ ∃i : real(λi ) > 0
– The system is marginally stable ⇐⇒ ∀i : real(λi ) ≤ 0

36/44
Linear Stability Analysis
(“Linear” ↔ “local” for a system linearized at the equilibrium point.)
• Given a linear system
ẋ = Ax

Let λi be the eigenvalues of A

– The system is asymptotically stable ⇐⇒ ∀i : real(λi ) < 0
– The system is unstable stable ⇐⇒ ∃i : real(λi ) > 0
– The system is marginally stable ⇐⇒ ∀i : real(λi ) ≤ 0

• Meaning: An eigenvalue describes how the system behaves along one state
dimension (along the eigenvector):

ẋi = λi xi

As for the 1D point mass the solution is xi (t) = aeλi t and

– imaginary λi → oscillation
– negative real(λi ) → exponential decay ∝ e−|λi |t
– positive real(λi ) → exponential explosion ∝ e|λi |t
36/44
Linear Stability Analysis: Example
• Let’s take the 1D point mass q̈ = u/m in closed loop with a PD
u = −Kp q − Kd q̇
• Dynamics:
     
q̇ 0 1 0 0 
  = 
ẋ =  x + 1/m

x
q̈ 0 0 −Kp −Kd
     

 
0 1 
A=
 
−Kp /m −Kd /m
 

• Eigenvalues:     
q  0 1  q
The equation λ  =    leads to the equation
q̇ −Kp /m −Kd /m q̇
    

λq̇ = λ2 q = −Kp /mq − Kd /mλq or mλ2 + Kd λ + Kp = 0 with solution

(compare slide 05:10)
p
−Kd ± Kd2 − 4mKp
λ=
2m
For Kd2 − 4mKp negative, the real(λ) = −Kd /2m
⇒ Positive derivative gain Kd makes the system stable. 37/44
Side note: Stability for discrete time systems
• Given a discrete time linear system

xt+1 = Axt

Let λi be the eigenvalues of A

– The system is asymptotically stable ⇐⇒ ∀i : |λi | < 1
– The system is unstable stable ⇐⇒ ∃i : |λi | > 1
– The system is marginally stable ⇐⇒ ∀i : |λi | ≤ 1

38/44
Linear Stability Analysis comments
• The same type of analysis can be done locally for non-linear systems,
as we will do for the cart-pole in the exercises

• We can design a controller that minimizes the (negative) eigenvalues

of A:
↔ controller with fastest asymtopic convergence

This is a real alternative to optimal control!

39/44
Lyapunov function method
• A method to analyze/prove stability for general non-linear systems is
the famous “Lyapunov’s second method”

Let D be a region around the equilibrium point x0

• A Lyaponov function V (x) for a system dynamics ẋ = f (x) is
– positive, V (x) > 0, everywhere in D except...
at the equilibrium point where V (x0 ) = 0
– always decreases, V̇ (x) = ∂V∂x(x) ẋ < 0, in D except...
at the equilibrium point where f (x) = 0 and therefore V̇ (x) = 0

• If there exists a D and a Lyapunov function ⇒ the system is

asymtotically stable

If D is the whole state space, the system is globally stable

40/44
Lyapunov function method
• The Lyapunov function method is very general. V (x) could be
“anything” (energy, cost-to-go, whatever). Whenever one finds some
V (x) that decreases, this proves stability

• The problem though is to think of some V (x) given a dynamics!

(In that sense, the Lyapunov function method is rather a method of
proof than a concrete method for stability analysis.)

41/44
Lyapunov function method
• The Lyapunov function method is very general. V (x) could be
“anything” (energy, cost-to-go, whatever). Whenever one finds some
V (x) that decreases, this proves stability

• The problem though is to think of some V (x) given a dynamics!

(In that sense, the Lyapunov function method is rather a method of
proof than a concrete method for stability analysis.)

• In standard cases, a good guess for the Lyapunov function is either the
energy or the value function

41/44
Lyapunov function method – Energy Example
• Let’s take the 1D point mass q̈ = u/m in closed loop with a PD
u = −Kp q − Kd q̇, which has the solution (slide 05:14):
√
1−ξ 2 t
q(t) = be−ξ/λ t eiω0

• Energy of the 1D point mass: V (q, q̇) := 12 mq̇ 2

V̇ (x) = e−2ξ/λ t V (x(0))

(using that the energy of an undamped oscillator is conserved)

• V (x) < 0 ⇐⇒ ξ > 0 ⇐⇒ Kd > 0
Same result as for the eigenvalue analysis

42/44
Lyapunov function method – value function
Example
• Consider infinite horizon linear-quadratic optimal control. The solution
of the Algebraic Riccati equation gives the optimal controller.
• The value function satisfies
V (x) = x>P x
V̇ (x) = [Ax + Bu∗ ]>P x + x>P [Ax + Bu∗ ]
u∗ = −R-1 B>P x = Kx
V̇ (x) = x>[(A + BK)>P + P (A + BK)]x
= x>[A>P + P A + (BK)>P + P (BK)]x
0 = A>P + P A − P BR-1 B>P + Q
V̇ (x) = x>[P BR-1 B>P − Q + (P BK)> + P BK]x
= −x>[Q + K>RK]x

(We could have derived this easier! x>Qx are the immediate state costs, and
x>K>RKx = u>Ru are the immediate control costs—and V̇ (x) = −c(x, u∗ )!
See slide 11 bottom.)
• That is: V is a Lyapunov function if Q + K>RK is positive definite! 43/44
Observability & Adaptive Control
• When some state dimensions are not directly observable: analyzing
higher order derivatives to infer them.
Very closely related to controllability: Just like the controllability matrix
tells whether state dimensions can (indirectly) be controlled; an
observation matrix tells whether state dimensions can (indirectly) be
inferred.

• Adaptive Control: When system dynamics ẋ = f (x, u, β) has unknown

parameters β.
– One approach is to estimate β from the data so far and use optimal
control.
– Another is to design a controller that has an additional internal
update equation for an estimate β̂ and is provably stable. (See
Schaal’s lecture, for instance.)

44/44

Interaction Between Caffeine and Creatine When Used As Concurrent
No ratings yet
Interaction Between Caffeine and Creatine When Used As Concurrent
11 pages
Orca Share Media1680785708386 7049726219828421931
No ratings yet
Orca Share Media1680785708386 7049726219828421931
4 pages
Accounting Theory Godfrey 7th Edition Solution
63% (8)
Accounting Theory Godfrey 7th Edition Solution
2 pages
Thesis On Bio Medical Waste Management
100% (2)
Thesis On Bio Medical Waste Management
126 pages
Lec19 - Linear Quadratic Regulator
No ratings yet
Lec19 - Linear Quadratic Regulator
7 pages
Leadership Style at Google
100% (1)
Leadership Style at Google
6 pages
Derivation of HJI Constrained
No ratings yet
Derivation of HJI Constrained
6 pages
Material Specification: Document Number 1Pl030
No ratings yet
Material Specification: Document Number 1Pl030
7 pages
5.1 Dynamic Programming and The HJB Equation: k+1 K K K K
No ratings yet
5.1 Dynamic Programming and The HJB Equation: k+1 K K K K
30 pages
Woolseylecture 1
No ratings yet
Woolseylecture 1
4 pages
Optimal Control Matlab
No ratings yet
Optimal Control Matlab
25 pages
Calculus of Variations and Optimal Control: Continuous Systems
No ratings yet
Calculus of Variations and Optimal Control: Continuous Systems
29 pages
Lecture 4 Control
No ratings yet
Lecture 4 Control
23 pages
Optimal Control Exercises Guide
100% (2)
Optimal Control Exercises Guide
79 pages
Automatica: Kyriakos G. Vamvoudakis Frank L. Lewis
No ratings yet
Automatica: Kyriakos G. Vamvoudakis Frank L. Lewis
11 pages
Hartog Committee Report
92% (13)
Hartog Committee Report
14 pages
Congratulations! You Are On One Airtel 1349 Plan
No ratings yet
Congratulations! You Are On One Airtel 1349 Plan
11 pages
PV Elite Validation Comparision Sheet v-124
No ratings yet
PV Elite Validation Comparision Sheet v-124
1 page
Riccati Equations in Optimal Control Theory
No ratings yet
Riccati Equations in Optimal Control Theory
40 pages
Tenta13a 2
No ratings yet
Tenta13a 2
3 pages
LQR, Controllability & Observability
No ratings yet
LQR, Controllability & Observability
6 pages
Optimal Control Lecture Notes
No ratings yet
Optimal Control Lecture Notes
233 pages
1 The Hamilton-Jacobi-Bellman Equation
No ratings yet
1 The Hamilton-Jacobi-Bellman Equation
7 pages
Welding Defects: Ganesan V Assistant Manager-TSD D&H Secheron Electrodes PVT - LTD
100% (1)
Welding Defects: Ganesan V Assistant Manager-TSD D&H Secheron Electrodes PVT - LTD
49 pages
Optimal Control and LQR Guide
No ratings yet
Optimal Control and LQR Guide
35 pages
Deterministic Continuous Time Optimal Control and The Hamilton-Jacobi-Bellman Equation
No ratings yet
Deterministic Continuous Time Optimal Control and The Hamilton-Jacobi-Bellman Equation
7 pages
Dynamic Programming and Linear Quadratic (LQ) Control (Discrete-Time and Continuous Time Cases)
No ratings yet
Dynamic Programming and Linear Quadratic (LQ) Control (Discrete-Time and Continuous Time Cases)
53 pages
Optimal Control and Quadratic Optimization
No ratings yet
Optimal Control and Quadratic Optimization
23 pages
Deterministic Control Insights
No ratings yet
Deterministic Control Insights
42 pages
IAS Books
100% (1)
IAS Books
2 pages
Linear-Quadratic Regulator (LQR) - Wikipedia
100% (1)
Linear-Quadratic Regulator (LQR) - Wikipedia
4 pages
7 Linear Quadratic Control: 7.1 The Problem
No ratings yet
7 Linear Quadratic Control: 7.1 The Problem
10 pages
G - EWL Manpower APRIL - JULY - 2019
No ratings yet
G - EWL Manpower APRIL - JULY - 2019
1,231 pages
Typical Conection Details
No ratings yet
Typical Conection Details
4 pages
Adaptive DP For Discrete Time LQR Optimal Tracking Control Problems With Unknown Dynamics
No ratings yet
Adaptive DP For Discrete Time LQR Optimal Tracking Control Problems With Unknown Dynamics
6 pages
L4 Discrete Time Optimal Control Indirect LQ ARE
No ratings yet
L4 Discrete Time Optimal Control Indirect LQ ARE
26 pages
5 - HJB
No ratings yet
5 - HJB
12 pages
Safety Precaution in Welding: Ashok Kumar Engineer Sales-Chennai D&H Secheron Electrodes PVT - LTD
No ratings yet
Safety Precaution in Welding: Ashok Kumar Engineer Sales-Chennai D&H Secheron Electrodes PVT - LTD
36 pages
16.323 Principles of Optimal Control: Mit Opencourseware
No ratings yet
16.323 Principles of Optimal Control: Mit Opencourseware
32 pages
Linear Quadratic Regulator Guide
No ratings yet
Linear Quadratic Regulator Guide
14 pages
Optimal Control Theory Explained
No ratings yet
Optimal Control Theory Explained
6 pages
Linear Systems and Optimal Control Condensed Notes: J. A. Mcmahan JR
No ratings yet
Linear Systems and Optimal Control Condensed Notes: J. A. Mcmahan JR
22 pages
Optimal Control in Bilinear Systems
No ratings yet
Optimal Control in Bilinear Systems
22 pages
SC Dec22
No ratings yet
SC Dec22
82 pages
Lecture8 S21
No ratings yet
Lecture8 S21
19 pages
Optimal Control
No ratings yet
Optimal Control
32 pages
LQG Controllers & Kalman Filters
No ratings yet
LQG Controllers & Kalman Filters
15 pages
Optimal Control & Dynamic Games Guide
No ratings yet
Optimal Control & Dynamic Games Guide
12 pages
OCDM2223 Tutorial7solved
No ratings yet
OCDM2223 Tutorial7solved
5 pages
Htytdjhfjgk LJ K JKFTDRGBN
No ratings yet
Htytdjhfjgk LJ K JKFTDRGBN
45 pages
TDC-722-Rev-0-IS-2062-2011-E250 BO - Domestic
No ratings yet
TDC-722-Rev-0-IS-2062-2011-E250 BO - Domestic
6 pages
Stochastic Control for Engineers
No ratings yet
Stochastic Control for Engineers
45 pages
Engineering Entrepreneurship Guide
No ratings yet
Engineering Entrepreneurship Guide
37 pages
8 Material Control
No ratings yet
8 Material Control
8 pages
Advanced Control Systems Lecture
No ratings yet
Advanced Control Systems Lecture
34 pages
Exxonmobil - Serpentina H2S Management Project: Material Requisition For H2S Absorbent Modular Skid
No ratings yet
Exxonmobil - Serpentina H2S Management Project: Material Requisition For H2S Absorbent Modular Skid
21 pages
BIMO Site Audit Check List 8nov11
No ratings yet
BIMO Site Audit Check List 8nov11
8 pages
Kybernetika 39-2003-4 6
No ratings yet
Kybernetika 39-2003-4 6
11 pages
09 LQR
No ratings yet
09 LQR
68 pages
PCS Initial Application v5 PDF
No ratings yet
PCS Initial Application v5 PDF
22 pages
Sastry Optimal 2021
No ratings yet
Sastry Optimal 2021
15 pages
Inno2024 EMT4203 CONTROL II NOTES R6
No ratings yet
Inno2024 EMT4203 CONTROL II NOTES R6
9 pages
4 The Linear Quadratic Regulator: 4.1 Time Varying and Finite Horizon Case
No ratings yet
4 The Linear Quadratic Regulator: 4.1 Time Varying and Finite Horizon Case
12 pages
cs229 Notes13
No ratings yet
cs229 Notes13
15 pages
Linear Quadratic Control Analysis
No ratings yet
Linear Quadratic Control Analysis
4 pages
Iitk Iitk: Chemical Engineering Chemical Engineering
No ratings yet
Iitk Iitk: Chemical Engineering Chemical Engineering
4 pages
Rasch Model Measurement: Basic: Bambang Sumintono
No ratings yet
Rasch Model Measurement: Basic: Bambang Sumintono
30 pages
Optimal Control and The Linear Quadratic Regulator: 1 Derivation of The Euler-Lagrange Equations
No ratings yet
Optimal Control and The Linear Quadratic Regulator: 1 Derivation of The Euler-Lagrange Equations
10 pages
Clearance, Tolerance, & - For Fastener Holes, en 1090-2
100% (1)
Clearance, Tolerance, & - For Fastener Holes, en 1090-2
6 pages
Linear Quadratic Optimal Control
No ratings yet
Linear Quadratic Optimal Control
32 pages
Agent Name SMC Insurance Brokers PVT LTD Agent Code IMD1000019 Agent Contact No 9900021919
No ratings yet
Agent Name SMC Insurance Brokers PVT LTD Agent Code IMD1000019 Agent Contact No 9900021919
2 pages
Additional Data Analysis and Statistics
100% (1)
Additional Data Analysis and Statistics
11 pages
Material Specification: Document Number 1Pl020
No ratings yet
Material Specification: Document Number 1Pl020
7 pages
Model Based Output Difference Feedback Optimal Control
No ratings yet
Model Based Output Difference Feedback Optimal Control
6 pages
Contoh Soal PMP - 3
No ratings yet
Contoh Soal PMP - 3
2 pages
Universiti Malaysia Pahang Questionnaire
No ratings yet
Universiti Malaysia Pahang Questionnaire
4 pages
Prem Mann, Introductory Statistics, 7/E
100% (1)
Prem Mann, Introductory Statistics, 7/E
44 pages
Sti College
No ratings yet
Sti College
8 pages
5 Manual Control
No ratings yet
5 Manual Control
3 pages
Minimax Control for Positive Systems
No ratings yet
Minimax Control for Positive Systems
26 pages
List of Welding Procedure Specification & Procedure Qualifications - Typical Connection
No ratings yet
List of Welding Procedure Specification & Procedure Qualifications - Typical Connection
5 pages
Estimating The Economic Model of Crime With Panel Data: June 2019
No ratings yet
Estimating The Economic Model of Crime With Panel Data: June 2019
12 pages
HPCL-RUF-QA-QC-TQ-LTMMH Kanchipuram-Vessels-01
No ratings yet
HPCL-RUF-QA-QC-TQ-LTMMH Kanchipuram-Vessels-01
4 pages
HPCL-RUF-QA-QC-TQ-LTMMH Kanchipuram-Vessels-01
No ratings yet
HPCL-RUF-QA-QC-TQ-LTMMH Kanchipuram-Vessels-01
4 pages
Solving HJB with Least Squares ML
No ratings yet
Solving HJB with Least Squares ML
20 pages
The Hamilton Jac
No ratings yet
The Hamilton Jac
5 pages
Review Questions of Midterm Chapters 1-4
100% (2)
Review Questions of Midterm Chapters 1-4
2 pages
Data Science Methodologies: Current Challenges and Future Approaches
No ratings yet
Data Science Methodologies: Current Challenges and Future Approaches
22 pages
Pipe Splicing Specification Guide
No ratings yet
Pipe Splicing Specification Guide
3 pages
COE0011 ProbSet #3
No ratings yet
COE0011 ProbSet #3
7 pages
Opcrf Movs Checklist Sy 2022 2023
No ratings yet
Opcrf Movs Checklist Sy 2022 2023
9 pages
Supplier Welding Requirements and Recommendations Continuous Miner Gathering Heads Conveyors Supports and Track Frames
No ratings yet
Supplier Welding Requirements and Recommendations Continuous Miner Gathering Heads Conveyors Supports and Track Frames
2 pages
CITI Program Completion Report
No ratings yet
CITI Program Completion Report
2 pages
Ansa Price List
No ratings yet
Ansa Price List
1 page
The Validity and Reliability Analysis of English N
No ratings yet
The Validity and Reliability Analysis of English N
10 pages
Digitalization Needs for LNT Construction
No ratings yet
Digitalization Needs for LNT Construction
1 page
A2 Linear-Quadratic Optimal Control
No ratings yet
A2 Linear-Quadratic Optimal Control
8 pages
LNT Construction QC Manual TOC
No ratings yet
LNT Construction QC Manual TOC
1 page
Bracing Pipe Working
No ratings yet
Bracing Pipe Working
1 page
02 - Dynamic Programming and LQR
No ratings yet
02 - Dynamic Programming and LQR
25 pages
Supplier-Induced Demand An Experimental Study
No ratings yet
Supplier-Induced Demand An Experimental Study
12 pages
Class 4
No ratings yet
Class 4
4 pages
Pharmaceutical Medical Sales Manager in Ashland KY Resume James Dickison
No ratings yet
Pharmaceutical Medical Sales Manager in Ashland KY Resume James Dickison
2 pages
Urban Regeneration Dissertation Ideas
100% (2)
Urban Regeneration Dissertation Ideas
5 pages
Rahul BTech ECE 11weeks 15may2024 IIST Thiruvananthapuram
No ratings yet
Rahul BTech ECE 11weeks 15may2024 IIST Thiruvananthapuram
3 pages
Journal Evaluation for Researchers
No ratings yet
Journal Evaluation for Researchers
24 pages
NJ Cse4261-3
No ratings yet
NJ Cse4261-3
47 pages
Digital Control SS7
No ratings yet
Digital Control SS7
11 pages
MIT6 832s09 Read ch10
No ratings yet
MIT6 832s09 Read ch10
8 pages

Robotics: Control Theory

Uploaded by

Robotics: Control Theory

Uploaded by

Robotics

Topics in control theory, optimal control, HJB

state x = (x, ẋ, θ, θ̇)

and a controller of the form

• We’ll neglect stochasticity here

• When analyzing a given controller π, one analyzes closed-loop

ẋ = f (x, π(x, t))

(E.g., analysis for convergence & stability)

• Robert F. Stengel: Optimal control and estimation

• From robotics lectures:

• Then controllability & stability

we define a cost function

where x0 and the controller π : (x, t) 7→ u are given, which determines

“V (state) = minedge [c(edge) + V (next-state)]”

we define a cost function with horizon T

• This cost function is stationary (time-invariant)!

• Many many problems in control can be framed this way

But it might still be hard or infeasible to represent the functions Vt (x)

• Both become significantly simpler under linear dynamics and quadratic

c(x, u) = x>Qx + u>Ru , φ(xT ) = x>T F xT

• Note: Dynamics neglects constant term; costs neglect linear and

Let x∗ be such a desired state (e.g., cart-pole: x∗ = (0, 0, 0, 0))

Let V (x, t) = x>P (t)x, then the HBJ equation becomes

⇒ Riccati differential equation

−Ṗ = A>P + P A − P BR-1 B>P + Q

• This is a differential equation for the matrix P (t) describing the

• The optimal control u∗ = −R-1 B>P x is called Linear Quadratic

u∗ = Kp (r(t) − q(t)) + Kd (ṙ(t) − q̇(t))

−Ṗ = A>P + P A − P BR-1 B>P + Q , P (T ) = F

• Infinite horizon continuous time

Pt-1 = Q + A>[Pt − Pt B(R + B>Pt B)-1 B>Pt ]A , PT = F

P = Q + A>[P − P B(R + B>P B)-1 B>P ]A

c(x, u) = ||x||2 + %||u||2 , Q = I , R = %I

• Algebraic Riccati equation:

• The Algebraic Riccati equation is usually solved numerically. (E.g. are,

• Last note on optimal control:

• Optimal operational space control:

• Trajectory Optimization: (control hard constraints could be included)

• I think the general idea of controllability is really interesting

• Example: x is 2-dim, u is 1-dim:

• Example: x is 2-dim, u is 1-dim:

• The Algebraic Riccati Equation will always return an “optimal” controller

• The Algebraic Riccati Equation will always return an “optimal” controller

• “Intelligence means to gain maximal controllability over all degrees of

• Instead of designing a controller by first designing a cost function and

ẋ = f (x, π(x, t)) =: f (x)

• The following will therefore discuss stability analysis of general

• Lyapunov stable or uniformly stable ⇐⇒

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤ 

(when it starts off δ-near to x0 , it will remain -near forever)

• Lyapunov stable or uniformly stable ⇐⇒

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤ 

(when it starts off δ-near to x0 , it will remain -near forever)

• Lyapunov stable or uniformly stable ⇐⇒

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤ 

(when it starts off δ-near to x0 , it will remain -near forever)

Let λi be the eigenvalues of A

Let λi be the eigenvalues of A

As for the 1D point mass the solution is xi (t) = aeλi t and

λq̇ = λ2 q = −Kp /mq − Kd /mλq or mλ2 + Kd λ + Kp = 0 with solution

Let λi be the eigenvalues of A

• We can design a controller that minimizes the (negative) eigenvalues

This is a real alternative to optimal control!

Let D be a region around the equilibrium point x0

• If there exists a D and a Lyapunov function ⇒ the system is

If D is the whole state space, the system is globally stable

• The problem though is to think of some V (x) given a dynamics!

• The problem though is to think of some V (x) given a dynamics!

• Energy of the 1D point mass: V (q, q̇) := 12 mq̇ 2

V̇ (x) = e−2ξ/λ t V (x(0))

(using that the energy of an undamped oscillator is conserved)

• Adaptive Control: When system dynamics ẋ = f (x, u, β) has unknown

You might also like

c(x, u) = ||x||2 + %||u||2 , Q = I , R = %I

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤

(when it starts off δ-near to x0 , it will remain -near forever)

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤

(when it starts off δ-near to x0 , it will remain -near forever)

∀ : ∃δ s.t. ||x(0)|| ≤ δ ⇒ ||x(t)|| ≤

(when it starts off δ-near to x0 , it will remain -near forever)