4.4.2 Bang-bang principle for linear systems

Next: 4.4.3 Nonlinear systems, singular Up: 4.4 Time-optimal control problems Previous: 4.4.1 Example: double integrator Contents Index

4.4.2 Bang-bang principle for linear systems

Consider now a system with general linear time-invariant dynamics

$\displaystyle \dot x=Ax+Bu$

(4.51)

where $x\in\mathbb{R}^n$ and $u\in U\subset \mathbb{R}^m$ . We want to investigate under what conditions time-optimal controls for this system have a bang-bang property along the lines of what we saw in the previous example. Of course, we need to specify what the control set

is and exactly what we mean by a bang-bang property for controls taking values in this

. As a natural generalization of the interval $[-1,1]\subset \mathbb{R}$ to higher dimensions, we take

to be an

-dimensional hypercube:

$\displaystyle U=\{u\in\mathbb{R}^m: u_i\in[-1,1],\, i=1,\dots,m\}.$

(4.52)

This is a reasonable control set representing independent control actuators. We take the magnitude constraints on the different components of

to be the same just for simplicity; the extension to different constraints is immediate.

Suppose that the control objective is to steer from a given initial state to a given final state in minimal time. To be sure that this problem is well posed, we assume that there exists some control that achieves the transfer from to (in some time). As we will see in Section 4.5 (Theorem 4.3), this guarantees that a time-optimal control $u^*:[t_0,t^*]\to U$ exists. We now use the maximum principle to characterize it. The Hamiltonian is $H(x,u,p,p_0)=\langle p,Ax+Bu\rangle +p_0$ . The Hamiltonian maximization condition implies that

$\displaystyle \langle p^*(t),Bu^*(t)\rangle=\max_{u\in U}\langle p^*(t),Bu(t)\rangle$

(4.53)

for all $t\in[t_0,t^*]$ . We can rewrite this formula in terms of the input components as

$\displaystyle \sum_{i=1}^m\langle p^*(t),b_i\rangle u_i^*(t)=\max_{u\in U}\sum_{i=1}^m\langle p^*(t),b_i\rangle u_i$

where $b_1,\dots,b_m$ are the columns of

. Since the components

of the optimal control can be chosen independently, it is clear that each term in the summation must be maximized:

$\displaystyle \langle p^*(t),b_i\rangle u_i^*(t)=\max_{\vert u_i\vert\le 1}\langle p^*(t),b_i\rangle u_i,\qquad i=1,\dots,m.$

It has now become obvious that we must have

$\displaystyle u_i^*(t)=$ sgn $\displaystyle (\langle p^*(t),b_i\rangle)=\begin{cases}1 \quad&\text{ if }\ \la... ...*(t),b_i\rangle<0\\ ?\quad&\text{ if }\ \langle p^*(t),b_i\rangle=0 \end{cases}$

(4.54)

for each

. Observe that (4.54) resembles (4.50). Similarly to how we proceeded from that formula, we now need to investigate the possibility that $\langle p^*,b_i\rangle\equiv 0$ on some interval of time for some

, as this would prevent us from determining

on that interval.

The adjoint equation is $\dot p^*=-A^Tp^*$ , which gives $p^*(t)=e^{A^T(t^*-t)}p^*(t^*)$ . From this we obtain $\langle p^*(t),b_i\rangle=\langle p^*(t^*),e^{A(t^*-t)}b_i\rangle$ . This is a real analytic function of ; hence, if it vanishes on some time interval, then it vanishes for all , together with all its derivatives. Calculating these derivatives at , we arrive at the equalities

$\displaystyle \langle p^*(t^*),b_i\rangle=\langle p^*(t^*),Ab_i\rangle=\dots= \langle p^*(t^*),A^{n-1}b_i\rangle=0.$

(4.55)

In other words,

is orthogonal to the vectors

, $\dots$ , $A^{n-1}b_i$ . As we know,

itself cannot be 0 in a time-optimal control problem (see page

). Therefore, to rule out (4.55) it is enough to assume that

is a controllable pair. We need this to be true for each

; i.e., the system should be controllable with respect to each individual input channel. Such linear control systems are called normal.

Let us now collect the properties of the optimal control that we are able to derive under the above normality assumption. None of the functions $\langle p^*(\cdot),b_i\rangle$ equal 0 on any time interval; being real analytic functions, they only have finitely many zeros on the interval . Using the formula (4.54), we see that each function only takes the values $\pm1$ and switches between these values finitely many times. Away from these switching times, is uniquely determined by (4.54). We conclude that the overall optimal control takes values only in the set of vertices of the hypercube , has finitely many discontinuities (switches), and is unique everywhere else. Generalizing the earlier notion, we say that controls taking values in the set of vertices of are bang-bang (or have the bang-bang property); the result that we have just obtained is a version of the bang-bang principle for linear systems.

Before closing this discussion, it is instructive to see how the above bang-bang property can be established in a self-contained way, without relying on the maximum principle. More precisely, the argument outlined next essentially rederives the maximum principle from scratch for the particular problem at hand (in the spirit of Section 4.3.1). Solutions of the system (4.51) take the form

$\displaystyle x(t)=e^{A(t-t_0)}x_0+\int_{t_0}^te^{A (t-s) }Bu( s )d s .$

For $t\ge t_0$ , let us introduce the set of points reachable from

at time

$\displaystyle R^t(x_0):=\left\{e^{A(t-t_0)}x_0+\int_{t_0}^te^{A (t-s)} Bu( s )d s:u( s )\in U,\, t_0\le s \le t\right\}.$

(4.56)

We know that

$\displaystyle x_1=e^{A(t^*-t_0)}x_0+\int_{t_0}^{t^*}e^{A (t^*-s) }Bu^*( s )d s\in R^{t^*}(x_0).$

(4.57)

In fact, the optimal time

is the smallest time

such that $x_1\in R^t(x_0)$ . It follows that

must belong to the boundary of the reachable set $R^{t^*}(x_0)$ ; indeed, if it were an interior point of $R^{t^*}(x_0)$ then we could reach it sooner. We will see a little later (in Section 4.5) that the set $R^{t^*}(x_0)$ is compact and convex. Along the lines of Section 4.2.7, there exists a hyperplane that passes through

and contains $R^{t^*}(x_0)$ on one side; such a hyperplane is said to support $R^{t^*}(x_0)$ at

. Denoting a suitably chosen normal vector to this hyperplane by

, we have $\langle p^*(t^*),x_1\rangle \ge \langle p^*(t^*),x\rangle$ for all $x\in R^{t^*}(x_0)$ . Using (4.56) and (4.57), we easily obtain

$\displaystyle \int_{t_0}^{t^*} \langle p^*(t^*),e^{A(t^*- s) }Bu^*( s ) \rangle... ...le p^*(t^*),e^{A(t^*- s) }Bu( s ) \rangle d s \qquad \forall\,u:[t_0,t^*]\to U$

which, in view of the formula $e^{A^T(t^*-s)}p^*(t^*)=p^*(s)$ , is equivalent to

$\displaystyle \int_{t_0}^{t^*} \langle p^*(s),Bu^*( s ) \rangle d s\ge\int_{t_0}^{t^*} \langle p^*(s),Bu( s ) \rangle d s \qquad \forall\,u:[t_0,t^*]\to U.$

From this it is not difficult to recover the fact that (4.53) must hold for (almost) all $t\in[t_0,t^*]$ , and we can proceed from there as before.

$\begin{Exercise}Consider the same time-optimal control problem as above, but tak... ...o not assume normality). What can you say about optimal controls? \end{Exercise}$

The assumption of normality, which was needed to prove the bang-bang property of time-optimal controls for a hypercube, is quite strong. A different, weaker version of the bang-bang principle could be formulated as follows. Rather than wishing for every time-optimal control to be bang-bang, we could ask whether every state reachable from by some control is also reachable from in the same time by a bang-bang control; in other words, whether reachable sets for bang-bang controls coincide with reachable sets for all controls. This would imply that, even though not all time-optimal controls are necessarily bang-bang, we can always select one that is bang-bang. It turns out that this modified bang-bang principle holds for every linear control system (no controllability assumption is necessary) and every control set that is a convex polyhedron. The proof requires a refinement of the above argument and some additional steps; see [Sus83, Section 8.1] for details.

Next: 4.4.3 Nonlinear systems, singular Up: 4.4 Time-optimal control problems Previous: 4.4.1 Example: double integrator Contents Index

Daniel 2010-12-20