4.4.1 Example: double integrator

Next: 4.4.2 Bang-bang principle for Up: 4.4 Time-optimal control problems Previous: 4.4 Time-optimal control problems Contents Index

4.4.1 Example: double integrator

Consider the system

$\displaystyle \ddot x=u,\qquad u\in[-1,1]$

(4.47)

which can represent a car with position $x\in\mathbb{R}$ and with bounded acceleration

acting as the control (negative acceleration corresponds to braking). Let us study the problem of ``parking" the car at the origin, i.e., bringing it to rest at

, in minimal time. It is clear--and will follow from the analysis we give below--that the system can indeed be brought to rest at the origin from every initial condition. However, since the control is bounded, we cannot do this arbitrarily fast (we are ignoring the trivial case when the system is initialized at the origin). Thus we expect that there exists an optimal control

which achieves the transfer in the smallest amount of time. (As we have already mentioned, the issue of existence of optimal controls is important and nontrivial in general; it will be treated in Section 4.5.)

We know that the dynamics of the double integrator (4.47) are equivalently described by the state-space equations

$\begin{displaymath}\begin{split}\dot x_1&=x_2\\ \dot x_2&=u \end{split}\end{displaymath}$

(4.48)

where we assume that the initial values

and

are given. The running cost is $L\equiv 1$ (cf. Example 3.2 in Section 3.2 where we discussed another time-optimal control problem). Accordingly, the Hamiltonian is

. Let

be an optimal control. Our problem is a special case of the Basic Fixed-Endpoint Control Problem, and we now apply the maximum principle to characterize

. The costate $p^*=\Big({\textstyle{p_1^*}\atop \textstyle{p_2^*}}\Big)$ must satisfy the adjoint equation

$\displaystyle \begin{pmatrix}\dot p_1^*\\ \dot p_2^* \end{pmatrix}=\begin{pmatr... ...H}_{x_2}\right\vert _{*} \end{pmatrix}=\begin{pmatrix}0\\ -p_1^* \end{pmatrix}.$

(4.49)

The first line of (4.49) implies that

is equal to a constant, say

. The second line of (4.49) then says that

is given by $p_2^*(t)={-c_1}t+c_2$ , where

is another constant.

Next, from the Hamiltonian maximization condition and the fact that we have

$\displaystyle u^*(t)=$ sgn $\displaystyle (p_2^*(t)):=\begin{cases}1 \quad&\text{ if }\ p_2^*(t)>0\\ -1\quad&\text{ if }\ p_2^*(t)<0\\ ?\quad&\text{ if }\ p_2^*(t)=0 \end{cases}$

(4.50)

The third case in (4.50) is meant to indicate that when

equals 0 , making the Hamiltonian independent of

, the value of

can in principle be arbitrary. (By our convention that

is continuous either from the left or from the right everywhere, we know that

actually cannot take any values other than 1 or

.) Can

be identically 0 over some time interval? If this happens, then $\dot p_2^*=-p_1^*$ must also be identically 0 on that interval. However, we already saw (on page

) that

cannot vanish in a time-optimal control problem, because the fact that $H\equiv 0$ (statement 3 of the maximum principle) would then imply that

and the nontriviality condition would be violated. Therefore,

may only equal 0 at isolated points in time, and the formula (4.50) defines

uniquely everywhere away from these times. How many zero crossings can

have? We derived a little while ago that

is a linear function of time. Thus

can cross the value 0 at most once.

We conclude that the optimal control takes only the values $\pm1$ and switches between these values at most once. Interpreted in terms of bringing a car to rest at the origin, the optimal control strategy consists in switching between maximal acceleration and maximal braking. The initial sign and the switching time of course depend on the initial condition. The property that only switches between the extreme values $\pm1$ is intuitively natural and important; such controls are called bang-bang.

It turns out that for the present problem, the pattern identified above uniquely determines the optimal control law for every initial condition. To see this, let us plot the solutions of the system (4.47) in the $(x,\dot x)$ -plane for $u\equiv \pm 1$ . For $u\equiv 1$ , repeated integration gives $\dot x(t)=t+a$ and then $x(t)=\frac12t^2+at+b$ for some constants and . The resulting relation $x=\frac12\dot x^2+c$ (where ) defines a family of parabolas in the $(x,\dot x)$ -plane parameterized by $c\in\mathbb{R}$ . Similarly, for $u\equiv -1$ we obtain the family of parabolas $x=-\frac12\dot x^2+c$ , $c\in\mathbb{R}$ . These curves are shown in Figure 4.15(a,b), with the arrows indicating the direction in which they are traversed. It is easy to see that only two of these trajectories hit the origin (which is the prescribed final point). Their union is the thick curve in Figure 4.15(c), which we call the switching curve and denote by $\Gamma$ ; it is defined by the relation $x=-\frac12\vert\dot x\vert\dot x$ . The optimal control strategy thus consists in applying or depending on whether the initial point is below or above $\Gamma$ , then switching the control value exactly on $\Gamma$ and subsequently following $\Gamma$ to the origin; no switching is needed if the initial point is already on $\Gamma$ . (Thinking slightly differently, we can generate all possible optimal trajectories--which cover the entire plane--by starting at the origin and flowing backward in time, first following $\Gamma$ and then switching at an arbitrary point on $\Gamma$ .) Recalling the interpretation of our problem as that of parking a car using bounded acceleration/braking, the reader can easily relate the optimal trajectories in the $(x,\dot x)$ -plane with the corresponding motions of the car along the -axis. Note that if the car is initially moving away from the origin, then it begins braking until it stops, turns around, and starts accelerating (this is a ``false" switch because actually remains constant), and then switches sign and the car starts braking again.

**Figure:** Bang-bang time-optimal control of the double integrator: (a) trajectories for $u\equiv 1$ , (b) trajectories for $u\equiv -1$ , (c) the switching curve and optimal trajectories
$\includegraphics{figures/optimal2.eps}$

The optimal control law that we just found has two important features. First, as we already said, it is bang-bang. Second, we see that it can be described in the form of a state feedback law. This is interesting because in general, the maximum principle only provides an open-loop description of an optimal control; indeed, depends, besides the state , on the costate , but we managed to eliminate this latter dependence here. It is natural to ask for what more general classes of systems time-optimal controls have these two properties, i.e., are bang-bang and take the state feedback form. The bang-bang property will be examined in detail in the next two subsections. The problem of representing optimal controls as state feedback laws is rather intricate and will not be treated in this book, except for the two exercises below.

$\begin{Exercise} Suppose that we modify the above problem by removing the requir... ...t x)$-plane? Include an explanation of the car's optimal motions. \end{Exercise}$

The next exercise is along the same lines but the solution is less obvious.

$\begin{Exercise} % latex2html id marker 9375 Consider the problem of bringing t... ...$. Answer the same questions as in Exercise~\ref{z-hard-landing}. \end{Exercise}$

Next: 4.4.2 Bang-bang principle for Up: 4.4 Time-optimal control problems Previous: 4.4 Time-optimal control problems Contents Index

Daniel 2010-12-20