next up previous contents index
Next: 6.2 Infinite-horizon LQR problem Up: 6.1 Finite-horizon LQR problem Previous: 6.1.3 Value function and   Contents   Index

6.1.4 Global existence of solution for the RDE

We are finally in a position to prove that the solution $ P(\cdot)$ of the RDE (6.14), propagated backward in time from the terminal condition (6.11), exists for all $ t\le t_1$ . It follows from the standard theory of local existence and uniqueness of solutions to ordinary differential equations that $ P(t)$ exists for $ t$ sufficiently close to $ t_1$ . (We already came across such results in Section 3.3.1, with the difference that there solutions were propagated forward in time; we also know about local existence of $ P(t)$ from a different argument based on the formula (6.10), which we gave on page [*].) As for global existence, the problem is that some entry of $ P$ may have a finite escape time. In other words, there may exist a time $ \bar t<t_1$ and some indices $ i,j\in\{1,\dots,n\}$ such that $ P_{ij}(t)$ approaches $ \pm\infty$ as $ t\searrow \bar t$ . Such behavior is actually quite typical for solutions of quadratic differential equations of Riccati type, as we discussed in detail on page [*]. In the context of the formula (6.10), this would mean that the matrix $ \Phi_{11}(t,t_1)-2\Phi_{12}(t,t_1)M$ becomes singular at $ t=\bar t$ . Similarly, in the context of the formula (6.15) the finite escape would mean that the matrix $ X(t)$ becomes singular at $ t=\bar t$ . In Section 2.6.2, we encountered a closely related situation and formalized it in terms of existence of conjugate points. Fortunately, in the present setting it is not very difficult to show by a direct argument that all entries of $ P(t)$ remain bounded for all $ t$ , relying on the fact--established in the previous subsection--that $ x^TP(t)x$ is the optimal LQR cost-to-go from $ (t,x)$ .

Seeking a contradiction, suppose that there is a $ \bar t<t_1$ such that $ P(t)$ exists on the interval $ (\bar t,t_1]$ but some entry of $ P(t)$ becomes unbounded as $ t\searrow \bar t$ . We know from Exercise 6.2 that for all $ t\in (\bar t,t_1]$ the matrix $ P(t)$ is symmetric and positive semidefinite, hence all its principal minors must be nonnegative. If an off-diagonal entry $ P_{ij}(t)$ becomes unbounded as $ t\searrow \bar t$ while all diagonal entries stay bounded, then a certain $ 2\times 2$ principal minor of $ P(t)$ must be negative for $ t$ sufficiently close to $ \bar t$ ; namely, this is the determinant of the matrix

  $\displaystyle \qquad\quad\ \ \, i \qquad \quad j$    
  \begin{align*}\begin{array}{cc} \begin{array}{c} i\\ j \end{array}\hspace*{-1em}...} * & P_{ij}(t) \\ P_{ij}(t) & * \\ \end{array}\right) \end{array}\end{align*}    

formed by the $ i$ -th and the $ j$ -th row and column of $ P(t)$ . Thus this scenario is ruled out, and the only remaining possibility is that a diagonal entry $ P_{ii}(t)$ becomes unbounded as $ t\searrow \bar t$ . Consider the vector $ e_i:=(0,\dots,1,\dots,0)^T\in\mathbb{R}^n$ , with 1 in the $ i$ -th position and zeros everywhere else; then $ e_i^TP(t)e_i=P_{ii}(t)\to\infty$ as $ t\searrow \bar t$ . Suppose that the system is in state $ e_i$ at some time $ t>\bar t$ . We know that $ e_i^TP(t)e_i$ is the optimal cost-to-go from there, and so this cost must be unbounded as we take $ t$ to be closer to $ \bar t$ . On the other hand, this cost cannot exceed the cost of applying, e.g., the zero control on $ [t,t_1]$ . The state trajectory corresponding to this control is $ x( s)=\Phi_A( s,t)e_i$ for $ s\in[t,t_1]$ , where $ \Phi_A(\cdot,\cdot)$ is the transition matrix for $ A(\cdot)$ , and the associated cost is

$\displaystyle \int_t^{t_1}\left(e_i^T\Phi^T_A( s,t)Q( s)\Phi_A( s,t)e_i\right)d
s +e_i^T\Phi_A^T(t_1,t)M\Phi_A(t_1,t)e_i.

It is quite clear that this cost remains bounded as $ t$ approaches $ \bar t$ , since the quantity inside the integral is bounded for $ \bar t\le t\le s\le t_1$ . The resulting contradiction proves that a finite escape time $ \bar t$ cannot exist.

The existence of the solution $ P(\cdot)$ to the RDE (6.14) on the interval $ [t_0,t_1]$ is now established. Thus we can be sure that the optimal control (6.12) is well defined, and the finite-horizon LQR problem has been completely solved. We must be able to explicitly solve the RDE, though, if we want to obtain a closed-form expression for the optimal control law.

To obtain the simplest possible finite-horizon LQR problem, cons...
...eedback law $u(t)=-\tanh (t_1-t)x(t)$\ is the optimal control.~\qed\end{Example} 6.2

If analytically solving the RDE is not a completely trivial task even for such an elementary example, we expect closed-form solutions to be obtainable only in very special cases. Yet, the LQR problem is much more tractable compared to the general optimal control problem studied in Chapter 5. The main simplification is that instead of trying to solve the HJB equation which is a partial differential equation, we now have to solve the RDE which is an ordinary differential equation, and this can be done efficiently by standard numerical methods. In the next section, we define and study a variant of the LQR problem which lends itself to an even simpler solution; this development will be in line with what we already saw, in a more general context but in much less detail, towards the end of Section 5.1.3.

next up previous contents index
Next: 6.2 Infinite-horizon LQR problem Up: 6.1 Finite-horizon LQR problem Previous: 6.1.3 Value function and   Contents   Index
Daniel 2010-12-20