next up previous contents index
Next: 6.1.3 Value function and Up: 6.1 Finite-horizon LQR problem Previous: 6.1.1 Candidate optimal feedback   Contents   Index

6.1.2 Riccati differential equation

We will now derive a differential equation that the matrix $ P(\cdot)$ defined in (6.10) must satisfy. First, differentiate both sides of the equality (6.6) to obtain

$\displaystyle \dot p^*(t)=-2\dot P(t)x^*(t)-2P(t)\dot x^*(t).

Next, expand $ \dot p^*$ and $ \dot x^*$ using the canonical equations (6.7) to arrive at

$\displaystyle 2Q(t)x^*(t)-A^T(t)p^*(t)=-2\dot P(t)x^*(t)-2P(t)A(t)x^*(t)-P(t)B(t)R^{-1}(t)B^T(t)p^*(t).

Applying (6.6) to eliminate $ p^*$ and dividing by 2, we conclude that the equation

$\displaystyle Q(t)x^*(t)+A^T(t)P(t)x^*(t)=-\dot P(t)x^*(t)-P(t)A(t)x^*(t)+P(t)B(t)R^{-1}(t)B^T(t)P(t)x^*(t)$ (6.13)

must hold (for all $ t$ at which $ P(t)$ is defined). Since the initial state $ x_0$ is arbitrary and $ x^*$ is the state of the linear time-varying system given by (6.1) and (6.12) whose transition matrix is nonsingular, $ x^*(t)$ can be arbitrary. It follows that $ P$ must be a solution of the matrix differential equation

$\displaystyle \fbox{$\dot P(t)=-P(t)A(t)-A^T(t)P(t)-Q(t)+P(t)B(t)R^{-1}(t)B^T(t)P(t)$}$ (6.14)

which is called the Riccati differential equation (RDE). We already know that the boundary condition for it is specified by (6.11). The solution $ P(t)$ is to be propagated backward in time from $ t=t_1$ ; its global existence remains to be addressed.

It is interesting to compare the two descriptions that we now have for $ P(t)$ . The RDE (6.14) is a quadratic matrix differential equation. The formula (6.10), on the other hand, is in terms of the transition matrix $ \Phi$ which satisfies a linear matrix differential equation $ \frac d{dt}\Phi(t,t_1)=\mathcal H(t)\Phi(t,t_1)$ but has size $ 2n\times 2n$ (while $ P$ is $ n\times n$ ). Ignoring the computational effort involved in computing the matrix inverse in (6.10), we can say that by passing from (6.10) to (6.14) we reduced in half the size of the matrix to be solved for, but traded a linear differential equation for a quadratic one. Actually, if we prefer matrix differential equations that are linear rather than quadratic, it is possible to compute $ P(t)$ somewhat more efficiently by solving a linear system of size $ 2n\times n$ , as shown in the next exercise.

Let $X(t)$, $Y(t)$\ be $n\times n$\ matrices satisfying the lin...
and the boundary condition~\eqref{e-RDE-boundcond}.

The idea of reducing a quadratic differential equation to a linear one of twice the size is in fact not new to us; we already saw it in Section 2.6.2 in the context of deriving second-order sufficient conditions for optimality in calculus of variations. In the single-degree-of-freedom case, we passed from the first-order quadratic differential equation (2.64) to the second-order linear differential equation (2.67) via the substitution (2.66). In the multiple-degrees-of-freedom setting, scalar variables need to be replaced by matrices but a similar transformation can be applied, as we stated (without including the derivations) at the end of Section 2.6.2. Associating the matrix $ W$ there with the matrix $ P$ here, the reader will readily see the correspondence between that earlier construction and the one given in Exercise 6.1.

The outcome of applying the necessary conditions of the maximum principle to the LQR problem can now be summarized as follows: a unique candidate for an optimal control is given by the linear feedback law (6.12), where the matrix $ P(t)$ satisfies the RDE (6.14) and the boundary condition (6.11). This is as far as the maximum principle can take us; we need to employ other tools for investigating whether $ P(t)$ exists for all $ t$ and whether the control (6.12) is indeed optimal.

next up previous contents index
Next: 6.1.3 Value function and Up: 6.1 Finite-horizon LQR problem Previous: 6.1.1 Candidate optimal feedback   Contents   Index
Daniel 2010-12-20