next up previous contents index
Next: 6.2.3 Closed-loop stability Up: 6.2 Infinite-horizon LQR problem Previous: 6.2.1 Existence and properties   Contents   Index


6.2.2 Infinite-horizon problem and its solution

We have now prepared the ground for taking the limit as $ t_1\to\infty$ and considering the infinite-horizon LQR problem with the cost

$\displaystyle J(u)=\int_{t_0}^\infty \left(x^T(t)Qx(t)+u^T(t)Ru(t)\right)dt.$ (6.28)

Recalling the optimal cost (6.26) and the optimal control (6.24) for the finite-horizon case, and passing to the limit as $ t_1\to\infty$ , it is natural to guess that the infinite-horizon optimal cost and optimal control will be6.3

$\displaystyle V (x_0)=x_0^TP x_0$ (6.29)

and

$\displaystyle u^* (t)=-R^{-1}B^TP x^* (t)$ (6.30)

where $ P$ is the matrix limit (6.27) which satisfies the ARE (6.28). Note that the quadratic cost (6.30) is independent of $ t_0$ and the linear feedback law (6.31) is time-invariant, which is consistent with the problem formulation and with our earlier findings in Section 5.1.3. Still, optimality of (6.31) is far from obvious, and it is not even clear whether a control yielding a bounded cost exists. Strictly speaking, the use of the asterisks in (6.31) is not yet justified; at this point, $ x^*$ is simply the trajectory of the system under the action of the feedback law (6.31) with $ x^*(t_0)=x_0$ .

We now show that the above guess is indeed correct. Consider the function $ \widehat V (x):=x^TP x$ . Its derivative along the trajectory $ x^*$ is

$\displaystyle \frac d{dt}\widehat V (x^* (t))$ $\displaystyle = (x^* )^T(t)P (A-BR^{-1}B^TP )x^* (t)+ (x^* )^T(t)(A^T-P BR^{-1}B^T)P x^* (t)$    
  $\displaystyle = (x^* )^T(t)(P A+A^TP -2P BR^{-1}B^TP ) x^* (t)$    
  $\displaystyle = -(x^* )^T(t)(Q+P BR^{-1}B^TP ) x^* (t)$    

where the last equality follows from the ARE (6.28). We can then calculate the portion of the corresponding cost over an arbitrary finite interval $ [t_0,T]$ to be

  $\displaystyle \int_{t_0}^T \left((x^* )^T(t)Qx^* (t)+ (u^* )^T(t)Ru^* (t)\right)dt= \int_{t_0}^T (x^* )^T(t)(Q+P BR^{-1}B^TP )x^* (t)dt$    
  $\displaystyle =-\int _{t_0}^T \frac d{dt}\widehat V (x^* (t))dt=\widehat V (x_0)- \widehat V (x^* (T))= x^T_0P x_0-(x^* )^T(T)P x^* (T)\le x^T_0P x_0$    

where the last inequality follows from the fact that $ P \ge 0$ . Taking the limit as $ T\to\infty$ , we obtain

$\displaystyle J(u^*)\le x^T_0P x_0.$ (6.31)

In particular, we can now be sure that the infinite-horizon problem is well posed, because $ u^*$ gives a bounded cost. On the other hand, consider another trajectory $ x$ with the same initial condition corresponding to an arbitrary control $ u$ . Since $ x_0^TP(t_0,t_1)x_0$ is the finite-horizon optimal cost, we have for every finite $ t_1$ that

$\displaystyle x_0^TP(t_0,t_1)x_0$ $\displaystyle \le \int_{t_0}^{t_1} \left(x^T(t)Qx(t)+u^T(t)Ru(t)\right)dt\le \int_{t_0}^\infty \left(x^T(t)Qx(t)+u^T(t)Ru(t)\right)dt=J(u)$    

where the second inequality relies on the positive (semi)definiteness of $ Q$ and $ R$ . Passing to the limit as $ t_1\to\infty$ yields

$\displaystyle x_0^TP x_0\le J(u).
$

Comparing this inequality with (6.32) and remembering that $ u$ was arbitrary (and could in particular be equal to $ u^*$ ), we see that

$\displaystyle J(u^*)= x^T_0P x_0\le J(u)\qquad \forall\,u
$

hence $ x_0^TP x_0$ is the infinite-horizon optimal cost and $ u^*$ is an optimal control, as claimed.


next up previous contents index
Next: 6.2.3 Closed-loop stability Up: 6.2 Infinite-horizon LQR problem Previous: 6.2.1 Existence and properties   Contents   Index
Daniel 2010-12-20