6.2.2 Infinite-horizon problem and its solution

Next: 6.2.3 Closed-loop stability Up: 6.2 Infinite-horizon LQR problem Previous: 6.2.1 Existence and properties Contents Index

6.2.2 Infinite-horizon problem and its solution

We have now prepared the ground for taking the limit as $t_1\to\infty$ and considering the infinite-horizon LQR problem with the cost

$\displaystyle J(u)=\int_{t_0}^\infty \left(x^T(t)Qx(t)+u^T(t)Ru(t)\right)dt.$

(6.28)

Recalling the optimal cost (6.26) and the optimal control (6.24) for the finite-horizon case, and passing to the limit as $t_1\to\infty$ , it is natural to guess that the infinite-horizon optimal cost and optimal control will be^6.3

$\displaystyle V (x_0)=x_0^TP x_0$

(6.29)

and

$\displaystyle u^* (t)=-R^{-1}B^TP x^* (t)$

(6.30)

where

is the matrix limit (6.27) which satisfies the ARE (6.28). Note that the quadratic cost (6.30) is independent of

and the linear feedback law (6.31) is time-invariant, which is consistent with the problem formulation and with our earlier findings in Section 5.1.3. Still, optimality of (6.31) is far from obvious, and it is not even clear whether a control yielding a bounded cost exists. Strictly speaking, the use of the asterisks in (6.31) is not yet justified; at this point,

is simply the trajectory of the system under the action of the feedback law (6.31) with

We now show that the above guess is indeed correct. Consider the function $\widehat V (x):=x^TP x$ . Its derivative along the trajectory is

$\displaystyle \frac d{dt}\widehat V (x^* (t))$	$\displaystyle = (x^* )^T(t)P (A-BR^{-1}B^TP )x^* (t)+ (x^* )^T(t)(A^T-P BR^{-1}B^T)P x^* (t)$
	$\displaystyle = (x^* )^T(t)(P A+A^TP -2P BR^{-1}B^TP ) x^* (t)$
	$\displaystyle = -(x^* )^T(t)(Q+P BR^{-1}B^TP ) x^* (t)$

where the last equality follows from the ARE (6.28). We can then calculate the portion of the corresponding cost over an arbitrary finite interval

to be

	$\displaystyle \int_{t_0}^T \left((x^* )^T(t)Qx^* (t)+ (u^* )^T(t)Ru^* (t)\right)dt= \int_{t_0}^T (x^* )^T(t)(Q+P BR^{-1}B^TP )x^* (t)dt$
	$\displaystyle =-\int _{t_0}^T \frac d{dt}\widehat V (x^* (t))dt=\widehat V (x_0)- \widehat V (x^* (T))= x^T_0P x_0-(x^* )^T(T)P x^* (T)\le x^T_0P x_0$