We begin our analysis of the LQR problem by inspecting the necessary conditions for optimality provided by the maximum principle. After some further manipulation, these conditions will reveal that an optimal control must be a linear state feedback law. The Hamiltonian is given by
Note that, compared to the general formula (4.40) for the Hamiltonian, we took the abnormal multiplier to be equal to . This is no loss of generality because, for the present free-endpoint problem in the Bolza form, a combination of the results in Section 4.3.1 would give us the transversality condition which, in light of the nontriviality condition, guarantees that . It is also useful to observe that the LQR problem can be adequately treated with the variational approach of Section 3.4, which yields essentially the same necessary conditions as the maximum principle but without the abnormal multiplier appearing. Indeed, the control is unconstrained, the final state is free, and is quadratic--hence twice differentiable--in ; therefore, the technical issues discussed in Section 3.4.5 do not arise here. In Section 3.4 we proved that along an optimal trajectory we must have and , which is in general different from the Hamiltonian maximization condition, but in the present LQR setting this difference disappears as we will see in a moment. In fact, when solving part a) of Exercise 3.8, the reader should have already written down the necessary conditions for optimality from Section 3.4.3 for the LQR problem (with ). We will now rederive these necessary conditions and examine their consequences in more detail.
The gradient of with respect to is , and along an optimal trajectory it must vanish. Using our assumption that is invertible for all , we can solve the resulting equation for and conclude that an optimal control (if it exists) must satisfy
Since the formula (6.3) expresses in terms of the costate , let us look at more closely. It satisfies the adjoint equation
gives the more detailed relation
which, in view of the terminal condition (6.5), can be written as
We have thus established (6.6) with
A couple of remarks are in order. First, we have not yet justified the existence of the inverse in the definition of . For now, we note that , hence , , and so . By continuity, stays invertible for close enough to , which means that is well defined at least for near . Second, the minus sign and the factor of in (6.10), which stem from the factor of in (6.6), appear to be somewhat arbitrary at this point. We see from (6.5) and (6.6) that
Combining (6.3) and (6.6), we deduce that the optimal control must take the form
(provided that the inverse exists), and then using this expression in (6.3). However, the feedback form of is theoretically revealing and leads to a more compact description of the closed-loop system.
As we said, there are two things that we still need to check: optimality of the control that we found, and global existence of the matrix . These issues will be tackled in Sections 6.1.3 and 6.1.4, respectively. But first, we want to obtain a nicer description for the matrix , as the formula (6.10) is rather clumsy and not very useful (since calculating the transition matrix analytically is in general impossible).