5.3.3 HJB equation and the value function

Let us finally go back to our fixed-time optimal control problem from Section 5.1.2 and its HJB equation (5.10) with the boundary condition (5.3), with the goal of resolving the difficulty identified at the end of Section 5.2.1. Specifically, our hope is that the value function (5.2) is a solution of the HJB equation in the viscosity sense. In order to more closely match the PDE (5.34), we first rewrite the HJB equation as

As we saw in the previous subsection, flipping the sign in a PDE affects its viscosity solutions; it will turn out that the above sign convention is the correct one. The PDE (5.38) is still not in the form (5.34) because it contains the additional independent variable . However, the concepts and results of Sections 5.3.1 and 5.3.2 extend to this time-space setting without difficulties. Alternatively, we can absorb into by introducing the extra state variable , a trick already familiar to us from Chapters 3 and 4. Then the PDE (5.38) takes the form (5.34) with the obvious definition of the function , except that the domain of is the subset

Now the theory of viscosity solutions can be applied to the HJB equation. Under suitable technical assumptions on the functions
,
, and
, we have the following **main result**: *The value function
is a unique
viscosity solution of the HJB equation (5.38) with the boundary
condition (5.3). It is also locally
Lipschitz* (but, as we know from Section 5.2.1, not necessarily
). Regarding the technical assumptions, we will not list them here (see the references in Section 5.4) but we mention that they are satisfied if, for example,
,
, and
are uniformly continuous,
,
, and
are bounded, and
is a compact set.

We will not attempt to establish the uniqueness and the Lipschitz property, but we do want to understand why
is a viscosity solution of (5.38). To this end, let us prove that
is a viscosity *sub*solution. The additional technical assumptions cited above are actually not needed for this claim. Fix an arbitrary pair
.
We need to show that for every
test function
such that
attains a local minimum
at
, the inequality

must be satisfied. Suppose that, on the contrary, there exist a function and a control value such that

and

Taking as the initial condition, let us consider the state trajectory that results from applying the control on a small time interval . We will now demonstrate that the rate of change of the value function along this trajectory is inconsistent with the principle of optimality. As long as we pick to be sufficiently small, we have

where the first inequality is a direct consequence of (5.39) and the last inequality follows from (5.40) by virtue of continuity of all the functions appearing there. We thus obtain

On the other hand, the principle of optimality (5.4) tells us that

and we arrive at a contradiction. Provided that optimal controls exist, here is a slightly different way to see why (5.41) cannot be true: it would imply that the optimal cost-to-go from is higher than the cost of applying the constant control on followed by an optimal control on the remaining interval , which is clearly impossible.

We now have at our disposal a more rigorous formulation of the necessary conditions for optimality from Section 5.1.3. The above reasoning is of course quite different from our original derivation of the HJB equation, but we see that the principle of optimality still plays a central role. The sufficient condition for optimality from Section 5.1.4 can also be generalized, a task that we leave as an exercise.