4.3 Discussion of the maximum principle

Our main objective in the remainder of this chapter is to gain a better understanding of the maximum principle by discussing and interpreting its statement and by applying it to specific classes of problems. We begin this task here by making a few technical remarks.

One should always remember that the maximum principle provides *necessary*
conditions for optimality. Thus it only helps single out optimal
control
candidates, each of which needs to be further analyzed to
determine whether it is indeed optimal. The reader should also keep in mind that an
optimal control may not
even exist (the existence issue will be addressed in detail in
Section 4.5). For many problems of interest, however,
the optimal solution does exist and
the conditions provided by the maximum principle are strong enough to help
identify it, either directly or after a routine additional
elimination process. We already saw an example supporting this claim
in Exercise 4.1 and will study other important
examples in Section 4.4.

When stating the maximum principle, we ignored the distinction between different kinds of local minima by working with a globally optimal control , i.e., by assuming that for all other admissible controls that produce state trajectories satisfying the given endpoint constraint. However, it is clear from the proof that global optimality was not used. The control perturbations used in the proof produced controls which differ from on a small interval of order in length, making the norm of the difference, , small for small . The resulting perturbed trajectory , on the other hand, was close to the optimal trajectory in the sense of the 0-norm, i.e., was small for small (as is clear from the calculations given in Sections 4.2.2-4.2.4). It can be shown that the conditions of the maximum principle are in fact necessary for local optimality when closeness in the -space is measured by the 0-norm for and norm for ; we stress that the Hamiltonian maximization condition (statement 2 of the maximum principle) remains global. At this point it may be instructive to think of the system as an example and to recall the discussion in Section 3.4.5 related to Figure 3.6. In that context, the notion of a local minimum with respect to the norm we just described is in between the notions of weak and strong minima; indeed, weak minima are defined with respect to the 0-norm for both and , while strong minima are with respect to the 0-norm for with no constraints on . For strong minima, the necessary conditions provided by the maximum principle are still valid. This is not the case for weak minima, because in a needle perturbation the control value is no longer arbitrary: it must be close to .

The statement of the maximum principle contains the condition (justified in
Section 4.2.8) that
for all
. In fact, since the origin in
is an equilibrium of
the linear adjoint equation (4.31), if
,
vanish for some
then they must vanish for all
. Thus, the
above condition could be equivalently stated as
for *some*
. This condition is sometimes called the
*nontriviality condition*,
because with
all the statements of the
maximum principle are trivially satisfied. In some cases, it is
possible to show that the adjoint vector itself,
, is
nonzero for all
. For example, suppose that the running cost
is everywhere nonzero (this is true, for instance, in
time-optimal control problems, where
). The Hamiltonian satisfies
(by statement 3 of the
maximum principle). If
for some
, then we have
, hence
and we reach a contradiction
with the nontriviality condition. We will give another example
later involving a terminal cost; see Exercise 4.7
below. As for the abnormal multiplier
, since it is the
vertical coordinate of the normal to the separating hyperplane,
corresponds to the case when the separating hyperplane
is vertical (and cannot be tilted). The projection of such a
hyperplane onto the
-space is a hyperplane in
, and all
perturbed controls must bring the state
to the same side of
this projected hyperplane. In a majority of control problems this
does not happen and we can set
. We also know that the
separating hyperplane cannot be vertical and
cannot be 0
in the free-endpoint case (see the end of
Section 4.2.10).