Suppose that we are given a (time-invariant) control system
and the difference between states and tangent vectors becomes ``hidden" once again.
Let us assume for simplicity that an optimal control problem is formulated in the Mayer form (i.e., with terminal cost only). We know that problems with running cost can always be converted to this form by appending an additional state
, which would yield a system on the augmented manifold
(cf. Sections 3.3.2 and 4.2.1). The basic ingredients of the maximum principle are the costate
and the Hamiltonian
. In the case when
, the Hamiltonian for the Mayer problem took the form
. For a general manifold
, we need to ask ourselves which space
should belong to and how
should be (re)defined. Our first natural guess might be that
, like
, should be a tangent vector to
. However, in contrast with
, there is no clear geometric reason why
should be a tangent vector. Also, taking
to be a tangent vector, we cannot assign a new meaning to our earlier definition of
unless we equip the tangent space with an inner product. (Introducing an inner product on each tangent space
--called a Riemannian metric on
--is possible but, as we will see, is neither necessary nor relevant for our present purposes.) Another option that might come to mind is that
should live in
itself; however, this choice offers even fewer clues towards any natural interpretation of the Hamiltonian.
Can we perhaps take a more direct guidance from the fact that in our old definition of the Hamiltonian,
appears in an inner product with the velocity vector
? In fact, we already remarked in Section 3.4.2 that
never appears by itself but always inside inner products such as
; in other words, it
acts on velocity vectors. This observation suggests that the intrinsic role of the costate
is not that of a tangent vector,
but that of a covector. To better understand the difference between these two types of objects and why the latter one correctly captures the notion of a costate, let us look at how they propagate along a flow induced by a dynamical system on
.
Fix a number
and let
be a
map. While the construction that we are about to describe is valid for every such map, the map that we have in mind here is the one obtained by flowing forward for
units of time along the trajectory of the system (7.2) corresponding to some fixed control
(which, ultimately, is taken to be an optimal control for a given initial condition). Let us first discuss the transformation that
induces on tangent vectors. Pick a point
and a tangent vector
. We know that
is tangent to some curve in
passing through
, namely,
where
for real
(around 0) and
. The image
of this curve under the map
is a curve in
which passes through
, as illustrated in Figure 7.1. Denote the tangent vector at
associated with this new curve by
; in other words, define
The above quantity depends only on the vector
called the derivative (or differential) of
Now suppose that we are given a covector at
, i.e., a
linear function on
. Let us denote it by
so as to have
for each
.
For the same map
as before, can we define in a natural way a linear function
on
? We must decide what the value
should be for every
.
While it is tempting to say that
should equal the value of
on the preimage of
under the map
, this preimage is not well defined unless the map
is invertible. In fact, the reader will quickly realize that there is no apparent candidate map for propagating covectors along
similarly to how the derivative map
acts on tangent vectors. The reason is that, instead of trying to push covectors forward, we should pull them back. This revised objective is readily accomplished as follows:
given a covector
on
, define a covector
on
by
Now everything is beginning to fall into place. The Hamiltonian for our Mayer problem on a manifold
should be defined as