Next: 2.6.2 Sufficient condition for Up: 2.6 Second-order conditions Previous: 2.6 Second-order conditions   Contents   Index

## 2.6.1 Legendre's necessary condition for a weak minimum

Let us compute for a given test curve . Since third-order partial derivatives of will appear, we assume that . We work with the single-degree-of-freedom case for now. The left-hand side of (2.56) is

We need to write down its second-order Taylor expansion with respect to . We do this by expanding the function inside the integral with respect to (using the chain rule) and separating the terms of different orders in :

Matching this expression with (2.56) term by term, we deduce that the second variation is given by

where the integrand is evaluated along . This is indeed a quadratic form as defined in Section 1.3.3. Note that it explicitly depends on as well as on . In contrast with the first variation (analyzed in detail in Section 2.3.1), the dependence of the second variation on is essential and cannot be eliminated. We can, however, simplify the expression for by eliminating the mixed" term containing the product . We do this by using--as we did in our earlier derivation of the Euler-Lagrange equation--the method of integration by parts:

The first, non-integral term on the right-hand side vanishes due to the boundary conditions (2.11). Therefore, the second variation can be written as

 (2.56)

where

 (2.57)

Note that is continuous, and is also continuous at least when .

When we come to the issue of sufficiency in the next subsection, we will also need a more precise characterization of the higher-order term labeled as in the expansion (2.57). The next exercise invites the reader to go back to the derivation of (2.57) and analyze this term in more detail.

We know that if is a minimum, then for all perturbations vanishing at the endpoints the quantity (2.58) must be nonnegative:

 (2.58)

We would like to restate this condition in terms of and only--which are defined directly from and via (2.59)--so that we would not need to check it for all . (Recall that we followed a similar route earlier when passing from the condition (2.16) to the Euler-Lagrange equation via Lemma 2.1.)

What, if anything, does the inequality (2.61) imply about and ? Does it force at least one of these two functions, or perhaps both, to be nonnegative on ? The two terms inside the integral in (2.61) are of course not independent because and are related. So, we should try to see if maybe one of them dominates the other. More specifically, can it happen that is large (in magnitude) while is small, or the other way around?

To answer these questions, consider a family of perturbations parameterized by small , depicted in Figure 2.12. The function equals 0 everywhere outside some interval , and inside this interval it equals 1 except near the endpoints where it rapidly goes up to 1 and back down to 0. This rapid transfer is accomplished by the derivative having a short pulse of width approximately and height approximately right after , and a similar negative pulse right before . Here is small compared to . We base the subsequent argument on this graphical description, but it is not difficult to specify a formula for and use it to verify the claims that follow; see, e.g., [GF63, p. 103] for a similar construction.

We can see that

and this bound is uniform over . On the other hand, for nonzero the integral does not stay bounded as , because it is of order . In particular, let us see this more clearly for the case when is negative on , so that for some we have for all . Assume that there is an interval inside of length at least on which is no smaller than ; this property is completely consistent with our earlier description of and . We then have

As , the above expression tends to , dominating the bounded -dependent term. It follows that the inequality (2.61) cannot hold for all if is negative on some subinterval . But this means that for to be a minimum, must be nonnegative everywhere on . Indeed, if for some , then by continuity of we can find a subinterval containing on which is negative, and the above construction can be applied.2.3

Recalling the definition of in (2.59), we arrive at our second-order necessary condition for optimality: For all we must have

 (2.59)

This condition is known as Legendre's condition, as it was obtained by Legendre in 1786. Note that it places no restrictions on the sign of ; intuitively speaking, the -dependent term in the second variation is dominated by the -dependent term, hence can in principle be negative along the optimal curve (this point will become clearer in the next subsection). For multiple degrees of freedom, the proof takes a bit more work but the statement of Legendre's condition is virtually unchanged: , which becomes a symmetric matrix, must be positive semidefinite for all , i.e., (2.62) must hold in the matrix sense along the optimal curve.

As a brief digression, let us recall our definition (2.29) of the Hamiltonian:

 (2.60)

where . We already noted in Section 2.4.1 that when is viewed as a function of with the other arguments evaluated along an optimal curve , it should have a stationary point at , the velocity of at . More, precisely, the function defined in (2.32) has a stationary point at , which is a consequence of (2.33). Legendre's condition tells us that, in addition, along an optimal curve, which we can rewrite in terms of as

Thus, if the above stationary point is an extremum, then it is necessarily a maximum. This interpretation of necessary conditions for optimality moves us one step closer to the maximum principle. The basic idea behind our derivation of Legendre's condition will reappear in Section 3.4 in the context of optimal control, but eventually (in Chapter 4) we will obtain a stronger result using more advanced techniques.

Next: 2.6.2 Sufficient condition for Up: 2.6 Second-order conditions Previous: 2.6 Second-order conditions   Contents   Index
Daniel 2010-12-20