1.3.3 Second variation and second-order conditions

Next: 1.3.4 Global minima and Up: 1.3 Preview of infinite-dimensional Previous: 1.3.2 First variation and Contents Index

1.3.3 Second variation and second-order conditions

A real-valued functional on $V\times V$ is called bilinear if it is linear in each argument (when the other one is fixed). Setting we then obtain a quadratic functional, or quadratic form, on . This is a direct generalization of the corresponding familiar concepts in finite-dimensional vector spaces.

A quadratic form $\left.\delta^2 J\right\vert _{y}:V\to \mathbb{R}$ is called the second variation of at if for all $\eta\in V$ and all $\alpha$ we have

$\displaystyle J(y+\alpha\eta )=J(y)+\left.\delta J\right\vert _{y} (\eta)\alpha+\left.\delta^2 J\right\vert _{y}(\eta)\alpha^2+o(\alpha^2).$

(1.39)

This exactly corresponds to our previous second-order expansion (1.12) for

given by (1.35). Repeating the same argument we used earlier to prove (1.14), we easily establish the following second-order necessary condition for optimality: If is a local minimum of over $A\subset V$ , then for all admissible perturbations $\eta$ we have

$\displaystyle \fbox{$\left.\delta^2 J\right\vert _{y^*}(\eta)\ge0$}$

(1.40)

In other words, the second variation $\left.\delta^2 J\right\vert _{y^*}$ must be positive semidefinite on the space of admissible perturbations. For local maxima, the inequality in (1.40) is reversed. Of course, the usefulness of the condition will depend on our ability to compute the second variation of the functionals that we will want to study.

$\begin{Exercise} % latex2html id marker 8217Consider the same functional $J$\ ... ...variation of $J$\ (make sure that it is indeed a quadratic form). \end{Exercise}$

What about a second-order sufficient condition for optimality? By analogy with the second-order sufficient condition (1.16) which we derived for the finite-dimensional case, we may guess that we need to combine the first-order necessary condition (1.37) with the strict-inequality counterpart of the second-order necessary condition (1.40), i.e.,

$\displaystyle \left.\delta^2 J\right\vert _{y^*}(\eta)>0$

(1.41)

(this should again hold for all admissible perturbations $\eta$ with respect to a subset

over which we want

to be a minimum). We would then hope to show that for

the second-order term in (1.39) dominates the higher-order term $o(\alpha^2)$ , which would imply that

is a strict local minimum (since the first-order term is 0). Our earlier proof of sufficiency of (1.16) followed the same idea. However, examining that proof more closely, the reader will discover that in the present case the argument does not go through.

We know that there exists an $\varepsilon >0$ such that for all nonzero $\alpha$ with $\vert\alpha\vert<\varepsilon$ we have $\vert o(\alpha^2)\vert<\left.\delta^2 J\right\vert _{y^*}(\eta)\alpha^2$ . Using this inequality and (1.37), we obtain from (1.39) that $J(y^*+\alpha\eta)>J(y^*)$ . Note that this does not yet prove that is a (strict) local minimum of . According to the definition of a local minimum, we must show that is the lowest value of in some ball around with respect to the selected norm $\vert\cdot\vert$ on . The problem is that the term $o(\alpha^2)$ and hence the above $\varepsilon$ depend on the choice of the perturbation $\eta$ . In the finite-dimensional case we took the minimum of $\varepsilon$ over all perturbations of unit length, but we cannot do that here because the unit sphere in the infinite-dimensional space is not compact and the Weierstrass Theorem does not apply to it (see Section 1.3.4 below).

One way to resolve the above difficulty would be as follows. The first step is to strengthen the condition (1.41) to

$\displaystyle \left.\delta^2 J\right\vert _{y^*}(\eta)\ge \lambda \Vert\eta\Vert^2$

(1.42)

for some number $\lambda>0$ . The property (1.42) does not automatically follow from (1.41), again because we are in an infinite-dimensional space. (Quadratic forms satisfying (1.42) are sometimes called uniformly positive definite.) The second step is to modify the definitions of the first and second variations by explicitly requiring that the higher-order terms decay uniformly with respect to $\Vert\eta\Vert$ . We already mentioned such an alternative definition of the first variation via the expansion (1.38). Similarly, we could define $\left.\delta^2 J\right\vert _{y}$ via the following expansion in place of (1.39):

$\displaystyle J(y+\eta )=J(y)+\left.\delta J\right\vert _{y} (\eta)+\left.\delta^2 J\right\vert _{y}(\eta)+o(\Vert\eta\Vert^2).$

(1.43)

Adopting these alternative definitions and assuming that (1.37) and (1.42) hold, we could easily complete the sufficiency proof by noting that $\vert o(\Vert\eta\Vert^2)\vert<\lambda \Vert\eta\Vert^2$ when $\Vert\eta\Vert$ is small enough.

With our current definitions of the first and second variations in terms of (1.33) and (1.39), we do not have a general second-order sufficient condition for optimality. However, in variational problems that we are going to study, the functional to be minimized will take a specific form. This additional structure will allow us to derive conditions under which second-order terms dominate higher-order terms, resulting in optimality. The above discussion was given mainly for illustrative purposes, and will not be directly used in the sequel.

Next: 1.3.4 Global minima and Up: 1.3 Preview of infinite-dimensional Previous: 1.3.2 First variation and Contents Index

Daniel 2010-12-20