next up previous contents index
Next: 3.1.2 Weierstrass excess function Up: 3.1 Necessary conditions for Previous: 3.1 Necessary conditions for   Contents   Index


3.1.1 Weierstrass-Erdmann corner conditions

Recall from Section 2.2.1 that a piecewise $ \mathcal C^1$ curve $ y$ on $ [a,b]$ is $ \mathcal C^1$ everywhere except possibly at a finite number of points where it is continuous but its derivative $ y'$ is discontinuous. Such points of discontinuity of $ y'$ are known as corner points. A corner point $ c\in[a,b]$ is characterized by the property that the left-hand derivative $ y'(c^-):=\lim_{x\nearrow c}y'(x)$ and the right-hand derivative $ y'(c^+):=\lim_{x\searrow c}y'(x)$ both exist but have different values. For example, if a hanging chain (catenary) is suspended too close to the ground, then it will not look as in Figure 2.3 on page [*] but will instead touch the ground and have two corner points; see Figure 3.1. Below is another example in which a corner point arises.

Figure: An extremal with two corners for the catenary problem
\includegraphics{figures/broken.eps}


\begin{Example}
Consider the problem of minimizing
$
J(y)=\int_{-1}^1y^2(x)(y'(x...
...ve is piecewise $\mathcal C^1$\ with a corner point at $x=0$.
\qed\end{Example}

Suppose that a piecewise $ \mathcal C^1$ curve $ y$ is a strong extremum for the Basic Calculus of Variations Problem, under the same assumptions as in Section 2.3. Clearly, $ y$ is then also a weak extremum, with respect to the generalized 1-norm

$\displaystyle \Vert y\Vert _1:=\max_{a\le x\le b}\vert y(x)\vert+\max_{a\le x\le b}\max\{\vert y'(x^-)\vert,\vert y'(x^+)\vert\}.$ (3.1)

As we stated in Section 2.3.3 (see in particular footnote 2 there), such an extremum must satisfy the integral form (2.23) of the Euler-Lagrange equation almost everywhere, i.e., at all noncorner points. Extending our previous terminology to the present setting, we will refer to piecewise $ \mathcal C^1$ solutions of (2.23) as extremals (sometimes extremals with corner points are also called broken extremals). We now want to investigate what additional conditions must hold at corner points in order for $ y$ to be a strong extremum. We give a direct analysis3.1 below; later we will mention an alternative way of deriving these conditions (see Exercise 3.3).

For simplicity, we assume that $ y$ has only one (unspecified) corner point $ c\in[a,b]$ . As a generalization of (2.10), we will let two separate perturbations $ \eta_1$ and $ \eta_2$ act on the two portions of $ y$ (before and after the corner point). To make this construction precise, denote these two portions by $ y_1: [a,c]\to\mathbb{R}$ and $ y_2: [c,b]\to\mathbb{R}$ ; their perturbed versions will then be $ y_1+\alpha\eta_1$ and $ y_2+\alpha\eta_2$ . Clearly, we must have $ \eta_1(a)=\eta_2(b)=0$ to preserve the endpoints. Furthermore, since the location of the corner point is not fixed, we should allow the corner point of the perturbed curve to deviate from $ c$ . Let this new corner point be $ c+\alpha{\scriptstyle\Delta}x$ for some $ {\scriptstyle\Delta}x\in\mathbb{R}$ , with the same $ \alpha$ as before for convenience. Our family of perturbed curves (parameterized by $ \alpha$ ) is thus determined by the two curves $ \eta_1$ , $ \eta_2$ and one real number $ {\scriptstyle\Delta}x$ . We label these new curves as $ y(\cdot,\alpha)$ , with $ y(\cdot,0)=y$ . There will be an additional condition on $ \eta_1$ and $ \eta_2$ to guarantee that $ y(\cdot,\alpha)$ is continuous for each $ \alpha$ ; see (3.4) below. We take both $ \eta_1$ and $ \eta_2$ to be $ \mathcal C^1$ , to ensure that $ y(\cdot,\alpha)$ is piecewise $ \mathcal C^1$ with a single corner point at $ c+\alpha{\scriptstyle\Delta}x$ . (Note that the difference $ y(\cdot,\alpha)-y$ is piecewise $ \mathcal C^1$ with two corner points, one at $ c$ and the other at $ c+\alpha{\scriptstyle\Delta}x$ .) Figure 3.2 should help visualize this situation and the argument that follows.

Figure: A perturbation of an extremal with a corner
\includegraphics{figures/W-E.eps}

The reader might have noticed a small problem which we need to fix before proceeding. The domain of $ y_1$ is $ [a,c]$ , whereas we want the domain of $ y_1+\alpha\eta_1$ to be $ [a,c+\alpha{\scriptstyle\Delta}x]$ . For $ \alpha{\scriptstyle\Delta}x>0$ , such a perturbed curve is ill defined. To deal with this issue, let us agree to extend $ y_1$ beyond $ c$ via linear continuation: define $ y_1$ for $ x>c$ by $ y_1(x):=y(c)+y'(c^-)(x-c)$ . The linearity is actually not crucial, all we need is that the function $ y_1$ be $ \mathcal C^1$ at $ x=c$ , with

$\displaystyle y_1(c)=y(c),\qquad y_1'(c)=y'(c^-).$ (3.2)

If the perturbation $ \eta_1$ is also defined on an interval extending to the right of $ c$ , then the earlier construction makes sense (at least for $ \alpha$ close enough to 0). Of course we need to make a similar modification to $ y_2$ , extending it linearly to the left of $ c$ .

Let us write the functional to be minimized as a sum of two components:

$\displaystyle J(y)=\int_a^b L(x,y(x),y'(x))dx=\int_a^c L(x,y_1(x),y_1'(x))dx+\int_c^b L(x,y_2(x),y_2'(x))dx=:J_1(y_1)+J_2(y_2).
$

After the perturbation, the first functional becomes

$\displaystyle J_1(y_1+\alpha\eta_1)=\int_a^{c+\alpha{\scriptstyle\Delta}
x}L(x,y_1+\alpha\eta_1,y_1'+\alpha\eta_1')dx
$

(note that $ {\scriptstyle\Delta}x$ should also be an argument on the left-hand side, but we omit it for simplicity). We can now compute the corresponding first variation:

$\displaystyle \left.\delta J_1\right\vert _{y_1}(\eta_1)$ $\displaystyle =\left.\frac d{d\alpha}\right\vert _{\alpha=0} J_1(y_1+\alpha\eta_1)$    
  $\displaystyle =\int_a^c\big({L}_{y}(x,y_1(x),y_1'(x))\eta_1(x) +{L}_{y'}(x,y_1(x),y_1'(x))\eta_1'(x)\big)dx+L(c,y_1(c),y_1'(c)){\scriptstyle\Delta}x.$    

Applying integration by parts and recalling (3.2) and the constraint $ \eta_1(a)=0$ , we can bring the above expression to the form

$\displaystyle \left.\delta J_1\right\vert _{y_1}(\eta_1)$ $\displaystyle =\int_a^c\Big({L}_{y}(x,y_1(x),y_1'(x))- \frac d{dx} {L}_{y'}(x,y_1(x),y_1'(x))\Big)\eta_1(x)dx$    
  $\displaystyle +{L}_{y'}(c,y(c),y'(c^-)){\eta_1}(c)+L(c,y(c),y'(c^-)){\scriptstyle\Delta}x.$    

Similarly, for the second functional we have

$\displaystyle J_2(y_2+\alpha\eta_2)=\int_{c+\alpha{\scriptstyle\Delta}x}^b
L(x,y_2+\alpha\eta_2,y_2'+\alpha\eta _2')dx
$

and the first variation of $ J_2$ at $ y_2$ is

$\displaystyle \left.\delta J_2\right\vert _{y_2}(\eta_2)$ $\displaystyle =\int_c^b\Big({L}_{y}(x,y_2(x),y_2'(x))- \frac d{dx}{L}_{y'}(x,y_2(x),y_2'(x))\Big)\eta_2(x)dx$    
  $\displaystyle -{L}_{y'}(c,y(c),y'(c^+))\eta_2(c) -L(c,y(c),y'(c^+)){\scriptstyle\Delta}x.$    

For $ \alpha$ close to 0, the perturbed curve $ y(\cdot,\alpha)$ is close to the original curve $ y$ in the sense of the 0-norm. Therefore, the function $ \alpha\mapsto J(y(\cdot,\alpha))$ must attain a minimum at $ \alpha=0$ , implying that

$\displaystyle 0=\left.\frac
d{d\alpha}\right\vert _{\alpha=0} J(y(\cdot,\alpha)...
...elta J_1\right\vert _{y_1}(\eta_1)+\left.\delta J_2\right\vert _{y_2}(\eta_2).
$

Next, observe that each of the two portions $ y_i$ , $ i=1,2$ of the optimal curve $ y$ must be an extremal of the corresponding functional $ J_i$ . Indeed, this becomes clear if we consider the special case when the perturbation $ \eta_i$ vanishes at $ c$ and $ {\scriptstyle\Delta}x=0$ . Therefore, the integrals in the preceding expressions for $ \left.\delta J_1\right\vert _{y_1}(\eta_1)$ and $ \left.\delta J_2\right\vert _{y_2}(\eta_2)$ should both vanish, and we are left with the condition

$\displaystyle <tex2html_comment_mark>65 {L}_{y'}(c,y(c),y'(c^-)){\eta_1}(c) -{L...
...c,y(c),y'(c^-)){\scriptstyle\Delta}x -L(c,y(c),y'(c^+)){\scriptstyle\Delta}x=0.$ (3.3)

Now we need to take into account the fact that the two perturbations $ \eta_1$ and $ \eta_2$ are not independent: they have to be such that the perturbed curve remains continuous at $ x=c+\alpha {\scriptstyle\Delta}x$ . This provides the additional relation

$\displaystyle y_1(c+\alpha {\scriptstyle\Delta}x)+\alpha\eta_1(c+\alpha {\scrip...
..._2(c+\alpha {\scriptstyle\Delta}x)=:y(c)+\alpha{\scriptstyle\Delta}y+o(\alpha).$ (3.4)

The quantity $ {\scriptstyle\Delta}y$ describes the first-order (in $ \alpha$ ) vertical displacement of the corner point, in much the same sense that $ {\scriptstyle\Delta}x$ describes the first-order horizontal displacement; $ {\scriptstyle\Delta}y$ and $ {\scriptstyle\Delta}x$ are independent of each other. Equating the first-order terms with respect to $ \alpha$ in (3.4) and using the second equality in (3.2) along with its counterpart $ y_2'(c)=y'(c^+)$ , we obtain

$\displaystyle y'(c^-){\scriptstyle\Delta}x+\eta_1(c)=y'(c^+){\scriptstyle\Delta}x+\eta_2(c)={\scriptstyle\Delta}y.$ (3.5)

Using (3.5), we can eliminate $ \eta_1(c)$ and $ \eta_2(c)$ from (3.3) and rewrite that formula in terms of $ {\scriptstyle\Delta}y$ and $ {\scriptstyle\Delta}x$ as follows:

  $\displaystyle \big({L}_{y'}(c,y(c),y'(c^-)) -{L}_{y'}(c,y(c),y'(c^+))\big){\scriptstyle\Delta}y$    
$\displaystyle -$ $\displaystyle \Big(\big({L}_{y'}(c,y(c),y'(c^-)y'(c^-)-L(c,y(c),y'(c^-))\big)- ...
..._{y'}(c,y(c),y'(c^+))y'(c^+) -L(c,y(c),y'(c^+))\big) \Big){\scriptstyle\Delta}x$    
$\displaystyle =$ $\displaystyle -\left.{L}_{y'}(x,y(x),y'(x))\right\vert _{c^-}^{c^+}{\scriptstyl...
...x))y'(x) -L(x,y(x),y'(x))\Big)\right\vert _{c^-}^{c^+}{\scriptstyle\Delta}x =0.$    

Since $ {\scriptstyle\Delta}x$ and $ {\scriptstyle\Delta}y$ are independent and arbitrary, we conclude that the terms multiplying them must be 0. This means that $ {L}_{ y'}$ and $ y'{L}_{y'}-L$ are in fact continuous at $ x=c$ .

The above reasoning can be extended to multiple corner points, yielding the necessary conditions for optimality known as the Weierstrass-Erdmann corner conditions: If a curve $ y$ is a strong extremum, then $ {L}_{ y'}$ and $ y'{L}_{y'}-L$ must be continuous at each corner point of $ y$ . More precisely, their discontinuities (due to the fact that $ y'$ does not exist at corner points) must be removable. The quantities $ {L}_{ y'}$ and $ y'{L}_{y'}-L$ are of course familiar to us from Chapter 2; they are, respectively, the momentum and the Hamiltonian. Weierstrass presented these conditions in 1865 during his lectures on calculus of variations, but never formally published them. They were independently derived and published by Erdmann in 1877.


\begin{Exercise}
Let $y$\ be a weak extremum (with~\eqref{e-1norm-gen} serving a...
...'}$) can still be
established by suitably specializing the proof.
\end{Exercise}

For the case of a single corner point, the two Weierstrass-Erdmann corner conditions together with the two boundary conditions provide four relations, which is the correct number to uniquely specify two portions of the extremal (each satisfying the second-order Euler-Lagrange differential equation). In general, to uniquely specify an extremal consisting of $ m$ portions (i.e., having $ m-1$ corner points) we need $ 2m$ conditions, and these are provided by $ 2(m-1)$ corner conditions plus the two boundary conditions.


\begin{Exercise}
Consider the problem of minimizing
$
J(y)=\int_{-1}^1(y'(x))^3d...
...or each one that does, check if it
is a minimum (weak or strong).
\end{Exercise}


next up previous contents index
Next: 3.1.2 Weierstrass excess function Up: 3.1 Necessary conditions for Previous: 3.1 Necessary conditions for   Contents   Index
Daniel 2010-12-20