next up previous contents index
Next: 1.2.1.2 Second-order conditions for Up: 1.2.1 Unconstrained optimization Previous: 1.2.1 Unconstrained optimization   Contents   Index

1.2.1.1 First-order necessary condition for optimality

Suppose that $ f$ is a $ \mathcal C^1$ (continuously differentiable) function and $ x^*$ is its local minimum. Pick an arbitrary vector $ d\in\mathbb{R}^n$ . Since we are in the unconstrained case, moving away from $ x^*$ in the direction of $ d$ or $ -d$ cannot immediately take us outside $ D$ . In other words, we have $ x^*+\alpha d\in D$ for all $ \alpha\in\mathbb{R}$ close enough to 0.

For a fixed $ d$ , we can consider $ f(x^*+\alpha d)$ as a function of the real parameter $ \alpha$ , whose domain is some interval containing 0. Let us call this new function $ g$ :

$\displaystyle g(\alpha):=f(x^*+\alpha d).$ (1.4)

Since $ x^*$ is a minimum of $ f$ , it is clear that 0 is a minimum of $ g$ . Passing from $ f$ to $ g$ is useful because $ g$ is a function of a scalar variable and so its minima can be studied using ordinary calculus. In particular, we can write down the first-order Taylor expansion for $ g$ around $ \alpha=0$ :

$\displaystyle g(\alpha)=g(0)+g'(0)\alpha+o(\alpha)$ (1.5)

where $ o(\alpha)$ represents ``higher-order terms" which go to 0 faster than $ \alpha$ as $ \alpha$ approaches 0, i.e.,

$\displaystyle \lim_{\alpha\to 0}\frac{o(\alpha)}{\alpha}=0.$ (1.6)

We claim that

$\displaystyle g'(0)= 0.$ (1.7)

To show this, suppose that $ g'(0)\ne 0$ . Then, in view of (1.6), there exists an $ \varepsilon >0$ small enough so that for $ \vert\alpha\vert<\varepsilon $ , the absolute value of the fraction in (1.6) is less than $ \vert g'(0)\vert$ . We can write this as

$\displaystyle \vert\alpha\vert<\varepsilon \quad\Rightarrow\quad \vert o(\alpha)\vert<
\vert g'(0)\alpha\vert.
$

For these values of $ \alpha$ , (1.5) gives

$\displaystyle g(\alpha)-g(0)<g'(0)\alpha+\vert g'(0)\alpha\vert.$ (1.8)

If we further restrict $ \alpha$ to have the opposite sign to $ g'(0)$ , then the right-hand side of (1.8) becomes 0 and we obtain $ g(\alpha)-g(0)<0$ . But this contradicts the fact that $ g$ has a minimum at 0. We have thus shown that (1.7) is indeed true.

We now need to re-express this result in terms of the original function $ f$ . A simple application of the chain rule from vector calculus yields the formula

$\displaystyle g'(\alpha)=\nabla f(x^*+\alpha d)\cdot d$ (1.9)

where

$\displaystyle \nabla f:=\left({f}_{x_1},\dots,{f}_{x_n}\right)^T
$

is the gradient of $ f$ and $ \cdot$ denotes inner product.1.1 Whenever there is no danger of confusion, we use subscripts as a shorthand notation for partial derivatives: $ {f}_{x_i}:={\partial f}/{\partial x_i}$ . Setting $ \alpha=0$ in (1.9), we have

$\displaystyle g'(0)=\nabla f(x^*)\cdot d$ (1.10)

and this equals 0 by (1.7). Since $ d$ was arbitrary, we conclude that

$\displaystyle \fbox{$\nabla f(x^*)=0$}$ (1.11)

This is the first-order necessary condition for optimality.

A point $ x^*$ satisfying this condition is called a stationary point. The condition is ``first-order" because it is derived using the first-order expansion (1.5). We emphasize that the result is valid when $ f\in\mathcal C^1$ and $ x^*$ is an interior point of $ D$ .


next up previous contents index
Next: 1.2.1.2 Second-order conditions for Up: 1.2.1 Unconstrained optimization Previous: 1.2.1 Unconstrained optimization   Contents   Index
Daniel 2010-12-20