Next: 5.2 HJB equation versus Up: 5.1 Dynamic programming and Previous: 5.1.4 Sufficient condition for   Contents   Index

## 5.1.5 Historical remarks

The HJB partial differential equation has its origins in the work of Hamilton, with subsequent improvements by Jacobi, done in the context of calculus of variations in the late 1830s. At that time the equation served as a necessary condition for optimality. Its use as a sufficient condition--still in the calculus of variations setting--was proposed in the work of Carathéodory, begun in the 1920s and culminating in his book published in 1935. (He established local optimality by working in a neighborhood of a test curve.) Carathéodory's approach became known as the ``royal road" of calculus of variations.

The principle of optimality seems, in hindsight, an almost trivial observation, which actually dates all the way back to Jacob Bernoulli's 1697 solution of the brachistochrone problem. In the early 1950s, slightly before Bellman, the principle of optimality was formalized in the context of differential games by Isaacs, who called it the ``tenet of transition." (The fundamental PDE of game theory bears Isaacs' initial alongside those of Hamilton, Jacobi, and Bellman.) The term ``dynamic programming" was coined by Bellman, who published a series of papers and the book [Bel57] on this subject in the 1950s. Bellman's contribution was to recognize the power of the method to study value functions globally, and to use it for solving a variety of calculus of variations and optimal control problems.

It is not clear if Bellman was aware of the close connection between his work and the Hamilton-Jacobi equation of calculus of variations. This connection was explicitly made in the early 1960s by Kalman, who was apparently the first to use the name ``HJB equation." Kalman's derivation of sufficient conditions for optimal control, combining the ideas of Carathéodory and Bellman, provided the basis for the treatment given here and in other modern sources. (The work of Kalman will also be prominently featured in the next chapter, where we discuss linear systems and quadratic costs.)

Quite remarkably, the maximum principle was being developed in the Soviet Union independently around the same time as Bellman's and Kalman's work on dynamic programming was appearing in the United States. We thus find it natural at this point to compare the maximum principle with the HJB equation and discuss the relationship between the two approaches.

Next: 5.2 HJB equation versus Up: 5.1 Dynamic programming and Previous: 5.1.4 Sufficient condition for   Contents   Index
Daniel 2010-12-20