Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Introduction to the Calculus of Variations
Introduction to the Calculus of Variations
Introduction to the Calculus of Variations
Ebook795 pages6 hours

Introduction to the Calculus of Variations

Rating: 3 out of 5 stars

3/5

()

Read preview

About this ebook

". . . eminently suitable as a text for an introductory course: the style is pleasant; the prerequisites are kept to a minimum . . . and the pace of the development is appropriate for most students at the senior or first year graduate level." — American Mathematical Monthly
The purpose of this text is to lay a broad foundation for an understanding of the problems of the calculus of variations and its many methods and techniques, and to prepare readers for the study of modern optimal control theory. The treatment is limited to a thorough discussion of single-integral problems in one or more unknown functions, where the integral is employed in the riemannian sense.
The first three chapters deal with variational problems without constraints. Chapter 4 is a self-contained treatment of the homogeneous problem in the two-dimensional plane. In Chapter 5, the minimum principle of Pontryagin as it applies to optimal control problems of nonpredetermined duration, where the state variables satisfy an autonomous system of first-order equations, is developed to the extent possible by classical means within the general framework of the Hamilton-Jacobi theory. Chapter 6 is devoted to a derivation of the multiplier rule for the problem of Mayer with fixed and variable endpoints and its application to the problem of Lagrange and the isoperimetric problem. In the last chapter, Legendre's necessary condition for a weak relative minimum and a sufficient condition for a weak relative minimum are derived within the framework of the theory of the second variation.
This book, which includes many strategically placed problems and over 400 exercises, is directed to advanced undergraduate and graduate students with a background in advanced calculus and intermediate differential equations, and is adaptable to either a one- or two-semester course on the subject.
LanguageEnglish
Release dateApr 26, 2012
ISBN9780486138022
Introduction to the Calculus of Variations

Read more from Hans Sagan

Related to Introduction to the Calculus of Variations

Titles in the series (100)

View More

Related ebooks

Mathematics For You

View More

Related articles

Reviews for Introduction to the Calculus of Variations

Rating: 3 out of 5 stars
3/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Introduction to the Calculus of Variations - Hans Sagan

    INDEX

    Introduction to the Calculus of Variations

    CHAPTER 1

    EXTREME VALUES OF FUNCTIONALS

    1.1  Introduction

    1.2  Functionals

    1.3  Necessary Conditions for Relative Extreme Values of Real-valued Functions of One Real Variable

    1.4  Normed Linear Spaces

    1.5  The Gâteaux Variation of a Functional

    1.6  The Space of Admissible Variations

    1.7  First Necessary Condition for a Relative Minimum of a Functional

    1.8  The Second Gâteaux Variation and a Second Necessary Condition for a Relative Minimum of a Functional

    Brief Summary

    APPENDIX

    A1.9  Relative Extreme Values of Real-valued Functions of n Real Variables

    1.1  INTRODUCTION

    The calculus of variations is a mathematical discipline that may best be described as a general theory of extreme values. The name of this discipline does not derive from the type of problems it is concerned with but rather from a specific technique—the technique of variation, which will be discussed in Chaps. 1 and 2—that is employed to obtain certain necessary conditions for the existence of extreme values.

    To give the reader some indication of what variational problems are all about, we shall list and discuss a variety of such problems and then extract their essence to arrive at a fairly general formulation, which we shall then subject to mathematical analysis.

    A. THE PROBLEM OF THE BRACHISTOCHRONE

    There is hardly a book written on the subject of the calculus of variations that does not use this problem as a takeoff point, and we shall make no exception. The problem deserves merit not only because it was historically the first variational problem to be formulated mathematically but also because it serves well to illustrate the scope of applicability of this discipline to problems of an applied nature.

    Let Pa(a,0) and Pb(b, yb(see Fig. 1.1). The problem is to find among all curves with a continuous derivative that join the point Pa to the point Pb the one along which a mass point under the influence of gravity will slide from Pa to Pb without friction in the shortest possible time.

    Figure 1.1

    Before making an attempt to formulate this problem mathematically, let us emphasize a point that is of paramount importance: First, we consider all curves with a continuous derivative that join the given points Pa and Pb. Then we assign to each curve a number, namely, the amount of time it takes the mass point to slide from Pa to Pb. (All those curves along which the mass point will never get to Pb we either rule out from our consideration or else assign the number ∞.) Then, from among all these numbers (sliding times), we pick the smallest one—provided it exists—and call the curve that this smallest number is associated with the solution of our problem.

    Now, let us move on to a mathematical formulation of the problem. Let y = y(x) be a function with a continuous derivative that joins the point Pa to the point Pb:

    If s represents the distance on y = y(x) measured from Pa, we have

    where t represents the time and υ the velocity of the mass point. Since the motion is, by assumption, frictionless with initial velocity 0 and is influenced only by the gravitational force, we have from the principle of the conservation of energy

    where y is the vertical distance from the initial level. This yields

    Hence

    So it would appear that the sliding time along y = y(x) from Pa to Pb is given by

    This definite integral is to be rendered a minimum by a suitable choice of y = y(x) within the class of functions with a continuous derivative which pass through the prescribed points.

    As the solution of this problem, we obtain an arc of a cycloid (see Prob. 2.6.1 and Sec. 4.10). The problem was first proposed by Johannes Bernoulli (1667–1748) in 1696 and was solved by a method which, though ingenious, lacks mathematical rigor and applicability to more general problems of a similar type.†

    B. THE PROBLEM OF MINIMAL SURFACES OF REVOLUTION

    Given two points Pa(a,ya) and Pb(b,yb), a b in the plane. These two points are to be joined by a curve y = y(x) with a continuous derivative in such a manner that the surface which is generated by rotation of this curve about the x axis has the smallest possible area (see Fig. 1.2).

    Again, we realize that the formulation of this problem establishes a certain relationship between curves (functions) of a certain class and a subset of the set of real numbers, the smallest of which is to be found, provided that it exists.

    The mathematical formulation of this problem poses no difficulties whatever. If S denotes the area of a surface generated by rotation of y = y(x) about the x axis, where

    ,

    Figure 1.2

    we have

    which is to be rendered a minimum by an appropriate choice of y = y(x).

    The solution of this problem is—if the position of Pa relative to Pb satisfies certain additional conditions—a catenoid, that is, a surface of revolution that is generated by a catenary (see Sec. 2.6) .†

    An obvious generalization of this problem is the following: Given a closed Jordan curve in space. To be found is that surface which passes through the given curve and has the smallest possible area. This problem is known as the problem of Plateau.

    The attentive reader cannot possibly fail to observe the striking similarity between problems A and B. In both cases, we look for a curve with a continuous derivative which satisfies the boundary conditions

    and yields a minimum for an integral of the type

    where f is some given function of the three variables x, y, y′. Such a problem is often known as the simplest problem of the calculus of variations.

    The problems which we shall discuss below will not fit into this narrow category, but they are nevertheless bona fide variational problems.

    C. THE SIMPLEST ISOPERIMETRIC PROBLEM

    This problem, too, is a classic. (Its formulation and solution—by pure intuition—are credited to Queen Dido of Carthage, about 850 B.C.) Among all curves with a continuous derivative that join two given points Pa and Pb and have the given length L, the one is to be found that encompasses the largest possible area, where the x axis and the vertical lines x = a and x = b serve as the supplementary boundary.

    The mathematical formulation of this problem is fairly obvious: Among all curves y = y(x) for which

    find the one for which

    yields the largest possible value. A circular arc turns out to be the solution of this problem. (See Secs. 6.5 and 6.6)

    A more general formulation of this problem is the following: Among all possible simple closed curves of a given perimeter (hence the name isoperimetric problem), find the one that encompasses the largest possible area, or equivalently, among all simple closed curves that encompass an area of given magnitude, find the one of shortest perimeter. The circle is the solution in either case.

    Problem C differs from problems A and B in that an additional condition in the form of an integral constraint is imposed on the class of competing functions. The nature of this additional constraint makes the problem less easily accessible to mathematical analysis than problems A and B.

    D. A PROBLEM OF NAVIGATION

    Given a river with parallel straight banks, b units apart. One of the banks coincides with the y at every point (x, y) is given by

    A boat with constant speed c(c² > υ²) in still water is to cross the river in the shortest possible time, using the point (0,0) as point of departure. (See Fig. 1.3.)

    Figure 1.3

    If α denotes an angle that depends on the course of the boat, then the actual velocity of the boat in the river is given by

    For the path y = y(x) on which the boat moves, we have

    The time to cross is given by

    We have from the preceding equation

    We solve this equation for 1/(c cos α) in terms of υ, c, y′ and obtain

    where υ = υ(x) is a known function of x.

    This integral, which is to be minimized, is of the same type as the integrals in the preceding problems; the class of competing functions, however, is less restrained than before. Only one boundary condition is imposed, namely, the one at the beginning point

    The endpoint is allowed to move freely along the other bank x = b. This is, of course, quite reasonable because different terminal points will, in general, yield different minimal crossing times—and we are, after all, interested in the shortest crossing time, no matter where we might land!

    Because of the special structure of the given velocity field, the solution of the problem will be entirely independent of the choice of the beginning point. However, if we consider a more general velocity field, with u, v, as functions of y also, then different minimal crossing times can be obtained by varying the beginning point also. Further generalizations of this problem are easy to develop. Let the banks of the river be represented by some continuous curves (preferably with a continuous derivative) rather than by straight lines, assume a velocity field of the general type u = u(x,y), υ = υ(x,y), and leave the point of departure and point of arrival unspecified. Then you will have a fairly general problem that falls into the category of free-endpoint problems. (See Sec. 4.6.)

    E. A SIMPLE OPTIMAL CONTROL PROBLEM

    As M. R. Hestenes so aptly remarked,† had Johannes Bernoulli lived in our time, he would probably have formulated his problem as follows: To be found is the path of minimal travel time of a rocket under the influence of gravity and with a thrust force of constant magnitude and variable direction. The rocket is to be fired from a given point with a given initial direction and is to arrive at another given point with a given terminal direction.

    If T denotes the constant magnitude of the thrust force, u(t) its direction as represented by the angle with the positive x axis, and t the time, then we obtain for the equations of motion

    If (a,ya) and (b,yb) represent the coordinates of the given initial and terminal points and if a and y′b represent the given initial and terminal slopes, then the boundary conditions can be expressed in terms of the unknown duration t

    In addition to these conditions, t1 has to be minimized.

    A formulation of this problem that is more easily accessible to analysis may be obtained in terms of the functions

    Then we obtain, instead of the previously listed equations of motion,

    and the new boundary conditions

    The problem consists in finding a suitable function u = u(t) the range of which may or may not be subjected to constraints, so that

    We may view the above system of first-order differential equations as an underdetermined system of four first-order differential equations for the five unknown functions y1, y2, y3, y4, u. A choice of u = u(t) will turn this into a determined system, the solutions of which will, in general, be determined by the initial conditions. If there are functions u = u(t) for which the terminal conditions can also be satisfied, then the problem consists of finding among these functions the one for which t1 assumes the smallest possible value. Since the direction of the thrust force at any time t controls the motion of the rocket, we call it a control function, or simply a control. The control that yields the minimal travel time is called the optimal control.

    Again, we have a problem that fits our general framework: There is a relation between functions (controls) and numbers (duration of process), and that function that yields the shortest duration is to be found. (Although the quantity to be minimized in our formulation of the problem is not expressed as an integral, as on previous occasions, this can easily be remedied by some mathematical trickery—see Prob. 1.1.8.)

    A problem such as this is variously called a Mayer problem, a Lagrange problem, or an optimal control problem. (See Chaps. 5 and 6.)

    PROBLEMS 1.1

    1.  Give examples of curves with a continuous derivative that have to be ruled out from the competition in problem A, even though they satisfy the given boundary conditions.

    2.  Find the parametric representation of a cycloid that is generated by a point on the circumference of a circle of radius a that rolls along the x axis.

    3.  Formulate mathematically: Find a curve of length L joining the points Pa and Pb in the x,y plane of such shape that its center of gravity is as low as possible. Also, impose suitable differentiability conditions on the curve.

    for all x, y, and where the banks are represented by curves with a continuous derivative.

    5.  Formulate mathematically: Given is a closed curve Γ in space with a simple closed curve as projection into the x,y plane. To be found is the surface u = u(x,y) of smallest possible area that possesses Γ as boundary.

    6.  Formulate mathematically: Two given points in the plane are to be joined by a curve of shortest possible length.

    7.  A surface in space is given by a parametric representation. Join two points on this surface by a curve of shortest possible length.

    8.  Consider problem E. Introduce yo = t with yo(0) = 0, yo(t1) = t1 as a new unknown function, and formulate the problem as a problem of an underdetermined system of five differential equations in six unknown functions with five initial conditions, four terminal conditions, and a minimum condition that is imposed on a definite integral with an unknown upper limit.

    1.2  FUNCTIONALS

    The discussion in the preceding section was, by necessity, rather sketchy. All the problems that were discussed will be dealt with in much greater detail at the appropriate time. For the time being, this superficial survey will have to suffice for the purpose of illustrating the type of problem we shall deal with. The discussion, nevertheless, brought out one main point of immediate interest:

    No matter what the particular trimmings, in each one of these problems a certain class of functions defined by differentiability conditions, boundary conditions, constraints of various types, etc., is considered, and by the formulation of the problem, there is associated with each element of this class a real number. The solution of the problem will be that element of the class of functions which is associated with the smallest (largest) real number—provided such a number exists.

    Now, this certainly sounds familiar. In calculus, we consider real-valued functions of a real variable which are defined on a certain subset of the real line (e.g., closed interval, open interval). With each element of this subset of the reals, the function associates a real number. The theory of extreme values in calculus is concerned with finding that element in the domain in which the function is defined with which the smallest (largest) value of the function is associated. This is actually the problem we are concerned with now, except that the class on which the relationship (function) is defined is not a subset of the set of real numbers but is a specific subset of the set of all functions.

    In other words, instead of a mapping of a subset of the set R of real numbers (domain of the function) onto a subset of R (range of the function), we now consider a mapping of a subset of the set of all functions onto a subset of R. So to speak, we are dealing with functions of functions. Such things are called functionals.

    We give the following definition:

    Definition 1.2 Let S be a set of well-defined elements. If F denotes a mapping of S into R such that to every element f S there corresponds exactly one real number, then F is called a functional on S.

    Symbolically, we write

    Examples of functionals abound in mathematical analysis. Let C[0,1] denote the set of all real-valued functions that are defined and continuous on [0,1]. Then

    is a functional on C[0,1].

    Let C¹[0,1] denote the set of all functions that are defined and differentiable and have a continuous derivative on [0,1]. Then

    is a functional on C¹[0,1 ]. If f = f(x,y,y′) is continuous for all x, y, y′, then

    is a functional on C¹[a,b]. This latter case clearly embraces all the integrals that have been considered in the preceding section wherever the integrands are continuous functions of x, y, y′.

    Functionals do not necessarily have to be of the particular nature given in these preceding examples.

    is also a legitimate functional, and so is

    where k(x) is a given continuous function.

    Now that we have found such a simple and obvious generalization of the concept of a function, we shall try to pursue this line of inquiry further in our search for a solution of the general extreme-value problem. Again, we shall see that within reasonable limits, the classical arguments from the extreme-value theory of real-valued functions of a real variable will find their counterparts in the theory of extreme values of functionals.

    The next section will be devoted to a re-examination of the extreme-value problem in calculus, and in the sections after that, we shall generalize these concepts and procedures so that they will sensefully apply to the theory of functionals.

    PROBLEMS 1.2

    1.  What is the customary name for a functional that is defined as a mapping of the n-dimensional (cartesian) space into R?

    2.  An obvious generalization of the concept of a functional is a mapping that admits image sets other than subsets of R. Such a mapping is called an operator, or more precisely, a functional operator. Give examples of functional operators.

    3.  Try to define continuity of a functional by a straightforward generalization of the definition of continuity of a function. What generalized concept is missing and not immediately obvious?

    4.  A theorem of Weierstrass states: A function that is continuous on a closed interval [a,b] will assume on [a,b] its maximum value and its minimum value. † We call f lower [upper] semicontinuous at xo if for any ε > 0 there exists a δ(ε) such that

    for all |x xo| < δ(ε). Prove that if f is lower (upper) semicontinuous in [a,b], then f will assume its minimum (maximum) value in [a,b], by adapting the proof of Weierstrass’ theorem in a suitable manner.

    1.3  NECESSARY CONDITIONS FOR RELATIVE EXTREME VALUES OF REAL-VALUED FUNCTIONS OF ONE REAL VARIABLE

    We shall devote this section to a review of well-known necessary conditions for relative extreme values of functions of a real variable. The argument we shall supply, however, will differ from the customary reasoning that is to be found in treatments of this subject. The advantage of our argument is that it is amenable to immediate generalization.

    We know that, by a celebrated theorem of Weierstrass (see Prob. 1.2.4), a continuous function assumes its maximum value and its minimum value in a closed interval, and on the other hand, we know that direct calculus methods fail to locate all relative extreme values of functions with even a continuous derivative in a closed interval. (The relative extrema at the endpoints are the ones that present difficulties.) For this reason, we shall restrict our investigation to functions on open intervals.

    To formulate the definition of a relative extreme value in practical terms, let us first introduce the concept of a neighborhood:

    Definition 1.3.1 A subset Nδ(x0) of R is called a δ neighborhood of xo if it contains all points x for which xo – δ < x < xo + δ—in other words, if it contains all points xo + h for which |h| < δ.

    Now we can proceed to a definition of a relative extreme value (see also Fig. 1.4):

    Figure 1.4

    Definition 1.3.2 Let y = f(x) represent a real-valued function of a real variable which is defined on the open interval (a,b). y = f(x) possesses a relative minimum (maximum) at if there exists a δ neighborhood of x0such that for all or in other words, if for all

    With a view toward later generalizations, let us now define the derivative of a function in the following, somewhat unusual manner, assuming that f

    Definition 1.3.3 The number f ′(xo) is the derivative of y = f(x) at xo if and only if there exists a δ > 0 such that

    where .

    We contrast this definition to the classical one, which is to be found in most treatments of the differential calculus:

    Definition 1.3.3a The number f ′(xo) is the derivative of y = f(x) at xo if and only if

    These two definitions are tied together by the following theorem:

    Theorem 1.3.1 Definitions 1.3.3 and 1.3.3a are equivalent.

    Proof: (a) Definition 1.3.3 implies Definition 1.3.3a.

    We obtain from (1.3.1) after division by h:

    Consequently,

    (b) Definition 1.3.3a. implies Definition 1.3.3.

    Let

    By (1.3.2),

    and (1.3.1) follows immediately with ε(h) = ε1(h)h.

    Let C′(a,b) denote the class of functions that possess a derivative at all points of the interval (a,bBy such that

    (This δ is, in general, not the same as the one in Definition 1.3.3. We shall agree that whenever the two δ’s are different, we shall work with the smaller one.)

    we have from (1.3.1),

    Then

    that

    .

    it follows that

    ,

    and by the same reasoning as before,

    Since hwe have the following necessary condition for a relative minimum:

    follows immediately after division by h.

    Definiton 1.3.4 The number is the second derivative of at if there exists a δ such that

    for all where

    in terms of the first and second derivatives of f(x) is given in the following theorem:

    Theorem 1.3.2 If exists in a neighborhood of x0 and is continuous at then

    for all where

    Proof: By Taylor’s formula,

    and hence

    where C²(a,b) denotes the class of all functions with a continuous second-order derivative in (a,bThen, by (and we have from (1.3.5) that

    We pick an hand we obtain

    Since

    Collecting all the results that have been obtained thus far, we can state:

    Theorem 1.3.3 If is differentiable in (a,b) and if possesses a relative minimum (maximum) at then, by necessityIf then the additional condition has to hold.

    then

    and we see that f(x0Hence:

    Theorem 1.3.4 If and if then has a relative minimum (maximum) at

    PROBLEMS 1.3

    1.  Assume that f,g are differentiable functions of x. Show that fg + fg′ is the derivative of fg by Definition 1.3.3.

    2.  Show that f′(x0) is uniquely defined by Definition 1.3.3 and that f″(xo) is uniquely defined by Definition 1.3.4.

    3.  Suppose that (1.3.5) holds. Show that

    4.  Suppose that f′(x0) is defined by

    (a)  Show that if f′(x0) exists in this sense, then it also exists in the sense of Definition 1.3.3.

    (b)  Show that f′(x0) as defined here is unique, provided it exists.

    (c)  Prove Theorem 1.3.1 for the case where the concept of the derivative is based on the definition given here rather than the one given in the text.

    5.  Demonstrate with an example that ε(h) and ε1(h) in Definitions 1.3.3 and 1.3.4 also depend on x0.

    1.4  NORMED LINEAR SPACES

    We shall now start to lay the foundation for a generalization of the ideas developed in the preceding section.

    We note that if a function f is defined on an interval (a,bprovided that h is sufficiently small. Is this also true if f is a functional and x is a function rather than a real number?

    This section will be devoted to a discussion of this problem. Suppose that a functional F where S is some specified class of functions but does not necessarily exist for functions that are not in S. may not even exist. This line of thought leads us directly to the concept of a linear space. Roughly speaking, a class of functions forms a linear space if, with any two elements, it also contains all linear combinations of these two elements with coefficients from a given (number) field.

    We give the following precise definition:

    Definition 1.4.1 The collection S of elements x,y,z,... is called a linear space over the field with elements λ, μ, ν, . . . if the following conditions are satisfied:

    1.  If x S, y S, then the sum of x and y, written x + y, is defined and x + y + ∈ S.

    2.  Addition is commutative: x + y = y + x.

    3.  Addition is associative: (x + y) + z = x + (y + z).

    4.  There exists an additive identity 0 ∈ S such that x + 0 = x for all x S

    5.  For each x S, there exists an additive inverse (–x) such that x + (–x) = 0.

    6.  Scalar multiplication of elements of S with elements of is defined; that is, if x S, λ ∈ , then λx S.

    7.  The scalar multiplication is associative: If x S, λ,μ , then λ(μx) = (λμ)x.

    8.  Scalar multiplication is distributive: If x S, λ, μ then (λ + μ)x = λx + μx, and if x,y S and λ ∈ , then λ(x + y) = λx + λy.

    9.  For the multiplicative identity for all x S.

    The first five postulates express the fact that S is an Abelian group, with addition as group operation, while the remaining four postulates regulate the multiplication of elements from S in the customary manner. (See also Prob. 1.4.1.)

    Examples of linear spaces are easy to find:

    A. C[0,1], the space of all real-valued functions that are defined and continuous on the interval [0,1]. The following two results are established in elementary calculus:

    If f C[0,1] and g C[0,1], then f + g C[0,1], and if λ is a real number, then λf C[0,1]. 0, in particular, is a continuous function. All the other properties are so obviously satisfied that it is superfluous to dwell on them.

    The reader, of course, should realize that the space of all real-valued continuous functions on [0,1]—or on any other interval, for that matter—is a linear space over the field of reals, as over the field of rationals, or any other field that is contained in R. Whenever we use the symbol C[0,1], however, we always mean the space of all real-valued continuous functions on [0,1] over the field of real numbers R.

    B. C¹[0,1], the space of all real-valued functions with a continuous derivative on the interval [0,1]. The two theorems quoted under example A, if applied to f′ rather than to f, yield

    Enjoying the preview?
    Page 1 of 1