Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Introduction to Stochastic Control Theory
Introduction to Stochastic Control Theory
Introduction to Stochastic Control Theory
Ebook602 pages6 hours

Introduction to Stochastic Control Theory

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This text for upper-level undergraduates and graduate students explores stochastic control theory in terms of analysis, parametric optimization, and optimal stochastic control. Limited to linear systems with quadratic criteria, it covers discrete time as well as continuous time systems.
The first three chapters provide motivation and background material on stochastic processes, followed by an analysis of dynamical systems with inputs of stochastic processes. A simple version of the problem of optimal control of stochastic systems is discussed, along with an example of an industrial application of this theory. Subsequent discussions cover filtering and prediction theory as well as the general stochastic control problem for linear systems with quadratic criteria.
Each chapter begins with the discrete time version of a problem and progresses to a more challenging continuous time version of the same problem. Prerequisites include courses in analysis and probability theory in addition to a course in dynamical systems that covers frequency response and the state-space approach for continuous time and discrete time systems.
LanguageEnglish
Release dateMay 11, 2012
ISBN9780486138275
Introduction to Stochastic Control Theory

Related to Introduction to Stochastic Control Theory

Related ebooks

Technology & Engineering For You

View More

Related articles

Related categories

Reviews for Introduction to Stochastic Control Theory

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Introduction to Stochastic Control Theory - Karl J. Åström

    Tell.

    CHAPTER 1

    STOCHASTIC CONTROL

    1. INTRODUCTION

    This introductory chapter will try to put stochastic control theory into a proper context. The development of control theory is briefly discussed in Section 2. Particular emphasis is given to a discussion of deterministic control theory. The main limitation of this theory is that it does not provide a proper distinction between open loop systems and closed loop systems. This is mainly due to the fact that disturbances are largely neglected in the framework of deterministic control theory. The difficulties of characterizing disturbances are discussed in Section 3. An outline of the development of stochastic control theory and the most important results are given in Section 4. Section 5 is devoted to a presentation of the contents of the different chapters of the book.

    2. THEORY OF FEEDBACK CONTROL

    Control theory was originally developed in order to obtain tools for analysis and synthesis of control systems. The early development was concerned with centrifugal governors, simple regulating devices for industrial processes, electronic amplifiers, and fire control systems. As the theory developed, it turned out that the tools could be applied to a large variety of different systems, technical as well as nontechnical. Results from various branches of applied mathematics have been exploited throughout the development of control theory. The control problems have also given rise to new results in applied mathematics.

    In the early development there was a strong emphasis on stability theory based on results like the Routh-Hurwitz theorem. This theorem is a good example of interaction between theory and practice. The stability problem was actually suggested to Hurwitz by Stodola who had found the problem in connection with practical design of regulators for steam turbines.

    The analysis of feedback amplifiers used tools from the theory of analytical functions and resulted, among other things, in the celebrated Nyquist criterion.

    During the postwar development, control engineers were faced with several problems which required very stringent performance. Many of the control processes which were investigated were also very complex. This led to a new formulation of the synthesis problem as an optimization problem, and made it possible to use the tools of calculus of variations as well as to improve these tools. The result of this development has been the theory of optimal control of deterministic processes. This theory in combination with digital computers has proven to be a very successful design tool. When using the theory of optimal control, it frequently happens that the problem of stability will be of less interest because it is true, under fairly general conditions, that the optimal systems are stable.

    The theory of optimal control of deterministic processes has the following characteristic features:

    There is no difference between a control program (an open loop system) and a feedback control (a closed loop system).

    The optimal feedback is simply a function which maps the state space into the space of control variables. Hence there are no dynamics in the optimal feedback.

    The information available to compute the actual value of the control signal is never introduced explicitly when formulating and solving the problem.

    We can illustrate these properties by a simple example.

    EXAMPLE 2.1

    Consider the system

    (2.1)

    with initial conditions

    (2.2)

    Suppose that it is desirable to control the system in such a way that the performance of the system judged by the criterion

    (2.3)

    is as small as possible. It is easy to see that the minimal value of the criterion (2.3) is J = 1 and that this value is assumed for the control program

    (2.4)

    as well as for the control strategy

    (2.5)

    Equation (2.4) represents an open loop control because the value of the control signal is determined from a priori data only, irrespective of how the process develops. Equation (2.5) represents a feedback law because the value of the control signal at time t depends on the state of the process at time t.

    The example thus illustrates that the open loop system (2.4) and the closed loop system (2.5) are equivalent in the sense that they will give the same value to the loss function (2.3). The stability properties are, however, widely different. The system (2.1) with the feedback control (2.5) is asymptotically stable while the system (2.1) with the control program (2.4) only is stable. In practice, the feedback control (2.5) and the open loop control (2.4) will thus be widely different. This can be seen, e.g., by introducing disturbances or by assuming that the controls are calculated from a model whose coefficients are slightly in error.

    Several of the features of deterministic control theory mentioned above are highly undesirable in a theory which is intended to be applicable to feedback control. When the deterministic theory of optimal control was introduced, the old-timers of the field particularly reacted to the fact that the theory showed no difference between open loop and closed loop systems and to the fact that there were no dynamics in the feedback loop. For example, it was not possible to get a strategy which corresponded to the well-known PI-regulator which was widely used in industry. This is one reason for the widely publicized discussion about the gap between theory and practice in control. The limitations of the deterministic control theory were clearly understood by many workers in the field from the very start, and this understanding is now widely spread. The heart of the matter is that no realistic models for disturbances are used in deterministic control theory. If a so-called disturbance is introduced, it is always postulated that the disturbance is a function which is known a priori. When this is the case and the system is governed by a differential equation with unique solutions, it is clear that the knowledge of initial conditions is equivalent to the knowledge of the state of the system at an arbitrary instant of time. This explains why there are no differences in performance between an open loop system and a closed loop system, and why the assumption of a given initial condition implicitly involves that the actual value of the state is known at all times. Also when the state of the system is known, the optimal feedback will always be a function which maps the state space into the space of control variables. As will be seen later, the dynamics of the feedback arise when the state is not known but must be reconstructed from measurements of output signals.

    The importance of taking disturbances into account has been known by practitioners of the field from the beginning of the development of control theory. Many of the classical methods for synthesis were also capable of dealing with disturbances in an heuristic manner. Compare the following quotation from A. C. Hall¹:

    I well remember an instance in which M. I. T. and Sperry were co-operating on a control for an air-borne radar, one of the first such systems to be developed. Two of us had worked all day in the Garden City Laboratories on Sunday, December 7, 1941, and consequently did not learn of the attack on Pearl Harbor until late in the evening. It had been a discouraging day for us because while we had designed a fine experimental system for test, we had missed completely the importance of noise with the result that the system’s performance was characterized by large amounts of jitter and was entirely unsatisfactory. In attempting to find an answer to the problem we were led to make use of frequency-response techniques. Within three months we had a modified control system that was stable, had a satisfactory transient response, and an order of magnitude less jitter. For me this experience was responsible for establishing a high level of confidence in the frequency-response techniques.

    Exercises

    Consider the problem of Example 2.1. Show that the control signal (2.4) and the control law (2.5) are optimal. Hint: First prove the identity

    Consider the problem of Example 2.1. Assume that the optimal control signal and the optimal control law are determined from the model

    where a has a value close to 1 when the system is actually governed by Eq. (2.1). Determine the value of the criterion (2.3) for the systems obtained with open loop control and with closed loop control.

    Compare the performance of the open loop control (2.4) and the closed loop control (2.5) when the system of Example 2.1 is actually governed by the equation

    where v is an unknown disturbance. In particular let v be an unknown constant.

    3. HOW TO CHARACTERIZE DISTURBANCES

    Having realized the necessity of introducing more realistic models of disturbances, we are faced with the problem of finding suitable ways to characterize them. A characteristic feature of practical disturbances is the impossibility of predicting their future values precisely. A moment’s reflection indicates that it is not easy to devise mathematical models which have this property. It is not possible, for example, to model a disturbance by an analytical function because, if the values of an analytical function are known in an arbitrarily short interval, the values of the function for other arguments can be determined by analytic continuation.

    Since analytic functions do not work, we could try to use statistical concepts to model disturbances. As can be seen from the early literature on statistical time series, this is not easy. For example, if we try to model a disturbance as

    (3.1)

    where a, (t) , a2 (t), ..., an (t) are known functions and ξi is a random variable, we find that if the linear equations

    (3.2)

    have a solution then the particular realizations of the stochastic variables ξ1, ξ2, ..., ξn can be determined exactly from observations of x(t1), x(t2), ..., x(tn) and the future values of x can then be determined exactly. The disturbance described by (3.1) is therefore called a completely deterministic stochastic process or a singular stochastic process.

    A more successful attempt is to model a disturbance as a sequence of random variables. A simple example is given by the autoregressive process {x(t)} given by

    (3.3)

    where x(t0) = 1, | a | < 1 and {e(t), t = t0, t0 + 1, ...} is a sequence of independent normal (0, σ) stochastic variables. It is also assumed that e(t) is independent of x(t) for all t. Assume, for example, that we want to predict the value of x(t + 1) based on observations of x(t). It seems reasonable to predict x(t + 1) by ax(t). The prediction error is then equal to e(t), that is, a stochastic variable with zero mean and variance σ².

    It turns out that one answer to the problem of modeling disturbances is to describe them as stochastic processes. The theory of stochastic processes has actually partly grown out of attempts to model the fluctuations observed in physical systems. The theory matured very quickly due to contributions from such intellectual giants as Cramér, Khintchine, Kolmogorov, and Wiener.

    Problems of prediction are of central importance in the theory of stochastic processes. As will be seen in the following, they are also closely related to the problems of control.

    Exercises

    Consider a disturbance which is characterized by

    x(t) = a cos t

    where a is a stochastic variable. Give a procedure for predicting future values of x exactly.

    (t + 1) = ax(t) is optimal in the sense that it minimizes the least squares prediction error defined by E[x(t (t + 1)]².

    4. STOCHASTIC CONTROL THEORY

    This section will discuss the main problems and results of stochastic control theory. It will also give a brief account of the development of the theory.

    Stochastic control theory deals with dynamical systems, described by difference or differential equations, and subject to disturbances which are characterized as stochastic processes. The theory aims at answering problems of analysis and synthesis.

    Analysis—What are the statistical properties of the system variables?

    Parametric Optimization—Suppose that we are given a system and a regulator with a given structure but with unknown parameters. How are the parameters to be adjusted in order to optimize the system with respect to a given criterion ?

    Stochastic Optimal Control—Given a system and a criterion, find the control law which minimizes the criterion.

    The tools required to solve all these problems are fairly recent developments. Stochastic control theory was used at M. I. T. during the Second World War to synthesize fire control systems. An interesting example, design of a tracking radar, using parametric optimization is described in a book by James, Nichols, and Phillips.²

    The filtering and prediction theory developed by Wiener and Kolmogorov is one of the cornerstones in stochastic control theory. This theory makes it possible to extract a signal from observations of signal and disturbances. It plays a very important role in the solution of the stochastic optimal control problem. The Wiener-Kolmogorov theory has, however, not been applied extensively. One reason for this is that it requires the solution of an integral equation (the Wiener-Hopf equation). In realistic problems the Wiener-Hopf equation seldom has analytical solutions, and it is not easy to solve the equation numerically.

    The use of digital computers both for analysis and synthesis has profoundly influenced the development of the theory. A significant contribution to the filtering problem was given by Kalman and Bucy. Their results made it possible to solve prediction and filtering problems recursively. This is ideally suited for digital computers. The results of Kalman and Bucy also generalize to nonstationary processes. Using the Kalman-Bucy theory, the predictor is given as the output of a linear dynamical system driven by the observations. To determine the coefficients of the dynamical system, it is necessary to solve an initial value problem for a Riccati equation. The Riccati equation is similar to the one encountered in the theory of optimal control of linear deterministic systems with quadratic criteria. The prediction problem and the linear quadratic control problem are in fact mathematical duals. This result is of great interest both from theoretical and practical points of view. If one of the problems is solved we can easily get the solution of the other by invoking the duality. Also the same computer programs can be used to solve both the filtering and the deterministic control problem.

    The solution of the stochastic optimal control problem relies heavily upon the concepts and techniques of dynamic programming. For linear systems with quadratic criteria, the solution is given by the so-called separation theorem. This result implies that the optimal strategy can be thought of as composed of two parts. See Fig. 1.1. One part is an optimal filter which computes an estimate of the state in terms of the conditional mean given the observed output signals. The other part is a linear feedback from the estimated state to the control signal.

    It turns out that the linear feedback is the same as would be obtained if there were no disturbances and if the state of the system could be measured exactly. The linear feedback can be determined by solving a deterministic control problem. The conditional mean of the state is obtained as the output of a Kalman filter which is essentially a mathematical model of the system driven by the observations. The filter depends on the disturbances and on the system dynamics, but it is independent of the criterion.

    Fig. 1.1. Block diagram which illustrates the separation theorem. The control variable is denoted by u, the output by y, and the state by x.

    The separation theorem thus provides a connection between filtering theory and the theory of optimal stochastic control. A version of the separation theorem was first published by Joseph and Tou.³ A related result has been known in econometrics under the name of certainty equivalence principle.

    The optimal strategy obtained when solving the stochastic optimal control problem for linear systems with quadratic criteria thus consists of a linear dynamical system possibly with time varying parameters. This class of strategies includes those which have been used in practice for many years, usually arrived at by ad hoc methods. Since there are no difficulties in dealing with many inputs and many outputs, the linear stochastic con-rol theory is a very important design tool. The result of the theory is a closed form solution in the sense that the parameters of the optimal strategy are given in terms of solutions of initial value problems for Riccati equations. Efficient numerical algorithms for solving such equations are available. Occasionally the problems might be numerically ill conditioned.

    The linear stochastic control theory is one of the simplest structures available that includes several of the features that are desirable in a theory of feedback control systems. For example:

    The theory shows directly that there are considerable differences between open loop and closed loop systems.

    The performance of the system depends critically on the information available at the time the value of the control signal should be determined. For example, a delay of the measured signal will lead directly to a deterioration of the performance.

    The optimal feedback consists of a linear dynamical system.

    5. OUTLINE OF THE CONTENTS OF THE BOOK

    Having discussed some features of stochastic control theory heuristically, it is now meaningful to outline the contents of the book. The purpose of the book is to present theory for analysis, parametric optimization, and optimal control of stochastic control processes. The results presented are limited to linear systems, however, both discrete time systems and continuous time systems are covered. The discrete time case is conceptually and analytically much simpler than the continuous time case.

    It should be emphasized that in practical applications where digital computers are used to implement the control strategies, the discrete time case is sufficient. The results for the discrete time case are complete in the sense that all results are shown in full detail.

    Chapter 2 gives a survey of some results and concepts from the theory of stochastic processes. Specific stochastic processes, such as stationary processes, Markov processes, processes of second order, and processes with independent increments, are discussed. Covariance functions and spectral densities are introduced. Special attention is given to the concept of white noise. This is the first case where the differences between continuous time and discrete time stochastic processes show up. For discrete time processes, white noise is simply a sequence of independent, equally distributed, random variables. The continuous time white noise is a process which is considerably more involved. It has, for example, infinite variance.

    A machinery which will enable us to perform analysis of continuous time processes, e.g., differentiation and integration, is also given. The key to this is the definition of convergence. It turns out that the result will depend on the chosen topology. The practical significance of the different convergence concepts are discussed. For simplicity, the analysis is carried out for convergence in the mean square only. Chapter 2 is kept quite brief. The newcomer to the field of stochastic processes should be prepared to spend some effort in reading the references. Readers who are already familiar with the theory of stochastic processes could scan this chapter quickly.

    So-called stochastic state models are developed in Chapter 3. This provides another good illustration of the differences between discrete time and continuous time processes. Discrete time processes are handled in 8 pages while continuous time processes require 45 pages. The purpose of the chapter is to develop the concept of state for stochastic processes. For deterministic systems the state is, roughly speaking, the minimal amount of information about the past history of a system which is required in order to predict its future motion. For a stochastic system it turns out that it is not possible to predict the future motion exactly. Instead the state will be the minimal amount of information which is required in order to predict the future probability distribution of the state. It turns out that state models for discrete time systems can be characterized as difference equations driven by discrete time white noise, i.e., a sequence of independent, equally distributed random variables. The analysis of the linear stochastic difference equation is discussed in detail. For continuous time systems, the situation is considerably more involved. To obtain a state model for continuous time processes, we are led to the concept of stochastic differential equations. These equations are explained intuitively and the machinery required to handle the equations is reviewed. Several key theorems are, however, not proven. An alternative to this analysis would be a formal analysis based on manipulation of delta functions, but this was considered less satisfactory because the formal manipulations can give incorrect results.

    While Chapters 2 and 3 can be considered as introductory, the main theme is taken up in Chapter 4. This chapter presents the key theorems which are required in order to analyze dynamical systems whose inputs are stochastic processes. Discrete time systems characterized by input-output relations, such as weighting functions and transfer functions, with second order stochastic processes as inputs are discussed. The results show that it is possible to generate stochastic processes with covariance functions belonging to a large class simply by sending white noise through a linear system. These so-called representation theorems will make it possible to considerably simplify the results, since a large class of problems can be reduced to the analysis of linear systems with white noise inputs. The analogous results for continuous time systems are also given. Chapter 4 thus provides us with the tools required to analyse control systems subject to stochastic disturbances.

    Evaluation of quadratic functions of state variables for linear systems is the subject of Chapter 5. By using results from the theory of analytical functions, we derive recursive formulas for the evaluation of quadratic loss functions. The results also exhibit interesting relations between stability analysis and the evaluation of quadratic loss functions. As an illustration of parametric optimization of time dependent processes, we also discuss the problem of reconstructing the state variables of a dynamical system using a mathematical model. The optimal gain for a reconstructor with a given structure is discussed. Later in Chapter 7, it will be shown that the chosen structure is actually optimal. The result will in fact be the Kalman filter reconstructing the state.

    Chapter 6 is devoted to a particularly simple class of stochastic control problems, namely linear systems with one input and one output where the criterion is to minimize the mean square deviations of the output in steady state. This particular problem gives a good insight into the structure of the optimal solutions because the separation theorem can be proven with very little mathematical sophistication. The solution clearly shows the relationships between optimal filtering and optimal control. As a side result, we also get a new algorithm for solving the filtering problem for processes with rational spectral densities. An industrial

    Enjoying the preview?
    Page 1 of 1