Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Ebook552 pages3 hours

Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case

Rating: 3 out of 5 stars

3/5

()

Read preview

About this ebook

Volume 3 of the second edition of the fully revised and updated Digital Signal and Image Processing using MATLAB®, after first two volumes on the “Fundamentals” and “Advances and Applications: The Deterministic Case”, focuses on the stochastic case. It will be of particular benefit to readers who already possess a good knowledge of MATLAB®, a command of the fundamental elements of digital signal processing and who are familiar with both the fundamentals of continuous-spectrum spectral analysis and who have a certain mathematical knowledge concerning Hilbert spaces.

This volume is focused on applications, but it also provides a good presentation of the principles. A number of elements closer in nature to statistics than to signal processing itself are widely discussed. This choice comes from a current tendency of signal processing to use techniques from this field.

More than 200 programs and functions are provided in the MATLAB® language, with useful comments and guidance, to enable numerical experiments to be carried out, thus allowing readers to develop a deeper understanding of both the theoretical and practical aspects of this subject.

LanguageEnglish
PublisherWiley
Release dateOct 2, 2015
ISBN9781119054108
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case

Related to Digital Signal and Image Processing using MATLAB, Volume 3

Related ebooks

Technology & Engineering For You

View More

Related articles

Reviews for Digital Signal and Image Processing using MATLAB, Volume 3

Rating: 3 out of 5 stars
3/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Digital Signal and Image Processing using MATLAB, Volume 3 - Gérard Blanchet

    Chapter 1

    Mathematical Concepts

    1.1 Basic concepts on probability

    Without describing in detail the formalism used by Probability Theory, we will simply remind the reader of some useful concepts. However we advise the reader to consult some of the many books with authority on the subject [1].

    Definition 1.1 (Discrete random variable) A random variable X is said to be discrete if the set of its possible values is, at the most, countable. If {a0, …, an, …}, where n , is the set of its values, the probability distribution of X is characterized by the sequence:

    (1.1)

    representing the probability that X is equal to the element an. These values are such that 0 ≤ pX (n) ≤ 1 and n≥ 0pX (n) = 1.

    This leads us to the probability for the random variable X to belong to the interval ]a,b ]. It is given by:

    The function defined for x ∈ by:

    (1.2)

    is called the cumulative distribution function (cdf) of the random variable X. It is a monotonic increasing function, and verifies FX (−∞) = 0 and FX (+∞) = 1. Its graph resembles that of a staircase function, the jumps of which are located at x-coordinates an and have an amplitude of pX (n).

    Definition 1.2 (Two discrete random variables) Let X and Y be two discrete random variables, with possible values {a0,, an, …} and {b0, , bk, …} respectively. The joint probability distribution is characterized by the sequence of positive values:

    (1.3)

    Pr(X = an, Y = bk) represents the probability to simultaneously have X = an and Y = bk. This definition can easily be extended to the case of a finite number of random variables.

    Property 1.1 (Marginal probability distribution) Let X and Y be two discrete random variables, with possible values {a0,, an, …} and {b0,, bk, …} respectively, and with their joint probability distribution characterized by pXY (n, k). We have:

    (1.4)

    pX (n) and pY (k) denote the marginal probability distribution of X and Y respectively.

    Definition 1.3 (Continuous random variable) A random variable is said to be continuous ¹ if its values belong to and if, for any real numbers a and b, the probability that X belongs to the interval ]a,b ] is given by:

    (1.5)

    where pX (x) is a function that must be positive or equal to zero such that .pX (x) is called the probability density function (pdf) of X.

    The function defined for any x by:

    (1.6)

    is called the cumulative distribution function (cdf) of the random variable X. It is a monotonic increasing function and it verifies FX (−∞) = 0 and FX (+∞) = 1. Notice that pX (x) also represents the derivative of FX (x) with respect to x.

    Definition 1.4 (Two continuous random variables) Let X and Y be two random variables with possible values in × . They are said to be continuous if, for any domain Δ of ², the probability that the pair (X, Y) belongs to Δ is given by:

    (1.7)

    where the function pXY (x, y) ≥ 0, and is such that:

    pXY (x, y) is called the joint probability density function of the pair (X, Y).

    Property 1.2 (Marginal probability distributions) Let X and Y be two continuous random variables with a joint probability distribution characterized by pXY (x,y). The probability distributions of X and Y have the following marginal probability density functions:

    (1.8)

    An example involving two real random variables (X, Y) is the case of a complex random variable Z = X + jY.

    It is also possible to have a mixed situation, where one of the two variables is discrete and the other is continuous. This leads to the following:

    Definition 1.5 (Mixed random variables) Let X be a discrete random variable with possible values {a0,,an, …} and Y a continuous random variable with possible values in . For any value an, and for any real numbers a and b, the probability:

    (1.9)

    where the function pXY (n, y), with n ∈ {0,, k, …} and y ∈ , is ≥ 0 and verifies .

    Definition 1.6 (Two independent random variables) Two random variables X and Y are said to be independent if and only if their joint probability distribution is the product of the marginal probability distributions. This can be expressed:

    for two discrete random variables:

    for two continuous random variables:

    for two mixed random variables:

    where the marginal probability distributions are obtained with formulae (1.4) and (1.8).

    It is worth noting that, knowing pXY (x, y), we can tell whether or not X and Y are independent. To do this, we need to calculate the marginal probability distributions and to check that pXY (x,y) = pX (x)pY (y). If that is the case, then X and Y are independent.

    The following definition is more general.

    Definition 1.7 (Independent random variables) The random variables (X1,…,Xn) are jointly independent if and only if their joint probability distribution is the product of their marginal probability distributions. This can be expressed:

    (1.10)

    where the marginal probability distributions are obtained as integrals with respect to (n − 1) variables, calculated from .

    For example, the marginal probability distribution of X1 has the expression:

    In practice, the following result is a simple method for determining whether or not random variables are independent: if is a product of n positive functions of the type f1(x1)f2( x2) …f n(xn), then the variables are independent.

    It should be noted that if n random variables are independent of one another, it does not necessarily mean that they are jointly independent.

    Definition 1.8 (Mathematical expectation) Let X be a random variable and f (x) a function. The mathematical expectation of f (X) – respectively f (X, Y) – is the value, denoted by respectively – defined:

    for a discrete random variable, by:

    for a continuous random variable, by:

    for two discrete random variables, by:

    for two continuous random variables, by:

    provided that all expressions exist.

    Property 1.3 If {X 1 , X2 , …, Xn} are jointly independent, then for any integrable functions f1, f2,…, fn :

    (1.11)

    Definition 1.9 (Characteristic function) The characteristic function of the probability distribution of the random variables X 1,…, Xn is the function of (u 1,…,un) ∈ n defined by:

    (1.12)

    Because , the characteristic function exists and is continuous even if the moments do not exist. The Cauchy probability distribution, for example, the probability density function of which is pX (x) = 1(1 + x²), has no moment and has the characteristic function e−|u|. Let us notice

    .

    Theorem 1.1 (Fundamental) (X 1,…,Xn) are independent if and only if for any point (u 1, u2,…,un) of n:

    Notice that the characteristic function of the marginal probability distribution of Xk can be directly calculated using (1.12). We have .

    Definition 1.10 (Mean, variance) The mean of the random variable X is defined as the first order moment, that is to say . If the mean is equal to zero, the random variable is said to be centered. The variance of the random variable X is the quantity defined by:

    (1.13)

    The variance is always positive, and its square root is called the standard deviation.

    As an exercise, we are going to show that, for any constants a and b:

    (1.14)

    (1.15)

    Expression (1.14) is a direct consequence of the integral’s linearity. We assume that Y = aX + b, then var . By replacing , we get var .

    A generalization of these two results to random vectors (their components are random variables) will be given by property (1.6).

    Definition 1.11 (Covariance, correlation) Let (X,Y) be two random variables ². The covariance of X and Y is the quantity defined by:

    (1.16)

    In what follows, the variance of the random variable X will be noted var (X). cov (X) or cov (X, X) have exactly the same meaning.

    X and Y are said to be uncorrelated if cov (X, Y) = 0 that is to say if . The correlation coefficient is the quantity defined by:

    (1.17)

    Applying the Schwartz inequality gives us | ρ (X,Y)| ≤ 1.

    Definition 1.12 (Mean vector and covariance matrix) Let {X 1, , Xn } be n random variables with the respective means . The mean vector is the n dimension vector with the means as its components. The n×n covariance matrix C is the matrix with the generating element

    .

    Matrix notation: if we write

    to refer to the random vector with the random variable Xk as its k-th component, the mean-vector can be expressed:

    the covariance matrix:

    (1.18)

    and the correlation matrix

    (1.19)

    with

    (1.20)

    R is obtained by dividing each element of C by , provided that . Therefore .

    Notice that the diagonal elements of a covariance matrix represent the respective variances of the n random variables. They are therefore positive. If the n random variables are uncorrelated, their covariance matrix is diagonal and their correlation matrix is the identity matrix.

    Property 1.4 (Positivity of the covariance matrix) Any covariance matrix is positive, meaning that for any vector a Cn, we have aHCa 0.

    Property 1.5 (Bilinearity of the covariance) Let X 1 , …, Xm, Y 1 , …, Yn be random variables, and v 1 , …, vm, w1 ,…,wn be constants. Hence:

    (1.21)

    Let V and W be the vectors of components vi and wj respectively, and A = V H X and B = W H Y . By definition, cov . Replacing A and B by their respective expressions and using and , we obtain, successively:

    thus demonstrating expression (1.21). Using matrix notation, this is written:

    (1.22)

    where C designates the covariance matrix of X and Y .

    Property 1.6 (Linear transformation of a random vector) Let {X 1,…, Xn} be n random variables with as their mean vector and C X as their covariance matrix, and let {Y 1 ,…,Yq} be q random variables obtained by the linear transformation:

    where A is a matrix and b is a non-random vector with the adequate sizes. We then have:

    Definition 1.13 (White sequence) Let {X 1 ,…, Xn} be a set of n random variables. They are said to form a white sequence if var (Xi) = σ ² and if cov (Xi, Xj) = 0 for i≠j. Hence their covariance matrix can be expressed:

    where I n is the n × n identity matrix.

    Property 1.7 (Independence non-correlation) The random variables {X 1 ,…, Xn} are independent, then uncorrelated, and hence their covariance matrix is diagonal. Usually the converse statement is false.

    1.2 Conditional expectation

    Definition 1.14 (Conditional expectation) We consider a random variable X and a random vector Y taking values respectively in χ⊂ and with joint probability density pXY (x,y). The conditional expectation of X given Y, is a (measurable) real valued function g (Y) such that for any other real valued function h (Y) we have:

    (1.23)

    g (Y) is commonly denoted by .

    Property 1.8 (Conditional probability distribution) We consider a random variable X and a random vector Y taking values respectively in χ and with joint probability density pXY (x, y). Then where:

    with

    (1.24)

    is known as the conditional probability distribution of X given Y .

    Property 1.9 The conditional expectation verifies the following properties:

    1. linearity:

    ;

    2. orthogonality: for any function ;

    3. , for all functions and ;

    4. for any function ; specifically

    5. refinement by conditioning: it can be shown (see page 13) that

    (1.25)

    The variance is therefore reduced by conditioning;

    6. if X and Y are independent, then . Specifically, . The reverse is not true;

    7. , if and only if X is a function of Y.

    1.3 Projection theorem

    Definition 1.15 (Scalar product) Let be a vector space constructed over . The scalar product is an application

    which verifies the following properties:

    – (X, Y) = (Y, X)*;

    – (αX + βY, Z) = α (X, Z) + β (Y, Z);

    – (X, X) ≥ 0. The equality occurs if and only if X = 0.

    A vector space constructed over has a Hilbert space structure if it possesses a scalar product and if it is complete ³ . The norm of X is defined by and the distance between two elements by d (X 1 ,X 2 ) = . Two elements X 1 and X 2 are said to be orthogonal, noted X 1 ⊥X 2, if and only if (X 1 , X 2 ) = 0. The demonstration of the following properties is trivial:

    – Schwarz inequality:

    (1.26)

    the equality occurs if and only if λ exists such that X1 = λX2;

    – triangular inequality:

    (1.27)

    – parallelogram identity:

    (1.28)

    In a Hilbert space, the projection theorem enables us to associate any given element from the space with its best quadratic approximation contained in a closed vector sub-space:

    Theorem 1.2 (Projection theorem) Let be a Hilbert space defined over and a closed vector sub-space of . Each vector of may then be associated with a unique element X 0 of such that ∀Y ∈ we have d(X, X 0 ) d(X, Y).

    Vector X0 verifies, for any Y∈ , the relationship (X−X 0 )⊥ Y

    The relationship (X − X 0 ) Y constitutes the orthogonality principle.

    A geometric representation of the orthogonality principle is shown in Figure 1.1. The element of closest in distance to X is given by the orthogonal projection of X onto C. In practice, this is the relationship which allows us to find the solution X 0.

    This result is used alongside the expression of the norm of X − X 0, which is written:

    (1.29)

    The term (X 0 , X − X 0 ) is null due to the orthogonality principle.

    Figure 1.1 – Orthogonality principle: the point X 0 which is the closest to X in is such that X−X 0 is orthogonal to

    In what follows, the vector X 0 will be noted (X|C), or (X |Y 1:n ) when the sub-space onto which projection occurs is spanned by the linear combinations of vectors Y 1 ,…, Y n.

    The simplest application of theorem 1.2 provides that for any X and any ε ∈ :

    (1.30)

    The projection theorem leads us to define an application associating element X with element (X| ). This application is known as the orthogonal projection of X onto . The orthogonal projection verifies the following properties:

    1. linearity: (λX1+ µX2| ) = λ (X1| ) + µ (X2| );

    2. contraction: ||(X | )|| ≤ ||X||;

    3. if '⊂ ,then((X | )| ') = (X | ');

    4. if 1 ⊥ 2, then .

    The following result is fundamental:

    (1.31)

    where ε = Y n +1 (Y n +1 |Y 1:n). Because the sub-space spanned by Y 1:n +1 coincides with the sub-space spanned by (Y 1:,n , ε) and because ε is orthogonal to the sub-space generated by (Y 1:n), then property (4) applies. To complete the proof we use (1.30).

    Formula (1.31) is the basic formula used in the determination of many recursive algorithms, such as Kalman filter or Levinson recursion.

    Theorem 1.3 (Square-integrable r.v.) Let be the vector space of square-integrable random variables, defined over the probability space . Using the scalar product , has a Hilbert space structure.

    Conditional expectation

    The conditional expectation may be seen as the orthogonal projection of X onto sub-space of all measurable functions of Y. Similarly, may be seen as the orthogonal projection of X onto the sub-space of the constant random variables. These vectors are shown in Figure 1.2. Because ⊂ , using Pythagoras’s theorem, we deduce that:

    demonstrating var var (X). This can be extended to random vectors, giving the inequality (1.25) i.e. cov cov (X).

    Figure 1.2 - The conditional expectation is the orthogonal projection of X onto the set of measurable functions of Y. The expectation is the orthogonal projection of X onto the set of constant functions. Clearly,

    1.4 Gaussianity

    Real Gaussian random variable

    Definition 1.16 A random variable X is said to be Gaussian, or normal, if all its values belong to and if its characteristic function (see definition (1.9)) has the expression:

    (1.32)

    where m is a real parameter and σ is a positive parameter. We check that its mean is equal to m and its variance to σ ².

    If σ 0, it can be shown that the probability distribution has a probability density junction with the expression:

    (1.33)

    Complex Gaussian random variable

    In some problems, and particularly in the field of communications, the complex notation X = U +jV is used, where U and V refer to two real, Gaussian, centered, independent random variables with the same variance σ ² / 2. Because of independence (definition (1.7)), the joint probability distribution of the pair (U, V) has the following probability density:

    If we notice that |x = u ² + v ², and if we introduce the notation p X(x) = p UV (u,v), we can also write:

    (1.34)

    Expression (1.34) is called the probability density of a complex Gaussian random variable. The word circular is sometimes added as a reminder that the isodensity contours are the circles u² + v² = constant.

    Note that:

    Gaussian random vectors

    Definition 1.17 (Gaussian vector) X1, …, X n are said to be n jointly Gaussian variables, or that the length n vector [ X 1 … X n ] T is Gaussian, if any linear combination of its components, that is to say Y = a H X for any a= [ a 1 … a n ]T ∈ n, is a Gaussian random variable.

    This definition is applicable for vectors with real or complex components.

    Theorem 1.4 (Distribution of a real Gaussian vector) It can be shown that the probability distribution of a n Gaussian vector, with a length n mean vector m and an (n × n) covariance matrix C, has the characteristic function:

    (1.35)

    where u =(u 1 ,…,un )T n. Let x = (x 1 ,…,x n ) T . If det {C} ≠ 0, the probability distribution’s density has the expression:

    (1.36)

    Theorem 1.5 (Distribution of a complex Gaussian vector) We consider a length n complex Gaussian vector, with a length n mean vector m and an (n×n) covariance matrix C. If det {C} ≠ 0,the probability distribution’s density has the expression:

    (1.37)

    We have

    (1.38)

    (1.39)

    where 0 n is the (n × n) null-matrix.

    Below, the real and complex Gaussian distributions will be noted (m,C) and c (m,C) respectively.

    Theorem 1.6 (Gaussian case: non-correlation independence) If n jointly Gaussian variables are uncorrelated, is diagonal; then they are independent.

    Theorem 1.7 (Linear transformation of a Gaussian vector) Let [X1 … Xn ] T be a Gaussian vector with a mean vector m x and a covariance matrix C X. The random vector Y = AX + b, where A and b are a matrix and a vector respectively, with the ad hoc length, is Gaussian and we have:

    (1.40)

    In other words, the Gaussian nature of a vector is untouched by linear transformations.

    Equations (1.40) are a direct consequence of definition (1.17) and of property (1.6).

    More specifically, if X is a random Gaussian vector , then the random variable Z = C −1/ ²( X M ) follows a Gaussian distribution . Another way of expressing this is to say that if Z has the distribution then X = M + C ¹/ ² Z has the distribution .

    Note that, if C denotes a positive matrix, a square root of C is a matrix M which verifies:

    (1.41)

    Hence, if M is a square root of C , then for any unitary matrix U , i.e. such that UU H = I , matrix MU

    Enjoying the preview?
    Page 1 of 1