Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Matrix Operations for Engineers and Scientists: An Essential Guide in Linear Algebra
Matrix Operations for Engineers and Scientists: An Essential Guide in Linear Algebra
Matrix Operations for Engineers and Scientists: An Essential Guide in Linear Algebra
Ebook524 pages5 hours

Matrix Operations for Engineers and Scientists: An Essential Guide in Linear Algebra

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Engineers and scientists need to have an introduction to the basics of linear algebra in a context they understand. Computer algebra systems make the manipulation of matrices and the determination of their properties a simple matter, and in practical applications such software is often essential. However, using this tool when learning about matrices, without first gaining a proper understanding of the underlying theory, limits the ability to use matrices and to apply them to new problems.
This book explains matrices in the detail required by engineering or science students, and it discusses linear systems of ordinary differential equations. These students require a straightforward introduction to linear algebra illustrated by applications to which they can relate. It caters of the needs of undergraduate engineers in all disciplines, and provides considerable detail where it is likely to be helpful.
According to the author the best way to understand the theory of matrices is by working simple exercises designed to emphasize the theory, that at the same time avoid distractions caused by unnecessary numerical calculations. Hence, examples and exercises in this book have been constructed in such a way that wherever calculations are necessary they are straightforward. For example, when a characteristic equation occurs, its roots (the eigenvalues of a matrix) can be found by inspection.

The author of this book is Alan Jeffrey, Emeritus Professor of mathematics at the University of Newcastle upon Tyne. He has given courses on engineering mathematics at UK and US Universities.
LanguageEnglish
PublisherSpringer
Release dateSep 5, 2010
ISBN9789048192748
Matrix Operations for Engineers and Scientists: An Essential Guide in Linear Algebra

Related to Matrix Operations for Engineers and Scientists

Related ebooks

Physics For You

View More

Related articles

Reviews for Matrix Operations for Engineers and Scientists

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Matrix Operations for Engineers and Scientists - Alan Jeffrey

    Alan JeffreyMatrix Operations for Engineers and ScientistsAn Essential Guide in Linear Algebra10.1007/978-90-481-9274-8_1© Springer Netherlands 2010

    1. Matrices and Linear Systems of Equations

    Alan Jeffrey¹ 

    (1)

    University of Newcastle, 16 Bruce Bldg., NE1 7RU Newcastle upon Tyne, United Kingdom

    Abstract

    The practical interest in matrices arose from the need to work with linear systems of algebraic equations of the form

    $$ \begin{array}{lllll}{{a_{11}}{x_1} + {a_{12}}{x_2} + \cdots {a_{1n}}{x_n} = {b_1}}, \\{{a_{21}}{x_1} + {a_{22}}{x_2} + \cdots {a_{2n}}{x_n} = {b_2}}, \\{{a_{31}}{x_1} + {a_{32}}{x_2} + \cdots {a_{3n}}{x_n} = {b_3}}, \\\vdots \\{{a_{m1}}{x_1} + {a_{m2}}{x_2} + \cdots {a_{mn}}{x_n} = {b_m}}, \\\end{array} $$

    (1.1)

    involving the n unknowns x 1, x 2,…, x n , m equations with constant coefficients a ij , i = 1, 2,…, m, j = 1, 2,… n, and m constants b 1, b 2,…, b mn called the nonhomogeneous terms, where the coefficients a ij and the b i may be real or complex numbers. A solution set for system (1.1) is a set of numbers {x 1, x 2,… x n }, real or complex, that when substituted into (1.1), satisfy all m equations identically. When m < n system (1.1) is said to be underdetermined, so as there are fewer linear equations than unknowns a unique solution set cannot be expected.

    1.1 Systems of Algebraic Equations

    The practical interest in matrices arose from the need to work with linear systems of algebraic equations of the form

    $$ \begin{array}{lllll}{{a_{11}}{x_1} + {a_{12}}{x_2} + \cdots {a_{1n}}{x_n} = {b_1}}, \\{{a_{21}}{x_1} + {a_{22}}{x_2} + \cdots {a_{2n}}{x_n} = {b_2}}, \\{{a_{31}}{x_1} + {a_{32}}{x_2} + \cdots {a_{3n}}{x_n} = {b_3}}, \\\vdots \\{{a_{m1}}{x_1} + {a_{m2}}{x_2} + \cdots {a_{mn}}{x_n} = {b_m}}, \\\end{array} $$

    (1.1)

    involving the n unknowns x 1, x 2,…, x n , m equations with constant coefficients a ij , i = 1, 2,…, m, j = 1, 2,… n, and m constants b 1, b 2,…, b mn called the nonhomogeneous terms, where the coefficients a ij and the b i may be real or complex numbers. A solution set for system (1.1) is a set of numbers {x 1, x 2,… x n }, real or complex, that when substituted into (1.1), satisfy all m equations identically. When m < n system (1.1) is said to be underdetermined, so as there are fewer linear equations than unknowns a unique solution set cannot be expected.

    The reason for this can be seen by considering the simple underdetermined system

    $$ \begin{array}{l}{x_1} + {x_2} + {x_3} = 1, \hfill \\{x_1} + 2{x_2} + 3{x_3} = 2. \hfill \\\end{array} $$

    Rewriting the system as

    $$ \begin{array}{l}{x_1} + {x_2} = 1 - {x_3}, \hfill \\{x_1} + 2{x_2} = 2 - 3{x_3}, \hfill \\\end{array} $$

    and for the moment regarding the expressions on the right of the equality sign as known quantities, solving for x 1 and x 2 by elimination gives x 1 = x 3 and x 2 = 1 − 2x 3, where x 3 is unknown. Setting x 3 = k, where k is a parameter (an arbitrary number), the solution set {x 1, x 2, x 3} of this underdetermined system becomes {k, 1 − 2k, k}. As k is arbitrary, the solution set of the system is not unique. It is not difficult to see that this situation generalizes to larger underdetermined systems, though then the solution set may depend on more than one unknown variable, each of which may be regarded as a parameter.

    When m > n system (1.1) is said to be overdetermined, so as n unknowns have to satisfy m > n linear equations, in general no solution set will exist. That overdetermined systems may or may not have a solution set can be seen by considering the following three systems:

    $$ \begin{array}{ll}{\left({\hbox{a}} \right)}& \begin{array}{lll}{x_1} + {x_2} = 1 \hfill \\{x_1} + 2{x_2} = 3 \hfill \\{x_1} + 3{x_2} = 0, \hfill \\\end{array} \\\end{array} \quad \quad \begin{array}{lllll}{\left( {\hbox{b}} \right) } & \begin{array}{lll}{x_1} + {x_2} + {x_3} = 2 \hfill \\{x_1} + 2{x_2} + 3{x_3} = 0 \hfill \\{x_1} - 2{x_2} + {x_3} = - 4 \hfill \\2{x_1} + 3{x_2} + 4{x_3} = 2, \hfill \\\end{array} \\\end{array} \quad \quad \begin{array}{lllll}{\left( {\hbox{c}} \right) } & \begin{array}{lll}{x_1} + {x_2} + {x_3} = 1 \hfill \\{x_1} + 2{x_2} + 3{x_3} = 2 \hfill \\{x_2} + 2{x_3} = 1 \hfill \\2{x_2} + 3{x_3} + 4{x_3} = 3. \hfill \\\end{array} \\\end{array} $$

    System (a) can have no solution, because the left side of the third equation is the sum of the left sides of the first two equations, but this relationship is not true for its right side. Thus the last equation contradicts the first two equations, so the system is said to be inconsistent. In system (b) the last equation is seen to be the sum of the first two equations, so after discarding the last equation because it is redundant, solving the remaining three equations by elimination gives x 1 = 2, x 2 = 2 and x 3 = −2. Thus the overdetermined system in (b) has a unique solution set {2, 2, −2}. However, the situation in system (c) is different again, because the third equation is simply the difference between the second and first equations, while the fourth equation is the sum of the first two equations, so after discarding the last two equations which are redundant, we are left with the first two equations that have already been shown in (a) to have the nonunique solution set {x 1, x 2, x 3}of the form {k, 1 − 2k, k}, with k arbitrary (a parameter).

    Finally, when m = n system (1.1) is said to be properly determined, so as n unknowns have to satisfy n linear equations, unless one or more of the equations contradicts the other equations, a unique solution set can be expected. This is the case with the system

    $$ \begin{array}{l}{x_1}+{x_2}-{x_3} = 6,\hfill \\{x_1}-{x_2}+{x_3}=- 4,\hfill \\{x_1} + 2{x_2}- {x_3} = 8,\hfill \\\end{array}$$

    which is easily seen to have the unique solution set {x 1, x 2, x 3} given by {1, 2, −3}. Notice that when, as above, the general solution set {x 1, x 2, x 3} is equated to {1, 2, −3}, this requires corresponding entries to be equal, so writing{x 1, x 2, x 3} = {1, 2, −3} means that x 1 = 1, x 2 = 2 and x 3 = −3. This interpretation of equality between similar arrangements (arrays) of quantities, which in this case were numbers, will be seen to play an important role when matrices are introduced and their equality is defined.

    1.2 Suffix and Matrix Notation

    Later the solution of the system of Eq. (1.1) will be considered in detail, and it will be shown how to determine if a unique solution set exists, if a solution set exists but it is not unique, and in which case how many arbitrary parameters the solution set must contain, and if no solution set exists.

    The suffix notation for the coefficients and unknowns in system (1.1) is standard, and its purpose is to show that a ij is the numerical multiplier of the jth unknown x j in the ith equation, and b i is the corresponding nonhomogeneous term in the ith equation. With this understanding, because the numbers a ij and b j each has a sign, if the n unknowns x 1, x 2,…, x n are arranged in the same order in each equation, the symbols x 1, x 2,…, x n may be omitted, and the system represented instead by the array of numbers

    $$ \begin{array}{lllll}\;\;{{a_{11}}\;\;{a_{12}}\;\;{a_{13}}\;\; \cdots \;\;{a_{1n}}\;\; \vdots \;\;{b_1}} \\{{a_{21}}\;\;{a_{22}}\;\;{a_{23}} \;\;\cdots \;\;{a_{2n}}\;\; \vdots \;\;{b_2}} \\{{a_{31}}\;\;{a_{32}}\;\;{a_{33}}\; \cdots \;\;{a_{3n}}\;\; \;\vdots \;\;{b_3}} \\\;\;\vdots \\{{a_{m1}}\;\;{a_{m2}}\;\;{a_{m3}}\; \cdots \;\;{a_{mn}}\;\; \vdots \;\;{b_m}} \\\end{array}. $$

    (1.2)

    For reasons that will appear later, the nonhomogeneous terms b i have been separated from the array of coefficients a ij , and for the time being the symbol $$ \vdots $$ has been written in place of the equality sign. The double suffix ij serves as the grid reference for the position of the number a ij in the array (1.2) showing that it occurs in the ith row and the jth column, while for the nonhomogeneous term b i , the suffix i shows the row in which b i occurs. For example, if a 32 = −5, the numerical multiplier of x 2 in the third equation in (1.1) is −5, so the element in the second position of the third row in array (1.2) is −5. Similarly, if b 3 = 4 the nonhomogeneous term in the third equation in (1.1) is 4, so the entry b 3 in (1.2) is 4. Arrays of m rows of n numbers are called matrices, and a concise notation is needed if instead of algebra being performed on equations like (1.1), it is to be replaced by algebra performed on matrices. The standard notation for a matrix denoted by A that contains the entries a ij , and a matrix containing the entries b i in (1.2) is to write

    $$ {\mathbf{A}} = \left[ {\begin{array}{llllll}& {{a_{11}}} & {{a_{12}}} & {{a_{13}}} & \cdots & {{a_{1n}}} \\{{a_{21}}} & {{a_{22}}} & {{a_{23}}} & \cdots & {{a_{2n}}} \\{{a_{31}}} & {{a_{32}}} & {{a_{33}}} & \cdots & {{a_{3n}}} \\\vdots & \vdots & \vdots & \vdots & \vdots \\{{a_{m1}}} & {{a_{m2}}} & {{a_{m3}}} & \cdots & {{a_{mn}}} \\\end{array} } \right], {\mathbf{b}} = \left[ {\begin{array}{lllll}{{b_1}} \\{{b_2}} \\{{b_3}} \\\vdots \\{{b_m}} \\\end{array} } \right], $$

    (1.3)

    or more concisely still,

    $$ {\mathbf{A}} = [{a_{ij}}],\;\;i = 1,\,2,.\;.\;.\;,\,m,\,\,j = 1,\,2,.\;.\;.\;,\;n\ {\hbox{and }}{\mathbf{b}}\;{ = }\;{[}{b_i}],\;\;i = 1,\;2,.\;.\;.\;,\ m. $$

    (1.4)

    A different but equivalent notation that is also in use replaces the square brackets [.] by (.), in which case (1.4) become A = (a ij ) and b = (b i ).

    Expression A in (1.3) is called an m × n matrix to show the number of rows m and the number of columns n it contains, without specifying individual entries. The notation m × n is often called the size or shape of a matrix, as it gives a qualitative understanding of the number of rows and columns in the matrix, without specifying the individual entries a ij . A matrix in which the number of rows equals the number of columns it is called a square matrix, so if it has n rows, it is an n × n matrix. Matrix b in (1.3) is called an m element column vector, or if the number of entries in b is unimportant, simply a column vector. A matrix with the n entries c 1, c 2,…, c n of the form

    $$ {\mathbf{c}} = [c{{\kern 1pt}_1},\;{c_2},\;{c_3},\,.\;.\;.\,,{c_n}] $$

    (1.5)

    is called an n element row vector, or if the number of entries in c is unimportant, simply a row vector. In what follows we use the convention that row and column vectors are denoted by bold lower case Roman characters, while other matrices are denoted by bold upper case Roman characters. The entries in matrices and vectors are called elements, so an m × n matrix contains mn elements, while the row vector in (1.5) is an n element row vector. As a rule, the entries in a general matrix A are denoted by the corresponding lower case italic letter a with a suitable double suffix, while in a row or column vector d the elements are denoted by the corresponding lower case italic letter d with a single suffix. The elements in each row of A in (1.3) form an n element row vector, and the elements in each column form an m element column vector. This interpretation of matrices as collections of row or column vectors will be needed later when the operations of matrix transposition and multiplication are defined.

    1.3 Equality, Addition and Scaling of Matrices

    Two matrices A and B are said to be equal, shown by writing A = B, if each matrix has the same number of rows and columns, and elements in corresponding positions in A and B are equal. For example, if

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}\\1 & p \\2 & { - 4} \\\end{array} } \right]\,\;\;{\hbox{and }}{\mathbf{B}} = \left[ {\begin{array}{lllll}\\1 & 3 \\2 & q \\\end{array} } \right], $$

    equality is possible because each matrix has the same number of rows and columns, so they each have the same shape, but A = B only if, in addition, p = 3 and q = −4.

    If every element in a matrix is zero, the matrix is written 0 and called the null or zero matrix. It is not usual to indicate the number of rows and columns in a null matrix, because it will be assumed they are appropriate for whatever algebraic operations are being performed. If, for example, in the linear system of algebraic equations in (1.1) all of the nonhomogeneous terms b 1 = b 2 =… = b m 0, the corresponding vector b in (1.3) becomes b = 0, where in this case 0 in an m-dimensional column vector with every element zero. A column or row vector in which every element is zero is called a null vector.

    Given two similar systems of equations

    $$ \begin{array}{lllll}\\{{a_{11}}{x_1} + {a_{12}}{x_2} + \cdots {a_{1n}}{x_n} = {b_1}} \\{{a_{21}}{x_1} + {a_{22}}{x_2} + \cdots {a_{2n}}{x_n} = {b_2}} \\{{a_{31}}{x_1} + {a_{32}}{x_2} + \cdots {a_{3n}}{x_n} = {b_3}} \\\vdots \\{{a_{m1}}{x_1} + {a_{m2}}{x_2} + \cdots {a_{mn}}{x_n} = {b_m}} \\\end{array}\quad {\hbox{and}} \quad\begin{array}{lllll}\\{{{\tilde{a}}_{11}}{x_1} + {{\tilde{a}}_{12}}{x_2} + \cdots {{\tilde{a}}_{1n}}{x_n} = {{\tilde{b}}_1}} \\{{{\tilde{a}}_{21}}{x_1} + {{\tilde{a}}_{22}}{x_2} + \cdots {{\tilde{a}}_{2n}}{x_n} = {{\tilde{b}}_2}} \\{{{\tilde{a}}_{31}}{x_1} + {{\tilde{a}}_{32}}{x_2} + \cdots {{\tilde{a}}_{3n}}{x_n} = {{\tilde{b}}_3}} \\\vdots \\{{{\tilde{a}}_{m1}}{x_1} + {{\tilde{a}}_{m2}}{x_2} + \cdots {{\tilde{a}}_{mn}}{x_n} = {{\tilde{b}}_m}}, \\\end{array} $$

    the result of adding corresponding equations, and writing the result in matrix form, leads to the following definitions of the sum of the respective coefficient matrices and of the vectors that contain the nonhomogeneous terms

    $$ {\mathbf{A}} + {{\tilde{\mathbf{A}}}} = \left[ {\begin{array}{llllllllll}{{a_{11}} + {{\tilde{a}}_{11}}} & {{a_{12}} + {{\tilde{a}}_{12}}} & {{a_{13}} + {{\tilde{a}}_{13}}} & \cdots & {{a_{1n}} + {{\tilde{a}}_{1n}}} \\{{a_{21}} + {{\tilde{a}}_{21}}} & {{a_{22}} + {{\tilde{a}}_{22}}} & {{a_{23}} + {{\tilde{a}}_{23}}} & \cdots & {{a_{2n}} + {{\tilde{a}}_{2n}}} \\{{a_{31}} + {{\tilde{a}}_{31}}} & {{a_{32}} + {{\tilde{a}}_{32}}} & {{a_{33}} + {{\tilde{a}}_{33}}} & \cdots & {{a_{3n}} + {{\tilde{a}}_{3n}}} \\\vdots & \vdots & \vdots & \vdots & \vdots \\{{a_{m1}} + {{\tilde{a}}_{m1}}} & {{a_{m2}} + {{\tilde{a}}_{m2}}} & {{a_{m3}} + {{\tilde{a}}_{m3}}} & \cdots & {{a_{mn}} + {{\tilde{a}}_{mn}}} \\\end{array} } \right] {\hbox{and}}\ {\mathbf{b}} + {{\tilde{\mathbf{b}}}} = \left[ {\begin{array}{lllll}{{b_1} + {{\tilde{b}}_1}} \\{{b_2} + {{\tilde{b}}_2}} \\{{b_3} + {{\tilde{b}}_3}} \\\vdots \\{{b_m} + {{\tilde{b}}_m}} \\\end{array} } \right]\,. $$

    This shows that if matrix algebra is to represent ordinary algebraic addition, it must be defined as follows. Matrices A and B will be said to be conformable for addition, or summation, if each matrix has the same number of rows and columns. Setting A = [a ij ], B = [b ij ], the sum A + B of matrices A and B is defined as the matrix

    $$ {\mathbf{A}} + {\mathbf{B}} = \left[ {{a_{ij}} + {b_{ij}}} \right]. $$

    (1.6)

    It follows directly from (1.6) that

    $$ {\mathbf{A}} + {\mathbf{B}} = {\mathbf{B}} + {\mathbf{A}}, $$

    (1.7)

    so matrix addition is commutative. This means the order in which conformable matrices are added (summed) is unimportant, as it does not affect the result. It follows from (1.6) that the difference between matrices A and B, written A − B, is defined as

    $$ {\mathbf{A}} - {\mathbf{B}} = [{a_{ij}} - {b_{ij}}]. $$

    (1.8)

    The sum and difference of matrices A and B with different shapes is not defined.

    If each equation in (1.1) is scaled (multiplied) by a constant k the matrices in (1.3) become

    $$ \left[ {\begin{array}{lllll}\\{k{a_{11}}} & {k{a_{12}}} & {k{a_{13}}} & \cdots & {k{a_{1n}}} \\{k{a_{21}}} & {k{a_{22}}} & {k{a_{23}}} & \cdots & {k{a_{2n}}} \\{k{a_{31}}} & {k{a_{32}}} & {k{a_{33}}} & \cdots & {k{a_{3n}}} \\\vdots & \vdots & \vdots & \vdots & \vdots \\{k{a_{m1}}} & {k{a_{m2}}} & {k{a_{m3}}} & \cdots & {k{a_{mn}}} \\\end{array} } \right] \quad {\hbox{and}} \quad \left[ {\begin{array}{lllll}\\{k{b_1}} \\{k{b_2}} \\{k{b_3}} \\\vdots \\{k{b_m}} \\\end{array} } \right]. $$

    This means that if matrix A = [a ij ] is scaled by a number k (real or complex), then the result, written k A, is defined as k A = [ka ij ]. So if A = [a ij ] and B = [b ij ] are conformable for addition and k and K are any two numbers (real or complex), then

    $$ k{\mathbf{A}} + K{\mathbf{B}} = \left[ {k{a_{ij}} + K{b_{ij}}} \right]. $$

    (1.9)

    Example 1.1

    Given $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}\\4 & { - 1} & 3 \\7 & 0 & { - 2} \\\end{array} } \right],\,\,\,\,{\mathbf{B}} = \left[ {\begin{array}{lllll}\\{ - 4} & 2 & 2 \\{ - 1} & 5 & 6 \\\end{array} } \right] $$ , find A + B, A − B and 2A + 3B.

    Solution

    The matrices are conformable for addition because each has two rows and three columns (they have the same shape). Thus from (1.6), (1.7) and (1.8)

    $$ {\mathbf{A}} + {\mathbf{B}} = \left[ {\begin{array}{lllll}0 & 1 & 5 \\6 & 5 & 4 \\\end{array} } \right],\ {\mathbf{A}} - {\mathbf{B}} = \left[ {\begin{array}{lllll}8 & { - 3} & 1 \\8 & { - 5} & { - 8} \\\end{array} } \right] \ {{{\rm and} \ \; 2}}{\mathbf{A}} + {3}{\mathbf{B}} = \left[ {\begin{array}{lllll}{ - 4} & 4 & {12} \\{11} & {15} & {14} \\\end{array} } \right]. $$

    1.4 Some Special Matrices and the Transpose Operation

    Some square matrices exhibit certain types of symmetry in the pattern of their coefficients. Consider the n × n matrix

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}\\{{a_{11}}} & {{a_{12}}} & {{a_{13}}} & \cdots & {{a_{1n}}} \\{{a_{21}}} & {{a_{22}}} & {{a_{23}}} & \cdots & {{a_{2n}}} \\{{a_{31}}} & {{a_{32}}} & {{a_{33}}} & \cdots & {{a_{3n}}} \\\vdots & \vdots & \vdots & \vdots & \vdots \\{{a_{n1}}} & {{a_{n2}}} & {{a_{n3}}} & \cdots & {{a_{nn}}} \\\end{array} } \right]\,, $$

    then the diagonal drawn from top left to bottom right containing the elements a 11, a 22, a 33,…, a nn is called the leading diagonal of the matrix.

    A square matrix A is said to be symmetric if its numerical entries appear symmetrically about the leading diagonal. That is, the elements of an n × n symmetric matrix A are such that

    $$ {a_{ij}} = {a_{ji}}, i,j = {1},{2},.{ }.{ }.{ },n\ \left( {condition{\ }for{\ }symmetry} \right). $$

    (1.10)

    Another way of defining a symmetric matrix is to say that if a new matrix B is constructed such that row 1 of A is written as column 1 of B, row 2 of A is written as column 2 of B,…, and row n of A is written as column n of B, then the matrices A and B are identical if B = A. For example, if

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}\\1 & 4 & 3 \\4 & 2 & 6 \\3 & 6 & 4 \\\end{array} } \right]\!\!\!\quad{\hbox{and}} \quad {\mathbf{B}} = \left[ {\begin{array}{lllll}\\1 & 5 & 7 \\9 & 4 & 5 \\1 & 0 & 1 \\\end{array} } \right], $$

    then A is seen to be a symmetric matrix, but B is not symmetric.

    Belonging to the class of symmetric matrices are the n × n diagonal matrices, all of whose elements are zero away from the leading diagonal. A diagonal matrix A with entries λ 1, λ 2,…, λ n on its leading diagonal, some of which may be zero, is often written A = diag{λ 1, λ 2,…, λ n

    Matrix multiplication is based on the product ab of an n element row vector a = [a 1, a 2, a n ] and an n element column vector b = [b 1, b 2, b n ]T. This product of vectors written ab, and called the inner product or scalar product of the matrix row vector a and the matrix column vector b, is defined as

    $$ {\mathbf{ab}} = {a_1}{b_1} + {a_2}{b_2} + \cdots + {a_n}{b_n} = \sum\limits_{i = 1}^n {{a_i}} {b_i} $$

    (3.1)

    An important special case of diagonal matrices are the identity matrices, also called unit matrices, which are denoted collectively by the symbol I. These are diagonal matrices in which each element on the leading diagonal is 1 (and all remaining entries are zeros). When written out in full, if A = diag{2, −3, 1}, and I is the 3 × 3 identity matrix, then

    $$ {\mathbf{A}} = {\hbox{diag}}\{ 2, - 3,1\} = \left[ {\begin{array}{lllll}2 & 0 & 0 \\0 & { - 3} & 0 \\0 & 0 & 1 \\\end{array} } \right]{ }\quad{\hbox{and }}\quad{\mathbf{I}} = \left[ {\begin{array}{lllll}1 & 0 & 0 \\0 & 1 & 0 \\0 & 0 & 1 \\\end{array} } \right]\,. $$

    As with the null matrix, it is not usual to specify the number of rows in an identity matrix, because the number is assumed to be appropriate for whatever algebraic operation is to be performed that involves I. If, for any reason, it is necessary to show the precise shape of an identity matrix, it is sufficient to write I n to show an n × n identity matrix is involved. In terms of this notation, the 3 × 3 identity matrix shown above becomes I 3.

    A different form of symmetry occurs when the n × n matrix A = [a ij ] is skew symmetric, in which case its entries a ij are such that

    $$ {a_{ij}} = - {a_{ji}}\ {\hbox{for}}\ i,j = {1},{2},.\;.\;.\;,n\ \left( {condition{\ }for{\ }skew{\ }symmetry} \right). $$

    (1.11)

    Notice that elements on the leading diagonal of a skew symmetric matrix must all be zero, because by definition a ii = −a ii , and this is only possible if a ii = 0 for i = 1, 2,…, n.

    A typical example of a skew symmetric matrix is

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}0 & 1 & 3 & { - 2} \\{ - 1} & 0 & 4 & 6 \\{ - 3} & { - 4} & 0 & { - 1} \\2 & { - 6} & 1 & 0 \\\end{array} } \right]\,. $$

    Other square matrices that are important are upper and lower triangular matrices, denoted respectively by U and L. In U all elements below the leading diagonal are zero, while in L all elements above the leading diagonal are zero. Typical examples of upper and lower triangular matrices are

    $$ \;{\mathbf{U}} = \left[ {\begin{array}{lllll}2 & 0 & 8 \\0 & 1 & 6 \\0 & 0 & { - 3} \\\end{array} } \right]\ {{\rm and}}\ {\mathbf{L}} = \left[ {\begin{array}{lllll}3 & 0 & 0 \\5 & 1 & 0 \\{ - 9} & 7 & 0 \\\end{array} } \right]\!. $$

    The need to construct matrices in which rows and columns have been interchanged (not necessarily square matrices) leads to the introduction of the transpose operation. The transpose of an m × n matrix A, denoted by A T, is the n × m matrix derived from A by writing row 1 of A as column 1 of A T, row 2 of A as column 2 of A T,…, and row m of A as column m of A T. Obviously, the transpose of a transposed matrix is the original matrix, so (A T)T = A. Typical examples of transposed matrices are

    $$ {\left[ {\begin{array}{lllll}1 & { - 4} & 7 \\\end{array} } \right]^{\rm{T}}} = \left[ {\begin{array}{lllll}1 \\{ - 4} \\7 \\\end{array} } \right], {\left[ {\begin{array}{lllll}1 \\{ - 4} \\7 \\\end{array} } \right]^{\rm{T}}}\!\!\! = \left[ {1, - 4,7} \right]\!\!\quad{{\rm and}}\quad \!\!{\left[ {\begin{array}{lllll}2 & 0 & 5 \\1 & { - 1} & 4 \\\end{array} } \right]^{\rm{T}}} = \left[ {\begin{array}{lllll}2 & 1 \\0 & { - 1} \\5 & 4 \\\end{array} } \right]\!. $$

    Clearly, a square matrix A is symmetric if A T = A, and it is skew symmetric if A T = −A. The matrix transpose operation has many uses, some of which will be encountered later.

    A useful property of the transpose operation when applied to the sum of two m × n matrices A and B is that

    $$ {[{\mathbf{A}} + {\mathbf{B}}]^{\rm{T}}} = {{\mathbf{A}}^{\rm{T}}} + {{\mathbf{B}}^{\rm{T}}}. $$

    (1.12)

    This proof of this result is almost immediate. If A = [a ij ] and B = [b ij ], by definition

    $$ {\mathbf{A}} + {\mathbf{B}} = \left[ {\begin{array}{lllll}{{a_{11}} + {b_{11}}} & {{a_{12}} + {b_{12}}} & \cdots & {{a_{1n}} + {b_{1n}}} \\{{a_{21}} + {b_{21}}} & {{a_{22}} + {b_{22}}} & \cdots & {{a_{2n}} + {b_{2n}}} \\\vdots & \vdots & \vdots & \vdots \\{{a_{m1}} + {b_{m1}}} & {{a_{m2}} + {b_{m2}}} & \cdots & {{a_{mn}} + {b_{mn}}} \\\end{array} } \right]. $$

    Taking the transpose of this result, and then using the rule for matrix addition, we have

    $$ {[{\mathbf{A}} + {\mathbf{B}}]^{\rm{T}}} = \left[ {\begin{array}{lllll}{{a_{11}} + {b_{11}}} & {{a_{21}} + {b_{21}}} & \cdots & {{a_{m1}} + {b_{m1}}} \\{{a_{12}} + {b_{12}}} & {{a_{22}} + {b_{22}}} & \cdots & {{a_{m2}} + {b_{m2}}} \\\vdots & \vdots & \vdots & \vdots \\{{a_{1n}} + {b_{1n}}} & {{a_{2n}} + {b_{2n}}} & \cdots & {{a_{nm}} + {b_{nm}}} \\\end{array} } \right] = {{\mathbf{A}}^{\rm{T}}} + {{\mathbf{B}}^{\rm{T}}}, $$

    and the result is established.

    An important use of matrices occurs in the study of properly determined systems of n linear first order differential equations in the n unknown differentiable functions x 1(t), x 2(t),…, x n (t) of the independent variable t:

    $$ \begin{array}{lllll}{\frac{{d{x_1}(t)}}{{dt}} = {a_{11}}{x_1}(t) + {a_{12}}{x_2}(t) + \cdots + {a_{1n}}{x_n}(t)}, \\{\frac{{d{x_2}(t)}}{{dt}} = {a_{21}}{x_1}(t) + {a_{22}}{x_2}(t) + \cdots + {a_{2n}}{x_n}(t)}, \\\vdots \\{\frac{{d{x_n}(t)}}{{dt}} = {a_{n1}}{x_1}(t) + {a_{n2}}{x_2}(t) + \cdots + {a_{nn}}{x_n}(t)}. \\\end{array} $$

    (1.13)

    In the next chapter matrix multiplication will be defined, and in anticipation of this we define the coefficient matrix of system (1.13) as A = [a ij ], and the column vectors x(t), and d x(t)/dt as

    $$ \begin{array}{ll} {\mathbf{x}}\,(t)= {\left[ {{x_1}(t),{x_2}(t), \ldots,{x_n}(t)} \right]^{\rm{T}}}\,\,\,{\hbox{and }} \\\frac{{d{\mathbf{x}}(t)}}{{dt}}= {\left[ {\frac{{d{x_1}(t)}}{{dt}},\frac{{d{x_2}(t)}}{{dt}}, \ldots, \frac{{d{x_n}(t)}}{{dt}}} \right]^{\rm{T}}}, \\\end{array}$$

    (1.14)

    where the transpose operation has been used to write a column vector as the transpose of a row vector to save space on the printed page. System (1.13) can be written more concisely as

    $$ \frac{{d{\mathbf{x}}(t)\,}}{{dt}} = {\mathbf{Ax}}(t), $$

    (1.15)

    where Ax(t) denotes the product of matrix A and vector x(t), in this order, which will be defined in Chapter 2. Notice how the use of the transpose operation in Eq. (1.14) saves space on a printed page, because had it not been used, column vectors like x(t) and d x(t)/dt when written out in full would have become

    $$ {\mathbf{x}}(t) = \left[ {\begin{array}{lllll}{{x_1}(t)} \\{{x_2}(t)} \\\vdots \\{{x_n}(t)} \\\end{array} } \right]\ {{\rm and }}\ \ \frac{{d{\mathbf{x}}(t)}}{{dt}} = \left[ {\begin{array}{lllll}{\frac{{d{x_1}(t)}}{{dt}}} \\{\frac{{d{x_2}(t)}}{{dt}}} \\\vdots \\{\frac{{d{x_n}(t)}}{{dt}}} \\\end{array} } \right]. $$

    Exercises

    1.

    Write down the coefficient matrix A and nonhomogeneous term matrix b for the linear nonhomogeneous system of equations in the variables x 1, x 2, x 3 and x 4:

    $$ \begin{array}{l}{3}{x_1} + {2}{x_2}-{4}{x_3}+{5}{x_4} = {4}, \hfill \\ {3}{x_1} + {2}{x_2}-{x_{{4} }} + {4}{x_3} = {3}, \hfill \\{4}{x_2} - {2}{x_1} + {x_3}+{5}{x_4} = {2}, \hfill\\{6}{x_3} + {3}{x_1} + {2}{x_2} = {1}. \hfill \\\end{array} $$

    2.

    If $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}2 & 0 & 5 \\1 & 3 & 1 \\\end{array} } \right]\!,\,\,\,{\mathbf{B}} = \left[ {\begin{array}{lllll}{ - 1} & 2 & 3 \\{ - 2} & 4 & 6 \\\end{array} } \right]\!, $$ find A + 2B and 3A − 4B.

    3.

    If

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}1 & 3 & a \\2 & b & { - 1} \\{ - 2} & c & 3 \\\end{array} } \right]\!\!\;\; {\rm {and }}\;\; {\mathbf{B}} = \left[ {\begin{array}{lllll}1 & 2 & { - 2} \\3 & 6 & 4 \\0 & { - 1} & 3 \\\end{array} } \right]\! $$

    , find a, b and c if A = B T.

    4.

    If A = $$ \left[ {\begin{array}{lllll}2 & 4 \\6 & 1 \\0 & 3 \\\end{array} } \right]\!\!\!\!\quad {\hbox{and }}\ {\mathbf{B}} = \left[ {\begin{array}{lllll}4 & 1 & { - 3} \\2 & { - 3} & 1 \\\end{array} } \right] $$ , find 3A − B T and 2A T + 4B.

    5.

    If $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}3 & 0 & 1 \\1 & 4 & 3 \\5 & 1 & 2 \\\end{array} } \right]\!\!\!\!\quad {\hbox{and}}\;\;\;{\mathbf{B}} = \left[ {\begin{array}{lllll}0 & 4 & 1 \\2 & 5 & 1 \\3 & { - 2} & 2 \\\end{array} } \right], $$ find A T + B and 2A + 3(B T)T.

    6.

    If matrices A and B are conformable for addition, prove that (A + B)T = A T + B T.

    7.

    Given

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll}{{a_{11}}} & 4 & { - 3} & {{a_{14}}} \\{{a_{21}}} & {{a_{22}}} & {{a_{23}}} & {{a_{24}}} \\{{a_{31}}} & 6 & {{a_{33}}} & 7 \\1 & {{a_{42}}} & {{a_{43}}} & {{a_{44}}} \\\end{array} } \right]\!, $$

    what conditions, if any, must be placed on the undefined coefficients a ij if (a) matrix A is to be symmetric, and (b) matrix A is to be skew symmetric?

    8.

    Prove that every n × n matrix A can be written as the sum of a symmetric matrix M and a skew symmetric matrix S. Write down an arbitrary 4 × 4 matrix and use your result to find the matrices M and S.

    9.

    Consider the underdetermined system

    $$ \begin{array}{l}{x_1} + {x_2} + {x_3} = 1, \hfill \\{x_1} + 2{x_2} + 3{x_3} = 2, \hfill \\\end{array} $$

    solved in the text. Rewrite it as the two equivalent systems

    $$ {\begin{array}{lllll}{\left( {\hbox{a}} \right) } \\\end{array} \begin{array}{lllll}{x_1} + {x_3} = 1 - {x_2} \\ {x_1} + 3{x_3} = 2 - 2{x_2} \\\end{array} {\hbox{and}}\ \left( {\hbox{b}} \right) } \begin{array}{lllll}{x_2} + {x_3} = 1 - {x_1} \\ 2{x_2} + 3{x_3} = 2 - {x_1}. \\\end{array} $$

    Find the solution set of system (a) in terms of an arbitrary parameter p = x 2, and the solution set of system (b) in terms of an arbitrary parameter q = x 1. By comparing solution sets, what can you deduce about the solution set found in the text in terms of the arbitrary parameter k = x 3, and the solution sets for systems (a) and (b) found, respectively, in terms of the arbitrary parameters p and q?

    10.

    Consider the two overdetermined systems

    $$ \begin{array}{lllll}{\left( {\hbox{a}} \right) } & \begin{array}{lllll}{x_1} - 2{x_2} + 2{x_3} = 6 \hfill \\ {x_1} + {x_2} - {x_3} = 0 \hfill \\ {x_1} + 3{x_2} - 3{x_3} = - 4 \hfill \\ {x_1} + {x_2} + {x_3} = 3 \hfill \\\end{array} \\\end{array} \begin{array}{lllll}{\!\! \quad {\hbox{and}} \quad \left( {\hbox{b}} \right) } & \begin{array}{lllll}2{x_1} + 3{x_2} - {x_3} = 2 \hfill \\ {x_1} - {x_2} + 2{x_3} = 1 \hfill \\ 4{x_1} + {x_2} + 3{x_3} = 4 \hfill \\ {x_1} + 4{x_2} - 3{x_3} = 1. \hfill \\\end{array} \\\end{array} $$

    In each case try to find a solution set, and comment on the result.

    Alan JeffreyMatrix Operations for Engineers and ScientistsAn Essential Guide in Linear Algebra10.1007/978-90-481-9274-8_2© Springer Netherlands 2010

    2. Determinants, and Linear Independence

    Alan Jeffrey¹ 

    (1)

    University of Newcastle, 16 Bruce Bldg., NE1 7RU Newcastle upon Tyne, United Kingdom

    Abstract

    Determinants can be defined and studied independently of matrices, though when square matrices occur they play a fundamental role in the study of linear systems of algebraic equations, in the formal definition of an inverse matrix, and in the study of the eigenvalues of a matrix. So, in anticipation of what is to follow in later chapters, and before developing the properties of determinants in general, we will introduce and motivate their study by examining the solution a very simple system of equations.

    2.1 Introduction to Determinants and Systems of Equations

    Determinants can be defined and studied independently of matrices, though when square matrices occur they play a fundamental role in the study of linear systems of algebraic equations, in the formal definition of an inverse matrix, and in the study of the eigenvalues of a matrix. So, in anticipation of what is to follow in later chapters, and before developing the properties of determinants in general, we will introduce and motivate their study by examining the solution a very simple system of equations.

    The theory of determinants predates the theory of matrices, their having been introduced by Leibniz (1646–1716) independently of his work on the calculus, and subsequently their theory was developed as part of algebra, until Cayley (1821–1895) first introduced matrices and established the connection between determinants and matrices. Determinants are associated with square matrices and they arise in many contexts, with two of the most important being their connection with systems of linear algebraic equations, and systems of linear differential equations like those in (1.12).

    To see how determinants arise from the study of linear systems of equations we will consider the simplest linear nonhomogeneous system of algebraic equations

    $$ \begin{array}{lllll} {{a_{11}}{x_1} + {a_{12}}{x_2} = {b_1}}, \\{{a_{21}}{x_1} + {a_{22}}{x_2} = {b_2}.} \\\end{array} $$

    (2.1)

    These equations can be solved by elimination as follows. Multiply the first equation by a 22, the second by a 12, and subtract the results to obtain an equation for x 1 from which the variable x 2 has been eliminated. Next, multiply the first equation by a 21, the second by a 11, and subtract the results to obtain an equation for x 2, where this time the variable x 1 has been eliminated. The result is the solution set {x 1, x 2} with its elements given by given by

    $$ {x_1} = \frac{{{b_1}{a_{22}} - {b_2}{a_{12}}}}{{{a_{11}}{a_{22}} - {a_{12}}{a_{21}}}},\quad {x_2} = \frac{{{b_2}{a_{11}} - {b_1}{a_{21}}}}{{{a_{11}}{a_{22}} - {a_{12}}{a_{21}}}}. $$

    (2.2)

    For this solution set to exist it is necessary that the denominator a 11 a 22 − a 12 a 21 in the expressions for x 1 and x 2 does not vanish. So setting Δ = a 11 a 22 − a 12 a 21, the condition for the existence of the solution set {x 1, x 2} becomes Δ ≠ 0.

    In terms of a square matrix of coefficients whose elements are the coefficients associated with (2.1), namely

    $$ {\mathbf{A}} = \left[ {\begin{array}{lllll} {{a_{11}}} & {{a_{12}}} \\{{a_{21}}} & {{a_{22}}} \\\end{array} } \right], $$

    (2.3)

    the second-order determinant associated with A, written either as det A or as $$ \left| {\mathbf{A}} \right| $$ , is defined as the number

    $$ { \det }\,{\mathbf{A}} = |{\mathbf{A}}| = \left| {\begin{array}{lllll} {{a_{11}}} & {{a_{12}}} \\{{a_{21}}} & {{a_{22}}} \\\end{array} } \right| = {a_{{11}}}{a_{{22}}} - {a_{{12}}}{a_{{21}}}, $$

    (2.4)

    so the denominator in (2.2) is Δ = det A.

    Notice how the value of the determinant in (2.4) is obtained from the elements of A. The expression on the right of (2.4), called the expansion of the determinant, is the product of elements on the leading diagonal of A, from which is subtracted the product of the elements on the cross-diagonal that runs from the bottom left to the top right of the array A. The classification of the type of determinant involved is described by specifying its order, which is the number of rows (equivalently columns) in the square matrix A from which the determinant is derived. Thus the determinant in (2.4) is a second-order determinant. Specifying the order of a determinant gives some indication of the magnitude of the calculation involved when expanding it, while giving no indication of the value of the determinant. If the elements of A are numbers, det A is seen to be a number, but if the elements are functions of a variable, say t, then det A becomes a function of t. In general determinants whose elements are functions, often of several variables, are called functional determinants. Two important examples of these determinants called Jacobian determinants, or simply Jacobians, will be found in Exercises 14 and 15 at the end of this chapter.

    Notice that in the conventions used in this book, when a matrix is written out in full, the elements of the matrix are enclosed within square brackets, thus [ …], whereas the notation for its determinant, which is only associated with a square matrix, encloses its elements between vertical rules, thus $$ \left| {\left. {...} \right|} \right., $$ and these notations should not be confused

    Example 2.1

    Given

    $$ \left( {\hbox{a}} \right)\ {\mathbf{A}} = \left[ {\begin{array}{lllll} 1 & 3 \\{ - 4} & 6 \\\end{array} } \right]\ {\hbox{and}}\ \left( {\hbox{b}} \right)\ {\mathbf{B}} = \left[ {\begin{array}{lllll} {{e^t}} & {{e^t}} \\{\cos t} & {\sin t} \\\end{array} } \right], $$

    find det A and det B.

    Solution

    By definition

    $$ \left( {\hbox{a}} \right) \hfill{ \det }{\mathbf{A}} = \left| {\begin{array}{lllll} 1 & 3 \\{ - 4} & 6 \\\end{array} } \right| = \left( {1 \times 6} \right) - (3) \times \left( { - 4} \right) = 18. \left( {\hbox{b}} \right)\ \hfill{ \det }\,{\mathbf{B}} = \left| {\begin{array}{lllll} {{e^t}} & {{e^t}} \\{\cos t} & {\sin t} \\\end{array} } \right| = \left( {{e^t}} \right) \times \left( {\sin t} \right) - \left( {{e^t}} \right) \times \left( {\cos t} \right) = {e^t}(\sin t - \cos t).\hskip1.5pc\hfill $$

    It is possible to express the solution set {x 1, x 2 } in (2.2) entirely in terms of determinants by defining the three second-order determinants

    $$ \Delta = \det {\mathbf{A}} = \left| {\begin{array}{lllll} {{a_{11}}} & {{a_{12}}} \\{{a_{21}}} & {{a_{22}}} \\\end{array} } \right|,\quad {\Delta_1} = \left| {\begin{array}{lllll} {{b_1}} & {{a_{12}}} \\{{b_2}} & {{a_{22}}} \\\end{array} } \right|,\quad {\Delta_2} = \left| {\begin{array}{lllll} {{a_{11}}} & {{b_1}} \\{{a_{21}}} & {{b_2}} \\\end{array} } \right|, $$

    (2.5)

    because then the solutions in (2.2) become

    $$ {x_1} = \frac{{{\Delta_1}}}{\Delta },\quad{x_2} = \frac{{{\Delta_2}}}{\Delta }\;. $$

    (2.6)

    Here Δ is the determinant of the coefficient matrix in system (2.1), while the determinant Δ1 in the numerator of the expression for x 1 is obtained from Δ by replacing its first column by the nonhomogeneous terms b 1 and b 2 in the system, and the determinant Δ2 in the numerator of the expression for x 2 is obtained from Δ by replacing its second column by the nonhomogeneous terms b 1 and b 2. This is the simplest form of a result known as Cramer’s rule for solving the two simultaneous first-order algebraic equations in (2.1), in terms of determinants, and its generalization to n nonhomogeneous equations in n unknowns will be given later, along with its proof.

    2.2 A First Look at Linear Dependence and Independence

    Before developing the general properties of determinants, the simple system (2.1) will be used introduce the important concepts of the linear dependence and independence of equations. Suppose the second equation in (2.1) is proportional to the first equation, then for some constant of proportionality λ ≠ 0 it will follow that a 21 = λa 11, a 22 = λa 12 and b 2 = λb 1. If this happens the equations are said to be linearly dependent, though when they are not proportional, the equations are said to be linearly independent. Linear dependence and independence between systems of linear algebraic equations is important, irrespective of the number of equations and unknowns that are involved. Later, when the most important properties of determinants have been established, a determinant test for the linear independence of n homogeneous linear equations in n unknowns will be derived.

    When the equations in system (2.1) are linearly dependent, the system only contains one equation relating x 1 and x

    Enjoying the preview?
    Page 1 of 1