Academia.eduAcademia.edu

Linear_Algebra_and_Its_Applications__4ed_.pdf

Linear Algebra and Its Applications Fourth Edition Gilbert Strang x y z Ax  b y Ay  b b 0 0 z Az  0 Contents Preface iv 1 . . . . . . . . 1 1 4 13 21 36 50 66 72 . . . . . . . 77 77 86 103 115 129 140 154 . . . . . . 159 159 171 180 195 211 221 2 3 Matrices and Gaussian Elimination 1.1 Introduction . . . . . . . . . . . . . . . . 1.2 The Geometry of Linear Equations . . . . 1.3 An Example of Gaussian Elimination . . 1.4 Matrix Notation and Matrix Multiplication 1.5 Triangular Factors and Row Exchanges . 1.6 Inverses and Transposes . . . . . . . . . . 1.7 Special Matrices and Applications . . . . Review Exercises . . . . . . . . . . . . . . . . . . . . . Vector Spaces 2.1 Vector Spaces and Subspaces . . . . . . . . 2.2 Solving Ax = 0 and Ax = b . . . . . . . . . 2.3 Linear Independence, Basis, and Dimension 2.4 The Four Fundamental Subspaces . . . . . 2.5 Graphs and Networks . . . . . . . . . . . . 2.6 Linear Transformations . . . . . . . . . . . Review Exercises . . . . . . . . . . . . . . Orthogonality 3.1 Orthogonal Vectors and Subspaces . . 3.2 Cosines and Projections onto Lines . . 3.3 Projections and Least Squares . . . . 3.4 Orthogonal Bases and Gram-Schmidt 3.5 The Fast Fourier Transform . . . . . . Review Exercises . . . . . . . . . . . i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS ii 4 Determinants 4.1 Introduction . . . . . . . . . . 4.2 Properties of the Determinant . 4.3 Formulas for the Determinant . 4.4 Applications of Determinants . Review Exercises . . . . . . . . . . . . . . . . . . . . . . 5 Eigenvalues and Eigenvectors 5.1 Introduction . . . . . . . . . . . . . 5.2 Diagonalization of a Matrix . . . . . 5.3 Difference Equations and Powers Ak 5.4 Differential Equations and eAt . . . 5.5 Complex Matrices . . . . . . . . . . 5.6 Similarity Transformations . . . . . Review Exercises . . . . . . . . . . 6 Positive Definite Matrices 6.1 Minima, Maxima, and Saddle Points 6.2 Tests for Positive Definiteness . . . 6.3 Singular Value Decomposition . . . 6.4 Minimum Principles . . . . . . . . 6.5 The Finite Element Method . . . . . 7 Computations with Matrices 7.1 Introduction . . . . . . . . . . . . . 7.2 Matrix Norm and Condition Number 7.3 Computation of Eigenvalues . . . . 7.4 Iterative Methods for Ax = b . . . . 8 Linear Programming and Game Theory 8.1 Linear Inequalities . . . . . . . . . 8.2 The Simplex Method . . . . . . . . 8.3 The Dual Problem . . . . . . . . . . 8.4 Network Models . . . . . . . . . . 8.5 Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Intersection, Sum, and Product of Spaces A.1 The Intersection of Two Vector Spaces . . . . . A.2 The Sum of Two Vector Spaces . . . . . . . . . A.3 The Cartesian Product of Two Vector Spaces . . A.4 The Tensor Product of Two Vector Spaces . . . A.5 The Kronecker Product A ⊗ B of Two Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 225 227 236 247 258 . . . . . . . 260 260 273 283 296 312 325 341 . . . . . 345 345 352 367 376 384 . . . . 390 390 391 399 407 . . . . . 417 417 422 434 444 451 . . . . . 459 459 460 461 461 462 CONTENTS iii B The Jordan Form 466 C Matrix Factorizations 473 D Glossary: A Dictionary for Linear Algebra 475 E MATLAB Teaching Codes 484 F Linear Algebra in a Nutshell 486 A~x = ~b C(AT ) C(A) AT ~y = ~c dim r dim r Row Space Column Space all AT ~y Rn all A~x Rm AT ~y = ~0 ~0 ~0 A~x = ~0 N (A) Null Space Left Null Space dim n − r     N (AT ) dim m − r Preface Revising this textbook has been a special challenge, for a very nice reason. So many people have read this book, and taught from it, and even loved it. The spirit of the book could never change. This text was written to help our teaching of linear algebra keep up with the enormous importance of this subject—which just continues to grow. One step was certainly possible and desirable—to add new problems. Teaching for all these years required hundreds of new exam questions (especially with quizzes going onto the web). I think you will approve of the extended choice of problems. The questions are still a mixture of explain and compute—the two complementary approaches to learning this beautiful subject. I personally believe that many more people need linear algebra than calculus. Isaac Newton might not agree! But he isn’t teaching mathematics in the 21st century (and maybe he wasn’t a great teacher, but we will give him the benefit of the doubt). Certainly the laws of physics are well expressed by differential equations. Newton needed calculus—quite right. But the scope of science and engineering and management (and life) is now so much wider, and linear algebra has moved into a central place. May I say a little more, because many universities have not yet adjusted the balance toward linear algebra. Working with curved lines and curved surfaces, the first step is always to linearize. Replace the curve by its tangent line, fit the surface by a plane, and the problem becomes linear. The power of this subject comes when you have ten variables, or 1000 variables, instead of two. You might think I am exaggerating to use the word “beautiful” for a basic course in mathematics. Not at all. This subject begins with two vectors v and w, pointing in different directions. The key step is to take their linear combinations. We multiply to get 3v and 4w, and we add to get the particular combination 3v + 4w. That new vector is in the same plane as v and w. When we take all combinations, we are filling in the whole plane. If I draw v and w on this page, their combinations cv + dw fill the page (and beyond), but they don’t go up from the page. In the language of linear equations, I can solve cv + dw = b exactly when the vector b lies in the same plane as v and w. iv v Matrices I will keep going a little more to convert combinations of three-dimensional vectors into linear algebra. If the vectors are v = (1, 2, 3) and w = (1, 3, 4), put them into the columns of a matrix:   1 1   matrix = 2 3 . 3 4 To find combinations of those columns, “multiply” the matrix by a vector (c, d):       1 1 1 1 " #    c    Linear combinations cv + dw = c 2 + d 3 . 2 3 d 3 4 3 4 Those combinations fill a vector space. We call it the column space of the matrix. (For these two columns, that space is a plane.) To decide if b = (2, 5, 7) is on that plane, we have three components to get right. So we have three equations to solve:     c+ d = 2 1 1 " # 2  c    means = 5 2c + 3d = 5 . 2 3 d 3c + 4d = 7 3 4 7 I leave the solution to you. The vector b = (2, 5, 7) does lie in the plane of v and w. If the 7 changes to any other number, then b won’t lie in the plane—it will not be a combination of v and w, and the three equations will have no solution. Now I can describe the first part of the book, about linear equations Ax = b. The matrix A has n columns and m rows. Linear algebra moves steadily to n vectors in mdimensional space. We still want combinations of the columns (in the column space). We still get m equations to produce b (one for each row). Those equations may or may not have a solution. They always have a least-squares solution. The interplay of columns and rows is the heart of linear algebra. It’s not totally easy, but it’s not too hard. Here are four of the central ideas: 1. The column space (all combinations of the columns). 2. The row space (all combinations of the rows). 3. The rank (the number of independent columns) (or rows). 4. Elimination (the good way to find the rank of a matrix). I will stop here, so you can start the course. PREFACE vi Web Pages It may be helpful to mention the web pages connected to this book. So many messages come back with suggestions and encouragement, and I hope you will make free use of everything. You can directly access http://web.mit.edu/18.06, which is continually updated for the course that is taught every semester. Linear algebra is also on MIT’s OpenCourseWare site http://ocw.mit.edu, where 18.06 became exceptional by including videos of the lectures (which you definitely don’t have to watch...). Here is a part of what is available on the web: 1. Lecture schedule and current homeworks and exams with solutions. 2. The goals of the course, and conceptual questions. 3. Interactive Java demos (audio is now included for eigenvalues). 4. Linear Algebra Teaching Codes and MATLAB problems. 5. Videos of the complete course (taught in a real classroom). The course page has become a valuable link to the class, and a resource for the students. I am very optimistic about the potential for graphics with sound. The bandwidth for voiceover is low, and FlashPlayer is freely available. This offers a quick review (with active experiment), and the full lectures can be downloaded. I hope professors and students worldwide will find these web pages helpful. My goal is to make this book as useful as possible with all the course material I can provide. Other Supporting Materials Student Solutions Manual 0-495-01325-0 The Student Solutions Manual provides solutions to the odd-numbered problems in the text. Instructor’s Solutions Manual 0-030-10588-4 The Instructor’s Solutions Manual has teaching notes for each chapter and solutions to all of the problems in the text. Structure of the Course The two fundamental problems are Ax = b and Ax = λ x for square matrices A. The first problem Ax = b has a solution when A has independent columns. The second problem Ax = λ x looks for independent eigenvectors. A crucial part of this course is to learn what “independence” means. I believe that most of us learn first from examples. You can see that   1 1 2   does not have independent columns. A = 1 2 3 1 3 4 vii Column 1 plus column 2 equals column 3. A wonderful theorem of linear algebra says that the three rows are not independent either. The third row must lie in the same plane as the first two rows. Some combination of rows 1 and 2 will produce row 3. You might find that combination quickly (I didn’t). In the end I had to use elimination to discover that the right combination uses 2 times row 2, minus row 1. Elimination is the simple and natural way to understand a matrix by producing a lot of zero entries. So the course starts there. But don’t stay there too long! You have to get from combinations of the rows, to independence of the rows, to “dimension of the row space.” That is a key goal, to see whole spaces of vectors: the row space and the column space and the nullspace. A further goal is to understand how the matrix acts. When A multiplies x it produces the new vector Ax. The whole space of vectors moves—it is “transformed” by A. Special transformations come from particular matrices, and those are the foundation stones of linear algebra: diagonal matrices, orthogonal matrices, triangular matrices, symmetric matrices. The eigenvalues of those matrices are special too. I think 2 by 2 matrices provide terrific examples of the information that eigenvalues λ can give. Sections 5.1 and 5.2 are worth careful reading, to see how Ax = λ x is useful. Here is a case in which small matrices allow tremendous insight. Overall, the beauty of linear algebra is seen in so many different ways: 1. Visualization. Combinations of vectors. Spaces of vectors. Rotation and reflection and projection of vectors. Perpendicular vectors. Four fundamental subspaces. 2. Abstraction. Independence of vectors. Basis and dimension of a vector space. Linear transformations. Singular value decomposition and the best basis. 3. Computation. Elimination to produce zero entries. Gram-Schmidt to produce orthogonal vectors. Eigenvalues to solve differential and difference equations. 4. Applications. Least-squares solution when Ax = b has too many equations. Difference equations approximating differential equations. Markov probability matrices (the basis for Google!). Orthogonal eigenvectors as principal axes (and more...). To go further with those applications, may I mention the books published by WellesleyCambridge Press. They are all linear algebra in disguise, applied to signal processing and partial differential equations and scientific computing (and even GPS). If you look at http://www.wellesleycambridge.com, you will see part of the reason that linear algebra is so widely used. After this preface, the book will speak for itself. You will see the spirit right away. The emphasis is on understanding—I try to explain rather than to deduce. This is a book about real mathematics, not endless drill. In class, I am constantly working with examples to teach what students need. PREFACE viii Acknowledgments I enjoyed writing this book, and I certainly hope you enjoy reading it. A big part of the pleasure comes from working with friends. I had wonderful help from Brett Coonley and Cordula Robinson and Erin Maneri. They created the LATEX files and drew all the figures. Without Brett’s steady support I would never have completed this new edition. Earlier help with the Teaching Codes came from Steven Lee and Cleve Moler. Those follow the steps described in the book; MATLAB and Maple and Mathematica are faster for large matrices. All can be used (optionally) in this course. I could have added “Factorization” to that list above, as a fifth avenue to the understanding of matrices: [L, U, P] = lu(A) for linear equations [Q, R] = qr(A) to make the columns orthogonal [S, E] = eig(A) to find eigenvectors and eigenvalues. In giving thanks, I never forget the first dedication of this textbook, years ago. That was a special chance to thank my parents for so many unselfish gifts. Their example is an inspiration for my life. And I thank the reader too, hoping you like this book. Gilbert Strang Chapter 1 Matrices and Gaussian Elimination 1.1 Introduction This book begins with the central problem of linear algebra: solving linear equations. The most important ease, and the simplest, is when the number of unknowns equals the number of equations. We have n equations in n unknowns, starting with n = 2: Two equations 1x + 2y = 3 Two unknowns 4x + 5y = 6. (1) The unknowns are x and y. I want to describe two ways, elimination and determinants, to solve these equations. Certainly x and y are determined by the numbers 1, 2, 3, 4, 5, 6. The question is how to use those six numbers to solve the system. 1. Elimination Subtract 4 times the first equation from the second equation. This eliminates x from the second equation. and it leaves one equation for y: (equation 2) − 4(equation 1) − 3y = −6. (2) Immediately we know y = 2. Then x comes from the first equation 1x + 2y = 3: Back-substitution 1x + 2(2) = 3 gives x = −1. (3) Proceeding carefully, we cheek that x and y also solve the second equation. This should work and it does: 4 times (x = −1) plus 5 times (y = 2) equals 6. 2. Determinants The solution y = 2 depends completely on those six numbers in the equations. There most be a formula for y (and also x) It is a “ratio of determinants” and I hope you will allow me to write it down directly: ¯ ¯ ¯1 3 ¯ ¯ ¯ ¯ ¯ ¯4 6¯ 1 · 6 − 3 · 4 −6 ¯= y = ¯¯ = = 2. (4) ¯ ¯1 2¯ 1 · 5 − 2 · 4 −3 ¯ ¯ ¯4 5 ¯ 2 Chapter 1 Matrices and Gaussian Elimination That could seem a little mysterious, unless you already know about 2 by 2 determinants. They gave the same answer y = 2, coming from the same ratio of −6 to −3. If we stay with determinants (which we don’t plan to do), there will be a similar formula to compute the other unknown, x: ¯ ¯ ¯3 2¯ ¯ ¯ ¯ ¯ ¯6 5¯ 3 · 5 − 2 · 6 3 ¯= = = −1. (5) x = ¯¯ ¯ ¯1 2¯ 1 · 5 − 2 · 4 −3 ¯ ¯ ¯4 5 ¯ Let me compare those two approaches, looking ahead to real problems when n is much larger (n = 1000 is a very moderate size in scientific computing). The truth is that direct use of the determinant formula for 1000 equations would be a total disaster. It would use the million numbers on the left sides correctly, but not efficiently. We will find that formula (Cramer’s Rule) in Chapter 4, but we want a good method to solve 1000 equations in Chapter 1. That good method is Gaussian Elimination. This is the algorithm that is constantly used to solve large systems of equations. From the examples in a textbook (n = 3 is close to the upper limit on the patience of the author and reader) too might not see much difference. Equations (2) and (4) used essentially the same steps to find y = 2. Certainly x came faster by the back-substitution in equation (3) than the ratio in (5). For larger n there is absolutely no question. Elimination wins (and this is even the best way to compute determinants). The idea of elimination is deceptively simple—you will master it after a few examples. It will become the basis for half of this book, simplifying a matrix so that we can understand it. Together with the mechanics of the algorithm, we want to explain four deeper aspects in this chapter. They are: 1. Linear equations lead to geometry of planes. It is not easy to visualize a ninedimensional plane in ten-dimensional space. It is harder to see ten of those planes, intersecting at the solution to ten equations—but somehow this is almost possible. Our example has two lines in Figure 1.1, meeting at the point (x, y) = (−1, 2). Linear algebra moves that picture into ten dimensions, where the intuition has to imagine the geometry (and gets it right) 2. We move to matrix notation, writing the n unknowns as a vector x and the n equations as Ax = b. We multiply A by “elimination matrices” to reach an upper triangular matrix U. Those steps factor A into L times U, where L is lower triangular. I will write down A and its factors for our example, and explain them at the right time: # #" # " " 1 0 1 2 1 2 = L times U. (6) = Factorization A= 4 1 0 −3 4 5 1.1 Introduction y 3 y y b x = −1 y=2 x + 2y = 3 x x + 2y = 3 x 4x + 5y = 6 One solution (x, y) = (−1, 2) x + 2y = 3 x 4x + 8y = 6 Parallel: No solution 4x + 8y = 12 Whole line of solutions Figure 1.1: The example has one solution. Singular cases have none or too many. First we have to introduce matrices and vectors and the rules for multiplication. Every matrix has a transpose AT . This matrix has an inverse A−1 . 3. In most cases elimination goes forward without difficulties. The matrix has an inverse and the system Ax = b has one solution. In exceptional cases the method will break down—either the equations were written in the wrong order, which is easily fixed by exchanging them, or the equations don’t have a unique solution. That singular case will appear if 8 replaces 5 in our example: Singular case Two parallel lines 1x + 2y = 3 4x + 8y = 6. (7) Elimination still innocently subtracts 4 times the first equation from the second. But look at the result! (equation 2) − 4(equation 1) 0 = −6. This singular case has no solution. Other singular cases have infinitely many solutions. (Change 6 to 12 in the example, and elimination will lead to 0 = 0. Now y can have any value,) When elimination breaks down, we want to find every possible solution. 4. We need a rough count of the number of elimination steps required to solve a system of size n. The computing cost often determines the accuracy in the model. A hundred equations require a third of a million steps (multiplications and subtractions). The computer can do those quickly, but not many trillions. And already after a million steps, roundoff error could be significant. (Some problems are sensitive; others are not.) Without trying for full detail, we want to see large systems that arise in practice, and how they are actually solved. The final result of this chapter will be an elimination algorithm that is about as efficient as possible. It is essentially the algorithm that is in constant use in a tremendous variety of applications. And at the same time, understanding it in terms of matrices—the coefficient matrix A, the matrices E for elimination and P for row exchanges, and the 4 Chapter 1 Matrices and Gaussian Elimination final factors L and U—is an essential foundation for the theory. I hope you will enjoy this book and this course. 1.2 The Geometry of Linear Equations The way to understand this subject is by example. We begin with two extremely humble equations, recognizing that you could solve them without a course in linear algebra. Nevertheless I hope you will give Gauss a chance: 2x − y = 1 x + y = 5. We can look at that system by rows or by columns. We want to see them both. The first approach concentrates on the separate equations (the rows). That is the most familiar, and in two dimensions we can do it quickly. The equation 2x − y = 1 is represented by a straight line in the x-y plane. The line goes through the points x = 1, y = 1 and x = 12 , y = 0 (and also through (2, 3) and all intermediate points). The second equation x + y = 5 produces a second line (Figure 1.2a). Its slope is dy/dx = −1 and it crosses the first line at the solution. The point of intersection lies on both lines. It is the only solution to both equations. That point x = 2 and y = 3 will soon be found by “elimination.” y 2x − y = 1 (1, b 5) = (0, 5) (−3, 3) b b (0, −1) ( 12 , 0) (x, y) = (2, 3) (5, 0) b +3 (column 2) (4, b 2) b x 2 (column 1) (−1, 1) b (2, 1) = column 1 x+y = 5 (a) Lines meet at x = 2, y = 3 (b) Columns combine with 2 and 3 Figure 1.2: Row picture (two lines) and column picture (combine columns). The second approach looks at the columns of the linear system. The two separate equations are really one vector equation: " # " # " # 1 −1 2 . = +y Column form x 5 1 1 1.2 The Geometry of Linear Equations 5 The problem is to find the combination of the column vectors on the left side that produces the vector on the right side. Those vectors (2, 1) and (−1, 1) are represented by the bold lines in Figure 1.2b. The unknowns are the numbers x and y that multiply the column vectors. The whole idea can be seen in that figure, where 2 times column 1 is added to 3 times column 2. Geometrically this produces a famous parallelogram. Algebraically it produces the correct vector (1, 5), on the right side of our equations. The column picture confirms that x = 2 and y = 3. More time could be spent on that example, but I would rather move forward to n = 3. Three equations are still manageable, and they have much more variety: 2u + v + w = 5 4u − 6v = −2 −2u + 7v + 2w = 9. Three planes (1) Again we can study the rows or the columns, and we start with the rows. Each equation describes a plane in three dimensions. The first plane is 2u+v+w = 5, and it is sketched in Figure 1.3. It contains the points ( 25 , 0, 0) and (0, 5, 0) and (0, 0, 5). It is determined by any three of its points—provided they do not lie on a line. w 2u + v + w = 5 (sloping plane) 4u − 6v = −2 (vertical plane) (1, 1, 2) = point of intersection with third plane = solution b v line of intersection: first two planes u Figure 1.3: The row picture: three intersecting planes from three linear equations. Changing 5 to 10, the plane 2u + v + w = 10 would be parallel to this one. It contains (5, 0, 0) and (0, 10, 0) and (0, 0, 10), twice as far from the origin—which is the center point u = 0, v = 0, w = 0. Changing the right side moves the plane parallel to itself, and the plane 2u + v + w = 0 goes through the origin. 6 Chapter 1 Matrices and Gaussian Elimination The second plane is 4u − 6v = −2. It is drawn vertically, because w can take any value. The coefficient of w is zero, but this remains a plane in 3-space. (The equation 4u = 3, or even the extreme case u = 0, would still describe a plane.) The figure shows the intersection of the second plane with the first. That intersection is a line. In three dimensions a line requires two equations; in n dimensions it will require n − 1. Finally the third plane intersects this line in a point. The plane (not drawn) represents the third equation −2u + 7v + 2w = 9, and it crosses the line at u = 1, v = 1, w = 2. That triple intersection point (1, 1, 2) solves the linear system. How does this row picture extend into n dimensions? The n equations will contain n unknowns. The first equation still determines a “plane.” It is no longer a twodimensional plane in 3-space; somehow it has “dimension” n − 1. It must be flat and extremely thin within n-dimensional space, although it would look solid to us. If time is the fourth dimension, then the plane t = 0 cuts through four-dimensional space and produces the three-dimensional universe we live in (or rather, the universe as it was at t = 0). Another plane is z = 0, which is also three-dimensional; it is the ordinary x-y plane taken over all time. Those three-dimensional planes will intersect! They share the ordinary x-y plane at t = 0. We are down to two dimensions, and the next plane leaves a line. Finally a fourth plane leaves a single point. It is the intersection point of 4 planes in 4 dimensions, and it solves the 4 underlying equations. I will be in trouble if that example from relativity goes any further. The point is that linear algebra can operate with any number of equations. The first equation produces an (n − 1)-dimensional plane in n dimensions, The second plane intersects it (we hope) in a smaller set of “dimension n − 2.” Assuming all goes well, every new plane (every new equation) reduces the dimension by one. At the end, when all n planes are accounted for, the intersection has dimension zero. It is a point, it lies on all the planes, and its coordinates satisfy all n equations. It is the solution! Column Vectors and Linear Combinations We turn to the columns. This time the vector equation (the same equation as (1)) is         5 1 1 2         (2) Column form u  4  + v −6 + w 0 = −2 = b. 9 2 7 −2 Those are three-dimensional column vectors. The vector b is identified with the point whose coordinates are 5, −2, 9. Every point in three-dimensional space is matched to a vector, and vice versa. That was the idea of Descartes, who turned geometry into algebra by working with the coordinates of the point. We can write the vector in a column, or we can list its components as b = (5, −2, 9), or we can represent it geometrically by an arrow from the origin. You can choose the arrow, or the point, or the three numbers. In six dimensions it is probably easiest to choose the six numbers. 1.2 The Geometry of Linear Equations 7 We use parentheses and commas when the components are listed horizontally, and square brackets (with no commas) when a column vector is printed vertically. What really matters is addition of vectors and multiplication by a scalar (a number). In Figure 1.4a you see a vector addition, component by component:         0 5 5 0         Vector addition 0 + −2 + 0 = −2 . 9 9 0 0 In the right-hand figure there is a multiplication by 2 (and if it had been −2 the vector b b= h 5 −2 9 i h0i 0 9 h b 5 −1 9 i = linear combination equals b b b b h2i 0 4 =2 h1i 0 2 2 (column 3) h 0 −2 0 i h b 2 4 −2 i + h 1 −6 7 i = h 3 −2 5 i columns 1 + 2 b h5i 0 0 (a) Add vectors along axes (b) Add columns 1 + 2 + (3 + 3) Figure 1.4: The column picture: linear combination of columns equals b. would have gone in the reverse direction): Multiplication by scalars     1 2     2 0 = 0 , 2 4     1 −2     −2 0 =  0  . 2 −4 Also in the right-hand figure is one of the central ideas of linear algebra. It uses both of the basic operations; vectors are multiplied by numbers and then added. The result is called a linear combination, and this combination solves our equation:         5 1 1 2         Linear combination 1  4  + 1 −6 + 2 0 = −2 . 9 2 7 −2 Equation (2) asked for multipliers u, v, w that produce the right side b. Those numbers are u = 1, v = 1, w = 2. They give the correct combination of the columns. They also gave the point (1, 1, 2) in the row picture (where the three planes intersect). 8 Chapter 1 Matrices and Gaussian Elimination Our true goal is to look beyond two or three dimensions into n dimensions. With n equations in n unknowns, there are n planes in the row picture. There are n vectors in the column picture, plus a vector b on the right side. The equations ask for a linear combination of the n columns that equals b. For certain equations that will be impossible. Paradoxically, the way to understand the good case is to study the bad one. Therefore we look at the geometry exactly when it breaks down, in the singular case. Row picture: Intersection of planes Column picture: Combination of columns The Singular Case Suppose we are again in three dimensions, and the three planes in the row picture do not intersect. What can go wrong? One possibility is that two planes may be parallel. The equations 2u + v + w = 5 and 4u + 2v + 2w = 11 are inconsistent—and parallel planes give no solution (Figure 1.5a shows an end view). In two dimensions, parallel lines are the only possibility for breakdown. But three planes in three dimensions can be in trouble without being parallel. two parallel planes (a) no intersection (b) line of intersection (c) all planes parallel (d) Figure 1.5: Singular cases: no solution for (a), (b), or (d), an infinity of solutions for (c). The most common difficulty is shown in Figure 1.5b. From the end view the planes form a triangle. Every pair of planes intersects in a line, and those lines are parallel. The third plane is not parallel to the other planes, but it is parallel to their line of intersection. This corresponds to a singular system with b = (2, 5, 6): No solution, as in Figure 1.5b u + v + w = 2 2u + 3w = 5 3u + v + 4w = 6. (3) The first two left sides add up to the third. On the right side that fails: 2+5 6= 6. Equation 1 plus equation 2 minus equation 3 is the impossible statement 0 = 1. Thus the equations are inconsistent, as Gaussian elimination will systematically discover. 1.2 The Geometry of Linear Equations 9 Another singular system, close to this one, has an infinity of solutions. When the 6 in the last equation becomes 7, the three equations combine to give 0 = 0. Now the third equation is the sum of the first two. In that case the three planes have a whole line in common (Figure 1.5c). Changing the right sides will move the planes in Figure 1.5b parallel to themselves, and for b = (2, 5, 7) the figure is suddenly different. The lowest plane moved up to meet the others, and there is a line of solutions. Problem 1.5c is still singular, but now it suffers from too many solutions instead of too few. The extreme case is three parallel planes. For most right sides there is no solution (Figure 1.5d). For special right sides (like b = (0, 0, 0)!) there is a whole plane of solutions—because the three parallel planes move over to become the same. What happens to the column picture when the system is singular? it has to go wrong; the question is how, There are still three columns on the left side of the equations, and we try to combine them to produce b. Stay with equation (3):       Singular case: Column picture 1 1 1       u 2 + v 0 + w 3 = b. (4) Three columns in the same plane Solvable only for b in that plane 3 1 4 For b = (2, 5, 7) this was possible; for b = (2, 5, 6) it was not. The reason is that those three columns lie in a plane. Then every combination is also in the plane (which goes through the origin). If the vector b is not in that plane, no solution is possible (Figure 1.6). That is by far the most likely event; a singular system generally has no solution. But there is a chance that b does lie in the plane of the columns. In that case there are too many solutions; the three columns can be combined in infinitely many ways to produce b. That column picture in Figure 1.6b corresponds to the row picture in Figure 1.5c. 3 columns in a plane 3 columns in a plane b not in place b b in place b (a) no solution b b (b) infinity of solutions Figure 1.6: Singular cases: b outside or inside the plane with all three columns. How do we know that the three columns lie in the same plane? One answer is to find a combination of the columns that adds to zero. After some calculation, it is u = 3, v = 1, w = −2. Three times column 1 equals column 2 plus twice column 3. Column 1 is in 10 Chapter 1 Matrices and Gaussian Elimination the plane of columns 2 and 3. Only two columns are independent. The vector b = (2, 5, 7) is in that plane of the columns—it is column 1 plus column 3—so (1, 0, 1) is a solution. We can add an multiple of the combination (3, −1, −2) that gives b = 0. So there is a whole line of solutions—as we know from the row picture. The truth is that we knew the columns would combine to give zero, because the rows did. That is a fact of mathematics, not of computation—and it remains true in dimension n. If the n planes have no point in common, or infinitely many points, then the n columns lie in the same plane. If the row picture breaks down, so does the column picture. That brings out the difference between Chapter 1 and Chapter 2. This chapter studies the most important problem—the nonsingular case—where there is one solution and it has to be found. Chapter 2 studies the general case, where there may be many solutions or none. In both cases we cannot continue without a decent notation (matrix notation) and a decent algorithm (elimination). After the exercises, we start with elimination. Problem Set 1.2 1. For the equations x + y = 4, 2x − 2y = 4, draw the row picture (two intersecting lines) and the column picture (combination of two columns equal to the column vector (4, 4) on the right side). 2. Solve to find a combination of the columns that equals b: Triangular system u − v − w = b1 v + w = b2 w = b3 . 3. (Recommended) Describe the intersection of the three planes u + v + w + z = 6 and u + w + z = 4 and u + w = 2 (all in four-dimensional space). Is it a line or a point or an empty set? What is the intersection if the fourth plane u = −1 is included? Find a fourth equation that leaves us with no solution. 4. Sketch these three lines and decide if the equations are solvable: 3 by 2 system x + 2y = 2 x − y = 2 y = 1. What happens if all right-hand sides are zero? Is there any nonzero choice of righthand sides that allows the three lines to intersect at the same point? 5. Find two points on the line of intersection of the three planes t = 0 and z = 0 and x + y + z + t = 1 in four-dimensional space. 1.2 The Geometry of Linear Equations 11 6. When b = (2, 5, 7), find a solution (u, v, w) to equation (4) different from the solution (1, 0, 1) mentioned in the text. 7. Give two more right-hand sides in addition to b = (2, 5, 7) for which equation (4) can be solved. Give two more right-hand sides in addition to b = (2, 5, 6) for which it cannot be solved. 8. Explain why the system u + v + w = 2 u + 2v + 3w = 1 v + 2w = 0 is singular by finding a combination of the three equations that adds up to 0 = 1. What value should replace the last zero on the right side to allow the equations to have solutions—and what is one of the solutions? 9. The column picture for the previous exercise (singular system) is       1 1 1       u 1 + v 2 + w 3 = b. 0 1 2 Show that the three columns on the left lie in the same plane by expressing the third column as a combination of the first two. What are all the solutions (u, v, w) if b is the zero vector (0, 0, 0)? 10. (Recommended) Under what condition on y1 , y2 , y3 do the points (0, y1 ), (1, y2 ), (2, y3 ) lie on a straight line? 11. These equations are certain to have the solution x = y = 0. For which values of a is there a whole line of solutions? ax + 2y = 0 2x + ay = 0 12. Starting with x + 4y = 7, find the equation for the parallel line through x = 0, y = 0. Find the equation of another line that meets the first at x = 3, y = 1. Problems 13–15 are a review of the row and column pictures. 13. Draw the two pictures in two planes for the equations x − 2y = 0, x + y = 6. 14. For two linear equations in three unknowns x, y, z, the row picture will show (2 or 3) (lines or planes) in (two or three)-dimensional space. The column picture is in (two or three)-dimensional space. The solutions normally lie on a . 12 Chapter 1 Matrices and Gaussian Elimination 15. For four linear equations in two unknowns x and y, the row picture shows four . The column picture is in -dimensional space. The equations have no solution unless the vector on the right-hand side is a combination of . 16. Find a point with z = 2 on the intersection line of the planes x + y + 3z = 6 and x − y + z = 4. Find the point with z = 0 and a third point halfway between. 17. The first of these equations plus the second equals the third: x + y + z = 2 x + 2y + z = 3 2x + 3y + 2z = 5. The first two planes meet along a line. The third plane contains that line, because . The equations have if x, y, z satisfy the first two equations then they also infinitely many solutions (the whole line L). Find three solutions. 18. Move the third plane in Problem 17 to a parallel plane 2x + 3y + 2z = 9. Now the three equations have no solution—why not? The first two planes meet along the line L, but the third plane doesn’t that line. 19. In Problem 17 the columns are (1, 1, 2) and (1, 2, 3) and (1, 1, 2). This is a “singular case” because the third column is . Find two combinations of the columns that give b = (2, 3, 5). This is only possible for b = (4, 6, c) if c = . 20. Normally 4 “planes” in four-dimensional space meet at a . Normally 4 column vectors in four-dimensional space can combine to produce b. What combination of (1, 0, 0, 0), (1, 1, 0, 0), (1, 1, 1, 0), (1, 1, 1, 1) produces b = (3, 3, 3, 2)? What 4 equations for x, y, z, t are you solving? 21. When equation 1 is added to equation 2, which of these are changed: the planes in the row picture, the column picture, the coefficient matrix, the solution? 22. If (a, b) is a multiple of (c, d) with abcd 6= 0, show that (a, c) is a multiple of (b, d). This is surprisingly important: call it a challenge question. You could use numbers first to see how a, b, c, and d are related. The question will lead to: £ ¤ If A = ac db has dependent rows then it has dependent columns. 23. In these equations, the third column (multiplying w) is the same as the right side b. The column form of the equations immediately gives what solution for (u, v, w)? 6u + 7v + 8w = 8 4u + 5v + 9w = 9 2u − 2v + 7w = 7. 1.3 An Example of Gaussian Elimination 13 1.3 An Example of Gaussian Elimination The way to understand elimination is by example. We begin in three dimensions: Original system 2u + v + w = 5 4u − 6v = −2 −2u + 7v + 2w = 9. (1) The problem is to find the unknown values of u, v, and w, and we shall apply Gaussian elimination. (Gauss is recognized as the greatest of all mathematicians, but certainly not because of this invention, which probably took him ten minutes. Ironically, it is the most frequently used of all the ideas that bear his name.) The method starts by subtracting multiples of the first equation from the other equations. The goal is to eliminate u from the last two equations. This requires that we (a) subtract 2 times the first equation from the second (b) subtract −1 times the first equation from the third. Equivalent system 2u + v + w = 5 − 8v − 2w = −12 8v + 3w = 14. (2) The coefficient 2 is the first pivot. Elimination is constantly dividing the pivot into the numbers underneath it, to find out the right multipliers. The pivot for the second stage of elimination is −8. We now ignore the first equation. A multiple of the second equation will be subtracted from the remaining equations (in this case there is only the third one) so as to eliminate v. We add the second equation to the third or, in other words, we (c) subtract −1 times the second equation from the third. The elimination process is now complete, at least in the “forward” direction: Triangular system 2u + v + w = 5 − 8v − 2w = −12 1w = 2. (3) This system is solved backward, bottom to top. The last equation gives w = 2. Substituting into the second equation, we find v = 1. Then the first equation gives u = 1. This process is called back-substitution. To repeat: Forward elimination produced the pivots 2, −8, 1. It subtracted multiples of each row from the rows beneath, It reached the “triangular” system (3), which is solved in reverse order: Substitute each newly computed value into the equations that are waiting. 14 Chapter 1 Matrices and Gaussian Elimination Remark. One good way to write down the forward elimination steps is to include the right-hand side as an extra column. There is no need to copy u and v and w and = at every step, so we are left with the bare minimum:      2 1 1 5 2 1 1 5 2 1 1 5        4 −6 0 −2 −→ 0 −8 −2 −12 −→ 0 −8 −2 −12 . 0 0 1 2 0 8 3 14 −2 7 2 9  At the end is the triangular system, ready for back-substitution. You may prefer this arrangement, which guarantees that operations on the left-hand side of the equations are also done on the right-hand side—because both sides are there together. In a larger problem, forward elimination takes most of the effort. We use multiples of the first equation to produce zeros below the first pivot. Then the second column is cleared out below the second pivot. The forward step is finished when the system is triangular; equation n contains only the last unknown multiplied by the last pivot. Backsubstitution yields the complete solution in the opposite order—beginning with the last unknown, then solving for the next to last, and eventually for the first. By definition, pivots cannot be zero. We need to divide by them. The Breakdown of Elimination Under what circumstances could the process break down? Something must go wrong in the singular case, and something might go wrong in the nonsingular case. This may seem a little premature—after all, we have barely got the algorithm working. But the possibility of breakdown sheds light on the method itself. The answer is: With a full set of n pivots, there is only one solution. The system is non singular, and it is solved by forward elimination and back-substitution. But if a zero appears in a pivot position, elimination has to stop—either temporarily or permanently. The system might or might not be singular. If the first coefficient is zero, in the upper left corner, the elimination of u from the other equations will be impossible. The same is true at every intermediate stage. Notice that a zero can appear in a pivot position, even if the original coefficient in that place was not zero. Roughly speaking, we do not know whether a zero will appear until we try, by actually going through the elimination process. In many cases this problem can be cured, and elimination can proceed. Such a system still counts as nonsingular; it is only the algorithm that needs repair. In other cases a breakdown is unavoidable. Those incurable systems are singular, they have no solution or else infinitely many, and a full set of pivots cannot be found. 1.3 An Example of Gaussian Elimination 15 Example 1. Nonsingular (cured by exchanging equations 2 and 3) u + v + w = 2u + 2v + 5w = 4u + 6v + 8w = u + → v + w = 3w = 2v + 4w = u + → v + w = 2v + 4w = 3w = The system is now triangular, and back-substitution will solve it. Example 2. Singular (incurable) u + v + w = 2u + 2v + 5w = 4u + 4v + 8w = u + v + −→ w = 3w = 4w = There is no exchange of equations that can avoid zero in the second pivot position. The equations themselves may be solvable or unsolvable. If the last two equations are 3w = 6 and 4w = 7, there is no solution. If those two equations happen to be consistent—as in 3w = 6 and 4w = 8—then this singular case has an infinity of solutions. We know that w = 2, but the first equation cannot decide both u and v. Section 1.5 will discuss row exchanges when the system is not singular. Then the exchanges produce a full set of pivots. Chapter 2 admits the singular case, and limps forward with elimination. The 3w can still eliminate the 4w, and we will call 3 the second pivot. (There won’t be a third pivot.) For the present we trust all n pivot entries to be nonzero, without changing the order of the equations. That is the best case, with which we continue. The Cost of Elimination Our other question is very practical. How many separate arithmetical operations does elimination require, for n equations in n unknowns? If n is large, a computer is going to take our place in carrying out the elimination. Since all the steps are known, we should be able to predict the number of operations. For the moment, ignore the right-hand sides of the equations, and count only the operations on the left. These operations are of two kinds. We divide by the pivot to find out what multiple (say `) of the pivot equation is to be subtracted. When we do this subtraction, we continually meet a “multiply-subtract” combination; the terms in the pivot equation are multiplied by `, and then subtracted from another equation. Suppose we call each division, and each multiplication-subtraction, one operation. In column 1, it takes n operations for every zero we achieve—one to find the multiple `, and the other to find the new entries along the row. There are n − 1 rows underneath the first one, so the first stage of elimination needs n(n − 1) = n2 − n operations. (Another approach to n2 − n is this: All n2 entries need to be changed, except the n in the first row.) Later stages are faster because the equations are shorter. 16 Chapter 1 Matrices and Gaussian Elimination When the elimination is down to k equations, only k2 − k operations are needed to clear out the column below the pivot—by the same reasoning that applied to the first stage, when k equaled n. Altogether, the total number of operations is the sum of k2 − k over all values of k from 1 to n: Left side n(n + 1)(2n + 1) n(n + 1) − 6 2 n3 − n = . 3 (12 + · · · + n2 ) − (1 + · · · + n) = Those are standard formulas for the sums of the first n numbers and the first n squares. Substituting n = 1 and n = 2 and n = 100 into the formula 13 (n3 −n), forward elimination can take no steps or two steps or about a third of a million steps: If n is at all large, a good estimate for the number of operations is 31 n3 . If the size is doubled, and few of the coefficients are zero, the cost is multiplied by 8. Back-substitution is considerably faster. The last unknown is found in only one operation (a division by the last pivot). The second to last unknown requires two operations, and so on. Then the total for back-substitution is 1 + 2 + · · · + n. Forward elimination also acts on the right-hand side (subtracting the same multiples as on the left to maintain correct equations). This starts with n − 1 subtractions of the first equation. Altogether the right-hand side is responsible for n2 operations—much less than the n3 /3 on the left. The total for forward and back is Right side [(n − 1) + (n − 2) + · · · + 1] + [1 + 2 + · · · + n] = n2 . Thirty years ago, almost every mathematician would have guessed that a general system of order n could not be solved with much fewer than n3 /3 multiplications. (There were even theorems to demonstrate it, but they did not allow for all possible methods.) Astonishingly, that guess has been proved wrong. There now exists a method that requires only Cnlog2 7 multiplications! It depends on a simple fact: Two combinations of two vectors in two-dimensional space would seem to take 8 multiplications, but they can be done in 7. That lowered the exponent from log2 8, which is 3, to log2 7 ≈ 2.8. This discovery produced tremendous activity to find the smallest possible power of n. The exponent finally fell (at IBM) below 2.376. Fortunately for elimination, the constant C is so large and the coding is so awkward that the new method is largely (or entirely) of theoretical interest. The newest problem is the cost with many processors in parallel. Problem Set 1.3 Problems 1–9 are about elimination on 2 by 2 systems. 1.3 An Example of Gaussian Elimination 17 1. What multiple ` of equation 1 should be subtracted from equation 2? 2x + 3y = 1 10x + 9y = 11. After this elimination step, write down the upper triangular system and circle the two pivots. The numbers 1 and 11 have no influence on those pivots. 2. Solve the triangular system of Problem 1 by back-substitution, y before x. Verify that x times (2, 10) plus y times (3, 9) equals (1, 11). If the right-hand side changes to (4, 44), what is the new solution? 3. What multiple of equation 2 should be subtracted from equation 3? 2x − 4y = 6 −x + 5y = 0. After this elimination step, solve the triangular system. If the right-hand side changes to (−6, 0), what is the new solution? 4. What multiple ` of equation 1 should be subtracted from equation 2? ax + by = f cx + dy = g. The first pivot is a (assumed nonzero). Elimination produces what formula for the second pivot? What is y? The second pivot is missing when ad = bc. 5. Choose a right-hand side which gives no solution and another right-hand side which gives infinitely many solutions. What are two of those solutions? 3x + 2y = 6x + 4y = 10 . 6. Choose a coefficient b that makes this system singular. Then choose a right-hand side g that makes it solvable. Find two solutions in that singular case. 2x + by = 16 4x + 8y = g. 7. For which numbers a does elimination break down (a) permanently, and (b) temporarily? ax + 3y = −3 4x + 6y = 6. Solve for x and y after fixing the second breakdown by a row exchange. 18 Chapter 1 Matrices and Gaussian Elimination 8. For which three numbers k does elimination break down? Which is fixed by a row exchange? In each case, is the number of solutions 0 or 1 or ∞? kx + 3y = 6 3x + ky = −6. 9. What test on b1 and b2 decides whether these two equations allow a solution? How many solutions will they have? Draw the column picture. 3x − 2y = b1 6x − 4y = b2 . Problems 10–19 study elimination on 3 by 3 systems (and possible failure). 10. Reduce this system to upper triangular form by two row operations: 2x + 3y + z = 8 4x + 7y + 5z = 20 − 2y + 2z = 0. Circle the pivots. Solve by back-substitution for z, y, x. 11. Apply elimination (circle the pivots) and back-substitution to solve 2x − 3y = 3 4x − 5y + z = 7 2x − y − 3z = 5. List the three row operations: Subtract times row from row . 12. Which number d forces a row exchange, and what is the triangular system (not singular) for that d? Which d makes this system singular (no third pivot)? 2x + 5y + z = 0 4x + dy + z = 2 y − z = 3. 13. Which number b leads later to a row exchange? Which b leads to a missing pivot? In that singular case find a nonzero solution x, y, z. x + by = 0 x − 2y − z = 0 y + z = 0. 14. (a) Construct a 3 by 3 system that needs two row exchanges to reach a triangular form and a solution. (b) Construct a 3 by 3 system that needs a row exchange to keep going, but breaks down later. 1.3 An Example of Gaussian Elimination 19 15. If rows 1 and 2 are the same, how far can you get with elimination (allowing row exchange)? If columns 1 and 2 are the same, which pivot is missing? 2x − y + z = 0 2x + 2y + z = 0 2x − y + z = 0 4x + 4y + z = 0 4x + y + z = 2 6x + 6y + z = 2. 16. Construct a 3 by 3 example that has 9 different coefficients on the left-hand side, but rows 2 and 3 become zero in elimination. How many solutions to your system with b = (1, 10, 100) and how many with b = (0, 0, 0)? 17. Which number q makes this system singular and which right-hand side t gives it infinitely many solutions? Find the solution that has z = 1. x + 4y − 2z = 1 x + 7y − 6z = 6 3y + qz = t. 18. (Recommended) It is impossible for a system of linear equations to have exactly two solutions. Explain why. (a) If (x, y, z) and (X,Y, Z) are two solutions, what is another one? (b) If 25 planes meet at two points, where else do they meet? 19. Three planes can fail to have an intersection point, when no two planes are parallel. of the first two rows. Find a third The system is singular if row 3 of A is a equation that can’t be solved if x + y + z = 0 and x − 2y − z = 1. Problems 20–22 move up to 4 by 4 and n by n. 20. Find the pivots and the solution for these four equations: 2x + y x + 2y + z y + 2z + t z + 2t = = = = 0 0 0 5. 21. If you extend Problem 20 following the 1, 2, 1 pattern or the −1, 2, −1 pattern, what is the fifth pivot? What is the nth pivot? 22. Apply elimination and back-substitution to solve 2u + 3v = 0 4u + 5v + w = 3 2u − v − 3w = 5. What are the pivots? List the three operations in which a multiple of one row is subtracted from another. 20 Chapter 1 Matrices and Gaussian Elimination 23. For the system u + v + w = 2 u + 3v + 3w = 0 u + 3v + 5w = 2, what is the triangular system after forward elimination, and what is the solution? 24. Solve the system and find the pivots when 2u − v −u + 2v − w − v + 2w − z − w + 2z = = = = 0 0 0 5. You may carry the right-hand side as a fifth column (and omit writing u, v, w, z until the solution at the end). 25. Apply elimination to the system u + v + w = −2 3u + 3v − w = 6 u − v + w = −1. When a zero arises in the pivot position, exchange that equation for the one below it and proceed. What coefficient of v in the third equation, in place of the present −1, would make it impossible to proceed—and force elimination to break down? 26. Solve by elimination the system of two equations x − y = 0 3x + 6y = 18. Draw a graph representing each equation as a straight line in the x-y plane; the lines intersect at the solution. Also, add one more line—the graph of the new second equation which arises after elimination. 27. Find three values of a for which elimination breaks down, temporarily or permanently, in au + u = 1 4u + av = 2. Breakdown at the first step can be fixed by exchanging rows—but not breakdown at the last step. 28. True or false: (a) If the third equation starts with a zero coefficient (it begins with 0u) then no multiple of equation 1 will be subtracted from equation 3. 1.4 Matrix Notation and Matrix Multiplication 21 (b) If the third equation has zero as its second coefficient (it contains 0v) then no multiple of equation 2 will be subtracted from equation 3. (c) If the third equation contains 0u and 0v, then no multiple of equation 1 or equation 2 will be subtracted from equation 3. 29. (Very optional) Normally the multiplication of two complex numbers (a + ib)(c + id) = (ac − bd) + i(bc + ad) involves the four separate multiplications ac, bd, be, ad. Ignoring i, can you compute ac − bd and bc + ad with only three multiplications? (You may do additions, such as forming a + b before multiplying, without any penalty.) 30. Use elimination to solve u + v + w = 6 u + 2v + 2w = 11 2u + 3v − 4w = 3 and u + v + w = 7 u + 2v + 2w = 10 2u + 3v − 4w = 3. 31. For which three numbers a will elimination fail to give three pivots? ax + 2y + 3z = b1 ax + ay + 4z = b2 ax + ay + az = b3 . 32. Find experimentally the average size (absolute value) of the first and second and third pivots for MATLAB’s lu(rand(3, 3)). The average of the first pivot from abs(A(1, 1)) should be 0.5. 1.4 Matrix Notation and Matrix Multiplication With our 3 by 3 example, we are able to write out all the equations in full. We can list the elimination steps, which subtract a multiple of one equation from another and reach a triangular matrix. For a large system, this way of keeping track of elimination would be hopeless; a much more concise record is needed. We now introduce matrix notation to describe the original system, and matrix multiplication to describe the operations that make it simpler. Notice that three different types of quantities appear in our example: Nine coefficients Three unknowns Three right-hand sides 2u + v + w = 5 4u − 6v = −2 −2u + 7v + 2w = 9 (1) 22 Chapter 1 Matrices and Gaussian Elimination On the right-hand side is the column vector b. On the left-hand side are the unknowns u, v, w. Also on the left-hand side are nine coefficients (one of which happens to be zero). It is natural to represent the three unknowns by a vector:     1 u     The solution is x = 1 . The unknown is x =  v  2 w The nine coefficients fall into three rows and three columns, producing a 3 by 3 matrix:   2 1 1   Coefficient matrix A =  4 −6 0 . −2 7 2 A is a square matrix, because the number of equations equals the number of unknowns. If there are n equations in n unknowns, we have a square n by n matrix. More generally, we might have m equations and n unknowns. Then A is rectangular, with m rows and n columns. It will be an “m by n matrix.” Matrices are added to each other, or multiplied by numerical constants, exactly as vectors are—one entry at a time. In fact we may regard vectors as special cases of matrices; they are matrices with only one column. As with vectors, two matrices can be added only if they have the same shape:           2 1 1 2 3 3 2 1 4 2 Addition A + B           2 3 0 = 6 0 . 3 0 + −3 1 = 0 1 Multiplication 2A 0 4 1 2 1 6 0 4 0 8 Multiplication of a Matrix and a Vector We want to rewrite the three equations with three unknowns u, v, w in the simplified matrix form Ax = b. Written out in full, matrix times vector equals vector:      5 2 1 1 u      (2) Matrix form Ax = b  4 −6 0  v  = −2 . 9 −2 7 2 w The right-hand side b is the column vector of “inhomogeneous terms.” The left-hand side is A times x. This multiplication will be defined exactly so as to reproduce the original system. The first component of Ax comes from “multiplying” the first row of A into the column vector x:   h i h i h i u   (3) Row times column 2 1 1  v  = 2u + v + w = 5 . w 1.4 Matrix Notation and Matrix Multiplication 23 The second component of the product Ax is 4u − 6v + 0w, from the second row of A. The matrix equation Ax = b is equivalent to the three simultaneous equations in equation (1). Row times column is fundamental to all matrix multiplications. From two vectors it produces a single number. This number is called the inner product of the two vectors. In other words, the product of a 1 by n matrix (a row vector) and an n by 1 matrix (a column vector) is a 1 by 1 matrix:   h i 1 h i h i   Inner product 2 1 1 1 = 2 · 1 + 1 · 1 + 1 · 2 = 5 . 2 This confirms that the proposed solution x = (1, 1, 2) does satisfy the first equation. There are two ways to multiply a matrix A and a vector x. One way is a row at a time, Each row of A combines with x to give a component of Ax. There are three inner products when A has three rows:        7 1·2+1·5+6·0 1 1 6 2        (4) Ax by rows 3 0 1 5 = 3 · 2 + 0 · 5 + 3 · 0 = 6 . 7 1·2+1·5+4·0 1 1 4 0 That is how Ax is usually explained, but the second way is equally important. In fact it is more important! It does the multiplication a column at a time. The product Ax is found all at once, as a combination of the three columns of A:         1 1 6 7         Ax by columns 2 3 + 5 0 + 0 3 = 6 . (5) 1 1 4 7 The answer is twice column 1 plus 5 times column 2. It corresponds to the “column picture” of linear equations. If the right-hand side b has components 7, 6, 7, then the solution has components 2, 5, 0. Of course the row picture agrees with that (and we eventually have to do the same multiplications). The column rule will be used over and over, and we repeat it for emphasis: 1A Every product Ax can be found using whole columns as in equation (5). Therefore Ax is a combination of the columns of A. The coefficients are the components of x. To multiply A times x in n dimensions, we need a notation for the individual entries in A. The entry in the ith row and jth column is always denoted by ai j . The first subscript gives the row number, and the second subscript indicates the column. (In equation (4), a21 is 3 and a13 is 6.) If A is an m by n matrix, then the index i goes from 1 to m—there are m rows—and the index j goes from 1 to n. Altogether the matrix has mn entries, and amn is in the lower right corner. 24 Chapter 1 Matrices and Gaussian Elimination One subscript is enough for a vector. The jth component of x is denoted by x j . (The multiplication above had x1 = 2, x2 = 5, x3 = 0.) Normally x is written as a column vector—like an n by 1 matrix. But sometimes it is printed on a line, as in x = (2, 5, 0). The parentheses and commas emphasize that it is not a 1 by 3 matrix. It is a column vector, and it is just temporarily lying down. To describe the product Ax, we use the “sigma” symbol Σ for summation: n Sigma notation The ith component of Ax is ∑ ai j x j . j=1 This sum takes us along the ith row of A. The column index j takes each value from 1 to n and we add up the results—the sum is ai1 x1 + ai2 x2 + · · · + ain xn . We see again that the length of the rows (the number of columns in A) must match the length of x. An m by n matrix multiplies an n-dimensional vector (and produces an m-dimensional vector). Summations are simpler than writing everything out in full, but matrix notation is better. (Einstein used “tensor notation,” in which a repeated index automatically means summation. He wrote ai j x j or even aij x j , without the Σ. Not being Einstein, we keep the Σ.) The Matrix Form of One Elimination Step So far we have a convenient shorthand Ax = b for the original system of equations. What about the operations that are carried out during elimination? In our example, the first step subtracted 2 times the first equation from the second. On the right-hand side, 2 times the first component of b was subtracted from the second component. The same result is achieved if we multiply b by this elementary matrix (or elimination matrix):   1 0 0   Elementary matrix E = −2 1 0 . 0 0 1 This is verified just by obeying the rule for multiplying a matrix and a vector:      5 5 1 0 0      Eb = −2 1 0 −2 = −12 . 9 9 0 0 1 The components 5 and 9 stay the same (because of the 1, 0, 0 and 0, 0, 1 in the rows of E). The new second component −12 appeared after the first elimination step. It is easy to describe the matrices like E, which carry out the separate elimination steps. We also notice the “identity matrix,” which does nothing at all. 1B The identity matrix I, with 1s on the diagonal and 0s everywhere else, leaves every vector unchanged. The elementary matrix Ei j subtracts ` times 1.4 Matrix Notation and Matrix Multiplication 25 row j from row i. This Ei j includes −` in row i, column j.       1 0 0 b1 1 0 0       E31 =  0 1 0 has E31 b =  b2  . I = 0 1 0 has Ib = b b3 − `b1 −` 0 1 0 0 1 Ib = b is the matrix analogue of multiplying by 1. A typical elimination step multiplies by E31 . The important question is: What happens to A on the lefthand side? To maintain equality, we must apply the same operation to both sides of Ax = b. In other words, we must also multiply the vector Ax by the matrix E. Our original matrix E subtracts 2 times the first component from the second, After this step the new and simpler system (equivalent to the old) is just E(Ax) = Eb. It is simpler because of the zero that was created below the first pivot. It is equivalent because we can recover the original system (by adding 2 times the first equation back to the second). So the two systems have exactly the same solution x. Matrix Multiplication Now we come to the most important question: How do we multiply two matrices? There is a partial clue from Gaussian elimination: We know the original coefficient matrix A, we know the elimination matrix E, and we know the result EA after the elimination step. We hope and expect that       2 1 1 2 1 1 1 0 0       E = −2 1 0 times A =  4 −6 0 gives EA =  0 −8 −2 . −2 7 2 −2 7 2 0 0 1 Twice the first row of A has been subtracted from the second row. Matrix multiplication is consistent with the row operations of elimination. We can write the result either as E(Ax) = Eb, applying E to both sides of our equation, or as (EA)x = Eb. The matrix EA is constructed exactly so that these equations agree, and we don’t need parentheses: Matrix multiplication (EA times x) equals (E times Ax). We just write EAx. This is the whole point of an “associative law” like 2 × (3 × 4) = (2 × 3) × 4. The law seems so obvious that it is hard to imagine it could be false. But the same could be said of the “commutative law” 2 × 3 = 3 × 2—and for matrices EA is not AE. There is another requirement on matrix multiplication. We know how to multiply Ax, a matrix and a vector. The new definition should be consistent with that one. When a matrix B contains only a single column x, the matrix-matrix product AB should be identical with the matrix-vector product Ax. More than that: When B contains several 26 Chapter 1 Matrices and Gaussian Elimination columns b1 , b2 , b3 , the columns of AB should be Ab1 , Ab2 , Ab3 !     b1 Ab1     Multiplication by columns AB = A b2  = Ab2  . b3 Ab3 Our first requirement had to do with rows, and this one is concerned with columns. A third approach is to describe each individual entry in AB and hope for the best. In fact, there is only one possible rule, and I am not sure who discovered it. It makes everything work. It does not allow us to multiply every pair of matrices. If they are square, they must have the same size. If they are rectangular, they must not have the same shape; the number of columns in A has to equal the number of rows in B. Then A can be multiplied into each column of B. If A is m by n, and B is n by p, then multiplication is possible. The product AB will be m by p. We now find the entry in row i and column j of AB. 1C The i, j entry of AB is the inner product of the ith row of A and the jth column of B. In Figure 1.7, the 3, 2 entry of AB comes from row 3 and column 2: (AB)32 = a31 b12 + a32 b22 + a33 b32 + a34 b42 . (6) Figure 1.7: A 3 by 4 matrix A times a 4 by 2 matrix B is a 3 by 2 matrix AB. Note. We write AB when the matrices have nothing special to do with elimination. Our earlier example was EA, because of the elementary matrix E. Later we have PA, or LU, or even LDU. The rule for matrix multiplication stays the same. Example 1. # # " #" " 17 1 0 2 3 1 2 0 . = AB = 4 8 0 4 0 5 −1 0 The entry 17 is (2)(1) + (3)(5), the inner product of the first row of A and first column of B. The entry 8 is (4)(2) + (0)(−1), from the second row and second column. The third column is zero in B, so it is zero in AB. B consists of three columns side by side, and A multiplies each column separately. Every column of AB is a combination of the columns of A. Just as in a matrix-vector multiplication, the columns of A are multiplied by the entries in B. 1.4 Matrix Notation and Matrix Multiplication 27 Example 2. " Row exchange matrix 0 1 1 0 #" # " # 2 3 7 8 = . 7 8 2 3 Example 3. The 1s in the identity matrix I leave every matrix unchanged: Identity matrix IA = A and BI = B. Important: The multiplication AB can also be done a row at a time. In Example 1, the first row of AB uses the numbers 2 and 3 from the first row of A. Those numbers give 2[row 1] + 3[row 2] = [17 1 0]. Exactly as in elimination, where all this started, each row of AB is a combination of the rows of B. We summarize these three different ways to look at matrix multiplication. 1D (i) Each entry of AB is the product of a row and a column: (AB)i j = (row i of A) times (column j of B) (ii) Each column of AB is the product of a matrix and a column: column j of AB = A times (column j of B) (iii) Each row of AB is the product of a row and a matrix: row i of AB = (row i of A) times B. This leads hack to a key property of matrix multiplication. Suppose the shapes of three matrices A, B, C (possibly rectangular) permit them to be multiplied. The rows in A and B multiply the columns in B and C. Then the key property is this: 1E Matrix multiplication is associative: (AB)C = A(BC). Just write ABC. AB times C equals A times BC. If C happens to be just a vector (a matrix with only one column) this is the requirement (EA)x = E(Ax) mentioned earlier. It is the whole basis for the laws of matrix multiplication. And if C has several columns, we have only to think of them placed side by side, and apply the same rule several times. Parentheses are not needed when we multiply several matrices. There are two more properties to mention—one property that matrix multiplication has, and another which it does not have. The property that it does possess is: 1F Matrix operations are distributive: A(B +C) = AB + AC and (B +C)D = BD +CD. 28 Chapter 1 Matrices and Gaussian Elimination Of course the shapes of these matrices must match properly—B and C have the same shape, so they can be added, and A and D are the right size for premultiplication and postmultiplication. The proof of this law is too boring for words. The property that fails to hold is a little more interesting: 1G Matrix multiplication is not commutative: Usually FE 6= EF. Example 4. Suppose E subtracts twice the first equation from the second. Suppose F is the matrix for the next step, to add row 1 to row 3:     1 0 0 1 0 0     E = −2 1 0 and F = 0 1 0 . 0 0 1 1 0 1 These two matrices do commute and the product does both steps at once:   1 0 0   EF = −2 1 0 = FE. 1 0 1 In either order, EF or FE, this changes rows 2 and 3 using row 1. Example 5. Suppose E is the same but G adds row 2 to row 3. Now the order makes a difference. When we apply E and then G, the second row is altered before it affects the third. If E comes after G, then the third equation feels no effect from the first. You will see a zero in the (3, 1) entry of EG, where there is a −2 in GE:        1 0 0 1 0 0 1 0 0 1 0 0        GE = 0 1 0 −2 1 0 = −2 1 0 but EG = −2 1 0 . 0 1 1 0 0 1 −2 1 1 0 1 1 Thus EG 6= GE. A random example would show the same thing—most matrices don’t commute. Here the matrices have meaning. There was a reason for EF = FE, and a reason for EG 6= GE. It is worth taking one more step, to see what happens with all three elimination matrices at once:     1 0 0 1 0 0     GFE = −2 1 0 and EFG = −2 1 0 . −1 1 1 −1 1 1 The product GFE is the true order of elimination. It is the matrix that takes the original A to the upper triangular U. We will see it again in the next section. The other matrix EFG is nicer. In that order, the numbers −2 from E and 1 from F and G were not disturbed. They went straight into the product. It is the wrong order for elimination. But fortunately it is the right order for reversing the elimination steps— which also comes in the next section. Notice that the product of lower triangular matrices is again lower triangular. 1.4 Matrix Notation and Matrix Multiplication 29 Problem Set 1.4 1. Compute the products       " #" # 5 1 0 0 4 0 1 3 2 0 1       . 0 1 0 4 and 0 1 0 −2 and 1 3 1 0 0 1 3 4 0 1 5 For the third one, draw the column vectors (2, 1) and (0, 3). Multiplying by (1, 1) just adds the vectors (do it graphically). 2. Working a column at a time, compute the products        1 2 3 0 4 3 "1# 4 1 " #  1       and 4 5 6 1 and 6 6 21 . 5 1 3 7 8 9 0 8 9 3 6 1 3. Find two inner products and a matrix product:       1 3 1 h h i h i i       1 −2 7 −2 and 1 −2 7 5 and −2 3 5 1 . 7 1 7 The first gives the length of the vector (squared). 4. If an m by n matrix A multiplies an n-dimensional vector x, how many separate multiplications are involved? What if A multiplies an n by p matrix B? 5. Multiply Ax to find a solution vector x to the system Ax = zero vector. Can you find more solutions to Ax = 0?    3 −6 0 2    Ax = 0 2 −2 1 . 1 −1 −1 1 6. Write down the 2 by 2 matrices A and B that have entries ai j = i+ j and bi j = (−1)i+ j . Multiply them to find AB and BA. 7. Give 3 by 3 examples (not just the zero matrix) of (a) a diagonal matrix: ai j = 0 if i 6= j. (b) a symmetric matrix: ai j = a ji for all i and j. (c) an upper triangular matrix: ai j = 0 if i > j. (d) a skew-symmetric matrix: ai j = −a ji for all i and j. 8. Do these subroutines multiply Ax by rows or columns? Start with B(I) = 0: 30 Chapter 1 Matrices and Gaussian Elimination DO 10 I = 1, N DO 10 J = 1, N 10 B(I) = B(I) + A(I,J) * X(J) DO 10 J = 1, N DO 10 I = 1, N 10 B(I) = B(I) + A(I,J) * X(J) The outputs Bx = Ax are the same. The second code is slightly more efficient in FORTRAN and much more efficient on a vector machine (the first changes single entries B(I), the second can update whole vectors). 9. If the entries of A are ai j , use subscript notation to write (a) the first pivot. (b) the multiplier `i1 of row 1 to be subtracted from row i. (c) the new entry that replaces ai j after that subtraction. (d) the second pivot. 10. True or false? Give a specific counterexample when false. (a) If columns 1 and 3 of B are the same, so are columns 1 and 3 of AB. (b) If rows 1 and 3 of B are the same, so are rows 1 and 3 of AB. (c) If rows 1 and 3 of A are the same, so are rows 1 and 3 of AB. (d) (AB)2 = A2 B2 . 11. The first row of AB is a linear combination of all the rows of B. What are the coefficients in this combination, and what is the first row of AB, if   " # 1 1 2 1 4   A= and B = 0 1? 0 −1 1 1 0 12. The product of two lower triangular matrices is again lower triangular (all its entries above the main diagonal are zero). Confirm this with a 3 by 3 example, and then explain how it follows from the laws of matrix multiplication. 13. By trial and error find examples of 2 by 2 matrices such that (a) A2 = −I, A having only real entries. (b) B2 = 0, although B 6= 0. (c) CD = −DC, not allowing the case CD = 0. (d) EF = 0, although no entries of E or F are zero. 14. Describe the rows of EA and the columns of AE if # " 1 7 . E= 0 1 1.4 Matrix Notation and Matrix Multiplication 31 15. Suppose A commutes with every 2 by 2 matrix (AB = BA), and in particular # " # " # " a b 1 0 0 1 commutes with B1 = and B2 = . A= c d 0 0 0 0 Show that a = d and b = c = 0. If AB = BA for all matrices B, then A is a multiple of the identity. 16. Let x be the column vector (1, 0, . . . , 0). Show that the rule (AB)x = A(Bx) forces the first column of AB to equal A times the first column of B. 17. Which of the following matrices are guaranteed to equal (A + B)2 ? A2 + 2AB + B2 , A(A + B) + B(A + B), (A + B)(B + A), A2 + AB + BA + B2 . 18. If A and B are n by n matrices with all entries equal to 1, find (AB)i j . Summation notation turns the product AB, and the law (AB)C = A(BC), into ! ! à à (AB)i j = ∑ aik bk j k ∑ ∑ aik bk j j k c jl = ∑ aik k ∑ bk j c jl . j Compute both sides if C is also n by n, with every c jl = 2. 19. A fourth way to multiply matrices is columns of A times rows of B: AB = (column 1)(row 1) + · · · + (column n)(row n) = sum of simple matrices. Give a 2 by 2 example of this important rule for matrix multiplication. 20. The matrix that rotates the x-y plane by an angle θ is " # cos θ − sin θ A(θ ) = . sin θ cos θ Verify that A(θ1 )A(θ2 ) = A(θ1 + θ2 ) from the identities for cos(θ1 + θ2 ) and sin(θ1 + θ2 ). What is A(θ ) times A(−θ )? 21. Find the powers A2 , A3 (A2 times A), and B2 , B3 , C2 , C3 . What are Ak , Bk , and Ck ? " # " # " # 1 1 1 1 1 0 −2 A = 21 21 and B = and C = AB = 21 1 0 −1 2 2 2 −2 Problems 22–31 are about elimination matrices. 22. Write down the 3 by 3 matrices that produce these elimination steps: (a) E21 subtracts 5 times row 1 from row 2. (b) E32 subtracts −7 times row 2 from row 3. 32 Chapter 1 Matrices and Gaussian Elimination (c) P exchanges rows 1 and 2, then rows 2 and 3. 23. In Problem 22, applying E21 and then E32 to the column b = (1, 0, 0) gives E32 E21 b = . Applying E32 before E21 gives E21 E32 b = . When E32 comes first, row feels no effect from row . 24. Which three matrices E21 , E31 , E32 put A into triangular form U?   1 1 0   A =  4 6 1 and E32 E31 E21 A = U. −2 2 0 Multiply those E’s to get one matrix M that does elimination: MA = U. 25. Suppose a33 = 7 and the third pivot is 5. If you change a33 to 11, the third pivot is . If you change a33 to , there is zero in the pivot position. 26. If every column of A is a multiple of (1, 1, 1), then Ax is always a multiple of (1, 1, 1). Do a 3 by 3 example. How many pivots are produced by elimination? 27. What matrix E31 subtracts 7 times row 1 from row 3? To reverse that step, R31 should 7 times row to row . Multiply E31 by R31 . 28. (a) E21 subtracts row 1 from row 2 and then P23 exchanges rows 2 and 3. What matrix M = P23 E21 does both steps at once? (b) P23 exchanges rows 2 and 3 and then E31 subtracts row I from row 3. What matrix M = E31 P23 does both steps at once? Explain why the M’s are the same but the E’s are different. 29. (a) What 3 by 3 matrix E13 will add row 3 to row 1? (b) What matrix adds row 1 to row 3 and at the same time adds row 3 to row 1? (c) What matrix adds row 1 to row 3 and then adds row 3 to row 1? 30. Multiply these matrices:     0 0 1 1 2 3 0 0 1     0 1 0 4 5 6 0 1 0 1 0 0 7 8 9 1 0 0   1 0 0 1 2 3    −1 1 0 1 3 1 . −1 0 1 1 4 0  and 31. This 4 by 4 matrix needs which elimination matrices E21 and E32 and E43 ?   2 −1 0 0 −1 2 −1 0    A= .  0 −1 2 −1 0 0 −1 2 Problems 32–44 are about creating and multiplying matrices 1.4 Matrix Notation and Matrix Multiplication 33 32. Write these ancient problems in a 2 by 2 matrix form Ax = b and solve them: (a) X is twice as old as Y and their ages add to 39, (b) (x, y) = (2, 5) and (3, 7) lie on the line y = mx + c. Find m and c. 33. The parabola y = a + bx + cx2 goes through the points (x, y) = (1, 4) and (2, 8) and (3, 14). Find and solve a matrix equation for the unknowns (a, b, c). 34. Multiply these matrices in the orders EF and FE and E 2 :     1 0 0 1 0 0     E = a 1 0 F = 0 1 0 . b 0 1 0 c 1 35. (a) Suppose all columns of B are the same. Then all columns of EB are the same, because each one is E times . (b) Suppose all rows of B are [1 2 4]. Show by example that all rows of EB are not [1 2 4]. It is true that those rows are . 36. If E adds row 1 to row 2 and F adds row 2 to row 1, does EF equal FE? 37. The first component of Ax is ∑ a1 j x j = a11 x1 + · · · + a1n xn . Write formulas for the third component of Ax and the (1, 1) entry of A2 . 38. If AB = I and BC = I, use the associative law to prove A = C. 39. A is 3 by 5, B is 5 by 3, C is 5 by 1, and D is 3 by 1. All entries are 1. Which of these matrix operations are allowed, and what are the results? BA AB ABD DBA A(B +C). 40. What rows or columns or matrices do you multiply to find (a) the third column of AB? (b) the first row of AB? (c) the entry in row 3, column 4 of AB? (d) the entry in row 1, column 1 of CDE? 41. (3 by 3 matrices) Choose the only B so that for every matrix A, (a) BA = 4A. (b) BA = 4B. (c) BA has rows 1 and 3 of A reversed and row 2 unchanged. (d) All rows of BA are the same as row 1 of A. 42. True or false? 34 Chapter 1 Matrices and Gaussian Elimination (a) If A2 is defined then A is necessarily square. (b) If AB and BA are defined then A and B are square. (c) If AB and BA are defined then AB and BA are square. (d) If AB = B then A = I. 43. If A is m by n, how many separate multiplications are involved when (a) A multiplies a vector x with n components? (b) A multiplies an n by p matrix B? Then AB is m by p. (c) A multiplies itself to produce A2 ? Here m = n. 44. To prove that (AB)C = A(BC), use the column vectors b1 , . . . , bn of B. First suppose that C has only one column c with entries c1 , . . . , cn : AB has columns Ab1 , . . . , Abn , and Bc has one column c1 b1 + · · · + cn bn . Then (AB)c = c1 Ab1 + · · · + cn Abn , equals A(c1 b1 + · · · + cn bn ) = A(Bc). Linearity gives equality of those two sums, and (AB)c = A(Bc). The same is true for of C. Therefore (AB)C = A(BC). all other Problems 45–49 use column-row multiplication and block multiplication. 45. Multiply AB using columns times rows:     # 1 0 " 1 h i   3 3 0   AB = 2 4 = 2 3 3 0 + 1 2 1 2 1 2 = . 46. Block multiplication separates matrices into blocks (submatrices). If their shapes make block multiplication possible, then it is allowed. Replace these x’s by numbers and confirm that block multiplication succeeds.    " # x x x x x x h h i C i    = AC + BD and  x x x   x x x  . A B D x x x x x x 47. Draw the cuts in A and B and AB to show how each of the four multiplication rules is really a block multiplication to find AB: (a) Matrix A times columns of B. (b) Rows of A times matrix B. (c) Rows of A times columns of B. (d) Columns of A times rows of B. 1.4 Matrix Notation and Matrix Multiplication 35 48. Block multiplication says that elimination on column 1 produces " #" # " # 1 0 a b a b EA = . = −c/a I c D 0 49. Elimination for a 2 by 2 block matrix: When A−1 A = I, multiply the first block row by CA−1 and subtract from the second row, to find the “Schur complement” S: " #" # " # I 0 A B A B = . −CA−1 I C D 0 S 50. With i2 = −1, the product (A + iB)(x + iy) is Ax + iBx + iAy − By. Use blocks to separate the real part from the imaginary part that multiplies i: " #" # " # A −B x Ax − By real part = ? ? y ? imaginary part 51. Suppose you solve Ax = b for three special right-hand sides b:       1 0 0       Ax1 = 0 and Ax2 = 1 and Ax3 = 0 . 0 0 1 If the solutions x1 , x2 , x3 are the columns of a matrix X, what is AX? 52. If the three solutions in Question 51 are x1 = (1, 1, 1) and x2 = (0, 1, 1) and x3 = (0, 0, 1), solve Ax = b when b = (3, 5, 8). Challenge problem: What is A? 53. Find all matrices " A= a b c d # " that satisfy # " # 1 1 1 1 A = A. 1 1 1 1 54. If you multiply a northwest matrix A and a southeast matrix B, what type of matrices are AB and BA? “Northwest” and “southeast” mean zeros below and above the antidiagonal going from (1, n) to (n, 1). 55. Write 2x + 3y + z + 5t = 8 as a matrix A (how many rows?) multiplying the column vector (x, y, z,t) to produce b. The solutions fill a plane in four-dimensional space. The plane is three-dimensional with no 4D volume. 56. What 2 by 2 matrix P1 projects the vector (x, y) onto the x axis to produce (x, 0)? What matrix P2 projects onto the y axis to produce (0, y)? If you multiply (5, 7) by P1 and then multiply by P2 , you get ( ) and ( ). 57. Write the inner product of (1, 4, 5) and (x, y, z) as a matrix multiplication Ax. A has one row. The solutions to Ax = 0 lie on a perpendicular to the vector . The columns of A are only in -dimensional space. 36 Chapter 1 Matrices and Gaussian Elimination 58. In MATLAB notation, write the commands that define the matrix A and the column vectors x and b. What command would test whether or not Ax = b? " # " # " # 1 2 5 1 A= x= b= 3 4 −2 7 59. The MATLAB commands A = eye(3) and v = [3:5]’ produce the 3 by 3 identity matrix and the column vector (3, 4, 5). What are the outputs from A ∗ v and v’ ∗ v? (Computer not needed!) If you ask for v ∗ A, what happens? 60. If you multiply the 4 by 4 all-ones matrix A = ones(4,4) and the column v = ones(4,1), what is A ∗ v? (Computer not needed.) If you multiply B = eye(4) + ones(4,4) times w = zeros(4,1) + 2 ∗ ones(4,1), what is B ∗ w? 61. Invent a 3 by 3 magic matrix M with entries 1, 2, . . . , 9. All rows and columns and diagonals add to 15. h The first i row could be 8, 3, 4. What is M times (1, 1, 1)? What is the row vector 1 1 1 times M? 1.5 Triangular Factors and Row Exchanges We want to look again at elimination, to see what it means in terms of matrices. The starting point was the model system Ax = b:      2 1 1 u 5      Ax =  4 −6 0  v  = −2 = b. (1) −2 7 2 w 9 Then there were three elimination steps, with multipliers 2, −1, −1: Step 1. Subtract 2 times the first equation from the second; Step 2. Subtract −1 times the first equation from the third; Step 3. Subtract −1 times the second equation from the third. The result was an equivalent system Ux = c, with a new coefficient matrix U:      5 u 2 1 1      Upper triangular Ux = 0 −8 −2  v  = −12 = c. 2 w 0 0 1 (2) This matrix U is upper triangular—all entries below the diagonal are zero. The new right side c was derived from the original vector b by the same steps that took A into U. Forward elimination amounted to three row operations: 1.5 Triangular Factors and Row Exchanges 37 Start with A and b; Apply steps 1, 2, 3 in that order; End with U and c. Ux = c is solved by back-substitution. Here we concentrate on connecting A to U. The matrices E for step 1, F for step 2, and G for step 3 were introduced in the previous section. They are called elementary matrices, and it is easy to see how they work. To subtract a multiple ` of equation j from equation i, put the number −` into the (i, j) position. Otherwise keep the identity matrix, with 1s on the diagonal and 0s elsewhere. Then matrix multiplication executes the row operation. The result of all three steps is GFEA = U. Note that E is the first to multiply A, then F, then G. We could multiply GFE together to find the single matrix that takes A to U (and also takes b to c). It is lower triangular (zeros are omitted):       1 1 1 1       From A to U GFE =  1   1  −2 1  = −2 1  . (3) 1 1 1 1 1 −1 1 1 This is good, but the most important question is exactly the opposite: How would we get from U back to A? How can we undo the steps of Gaussian elimination? To undo step 1 is not hard. Instead of subtracting, we add twice the first row to the second. (Not twice the second row to the first!) The result of doing both the subtraction and the addition is to bring back the identity matrix:      1 0 0 1 0 0 1 0 0 Inverse of      (4) subtraction 2 1 0 −2 1 0 = 0 1 0 . 0 0 1 0 0 1 0 0 1 is addition One operation cancels the other. In matrix terms, one matrix is the inverse of the other. If the elementary matrix E has the number −` in the (i, j) position, then its inverse E −1 has +` in that position. Thus E −1 E = I, which is equation (4). We can invert each step of elimination, by using E −1 and F −1 and G−1 . I think it’s not bad to see these inverses now, before the next section. The final problem is to undo the whole process at once, and see what matrix takes U back to A. Since step 3 was last in going from A to U, its matrix G must be the first to be inverted in the reverse direction. Inverses come in the opposite order! The second reverse step is F −1 and the last is E −1 : From U back to A E −1 F −1 G−1U = A is LU = A. (5) You can substitute GFEA for U, to see how the inverses knock out the original steps. Now we recognize the matrix L that takes U back to A. It is called L, because it is lower triangular. And it has a special property that can be seen only by multiplying the 38 Chapter 1 Matrices and Gaussian Elimination three inverse matrices in the right order:   1 1   −1 −1 −1 E F G = 2 1   1 1 −1     1 1     1 1  = 2  = L. 1 −1 1 −1 −1 1 (6) The special thing is that the entries below the diagonal are the multipliers ` = 2, −1, and −1. When matrices are multiplied, there is usually no direct way to read off the answer. Here the matrices come in just the right order so that their product can be written down immediately. If the computer stores each multiplier `i j —the number that multiplies the pivot row j when it is subtracted from row i, and produces a zero in the i, j position—then these multipliers give a complete record of elimination. The numbers `i j fit right into the matrix L that takes U back to A. 1H Triangular factorization A = LU with no exchanges of rows. L is lower triangular, with 1s on the diagonal. The multipliers `i j (taken from elimination) are below the diagonal. U is the upper triangular matrix which appears after forward elimination, The diagonal entries of U are the pivots. Example 1. " # " # " # 1 2 1 2 1 0 A= goes to U = with L = . Then LU = A. 3 8 0 2 3 1 Example 2. (which needs a row exchange) " # 0 2 A= cannot be factored into A = LU. 3 4 Example 3. (with all pivots and multipliers equal to 1)      1 1 1 1 0 0 1 1 1      A = 1 2 2 = 1 1 0 0 1 1 = LU. 1 2 3 1 1 1 0 0 1 From A to U there are subtractions of rows. From U to A there are additions of rows. Example 4. (when U is the identity and L is the same as A)   1 0 0   Lower triangular case A = `21 1 0 . `31 `32 1 The elimination steps on this A are easy: (i) E subtracts `21 times row 1 from row 2, (ii) F subtracts `31 times row 1 from row 3, and (iii) G subtracts `32 times row 2 from row 3. The result is the identity matrix U = I. The inverses of E, F, and G will bring back A: 1.5 Triangular Factors and Row Exchanges 39 E −1 applied to F −1 applied to G−1 applied to I produces A.  1  `21 1   1   1  times  `31 1 1  1    times   1 0 0     equals `21 1 0 . `31 `32 1 1   1 `32  The order is right for the `’s to fall into position. This always happens! Note that parentheses in E −1 F −1 G−1 were not necessary because of the associative law. A = LU: The n by n case The factorization A = LU is so important that we must say more. It used to be missing in linear algebra courses when they concentrated on the abstract side. Or maybe it was thought to be too hard—but you have got it. If the last Example 4 allows any U instead of the particular U = I, we can see how the rule works in general. The matrix L, applied to U, brings back A:    1 0 0 row 1 of U    A = LU (7) `21 1 0 row 2 of U  = original A. `31 `32 1 row 3 of U The proof is to apply the steps of elimination. On the right-hand side they take A to U. On the left-hand side they reduce L to I, as in Example 4. (The first step subtracts `21 times (1, 0, 0) from the second row, which removes `21 .) Both sides of (7) end up equal to the same matrix U, and the steps to get there are all reversible. Therefore (7) is correct and A = LU. A = LU is so crucial, and so beautiful, that Problem 8 at the end of this section suggests a second approach. We are writing down 3 by 3 matrices, but you can see how the arguments apply to larger matrices. Here we give one more example, and then put A = LU to use. Example 5. (A = LU, with zeros in the empty spaces)      1 −1 1 1 −1    −1 1 −1 2 −1 1 −1      A= .  =   1 −1 −1 1 −1 2 −1  1 −1 1 −1 2 That shows how a matrix A with three diagonals has factors L and U with two diagonals. This example comes from an important problem in differential equations (Section 1.7). The second difference in A is a backward difference L times a forward difference U. 40 Chapter 1 Matrices and Gaussian Elimination One Linear System = Two Triangular Systems There is a serious practical point about A = LU. It is more than just a record of elimination steps; L and U are the right matrices to solve Ax = b. In fact A could be thrown away! We go from b to c by forward elimination (this uses L) and we go from c to x by back-substitution (that uses U). We can and should do it without A: Splitting of Ax = b First Lc = b and then Ux = c. (8) Multiply the second equation by L to give LUx = Lc, which is Ax = b. Each triangular system is quickly solved. That is exactly what a good elimination code will do: 1. Factor (from A find its factors L and U). 2. Solve (from L and U and b find the solution x). The separation into Factor and Solve means that a series of b’s can be processed. The Solve subroutine obeys equation (8): two triangular systems in n2 /2 steps each. The solution for any new right-hand side b can be found in only n2 operations. That is far below the n3 /3 steps needed to factor A on the left-hand side. Example 6. This is the previous matrix A with a right-hand side b = (1, 1, 1, 1). Ax = b x1 − x2 −x1 + 2x2 − x3 − x2 + 2x3 − x4 − x3 + 2x4 = = = = Lc = b c1 −c1 + c2 − c2 + c3 − c3 + c4 1 1 1 1 Ux = c x1 − x2 x2 − x3 x3 − x4 x4 = = = = = = = = 1 2 3 4 1 1 1 1 splits into Lc = b and Ux = c.   1 2   gives c =   . 3 4   10 9   gives x =   . 7 4 For these special “tridiagonal matrices,” the operation count drops from n2 to 2n. You see how Lc = b is solved forward (c1 comes before c2 ). This is precisely what happens during forward elimination. Then Ux = c is solved backward (x4 before x3 ). Remark 1. The LU form is “unsymmetric” on the diagonal: L has 1s where U has the 1.5 Triangular Factors and Row Exchanges pivots. This is easy to correct. Divide out of U a diagonal pivot matrix D:   . d1 1 u12 /d1 u13 /d1 .. ..     d2 1 u /d   23 2 .   Factor out D U =  ... ..  . ...    . dn 1 41 (9) In the last example all pivots were di = 1. In that case D = I. But that was very exceptional, and normally LU is different from LDU (also written LDV ). The triangular factorization can be written A = LDU, where L and U have 1s on the diagonal and D is the diagonal matrix of pivots. Whenever you see LDU or LDV , it is understood that U or V has is on the diagonal— each row was divided by the pivot in D. Then L and U are treated evenly. An example of LU splitting into LDU is " # " #" # " #" #" # 1 2 1 1 2 1 1 1 2 A= = = = LDU. 3 4 3 1 −2 3 1 −2 1 That has the 1s on the diagonals of L and U, and the pivots 1 and −2 in D. Remark 2. We may have given the impression in describing each elimination step, that the calculations must be done in that order. This is wrong. There is some freedom, and there is a “Crout algorithm” that arranges the calculations in a slightly different way. There is no freedom in the final L, D, and U. That is our main point: 1I If A = L1 D1U1 and also A = L2 D2U2 , where the L’s are lower triangular with unit diagonal, the U’s are upper triangular with unit diagonal, and the D’s are diagonal matrices with no zeros on the diagonal, then L1 = L2 , D1 = D2 , U1 = U2 . The LDU factorization and the LU factorization are uniquely determined by A. The proof is a good exercise with inverse matrices in the next section. Row Exchanges and Permutation Matrices We now have to face a problem that has so far been avoided: The number we expect to use as a pivot might be zero. This could occur in the middle of a calculation. It will happen at the very beginning if a11 = 0. A simple example is #" # " # " b1 0 2 u = . Zero in the pivot position b2 3 4 v The difficulty is clear; no multiple of the first equation will remove the coefficient 3. 42 Chapter 1 Matrices and Gaussian Elimination The remedy is equally clear. Exchange the two equations, moving the entry 3 up into the pivot. In this example the matrix would become upper triangular: Exchange rows 3u + 4v = b2 2v = b1 To express this in matrix terms, we need the permutation matrix P that produces the row exchange. It comes from exchanging the rows of I: " " # #" # " # 0 1 0 1 0 2 3 4 Permutation P= and PA = = . 1 0 1 0 3 4 0 2 P has the same effect on b, exchanging b1 and b2 . The new system is PAx = Pb. The unknowns u and v are not reversed in a row exchange. A permutation matrix P has the same rows as the identity (in some order). There is a single “1” in every row and column. The most common permutation matrix is P = I (it exchanges nothing). The product of two permutation matrices is another permutation— the rows of I get reordered twice. After P = I, the simplest permutations exchange two rows. Other permutations exchange more rows. There are n! = (n)(n − 1) · · · (1) permutations of size n. Row 1 has n choices, then row 2 has n − 1 choices, and finally the last row has only one choice. We can display all 3 by 3 permutations (there are 3! = (3)(2)(1) = 6 matrices):       1 1 1       I= 1  P21 = 1 P32 P21 =  1  1 1 1       1 1 1       P31 =  1  P32 =  P21 P32 = 1 1 . 1 1 1 There will be 24 permutation matrices of order n = 4. There are only two permutation matrices of order 2, namely " # " # 1 0 0 1 and . 0 1 1 0 When we know about inverses and transposes (the next section defines A−1 and AT ), we discover an important fact: P−1 is always the same as PT . A zero in the pivot location raises two possibilities: The trouble may be easy to fix, or it may be serious. This is decided by looking below the zero. If there is a nonzero entry lower down in the same column, then a row exchange is carried out. The nonzero entry becomes the needed pivot, and elimination can get going again:   0 a b d = 0 =⇒ no first pivot   A = 0 0 c  a = 0 =⇒ no second pivot d e f c = 0 =⇒ no third pivot. 1.5 Triangular Factors and Row Exchanges 43 If d = 0, the problem is incurable and this matrix is singular. There is no hope for a unique solution to Ax = b. If d is not zero, an exchange P13 of rows 1 and 3 will move d into the pivot. However the next pivot position also contains a zero. The number a is now below it (the e above it is useless). If a is not zero then another row exchange P23 is called for:       1 0 0 d e f 0 0 1       P13 = 0 1 0 and P23 = 0 0 1 and P23 P13 A =  0 a b  1 0 0 0 1 0 0 0 c One more point: The permutation P23 P13 will do both row exchanges at once:      1 0 0 0 0 1 0 0 1      P13 acts first P23 P13 = 0 0 1 0 1 0 = 1 0 0 = P. 0 1 0 1 0 0 0 1 0 If we had known, we could have multiplied A by P in the first place. With the rows in the right order PA, any nonsingular matrix is ready for elimination. Elimination in a Nutshell: PA = LU The main point is this: If elimination can be completed with the help of row exchanges, then we can imagine that those exchanges are done first (by P). The matrix PA will not need row exchanges. In other words, PA allows the standard factorization into L times U. The theory of Gaussian elimination can be summarized in a few lines: 1J In the nonsingular case, there is a permutation matrix P that reorders the rows of A to avoid zeros in the pivot positions. Then Ax = b has a unique solution: With the rows reordered in advance, PA can be factored into LU. In the singular case, no P can produce a full set of pivots: elimination fails. In practice, we also consider a row exchange when the original pivot is near zero— even if it is not exactly zero. Choosing a larger pivot reduces the roundoff error. You have to be careful with L. Suppose elimination subtracts row 1 from row 2, creating `21 = 1. Then suppose it exchanges rows 2 and 3. If that exchange is done in advance, the multiplier will change to `31 = 1 in PA = LU. Example 7.       1 1 1 1 1 1 1 1 1       A = 1 1 3 → 0 0 2 → 0 3 6 = U. 0 0 2 0 3 6 2 5 8 (10) 44 Chapter 1 Matrices and Gaussian Elimination That row exchange recovers LU—but now `31 = 1 and `21 = 2:     1 0 0 1 0 0     P = 0 0 1 and L = 2 1 0 and PA = LU. 0 1 0 1 0 1 (11) In MATLAB, A([r k] :) exchanges row k with row r below it (where the kth pivot has been found). We update the matrices L and P the same way. At the start, P = I and sign = +1: A([r k], :) = A([k r], :); L([r k], 1:k-1) = L([k r], 1:k-1); P([r k], :) = P([k r], :); sign = -sign The “sign” of P tells whether the number of row exchanges is even (sign = +1) or odd (sign = −1). A row exchange reverses sign. The final value of sign is the determinant of P and it does not depend on the order of the row exchanges. To summarize: A good elimination code saves L and U and P. Those matrices carry the information that originally came in A—and they carry it in a more usable form. Ax = b reduces to two triangular systems. This is the practical equivalent of the calculation we do next—to find the inverse matrix A−1 and the solution x = A−1 b. Problem Set 1.5 1. When is an upper triangular matrix nonsingular (a full set of pivots)? 2. What multiple `32 of row 2 of A will elimination subtract from row 3 of A? Use the factored form    1 0 0 5 7 8    A = 2 1 0 0 2 3 . 1 4 1 0 0 6 What will be the pivots? Will a row exchange be required? 3. Multiply the matrix L = E −1 F −1 G−1 in equation (6) by GFE in equation (3):     1 0 0 1 0 0     1 0 times −2 1 0 . 2 −1 1 1 −1 −1 1 Multiply also in the opposite order. Why are the answers what they are? 1.5 Triangular Factors and Row Exchanges 4. Apply elimination to produce the factors L and U for   # " 3 1 1 2 1   and A = 1 3 1 and A= 8 7 1 1 3 45   1 1 1   A = 1 4 4 . 1 4 8 5. Factor A into LU, and write down the upper triangular system Ux = c which appears after elimination, for      2 3 3 u 2      Ax = 0 5 7  v  = 2 . 6 9 8 w 5 6. Find E 2 and E 8 and E −1 if " # 1 0 E= . 6 1 7. Find the products FGH and HGF if (with upper triangular zeros omitted)       1 1 1 2 1  0 1  0 1        F = G= H =   . 0 0 1  0 2 1  0 0 1  0 0 0 1 0 0 0 1 0 0 2 1 8. (Second proof of A = LU) The third row of U comes from the third row of A by subtracting multiples of rows 1 and 2 (of U!): row 3 of U = row 3 of A − `31 (row 1 of U) − `32 (row 2 of U). (a) Why are rows of U subtracted off and not rows of A? Answer: Because by the time a pivot row is used, . (b) The equation above is the same as row 3 of A = `31 (row 1 of U) + `32 (row 2 of U) + 1(row 3 of U). Which rule for matrix multiplication makes this row 3 of L times U? The other rows of LU agree similarly with the rows of A. 9. (a) Under what conditions is the following product nonsingular?     1 −1 0 1 0 0 d1     A = −1 1 0  d2  0 1 −1 . 1 d3 0 0 0 −1 1 (b) Solve the system Ax = b starting with Lc = b:      0 1 0 0 c1      −1 1 0 c2  = 0 = b. 1 0 −1 1 c3 46 Chapter 1 Matrices and Gaussian Elimination 10. (a) Why does it take approximately n2 /2 multiplication-subtraction steps to solve each of Lc = b and Ux = c? (b) How many steps does elimination use in solving 10 systems with the same 60 by 60 coefficient matrix A? 11. Solve as two triangular systems, without multiplying LU to find A:       u 2 1 0 0 2 4 4       LUx = 1 1 0 0 1 2  v  = 0 . 2 1 0 1 0 0 1 w 12. How could you factor A into a product UL, upper triangular times lower triangular? Would they be the same factors as in A = LU? 13. Solve by elimination, exchanging rows when necessary: u + 4v + 2w = −2 −2u − 8v + 3w = 32 v + w = 1 and v + w = 0 u + v = 0 u + v + w = 1. Which permutation matrices are required? 14. Write down all six of the 3 by 3 permutation matrices, including P = I. Identify their inverses, which are also permutation matrices. The inverses satisfy PP−1 = I and are on the same list. 15. Find the PA = LDU factorizations (and check them) for     1 2 1 0 1 1     A = 1 0 1 and A = 2 4 2 . 1 1 1 2 3 4 16. Find a 4 by 4 permutation matrix that requires three row exchanges to reach the end of elimination (which is U = I). 17. The less familiar form A = LPU exchanges rows only at the end:        1 1 1 1 0 0 1 1 1 1 1 1        A = 1 1 3 → L−1 A = 0 0 2 = PU = 0 0 1 0 3 6 . 0 1 0 0 0 2 0 3 6 2 5 8 What is L is this case? Comparing with PA = LU in Box 1J, the multipliers now stay in place (`21 is 1 and `31 is 2 when A = LPU). 18. Decide whether the following systems are singular or nonsingular, and whether they have no solution, one solution, or infinitely many solutions: v − w = 2 u − v = 2 u − w = 2 and v − w = 0 u − v = 0 u − w = 0 and v + w = 1 u + v = 1 u + w = 1. 1.5 Triangular Factors and Row Exchanges 47 19. Which numbers a, b, c lead to row exchanges? Which make the matrix singular?   " # 1 2 0 c 2   . A = a 8 3 and A = 6 4 0 b 5 Problems 20–31 compute the factorization A = LU (and also A = LDU). £ ¤ £ ¤ 20. Forward elimination changes 11 12 x = b to a triangular 10 11 x = c: " # " # x + y = 5 x + y = 5 1 1 5 1 1 5 → → . x + 2y = 7 y = 2 1 2 7 0 1 2 That step subtracted `21 = times row 1 from row 2. The reverse step adds `21 times row 1 to row 2. The matrix . Multiply this L £ 1 1 ¤ for£that ¤ reverse step is L = 5 = . In letters, L multiplies times the triangular system 0 1 x = 2 to get Ux = c to give . 21. (Move to 3 by 3) Forward elimination changes Ax = b to a triangular Ux = c: x+y+z = 5 x+y+z = 5 x+y+z = 5 x + 2y + 3z = 7 y + 2z = 2 y + 2z = 2 x + 3y + 6z = 11 2y + 5z = 6 z = 2. The equation z = 2 in Ux = c comes from the original x + 3y + 6z = 11 in Ax = b by subtracting `31 = times equation 1 and `32 = times the final equation 2. Reverse that to recover [1 3 6 11] in [A b] from the final [1 1 1 5] and [0 1 2 2] and [0 0 1 2] in [U c]: h i h i Row 3 of A b = (`31 Row 1 + `32 Row 2 + 1 Row 3) of U c . In matrix notation this is multiplication by L. So A = LU and b = Lc. 22. What are the 3 by 3 triangular systems Lc = b and Ux = c from Problem 21? Check that c = (5, 2, 2) solves the first one. Which x solves the second one? 23. What two elimination matrices E21 and E32 put A into upper triangular form E32 E21 A = −1 −1 −1 −1 U? Multiply by E31 and E21 to factor A into LU = E21 E32 U:   1 1 1   A = 2 4 5 . 0 4 0 24. What three elimination matrices E21 , E31 , E32 put A into upper triangular form −1 −1 −1 E32 E31 E21 A = U? Multiply by E32 , E31 and E21 to factor A into LU where L = −1 −1 −1 E21 E31 E32 . Find L and U:   1 0 1   A = 2 2 2 . 3 4 5 48 Chapter 1 Matrices and Gaussian Elimination 25. When zero appears in a pivot position, A = LU is not possible! (We need nonzero pivots d, f , i in U.) Show directly why these are both impossible:      " # " #" # 1 1 0 1 d e g 0 1 1 0 d e      = 1 1 2 =  ` 1   f h . 2 3 ` 1 0 f 1 2 1 m n 1 i 26. Which number c leads to zero in the second pivot position? A row exchange is needed and A = LU is not possible. Which c produces zero in the third pivot position? Then a row exchange can’t help and elimination fails:   1 c 0   A = 2 4 1 . 3 5 1 27. What are L and D for this matrix A? What is U in A = LU and what is the new U in A = LDU?   2 4 8   A = 0 3 9 . 0 0 7 28. A and B are symmetric across the diagonal (because 4 = 4). Find their triple factorizations LDU and say how U is related to L for these symmetric matrices:   " # 1 4 0 2 4   A= and B = 4 12 4 . 4 11 0 4 0 29. (Recommended) Compute L and U for the symmetric matrix   a a a a a b b b   A= . a b c c  a b c d Find four conditions on a, b, c, d to get A = LU with four pivots. 30. Find L and U for the nonsymmetric matrix  a r a b  A= a b a b r s c c  r s  . t d Find the four conditions on a, b, c, d, r, s, t to get A = LU with four pivots. 1.5 Triangular Factors and Row Exchanges 49 31. Tridiagonal matrices have zero entries except on the main diagonal and the two adjacent diagonals. Factor these into A = LU and A = LDV :     1 1 0 a a 0     A = 1 2 1 and A = a a + b b . 0 1 2 0 b b+c 32. Solve the triangular system Lc = b to find c. Then solve Ux = c to find x: " # " # " # 1 0 2 4 2 L= and U = and b = . 4 1 0 1 11 For safety find A = LU and solve Ax = b as usual. Circle c when you see it. 33. Solve Lc = b to find c. Then solve Ux = c to find x. What was A?       1 0 0 1 1 1 4       L = 1 1 0 and U = 0 1 1 and b = 5 . 1 1 1 0 0 1 6 34. If A and B have nonzeros in the positions marked by x, which zeros are still zero in their factors L and U?     x x x x x x x 0  x x x 0 x x 0 x     A=  and B =  . 0 x x x x 0 x x 0 0 x x 0 x x x 35. (Important) If A has pivots 2, 7, 6 with no row exchanges, what are the pivots for the upper left 2 by 2 submatrix B (without row 3 and column 3)? Explain why. 36. Starting from a 3 by 3 matrix A with pivots 2, 7, 6, add a fourth row and column to produce M. What are the first three pivots for M, and why? What fourth row and column are sure to produce 9 as the fourth pivot? 37. Use chol(pascal(5)) to find the triangular factors of MATLAB’s pascal(5). Row exchanges in [L, U] = lu(pascal(5)) spoil Pascal’s pattern! 38. (Review) For which numbers c is A = LU  1  A = 3 0 impossible—with three pivots?  2 0  c 1 . 1 1 39. Estimate the time difference for each new right-hand side b when n = 800. Create A = rand(800) and b = rand(800,1) and B = rand(800,9). Compare the times from tic; A\b; toc and tic; A\B; toc (which solves for 9 right sides). Problems 40–48 are about permutation matrices. 50 Chapter 1 Matrices and Gaussian Elimination 40. There are 12 “even” permutations of (1, 2, 3, 4), with an even number of exchanges. Two of them are (1, 2, 3, 4) with no exchanges and (4, 3, 2, 1) with two exchanges. List the other ten. Instead of writing each 4 by 4 matrix, use the numbers 4, 3, 2, 1 to give the position of the 1 in each row. 41. How many exchanges will permute (5, 4, 3, 2, 1) back to (1, 2, 3, 4, 5)? How many exchanges to change (6, 5, 4, 3, 2, 1) to (1, 2, 3, 4, 5, 6)? One is even and the other is odd. For (n, . . . , 1) to (1, . . . , n), show that n = 100 and 101 are even, n = 102 and 103 are odd. 42. If P1 and P2 are permutation matrices, so is P1 P2 . This still has the rows of I in some order. Give examples with P1 P2 6= P2 P1 and P3 P4 = P4 P3 . 43. (Try this question.) Which permutation makes PA upper triangular? Which permutations make P1 AP2 lower triangular? Multiplying A on the right by P2 exchanges the of A.   0 0 6   A = 1 2 3 0 4 5 44. Find a 3 by 3 permutation matrix with P3 = I (but not P = I). Find a 4 by 4 permutation Pb with Pb4 6= I. 45. If you take powers of a permutation, why is some Pk eventually equal to I? Find a 5 by 5 permutation P so that the smallest power to equal I is P6 . (This is a challenge question. Combine a 2 by 2 block with a 3 by 3 block.) 46. The matrix P that multiplies (x, y, z) to give (z, x, y) is also a rotation matrix. Find P and P3 . The rotation axis a = (1, 1, 1) doesn’t move, it equals Pa. What is the angle of rotation from v = (2, 3, −5) to Pv = (−5, 2, 3)? 47. If P is any permutation matrix, find a nonzero vector x so that (I − P)x = 0. (This will mean that I − P has no inverse, and has determinant zero.) 48. If P has 1s on the antidiagonal from (1, n) to (n, 1), describe PAP. 1.6 Inverses and Transposes The inverse of an n by n matrix is another n by n matrix. The inverse of A is written A−1 (and pronounced “A inverse”). The fundamental property is simple: If you multiply by A and then multiply by A−1 , you are back where you started: Inverse matrix If b = Ax then A−1 b = x. 1.6 Inverses and Transposes 51 Thus A−1 Ax = x. The matrix A−1 times A is the identity matrix. Not all matrices have inverses. An inverse is impossible when Ax is zero and x is nonzero. Then A−1 would have to get back from Ax = 0 to x. No matrix can multiply that zero vector Ax and produce a nonzero vector x. Our goals are to define the inverse matrix and compute it and use it, when A−1 exists—and then to understand which matrices don’t have inverses. 1K The inverse of A is a matrix B such that BA = I and AB = I. There is at most one such B, and it is denoted by A−1 : A−1 A = I and AA−1 = I. (1) Note 1. The inverse exists if and only if elimination produces n pivots (row exchanges allowed). Elimination solves Ax = b without explicitly finding A−1 . Note 2. The matrix A cannot have two different inverses, Suppose BA = I and also AC = I. Then B = C, according to this “proof by parentheses”: B(AC) = (BA)C gives BI = IC which is B = C. (2) This shows that a left-inverse B (multiplying from the left) and a right-inverse C (multiplying A from the right to give AC = I) must be the same matrix. Note 3. If A is invertible, the one and only solution to Ax = b is x = A−1 b: Multiply Ax = b by A−1 . Then x = A−1 Ax = A−1 b. Note 4. (Important) Suppose there is a nonzero vector x such that Ax = 0. Then A cannot have an inverse. To repeat: No matrix can bring 0 back to x. If A is invertible, then Ax = 0 can only have the zero solution x = 0. Note 5. A 2 by 2 matrix is invertible if and only if ad − bc is not zero: " #−1 " # 1 a b d −b 2 by 2 inverse = . ad − bc −c a c d (3) This number ad − bc is the determinant of A. A matrix is invertible if its determinant is not zero (Chapter 4). In MATLAB, the invertibility test is to find n nonzero pivots. Elimination produces those pivots before the determinant appears. Note 6. A diagonal matrix has an inverse provided no diagonal entries are zero:     1/d1 d1     −1 −1 ... ... If A =   and AA = I.  then A =  dn 1/dn When two matrices are involved, not much can be done about the inverse of A + B. The sum might or might not be invertible. Instead, it is the inverse of their product 52 Chapter 1 Matrices and Gaussian Elimination AB which is the key formula in matrix computations. Ordinary numbers are the same: (a + b)−1 is hard to simplify, while 1/ab splits into 1/a times 1/b. But for matrices the order of multiplication must be correct—if ABx = y then Bx = A−1 y and x = B−1 A−1 y. The inverses come in reverse order. 1L A product AB of invertible matrices is inverted by B−1 A−1 : Inverse of AB (AB)−1 = B−1 A−1 . (4) Proof. To show that B−1 A−1 is the inverse of AB, we multiply them and use the associative law to remove parentheses. Notice how B sits next to B−1 : (AB)(B−1 A−1 ) = ABB−1 A−1 = AIA−1 = AA−1 = I (B−1 A−1 )(AB) = B−1 A−1 AB = B−1 IB = B−1 B = I. A similar rule holds with three or more matrices: Inverse of ABC (ABC)−1 = C−1 B−1 A−1 . We saw this change of order when the elimination matrices E, F, G were inverted to come back from U to A. In the forward direction, GFEA was U. In the backward direction, L = E −1 F −1 G−1 was the product of the inverses. Since G came last, G−1 comes first. Please check that A−1 would be U −1 GFE. The Calculation of A−1 : The Gauss-Jordan Method Consider the equation AA−1 = I. If it is taken a column at a time, that equation determines each column of A−1 . The first column of A−1 is multiplied by A, to yield the first column of the identity: Ax1 = e1 . Similarly Ax2 = e2 and Ax3 = e3 the e’s are the columns of I. In a 3 by 3 example, A times A−1 is I:     2 1 1 h 1 0 0 i h i     (5) Axi = ei = 4 −6 0 x x x e e e   1 2 3 1 2 3 = 0 1 0 . −2 7 2 0 0 1 Thus we have three systems of equations (or n systems). They all have the same coefficient matrix A. The right-hand sides e1 , e2 , e3 are different, but elimination is possible on all systems simultaneously. This is the Gauss-Jordan method. Instead of stopping at U and switching to back-substitution, it continues by subtracting multiples of a row from the rows above. This produces zeros above the diagonal as well as below. When it reaches the identity matrix we have found A−1 . The example keeps all three columns e1 , e2 , e3 , and operates on rows of length six: 1.6 Inverses and Transposes 53 Example 1. Using the Gauss-Jordan Method to Find A−1   2 1 1 1 0 0 h i   A e1 e2 e3 =  4 −6 0 0 1 0 −2 7 2 0 0 1   2 1 1 1 0 0   pivot = 2 → 0 −8 −2 −2 1 0 0 8 3 1 0 1   2 1 1 1 0 0 h i   −1 . pivot = −8 → 0 −8 −2 −2 1 0 = U L 0 0 1 −1 1 1 This completes the first half—forward elimination. The upper triangular U appears in the first three columns. The other three columns are the same as L−1 . (This is the effect of applying the elementary operations GFE to the identity matrix.) Now the second half will go from U to I (multiplying by U −1 ). That takes L−1 to U −1 L−1 which is A−1 . Creating zeros above the pivots, we reach A−1 :   2 1 0 2 −1 −1 h i   Second half 2 U L−1 → 0 −8 0 −4 3 0 0 1 −1 1 1   12 5 6 2 0 0 8 −8 −8   zeros above pivots → 0 −8 0 −4 3 2  0 0 1 −1 1 1   5 6 − − 1 0 0 12 h i 16 16 16   4 3 2 −1 . divide by pivots → 0 1 0 8 − 8 − 8  = I A 0 0 1 −1 1 1 At the last step, we divided the rows by their pivots 2 and −8 and 1. The coefficient matrix in the left-hand half became the identity. Since A went to I, the same operations on the right-hand half must have carried I into A−1 . Therefore we have computed the inverse. A note for the future: You can see the determinant −16 appearing in the denominators of A−1 . The determinant is the product of the pivots (2)(−8)(1). It enters at the end when the rows are divided by the pivots. Remark 1. In spite of this brilliant success in computing A−1 , I don’t recommend it, I admit that A−1 solves Ax = b in one step. Two triangular steps are better: x = A−1 b separates into Lc = b and Ux = c. We could write c = L−1 b and then x = U −1 c = U −1 L−1 b. But note that we did not explicitly form, and in actual computation should not form, these matrices L−1 and U −1 . 54 Chapter 1 Matrices and Gaussian Elimination It would be a waste of time, since we only need back-substitution for x (and forward substitution produced c). A similar remark applies to A−1 ; the multiplication A−1 b would still take n2 steps. It is the solution that we want, and not all the entries in the inverse. Remark 2. Purely out of curiosity, we might count the number of operations required to find A−1 . The normal count for each new right-hand side is n2 , half in the forward direction and half in back-substitution. With n right-hand sides e1 , . . . , en this makes n3 . After including the n3 /3 operations on A itself, the total seems to be 4n3 /3. This result is a little too high because of the zeros in the e j . Forward elimination changes only the zeros below the 1. This part has only n − j components, so the count for e j is effectively changed to (n − j)2 /2. Summing over all j, the total for forward elimination is n3 /6. This is to be combined with the usual n3 /3 operations that are applied to A, and the n(n2 /2) back-substitution steps that finally produce the columns x j of A−1 . The final count of multiplications for computing A−1 is n3 : µ 2¶ n3 n3 n Operation count + +n = n3 . 6 3 2 This count is remarkably low. Since matrix multiplication already takes n3 steps, it requires as many operations to compute A2 as it does to compute A−1 ! That fact seems almost unbelievable (and computing A3 requires twice as many, as far as we can see). Nevertheless, if A−1 is not needed, it should not be computed. Remark 3. In the Gauss-Jordan calculation we went all the way forward to U, before starting backward to produce zeros above the pivots. That is like Gaussian elimination, but other orders are possible. We could have used the second pivot when we were there earlier, to create a zero above it as well as below it. This is not smart. At that time the second row is virtually full, whereas near the end it has zeros from the upward row operations that have already taken place. Invertible = Nonsingular (n pivots) Ultimately we want to know which matrices are invertible and which are not. This question is so important that it has many answers. See the last page of the book! Each of the first five chapters will give a different (but equivalent) test for invertibility. Sometimes the tests extend to rectangular matrices and one-sided inverses: Chapter 2 looks for independent rows and independent columns, Chapter 3 inverts AAT or AT A. The other chapters look for nonzero determinants or nonzero eigenvalues or nonzero pivots. This last test is the one we meet through Gaussian elimination. We want to show (in a few theoretical paragraphs) that the pivot test succeeds. Suppose A has a full set of n pivots. AA−1 = I gives n separate systems Axi = ei for the columns of A−1 . They can be solved by elimination or by Gauss-Jordan. Row exchanges may be needed, but the columns of A−1 are determined. 1.6 Inverses and Transposes 55 Strictly speaking, we have to show that the matrix A−1 with those columns is also a left-inverse. Solving AA−1 = I has at the same time solved A−1 A = I, but why? A 1-sided inverse of a square matrix is automatically a 2-sided inverse. To see why, notice that every Gauss-Jordan step is a multiplication on the left by an elementary matrix. We are allowing three types of elementary matrices: 1. Ei j to subtract a multiple ` of row j from row i 2. Pi j to exchange rows i and j 3. D (or D−1 ) to divide all rows by their pivots. The Gauss-Jordan process is really a giant sequence of matrix multiplications: (D−1 · · · E · · · P · · · E)A = I. (6) That matrix in parentheses, to the left of A, is evidently a left-inverse! It exists, it equals the right-inverse by Note 2, so every nonsingular matrix is invertible. The converse is also true: If A is invertible, it has n pivots. In an extreme case that is clear: A cannot have a whole column of zeros. The inverse could never multiply a column of zeros to produce a column of I. In a less extreme case, suppose elimination starts on an invertible matrix A but breaks down at column 3:   d1 x x x  0 d x x Breakdown   2 0 A = .  0 0 0 x No pivot in column 3 0 0 0 x This matrix cannot have an inverse, no matter what the x’s are. One proof is to use column operations (for the first time?) to make the whole third column zero. By subtracting multiples of column 2 and then of column 1, we reach a matrix that is certainly not invertible. Therefore the original A was not invertible. Elimination gives a complete test: An n by n matrix is invertible if and only if it has n pivots. The Transpose Matrix We need one more matrix, and fortunately it is much simpler than the inverse. The transpose of A is denoted by AT . Its columns are taken directly from the rows of A—the ith row of A becomes the ith column of AT :   " # 2 0 2 1 4   Transpose If A = then AT = 1 0 . 0 0 3 4 3 At the same time the columns of A become the rows of AT , If A is an m by n matrix, then AT is n by m. The final effect is to flip the matrix across its main diagonal, and the entry 56 Chapter 1 Matrices and Gaussian Elimination in row i, column j of AT comes from row j, column i of A: Entries of AT (AT )i j = A ji . (7) The transpose of a lower triangular matrix is upper triangular. The transpose of AT brings us back to A. If we add two matrices and then transpose, the result is the same as first transposing and then adding: (A + B)T is the same as AT + BT . But what is the transpose of a product AB or an inverse A−1 ? Those are the essential formulas of this section: 1M (i) The transpose of AB is (AB)T = BT AT , (ii) The transpose of A−1 is (A−1 )T = (AT )−1 . Notice how the formula for (AB)T resembles the one for (AB)−1 . In both cases we reverse the order, giving BT AT and B−1 A−1 . The proof for the inverse was easy, but this one requires an unnatural patience with matrix multiplication. The first row of (AB)T is the first column of AB. So the columns of A are weighted by the first column of B. This amounts to the rows of AT weighted by the first row of BT . That is exactly the first row of BT AT . The other rows of (AB)T and BT AT also agree. " #" # " # 1 0 3 3 3 3 3 3 Start from AB = = 1 1 2 2 2 5 5 5     # 3 2 " 3 5   1 1   Transpose to BT AT = 3 2 = 3 5 . 0 1 3 2 3 5 To establish the formula for (A−1 )T , start from AA−1 = I and A−1 A = I and take transposes. On one side, I T = I. On the other side, we know from part (i) the transpose of a product. You see how (A−1 )T is the inverse of AT , proving (ii): Inverse of AT = Transpose of A−1 (A−1 )T AT = I. (8) Symmetric Matrices With these rules established, we can introduce a special class of matrices, probably the most important class of all. A symmetric matrix is a matrix that equals its own transpose: AT = A. The matrix is necessarily square. Each entry on one side of the diagonal equals its “mirror image” on the other side: ai j = a ji . Two simple examples are A and D (and also A−1 ): # # " " # " 1 8 −2 1 0 1 2 and A−1 = and D = Symmetric matrices A= . 4 −2 1 0 4 2 8 1.6 Inverses and Transposes 57 A symmetric matrix need not be invertible; it could even be a matrix of zeros. But if A−1 exists it is also symmetric. From formula (ii) above, the transpose of A−1 always equals (AT )−1 ; for a symmetric matrix this is just A−1 . A−1 equals its own transpose; it is symmetric whenever A is. Now we show that multiplying any matrix R by RT gives a symmetric matrix. Symmetric Products RT R, RRT , and LDLT Choose any matrix R, probably rectangular. Multiply RT times R. Then the product RT R is automatically a square symmetric matrix: The transpose of RT R is RT (RT )T , which is RT R. (9) That is a quick proof of symmetry for RT R. Its i, j entry is the inner product of row i of RT (column i of R) with column j of R. The ( j, i) entry is the same inner product, column j with column i. So RT R is symmetric. RRT is also symmetric, but it is different from RT R. In my experience, most scientific problems that start with a rectangular matrix R end up with RT R or RRT or both. £ ¤ Example 2. R = [1 2] and RT = [ 12 ] produce RT R = 12 24 and RRT = [5]. The product RT R is n by n. In the opposite order, RRT is m by m. Even if m = n, it is not very likely that RT R = RRT . Equality can happen, but it’s not normal. Symmetric matrices appear in every subject whose laws are fair. “Each action has an equal and opposite reaction.” The entry ai j that gives the action of i onto j is matched by a ji . We will see this symmetry in the next section, for differential equations. Here, LU misses the symmetry but LDLT captures it perfectly. 1N Suppose A = AT can be factored into A = LDU without row exchanges. Then U is the transpose of L. The symmetric factorization becomes A = LDLT . The transpose of A = LDU gives AT = U T DT LT . Since A = AT , we now have two factorizations of A into lower triangular times diagonal times upper triangular. (LT is upper triangular with ones on the diagonal, exactly like U.) Since the factorization is unique (see Problem 17), LT must be identical to U. # #" #" # " " 1 0 1 0 1 2 1 2 = LDLT . = LT = U and A = LDLT 2 1 0 4 0 1 2 8 When elimination is applied to a symmetric matrix, AT = A is an advantage. The smaller matrices stay symmetric as elimination proceeds, and we can work with half the matrix! The lower right-hand corner remains symmetric:     a b c a b c 2     b d e  → 0 d − ba e − bc a . 2 c e f 0 e − bc f − ca a 58 Chapter 1 Matrices and Gaussian Elimination The work of elimination is reduced from n3 /3 to n3 /6. There is no need to store entries from both sides of the diagonal, or to store both L and U. Problem Set 1.6 1. Find the inverses (no special system required) of " # " # " # 0 2 2 0 cos θ − sin θ . A1 = , A2 = , A3 = 3 0 4 2 sin θ cos θ 2. (a) Find the inverses of the permutation matrices     0 0 1 0 0 1     P = 0 1 0 and P = 1 0 0 . 1 0 0 0 1 0 (b) Explain for permutations why P−1 is always the same as PT . Show that the 1s are in the right places to give PPT = I. 3. From AB = C find a formula for A−1 . Also find A−1 from PA = LU. 4. (a) If A is invertible and AB = AC, prove quickly that B = C. (b) If A = [ 10 00 ], find an example with AB = AC but B 6= C. 5. If the inverse of A2 is B, show that the inverse of A is AB. (Thus A is invertible whenever A2 is invertible.) 6. Use the Gauss-Jordan method to invert     1 0 0 2 −1 0     A1 = 1 1 1 , A2 = −1 2 −1 , 0 0 1 0 −1 2   0 0 1   A3 = 0 1 1 . 1 1 1 7. Find three 2 by 2 matrices, other than A = I and A = −I, that are their own inverses: A2 = I. 8. Show that A = [ 13 13 ] has no inverse by solving Ax = 0, and by failing to solve # # " #" " 1 0 1 1 a b . = 0 1 3 3 c d 9. Suppose elimination fails because there is no pivot in column 3:   2 1 4 6 0 3 8 5   Missing pivot A= . 0 0 0 7 0 0 0 9 1.6 Inverses and Transposes 59 Show that A cannot be invertible. The third row of A−1 , multiplying A, should give the third row [0 0 1 0] of A−1 A = I. Why is this impossible? 10. Find the inverses (in any legal way) of    1 0 0 0 0 0 1 − 1 1 0 0 2 0 0    A2 =  2 A1 =  , 2  0 −3 1 0 3 0 0 4 0 0 0 0 0 − 34  0 0  , 0 1  a c  A3 =  0 0 b d 0 0 0 0 a c  0 0  . b d 11. Give examples of A and B such that (a) A + B is not invertible although A and B are invertible. (b) A + B is invertible although A and B are not invertible. (c) all of A, B, and A + B are invertible. (d) In the last case use A−1 (A + B)B−1 = B−1 + A−1 to show that C = B−1 + A−1 is also invertible—and find a formula for C−1 . 12. If A is invertible, which properties of A remain true for A−1 ? (a) A is triangular. (b) A is symmetric. (c) A is tridiagonal. (d) All entries are whole numbers. (e) All entries are fractions (including numbers like 13 ). 13. If A = [ 31 ] and B = [ 22 ], compute AT B, BT A, ABT , and BAT . 14. If B is square, show that A = B + BT is always symmetric and K = B − BT is always skew-symmetric—which means that K T = −K. Find these matrices A and K when B = [ 11 31 ], and write B as the sum of a symmetric matrix and a skew-symmetric matrix. 15. (a) How many entries can be chosen independently in a symmetric matrix of order n? (b) How many entries can be chosen independently in a skew-symmetric matrix (K T = −K) of order n? The diagonal of K is zero! 16. (a) If A = LDU, with 1s on the diagonals of L and U, what is the corresponding factorization of AT ? Note that A and AT (square matrices with no row exchanges) share the same pivots. (b) What triangular systems will give the solution to AT y = b? 17. If A = L1 D1U1 and A = L2 D2U2 , prove that L1 = L2 , D1 = D2 , and U1 = U2 . If A is invertible, the factorization is unique. (a) Derive the equation L1−1 L2 D2 = D1U1U2−1 , and explain why one side is lower triangular and the other side is upper triangular. (b) Compare the main diagonals and then compare the off-diagonals. 60 Chapter 1 Matrices and Gaussian Elimination 18. Under what conditions on their entries are A and B invertible?     a b 0 a b c     B =  c d 0 . A =  d e 0 f 0 0 0 0 e 19. Compute the symmetric LDLT factorization of   " # 1 3 5 a b   . A = 3 12 18 and A = b d 5 18 30 20. Find the inverse of  1 0 0 0  1 1 0 0   A =  41 1 .  3 3 1 0 1 1 1 2 2 2 1  21. (Remarkable) If A and B are square matrices, show that I − BA is invertible if I − AB is invertible. Start from B(I − AB) = (1 − BA)B. 22. Find the inverses (directly or from the 2 by 2 formula) of A, B, C: " # " # " # 0 3 a b 3 4 A= and B = and C = . 4 6 b 0 5 7 " # x t 23. Solve for the columns of A−1 = : y z " #" # " # 10 20 x 1 = and 20 50 y 0 " 10 20 20 50 #" # " # t 0 = . z 1 24. Show that [ 13 26 ] has no inverse by trying to solve for the column (x, y): " #" # " # " #" # " # 1 2 x t 1 0 1 2 x 1 = must include = . 3 6 y z 0 1 3 6 y 0 25. (Important) If A has row 1 + row 2 = row 3, show that A is not invertible: (a) Explain why Ax = (1, 0, 0) cannot have a solution. (b) Which right-hand sides (b1 , b2 , b3 ) might allow a solution to Ax = b? (c) What happens to row 3 in elimination? 26. If A has column 1 + column 2 = column 3, show that A is not invertible: 1.6 Inverses and Transposes 61 (a) Find a nonzero solution x to Ax = 0. The matrix is 3 by 3. (b) Elimination keeps column 1 + column 2 = column 3. Explain why there is no third pivot. 27. Suppose A is invertible and you exchange its first two rows to reach B. Is the new matrix B invertible? How would you find B−1 from A−1 ? 28. If the product M = ABC of three square matrices is invertible, then A, B, C are invertible. Find a formula for B−1 that involves M −1 and A and C. 29. Prove that a matrix with a column of zeros cannot have an inverse. d −b ]. What is the inverse of each matrix if ad 6= bc? 30. Multiply [ ac db ] times [ −c a 31. (a) What matrix E has the same effect as these three steps? Subtract row 1 from row 2, subtract row 1 from row 3, then subtract row 2 from row 3. (b) What single matrix L has the same effect as these three reverse steps? Add row 2 to row 3, add row 1 to row 3, then add row 1 to row 2. 32. Find the numbers a and b that give the inverse of 5 ∗ eye(4) − ones(4,4):  −1   4 −1 −1 −1 a b b b −1 4 −1 −1 b a b b       = . −1 −1 4 −1 b b a b −1 −1 −1 4 b b b a What are a and b in the inverse of 6 ∗ eye(5) − ones(5,5)? 33. Show that A = 4 ∗ eye(4) − ones(4,4) is not invertible: Multiply A ∗ ones(4,1). 34. There are sixteen 2 by 2 matrices whose entries are 1s and 0s. How many of them are invertible? Problems 35–39 are about the Gauss-Jordan method for calculating A−1 . 35. Change I into A−1 as you reduce A to I (by row operations): # # " " h i h i 1 4 1 0 1 3 1 0 . and A I = A I = 3 9 0 1 2 7 0 1 36. Follow the 3 by 3 text example but with plus signs in A. Eliminate above and below the pivots to reduce [A I] to [I A−1 ]:   2 1 0 1 0 0 h i   A I = 1 2 1 0 1 0 . 0 1 2 0 0 1 62 Chapter 1 Matrices and Gaussian Elimination 37. Use Gauss-Jordan elimination on [A I] to solve AA−1 = I:     1 a b h 1 0 0 i     0 1 c x1 x2 x3 = 0 1 0 . 0 0 1 0 0 1 38. Invert these matrices A by the Gauss-Jordan method starting with [A I]:     1 0 0 1 1 1     A = 2 1 3 and A = 1 2 2 . 0 0 1 1 2 3 39. Exchange rows and continue with Gauss-Jordan to find A−1 : " # h i 0 2 1 0 . A I = 2 2 0 1 40. True or false (with a counterexample if false and a reason if true): (a) A 4 by 4 matrix with a row of zeros is not invertible. (b) A matrix with Is down the main diagonal is invertible. (c) If A is invertible then A−1 is invertible. (d) If AT is invertible then A is invertible. 41. For which three numbers c is this matrix not invertible, and why not?   2 c c   A =  c c c . 8 7 c 42. Prove that A is invertible if a 6= 0 and a 6= b (find the pivots and A−1 ):   a b b   A = a a b . a a a 43. This matrix has a remarkable inverse. Find A−1 by elimination on [A I]. Extend to a 5 by 5 “alternating matrix” and guess its inverse:   1 −1 1 −1 0 1 −1 1    A= . 0 0 1 −1 0 0 0 1 1.6 Inverses and Transposes 63 44. If B has the columns of A in reverse order, solve (A − B)x = 0 to show that A − B is not invertible. An example will lead you to x. 45. Find and check the inverses (assuming they exist) of these block matrices: " # " # " # I 0 A 0 0 I . C I C D I D 46. Use inv(S) to invert MATLAB’s 4 by 4 symmetric matrix S = pascal(4). Create Pascal’s lower triangular A = abs(pascal(4,1)) and test inv(S) = inv(A’) ∗ inv(A). 47. If A = ones(4,4) and b = rand(4,1), how does MATLAB tell you that Ax = b has no solution? If b = ones(4,1), which solution to Ax = b is found by A\b? 48. M −1 shows the change in A−1 (useful to know) when a matrix is subtracted from A. Check part 3 by carefully multiplying MM −1 to get I: 1. 2. 3. 4. M = I − uvT M = A − uvT M = I −UV M = A −UW −1V and and and and M −1 = I + uvT /(1 − vT u). M −1 = A−1 + A−1 uvT A−1 /(1 − vT A−1 u). M −1 = In +U(Im −VU)−1V . M −1 = A−1 + A−1U(W −VA−1U)−1VA−1 . The four identities come from the 1, 1 block when inverting these matrices: " # " # " # " # I u A u In U A U . T T v 1 v 1 V Im V W Problems 49–55 are about the rules for transpose matrices. 49. Find AT and A−1 and (A−1 )T and (AT )−1 for " # 1 0 A= and also 9 3 " # 1 c A= . c 0 50. Verify that (AB)T equals BT AT but those are different from AT BT : " # " # " # 1 0 1 3 1 3 A= B= AB = . 2 1 0 1 2 7 In case AB = BA (not generally true!), how do you prove that BT AT = AT BT ? ¢T ¡ 51. (a) The matrix (AB)−1 comes from (A−1 )T and (B−1 )T . In what order? (b) If U is upper triangular then (U −1 )T is triangular. 52. Show that A2 = 0 is possible but AT A = 0 is not possible (unless A = zero matrix). 64 Chapter 1 Matrices and Gaussian Elimination 53. (a) The row vector xT times A times the column y produces what number?   " # 0 h i 1 2 3   xT Ay = 0 1 . 1 = 4 5 6 0 (b) This is the row xT A = times the column y = (0, 1, 0). (c) This is the row xT = [0 1] times the column Ay = . 54. When you transpose a block matrix M = [ CA DB ] the result is M T = Under what conditions on A, B, C, D is the block matrix symmetric? . Test it. 55. Explain why the inner product of x and y equals the inner product of Px and Py. Then (Px)T (Py) = xT y says that PT P = I for any permutation. With x = (1, 2, 3) and y = (1, 4, 2), choose P to show that (Px)T y is not always equal to xT (PT y). Problems 56–60 are about symmetric matrices and their factorizations. 56. If A = AT and B = BT , which of these matrices are certainly symmetric? (a) A2 − B2 (b) (A + B)(A − B) (c) ABA (d) ABAB. 57. If A = AT needs a row exchange, then it also needs a column exchange to stay symmetric. In matrix language, PA loses the symmetry of A but recovers the symmetry. 58. (a) How many entries of A can be chosen independently, if A = AT is 5 by 5? (b) How do L and D (5 by 5) give the same number of choices in LDLT ? 59. Suppose R is rectangular (m by n) and A is symmetric (m by m). (a) Transpose RT AR to show its symmetry. What shape is this matrix? (b) Show why RT R has no negative numbers on its diagonal. 60. Factor these symmetric matrices into A = LDLT . The matrix D is diagonal:   " # " # 2 −1 0 1 3 1 b   A= and A = and A = −1 2 −1 . 3 2 b c 0 −1 2 The next three problems are about applications of (Ax)T y = xT (AT y). 61. Wires go between Boston, Chicago, and Seattle. Those cities are at voltages xB , xC , xS . With unit resistances between cities, the three currents are in y:      xB 1 −1 0 yBC      y = Ax is  yCS  = 0 1 −1 xC  . 1 0 −1 xS yBS 1.6 Inverses and Transposes 65 (a) Find the total currents AT y out of the three cities. (b) Verify that (Ax)T y agrees with xT (AT y)—six terms in both. 62. Producing x1 trucks and x2 planes requires x1 + 50x2 tons of steel, 40x1 + 1000x2 pounds of rubber, and 2x1 + 50x2 months of labor. If the unit costs y1 , y2 , y3 are $700 per ton, $3 per pound, and $3000 per month, what are the values of one truck and one plane? Those are the components of AT y. 63. Ax gives the amounts of steel, rubber, and labor to produce x in Problem 62. Find A. Then (Ax)T y is the of inputs while xT (AT y) is the value of . 64. Here is a new factorization of A into triangular times symmetric: Start from A = LDU. Then A equals L(U T )−1 times U T DU. Why is L(U T )−1 triangular? Its diagonal is all 1s. Why is U T DU symmetric? 65. A group of matrices includes AB and A−1 if it includes A and B. “Products and inverses stay in the group.” Which of these sets are groups? Lower triangularmatrices L with is on the diagonal, symmetric matrices S, positive matrices M, diagonal invertible matrices D, permutation matrices P. Invent two more matrix groups. 66. If every row of a 4 by 4 matrix contains the numbers 0, 1, 2, 3 in some order, can the matrix be symmetric? Can it be invertible? 67. Prove that no reordering of rows and reordering of columns can transpose a typical matrix. 68. A square northwest matrix B is zero in the southeast corner, below the antidiagonal that connects (1, n) to (n, 1). Will BT and B2 be northwest matrices? Will B−1 be northwest or southeast? What is the shape of BC = northwest times southeast? You are allowed to combine permutations with the usual L and U (southwest and northeast). 69. Compare tic; inv(A); toc for A = rand(500) and A = rand(1000). The n3 count says that computing time (measured by tic; toc) should multiply by 8 when n is doubled. Do you expect these random A to be invertible? 70. I = eye(1000); A = rand(1000); B = triu(A); produces a random triangular matrix B. Compare the times for inv(B) and B\I. Backslash is engineered to use the zeros in B, while inv uses the zeros in I when reducing [B I] by Gauss-Jordan. (Compare also with inv(A) and A\I for the full matrix A.) 71. Show that L−1 has entries j/i for i ≤ j (the −1, 2, −1 matrix has this L):     1 0 0 0 1 0 0 0  1 1 0 0 − 1 1 0 0     2 −1 L=  and L =  21 2 . 2  3 3 1 0  0 − 3 1 0 1 2 3 0 0 − 43 1 4 4 4 1 66 Chapter 1 Matrices and Gaussian Elimination Test this pattern for L = eye(5) − diag(1:5)\diag(1:4,−1) and inv(L). 1.7 Special Matrices and Applications This section has two goals. The first is to explain one way in which large linear systems Ax = b can arise in practice. The truth is that a large and completely realistic problem in engineering or economics would lead us far afield. But there is one natural and important application that does not require a lot of preparation. The other goal is to illustrate, by this same application, the special properties that coefficient matrices freq