Fraleigh = Linear Algebra.pdf

135
1.1 Vectors in Euclidean Space 1 Chapter 1. Vectors, Matrices, and Linear Spaces 1.1. Vectors in Euclidean Spaces Definition. The space R n , or Euclidean n-space, is either (1) the collection of all n-tuples of the form (x 1 ,x 2 ,...,x n ) where the x i ’s are real numbers (the n-tuples are called points), or (2) the collection of all n-tuples of the form [x 1 ,x 2 ,...,x n ] where the x i ’s are real numbers (the n-tuples are called vectors). Note. There is as yet no differences between points and vectors. Note. R 1 is just the collection of real numbers (which we know to have an algebraic structure — addition and subtraction, say). R 2 is the collection of all points in the Cartesian plane. Notation. The book denotes vectors with bold faced letters. We use letters (usually lower case) with little arrows over them: x =[x 1 ,x 2 ,...,x n ].

Transcript of Fraleigh = Linear Algebra.pdf

  • 1.1 Vectors in Euclidean Space 1

    Chapter 1. Vectors, Matrices, and Linear Spaces

    1.1. Vectors in Euclidean Spaces

    Definition. The space Rn, or Euclidean n-space, is either (1) the

    collection of all n-tuples of the form (x1, x2, . . . , xn) where the xis are

    real numbers (the n-tuples are called points), or (2) the collection of all

    n-tuples of the form [x1, x2, . . . , xn] where the xis are real numbers (the

    n-tuples are called vectors).

    Note. There is as yet no differences between points and vectors.

    Note. R1 is just the collection of real numbers (which we know to have an

    algebraic structure addition and subtraction, say). R2 is the collection

    of all points in the Cartesian plane.

    Notation. The book denotes vectors with bold faced letters. We use

    letters (usually lower case) with little arrows over them:

    x = [x1, x2, . . . , xn].

  • 1.1 Vectors in Euclidean Space 2

    Definition. For x R, say x = [x1, x2, . . . , xn], the ith component ofx is xi.

    Definition. Two vectors inRn, v = [v1, v2, . . . , vn] and w = [w1, w2, . . . , wn]

    are equal if each of their components are equal. The zero vector, 0, in Rn

    is the vector of all zero components.

    Note. We have the following geometric interpretation of vectors: A

    vector v R2 can be drawn in standard position in the Cartesian planeby drawing an arrow from the point (0, 0) to the point (v1, v2) where

    v = [v1, v2]:

  • 1.1 Vectors in Euclidean Space 3

    We can draw v translated to point P as follows:

    Notice that both of these are representations of the same vector v.

    Note. In physics, forces are represented by arrows (or vectors) and if

    two forces F1 and F2 are applied to an object, the resulting force F1 + F2

    satisfies a parallelogram property:

    Figure 1.1.5, page 5

  • 1.1 Vectors in Euclidean Space 4

    You can also talk about scaling a force by a constant c (we call these

    constants scalars as opposed to vectors and points):

    This inspires us to make the following definitions.

    Definition 1.1. Let v = [v1, v2, . . . , vn] and w = [w1, w2, . . . , wn] be

    vectors in Rn and let r R be a scalar. Define1. Vector addition: v + w = [v1 + w1, v2 + w2, . . . , vn + wn],

    2. Vector subtraction: v w = [v1 w1, v2 w2, . . . , vn wn], and3. Scalar multiplication: rv = [rv1, rv2, . . . , rvn].

    Example. Page 16 numbers 10 and 14.

  • 1.1 Vectors in Euclidean Space 5

    Theorem 1.1. Properties of Vector Algebra in Rn.

    Let u,v, w Rn and let r, s be scalars in R. ThenA1. Associativity of Vector Addition. (u + v) + w = u + (v + w)

    A2. Commutivity of Vector Addition. v + w = w + v

    A3. Additive Identity. 0 + v = v

    A4. Additive Inverses. v + (v) = 0S1. Distribution of Scalar Multiplication over Vector Addition.

    r(v + w) = rv + r w

    S2. Distribution of Scalar Addition over Scalar Multiplication.

    (r + s)v = rv + sv

    S3. Associativity. r(sv) = (rs)v

    S4. Preservation of Scale. 1v = v

    Example. Page 17 number 40a (prove A1).

    Definition 1.2. Two nonzero vectors v, w Rn are parallel, denotedv w, if one is a scalar multiple of the other. If v = r w with r > 0, thenv and w have the same direction and if v = r w with r < 0 then v and

    w have opposite directions.

  • 1.1 Vectors in Euclidean Space 6

    Example. Page 16 number 22.

    Definition 1.3. Given vectors v1, v2, . . . , vk Rn and scalars r1, r2, . . . ,rk R, the vector

    r1v1 + r2v2 + + rk vk =k

    l=1

    rlvl

    is a linear combination of the given vectors with the given scalars as

    scalar coefficients.

    Note. Sometimes there are special vectors for which it is easy to express

    a vector in terms of a linear combination of these special vectors.

    Definition. The standard basis vectors in R2 are i = [1, 0] and j =

    [0, 1]. The standard basis vectors in R3 are

    i = [1, 0, 0], j = [0, 1, 0], and k = [0, 0, 1].

    Note. Its easy to write a vector in terms of the standard basis vectors:

    b = [b1, b2] = b1[1, 0] + b2[0, 1] = b1i + b2j and

    b = [b1, b2, b3] = b1[1, 0, 0] + b2[0, 1, 0] + b3[0, 0, 1] = b1i + b2j + b3k.

  • 1.1 Vectors in Euclidean Space 7

    Definition. In Rn, the rth standard basis vector, denoted er, is

    er = [0, 0, . . . , 0, 1, 0, . . . , 0],

    where the rth component is 1 and all other components are 0.

    Notice. A vector b Rn can be uniquely expressed in terms of thestandard basis vectors:

    b = [b1, b2, . . . , bn] = b1e1 + b2e2 + + bnen =nl=1

    blel.

    Definition. If v Rn is a nonzero vector, then the line along v is thecollection of all vectors of the form rv for some scalar r R (notice 0is on all such lines). For two nonzero nonparallel vectors v, w Rn, thecollection of all possible linear combinations of these vectors: rv + sw

    where r, s R, is the plane spanned by v and w.

    Definition. A column vector in Rn is a representation of a vector as

    x =

    x1

    x2

    ...

    xn

    .

  • 1.1 Vectors in Euclidean Space 8

    A row vector in Rn is a representation of a vector as

    x = [x1, x2, . . . , xn].

    The transpose of a row vector, denoted xT , is a column vector, and con-

    versely:

    x1

    x2

    ...

    xn

    T

    = [x1, x2, . . . , xn], and [x1, x2, . . . , xn]T =

    x1

    x2

    ...

    xn

    .

    Note. A linear combination of column vectors can easily be translated

    into a system of linear equations:

    r

    13

    + s

    2

    5

    =

    1

    19

    r 2s = 1

    3r + 5s = 19.

    Definition 1.4. Let v1, v2, . . . , vk Rn. The span of these vectors isthe set of all linear combinations of them, denoted sp(v1, v2, . . . , vk):

    sp(v1, v2, . . . , vk) = {r1v1 + r2v2 + + rk vk | r1, r2, . . . , rk R}

    =

    {k

    l=1

    rlvl

    r1, r2, . . . , rk R}.

    Example. Page 16 number 28.

  • 1.2 The Norm and Dot Product 1

    Chapter 1. Vectors, Matrices, and Linear Spaces

    1.2. The Norm and Dot Product

    Definition 1.5. Let v = [v1, v2, . . . , vn] Rn. The norm or magnitudeof v is

    v =

    v21 + v22 + + v2n =

    nl=1

    (vl)2.

    Theorem 1.2. Properties of the Norm in Rn.

    For all v, w Rn and for all scalars r R, we have:1. v 0 and v = 0 if and only if v = 0.2. rv = |r|v.3. v + w v + w (the Triangle Inequality).

    Note. 1 and 2 are easy to see and we will prove 3 later in this section.

  • 1.2 The Norm and Dot Product 2

    Note. A picture for the Triangle Inequality is:

    1.2.22, page 22

    Definition. A vector with norm 1 is called a unit vector. When writing,

    unit vectors are frequently denoted with a hat: i.

    Example. Page 31 number 8.

    Definition 1.6. The dot product for v = [v1, v2, . . . , vn] and w =

    [w1, w2, . . . , wn] is

    v w = v1w1 + v2w2 + + vnwn =nl=1

    vlwl.

  • 1.2 The Norm and Dot Product 3

    Notice. If we let be the angle between nonzero vectors v and w, then

    we get by the Law of Cosines:

    1.2.24, page 23

    v2 + w2 = v w + 2vw cos

    or

    (v21 + v22 + + v2n) + (w21 + w22 + + w2n)

    = (v1 w1)2 + (v2 w2)2 + + (vn wn)2 + 2vw cos

    or

    2v1w1 + 2v2w2 + + 2vnwn = 2vw cos

    or

    2v w = 2vw cos

  • 1.2 The Norm and Dot Product 4

    or

    cos =v w

    vw. ()

    Definition. The angle between nonzero vectors v and w is

    arccos

    (v w

    vw).

    Theorem 1.4. Schwarzs Inequality.

    Let v, w Rn. Then|v w| vw.

    Proof. This follows from () and the fact that 1 cos 1. Thebook gives an algebraic proof. QED

    Example. Page 31 number 12.

  • 1.2 The Norm and Dot Product 5

    Theorem 1.3. Properties of Dot Products.

    Let u,v, w Rn and let r R be a scalar. ThenD1. Commutivity of : v w = w v.D2. Distribution of over Scalar Addition: u (v + w) = u v + u w.D3. r(v w) = (rv) w = v (r w).D4. v v 0 and v v = 0 if and only if v = 0.

    Example. Page 33 number 42b (Prove D2).

    Note. v2 = v v.

    Definition. Two vectors v, w Rn are perpendicular or orthogonal,denoted v w, if v w = 0.

    Example. Page 31 numbers 14 and 16.

  • 1.2 The Norm and Dot Product 6

    Theorem 1.5. The Triangle Inequality.

    Let v, w Rn. Then v + w v + w.

    Proof.

    v + w2 = (v + w) (v + w)= v v + 2v w + w w v2 + 2vw + w2 by Schwarz Inequality= (v + w)2.

    QED

    Note. It is common in physics to represent velocities and forces with

    vectors.

    Example. Page 31 number 36.

  • 1.3 Matrices and Their Algebra 1

    Chapter 1. Vectors, Matrices, and Linear Spaces

    1.3. Matrices and Their Algebra

    Definition. A matrix is a rectangluar array of numbers. An m nmatrix is a matrix with m rows and n columns:

    A = [aij] =

    a11 a12 a1na21 a22 a2n... ... . . . ...

    am1 am2 amn

    .

    Definition 1.8. Let A = [aik] be an m n matrix and let B = [bkj] bean n s matrix. The matrix product AB is the m s matrix C = [cij]where cij is the dot product of the ith row vector of A and the jth column

    vector of B: cij =n

    k=1 aikbkj.

  • 1.3 Matrices and Their Algebra 2

    Note. We can draw a picture of this process as:

    Example. Page 46 number 16.

    Definition. The main diagonal of an n n matrix is the set {a11, a22,. . . , ann}. A square matrix which has zeros off the main diagonal is adiagonal matrix. We denote the n n diagonal matrix with all diagonalentires 1 as I :

    I =

    1 0 0 00 1 0 00 0 1 0... ... ... . . . 1

    0 0 0 1

    .

  • 1.3 Matrices and Their Algebra 3

    Definition 1.9/1.10. Let A = [aij] and B = [bij] be m n matices.The sum A + B is the m n matrix C = [cij] where cij = aij + bij. Letr be a scalar. Then rA is the matrix D = [dij] where dij = raij.

    Example. Page 46 number 6.

    Definition 1.11. Matrix B is the transpose of A, denoted B = AT , if

    bij = aji. If A is a matrix such that A = AT then A is symmetric.

    Example. Page 47 number 39. If A is square, then A+AT is symmetric.

    Proof. Let A = [aij] then AT = [aji]. Let C = [cij] = A + A

    T =

    [aij]+ [aji] = [aij +aji]. Notice cij = aij +aji and cji = aji+aij, therefore

    C = A + AT is symmetric. QED

    Note. Properties of Matrix Algebra.

    Let A, B be m n matrices and r, s scalars. ThenCommutative Law of Addition: A + B = B + A

    Associative Law of Addition: (A + B) + C = A + (B + C)

    Additive Identity: A + 0 = 0 + A (here 0 represents the m n matrix

  • 1.3 Matrices and Their Algebra 4

    of all zeros)

    Left Distribution Law: r(A + B) = rA + rB

    Right Distribution Law: (r + s)A = rA + sA

    Associative Law of Scalar Multiplication: (rs)A = r(sA)

    Scalars Pull Through: (rA)B = A(rB) = r(AB)

    Associativity of Matrix Multiplication: A(BC) = (AB)C

    Matrix Multiplicative Identity: IA = A = AIDistributive Laws of Matrix Multiplication: A(B + C) = AB + AC and

    (A + B)C = AC + BC.

    Example. Show that IA = AI = A for A =

    1 2 3

    4 5 6

    7 8 9

    and I is 33.

    Note. Properties of the Transpose Operator.

    (AT )T = A (A + B)T = AT + BT (AB)T = BTAT .

  • 1.3 Matrices and Their Algebra 5

    Example. Page 47 number 32. Prove (AB)T = BTAT .

    Proof. Let C = [cij] = (AB)T . The (i, j)-entry of AB is

    nk=1

    aikbkj, so

    cij =

    nk=1

    ajkbki. Let BT = [bij]

    T = [btij] = [bji] and AT = [aij]

    T = [atij] =

    [aji]. Then the (i, j)-entry of BTAT is

    nk=1

    btikatkj =

    nk=1

    bkiajk =

    nk=1

    ajkbki = cij

    and therefore C = (AB)T = BTAT . QED

  • 1.4 Solving Systems of Linear Equations 1

    Chapter 1. Vectors, Matrices, and Linear Spaces

    1.4. Solving Systems of Linear Equations

    Definition. A system ofm linear equations in the n unknowns x1, x2, . . . , xn

    is a system of the form:

    a11x1 + a12x2 + + a1nxn = b1a21x1 + a22x2 + + a2nxn = b2... ...

    am1x1 + am2x2 + + amnxn = bm.

    Note. The above system can be written as Ax = b where A is the

    coefficient matrix and x is the vector of variables. A solution to the

    system is a vector s such that As = b.

    Defninition. The augmented matrix for the above system is

    [A | b] =

    a11 a21 a1n b1a21 a22 a2n b2... ...

    am1 am2 amn bm

    .

  • 1.4 Solving Systems of Linear Equations 2

    Note. Wewill perform certain operations on the augmented matrix which

    correspond to the following manipulations of the system of equations:

    1. interchange two equations,

    2. multiply an equation by a nonzero constant,

    3. replace an equation by the sum of itself and a multiple of another

    equation.

    Definition. The following are elementary row operations:

    1. interchange row i and row j (denoted Ri Rj),2. multiplying the ith row by a nonzero scalar s (denoted Ri sRi),and

    3. adding the ith row to s times the jth row (denoted Ri Ri + sRj).If matrix A can be obtained from matrix B by a series of elementary row

    operations, then A is row equivalent to B, denoted A B or A B.

    Notice. These operations correspond to the above manipulations of the

    equations and so:

  • 1.4 Solving Systems of Linear Equations 3

    Theorem 1.6. Invariance of Solution Sets Under Row Equiv-

    alence.

    If [A | b] [H | c] then the linear systems Ax = b and Hx = c have thesame solution sets.

    Definition 1.12. A matrix is in row-echelon form if

    (1) all rows containing only zeros appear below rows with nonzero entries,

    and

    (2) the first nonzero entry in any row appears in a column to the right of

    the first nonzero entry in any preceeding row.

    For such a matrix, the first nonzero entry in a row is the pivot for that

    row.

    Example. Which of the following is in row echelon form?1 2 3

    0 4 5

    0 0 6

    1 2 3

    0 4 5

    6 0 0

    2 4 0

    1 3 2

    0 0 0

    Note. If an augmented matrix is in row-echelon form, we can use the

    method of back substituton to find solutions.

  • 1.4 Solving Systems of Linear Equations 4

    Example. Consider the system

    x1 + 3x2 x3 = 4x2 x3 = 1

    x3 = 3.

    Definition 1.13. A linear system having no solution is inconsistent. If

    it has one or more solutions, it is consistent.

    Example. Is this system consistent or inconsistent:

    2x1 + x2 x3 = 1x1 x2 + 3x3 = 13x1 + 2x3 = 3?

    Example. Is this system consistent or inconsistent:

    2x1 + x2 x3 = 1x1 x2 + 3x3 = 13x1 + 2x3 = 2?

    (HINT: This system has multiple solutions. Express the solutions in terms

    of an unknown parameter r).

  • 1.4 Solving Systems of Linear Equations 5

    Note. In the above example, r is a free variable and the general

    solution is in terms of this free variable.

    Note. Reducing a Matrix to Row-Echelon Form.

    (1) If the first column is all zeros, mentally cross it off. Repeat this

    process as necessary.

    (2a) Use row interchange if necessary to get a nonzero entry (pivot) p in

    the top row of the remaining matrix.

    (2b) For each row R below the row containing this entry p, add r/ptimes the row containing p to R where r is the entry of row R in the

    column which contains pivot p. (This gives all zero entries below pivot p.)

    (3) Mentally cross off the first row and first column to create a smaller

    matrix. Repeat the process (1) - (3) until either no rows or no columns

    remain.

    Example. Page 68 number 2.

    Example. Page 69 number 16. (Put the associated augmented matrix

    in row-echelon form and then use substitution.)

  • 1.4 Solving Systems of Linear Equations 6

    Note. The above method is called Gauss reduction with back substitu-

    tion.

    Note. The system Ax = b is equivalent to the system

    x1 a1 + x2 a2 + + xn an = b

    where ai is the ith column vector of A. Therefore, Ax = b is consistent if

    and only if b is in the span of a1, a2, . . . , an (the columns of A).

    Definition. A matrix is in reduced row-echelon form if all the pivots

    are 1 and all entries above or below pivots are 0.

    Example. Page 69 number 16 (again).

    Note. The above method is the Gauss-Jordan method.

    Theorem 1.7. Solutions of Ax = b.

    Let Ax = b be a linear system and let [A | b] [H | c] where H is inrow-echelon form.

    (1) The system Ax = b is inconsistent if and only if [H | c] has a row

  • 1.4 Solving Systems of Linear Equations 7

    with all entries equal to 0 to the left of the partition and a nonzero entry

    to the right of the partition.

    (2) If Ax = b is consistent and every column of H contains a pivot, the

    system has a unique solution.

    (3) If Ax = b is consistent and some column of H has no pivot, the

    system has infinitely many solutions, with as many free variables as there

    are pivot-free columns of H.

    Definition 1.14. A matrix that can be obtained from an identity matrix

    by means of one elementary row operation is an elementary matrix.

    Theorem 1.8. Let A be an m n matrix and let E be an m melementary matrix. Multiplication of A on the left by E effects the

    same elementary row operation on A that was performed on the identity

    matrix to obtain E.

    Proof for Row-Interchange. (This is page 71 number 52.) Suppose

    E results from interchanging rows i and j:

    I RiRj E.

    Then the kth row of E is [0, 0, . . . , 0, 1, 0, . . . , 0] where

  • 1.4 Solving Systems of Linear Equations 8

    (1) for k {i, j} the nonzero entry if the kth entry,(2) for k = i the nonzero entry is the jth entry, and

    (3) for k = j the nonzero entry is the ith entry.

    Let A = [aij], E = [eij], and B = [bij] = EA. The kth row of B is

    [bk1, bk2, . . . , bkn] and

    bkl =

    np=1

    ekpapl.

    Now if k {i, j} then all ekp are 0 except for p = k and

    bkl =n

    p=1

    ekpapl = ekkakl = (1)akl = akl.

    Therefore for k {i, j}, the kth row of B is the same as the kth row ofA. If k = i then all ekp are 0 except for p = j and

    bkl = bil =n

    p=1

    ekpapl = ekjajl = (1)ajl = ajl

    and the ith row of B is the same as the jth row of A. Similarly, if k = j

    then all ekp are 0 except for p = i and

    bkl = bjl =

    np=1

    ekpapl = ekiail = (1)ail = ail

    and the jth row of B is the same as the ith row of A. Therefore

    B = EARiRj A.

    QED

  • 1.4 Solving Systems of Linear Equations 9

    Example. Multiply some 3 3 matrix A by

    E =

    0 1 0

    1 0 0

    0 0 1

    to swap Row 1 and Row 2.

    Note. If A is row equivalent to B, then we can find C such that CA = B

    and C is a product of elementary matrices.

    Example. Page 70 number 44.

  • 1.5 Inverses of Matrices, and Linear Systems 1

    Chapter 1. Vectors, Matrices, and Linear Spaces

    1.5. Inverses of Square Matrices

    Definition 1.15. An n n matrix A is invertible if there exists ann n matrix C such that AC = CA = I . If A is not invertible, it issingular.

    Theorem 1.9. Uniqueness of an Inverse Matrix.

    An invertible matrix has a unique inverse (which we denote A1).

    Proof. Suppose C and D are both inverses of A. Then (DA)C = IC =C and D(AC) = DI = D. But (DA)C = D(AC) (associativity), soC = D. QED

    Example. It is easy to invert an elementary matrix. For example, sup-

    pose E1 interchanges the first and third row and suppose E2 multiplies

    row 2 by 7. Find the inverses of E1 and E2.

  • 1.5 Inverses of Matrices, and Linear Systems 2

    Theorem 1.10. Inverses of Products.

    Let A and B be invertible n n matrices. Then AB is invertible and(AB)1 = B1A1.

    Proof. By associativity and the assumption that A1 and B1 exist, we

    have:

    (AB)(B1A1) = [A(BB1)]A1 = (AI)A1 = AA1 = I.

    We can similarly show that (B1A1)(AB) = I. Therefore AB is invert-ible and (AB)1 = B1A1. QED

    Lemma 1.1. Condition for Ax = b to be Solvable for b.

    Let A be an n n matrix. The linear system Ax = b has a solution forevery choice of column vector b Rn if and only if A is row equivalent tothe n n identity matrix I .

  • 1.5 Inverses of Matrices, and Linear Systems 3

    Theorem 1.11. Commutivity Property.

    Let A and C be n n matrices. Then CA = I if and only if AC = I .

    Proof. Suppose that AC = I . Then the equation Ax = b has a solutionfor every column vector b Rn. Notice that x = Cb is a solution because

    A(Cb) = (AC)b = Ib = b.

    By Lemma 1.1, we know that A is row equivalent to the n n iden-tity matrix I , and so there exists a sequence of elementary matricesE1, E2, . . . , Et such that (Et E2E1)A = I . By Theorem 1.9, the twoequations

    (Et E2E2)A = I and AC = Iimply that Et E2E1 = C, and so we have CA = I. The other half ofthe proof follows by interchanging the roles of A and C. QED

    Note. Computation of Inverses.

    If A = [aij], then finding A1 = [xij] amounts to solving for xij in:

    a11 a12 a1na21 a22 a2n... ... . . . ...

    an1 an2 ann

    x11 x12 x1nx21 x22 x2n... ... . . . ...

    xn1 xn2 xnn

    = I.

  • 1.5 Inverses of Matrices, and Linear Systems 4

    If we treat this as n systems of n equations in n unknowns, then the

    augmented matrix for these n systems is [A | I ]. So to compute A1:(1) Form [A | I ].(2) Apply Gauss-Jordan method to produce the row equivalent [I | C].If A1 exists, then A1 = C.

    Note. In the above computations, C is just the product of the elementary

    matrices that make up A1.

    Example. Page 84 number 6 (also apply this example to a system of

    equations).

    Theorem 1.12. Conditions for A1 to Exist.

    The following conditions for an n n matrix A are equivalent:(i) A is invertible.

    (ii) A is row equivalent to I .(iii) Ax = b has a solution for each b (namely, x = A1b).

    (iv) A can be expressed as a product of elementary matrices.

    (v) The span of the column vectors of A is Rn.

  • 1.5 Inverses of Matrices, and Linear Systems 5

    Note. In (iv) A is the left-to-right product of the inverses of the elemen-

    tary matrices corresponding to succesive row operations that reduce A to

    I.

    Example. Page 84 number 2. Express the inverse of

    3 6

    3 8

    as a

    product of elementary matrices.

    Solution. We perform the following elementary operations: 3 6 1 0

    3 8 0 1

    R2R2R1

    3 6 1 0

    0 2 1 1

    R1R13R2 3 0 4 3

    0 2 1 1

    R2R2/2 3 0 4 3

    0 1 1/2 1/2

    R1R1/3 1 0 4/3 1

    0 1 1/2 1/2

  • 1.5 Inverses of Matrices, and Linear Systems 6

    The elementary matrices which accomplish this are:

    E1 =

    1 01 1

    E11 =

    1 0

    1 1

    E2 =

    1 3

    0 1

    E12 =

    1 3

    0 1

    E3 =

    1 0

    0 1/2

    E13 =

    1 0

    0 2

    E4 =

    1/3 0

    0 1

    E14 =

    3 0

    0 1

    As in Section 1.3,

    E4E3E2E1A = I

    and so

    A = E11 E12 E

    13 E

    14 I = E11 E12 E13 E14 .

    Also A1 = E4E3E2E1. QED

    Example. Page 85 number 24.

  • 1.6 Homogeneous Systems, Subspaces and Bases 1

    Chapter 1. Vectors, Matrices, and Linear Spaces

    1.6. Homogeneous Systems, Subspaces and Bases

    Definition. A linear system Ax = b is homogeneous if b = 0. The zero

    vector x = 0 is a trivial solution to the homogeneous system Ax = 0.

    Nonzero solutions to Ax = 0 are called nontrivial solutions.

    Theorem 1.13. Structure of the Solution Set of Ax = 0.

    Let Ax = 0 be a homogeneous linear system. If h1, h2, . . . , hn are solu-

    tions, then any linear combination

    r1 h1 + r2 h2 + + rn hn

    is also a solution.

    Proof. Since h1, h2, . . . , hn are solutions,

    Ah1 = Ah2 = = Ahn = 0

    and so

    A(r1 h1+r2 h2+ +rn hn) = r1Ah1+r2Ah2+ +rnA hn = 0+0+ +0 = 0.

    Therefore the linear combination is also a solution. QED

  • 1.6 Homogeneous Systems, Subspaces and Bases 2

    Definition 1.16. A subset W of Rn is closed under vector addition

    if for all u,v W , we have u + v W . If rv W for all v Wand for all r R, then W is closed under scalar multiplication. Anonempty subset W of Rn is a subspace of Rn if it is both closed under

    vector addition and scalar multiplication.

    Example. Page 99 number 8.

    Theorem 1.14. Subspace Property of a Span.

    Let W = sp( w1, w2, . . . , wk) be the span of k > 0 vectors in Rn. Then

    W is a subspace of Rn. (The vectors w1, w2, . . . , wn are said to span or

    generate the subspace.)

    Example. Page 100 number 18.

    Definition. Given an mn matrix A, the span of the row vectors of Ais the row space of A, the span of the column vectors of A is the column

    space of A and the solution set to the system Ax = 0 is the nullspace of

    A.

  • 1.6 Homogeneous Systems, Subspaces and Bases 3

    Definition 1.17. LetW be a subspace ofRn. A subset { w1, w2, . . . , wk}of E is a basis for W if every vector in W can be expressed uniquely as a

    linear combination of w1, w2, . . . , wk.

    Theorem 1.15. Unique Linear Combinations.

    The set { w1, w2, . . . , wk} is a basis for W = sp( w1, w2, . . . , wk) if andonly if

    r1 w1 + r2 w2 + + rk wk = 0

    implies

    r1 = r2 = = rk = 0.

    Proof. First, if { w1, w2, . . . , wk} is a basis for W , then each vector ofW can be uniquely written as a linear combination of these wis. Since

    0 = 0 w1 + 0 w2 + + 0 wk and this is the unique way to write 0 in termsof the wis, then for any r1 w1 + r2 w2 + + rk wk = 0 we must haver1 = r2 = rk = 0.

    Second, suppose that the only linear combination of wis that gives 0 is

    0 w1+0 w2 + +0vk. We want to show that any vector of W is a uniquelinear combination of the wis. Suppose for w W we have

    w = c1 w1 + c2 w2 + + ck wk and

  • 1.6 Homogeneous Systems, Subspaces and Bases 4

    w = d1 w1 + d2 w2 + + dk wk.

    Then

    0 = w w = c1 w1 + c2 w2 + + ck wk (d1 w1 + d2 w2 + + dk wk)

    = (c1 d1) w1 + (c2 d2) w2 + + (ck dk) wk.

    So each coefficient must be 0 and we have ci = di for i = 1, 2, . . . , k and

    w can be written as a linear combination of wis in only one unique way.

    QED

    Example. Page 100 number 22.

    Theorem 1.16. Let A be an nnmatrix. The following are equivalent:(1) Ax = b has a unique solution,

    (2) A is row equivalent to I ,(3) A is invertible, and

    (4) the column vectors of A form a basis for Rn.

    Example. Page 100 number 22 (again).

  • 1.6 Homogeneous Systems, Subspaces and Bases 5

    Theorem 1.17. LetA be anmnmatrix. The following are equivalent:(1) each consistent system Ax = b has a unique solution,

    (2) the reduced row-echelon form ofA consists of the nn identity matrixfollowed by m n rows of zeros, and(3) the column vectors of A form a basis for the column space of A.

    Corollary 1. Fewer Equations then Unknowns

    If a linear system A X = b is consistent and has fewer equations then

    unknowns, then it has an infinite number of solutions.

    Corollary 2. The Homogeneous Case

    (1) A homogeneous linear system Ax = 0 having fewer equations then

    unknowns has a nontrivial solution (i.e. a solution other than x = 0),

    (2) A square homogeneous system Ax = 0 has a nontrivial solution if and

    only if A is not row equivalent to the identity matrix.

    Example. Page 97 Example 6. A basis of Rn cannot contain more than

    n vectors.

    Proof. Suppose {v1, v2, . . . , vk} is a basis for Rn where n < k. Consider

  • 1.6 Homogeneous Systems, Subspaces and Bases 6

    the system A X = 0 where the column vectors of A are v1, v2, . . . , vk.

    Then A has n rows and k columns (corresponding to n equations in k

    unknowns). With n < k, Corollary 2 implies there is a nontrivial solution

    to Ax = 0. But this corresponds to a linear combination of the columns

    of A which equals 0 while not all the coefficients are 0. This contradicts

    the definition of basis. Therefore, k n. QED

    Theorem 1.18. Structure of the Solution Set of Ax = b.

    Let Ax = b be a linear system. If p is any particular solution of Ax = b

    and h is a solution to Ax = 0, then p + h is a solution of Ax = b. In

    fact, every solution of Ax = b has the form p+h and the general solution

    is x = p + h where Ah = 0 (that is, h is an arbitrary element of the

    nullspace of A).

    Example. Page 101 numbers 36 and 43.

  • 2.1 Independence and Dimension 1

    Chapter 2. Dimension, Rank, and Linear

    Transformations

    2.1. Independence and Dimension

    Definition 2.1. Let { w1, w2, . . . , wk} be a set of vectors in Rn. Adependence relation in this set is an equation of the form

    r1 w1 + r2 w2 + + rk wk = 0

    with at least one rj = 0. If such a dependence relation exists, then{ w1, w2, . . . , wk} is a linearly dependent set. A set of vectors which isnot linearly dependent is linearly independent.

    Theorem 2.1. Alternative Characterization of Basis

    Let W be a subspace of Rn. A subset { w1, w2, . . . , wk} of W is a basisfor W if and only if

    (1) W = sp( w1, w2, . . . , wk) and

    (2) the vector w1, w2, . . . , wk are linearly independent.

  • 2.1 Independence and Dimension 2

    Note. The proof of Theorem 2.1 follows directly from the definitions of

    basis and linear independent.

    Theorem. Finding a Basis for W = sp( w1, w2, . . . , wk).

    Form the matrix A whose jth column vector is wj. If we row-reduce A to

    row-echelon form H, then the set of all wj such that the jth column of H

    contains a pivot, is a basis for W .

    Example. Page 134 number 8 or 10.

    Example. Page 138 number 22.

    Theorem 2.2. Relative Sizes of Spanning and Independent

    Sets.

    Let W be a subspace of Rn. Let w1, w2, . . . , wk be vectors in W that

    span W and let v1, v2, . . . , vm be vectors in W that are independent.

    Then k m.

  • 2.1 Independence and Dimension 3

    Corollary. Invariance of Dimension.

    Any two bases of a subspace of Rn contains the same number of vectors.

    Definition 2.2. Let W be a subspace of Rn. The number of elements

    in a basis for W is the dimension of W , denoted dim(W ).

    Note. The standard basis {e1, e2, . . . , en} of Rn has n vectors, so dim(Rn) =n.

    Theorem 2.3. Existence and Determination of Bases.

    (1) Every subspace W of Rn has a basis and dim(W ) n.(2) Every independent set of vectors in Rn can be enlarged to become a

    basis of Rn.

    (3) If W is a subspace of Rn and dim(W ) = k then

    (a) every independent set of k vectors in W is a basis for W , and

    (b) every set of k vectors in W that spans W is a basis of W .

    Example. Page 136 numbers 34 and 38.

  • 2.2 The Rank of a Matrix 1

    Chapter 2. Dimension, Rank, and Linear

    Transformations

    2.2. The Rank of a Matrix

    Note. In this section, we consider the relationship between the dimen-

    sions of the column space, row space and nullspace of a matrix A.

    Theorem 2.4. Row Rank Equals Column Rank.

    Let A be an m n matrix. The dimension of the row space of A equalsthe dimension of the column space of A. The common dimension is the

    rank of A.

    Note. The dimension of the column space is the number of pivots of A

    when in row-echelon form, so by page 129, the rank of A is the number of

    pivots of A when in row-echelon form.

  • 2.2 The Rank of a Matrix 2

    Note. Finding Bases for Spaces Associated with a Matrix.

    Let A be an m n matrix with row-echelon form H.(1) for a basis of the row space of A, use the nonzero rows of H (or A),

    (2) for a basis of the column space of A, use the columns of A correspond-

    ing to the columns of H which contain pivots, and

    (3) for a basis of the nullspace of A use H to solve Hx = 0 as before.

    Example. Page 140 number 4.

    Theorem 2.5. Rank Equation.

    Let A be m n with row-echelon form H.(1) The dimension of the nullspace of A is

    nullity(A) = (# free variables in solution of Ax = 0)

    = (# pivot-free columns of H).

    (2) rank(A) = (# of pivots in H).

    (3) Rank Equation:

    rank(A) + nullity(A) = # of columns of A.

  • 2.2 The Rank of a Matrix 3

    Theorem 2.6. An Invertibility Criterion.

    An n n matrix A is invertible if and only if rank(A) = n.

    Example. Page 141 number 12. If A is square, then nullity(A) =

    nullity(AT ).

    Proof. The column space of A is the same as the row space of AT , so

    rank(A) = rank(AT ) and since the number of columns of A equals the

    number of columns of AT , then by the Rank Equation:

    rank(A) + nullity(A) = rank(AT ) + nullity(AT )

    and the result follows. QED

  • 2.3 Linear Transformations of Euclidean Spaces 1

    Chapter 2. Dimension, Rank, and Linear

    Transformations

    2.3 Linear Transformations of Euclidean Spaces

    Definition. A linear transformation T : Rn Rm is a function whosedomain is Rn and whose codomain is Rm, where

    (1) T (u + v) = T (u) + T (v) for all u,v Rn, and(2) T (ru) = rT (u) for all u Rn and for all r R.

    Note. Combining (1) and (2) gives

    T (ru + sv) = rT (u) + sT (v)

    for all u,v Rn and r, s R. As the book says, linear transformationspreserve linear combinations.

    Note. T (0) = T (00) = 0T (0) = 0.

    Example. Page 152 number 4.

  • 2.3 Linear Transformations of Euclidean Spaces 2

    Example. Page 145 Example 4. Notice that every linear transformation

    of R R is of the formT (x) = ax.

    The graphs of such functions are lines through the origin.

    Theorem 2.7. Bases and Linear Transformations.

    Let T : Rn Rm be a linear transformation and let B = {b1, b2, . . . , bn}be a basis for Rn. For any vector v Rn, the vector T (v) is uniquelydetermined by T (b1), T (b2), . . . , T (bn)..

    Proof. Let v Rn. Then since B is a basis, there exist unique scalarsr1, r2, . . . , rn such that

    v = r1b1 + r2b2 + + rnbn.

    Since T is linear, we have

    T (v) = r1T (b1) + r2T (b2) + + rnT (bn).

    Since the coefficients ri are uniquely determined by v, it follows that the

    value of T (v) is completely determined by the vectors T (bi). QED

  • 2.3 Linear Transformations of Euclidean Spaces 3

    Corollary. Standard Matrix Representation of Linear Trans-

    formations.

    Let T : Rn Rm be linear, and let A be the m n matrix whose jthcolumn is T (ej). Then T (x) = Ax for each x Rn. A is the standardmatrix representation of T .

    Proof. For any matrix A, Aej is the jth column of A. So if A is the ma-

    trix described, then Aej = T (ej), and so T and the linear transformation

    TA given by TA(x) = Ax agree on the standard basis {e1, e2, . . . , en} ofR

    n. Therefore by Theorem 2.7, T ( X) = Ax for all x Rn. QED

    Example. Page 152 number 10.

    Theorem/Definition. Let T : Rn Rm be a linear transformationwith standard matrix representation A.

    (1) The range T [Rn] of T is the column space of A.

    (2) The kernel of T is the nullspace of A, denoted ker(T ).

    (3) If W is a subspace of Rn, then T [W ] is a subspace of Rm (i.e. T

    preserves subspaces).

  • 2.3 Linear Transformations of Euclidean Spaces 4

    Notice. If A is the standard matrix representation for T , then from the

    rank equation we get:

    dim(range T ) + dim(ker T ) = dim(domain T ).

    Definition. For a linear transformation T , we define rank and nullity

    in terms of the standard matrix representation A of T :

    rank(T ) = dim(range T ), nullity(T ) = dim(ker T ).

    Definition. If T : Rn Rm and T : Rm Rk, then the compositionof T and T is (T T ) : Rn Rk where (T T )x = T (T (x)).

    Theorem. Matrix Multiplication and Composite Transfor-

    mations.

    A composition of two linear transformations T and T with standard ma-

    trix representation A and A yields a linear transformation T T withstandard matrix representation AA.

    Example. Page 153 number 20.

  • 2.3 Linear Transformations of Euclidean Spaces 5

    Definition. If T : Rn Rn and there exists T : Rn Rn suchthat T T (x) = x for all x Rn, then T is the inverse of T denotedT = T1. (Notice that if T : Rm Rn where m = n, then T1 is notdefined there are domain/range size problems.)

    Theorem. Invertible Matrices and Inverse Transformations.

    Let T : Rn Rn have standard matrix representation A: T (x) = Ax.Then T is invertible if and only if A is invertible and T1(x) = A1x.

    Example. Page 153 number 22.

  • 2.4 Linear Transformations of the Plane 1

    Chapter 2. Dimension, Rank, and Linear

    Transformations

    2.4 Linear Transformations of the Plane (in brief)

    Note. If A is a 2 2 matrix with rank 0 then it is the matrix

    A =

    0 00 0

    and all vectors in R2 are mapped to 0 under the transformation with

    asociated matrix A (We can view 0 as a 0 dimensional space). If the

    rank(A) = 1, then the column space of A, which is the range of TA, is a

    one dimensional subspace of R2. In this case, TA projects a vector onto

    the column space. See page 155 for details.

    Note. We can rotate a vector in R2 about the origin through an angle

    by applying TA where

    A =

    cos sin

    sin cos

    .

    This is an example of a rigid transformation of the plane since lengths are

    not changed under this transformation.

  • 2.4 Linear Transformations of the Plane 2

    Note. We can reflect a vector in R2 about the x-axis by applying TX

    where

    X =

    1 00 1

    .

    We can reflect a vector in R2 about the y-axis by applying TY where

    Y =

    1 0

    0 1

    .

    We can reflect a vector in R2 about the line y = x by applying TZ where

    Z =

    0 11 0

    .

    Notice that X , Y , and Z are elementary matrices since they differ from Iby an operation of row scaling (for X and Y ), or by an operation of row

    interchange (for Z).

    Note. Transformation TA where

    A =

    r 00 1

    is a horizontal expansion if r > 1, and is a horizontal contraction if

  • 2.4 Linear Transformations of the Plane 3

    0 < r < 1. Transformation TB where

    B =

    1 00 r

    is a vertical expansion if r > 1, and is a vertical contraction if 0 < r < 1.

    Notice that A and B are elementary matrices since they differ from I byan operation of row scaling.

    Note. Transformation TA where

    A =

    1 0r 1

    is a vertical shear (see Figure 2.2.16 on page 163). Transformation TB

    where

    B =

    1 r0 1

    is a horizontal shear. Notice that A and B are elementary matrices since

    they differ from I by an operation of row addition.

  • 2.4 Linear Transformations of the Plane 4

    Theorem. Geometric Description of Invertible Transforma-

    tions of Rn.

    A linear transformation T of the plane R2 into itself is invertible if and

    only if T consists of a finite sequence of:

    Reflections in the x-axis, the y-axis, or the line y = x; Vertical or horizontal expansions or contractions; and Vertical or horizontal shears.

    Proof. Each elementary operation corresponds to one of these types of

    transformations (and conversely). Each of these transformations corre-

    spond to elementary matrices as listed above (and conversely). Also, we

    know that a matrix is invertible if and only if it is a product of elementary

    matrices by Theorem 1.12(iv). Therefore T is invertible if and only if its

    associated matrix is a product of elementary matrices, and so the result

    follows. QED

  • 2.5 Lines, Planes, and Other Flats 1

    Chapter 2. Dimension, Rank, and Linear

    Transformations

    2.5 Lines, Planes, and Other Flats

    Definitions 2.4, 2.5. Let S be a subset of Rn and let a Rn. Theset {x + a | x S} is the translate of S by a, and is denoted by S + a.The vector a is the translation vector. A line in Rn is a translate of a

    one-dimensional subspace of Rn.

    Figure 2.19, page 168.

  • 2.5 Lines, Planes, and Other Flats 2

    Definition. If a line L in Rn contains point (a1, a2, . . . , an) and if vector

    d is parallel to L, then d is a direction vector for L anda = [a1, a2, . . . , an]

    is a translation vector of L.

    Note. With d as a direction vector and a as a translation vector of a line,

    we have L = {td + a | t R}. In this case, t is called a parameter andwe can express the line parametrically as a vector equation:

    x = td + a

    or as a collection of component equations:

    x1 = td1 + a1

    x2 = td2 + a2

    ...

    xn = tdn + an.

    Example. Page 176 number 8.

    Definition 2.6. A k-flat in Rn is a translate of a k-dimensional subspace

    of Rn. In particular, a 1-flat is a line, a 2-flat is a plane, and an (n1)-flatis a hyperplane. We consider each point of Rn to be a zero-flat.

  • 2.5 Lines, Planes, and Other Flats 3

    Note. We can also talk about a translate of a k-dimensional subspace

    W of Rn. If a basis for W is {d1, d2, . . . , dk}, then the k-flat through thepoint (a1, a2, . . . , an) and parallel to W is

    x = t1 d1 + t2d2 + + tk dk + a

    where a = [a1, a2, . . . , an] and t1, t2, . . . , tk R are parameters. We canalso express this k-flat parametrically in terms of components.

    Example. Page 177 number 22.

    Note. We can now clearly explain the geometric interpretation of solu-

    tions of linear systems in terms of k-flats. Consider Ax = b, a system

    of m equations in n unknowns that has at least one solution x = p. By

    Theorem 1.18 on page 97, the solution set of the system consists of all

    vectors of the form x = p + h where h is a solution of the homogeneous

    system Ax = 0. Now the solution set of Ax = 0 is a subspace of Rn, and

    so the solution of Ax = b is a k-flat (where k is the nullity of A) passing

    through point (p1, p2, . . . , pn) where p = [p1, p2, . . . , pn].

    Example. Page 177, number 36.

  • 3.1 Vector Spaces 1

    Chapter 3. Vector Spaces

    3.1 Vector Spaces

    Definition 3.1. A vector space is a set V of vectors along with an

    operation of addition + of vectors and multiplication of a vector by a

    scalar (real number), which satisfies the following. For all u,v, w Vand for all r, s R:(A1) (u + v) + w = u + (v + w)

    (A2) v + w = w + v

    (A3) There exists 0 V such that 0 + v = v(A4) v + (v) = 0(S1) r(v + w) = rv + r w

    (S2) (r + s)v = rv + sv

    (S3) r(sv) = (rs)v

    (S4) 1v = v

    Definition. 0 is the additive identity. v is the additive inverse of v.

  • 3.1 Vector Spaces 2

    Example. Some examples of vector spaces are:

    (1) The set of all polynomials of degree n or less, denoted Pn.(2) All m n matrices.(3) The set of all functions integrable f with domain [0, 1] such that 1

    0

    |f(x)|2 dx < . This vector space is denoted L2[0, 1]:

    L2[0, 1] =

    {f

    1

    0

    |f(x)|2 dx < }.

    Theorem 3.1. Elementary Properties of Vector Spaces.

    Every vector space V satisfies:

    (1) the vector 0 is the unique additive identity in a vector space,

    (2) for each v V , v is the unique additive inverse of v,(3) if u + v = u + w then v = w,

    (4) 0v = 0 for all v V ,(5) r0 = 0 for all scalars r R,(6) (r)v = r(v) = (rv) for all r R and for all v V .

  • 3.1 Vector Spaces 3

    Proof of (1) and (3). Suppose that there are two additive identities,

    0 and 0. Then consider:

    0 = 0 +0 (since 0 is an additive identity)

    = 0(since 0 is an additive identity).

    Therefore, 0 = 0 and the additive identity is unique.

    Suppose u+v = u+ w. Then we add u to both sides of the equationand we get:

    u + v + (u) = u + w + (u)

    v + (u u) = w + (u u)

    v +0 = w +0

    v = w

    The conclusion holds. QED

    Example. Page 189 number 14 and page 190 number 24.

  • 3.2 Basic Concepts of Vector Spaces 1

    Chapter 3. Vector Spaces

    3.2 Basic Concepts of Vector Spaces

    Definition 3.2. Given vectors v1, v2, . . . , vk V and scalars r1, r2, . . . , rk R,

    kl=1

    rlvl = r1v1 + r2v2 + + rk vk

    is a linear combination of v1, v2, . . . , vk with scalar coefficients r1, r2, . . . , rk.

    Definition 3.3. Let X be a subset of vector space V . The span of X is

    the set of all linear combinations of elements in X and is denoted sp(X).

    If V = sp(X) for some finite set X , then V is finitely generated.

    Definition 3.4. A subset W of a vector space V is a subspace of V if

    W is itself a vector space.

    Theorem 3.2. Test for Subspace.

    A subset W of vector space V is a subspace if and only if

    (1) v, w W v + w W ,(2) for all r R and for all v W we have rv W .

  • 3.2 Basic Concepts of Vector Spaces 2

    Example. Page 202 number 4.

    Defninition 3.5. Let X be a set of vectors from a vector space V . A

    dependence relation in X is an equation of the form

    kl=1

    rlvl = r1v1 + r2v2 + + rk vk = 0

    with some rj = 0 and vi X . If such a relation exists, then X is alinearly dependent set. Otherwise X is a linearly independent set.

    Example. Page 202 number 16.

    Definition 3.6. Let V be a vector space. A set of vectors in V is a basis

    for V if

    (1) the set of vectors span V , and

    (2) the set of vectors is linearly independent.

    Example. Page 202 number 20.

  • 3.2 Basic Concepts of Vector Spaces 3

    Theorem 3.3. Unique Combination Criterion for a Basis.

    Let B be a set of nonzero vectors in vector space V . Then B is a basis

    for V if and only if each vector V can by uniquely expressed as a linear

    combination of the vectors in set B.

    Proof. Suppose that B is a basis for vector space V . Then by the first

    part of Definition 3.6 we see that any vector v V can be written as alinear combination of the elements of B, say

    v = r1b1 + r2b2 + + rk bk.

    Now suppose that there is some other linear combination of the vectors in

    B which represents v (we look for a contradiction):

    v = s1b1 + s2b2 + + sk bk.

    If we subtract these two representations of v then we get that

    0 = (r1 s1)b1 + (r2 s2)b2 + + (rk sk)bk.

    By the second part of Definition 3.6, we know that r1 s1 = r2 s2 = = rk sk = 0. Therefore there is only one linear combination ofelements of B which represent v.

    Now suppose that each vector in V can be uniquley represented as a

    linear combination of the elements of B. We wish to show that B is a

  • 3.2 Basic Concepts of Vector Spaces 4

    basis. Clearly B is a spanning set of V . Now we can write 0 as a linear

    combination of elements of B by taking all coefficients as 0. Since we

    hypothesize that each vector can be uniquely represented, then

    0 = r1b1 + r2b2 + + rk bk

    only for r1 = r2 = = rk = 0. Hence the elements of B are linearlyindependent and so B is a basis. QED

    Definition. A vector space is finitely generated if it is the span of some

    finite set.

    Theorem 3.4. Relative Size of Spanning and Independent

    Sets.

    Let V be a vector space. Let w1, w2, . . . , wk be vectors in V that span V

    and let v1, v2, . . . , vm be vectors in V that are independent. Then k m.

    Corollary. Invariance of Dimension for Finitely Generated

    Spaces.

    Let V be a finitely generated vector space. Then any two bases of V have

    the same number of elements.

  • 3.2 Basic Concepts of Vector Spaces 5

    Definition 3.7. Let V be a finitely generated vector space. The number

    of elements in a basis for V is the dimension of V , denoted dim(V ).

    Example. Page 203 number 32. Let {v1, v2, v3} be a basis for V . Ifw sp(v1, v2) then {v1, v2, w} is a basis for V .

    Proof. We need to show that {v1, v2, w} is a linearly independentspanning set of V . Since w V , then w = r1v1 + r2v2 + r3v3 andr3 = 0 since w sp(v1, v2). Then v3 = 1

    r3(w r1v1 r2v2). Therefore

    v3 sp(v1, v2, w). So

    sp(v1, v2, w) sp(v1, v2, w)

    and so {v1, v2, w} generates V .Next suppose, s1v1+s2v2+s3 w = 0. Then s3 = 0 or else w sp(v1, v2).

    So s1v1 + s2v2 = 0 and s1 = s2 = 0. Therefore s1 = s2 = s3 = 0 and so

    {v1, v2, w} is a basis for V . QED

  • 3.3 Coordinatization of Vectors 1

    Chapter 3. Vector Spaces

    3.3 Coordinatization of Vectors

    Definition. An ordered basis (b1, b2, . . . , bn) is an ordered set of vec-

    tors which is a basis for some vector space.

    Definition 3.8. If B = (b1, b2, . . . , bn) is an ordered basis for V and

    v = r1b1 + r2b2 + + rn bn, then the vector [r1, r2, . . . , rn] Rn is thecoordinate vector of v relative to B, denoted vB.

    Example. Page 211 number 6.

    Note. To find vB:

    (1) write the basis vectors as column vectors to form [b1, b2, . . . , bn | v],(2) use Gauss-Jordan elimination to get [I | vB].

  • 3.3 Coordinatization of Vectors 2

    Definition. An isomorphism between two vector spaces V and W is a

    one-to-one and onto function from V to W such that:

    (1) if v1, v2 V then

    (v1 + v2) = (v1) + (v2), and

    (2) if v V and r R then (rv) = r(v).If there is such an , then V and W are isomorphic, denoted V = W .

    Note. An isomorphism is a one-to-one and onto linear transformation.

    Theorem. The Fundamental Theorem of Finite Dimensional

    Vectors Spaces.

    If V is a finite dimensional vector space (say dim(V ) = n) then V is

    isomorphic to Rn.

    Proof. Let B = (b1, b2, . . . , bn) be an ordered basis for V and for v Vwith vB = [r1, r2, . . . , rn] define : V Rn as

    (v) = [r1, r2, . . . , rn].

  • 3.3 Coordinatization of Vectors 3

    Then clearly is one-to-one and onto. Also for v, w V suppose

    vB = [r1, r2, . . . , rn] and wB = [s1, s2, . . . , sn]

    and so

    (v + w) = [r1 + s1, r2 + s2, . . . , rn + sn]

    = [r1, r2, . . . , rn] + [s1, s2, . . . , sn]

    = (v) + (w).

    For a scalar t R,

    (tv) = [tr1, tr2, . . . , trn] = t[r1, r2, . . . , rn] = t(v).

    So is an isomorphism and V = Rn. QED

    Example. Page 212 number 12.

    Example. Page 212 number 20. Prove the set {(xa)n, (xa)n1, . . . , (xa), 1} is a basis for Pn.

    Proof. Let v0, v1, . . . , vn be the coordinate vectors of 1, (x a), . . . ,(x a)n in terms of the ordered basis {1, x, x2, . . . , xn}. Form a matrixA with the vls as the columns:

    A = [v0v1 vn].

  • 3.3 Coordinatization of Vectors 4

    Notice that A is upper triangular:

    A =

    1 a a2 (a)n

    0 1 2a ...0 0 1 ...... . . . ...

    0 0 0 1

    and so the vi are linearly independent. Since dim(Pn) = n+1 and the set

    {(x a)n, (x a)n1, . . . , (x a), 1}

    is a set of n + 1 linearly independent vectors, then this set is a basis for

    Pn. QED

  • 3.4 Linear Transformations 1

    Chapter 3. Vector Spaces

    3.4 Linear Transformations

    Note. We have already studied linear transformations from Rn into Rm.

    Now we look at linear transformations from one general vector space to

    another.

    Definition 3.9. A function T that maps a vector space V into a vector

    space V is a linear transformation if it satisfies:

    (1) T (u + v) = T (u) + T (v), and (2) T (ru) = rT (u),

    for all vectors u,v V and for all scalars r R.

    Definition. For a linear transformation T : V V , the set V is thedomain of T and the set V is the codomain of T . If W is a subset of

    V , then T [W ] = {T (w) | w W} is the image of W under T . T [V ]is the range of T . For W V , T1[W ] = {v V | T (v) W } isthe inverse image of W under T . T1[{0}] if the kernal of T , denotedker(T ).

  • 3.4 Linear Transformations 2

    Definition. Let V, V and V be vector spaces and let T : V V andT : V V be linear transformations. The composite transformationT T : V V is defined by (T T )(v) = T (T (v)) for v V .

    Example. Page 214 Example 1. Let F be the vector space of all functions

    f : R R, and let D be its subspace of all differentiable functions. Showthat differentiation is a linear transformation of D into F .

    Proof. Let T : D F be defined as T (f) = f . Then from Calculus 1we know

    T (f + g) = (f + g) = f + g = T (f) + T (g)

    and

    T (rf) = (rf) = rf = rT (f)

    for all f, g D and for all r R. Therefore T is linear. QED

  • 3.4 Linear Transformations 3

    Theorem 3.5. Preservation of Zero and Subtraction

    Let V and V be vectors spaces, and let T : V V be a linear transfor-mation. Then

    (1) T (0) = 0, and

    (2) T (v1 v2) = T (v1) T (v2),for any vectors v1 and v2 in V .

    Proof of (1). Consider

    T (0) = T (00) = 0T (0) = 0.

    QED

    Theorem 3.6. Bases and Linear Transformations.

    Let T : V V be a linear transformation, and letB be a basis for V . Forany vector v in V , the vector T (v) is uniquely determined by the vectors

    T (b) for all b B. In other words, if two linear transformations have thesame value at each basis vector b B, then the two transformations havethe same value at each vector in V .

  • 3.4 Linear Transformations 4

    Proof. Let T and T be two linear transformations such that T (bi) =

    T (bi) for each vector bi B. Let v V . Then for some scalarsr1, r2, . . . , rk we have

    v = r1b1 + r2v2 + + rk bk.

    Then

    T (v) = T (r1b1 + r2b2 + + rk vk)= r1T (b1) + r2T (b2) + + rkT (vk)= r1T (b1) + r2T (b2) + + rkT (vk)= T (r1b1 + r2b2 + + rk vk)= T (v).

    Therefore T and T are the same tranformations. QED

    Theorem 3.7. Preservation of Subspaces.

    Let V and V be vector spaces, and let T : V V be a linear transfor-mation.

    (1) If W is a subspace of V , then T [W ] is a subspace of V .

    (2) If W is a subspace of V , then T1[W ] is a subspace of V .

  • 3.4 Linear Transformations 5

    Theorem. Let T : V V be a linear transformation and let T (p) = bfor a particular vector p in V . The solution set of T (x) = b is the set

    {p + h | h ker(T )}.

    Proof. (Page 229 number 46) Let p be a solution of T (v) = b. Then

    T (p) = b. Let h be a solution of T (x) = 0. Then T (h) = 0. Therefore

    T (p + h) = T (p) + T (h) = b + 0 = b,

    and so p + h is indeed a solution. Also, if q is any solution of T (x) = b

    then

    T (q p) = T (q) T (p) = bb = 0,

    and so qp is in the kernal of T . Therefore for some h ker(T ), we haveq p = h, and q = p + h. QED

    Definition. A transformation T : V V is one-to-one if T (v1) =T (v2) implies that v1 = v2 (or by the contrapositive, v1 = v2 impliesT (v1) = T (v2)). Transformation T is onto if for all v V there is av V such that T (v) = v.

  • 3.4 Linear Transformations 6

    Corollary. A linear transformation T is one-to-one if and only if ker(T ) =

    {0}.

    Proof. By the previous theorem, if ker(T ) = {0}, then for all relevant b,the equation T (x) = b has a unique solution. Therefore T is one-to-one.

    Next, if T is one-to-one then for any nonzero vector x, T (x) is nonzero.

    Therefore by Theorem 3.5 Part (1), ker(T ) = {0}. QED

    Definition 3.10. Let V and V be vector spaces. A linear transfor-

    mation T : V V is invertible if there exists a linear transformationT1 : V V such that T1 T is the identity transformation on Vand T T1 is the identity transformation on V . Such T1 is called aninverse transformation of T .

    Theorem 3.8. A linear transformation T : V V is invertible if andonly if it is one-to-one and onto V .

    Proof. Suppose T is invertible and is not one-to-one. Then for some

    v1 = v2 both in V , we have T (v1) = T (v2) = v. But then T1T (v) = v1and T1 T (v) = v2, a contradiction. Therefore if T is invertible then Tis one-to-one.

  • 3.4 Linear Transformations 7

    From definition 3.10, if T is invertible then for any v V we musthave T1(v) = v for some v V . Therefore the image of v is v V

    and T is onto.

    Finally, we need to show that if T is one-to-one and onto then it is

    invertible. Suppose that T is one-to-one and onto V . Since T is onto V ,

    then for each v V we can find v V such that T (v) = v. Because Tis one-to-one, this vector v V is unique. Let T1 : V V be definedby T1(v) = v. Then

    (T T1)(v) = T (T1(v)) = T (v) = v

    and

    (T1 T )(v) = T1(T (v)) = T1(v) = v,and so T T1 is the identity map on V and T1 T is the identity mapon V .

    Now we need only show that T1 is linear. Suppose T (v1) = v1 and

    T (v2) = v2. Then

    T1(v1 + v2) = T

    1(T (v1) + T (v2)) = T1(T (v1 + v2))

    = (T1 T )(v1 + v2) = v1 + v2 = T1(v1) + T1(v2).Also

    T1(rv1) = T1(rT (v1)) = T1(T (rv1)) = rv1 = rT1(v1).

  • 3.4 Linear Transformations 8

    Therefore T1 is linear. QED

    Theorem 3.9. Coordinatization of Finite-Dimensional Spaces.

    Let V be a finite-dimensional vector space with ordered basisB = (b1, b2, . . . ,

    bn). The map T : V Rn defined by T (v) = vB, the coordinate vectorof v relative to B, is an isomorphism.

  • 3.4 Linear Transformations 9

    Theorem 3.10. Matrix Representations of Linear Transfor-

    mations.

    Let V and V be finite-dimensional vector spaces and letB = (b1, b2, . . . , bn)

    and B = (b1, b2, . . . ,

    bm) be ordered bases for V and V, respectively. Let

    T : V V be a linear transformation, and let T : Rn Rm be thelinear transformation such that for each v V , we have T ( vB) = T (v)B.Then the standard matrix representation of T is the matrix A whose jth

    column vector is T (bj)B, and T (v)B = A vB for all vectors v V.

    Definition 3.11. The matrix A of Theorem 3.10 is the matrix repre-

    sentation of T relative to B,B.

    Theorem. The matrix representation of T1 relative to B, B is the

    inverse of the matrix representation of T relative to B,B.

    Examples. Page 227 numbers 18 and 24.

  • 3.5 Inner-Product Spaces 1

    Chapter 3. Vector Spaces

    3.5 Inner-Product Spaces

    Note. In this section, we generalize the idea of dot product to general

    vector spaces. We use this more general idea to define length and angle in

    arbitrary vector spaces.

    Note. Motivated by the properties of dot product on Rn, we define the

    following:

    Definition 3.12. An inner product on a vector space V is a function

    that associates with each ordered pair of vectors v, w V a real number,written v, w, satisfying the following properties for all u,v, w V andfor all scalars r:

    P1. Symmetry: v, w = w,vP2. Additivity: u,v + w = u,v + u, w,P3. Homogeneity: rv, w = rv, w = v, r w,P4. Positivity: v,v 0, and v,v = 0 if and only if v = 0.An inner-product space is a vector space V together with an inner product

    on V .

  • 3.5 Inner-Product Spaces 2

    Example. Dot product on Rn is an example of an inner product:

    v, w = v w for v, w Rn.

    Example. Page 231 Example 3. Show that the space P0,1 of all poly-

    nomial functions with real coefficients and domain 0 x 1 is aninner-product space if for p and q in P0,1 we define

    p, q = 1

    0

    p(x)q(x) dx.

    Definition 3.13. Let V be an inner-product space. The magnitude or

    norm of a vector v V is v = v,v. The distance between v andw in an inner-product space V is d(v, w) = v w.

    Note. In Rn with dot product as inner product, we find that the distance

    induced by this inner-product is (as expected):

    d(v, w) = v w =v w,v w

    =

    (v1 w1, v2 w2, . . . , vn wn) (v1 w1, v2 w2, . . . , vn wn)

    =(v1 w1)2 + (v2 w2)2 + + (vn wn)2.

  • 3.5 Inner-Product Spaces 3

    Theorem 3.11. Schwarz Inequality.

    Let V be an inner-product space, and let v, w V . Then

    v, w vw.

    Proof. Let r, s R. Then by Definition 3.12

    rv + sw2 = rv + sw, rv + sw= r2v,v + 2rsv, w + s2w, w 0.

    Since this equation holds for all r, s R, we are free to choose particularvalues of r and s. We choose r = w, w and s = v, w. Then we have

    w, w2v,v 2w, wv, w2 + v, w2w, w

    = w, w2v,v w, wv, w2

    = w, w[w, wv,v v, w2] 0. (13)

    If w, w = 0 then w = 0 by Theorem 3.12 Part (P4), and the SchwarzInequality is proven (since it reduces to 0 0). If w2 = w, w = 0,then by the above inequality the other factor of inequality (13) must also

    be nonnegative:

    w, wv,v v, w2 0.

  • 3.5 Inner-Product Spaces 4

    Therefore

    v, w2 v,vw, w = v2w2.

    Taking square roots, we get the Schwarz Inequality. QED

    Theorem. The Triangle Inequality.

    Let v, w V (where V is an inner-product space). Then

    v + w v + w.

    Proof. We have

    v + w2 = v + w,v + w= v, w + 2v, w + w, w (by Definition 3.12)= v2 + 2v, w + w2 (by Definition 3.13) v2 + 2vw + w2 (by Schwarz Inequality)= (v + w)2

    Taking square roots, we have the Triangle Inequality. QED

  • 3.5 Inner-Product Spaces 5

    Definition. Let v, w V where V is an inner-product space. Definethe angle between vectors v and w as

    = arccosv, wvw.

    In particular, v and w are orthogonal (or perpendicular) if v, w = 0.

    Examples. Page 236 number 12, and page 237 number 26.

  • 4.1 Areas, Volumes, and Cross Products 1

    Chapter 4. Determinants

    4.1 Areas, Volumes, and Cross Products

    Note. Area of a Parallelogram.

    Consider the parallelogram determined by two vectors a and b:

    Figure 4.1, Page 239.

    Its area is

    A = Area = (base) (height) = ab sin

    = ab

    1 cos2 .

    Squaring both sides:

    A2 = a2b2(1 cos2 )= a2b2 a2b2 cos2

  • 4.1 Areas, Volumes, and Cross Products 2

    = a2b2 (a b)2.

    Converting to components a = [a1, a2] and b = [b1, b2] gives

    A2 = (a1b2 a2b1)2

    or A = |a1b2 a2b1|.

    Definition. For a 22 matrix A = a1 a2

    b1 b2

    , define the determinant

    of A as

    det(A) = a1b2 a2b1 =a1 a2

    b1 b2

    .

    Example. Page 249 number 26.

    Definition. For two vectors b = [b1, b2, b3] and c = [c1, c2, c3] define the

    cross product of b and c as

    b c =b2 b3

    c2 c3

    ib1 b3

    c1 c3

    j +b1 b2

    c1 c2

    k.

  • 4.1 Areas, Volumes, and Cross Products 3

    Note. We can take dot products and find that b c is perpendicular toboth b and c.

    Note. If b,c R3 are not parallel, then there are two directions perpen-dicular to both of these vectors. We can determine the direction of b cby using a right hand rule. If you curl the fingers of your right hand

    from vector b to vector c, then your thumb will point in the direction of

    b c:

    Figure 4.3, Page 242.

    Example. Page 248 number 16.

  • 4.1 Areas, Volumes, and Cross Products 4

    Definition. For a 3 3 matrix A =

    a1 a2 a3

    b1 b2 b3

    c1 c2 c3

    define the determi-

    nant as

    det(A) =

    a1 a2 a3

    b1 b2 b3

    c1 c2 c3

    = a1

    b2 b3

    c2 c3

    a2b1 b3

    c1 c3

    + a3b1 b2

    c1 c2

    .

    Note. We can now see that cross products can be computed using deter-

    minants:

    b c =

    i j k

    b1 b2 b3

    c1 c2 c3

    .

  • 4.1 Areas, Volumes, and Cross Products 5

    Theorem. The area of the parallelogram determined byb andc is bc.

    Proof. We know from the first note of this section that the area squared

    is A2 = cb (c b)2. In terms of components we have

    A2 = (c21 + c22 + c

    23)(b

    21 + b

    22 + b

    23) (c1b1 + c2b2 + c3b3)2.

    Multiplying out and regrouping we find that

    A2 =

    b2 b3

    c2 c3

    2

    +

    b1 b3

    c1 c3

    2

    +

    b1 b2

    c1 c2

    2

    .

    Taking square roots we see that the claim is verified. QED

    Theorem. The volume of a box determined by vectors a,b,c R3 is

    V = |a1(b2c3 b3c2) a2(b1c3 b3c1) + a3(b1c2 b2c1)| = |a b c|.

    Proof. Consider the box determined by a,b,c R3:

    Figure 4.5, Page 244.

  • 4.1 Areas, Volumes, and Cross Products 6

    The volume of the box is the height times the area of the base. The area

    of the base is b c by the previous theorem. Now the height is

    h = a| cos | = b ca| cos |

    b c=

    |(b c) a|b c

    .

    (Notice that if bc is in the opposite direction as given in the illustrationabove, then would be greater than /2 and cos would be negative.

    Therefore the absolute value is necessary.) Therefore

    V = (Area of base)(height) = b c|(c c) a|b c = |(b c) a|.

    QED

    Note. The volume of a box determined by a,b,c R3 can be computedin a similar manner to cross products:

    V = | det(A)| =

    a1 a2 a3

    b1 b2 b3

    c1 c2 c3

    .

    Example. Page 249 number 37.

  • 4.1 Areas, Volumes, and Cross Products 7

    Theorem 4.1. Properties of Cross Product.

    Let a,b,c R3.(1) Anticommutivity: b c = cb(2) Nonassociativity of : a (b c) = (a b) c (That is, equalitydoes not in general hold.)

    (3) Distributive Properties: a (b + c) = (ab) + (a c)(a +b) c = (a c) + (b c)

    (4) Perpendicular Property: b (b c) = (b c) c = 0(5) Area Property: bc = Area of the parallelogram determined by band c

    (6) Volume Property: a (b c) = (a b) c = Volume of the boxdetermined by a, b, and c.

    (7) a (b c) = (a c)b (a b)c

    Proof of (1). We have

    b c =b2 b3

    c2 c3

    ib1 b3

    c1 c3

    j +b1 b2

    c1 c2

    k= (b2c3 b3c2)i (b1c3 b3c1)j + (b1c2 b2c1)k=

    ((b3c2 b2c3)i (b3c1 b1c3)j + (b2c1 b1c2)k

    )

  • 4.1 Areas, Volumes, and Cross Products 8

    = c2 c3

    b2 b3

    ic1 c3

    b1 b3

    j +c1 c2

    b1 b2

    k

    = cb

    QED

    Example. Page 249 number 56.

  • 4.2 The Determinant of a Square Matrix 1

    Chapter 4. Determinants

    4.2 The Determinant of a Square Matrix

    Definition. The minor matrix Aij of an n n matrix A is the (n 1) (n 1) matrix obtained from it by eliminating the ith row and thejth column.

    Example. Find A11, A12, and A13 for

    A =

    a11 a12 a13

    a21 a22 a23

    a31 a32 a33

    .

    Definition. The determinant of Aij times (1)i+j is the cofactor ofentry aij in A, denoted A

    ij.

  • 4.2 The Determinant of a Square Matrix 2

    Example. We can write determinants of 3 3 matrices in terms ofcofactors:

    det(A) =

    a11 a12 a13

    a21 a22 a23

    a31 a32 a33

    = a11|A11| a12|A12| + a13|A13|

    = a11a11 + a12a

    12 + a13a

    13.

    Note. The following definition is recursive. For example, in order to

    process the definition for n = 4 you must process the definition for n = 3,

    n = 2, and n = 1.

    Definition 4.1. The determinant of a 1 1 matrix is its single entry.Let n > 1 and assume the determinants of order less than n have been

    defined. Let A = [aij] be an n n matrix. The cofactor of aij in A isaij = (1)i+j det(Aij). The determinant of A is

    det(A) = a11a11a12a

    12 + + a1na1n =

    ni=1

    a1ia1i.

    Example. Page 252 Example 2.

  • 4.2 The Determinant of a Square Matrix 3

    Theorem 4.2. General Expansion by Minors.

    The determinant of A can be calculated by expanding about any row or

    column:

    det(A) = ar1ar1 + ar2a

    r2 + arnarn

    = a1sa1s + a2sa

    s2 + ansans

    for any 1 r n or 1 s n.

    Proof. See Appendix B for a proof which uses mathematical induction.

    Example. Find the determinant of

    A =

    0 0 0 1

    0 1 2 0

    0 4 5 9

    1 15 6 57

    .

  • 4.2 The Determinant of a Square Matrix 4

    Theorem. Properties of the Determinant.

    Let A be a square matrix:

    1. det(A) = det(AT ).

    2. If H is obtained from A by interchanging two rows, then det(H) =

    det(A).

    3. If two rows of A are equal, then det(A) = 0.

    4. If H is obtained from A by multiplying a row of A by a scalar r, then

    det(H) = r det(A).

    5. If H is obtained from A by adding a scalar times one row to another

    row, then det(H) = det(A).

    Proof of 2. We will prove this by induction. The proof is trivial for

    n = 2. Assume that n > 2 and that this row interchange property holds

    for square matrices of size smaller that n n. Let A be an n n matrixand let B be the matrix obtained from A by interchanging the ith row

    and the rth row. Since n > 2, we can choose a kth row for expansion by

    minors, where k {r, i}. Consider the cofactors

    (1)k+j|Akj| and (1)k+j|Bkj|.

  • 4.2 The Determinant of a Square Matrix 5

    These numbers must have opposite signs, by our induction hypothesis,

    since the minor matrices Akj and Bkj have size (n1) (n1), and Bkjcan be obtained from Akj by interchanging two rows. That is, |Bkj| =|Akj|. Expanding by minors on the kth row to find det(A) and det(B),we see that det(A) = det(B). QED

    Note. Property 1 above implies that each property of determinants stated

    for rows also holds for columns.

    Example. Page 261 number 8.

    Theorem 4.3. Determinant Criterion for Invertibility.

    A square matrix A is invertible if and only if det(A) = 0.

    Theorem 4.5 The Multiplicative Property.

    If A and B are n n matrices, then det(AB) = det(A) det(B).

    Examples. Page 262 numbers 28 and 32.

  • 4.3 Computation of Determinants and Cramers Rule 1

    Chapter 4. Determinants

    4.3 Computation of Determinants and Cramers Rule

    Note. Computation of A Determinant.

    The determinant of an n n matrix A can be computed as follows:

    1. Reduce A to an echelon form using only row (column) addition and

    row (column) interchanges.

    2. If any matrices appearing in the reduction contain a row (column) of

    zeros, then det(A) = 0.

    3. Otherwise,

    det(A) = (1)r (product of pivots)

    where r is the number of row (column) interchanges.

    Example. Page 271 number 6 (work as in Example 1 of page 264).

  • 4.3 Computation of Determinants and Cramers Rule 2

    Theorem 4.5. Cramers Rule.

    Consider the linear system Ax = b, where A = [aij] is an nn invertiblematrix,

    x =

    x1

    x2

    ...

    xn

    and b =

    b1

    b2

    ...

    bn

    .

    The system has a unique solution given by

    xk =det(Bk)

    det(a)for k = 1, 2, . . . , n,

    where Bk is the matrix obtained from A by replacing the kth column

    vector of A by the column vector b.

    Proof. Since A is invertible, we know that the linear system Ax = b has

    a unique solution by Theorem 1.16. Let x be this unique solution. Let

    Xk be the matrix obtained from the n n identity matrix by replacing

  • 4.3 Computation of Determinants and Cramers Rule 3

    its kth column vector by the column vector x, so

    Xk =

    1 0 0 x1 0 0 00 1 0 x2 0 0 00 0 1 x3 0 0 0

    ...

    0 0 0 xk 0 0 0...

    0 0 0 xn 0 0 1

    .

    We now compute the product AXk If j = k, then the jth column of AXkis the product of A and the jth column of the identity matrix, which is

    just the jth column of A. If j = k, then the jth column of AXk is Ax = b.

    Thus AXk is the matrix obtained from A by replacing the kth column

    of A by the column vector b. That is, AXk is the matrix Bk described

    in the statement of the theorem. From the equation AXk = Bk and the

    multiplicative property of determinants, we have

    det(A) det(Xk) = det(Bk).

    Computing det(Xk) by expanding by minors across the kth row, we see

    that det(Xk) = xk and thus det(A)xk = det(Bk). Because A is invertible,we know that det(A) = 0 by theorem 4.3, and so xk = det(Bk)/ det(A)as claimed. QED

  • 4.3 Computation of Determinants and Cramers Rule 4

    Example. Page 272 number 26.

    Note. Recall that aij is the determinant of the minor matrix associated

    with element aij (i.e. the cofactor of aij).

    Definition. For an n n matrix A = [aij], define the adjoint of A as

    adj(A) = (A)T

    where A = [aij].

    Example. Page 272 number 18 (find the adjoint of A).

    Theorem 4.6. Property of the Adjoint.

    Let A be n n. Then

    (adj(A))A = A adj(A) = (det(A))I.

    Corollary. A Formula for A1.

    Let A be n n and suppose det(A) = 0. Then

    A1 =A

    det(A)adj(A).

  • 4.3 Computation of Determinants and Cramers Rule 5

    Example. Page 272 number 18 (use the corollary to find A1).

    Note. If A =

    a b

    c d

    then adj(A) =

    d bc a

    and det(a) = adbc,

    so

    A1 =1

    ad bc

    d bc a

    .

  • 5.1 Eigenvalues and Eigenvectors 1

    Chapter 5. Eigenvalues and Eigenvectors

    5.1 Eigenvalues and Eigenvectors

    Definition 5.1. Let A be an nn matrix. A scalar is an eigenvalueof A if there is a nonzero column vector v Rn such that Av = v. Thevector v is then an eigenvector of A corresponding to .

    Note. If Av = v then Av v = 0 and so (A I)v = 0. Thisequation has a nontrivial solution only when det(A I) = 0.

    Definition. det(AI) is a polynomial of degree n (where A is nn)called the characteristic polynomial of A, denoted p(), and the equation

    p() = 0 is called the characteristic polynomial.

    Example. Page 300 number 8.

  • 5.1 Eigenvalues and Eigenvectors 2

    Theorem 5.1. Properties of Eigenvalues and Eigenvectors.

    Let A be an n n matrix.

    1. If is an eigenvalue of A with v as a corresponding eigenvector, then

    k is an eigenvalue of Ak, again with v as a corresponding eigenvector,

    for any positive integer k.

    2. If is an eigenvalue of an invertible matrix A with v as a corresponding

    eigenvector, then = 0 and 1/ is an eigenvalue of A1, again withv as a corresponding eigenvector.

    3. If is an eigenvalue of A, then the set E consisting of the zero vector

    together with all eigenvectors of A for this eigenvalue is a subspace

    of n-space, the eigenspace of .

    Proof of (2). (Page 301 number 28.) By definition, = 0. If is aneigenvalue of A with eigenvector v, then Av = v. Therefore A1Av =

    A1v or v = A1v. So A1v = (1/)v and is an eigenvalue of A1.

    QED

  • 5.1 Eigenvalues and Eigenvectors 3

    Note. We define eigenvalue and eigenvector for a linear transformation

    in the most obvious way (that is, in terms of the matrix which represents

    it).

    Definition 5.2. Eigenvalues and Eigenvectors.

    Let T be a linear transformation of a vector space V into itself. A scalar

    is an eigenvalue of T if there is a nonzero vector v V such thatT (v) = v. The vector v is then an eigenvector of T corresponding to .

    Examples. Page 300 number 18, page 301 number 32.

  • 5.2 Diagonalization 1

    Chapter 5. Eigenvalues and Eigenvectors

    5.2 Diagonalization

    Recall. A matrix is diagonal if all entries off the main diagonal are 0.

    Note. In this section, the theorems stated are valid for matrices and

    vectors with complex entries and complex scalars, unless stated otherwise.

    Theorem 5.2. Matrix Summary of Eigenvalues of A.

    Let A be an n n matrix and let 1, 2, . . . , n be (possibly complex)scalars and v1, v2, . . . , vn be nonzero vectors in n-space. Let C be the

    n n matrix having vj as jth column vector and let

    D =

    1 0 0 00 2 0 00 0 3 0... ... ... . . . ...

    0 0 0 n

    .

    Then AC = CD if and only if 1, 2, . . . , n are eigenvalues of A and vj

    is an eigenvector of A corresponding to j for j = 1, 2, . . . , n.

  • 5.2 Diagonalization 2

    Proof. We have

    CD =

    ... ... ...

    v1 v2 vn... ... ...

    1 0 0 00 2 0 00 0 3 0... ... ... . . . ...

    0 0 0 n

    =

    ... ... ...

    1v1 2v2 n vn... ... ...

    .

    Also,

    AC = A

    ... ... ...

    v1 v2 vn... ... ...

    .

    Therefore, AC = CD if and only if Avj = j vj. QED

    Note. The n n matrix C is invertible if and only if rank(C) = n that is, if and only if the column vectors of C form a basis of n-space.

    In this case, the criterion AC = CD in Theorem 5.2 can be written as

    D = C1AC. The equation D = C1AC transforms a matrix A into a

    diagonal matrix D that is much easier to work with.

  • 5.2 Diagonalization 3

    Definition 5.3. Diagonalizable Matrix.

    An n n matrix A is diagonalizable if there exists an invertible matrixC such that C1AC = D is a diagonal matrix. The matrix C is said to

    diagonalize the matrix A.

    Corollary 1. A Criterion for Diagonalization.

    An n n matrix A is diagonalizable if and only if n-space has a basisconsisting of eigenvectors of A.

    Corollary 2. Computation of Ak.

    Let an nn matrix A have n eigenvectors and eigenvalues, giving rise tothe matrices C and D so that AC = CD, as described in Theorem 5.2.

    If the eigenvectors are independent, then C is an invertible matrix and

    C1AC = D. Under these conditions, we have Ak = CDkC1.

    Proof. By Corollary 1, if the eigenvectors of A are independent, then A

    is diagonalizable and so C is invertible. Now consider

    Ak = (CDC1)(CDC1) (CDC1) k factors

    = CD(C1C)D(C1C)D(C1C) (C1C)DC1

    = CDIDID IDC1

  • 5.2 Diagonalization 4

    = C DDD D k factors

    C1 = CDkC1

    QED

    Theorem 5.3. Independence of Eigenvectors.

    Let A be an n n matrix. If v1, v2, . . . , vn are eigenvectors of A corre-sponding to distinct eigenvalues 1, 2, . . . , n, respectively, the set {v1, v2,. . . , vn} is linearly independent and A is diagonalizable.

    Proof. We prove this by contradiction. Suppose that the conclusion

    is false and the hypotheses are true. That is, suppose the eigenvectors

    v1, v2, . . . , vn are linearly independent. then one of them is a linear com-

    bination of its predecessors (see page 203 number 37). Let vk be the first

    such vector, so that

    vk = d1v1 + d2v2 + + dk1vk1 (2)

    and {v1, v2, . . . , vk1} is independent. Multiplying (2) by k, we obtain

    k vk = d1k v1 + d2k v2 + + dk1kvk1. (3)

    Also, multiplying (2) on the left by the matrix A yields

    k vk = d11v1 + d22v2 + + dk1k1vk1 (4),

  • 5.2 Diagonalization 5

    since Avi = ivi. Subtracting (4) from (3), we see that

    0 = di(k 1)v1 + d2(k 2)v2 + + dk1(k k1)vk1.

    But this equation is a dependence relation since not all dis are 0 and

    the s are hypothesized to be different. This contradicts the linear in-

    dependence of the set {v1, v2, . . . , vk1}. This contradiction shows that{v1, v2, . . . , vn} is independent. From Corollary 1 of Theorem 5.2 we seethat A is diagonalizable. QED

    Example. Page 315 number 4.

    Definition 5.4. An n n matrix P is similar to an n n matrix Q ifthere exists an invertible n n matrix C such that C1PC = Q.

    Example. Page 315 number 18.

    Definition. The algebraic multiplicity of an eigenvalue i of A is its

    multiplicity as a root of the characteristic equation of A. Its geometric

    multiplicity is the dimension of the eigenspace Ei.

  • 5.2 Diagonalization 6

    Theorem. The geometric multiplicity of an eigenvalue of a matrix A is

    less than or equal to its algebraic multiplicity.

    Note. The proof of this result is a problem (number 33) in section 9.4.

    Theorem 5.4. A Criterion for Diagonalization.

    An nnmatrixA is diagonalizable if and only if the algebraic multiplicityof each (possibly complex) eigenvalue is equal to its geometric multiplicity.

    Example. Page 315 number 10.

    Theorem 5.5. Diagonalization of Real Symmetric Matrices.

    Every real symmetric matrix is real diagonalizable. That is, ifA is an nnsymmetric real matrix with real-number entries, then each eigenvalue of

    A is a real number, and its algebraic multiplicity equals its geometric

    multiplicity.

    Note. The proof of Theorem 5.5 is in Chapter 9 and uses the Jordan

    canonical form of matric A.

    Example. Page 316 number 26.

  • 6.1 Projections 1

    Chapter 6. Orthogonality

    6.1 Projections

    Note. We want to find the projection p of vector F on sp(a):

    Figure 6.1, Page 327.

    We see that p is a multiple of a. Now (1/a)a is a unit vector havingthe same direction as a, so p is a scalar multiple of this unit vector. We

    need only find the appropriate scalar. From the above figure, we see that

    the appropriate scalar is F cos , because it is the length of the leglabeled p of the right triangle. If p is in the opposite direction of a and

    [/2, 2]:

  • 6.1 Projections 2

    Figure 6.2, Page 327.

    then the appropriate scalar is again given by F cos . Thus

    p = F cos

    a a = Fa cos

    aa a =F aa aa.

    We use this to motivate the following definition.

    Definition. Let a,b Rn The projection p of b on sp(a) is

    p =b aa aa.

    Example. Page 336 number 4.

  • 6.1 Projections 3

    Definition 6.1. Let W be a subspace of Rn. The set of all vectors in Rn

    that are orthogonal to every vector in W is the orthogonal complement

    of W and is denoted by W.

    Note. To find the orthogonal complement of a subspace of Rn:

    1. Find a matrix A having as row vectors a generating set for W .

    2. Find the nullspace of A that is, the solution space of Ax = 0. This

    nullspace is W.

    Example. Page 336 number 10.

    Theorem 6.1. Properties of W.

    The orthogonal complement W of a subspace W of Rn has the following

    properties:

    1. W is a subspace of Rn.

    2. dim(W) = n dim(W ).

    3. (W) = W .

  • 6.1 Projections 4

    4. Each vectorb Rn can be expressed uniquely in the formb = bW+bWfor bW W and bW W.

    Proof of (1) and (2). Let dim(W ) = k, and let {v1, v2, . . . , vk} be abasis for W . Let A be the k n matrix having vi as its ith row vectorfor i = 1, 2, . . . , k.

    Property (1) follows from the fact that W is the nullspace of matrix

    A and therefore is a subspace of Rn.

    For Property (2), consider the rank equation of A:

    rank(A) + nullity(A) = n.

    Since dim(W ) = rank(A) and since W is the nullspace of A, then

    dim(W) = n dim(W ). QED

    Definition 6.2. Let b Rn, and let W be a subspace of Rn. Letb = bW +bW be as described in Theorem 6.1. Then

    bW is the projection

    of b on W .

  • 6.1 Projections 5

    Note. To find the projection of b on W , follow these steps:

    1. Select a basis {v1, v2, . . . , vk} for the subspace W .

    2. Find a basis {vk+1, vk+2, . . . , vn} for W.

    3. Find the coordinate vector r = r1, r2, . . . , rn] of b relative to the basis

    (v1, v2, . . . , vn) so that

    b = r1v1 + r2v2 + + rn vn.

    4. Then vW = r1v1 + r2v2 + + rk vk.

    Example. Page 336 number 20b.

    Note. We can perform projections in inner product spaces by replacing

    the dot products in the formulas above with inner products.

    Example. Page 335 Example 6. Consider the inner product space P[0,1]of all polynomial functions defined on the interval [0, 1] with inner product

    p(x), q(x) = 1

    0

    p(x)q(x) dx.

    Find the projection of f(x) = x on sp(1) and then find the projection of

    x on sp(1).

    Example. Page 337 number 28.

  • 6.2 The Gram-Schmidt Process 1

    Chapter 6. Orthogonality

    6.2 The Gram Schmidt Process

    Definition. A set {v1, v2, . . . , vk} of nonzero vectors in Rn is orthogonalif the vectors vj are mutually perpendicular that is, if vi vj = 0 fori = j.

    Theorem 6.2. Orthogonal Bases.

    Let {v1, v2, . . . , vk} be an orthogonal set of nonzero vectors in Rn. Thenthis set is independent and consequently is a basis for the subspace sp(v1, v2,

    . . . , vk).

    Proof. Let j be an integer between 2 and k. Consider

    vj = s1v1 + s2v2 + + sj1vj1.

    If we take the dot product of each side of this equation with vj then, since

    the set of vectors is orthogonal, we get vj vj = 0, which contradictsthe hypothesis that vj = 0. Therefore no vj is a linear combinationof its predecessors and by Exercise 37 page 203, the set is independent.

    Therefore the set is a basis for its span. QED

  • 6.2 The Gram-Schmidt Process 2

    Theorem 6.3. Projection Using an Orthogonal Basis.

    Let {v1, v2, . . . , vk} be an orthogonal basis for a subspace W of Rn, andlet b Rn. The projection of b on W is

    bW =b v1v1 v1 v1 +

    b v2v2 v2 v2 + +

    b vkvk vk vk.

    Proof. We know from Theorem 6.1 that b = bW + bW wherebW is

    the projection of b on W and bW is the projection ofb on W. Since

    bW W and {v1, v2, . . . , vk} is a basis of W , then

    bW = r1v1 + r2v2 + + rk vk

    for some scalars r1, r2, . . . , rk. We now find these ris. Taking the dot

    product of b with vi we have

    b vi = (bW vi) + (bW vi)= (r1v1 vi + r2v2 vi + + rk vk vi + 0= rivi vi

    Therefore ri = (b vi)/(vi vi) and so

    rivi =b vivi vi vi.

    Substituting these values of the ris into the expression for bW yields the

    theorem. QED

  • 6.2 The Gram-Schmidt Process 3

    Example. Page 347 number 4.

    Definition 6.3. Let W be a subspace of Rn. A basis {q1, q2, . . . , qk}for W is orthonormal if

    1. qi qj = 0 for i = j, and

    2. qi qi = 1.

    That is, each vector of the basis is a unit vector and the vectors are pairwise

    orthogonal.

    Note. If {q1, q2, . . . , qk} is an orthonormal basis for W , then

    bW = (b q1)q1 + (b q2)q2 + + (b qk)qk.

    Theorem 6.4. Orthonormal Basis (Gram-Schmidt) Theorem.

    Let W be a subspace of Rn, let {a1, a2, . . . , ak} be any basis for W , andlet

    Wj = sp(a1, a2, . . . , aj) for j = 1, 2, . . . , k.

    Then there is an orthonormal basis {q1, q2, . . . , qk} forW such thatWj =sp(q1, q2, . . . , qj).

  • 6.2 The Gram-Schmidt Process 4

    Note. The proof of Theorem 6.4 is computational. We summarize the

    proof in the following procedure:

    Gram-Schmidt Process.

    To find an orthonormal basis for a subspace W of Rn:

    1. Find a basis {a1, a2, . . . , ak} for W .

    2. Let v1 = a1. For j = 1, 2, . . . , k, compute in succession the vector vj

    given by subtracting from aj its projection on the subspace generated

    by its predecessors.

    3. The vj so obtained form an orthogonal basis for W , and they may be

    normalized to yield an orthonormal basis.

    Note. We can recursively describe the way to find vj as:

    vj = aj (aj v1v1 v1 v1 +

    aj v2v2 v2 v2 + +

    aj vj1vj1 vj1vj1

    ).

    If we normalize the vj as we go by letting qj = (1/vj)vj, then we have

    vj = aj ((aj q1)q1 + (aj q2)q2 + + (aj qj1)qj1).

    Example. Page 348 number 10.

  • 6.2 The Gram-Schmidt Process 5

    Corollary 2. Expansion of an Orthogonal Set to an Orthog-

    onal Basis.

    Every orthogonal set of vectors in a subspace W of Rn can be expanded

    if necessary to an orthogonal basis of W .

    Examples. Page 348 number 20 and page 349 number 34.

  • 6.3 Orthogonal Matrices 1

    Chapter 6. Orthogonality

    6.3 Orthogonal Matrices

    Definition 6.4. An n n matrix A is orthogonal if ATA = I.

    Note. We will see that the columns of an orthogonal matrix must be

    unit vectors and that the columns of an orthogonal matrix are mutually

    orthogonal