Fraleigh = Linear Algebra.pdf

1.1 Vectors in Euclidean Space 1

Chapter 1. Vectors, Matrices, and Linear Spaces

1.1. Vectors in Euclidean Spaces

Definition. The space Rn, or Euclidean n-space, is either (1) the

collection of all n-tuples of the form (x1, x2, . . . , xn) where the xis are

real numbers (the n-tuples are called points), or (2) the collection of all

n-tuples of the form [x1, x2, . . . , xn] where the xis are real numbers (the

n-tuples are called vectors).

Note. There is as yet no differences between points and vectors.

Note. R1 is just the collection of real numbers (which we know to have an

algebraic structure addition and subtraction, say). R2 is the collection

of all points in the Cartesian plane.

Notation. The book denotes vectors with bold faced letters. We use

letters (usually lower case) with little arrows over them:

x = [x1, x2, . . . , xn].


Definition. For x R, say x = [x1, x2, . . . , xn], the ith component ofx is xi.

Definition. Two vectors inRn, v = [v1, v2, . . . , vn] and w = [w1, w2, . . . , wn]

are equal if each of their components are equal. The zero vector, 0, in Rn

is the vector of all zero components.

Note. We have the following geometric interpretation of vectors: A

vector v R2 can be drawn in standard position in the Cartesian planeby drawing an arrow from the point (0, 0) to the point (v1, v2) where

v = [v1, v2]:


We can draw v translated to point P as follows:

Notice that both of these are representations of the same vector v.

Note. In physics, forces are represented by arrows (or vectors) and if

two forces F1 and F2 are applied to an object, the resulting force F1 + F2

satisfies a parallelogram property:

Figure 1.1.5, page 5


You can also talk about scaling a force by a constant c (we call these

constants scalars as opposed to vectors and points):

This inspires us to make the following definitions.

Definition 1.1. Let v = [v1, v2, . . . , vn] and w = [w1, w2, . . . , wn] be

vectors in Rn and let r R be a scalar. Define1. Vector addition: v + w = [v1 + w1, v2 + w2, . . . , vn + wn],

2. Vector subtraction: v w = [v1 w1, v2 w2, . . . , vn wn], and3. Scalar multiplication: rv = [rv1, rv2, . . . , rvn].

Example. Page 16 numbers 10 and 14.


Theorem 1.1. Properties of Vector Algebra in Rn.

Let u,v, w Rn and let r, s be scalars in R. ThenA1. Associativity of Vector Addition. (u + v) + w = u + (v + w)

A2. Commutivity of Vector Addition. v + w = w + v

A3. Additive Identity. 0 + v = v

A4. Additive Inverses. v + (v) = 0S1. Distribution of Scalar Multiplication over Vector Addition.

r(v + w) = rv + r w

S2. Distribution of Scalar Addition over Scalar Multiplication.

(r + s)v = rv + sv

S3. Associativity. r(sv) = (rs)v

S4. Preservation of Scale. 1v = v

Example. Page 17 number 40a (prove A1).

Definition 1.2. Two nonzero vectors v, w Rn are parallel, denotedv w, if one is a scalar multiple of the other. If v = r w with r > 0, thenv and w have the same direction and if v = r w with r < 0 then v and

w have opposite directions.


Example. Page 16 number 22.

Definition 1.3. Given vectors v1, v2, . . . , vk Rn and scalars r1, r2, . . . ,rk R, the vector

r1v1 + r2v2 + + rk vk =k

l=1

rlvl

is a linear combination of the given vectors with the given scalars as

scalar coefficients.

Note. Sometimes there are special vectors for which it is easy to express

a vector in terms of a linear combination of these special vectors.

Definition. The standard basis vectors in R2 are i = [1, 0] and j =

[0, 1]. The standard basis vectors in R3 are

i = [1, 0, 0], j = [0, 1, 0], and k = [0, 0, 1].

Note. Its easy to write a vector in terms of the standard basis vectors:

b = [b1, b2] = b1[1, 0] + b2[0, 1] = b1i + b2j and

b = [b1, b2, b3] = b1[1, 0, 0] + b2[0, 1, 0] + b3[0, 0, 1] = b1i + b2j + b3k.


Definition. In Rn, the rth standard basis vector, denoted er, is

er = [0, 0, . . . , 0, 1, 0, . . . , 0],

where the rth component is 1 and all other components are 0.

Notice. A vector b Rn can be uniquely expressed in terms of thestandard basis vectors:

b = [b1, b2, . . . , bn] = b1e1 + b2e2 + + bnen =nl=1

blel.

Definition. If v Rn is a nonzero vector, then the line along v is thecollection of all vectors of the form rv for some scalar r R (notice 0is on all such lines). For two nonzero nonparallel vectors v, w Rn, thecollection of all possible linear combinations of these vectors: rv + sw

where r, s R, is the plane spanned by v and w.

Definition. A column vector in Rn is a representation of a vector as

x =

x1

x2

...

xn

.


A row vector in Rn is a representation of a vector as

x = [x1, x2, . . . , xn].

The transpose of a row vector, denoted xT , is a column vector, and con-

versely:

x1

x2

...

xn

T

= [x1, x2, . . . , xn], and [x1, x2, . . . , xn]T =

x1

x2

...

xn

.

Note. A linear combination of column vectors can easily be translated

into a system of linear equations:

r

13

+ s

2

5

=

1

19

r 2s = 1

3r + 5s = 19.

Definition 1.4. Let v1, v2, . . . , vk Rn. The span of these vectors isthe set of all linear combinations of them, denoted sp(v1, v2, . . . , vk):

sp(v1, v2, . . . , vk) = {r1v1 + r2v2 + + rk vk | r1, r2, . . . , rk R}

=

{k

l=1

rlvl

r1, r2, . . . , rk R}.


1.2 The Norm and Dot Product 1


1.2. The Norm and Dot Product

Definition 1.5. Let v = [v1, v2, . . . , vn] Rn. The norm or magnitudeof v is

v =

v21 + v22 + + v2n =

nl=1

(vl)2.

Theorem 1.2. Properties of the Norm in Rn.

For all v, w Rn and for all scalars r R, we have:1. v 0 and v = 0 if and only if v = 0.2. rv = |r|v.3. v + w v + w (the Triangle Inequality).

Note. 1 and 2 are easy to see and we will prove 3 later in this section.


Note. A picture for the Triangle Inequality is:

1.2.22, page 22

Definition. A vector with norm 1 is called a unit vector. When writing,

unit vectors are frequently denoted with a hat: i.


Definition 1.6. The dot product for v = [v1, v2, . . . , vn] and w =

[w1, w2, . . . , wn] is

v w = v1w1 + v2w2 + + vnwn =nl=1

vlwl.


Notice. If we let be the angle between nonzero vectors v and w, then

we get by the Law of Cosines:

1.2.24, page 23

v2 + w2 = v w + 2vw cos

or

(v21 + v22 + + v2n) + (w21 + w22 + + w2n)

= (v1 w1)2 + (v2 w2)2 + + (vn wn)2 + 2vw cos

or

2v1w1 + 2v2w2 + + 2vnwn = 2vw cos

or

2v w = 2vw cos


or

cos =v w

vw. ()

Definition. The angle between nonzero vectors v and w is

arccos

(v w

vw).

Theorem 1.4. Schwarzs Inequality.

Let v, w Rn. Then|v w| vw.

Proof. This follows from () and the fact that 1 cos 1. Thebook gives an algebraic proof. QED



Theorem 1.3. Properties of Dot Products.

Let u,v, w Rn and let r R be a scalar. ThenD1. Commutivity of : v w = w v.D2. Distribution of over Scalar Addition: u (v + w) = u v + u w.D3. r(v w) = (rv) w = v (r w).D4. v v 0 and v v = 0 if and only if v = 0.

Example. Page 33 number 42b (Prove D2).

Note. v2 = v v.

Definition. Two vectors v, w Rn are perpendicular or orthogonal,denoted v w, if v w = 0.



Theorem 1.5. The Triangle Inequality.

Let v, w Rn. Then v + w v + w.

Proof.

v + w2 = (v + w) (v + w)= v v + 2v w + w w v2 + 2vw + w2 by Schwarz Inequality= (v + w)2.

QED

Note. It is common in physics to represent velocities and forces with

vectors.


1.3 Matrices and Their Algebra 1


1.3. Matrices and Their Algebra

Definition. A matrix is a rectangluar array of numbers. An m nmatrix is a matrix with m rows and n columns:

A = [aij] =

a11 a12 a1na21 a22 a2n... ... . . . ...

am1 am2 amn

.

Definition 1.8. Let A = [aik] be an m n matrix and let B = [bkj] bean n s matrix. The matrix product AB is the m s matrix C = [cij]where cij is the dot product of the ith row vector of A and the jth column

vector of B: cij =n

k=1 aikbkj.


Note. We can draw a picture of this process as:


Definition. The main diagonal of an n n matrix is the set {a11, a22,. . . , ann}. A square matrix which has zeros off the main diagonal is adiagonal matrix. We denote the n n diagonal matrix with all diagonalentires 1 as I :

I =

1 0 0 00 1 0 00 0 1 0... ... ... . . . 1

0 0 0 1

.


Definition 1.9/1.10. Let A = [aij] and B = [bij] be m n matices.The sum A + B is the m n matrix C = [cij] where cij = aij + bij. Letr be a scalar. Then rA is the matrix D = [dij] where dij = raij.


Definition 1.11. Matrix B is the transpose of A, denoted B = AT , if

bij = aji. If A is a matrix such that A = AT then A is symmetric.

Example. Page 47 number 39. If A is square, then A+AT is symmetric.

Proof. Let A = [aij] then AT = [aji]. Let C = [cij] = A + A

T =

[aij]+ [aji] = [aij +aji]. Notice cij = aij +aji and cji = aji+aij, therefore

C = A + AT is symmetric. QED

Note. Properties of Matrix Algebra.

Let A, B be m n matrices and r, s scalars. ThenCommutative Law of Addition: A + B = B + A

Associative Law of Addition: (A + B) + C = A + (B + C)

Additive Identity: A + 0 = 0 + A (here 0 represents the m n matrix


of all zeros)

Left Distribution Law: r(A + B) = rA + rB

Right Distribution Law: (r + s)A = rA + sA

Associative Law of Scalar Multiplication: (rs)A = r(sA)

Scalars Pull Through: (rA)B = A(rB) = r(AB)

Associativity of Matrix Multiplication: A(BC) = (AB)C

Matrix Multiplicative Identity: IA = A = AIDistributive Laws of Matrix Multiplication: A(B + C) = AB + AC and

(A + B)C = AC + BC.

Example. Show that IA = AI = A for A =

1 2 3

4 5 6

7 8 9

and I is 33.

Note. Properties of the Transpose Operator.

(AT )T = A (A + B)T = AT + BT (AB)T = BTAT .


Example. Page 47 number 32. Prove (AB)T = BTAT .

Proof. Let C = [cij] = (AB)T . The (i, j)-entry of AB is

nk=1

aikbkj, so

cij =

nk=1

ajkbki. Let BT = [bij]

T = [btij] = [bji] and AT = [aij]

T = [atij] =

[aji]. Then the (i, j)-entry of BTAT is

nk=1

btikatkj =

nk=1

bkiajk =

nk=1

ajkbki = cij

and therefore C = (AB)T = BTAT . QED

1.4 Solving Systems of Linear Equations 1


1.4. Solving Systems of Linear Equations

Definition. A system ofm linear equations in the n unknowns x1, x2, . . . , xn

is a system of the form:

a11x1 + a12x2 + + a1nxn = b1a21x1 + a22x2 + + a2nxn = b2... ...

am1x1 + am2x2 + + amnxn = bm.

Note. The above system can be written as Ax = b where A is the

coefficient matrix and x is the vector of variables. A solution to the

system is a vector s such that As = b.

Defninition. The augmented matrix for the above system is

[A | b] =

a11 a21 a1n b1a21 a22 a2n b2... ...

am1 am2 amn bm

.


Note. Wewill perform certain operations on the augmented matrix which

correspond to the following manipulations of the system of equations:

1. interchange two equations,

2. multiply an equation by a nonzero constant,

3. replace an equation by the sum of itself and a multiple of another

equation.

Definition. The following are elementary row operations:

1. interchange row i and row j (denoted Ri Rj),2. multiplying the ith row by a nonzero scalar s (denoted Ri sRi),and

3. adding the ith row to s times the jth row (denoted Ri Ri + sRj).If matrix A can be obtained from matrix B by a series of elementary row

operations, then A is row equivalent to B, denoted A B or A B.

Notice. These operations correspond to the above manipulations of the

equations and so:


Theorem 1.6. Invariance of Solution Sets Under Row Equiv-

alence.

If [A | b] [H | c] then the linear systems Ax = b and Hx = c have thesame solution sets.

Definition 1.12. A matrix is in row-echelon form if

(1) all rows containing only zeros appear below rows with nonzero entries,

and

(2) the first nonzero entry in any row appears in a column to the right of

the first nonzero entry in any preceeding row.

For such a matrix, the first nonzero entry in a row is the pivot for that

row.

Example. Which of the following is in row echelon form?1 2 3

0 4 5

0 0 6

1 2 3

0 4 5

6 0 0

2 4 0

1 3 2

0 0 0

Note. If an augmented matrix is in row-echelon form, we can use the

method of back substituton to find solutions.


Example. Consider the system

x1 + 3x2 x3 = 4x2 x3 = 1

x3 = 3.

Definition 1.13. A linear system having no solution is inconsistent. If

it has one or more solutions, it is consistent.

Example. Is this system consistent or inconsistent:

2x1 + x2 x3 = 1x1 x2 + 3x3 = 13x1 + 2x3 = 3?

Example. Is this system consistent or inconsistent:

2x1 + x2 x3 = 1x1 x2 + 3x3 = 13x1 + 2x3 = 2?

(HINT: This system has multiple solutions. Express the solutions in terms

of an unknown parameter r).


Note. In the above example, r is a free variable and the general

solution is in terms of this free variable.

Note. Reducing a Matrix to Row-Echelon Form.

(1) If the first column is all zeros, mentally cross it off. Repeat this

process as necessary.

(2a) Use row interchange if necessary to get a nonzero entry (pivot) p in

the top row of the remaining matrix.

(2b) For each row R below the row containing this entry p, add r/ptimes the row containing p to R where r is the entry of row R in the

column which contains pivot p. (This gives all zero entries below pivot p.)

(3) Mentally cross off the first row and first column to create a smaller

matrix. Repeat the process (1) - (3) until either no rows or no columns

remain.


Example. Page 69 number 16. (Put the associated augmented matrix

in row-echelon form and then use substitution.)


Note. The above method is called Gauss reduction with back substitu-

tion.

Note. The system Ax = b is equivalent to the system

x1 a1 + x2 a2 + + xn an = b

where ai is the ith column vector of A. Therefore, Ax = b is consistent if

and only if b is in the span of a1, a2, . . . , an (the columns of A).

Definition. A matrix is in reduced row-echelon form if all the pivots

are 1 and all entries above or below pivots are 0.

Example. Page 69 number 16 (again).

Note. The above method is the Gauss-Jordan method.

Theorem 1.7. Solutions of Ax = b.

Let Ax = b be a linear system and let [A | b] [H | c] where H is inrow-echelon form.

(1) The system Ax = b is inconsistent if and only if [H | c] has a row


with all entries equal to 0 to the left of the partition and a nonzero entry

to the right of the partition.

(2) If Ax = b is consistent and every column of H contains a pivot, the

system has a unique solution.

(3) If Ax = b is consistent and some column of H has no pivot, the

system has infinitely many solutions, with as many free variables as there

are pivot-free columns of H.

Definition 1.14. A matrix that can be obtained from an identity matrix

by means of one elementary row operation is an elementary matrix.

Theorem 1.8. Let A be an m n matrix and let E be an m melementary matrix. Multiplication of A on the left by E effects the

same elementary row operation on A that was performed on the identity

matrix to obtain E.

Proof for Row-Interchange. (This is page 71 number 52.) Suppose

E results from interchanging rows i and j:

I RiRj E.

Then the kth row of E is [0, 0, . . . , 0, 1, 0, . . . , 0] where


(1) for k {i, j} the nonzero entry if the kth entry,(2) for k = i the nonzero entry is the jth entry, and

(3) for k = j the nonzero entry is the ith entry.

Let A = [aij], E = [eij], and B = [bij] = EA. The kth row of B is

[bk1, bk2, . . . , bkn] and

bkl =

np=1

ekpapl.

Now if k {i, j} then all ekp are 0 except for p = k and

bkl =n

p=1

ekpapl = ekkakl = (1)akl = akl.

Therefore for k {i, j}, the kth row of B is the same as the kth row ofA. If k = i then all ekp are 0 except for p = j and

bkl = bil =n

p=1

ekpapl = ekjajl = (1)ajl = ajl

and the ith row of B is the same as the jth row of A. Similarly, if k = j

then all ekp are 0 except for p = i and

bkl = bjl =

np=1

ekpapl = ekiail = (1)ail = ail

and the jth row of B is the same as the ith row of A. Therefore

B = EARiRj A.

QED


Example. Multiply some 3 3 matrix A by

E =

0 1 0

1 0 0

0 0 1

to swap Row 1 and Row 2.

Note. If A is row equivalent to B, then we can find C such that CA = B

and C is a product of elementary matrices.


1.5 Inverses of Matrices, and Linear Systems 1


1.5. Inverses of Square Matrices

Definition 1.15. An n n matrix A is invertible if there exists ann n matrix C such that AC = CA = I . If A is not invertible, it issingular.

Theorem 1.9. Uniqueness of an Inverse Matrix.

An invertible matrix has a unique inverse (which we denote A1).

Proof. Suppose C and D are both inverses of A. Then (DA)C = IC =C and D(AC) = DI = D. But (DA)C = D(AC) (associativity), soC = D. QED

Example. It is easy to invert an elementary matrix. For example, sup-

pose E1 interchanges the first and third row and suppose E2 multiplies

row 2 by 7. Find the inverses of E1 and E2.


Theorem 1.10. Inverses of Products.

Let A and B be invertible n n matrices. Then AB is invertible and(AB)1 = B1A1.

Proof. By associativity and the assumption that A1 and B1 exist, we

have:

(AB)(B1A1) = [A(BB1)]A1 = (AI)A1 = AA1 = I.

We can similarly show that (B1A1)(AB) = I. Therefore AB is invert-ible and (AB)1 = B1A1. QED

Lemma 1.1. Condition for Ax = b to be Solvable for b.

Let A be an n n matrix. The linear system Ax = b has a solution forevery choice of column vector b Rn if and only if A is row equivalent tothe n n identity matrix I .


Theorem 1.11. Commutivity Property.

Let A and C be n n matrices. Then CA = I if and only if AC = I .

Proof. Suppose that AC = I . Then the equation Ax = b has a solutionfor every column vector b Rn. Notice that x = Cb is a solution because

A(Cb) = (AC)b = Ib = b.

By Lemma 1.1, we know that A is row equivalent to the n n iden-tity matrix I , and so there exists a sequence of elementary matricesE1, E2, . . . , Et such that (Et E2E1)A = I . By Theorem 1.9, the twoequations

(Et E2E2)A = I and AC = Iimply that Et E2E1 = C, and so we have CA = I. The other half ofthe proof follows by interchanging the roles of A and C. QED

Note. Computation of Inverses.

If A = [aij], then finding A1 = [xij] amounts to solving for xij in:

a11 a12 a1na21 a22 a2n... ... . . . ...

an1 an2 ann

x11 x12 x1nx21 x22 x2n... ... . . . ...

xn1 xn2 xnn

= I.


If we treat this as n systems of n equations in n unknowns, then the

augmented matrix for these n systems is [A | I ]. So to compute A1:(1) Form [A | I ].(2) Apply Gauss-Jordan method to produce the row equivalent [I | C].If A1 exists, then A1 = C.

Note. In the above computations, C is just the product of the elementary

matrices that make up A1.

Example. Page 84 number 6 (also apply this example to a system of

equations).

Theorem 1.12. Conditions for A1 to Exist.

The following conditions for an n n matrix A are equivalent:(i) A is invertible.

(ii) A is row equivalent to I .(iii) Ax = b has a solution for each b (namely, x = A1b).

(iv) A can be expressed as a product of elementary matrices.

(v) The span of the column vectors of A is Rn.


Note. In (iv) A is the left-to-right product of the inverses of the elemen-

tary matrices corresponding to succesive row operations that reduce A to

I.

Example. Page 84 number 2. Express the inverse of

3 6

3 8

as a

product of elementary matrices.

Solution. We perform the following elementary operations: 3 6 1 0

3 8 0 1

R2R2R1

3 6 1 0

0 2 1 1

R1R13R2 3 0 4 3

0 2 1 1

R2R2/2 3 0 4 3

0 1 1/2 1/2

R1R1/3 1 0 4/3 1

0 1 1/2 1/2


The elementary matrices which accomplish this are:

E1 =

1 01 1

E11 =

1 0

1 1

E2 =

1 3

0 1

E12 =

1 3

0 1

E3 =

1 0

0 1/2

E13 =

1 0

0 2

E4 =

1/3 0

0 1

E14 =

3 0

0 1

As in Section 1.3,

E4E3E2E1A = I

and so

A = E11 E12 E

13 E

14 I = E11 E12 E13 E14 .

Also A1 = E4E3E2E1. QED


1.6 Homogeneous Systems, Subspaces and Bases 1


1.6. Homogeneous Systems, Subspaces and Bases

Definition. A linear system Ax = b is homogeneous if b = 0. The zero

vector x = 0 is a trivial solution to the homogeneous system Ax = 0.

Nonzero solutions to Ax = 0 are called nontrivial solutions.

Theorem 1.13. Structure of the Solution Set of Ax = 0.

Let Ax = 0 be a homogeneous linear system. If h1, h2, . . . , hn are solu-

tions, then any linear combination

r1 h1 + r2 h2 + + rn hn

is also a solution.

Proof. Since h1, h2, . . . , hn are solutions,

Ah1 = Ah2 = = Ahn = 0

and so

A(r1 h1+r2 h2+ +rn hn) = r1Ah1+r2Ah2+ +rnA hn = 0+0+ +0 = 0.

Therefore the linear combination is also a solution. QED


Definition 1.16. A subset W of Rn is closed under vector addition

if for all u,v W , we have u + v W . If rv W for all v Wand for all r R, then W is closed under scalar multiplication. Anonempty subset W of Rn is a subspace of Rn if it is both closed under

vector addition and scalar multiplication.


Theorem 1.14. Subspace Property of a Span.

Let W = sp( w1, w2, . . . , wk) be the span of k > 0 vectors in Rn. Then

W is a subspace of Rn. (The vectors w1, w2, . . . , wn are said to span or

generate the subspace.)


Definition. Given an mn matrix A, the span of the row vectors of Ais the row space of A, the span of the column vectors of A is the column

space of A and the solution set to the system Ax = 0 is the nullspace of

A.


Definition 1.17. LetW be a subspace ofRn. A subset { w1, w2, . . . , wk}of E is a basis for W if every vector in W can be expressed uniquely as a

linear combination of w1, w2, . . . , wk.

Theorem 1.15. Unique Linear Combinations.

The set { w1, w2, . . . , wk} is a basis for W = sp( w1, w2, . . . , wk) if andonly if

r1 w1 + r2 w2 + + rk wk = 0

implies

r1 = r2 = = rk = 0.

Proof. First, if { w1, w2, . . . , wk} is a basis for W , then each vector ofW can be uniquely written as a linear combination of these wis. Since

0 = 0 w1 + 0 w2 + + 0 wk and this is the unique way to write 0 in termsof the wis, then for any r1 w1 + r2 w2 + + rk wk = 0 we must haver1 = r2 = rk = 0.

Second, suppose that the only linear combination of wis that gives 0 is

0 w1+0 w2 + +0vk. We want to show that any vector of W is a uniquelinear combination of the wis. Suppose for w W we have

w = c1 w1 + c2 w2 + + ck wk and


w = d1 w1 + d2 w2 + + dk wk.

Then

0 = w w = c1 w1 + c2 w2 + + ck wk (d1 w1 + d2 w2 + + dk wk)

= (c1 d1) w1 + (c2 d2) w2 + + (ck dk) wk.

So each coefficient must be 0 and we have ci = di for i = 1, 2, . . . , k and

w can be written as a linear combination of wis in only one unique way.

QED


Theorem 1.16. Let A be an nnmatrix. The following are equivalent:(1) Ax = b has a unique solution,

(2) A is row equivalent to I ,(3) A is invertible, and

(4) the column vectors of A form a basis for Rn.

Example. Page 100 number 22 (again).


Theorem 1.17. LetA be anmnmatrix. The following are equivalent:(1) each consistent system Ax = b has a unique solution,

(2) the reduced row-echelon form ofA consists of the nn identity matrixfollowed by m n rows of zeros, and(3) the column vectors of A form a basis for the column space of A.

Corollary 1. Fewer Equations then Unknowns

If a linear system A X = b is consistent and has fewer equations then

unknowns, then it has an infinite number of solutions.

Corollary 2. The Homogeneous Case

(1) A homogeneous linear system Ax = 0 having fewer equations then

unknowns has a nontrivial solution (i.e. a solution other than x = 0),

(2) A square homogeneous system Ax = 0 has a nontrivial solution if and

only if A is not row equivalent to the identity matrix.

Example. Page 97 Example 6. A basis of Rn cannot contain more than

n vectors.

Proof. Suppose {v1, v2, . . . , vk} is a basis for Rn where n < k. Consider


the system A X = 0 where the column vectors of A are v1, v2, . . . , vk.

Then A has n rows and k columns (corresponding to n equations in k

unknowns). With n < k, Corollary 2 implies there is a nontrivial solution

to Ax = 0. But this corresponds to a linear combination of the columns

of A which equals 0 while not all the coefficients are 0. This contradicts

the definition of basis. Therefore, k n. QED

Theorem 1.18. Structure of the Solution Set of Ax = b.

Let Ax = b be a linear system. If p is any particular solution of Ax = b

and h is a solution to Ax = 0, then p + h is a solution of Ax = b. In

fact, every solution of Ax = b has the form p+h and the general solution

is x = p + h where Ah = 0 (that is, h is an arbitrary element of the

nullspace of A).


2.1 Independence and Dimension 1

Chapter 2. Dimension, Rank, and Linear

Transformations

2.1. Independence and Dimension

Definition 2.1. Let { w1, w2, . . . , wk} be a set of vectors in Rn. Adependence relation in this set is an equation of the form

r1 w1 + r2 w2 + + rk wk = 0

with at least one rj = 0. If such a dependence relation exists, then{ w1, w2, . . . , wk} is a linearly dependent set. A set of vectors which isnot linearly dependent is linearly independent.

Theorem 2.1. Alternative Characterization of Basis

Let W be a subspace of Rn. A subset { w1, w2, . . . , wk} of W is a basisfor W if and only if

(1) W = sp( w1, w2, . . . , wk) and

(2) the vector w1, w2, . . . , wk are linearly independent.


Note. The proof of Theorem 2.1 follows directly from the definitions of

basis and linear independent.

Theorem. Finding a Basis for W = sp( w1, w2, . . . , wk).

Form the matrix A whose jth column vector is wj. If we row-reduce A to

row-echelon form H, then the set of all wj such that the jth column of H

contains a pivot, is a basis for W .

Example. Page 134 number 8 or 10.


Theorem 2.2. Relative Sizes of Spanning and Independent

Sets.

Let W be a subspace of Rn. Let w1, w2, . . . , wk be vectors in W that

span W and let v1, v2, . . . , vm be vectors in W that are independent.

Then k m.


Corollary. Invariance of Dimension.

Any two bases of a subspace of Rn contains the same number of vectors.

Definition 2.2. Let W be a subspace of Rn. The number of elements

in a basis for W is the dimension of W , denoted dim(W ).

Note. The standard basis {e1, e2, . . . , en} of Rn has n vectors, so dim(Rn) =n.

Theorem 2.3. Existence and Determination of Bases.

(1) Every subspace W of Rn has a basis and dim(W ) n.(2) Every independent set of vectors in Rn can be enlarged to become a

basis of Rn.

(3) If W is a subspace of Rn and dim(W ) = k then

(a) every independent set of k vectors in W is a basis for W , and

(b) every set of k vectors in W that spans W is a basis of W .


2.2 The Rank of a Matrix 1


Transformations

2.2. The Rank of a Matrix

Note. In this section, we consider the relationship between the dimen-

sions of the column space, row space and nullspace of a matrix A.

Theorem 2.4. Row Rank Equals Column Rank.

Let A be an m n matrix. The dimension of the row space of A equalsthe dimension of the column space of A. The common dimension is the

rank of A.

Note. The dimension of the column space is the number of pivots of A

when in row-echelon form, so by page 129, the rank of A is the number of

pivots of A when in row-echelon form.


Note. Finding Bases for Spaces Associated with a Matrix.

Let A be an m n matrix with row-echelon form H.(1) for a basis of the row space of A, use the nonzero rows of H (or A),

(2) for a basis of the column space of A, use the columns of A correspond-

ing to the columns of H which contain pivots, and

(3) for a basis of the nullspace of A use H to solve Hx = 0 as before.


Theorem 2.5. Rank Equation.

Let A be m n with row-echelon form H.(1) The dimension of the nullspace of A is

nullity(A) = (# free variables in solution of Ax = 0)

= (# pivot-free columns of H).

(2) rank(A) = (# of pivots in H).

(3) Rank Equation:

rank(A) + nullity(A) = # of columns of A.


Theorem 2.6. An Invertibility Criterion.

An n n matrix A is invertible if and only if rank(A) = n.

Example. Page 141 number 12. If A is square, then nullity(A) =

nullity(AT ).

Proof. The column space of A is the same as the row space of AT , so

rank(A) = rank(AT ) and since the number of columns of A equals the

number of columns of AT , then by the Rank Equation:

rank(A) + nullity(A) = rank(AT ) + nullity(AT )

and the result follows. QED

2.3 Linear Transformations of Euclidean Spaces 1


Transformations

2.3 Linear Transformations of Euclidean Spaces

Definition. A linear transformation T : Rn Rm is a function whosedomain is Rn and whose codomain is Rm, where

(1) T (u + v) = T (u) + T (v) for all u,v Rn, and(2) T (ru) = rT (u) for all u Rn and for all r R.

Note. Combining (1) and (2) gives

T (ru + sv) = rT (u) + sT (v)

for all u,v Rn and r, s R. As the book says, linear transformationspreserve linear combinations.

Note. T (0) = T (00) = 0T (0) = 0.



Example. Page 145 Example 4. Notice that every linear transformation

of R R is of the formT (x) = ax.

The graphs of such functions are lines through the origin.

Theorem 2.7. Bases and Linear Transformations.

Let T : Rn Rm be a linear transformation and let B = {b1, b2, . . . , bn}be a basis for Rn. For any vector v Rn, the vector T (v) is uniquelydetermined by T (b1), T (b2), . . . , T (bn)..

Proof. Let v Rn. Then since B is a basis, there exist unique scalarsr1, r2, . . . , rn such that

v = r1b1 + r2b2 + + rnbn.

Since T is linear, we have

T (v) = r1T (b1) + r2T (b2) + + rnT (bn).

Since the coefficients ri are uniquely determined by v, it follows that the

value of T (v) is completely determined by the vectors T (bi). QED


Corollary. Standard Matrix Representation of Linear Trans-

formations.

Let T : Rn Rm be linear, and let A be the m n matrix whose jthcolumn is T (ej). Then T (x) = Ax for each x Rn. A is the standardmatrix representation of T .

Proof. For any matrix A, Aej is the jth column of A. So if A is the ma-

trix described, then Aej = T (ej), and so T and the linear transformation

TA given by TA(x) = Ax agree on the standard basis {e1, e2, . . . , en} ofR

n. Therefore by Theorem 2.7, T ( X) = Ax for all x Rn. QED


Theorem/Definition. Let T : Rn Rm be a linear transformationwith standard matrix representation A.

(1) The range T [Rn] of T is the column space of A.

(2) The kernel of T is the nullspace of A, denoted ker(T ).

(3) If W is a subspace of Rn, then T [W ] is a subspace of Rm (i.e. T

preserves subspaces).


Notice. If A is the standard matrix representation for T , then from the

rank equation we get:

dim(range T ) + dim(ker T ) = dim(domain T ).

Definition. For a linear transformation T , we define rank and nullity

in terms of the standard matrix representation A of T :

rank(T ) = dim(range T ), nullity(T ) = dim(ker T ).

Definition. If T : Rn Rm and T : Rm Rk, then the compositionof T and T is (T T ) : Rn Rk where (T T )x = T (T (x)).

Theorem. Matrix Multiplication and Composite Transfor-

mations.

A composition of two linear transformations T and T with standard ma-

trix representation A and A yields a linear transformation T T withstandard matrix representation AA.



Definition. If T : Rn Rn and there exists T : Rn Rn suchthat T T (x) = x for all x Rn, then T is the inverse of T denotedT = T1. (Notice that if T : Rm Rn where m = n, then T1 is notdefined there are domain/range size problems.)

Theorem. Invertible Matrices and Inverse Transformations.

Let T : Rn Rn have standard matrix representation A: T (x) = Ax.Then T is invertible if and only if A is invertible and T1(x) = A1x.


2.4 Linear Transformations of the Plane 1


Transformations

2.4 Linear Transformations of the Plane (in brief)

Note. If A is a 2 2 matrix with rank 0 then it is the matrix

A =

0 00 0

and all vectors in R2 are mapped to 0 under the transformation with

asociated matrix A (We can view 0 as a 0 dimensional space). If the

rank(A) = 1, then the column space of A, which is the range of TA, is a

one dimensional subspace of R2. In this case, TA projects a vector onto

the column space. See page 155 for details.

Note. We can rotate a vector in R2 about the origin through an angle

by applying TA where

A =

cos sin

sin cos

.

This is an example of a rigid transformation of the plane since lengths are

not changed under this transformation.


Note. We can reflect a vector in R2 about the x-axis by applying TX

where

X =

1 00 1

.

We can reflect a vector in R2 about the y-axis by applying TY where

Y =

1 0

0 1

.

We can reflect a vector in R2 about the line y = x by applying TZ where

Z =

0 11 0

.

Notice that X , Y , and Z are elementary matrices since they differ from Iby an operation of row scaling (for X and Y ), or by an operation of row

interchange (for Z).

Note. Transformation TA where

A =

r 00 1

is a horizontal expansion if r > 1, and is a horizontal contraction if


0 < r < 1. Transformation TB where

B =

1 00 r

is a vertical expansion if r > 1, and is a vertical contraction if 0 < r < 1.

Notice that A and B are elementary matrices since they differ from I byan operation of row scaling.

Note. Transformation TA where

A =

1 0r 1

is a vertical shear (see Figure 2.2.16 on page 163). Transformation TB

where

B =

1 r0 1

is a horizontal shear. Notice that A and B are elementary matrices since

they differ from I by an operation of row addition.


Theorem. Geometric Description of Invertible Transforma-

tions of Rn.

A linear transformation T of the plane R2 into itself is invertible if and

only if T consists of a finite sequence of:

Reflections in the x-axis, the y-axis, or the line y = x; Vertical or horizontal expansions or contractions; and Vertical or horizontal shears.

Proof. Each elementary operation corresponds to one of these types of

transformations (and conversely). Each of these transformations corre-

spond to elementary matrices as listed above (and conversely). Also, we

know that a matrix is invertible if and only if it is a product of elementary

matrices by Theorem 1.12(iv). Therefore T is invertible if and only if its

associated matrix is a product of elementary matrices, and so the result

follows. QED

2.5 Lines, Planes, and Other Flats 1


Transformations

2.5 Lines, Planes, and Other Flats

Definitions 2.4, 2.5. Let S be a subset of Rn and let a Rn. Theset {x + a | x S} is the translate of S by a, and is denoted by S + a.The vector a is the translation vector. A line in Rn is a translate of a

one-dimensional subspace of Rn.

Figure 2.19, page 168.


Definition. If a line L in Rn contains point (a1, a2, . . . , an) and if vector

d is parallel to L, then d is a direction vector for L anda = [a1, a2, . . . , an]

is a translation vector of L.

Note. With d as a direction vector and a as a translation vector of a line,

we have L = {td + a | t R}. In this case, t is called a parameter andwe can express the line parametrically as a vector equation:

x = td + a

or as a collection of component equations:

x1 = td1 + a1

x2 = td2 + a2

...

xn = tdn + an.


Definition 2.6. A k-flat in Rn is a translate of a k-dimensional subspace

of Rn. In particular, a 1-flat is a line, a 2-flat is a plane, and an (n1)-flatis a hyperplane. We consider each point of Rn to be a zero-flat.


Note. We can also talk about a translate of a k-dimensional subspace

W of Rn. If a basis for W is {d1, d2, . . . , dk}, then the k-flat through thepoint (a1, a2, . . . , an) and parallel to W is

x = t1 d1 + t2d2 + + tk dk + a

where a = [a1, a2, . . . , an] and t1, t2, . . . , tk R are parameters. We canalso express this k-flat parametrically in terms of components.


Note. We can now clearly explain the geometric interpretation of solu-

tions of linear systems in terms of k-flats. Consider Ax = b, a system

of m equations in n unknowns that has at least one solution x = p. By

Theorem 1.18 on page 97, the solution set of the system consists of all

vectors of the form x = p + h where h is a solution of the homogeneous

system Ax = 0. Now the solution set of Ax = 0 is a subspace of Rn, and

so the solution of Ax = b is a k-flat (where k is the nullity of A) passing

through point (p1, p2, . . . , pn) where p = [p1, p2, . . . , pn].

Example. Page 177, number 36.

3.1 Vector Spaces 1

Chapter 3. Vector Spaces

3.1 Vector Spaces

Definition 3.1. A vector space is a set V of vectors along with an

operation of addition + of vectors and multiplication of a vector by a

scalar (real number), which satisfies the following. For all u,v, w Vand for all r, s R:(A1) (u + v) + w = u + (v + w)

(A2) v + w = w + v

(A3) There exists 0 V such that 0 + v = v(A4) v + (v) = 0(S1) r(v + w) = rv + r w

(S2) (r + s)v = rv + sv

(S3) r(sv) = (rs)v

(S4) 1v = v

Definition. 0 is the additive identity. v is the additive inverse of v.

3.1 Vector Spaces 2

Example. Some examples of vector spaces are:

(1) The set of all polynomials of degree n or less, denoted Pn.(2) All m n matrices.(3) The set of all functions integrable f with domain [0, 1] such that 1

0

|f(x)|2 dx < . This vector space is denoted L2[0, 1]:

L2[0, 1] =

{f

1

0

|f(x)|2 dx < }.

Theorem 3.1. Elementary Properties of Vector Spaces.

Every vector space V satisfies:

(1) the vector 0 is the unique additive identity in a vector space,

(2) for each v V , v is the unique additive inverse of v,(3) if u + v = u + w then v = w,

(4) 0v = 0 for all v V ,(5) r0 = 0 for all scalars r R,(6) (r)v = r(v) = (rv) for all r R and for all v V .

3.1 Vector Spaces 3

Proof of (1) and (3). Suppose that there are two additive identities,

0 and 0. Then consider:

0 = 0 +0 (since 0 is an additive identity)

= 0(since 0 is an additive identity).

Therefore, 0 = 0 and the additive identity is unique.

Suppose u+v = u+ w. Then we add u to both sides of the equationand we get:

u + v + (u) = u + w + (u)

v + (u u) = w + (u u)

v +0 = w +0

v = w

The conclusion holds. QED

Example. Page 189 number 14 and page 190 number 24.

3.2 Basic Concepts of Vector Spaces 1


3.2 Basic Concepts of Vector Spaces

Definition 3.2. Given vectors v1, v2, . . . , vk V and scalars r1, r2, . . . , rk R,

kl=1

rlvl = r1v1 + r2v2 + + rk vk

is a linear combination of v1, v2, . . . , vk with scalar coefficients r1, r2, . . . , rk.

Definition 3.3. Let X be a subset of vector space V . The span of X is

the set of all linear combinations of elements in X and is denoted sp(X).

If V = sp(X) for some finite set X , then V is finitely generated.

Definition 3.4. A subset W of a vector space V is a subspace of V if

W is itself a vector space.

Theorem 3.2. Test for Subspace.

A subset W of vector space V is a subspace if and only if

(1) v, w W v + w W ,(2) for all r R and for all v W we have rv W .



Defninition 3.5. Let X be a set of vectors from a vector space V . A

dependence relation in X is an equation of the form

kl=1

rlvl = r1v1 + r2v2 + + rk vk = 0

with some rj = 0 and vi X . If such a relation exists, then X is alinearly dependent set. Otherwise X is a linearly independent set.


Definition 3.6. Let V be a vector space. A set of vectors in V is a basis

for V if

(1) the set of vectors span V , and

(2) the set of vectors is linearly independent.



Theorem 3.3. Unique Combination Criterion for a Basis.

Let B be a set of nonzero vectors in vector space V . Then B is a basis

for V if and only if each vector V can by uniquely expressed as a linear

combination of the vectors in set B.

Proof. Suppose that B is a basis for vector space V . Then by the first

part of Definition 3.6 we see that any vector v V can be written as alinear combination of the elements of B, say

v = r1b1 + r2b2 + + rk bk.

Now suppose that there is some other linear combination of the vectors in

B which represents v (we look for a contradiction):

v = s1b1 + s2b2 + + sk bk.

If we subtract these two representations of v then we get that

0 = (r1 s1)b1 + (r2 s2)b2 + + (rk sk)bk.

By the second part of Definition 3.6, we know that r1 s1 = r2 s2 = = rk sk = 0. Therefore there is only one linear combination ofelements of B which represent v.

Now suppose that each vector in V can be uniquley represented as a

linear combination of the elements of B. We wish to show that B is a


basis. Clearly B is a spanning set of V . Now we can write 0 as a linear

combination of elements of B by taking all coefficients as 0. Since we

hypothesize that each vector can be uniquely represented, then

0 = r1b1 + r2b2 + + rk bk

only for r1 = r2 = = rk = 0. Hence the elements of B are linearlyindependent and so B is a basis. QED

Definition. A vector space is finitely generated if it is the span of some

finite set.

Theorem 3.4. Relative Size of Spanning and Independent

Sets.

Let V be a vector space. Let w1, w2, . . . , wk be vectors in V that span V

and let v1, v2, . . . , vm be vectors in V that are independent. Then k m.

Corollary. Invariance of Dimension for Finitely Generated

Spaces.

Let V be a finitely generated vector space. Then any two bases of V have

the same number of elements.


Definition 3.7. Let V be a finitely generated vector space. The number

of elements in a basis for V is the dimension of V , denoted dim(V ).

Example. Page 203 number 32. Let {v1, v2, v3} be a basis for V . Ifw sp(v1, v2) then {v1, v2, w} is a basis for V .

Proof. We need to show that {v1, v2, w} is a linearly independentspanning set of V . Since w V , then w = r1v1 + r2v2 + r3v3 andr3 = 0 since w sp(v1, v2). Then v3 = 1

r3(w r1v1 r2v2). Therefore

v3 sp(v1, v2, w). So

sp(v1, v2, w) sp(v1, v2, w)

and so {v1, v2, w} generates V .Next suppose, s1v1+s2v2+s3 w = 0. Then s3 = 0 or else w sp(v1, v2).

So s1v1 + s2v2 = 0 and s1 = s2 = 0. Therefore s1 = s2 = s3 = 0 and so

{v1, v2, w} is a basis for V . QED

3.3 Coordinatization of Vectors 1


3.3 Coordinatization of Vectors

Definition. An ordered basis (b1, b2, . . . , bn) is an ordered set of vec-

tors which is a basis for some vector space.

Definition 3.8. If B = (b1, b2, . . . , bn) is an ordered basis for V and

v = r1b1 + r2b2 + + rn bn, then the vector [r1, r2, . . . , rn] Rn is thecoordinate vector of v relative to B, denoted vB.


Note. To find vB:

(1) write the basis vectors as column vectors to form [b1, b2, . . . , bn | v],(2) use Gauss-Jordan elimination to get [I | vB].


Definition. An isomorphism between two vector spaces V and W is a

one-to-one and onto function from V to W such that:

(1) if v1, v2 V then

(v1 + v2) = (v1) + (v2), and

(2) if v V and r R then (rv) = r(v).If there is such an , then V and W are isomorphic, denoted V = W .

Note. An isomorphism is a one-to-one and onto linear transformation.

Theorem. The Fundamental Theorem of Finite Dimensional

Vectors Spaces.

If V is a finite dimensional vector space (say dim(V ) = n) then V is

isomorphic to Rn.

Proof. Let B = (b1, b2, . . . , bn) be an ordered basis for V and for v Vwith vB = [r1, r2, . . . , rn] define : V Rn as

(v) = [r1, r2, . . . , rn].


Then clearly is one-to-one and onto. Also for v, w V suppose

vB = [r1, r2, . . . , rn] and wB = [s1, s2, . . . , sn]

and so

(v + w) = [r1 + s1, r2 + s2, . . . , rn + sn]

= [r1, r2, . . . , rn] + [s1, s2, . . . , sn]

= (v) + (w).

For a scalar t R,

(tv) = [tr1, tr2, . . . , trn] = t[r1, r2, . . . , rn] = t(v).

So is an isomorphism and V = Rn. QED


Example. Page 212 number 20. Prove the set {(xa)n, (xa)n1, . . . , (xa), 1} is a basis for Pn.

Proof. Let v0, v1, . . . , vn be the coordinate vectors of 1, (x a), . . . ,(x a)n in terms of the ordered basis {1, x, x2, . . . , xn}. Form a matrixA with the vls as the columns:

A = [v0v1 vn].


Notice that A is upper triangular:

A =

1 a a2 (a)n

0 1 2a ...0 0 1 ...... . . . ...

0 0 0 1

and so the vi are linearly independent. Since dim(Pn) = n+1 and the set

{(x a)n, (x a)n1, . . . , (x a), 1}

is a set of n + 1 linearly independent vectors, then this set is a basis for

Pn. QED

3.4 Linear Transformations 1


3.4 Linear Transformations

Note. We have already studied linear transformations from Rn into Rm.

Now we look at linear transformations from one general vector space to

another.

Definition 3.9. A function T that maps a vector space V into a vector

space V is a linear transformation if it satisfies:

(1) T (u + v) = T (u) + T (v), and (2) T (ru) = rT (u),

for all vectors u,v V and for all scalars r R.

Definition. For a linear transformation T : V V , the set V is thedomain of T and the set V is the codomain of T . If W is a subset of

V , then T [W ] = {T (w) | w W} is the image of W under T . T [V ]is the range of T . For W V , T1[W ] = {v V | T (v) W } isthe inverse image of W under T . T1[{0}] if the kernal of T , denotedker(T ).


Definition. Let V, V and V be vector spaces and let T : V V andT : V V be linear transformations. The composite transformationT T : V V is defined by (T T )(v) = T (T (v)) for v V .

Example. Page 214 Example 1. Let F be the vector space of all functions

f : R R, and let D be its subspace of all differentiable functions. Showthat differentiation is a linear transformation of D into F .

Proof. Let T : D F be defined as T (f) = f . Then from Calculus 1we know

T (f + g) = (f + g) = f + g = T (f) + T (g)

and

T (rf) = (rf) = rf = rT (f)

for all f, g D and for all r R. Therefore T is linear. QED


Theorem 3.5. Preservation of Zero and Subtraction

Let V and V be vectors spaces, and let T : V V be a linear transfor-mation. Then

(1) T (0) = 0, and

(2) T (v1 v2) = T (v1) T (v2),for any vectors v1 and v2 in V .

Proof of (1). Consider

T (0) = T (00) = 0T (0) = 0.

QED

Theorem 3.6. Bases and Linear Transformations.

Let T : V V be a linear transformation, and letB be a basis for V . Forany vector v in V , the vector T (v) is uniquely determined by the vectors

T (b) for all b B. In other words, if two linear transformations have thesame value at each basis vector b B, then the two transformations havethe same value at each vector in V .


Proof. Let T and T be two linear transformations such that T (bi) =

T (bi) for each vector bi B. Let v V . Then for some scalarsr1, r2, . . . , rk we have

v = r1b1 + r2v2 + + rk bk.

Then

T (v) = T (r1b1 + r2b2 + + rk vk)= r1T (b1) + r2T (b2) + + rkT (vk)= r1T (b1) + r2T (b2) + + rkT (vk)= T (r1b1 + r2b2 + + rk vk)= T (v).

Therefore T and T are the same tranformations. QED

Theorem 3.7. Preservation of Subspaces.

Let V and V be vector spaces, and let T : V V be a linear transfor-mation.

(1) If W is a subspace of V , then T [W ] is a subspace of V .

(2) If W is a subspace of V , then T1[W ] is a subspace of V .


Theorem. Let T : V V be a linear transformation and let T (p) = bfor a particular vector p in V . The solution set of T (x) = b is the set

{p + h | h ker(T )}.

Proof. (Page 229 number 46) Let p be a solution of T (v) = b. Then

T (p) = b. Let h be a solution of T (x) = 0. Then T (h) = 0. Therefore

T (p + h) = T (p) + T (h) = b + 0 = b,

and so p + h is indeed a solution. Also, if q is any solution of T (x) = b

then

T (q p) = T (q) T (p) = bb = 0,

and so qp is in the kernal of T . Therefore for some h ker(T ), we haveq p = h, and q = p + h. QED

Definition. A transformation T : V V is one-to-one if T (v1) =T (v2) implies that v1 = v2 (or by the contrapositive, v1 = v2 impliesT (v1) = T (v2)). Transformation T is onto if for all v V there is av V such that T (v) = v.


Corollary. A linear transformation T is one-to-one if and only if ker(T ) =

{0}.

Proof. By the previous theorem, if ker(T ) = {0}, then for all relevant b,the equation T (x) = b has a unique solution. Therefore T is one-to-one.

Next, if T is one-to-one then for any nonzero vector x, T (x) is nonzero.

Therefore by Theorem 3.5 Part (1), ker(T ) = {0}. QED

Definition 3.10. Let V and V be vector spaces. A linear transfor-

mation T : V V is invertible if there exists a linear transformationT1 : V V such that T1 T is the identity transformation on Vand T T1 is the identity transformation on V . Such T1 is called aninverse transformation of T .

Theorem 3.8. A linear transformation T : V V is invertible if andonly if it is one-to-one and onto V .

Proof. Suppose T is invertible and is not one-to-one. Then for some

v1 = v2 both in V , we have T (v1) = T (v2) = v. But then T1T (v) = v1and T1 T (v) = v2, a contradiction. Therefore if T is invertible then Tis one-to-one.


From definition 3.10, if T is invertible then for any v V we musthave T1(v) = v for some v V . Therefore the image of v is v V

and T is onto.

Finally, we need to show that if T is one-to-one and onto then it is

invertible. Suppose that T is one-to-one and onto V . Since T is onto V ,

then for each v V we can find v V such that T (v) = v. Because Tis one-to-one, this vector v V is unique. Let T1 : V V be definedby T1(v) = v. Then

(T T1)(v) = T (T1(v)) = T (v) = v

and

(T1 T )(v) = T1(T (v)) = T1(v) = v,and so T T1 is the identity map on V and T1 T is the identity mapon V .

Now we need only show that T1 is linear. Suppose T (v1) = v1 and

T (v2) = v2. Then

T1(v1 + v2) = T

1(T (v1) + T (v2)) = T1(T (v1 + v2))

= (T1 T )(v1 + v2) = v1 + v2 = T1(v1) + T1(v2).Also

T1(rv1) = T1(rT (v1)) = T1(T (rv1)) = rv1 = rT1(v1).


Therefore T1 is linear. QED

Theorem 3.9. Coordinatization of Finite-Dimensional Spaces.

Let V be a finite-dimensional vector space with ordered basisB = (b1, b2, . . . ,

bn). The map T : V Rn defined by T (v) = vB, the coordinate vectorof v relative to B, is an isomorphism.


Theorem 3.10. Matrix Representations of Linear Transfor-

mations.

Let V and V be finite-dimensional vector spaces and letB = (b1, b2, . . . , bn)

and B = (b1, b2, . . . ,

bm) be ordered bases for V and V, respectively. Let

T : V V be a linear transformation, and let T : Rn Rm be thelinear transformation such that for each v V , we have T ( vB) = T (v)B.Then the standard matrix representation of T is the matrix A whose jth

column vector is T (bj)B, and T (v)B = A vB for all vectors v V.

Definition 3.11. The matrix A of Theorem 3.10 is the matrix repre-

sentation of T relative to B,B.

Theorem. The matrix representation of T1 relative to B, B is the

inverse of the matrix representation of T relative to B,B.

Examples. Page 227 numbers 18 and 24.

3.5 Inner-Product Spaces 1


3.5 Inner-Product Spaces

Note. In this section, we generalize the idea of dot product to general

vector spaces. We use this more general idea to define length and angle in

arbitrary vector spaces.

Note. Motivated by the properties of dot product on Rn, we define the

following:

Definition 3.12. An inner product on a vector space V is a function

that associates with each ordered pair of vectors v, w V a real number,written v, w, satisfying the following properties for all u,v, w V andfor all scalars r:

P1. Symmetry: v, w = w,vP2. Additivity: u,v + w = u,v + u, w,P3. Homogeneity: rv, w = rv, w = v, r w,P4. Positivity: v,v 0, and v,v = 0 if and only if v = 0.An inner-product space is a vector space V together with an inner product

on V .


Example. Dot product on Rn is an example of an inner product:

v, w = v w for v, w Rn.

Example. Page 231 Example 3. Show that the space P0,1 of all poly-

nomial functions with real coefficients and domain 0 x 1 is aninner-product space if for p and q in P0,1 we define

p, q = 1

0

p(x)q(x) dx.

Definition 3.13. Let V be an inner-product space. The magnitude or

norm of a vector v V is v = v,v. The distance between v andw in an inner-product space V is d(v, w) = v w.

Note. In Rn with dot product as inner product, we find that the distance

induced by this inner-product is (as expected):

d(v, w) = v w =v w,v w

=

(v1 w1, v2 w2, . . . , vn wn) (v1 w1, v2 w2, . . . , vn wn)

=(v1 w1)2 + (v2 w2)2 + + (vn wn)2.


Theorem 3.11. Schwarz Inequality.

Let V be an inner-product space, and let v, w V . Then

v, w vw.

Proof. Let r, s R. Then by Definition 3.12

rv + sw2 = rv + sw, rv + sw= r2v,v + 2rsv, w + s2w, w 0.

Since this equation holds for all r, s R, we are free to choose particularvalues of r and s. We choose r = w, w and s = v, w. Then we have

w, w2v,v 2w, wv, w2 + v, w2w, w

= w, w2v,v w, wv, w2

= w, w[w, wv,v v, w2] 0. (13)

If w, w = 0 then w = 0 by Theorem 3.12 Part (P4), and the SchwarzInequality is proven (since it reduces to 0 0). If w2 = w, w = 0,then by the above inequality the other factor of inequality (13) must also

be nonnegative:

w, wv,v v, w2 0.


Therefore

v, w2 v,vw, w = v2w2.

Taking square roots, we get the Schwarz Inequality. QED

Theorem. The Triangle Inequality.

Let v, w V (where V is an inner-product space). Then

v + w v + w.

Proof. We have

v + w2 = v + w,v + w= v, w + 2v, w + w, w (by Definition 3.12)= v2 + 2v, w + w2 (by Definition 3.13) v2 + 2vw + w2 (by Schwarz Inequality)= (v + w)2

Taking square roots, we have the Triangle Inequality. QED


Definition. Let v, w V where V is an inner-product space. Definethe angle between vectors v and w as

= arccosv, wvw.

In particular, v and w are orthogonal (or perpendicular) if v, w = 0.

Examples. Page 236 number 12, and page 237 number 26.

4.1 Areas, Volumes, and Cross Products 1

Chapter 4. Determinants

4.1 Areas, Volumes, and Cross Products

Note. Area of a Parallelogram.

Consider the parallelogram determined by two vectors a and b:

Figure 4.1, Page 239.

Its area is

A = Area = (base) (height) = ab sin

= ab

1 cos2 .

Squaring both sides:

A2 = a2b2(1 cos2 )= a2b2 a2b2 cos2


= a2b2 (a b)2.

Converting to components a = [a1, a2] and b = [b1, b2] gives

A2 = (a1b2 a2b1)2

or A = |a1b2 a2b1|.

Definition. For a 22 matrix A = a1 a2

b1 b2

, define the determinant

of A as

det(A) = a1b2 a2b1 =a1 a2

b1 b2

.


Definition. For two vectors b = [b1, b2, b3] and c = [c1, c2, c3] define the

cross product of b and c as

b c =b2 b3

c2 c3

ib1 b3

c1 c3

j +b1 b2

c1 c2

k.


Note. We can take dot products and find that b c is perpendicular toboth b and c.

Note. If b,c R3 are not parallel, then there are two directions perpen-dicular to both of these vectors. We can determine the direction of b cby using a right hand rule. If you curl the fingers of your right hand

from vector b to vector c, then your thumb will point in the direction of

b c:




Definition. For a 3 3 matrix A =

a1 a2 a3

b1 b2 b3

c1 c2 c3

define the determi-

nant as

det(A) =

a1 a2 a3

b1 b2 b3

c1 c2 c3

= a1

b2 b3

c2 c3

a2b1 b3

c1 c3

+ a3b1 b2

c1 c2

.

Note. We can now see that cross products can be computed using deter-

minants:

b c =

i j k

b1 b2 b3

c1 c2 c3

.


Theorem. The area of the parallelogram determined byb andc is bc.

Proof. We know from the first note of this section that the area squared

is A2 = cb (c b)2. In terms of components we have

A2 = (c21 + c22 + c

23)(b

21 + b

22 + b

23) (c1b1 + c2b2 + c3b3)2.

Multiplying out and regrouping we find that

A2 =

b2 b3

c2 c3

2

+

b1 b3

c1 c3

2

+

b1 b2

c1 c2

2

.

Taking square roots we see that the claim is verified. QED

Theorem. The volume of a box determined by vectors a,b,c R3 is

V = |a1(b2c3 b3c2) a2(b1c3 b3c1) + a3(b1c2 b2c1)| = |a b c|.

Proof. Consider the box determined by a,b,c R3:



The volume of the box is the height times the area of the base. The area

of the base is b c by the previous theorem. Now the height is

h = a| cos | = b ca| cos |

b c=

|(b c) a|b c

.

(Notice that if bc is in the opposite direction as given in the illustrationabove, then would be greater than /2 and cos would be negative.

Therefore the absolute value is necessary.) Therefore

V = (Area of base)(height) = b c|(c c) a|b c = |(b c) a|.

QED

Note. The volume of a box determined by a,b,c R3 can be computedin a similar manner to cross products:

V = | det(A)| =

a1 a2 a3

b1 b2 b3

c1 c2 c3

.



Theorem 4.1. Properties of Cross Product.

Let a,b,c R3.(1) Anticommutivity: b c = cb(2) Nonassociativity of : a (b c) = (a b) c (That is, equalitydoes not in general hold.)

(3) Distributive Properties: a (b + c) = (ab) + (a c)(a +b) c = (a c) + (b c)

(4) Perpendicular Property: b (b c) = (b c) c = 0(5) Area Property: bc = Area of the parallelogram determined by band c

(6) Volume Property: a (b c) = (a b) c = Volume of the boxdetermined by a, b, and c.

(7) a (b c) = (a c)b (a b)c

Proof of (1). We have

b c =b2 b3

c2 c3

ib1 b3

c1 c3

j +b1 b2

c1 c2

k= (b2c3 b3c2)i (b1c3 b3c1)j + (b1c2 b2c1)k=

((b3c2 b2c3)i (b3c1 b1c3)j + (b2c1 b1c2)k

)


= c2 c3

b2 b3

ic1 c3

b1 b3

j +c1 c2

b1 b2

k

= cb

QED


4.2 The Determinant of a Square Matrix 1


4.2 The Determinant of a Square Matrix

Definition. The minor matrix Aij of an n n matrix A is the (n 1) (n 1) matrix obtained from it by eliminating the ith row and thejth column.

Example. Find A11, A12, and A13 for

A =

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

Definition. The determinant of Aij times (1)i+j is the cofactor ofentry aij in A, denoted A

ij.


Example. We can write determinants of 3 3 matrices in terms ofcofactors:

det(A) =

a11 a12 a13

a21 a22 a23

a31 a32 a33

= a11|A11| a12|A12| + a13|A13|

= a11a11 + a12a

12 + a13a

13.

Note. The following definition is recursive. For example, in order to

process the definition for n = 4 you must process the definition for n = 3,

n = 2, and n = 1.

Definition 4.1. The determinant of a 1 1 matrix is its single entry.Let n > 1 and assume the determinants of order less than n have been

defined. Let A = [aij] be an n n matrix. The cofactor of aij in A isaij = (1)i+j det(Aij). The determinant of A is

det(A) = a11a11a12a

12 + + a1na1n =

ni=1

a1ia1i.

Example. Page 252 Example 2.


Theorem 4.2. General Expansion by Minors.

The determinant of A can be calculated by expanding about any row or

column:

det(A) = ar1ar1 + ar2a

r2 + arnarn

= a1sa1s + a2sa

s2 + ansans

for any 1 r n or 1 s n.

Proof. See Appendix B for a proof which uses mathematical induction.

Example. Find the determinant of

A =

0 0 0 1

0 1 2 0

0 4 5 9

1 15 6 57

.


Theorem. Properties of the Determinant.

Let A be a square matrix:

1. det(A) = det(AT ).

2. If H is obtained from A by interchanging two rows, then det(H) =

det(A).

3. If two rows of A are equal, then det(A) = 0.

4. If H is obtained from A by multiplying a row of A by a scalar r, then

det(H) = r det(A).

5. If H is obtained from A by adding a scalar times one row to another

row, then det(H) = det(A).

Proof of 2. We will prove this by induction. The proof is trivial for

n = 2. Assume that n > 2 and that this row interchange property holds

for square matrices of size smaller that n n. Let A be an n n matrixand let B be the matrix obtained from A by interchanging the ith row

and the rth row. Since n > 2, we can choose a kth row for expansion by

minors, where k {r, i}. Consider the cofactors

(1)k+j|Akj| and (1)k+j|Bkj|.


These numbers must have opposite signs, by our induction hypothesis,

since the minor matrices Akj and Bkj have size (n1) (n1), and Bkjcan be obtained from Akj by interchanging two rows. That is, |Bkj| =|Akj|. Expanding by minors on the kth row to find det(A) and det(B),we see that det(A) = det(B). QED

Note. Property 1 above implies that each property of determinants stated

for rows also holds for columns.


Theorem 4.3. Determinant Criterion for Invertibility.

A square matrix A is invertible if and only if det(A) = 0.

Theorem 4.5 The Multiplicative Property.

If A and B are n n matrices, then det(AB) = det(A) det(B).

Examples. Page 262 numbers 28 and 32.

4.3 Computation of Determinants and Cramers Rule 1


4.3 Computation of Determinants and Cramers Rule

Note. Computation of A Determinant.

The determinant of an n n matrix A can be computed as follows:

1. Reduce A to an echelon form using only row (column) addition and

row (column) interchanges.

2. If any matrices appearing in the reduction contain a row (column) of

zeros, then det(A) = 0.

3. Otherwise,

det(A) = (1)r (product of pivots)

where r is the number of row (column) interchanges.

Example. Page 271 number 6 (work as in Example 1 of page 264).


Theorem 4.5. Cramers Rule.

Consider the linear system Ax = b, where A = [aij] is an nn invertiblematrix,

x =

x1

x2

...

xn

and b =

b1

b2

...

bn

.

The system has a unique solution given by

xk =det(Bk)

det(a)for k = 1, 2, . . . , n,

where Bk is the matrix obtained from A by replacing the kth column

vector of A by the column vector b.

Proof. Since A is invertible, we know that the linear system Ax = b has

a unique solution by Theorem 1.16. Let x be this unique solution. Let

Xk be the matrix obtained from the n n identity matrix by replacing


its kth column vector by the column vector x, so

Xk =

1 0 0 x1 0 0 00 1 0 x2 0 0 00 0 1 x3 0 0 0

...

0 0 0 xk 0 0 0...

0 0 0 xn 0 0 1

.

We now compute the product AXk If j = k, then the jth column of AXkis the product of A and the jth column of the identity matrix, which is

just the jth column of A. If j = k, then the jth column of AXk is Ax = b.

Thus AXk is the matrix obtained from A by replacing the kth column

of A by the column vector b. That is, AXk is the matrix Bk described

in the statement of the theorem. From the equation AXk = Bk and the

multiplicative property of determinants, we have

det(A) det(Xk) = det(Bk).

Computing det(Xk) by expanding by minors across the kth row, we see

that det(Xk) = xk and thus det(A)xk = det(Bk). Because A is invertible,we know that det(A) = 0 by theorem 4.3, and so xk = det(Bk)/ det(A)as claimed. QED



Note. Recall that aij is the determinant of the minor matrix associated

with element aij (i.e. the cofactor of aij).

Definition. For an n n matrix A = [aij], define the adjoint of A as

adj(A) = (A)T

where A = [aij].

Example. Page 272 number 18 (find the adjoint of A).

Theorem 4.6. Property of the Adjoint.

Let A be n n. Then

(adj(A))A = A adj(A) = (det(A))I.

Corollary. A Formula for A1.

Let A be n n and suppose det(A) = 0. Then

A1 =A

det(A)adj(A).


Example. Page 272 number 18 (use the corollary to find A1).

Note. If A =

a b

c d

then adj(A) =

d bc a

and det(a) = adbc,

so

A1 =1

ad bc

d bc a

.

5.1 Eigenvalues and Eigenvectors 1

Chapter 5. Eigenvalues and Eigenvectors

5.1 Eigenvalues and Eigenvectors

Definition 5.1. Let A be an nn matrix. A scalar is an eigenvalueof A if there is a nonzero column vector v Rn such that Av = v. Thevector v is then an eigenvector of A corresponding to .

Note. If Av = v then Av v = 0 and so (A I)v = 0. Thisequation has a nontrivial solution only when det(A I) = 0.

Definition. det(AI) is a polynomial of degree n (where A is nn)called the characteristic polynomial of A, denoted p(), and the equation

p() = 0 is called the characteristic polynomial.



Theorem 5.1. Properties of Eigenvalues and Eigenvectors.

Let A be an n n matrix.

1. If is an eigenvalue of A with v as a corresponding eigenvector, then

k is an eigenvalue of Ak, again with v as a corresponding eigenvector,

for any positive integer k.

2. If is an eigenvalue of an invertible matrix A with v as a corresponding

eigenvector, then = 0 and 1/ is an eigenvalue of A1, again withv as a corresponding eigenvector.

3. If is an eigenvalue of A, then the set E consisting of the zero vector

together with all eigenvectors of A for this eigenvalue is a subspace

of n-space, the eigenspace of .

Proof of (2). (Page 301 number 28.) By definition, = 0. If is aneigenvalue of A with eigenvector v, then Av = v. Therefore A1Av =

A1v or v = A1v. So A1v = (1/)v and is an eigenvalue of A1.

QED


Note. We define eigenvalue and eigenvector for a linear transformation

in the most obvious way (that is, in terms of the matrix which represents

it).

Definition 5.2. Eigenvalues and Eigenvectors.

Let T be a linear transformation of a vector space V into itself. A scalar

is an eigenvalue of T if there is a nonzero vector v V such thatT (v) = v. The vector v is then an eigenvector of T corresponding to .

Examples. Page 300 number 18, page 301 number 32.

5.2 Diagonalization 1

Chapter 5. Eigenvalues and Eigenvectors

5.2 Diagonalization

Recall. A matrix is diagonal if all entries off the main diagonal are 0.

Note. In this section, the theorems stated are valid for matrices and

vectors with complex entries and complex scalars, unless stated otherwise.

Theorem 5.2. Matrix Summary of Eigenvalues of A.

Let A be an n n matrix and let 1, 2, . . . , n be (possibly complex)scalars and v1, v2, . . . , vn be nonzero vectors in n-space. Let C be the

n n matrix having vj as jth column vector and let

D =

1 0 0 00 2 0 00 0 3 0... ... ... . . . ...

0 0 0 n

.

Then AC = CD if and only if 1, 2, . . . , n are eigenvalues of A and vj

is an eigenvector of A corresponding to j for j = 1, 2, . . . , n.


Proof. We have

CD =

... ... ...

v1 v2 vn... ... ...

1 0 0 00 2 0 00 0 3 0... ... ... . . . ...

0 0 0 n

=

... ... ...

1v1 2v2 n vn... ... ...

.

Also,

AC = A

... ... ...

v1 v2 vn... ... ...

.

Therefore, AC = CD if and only if Avj = j vj. QED

Note. The n n matrix C is invertible if and only if rank(C) = n that is, if and only if the column vectors of C form a basis of n-space.

In this case, the criterion AC = CD in Theorem 5.2 can be written as

D = C1AC. The equation D = C1AC transforms a matrix A into a

diagonal matrix D that is much easier to work with.


Definition 5.3. Diagonalizable Matrix.

An n n matrix A is diagonalizable if there exists an invertible matrixC such that C1AC = D is a diagonal matrix. The matrix C is said to

diagonalize the matrix A.

Corollary 1. A Criterion for Diagonalization.

An n n matrix A is diagonalizable if and only if n-space has a basisconsisting of eigenvectors of A.

Corollary 2. Computation of Ak.

Let an nn matrix A have n eigenvectors and eigenvalues, giving rise tothe matrices C and D so that AC = CD, as described in Theorem 5.2.

If the eigenvectors are independent, then C is an invertible matrix and

C1AC = D. Under these conditions, we have Ak = CDkC1.

Proof. By Corollary 1, if the eigenvectors of A are independent, then A

is diagonalizable and so C is invertible. Now consider

Ak = (CDC1)(CDC1) (CDC1) k factors

= CD(C1C)D(C1C)D(C1C) (C1C)DC1

= CDIDID IDC1


= C DDD D k factors

C1 = CDkC1

QED

Theorem 5.3. Independence of Eigenvectors.

Let A be an n n matrix. If v1, v2, . . . , vn are eigenvectors of A corre-sponding to distinct eigenvalues 1, 2, . . . , n, respectively, the set {v1, v2,. . . , vn} is linearly independent and A is diagonalizable.

Proof. We prove this by contradiction. Suppose that the conclusion

is false and the hypotheses are true. That is, suppose the eigenvectors

v1, v2, . . . , vn are linearly independent. then one of them is a linear com-

bination of its predecessors (see page 203 number 37). Let vk be the first

such vector, so that

vk = d1v1 + d2v2 + + dk1vk1 (2)

and {v1, v2, . . . , vk1} is independent. Multiplying (2) by k, we obtain

k vk = d1k v1 + d2k v2 + + dk1kvk1. (3)

Also, multiplying (2) on the left by the matrix A yields

k vk = d11v1 + d22v2 + + dk1k1vk1 (4),


since Avi = ivi. Subtracting (4) from (3), we see that

0 = di(k 1)v1 + d2(k 2)v2 + + dk1(k k1)vk1.

But this equation is a dependence relation since not all dis are 0 and

the s are hypothesized to be different. This contradicts the linear in-

dependence of the set {v1, v2, . . . , vk1}. This contradiction shows that{v1, v2, . . . , vn} is independent. From Corollary 1 of Theorem 5.2 we seethat A is diagonalizable. QED


Definition 5.4. An n n matrix P is similar to an n n matrix Q ifthere exists an invertible n n matrix C such that C1PC = Q.


Definition. The algebraic multiplicity of an eigenvalue i of A is its

multiplicity as a root of the characteristic equation of A. Its geometric

multiplicity is the dimension of the eigenspace Ei.


Theorem. The geometric multiplicity of an eigenvalue of a matrix A is

less than or equal to its algebraic multiplicity.

Note. The proof of this result is a problem (number 33) in section 9.4.

Theorem 5.4. A Criterion for Diagonalization.

An nnmatrixA is diagonalizable if and only if the algebraic multiplicityof each (possibly complex) eigenvalue is equal to its geometric multiplicity.


Theorem 5.5. Diagonalization of Real Symmetric Matrices.

Every real symmetric matrix is real diagonalizable. That is, ifA is an nnsymmetric real matrix with real-number entries, then each eigenvalue of

A is a real number, and its algebraic multiplicity equals its geometric

multiplicity.

Note. The proof of Theorem 5.5 is in Chapter 9 and uses the Jordan

canonical form of matric A.


6.1 Projections 1

Chapter 6. Orthogonality

6.1 Projections

Note. We want to find the projection p of vector F on sp(a):


We see that p is a multiple of a. Now (1/a)a is a unit vector havingthe same direction as a, so p is a scalar multiple of this unit vector. We

need only find the appropriate scalar. From the above figure, we see that

the appropriate scalar is F cos , because it is the length of the leglabeled p of the right triangle. If p is in the opposite direction of a and

[/2, 2]:

6.1 Projections 2


then the appropriate scalar is again given by F cos . Thus

p = F cos

a a = Fa cos

aa a =F aa aa.

We use this to motivate the following definition.

Definition. Let a,b Rn The projection p of b on sp(a) is

p =b aa aa.


6.1 Projections 3

Definition 6.1. Let W be a subspace of Rn. The set of all vectors in Rn

that are orthogonal to every vector in W is the orthogonal complement

of W and is denoted by W.

Note. To find the orthogonal complement of a subspace of Rn:

1. Find a matrix A having as row vectors a generating set for W .

2. Find the nullspace of A that is, the solution space of Ax = 0. This

nullspace is W.


Theorem 6.1. Properties of W.

The orthogonal complement W of a subspace W of Rn has the following

properties:

1. W is a subspace of Rn.

2. dim(W) = n dim(W ).

3. (W) = W .

6.1 Projections 4

4. Each vectorb Rn can be expressed uniquely in the formb = bW+bWfor bW W and bW W.

Proof of (1) and (2). Let dim(W ) = k, and let {v1, v2, . . . , vk} be abasis for W . Let A be the k n matrix having vi as its ith row vectorfor i = 1, 2, . . . , k.

Property (1) follows from the fact that W is the nullspace of matrix

A and therefore is a subspace of Rn.

For Property (2), consider the rank equation of A:

rank(A) + nullity(A) = n.

Since dim(W ) = rank(A) and since W is the nullspace of A, then

dim(W) = n dim(W ). QED

Definition 6.2. Let b Rn, and let W be a subspace of Rn. Letb = bW +bW be as described in Theorem 6.1. Then

bW is the projection

of b on W .

6.1 Projections 5

Note. To find the projection of b on W , follow these steps:

1. Select a basis {v1, v2, . . . , vk} for the subspace W .

2. Find a basis {vk+1, vk+2, . . . , vn} for W.

3. Find the coordinate vector r = r1, r2, . . . , rn] of b relative to the basis

(v1, v2, . . . , vn) so that

b = r1v1 + r2v2 + + rn vn.

4. Then vW = r1v1 + r2v2 + + rk vk.

Example. Page 336 number 20b.

Note. We can perform projections in inner product spaces by replacing

the dot products in the formulas above with inner products.

Example. Page 335 Example 6. Consider the inner product space P[0,1]of all polynomial functions defined on the interval [0, 1] with inner product

p(x), q(x) = 1

0

p(x)q(x) dx.

Find the projection of f(x) = x on sp(1) and then find the projection of

x on sp(1).


6.2 The Gram-Schmidt Process 1


6.2 The Gram Schmidt Process

Definition. A set {v1, v2, . . . , vk} of nonzero vectors in Rn is orthogonalif the vectors vj are mutually perpendicular that is, if vi vj = 0 fori = j.

Theorem 6.2. Orthogonal Bases.

Let {v1, v2, . . . , vk} be an orthogonal set of nonzero vectors in Rn. Thenthis set is independent and consequently is a basis for the subspace sp(v1, v2,

. . . , vk).

Proof. Let j be an integer between 2 and k. Consider

vj = s1v1 + s2v2 + + sj1vj1.

If we take the dot product of each side of this equation with vj then, since

the set of vectors is orthogonal, we get vj vj = 0, which contradictsthe hypothesis that vj = 0. Therefore no vj is a linear combinationof its predecessors and by Exercise 37 page 203, the set is independent.

Therefore the set is a basis for its span. QED


Theorem 6.3. Projection Using an Orthogonal Basis.

Let {v1, v2, . . . , vk} be an orthogonal basis for a subspace W of Rn, andlet b Rn. The projection of b on W is

bW =b v1v1 v1 v1 +

b v2v2 v2 v2 + +

b vkvk vk vk.

Proof. We know from Theorem 6.1 that b = bW + bW wherebW is

the projection of b on W and bW is the projection ofb on W. Since

bW W and {v1, v2, . . . , vk} is a basis of W , then

bW = r1v1 + r2v2 + + rk vk

for some scalars r1, r2, . . . , rk. We now find these ris. Taking the dot

product of b with vi we have

b vi = (bW vi) + (bW vi)= (r1v1 vi + r2v2 vi + + rk vk vi + 0= rivi vi

Therefore ri = (b vi)/(vi vi) and so

rivi =b vivi vi vi.

Substituting these values of the ris into the expression for bW yields the

theorem. QED



Definition 6.3. Let W be a subspace of Rn. A basis {q1, q2, . . . , qk}for W is orthonormal if

1. qi qj = 0 for i = j, and

2. qi qi = 1.

That is, each vector of the basis is a unit vector and the vectors are pairwise

orthogonal.

Note. If {q1, q2, . . . , qk} is an orthonormal basis for W , then

bW = (b q1)q1 + (b q2)q2 + + (b qk)qk.

Theorem 6.4. Orthonormal Basis (Gram-Schmidt) Theorem.

Let W be a subspace of Rn, let {a1, a2, . . . , ak} be any basis for W , andlet

Wj = sp(a1, a2, . . . , aj) for j = 1, 2, . . . , k.

Then there is an orthonormal basis {q1, q2, . . . , qk} forW such thatWj =sp(q1, q2, . . . , qj).


Note. The proof of Theorem 6.4 is computational. We summarize the

proof in the following procedure:

Gram-Schmidt Process.

To find an orthonormal basis for a subspace W of Rn:

1. Find a basis {a1, a2, . . . , ak} for W .

2. Let v1 = a1. For j = 1, 2, . . . , k, compute in succession the vector vj

given by subtracting from aj its projection on the subspace generated

by its predecessors.

3. The vj so obtained form an orthogonal basis for W , and they may be

normalized to yield an orthonormal basis.

Note. We can recursively describe the way to find vj as:

vj = aj (aj v1v1 v1 v1 +

aj v2v2 v2 v2 + +

aj vj1vj1 vj1vj1

).

If we normalize the vj as we go by letting qj = (1/vj)vj, then we have

vj = aj ((aj q1)q1 + (aj q2)q2 + + (aj qj1)qj1).



Corollary 2. Expansion of an Orthogonal Set to an Orthog-

onal Basis.

Every orthogonal set of vectors in a subspace W of Rn can be expanded

if necessary to an orthogonal basis of W .

Examples. Page 348 number 20 and page 349 number 34.

6.3 Orthogonal Matrices 1


6.3 Orthogonal Matrices

Definition 6.4. An n n matrix A is orthogonal if ATA = I.

Note. We will see that the columns of an orthogonal matrix must be

unit vectors and that the columns of an orthogonal matrix are mutually

orthogonal

Fraleigh = Linear Algebra.pdf

Documents

Transcript of Fraleigh = Linear Algebra.pdf