alg-adv-06[1]

download alg-adv-06[1]

of 65

Transcript of alg-adv-06[1]

  • 7/28/2019 alg-adv-06[1]

    1/65

    Algebra (Advanced)

    Math 2968

    Lecture notes by

    Andrew Mathas

    with revisions and additions by

    Alexander Molev

    SIDERE ME

    NS

    EADEM

    MUTATO

    School of Mathematics and Statistics

    University of Sydney

    2006

  • 7/28/2019 alg-adv-06[1]

    2/65

    2 Algebra (Advanced)

    Algebra (Advanced)

    This course is concerned with inner product spaces and with group theory.

    An inner product spaces is a special type of vector space which comes equipped with an

    inner product; this leads to notions oflength, distance and angles.

    A group is a set equipped with a single binary operation which satisfies some simple axioms;

    you should think of this operation as multiplication (or addition). The group axioms are an ab-

    straction of the essential properties of multiplication (or addition) in a general context. Groups

    are ubiquitous in mathematics and have a wide range of applications throughout the sciences

    and beyond. Whenever symmetry is present there is a group in the background and group theory

    can be used to help solve the problem.

    There are bound to be typos in these notes; I would be very pleased to hear of any that you

    spot.

    Contents

    1. Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    2. Length and the CauchySchwartz inequality . . . . . . . . . . . . . . . . . . . . . . 4

    3. Orthogonality and projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    4. Application to linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    5. Complex inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    6. Isometries and inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    7. Normal, Hermitian and unitary matrices . . . . . . . . . . . . . . . . . . . . . . . . . 14

    8. The definition of a group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199. Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    10. Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    11. Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    12. Equivalence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    13. Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31

    14. Normal subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    15. Quotient groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    16. The isomorphism theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    17. The structure of groups I: the existence ofpelements . . . . . . . . . . . . 4018. Group actions on sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    19. Conjugacy classes and groups ofppower order . . . . . . . . . . . . . . . . . 46

    20. Direct and semidirect products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    21. The structure of groups II: Sylows first theorem . . . . . . . . . . . . . . . . 51

    22. Sylows second theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    23. Groups of order pq. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    24. Simple groups of small order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    25. Free groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59

    26. Generators and relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

  • 7/28/2019 alg-adv-06[1]

    3/65

    Inner product spaces 3

    PART I: INNER PRODUCT SPACES

    1. Inner product spaces

    Let V be a real vector space. Recall that this means that V is a set equipped with two opera-tions, vector addition ((x, y)

    x + y for x, y

    V) and scalar multiplication ((, x)

    x

    for x V and R), such that for all x,y,z V and , R the following hold:a) x + y = y + x;b) (x + y) + z = x + (y + z);c) there is an element 0 V with 0 + x = x + 0;d) there exists an element x V with x + x = 0 = x + x (usually we write x = x);e) 1 x = x (where 1 R is the multiplicative identity ofR);f) (x) = ()x;g) (x + y) = x + y;h) ( + )x = x + x.

    1.1 Definition LetV be a real vector space. An inner product on V is a map

    , : V VRsuch that for all x , y, z V and, R the following hold:

    a) (symmetric) x, y = y, x;b) (linear) x + y,z = x, z + y, z; and,c) (positive definite) x, x > 0 wheneverx = 0.

    A real vector space V is an inner product spaceif it has an inner product.

    Notice that by combining (a) and (b) we also have

    x,y + z = y + z,x = y, x + z, x = x, y + x, z.Hence, , is linear in both variables; we say that , is bilinear.1.2 Example We should start with some examples of inner product spaces.

    a) Let V = Rn. An element x Rn can be thought of as a row vector x = (x1, . . . , xn).(We could also use column vectors; however, this is slight harder typographically.) Given

    y = (y1, . . . , yn) V definex, y = x y = (x1, . . . , xn) (y1, . . . , yn) = x1y1 + x2y2 + + xnyn,

    where x y is the standard dot producton Rn. It is straightforward to check that this is aninner product.

    b) Suppose that a < b are two real numbers and let

    V = C[a, b] = { f: [a, b]R | f is continuous } .It is easy to check that V is a real vector space and that

    f, g =b

    a

    f(x)g(x) dx, f, g V,

    defines an inner product on V. Notice that this inner product is just the continuous ana-logue of (a).

  • 7/28/2019 alg-adv-06[1]

    4/65

    4 Algebra (Advanced)

    1.3 Lemma Suppose thatW is a vector subspace ofV. Then W is an inner product space.

    Proof See tutorials.

    1.4 Lemma Suppose thatx V. Then x, 0 = 0. Consequently, x, x 0 for all x V.

    Proof By linearity, x, 0 = x, 0 + 0 = x, 0 + x, 0; hence, x, 0 = 0 as claimed. Inparticular, 0, 0 = 0 so x, x 0 for all x V (since, by assumption, x, x > 0 if x = 0).

    2. Length and the CauchySchwartz inequality

    Suppose that V is an inner product space. Then the length (or norm) of a vector x V isdefined to be x = x, x. Note that x 0 for all x V with equality if and onlyifx = 0 by Lemma 1.4. It is also easy to check that x = || x for all R and x V(see tutorials).

    The key property relating lengths and inner products is the following.

    2.1 Theorem (CauchySchwartz inequality) Suppose thatx, y V. Then

    |x, y| x y.

    Proof Ifx = 0 then x, y = 0 and x = 0 so |x, y| = x y and were done. Supposenow that x = 0. Then x2 = x, x > 0. For any real we have by linearity,

    0 y x,y x = y, y x x, y x= y, y y, x x, y + 2x, x= y2 2x, y + 2x2.

    Hence, the discriminant of this quadratic polynomial must be non-positive. Therefore, we have

    x, y2 x2y2 0 and hence x, y2 x2y2. Taking square roots gives the result.

    2.2 Corollary Suppose thatx, y V. Then x + y x + y.

    Proof Using the CauchySchwartz inequality on the second line we have

    x + y2 = x + y, x + y = x, x + 2x, y + y, y = x2 + 2x, y + y2 x2 + 2x y + y2 = (x + y)2,

    Hence, x + y x + y as claimed.

  • 7/28/2019 alg-adv-06[1]

    5/65

    Orthogonality and projection 5

    3. Orthogonality and projection

    Rearranging the CauchySchwartz inequality (Theorem 2.1) shows that

    1 x, y

    x

    y

    1,

    for all nonzero x, y V. Therefore, there is a unique angle [0, ] such that

    cos =x, y

    x y .

    By definition is the angle between x and y.

    3.1 Definition Two vectors x, y V are orthogonal ifx, y = 0. An orthogonal basis ofV is abasis {f1, . . . , f n} such thatfi andfj are orthogonal wheneveri = j.

    Orthogonality gives an easy way to test for linear independence.

    3.2 Lemma Suppose that{f1, f2, . . . } is a set of pairwise orthogonal nonzero vectors in V(that is, fi, fj = 0 ifi = j). Then {f1, f2, . . . } is linearly independent.Proof See tutorials.

    Note that the following results also apply in the case W = V.

    3.3 Proposition Suppose thatW is a subspace of an inner product space V and thatf1, . . . , f mis an orthogonal basis ofW and letw

    W. Then

    w =m

    i=1

    w, fifi, fifi.

    Proof As {f1, . . . , f m} is a basis ofW we can write w uniquely in the form w =m

    i=1 ifi forsome i R. Therefore, if1 j m then by linearity

    w, fj = m

    i=1

    ifi, fj =m

    i=1

    ifi, fj = jfj, fj,

    since fi, fj = 0 by orthogonality. Hence, j = w, fj/fj, fj as claimed. (Note thatfj, fj > 0 because fj = 0.)

    A vector v V is normal if v = 1. An orthonormal basis of V is an orthogonal basis{f1, . . . , f n} such that each fi is normal; equivalently,

    fi, fj = ij =

    1, ifi = j,

    0, otherwise.

    Ifv is any nonzero vector in V then we can normalize it by setting v = 1

    v

    v. Consequently, wecan always replace a basis of orthogonal vectors with a basis of orthonormal vectors.

    Using an orthonormal basis gives a slightly nicer reformulation of the last result.

  • 7/28/2019 alg-adv-06[1]

    6/65

    6 Algebra (Advanced)

    3.4 Corollary Suppose that W is a subspace of V and that f1, . . . , f m is an orthonormalbasis ofW and letw W. Then

    w =m

    i=1

    w, fifi.

    These last two results are important because they have the following consequence.

    3.5 Theorem Assume that W has an orthonormal basis {f1, . . . , f m} and let v V. Thenthere exists a unique vectorw W such that

    v, x = w, x, for all x W;

    indeed, w =m

    i=1

    v, fifi. Moreover, v w < v x, wheneverx W andx = w.

    Proof Let w be an arbitrary element of W and suppose that v, x = w, x for all x W. Then, in particular, v, fi = w, fi for 1 i m; therefore, w =

    mi=1v, fifi by

    Corollary 3.4. This shows that if such a w exists then it is unique (every element is uniquelydetermined by its expansion with respect to a basis). Conversely, ifx is arbitrary element ofWthen we need to show that w, x = v, x. Write x =mi=1 ifi, for i R. Then

    w, x = w,m

    i=1

    ifi =m

    i=1

    iw, fi =m

    i=1

    iv, fi = v,m

    i=1

    ifi = v, x.

    It remains to show that

    v

    w

    v w if x = w, asclaimed.

    In Theorem 3.5 we have assumed that W has an orthonormal basis; it turns out that this isalways true so this assumption can be dropped. Before we prove this we make a definition

    which will be useful in the proof.

    3.6 Definition Suppose that{f1, . . . , f m} is an orthonormal basis ofW, a subspace of V andthatv V. The orthogonal projection ofv onto W is the vector

    W(v) =

    mi=1

    v, fifi.

  • 7/28/2019 alg-adv-06[1]

    7/65

    Orthogonality and projection 7

    The point of Theorem 3.5 is that v W(v) v x for all x W with equality if andonly if x = W(v). In other words, W(v) is the element of W which is closest to v andthere is a unique such element (shortly we will see how this can be applied to linear regression

    problems). As a consequence, if w W then W(w) = w since W is certainly the elementofW which is closest to itself! (This also follows by comparing the formulae of Corollary 3.4

    and Theorem 3.5.) It follows that W is a surjective map from V onto W.Note that we could equally well define W(v) in terms of an orthogonal basis; more precisely,

    it is easy to check that if{g1, . . . , gm} is an orthogonal basis ofW then

    W(v) =m

    i=1

    v, gigi, gigi.

    (To see this set fi =1

    gigi, for each i; then {f1, . . . , f m} is an orthonormal basis of W and,

    moreover, v, fifi = v, 1gigi 1gigi = 1gi2 v, gigi =v,gigi,gi

    gi.)

    3.7 Example In Tutorial 1, question 6, it is shown that the functions

    fn(x) =

    sin(nx), ifn > 0,

    cos(nx), ifn 0,

    for n Z, are pairwise orthogonal elements of C[, ]. Furthermore, f0, f0 = 2 andfn, fn = when n = 0. Let CN be the subspace of C[, ] spanned by the functions{ fn | N n N} and let f be a function in C[, ]. Then the projection off onto CN is

    CN(f) =

    N

    n=N

    f, fn

    fn, fnfn =1

    2 a0 +

    1

    N

    n=1 an cos(nx) + bn sin(nx),where

    an = f, fn =

    f(x)cos(nx) dx and bn = f, fn =

    f(x)sin(nx) dx.

    If f is 2periodic then it is possible to show that CN(f) is a very good approximation to f(when N 0). This example is the start of Fourier analysis.

    We now prove that every vector space has an orthogonal basis. The proof is constructive.

    3.8 Theorem (GramSchmidt orthogonalization) Suppose thatV is a finite dimensional in-ner product space. Then V has an orthonormal basis.

    Proof Choose a basis {v1, . . . , vn} ofV and for m = 1, . . . , n let Vm be the subspace ofV withbasis {v1, . . . , vm}. We show by induction on m that Vm has an orthonormal basis. Ifm = 1then {f1} is an orthonormal basis of V1, where we set f1 = 1v1v1. Assume, by way of induc-tion, that {f1, . . . , f m} is an orthonormal basis ofVm. Then Theorem 3.5 can be applied to Vm,so we can set gm+1 = vm+1 Vm(vm+1). Note that gm+1 is nonzero because {v1, . . . , vm+1}is linearly independent. Moreover, if1 k m then

    gm+1, fk = vm+1 Vm(vm+1), fk = vm+1, fk Vm(vm+1), fk = 0

  • 7/28/2019 alg-adv-06[1]

    8/65

    8 Algebra (Advanced)

    by Theorem 3.5. Set fm+1 =1

    gm+1gm+1. Then {f1, . . . , f m+1} is an orthonormal basis

    ofVm+1, completing the proof of the inductive step. As V = Vn the theorem follows.

    The proof of Theorem 3.8 gives us a way of constructing an orthonormal basis of V given

    a basis; namely, if{v1, . . . , vn} is a basis of V then {g1, . . . , gn} is an orthogonal basis of Vwhere

    g1 = v1

    g2 = v2 V1(v2) = v2 v2, g1g1, g1g1

    g3 = v3 V2(v3) = v3 v3, g1g1, g1g1

    v3, g2g2, g2g2

    ......

    . . .

    gn = vn Vn1(vn) = vn v

    n, g

    1g1, g1 g1 v

    n, g

    n1gn1, gn1gn1

    By normalizing {g1. . . . , gn} we obtain an orthonormal basis of V. Incidentally, in carryingout the GramSchmidt algorithm we do not need to assume that {v1, . . . , vn} is a basis of V;a spanning set is enough. To see this assume that {v1, . . . , vm} is linearly independent and letVm be the space spanned by these vectors. By Theorem 3.5, vm+1 Vm if and only ifvm+1 =Vm(vm+1); in the notation above, this is inequivalent to saying that gm+1 = 0. Consequently,the GramSchmidt algorithm will refine a spanning set to an orthogonal basis.

    Finally, we remark that the orthonormal basis of V produced by GramSchmidt orthogonal-ization is not unique (that is, V has many different orthonormal bases). Rather it depends on the

    choice of initial basis {v1, . . . , vn} and the order in which these basis elements are listed. Forexample, 1

    v1v1 always belongs to the resulting orthonormal basis.

    4. Application to linear regression

    As an example we now address the following problem: given a collection of random vari-

    ables X and Y how can we determine their line Y = aX + b of best fit? At first sight thisdoes not relate to inner product spaces; however we will see that it is a direct application of

    Theorem 3.5. In principle, the technique that we describe can be used to fit any type of curve to

    a collection of data; cf. Example 3.7.

    Consider a collection of random variables (x1, y1), . . . , (xn, yn); for example, coming from

    an experiment. We want to determine the line Y = aX+ b which best describes this data.Let x = (x1, . . . , xn), y = (y1, . . . , yn) and z = (1, . . . , 1). We want to find a, b R

    such that y ax + bz. In other words, we want to find the projection W(y) of y ontothe subspace W ofRn spanned by x and z. Applying the GramSchmidt algorithm to thebasis {z,x} of W yields the orthogonal basis {g1, gx}, where g1 = z and gx = x xz andx = 1

    n

    ni=1 xi, the mean of the {xi}. Hence, the projection ofy onto W is

    W(y) =y, g1z, z g1 +

    y, gxgx, gxgx = yg1 +

    y, gxS2x

    gx,

  • 7/28/2019 alg-adv-06[1]

    9/65

    Complex inner product spaces 9

    where y = 1n

    ni=1 yi and S

    2x =n

    i=1(xi x)2. Noting thatn

    i=1(xi x) = 0 we have

    y, gx = y,x xz =n

    i=1

    yixi xyi

    =

    ni=1

    yixi xyi y(xi x)

    =

    ni=1

    (xi x)(yi y);

    call this last quantity Sxy (it is the covariance ofX and Y). Then

    W(y) = yg1 +SxyS2x

    gx = yz +SxyS2x

    x xz) = Sxy

    S2xx + (y Sxy

    S2xx)z.

    This gives us the coefficients a = SxyS2x

    and b = (y SxyS2x

    x) and hence the line of best fit. With

    a little more work it is possible to show that the correlation coefficientr = Sxy/SxSy measureshow well the data is described by a linear model. It is always true that

    |r|

    1 and the closer|r|is to 1 the better the fit.

    5. Complex inner product spaces

    Now we extend the definition of inner product spaces to complex vector spaces. We need to

    relax the assumption that the inner product is symmetric.

    Throughout this section we will assume that V is a complex inner product space.

    5.1 Definition LetV be a complex vector space. An inner producton V is a map

    ,

    : V

    V

    C

    such that for all x , y, z V and, C the following hold:a) x, y = y, x;b) x + y,z = x, z + y, z; and,c) x, x > 0 wheneverx = 0.

    A complex vector space V is a complex inner product spaceif it has an inner product.

    Here we write for the complex conjugate of the complex number C.A complex inner product is not bilinear; but it is close to being so. This time by combining

    (a) and (b) we obtain

    x,y + z = y + z,x = y, x + z, x = x, y + x, z.More generally, we have

    5.2 w + x,y + z = w, y + w, z + x, y + y, z,

    for vectors w , x , y, z V and ,,, C. Notice that if and are both real numbersthen = and = and , is honestly bilinear; thus, the complex inner product is a truegeneralization of the real inner product. In general, complex inner products are linear in the first

    variable and conjugate linear in the second.

    5.3 Example As with the real case, we have two natural examples of complex inner product

    spaces.

  • 7/28/2019 alg-adv-06[1]

    10/65

    10 Algebra (Advanced)

    a) Let V = Cn. An element x Cn can be thought of as a row vector x = (x1, . . . , xn).Given y = (y1, . . . , yn) V define

    x, y = x y = (x1, . . . , xn) (y1, . . . , yn) = x1y1 + x2y2 + + xnyn,

    where x y is the standard dot producton Cn

    . It is straightforward to check that this is aninner product.

    b) Suppose that D C and let V = C(D) = { f: D C | f is continuous }. If you knewsome complex analysis then it would be easy to check that V is a complex vector spaceand that

    f, g =

    D

    f(z)g(z) dz, f, g V,defines an inner product on V.

    Exactly as before (see Lemma 1.4) we obtain:

    5.4 Lemma Suppose thatx V. Then x, 0 = 0. Consequently, x, x 0 for all x V.Note that if x V then x, x = x, x, so x, x is real. Therefore, we define the length

    of x to be the real number x = x, x. We could prove the CauchySchwartz inequalityby modifying the proof of Theorem 2.1; you might like to try to do this. Rather than do this at

    the end of this section we will give a more conceptual proof using orthogonal projections. This

    will also give us another proof in the real case.

    Recall that if = a + ib C then || = a2 + b2 is the modulus of. Note that ||2 = .5.5 Lemma Suppose thatx V and C. Then x = || x.

    Proof By direct calculation, using (5.2), we see that

    x2 = x, x = x, x = ||2x2.Now take square roots of both sides.

    As before, say that two vectors x, y V are orthogonal ifx, y = 0. As before, an orthogonalbasis ofV is a basis consisting of pairwise orthogonal vectors and an orthonormal basis is a basis{f1, . . . , f n} such that fi, fj = ij for 1 i, j n

    Repeating the argument of Proposition 3.3 word for word (for variation well prove Corol-

    lary 3.4 directly, the corresponding result using an orthonormal basis) we obtain:

    5.6 Proposition Suppose that W is a subspace of V and thatf1, . . . , f m is an orthonormalbasis ofW and letx W. Then

    w =m

    i=1

    w, fifi.

    Proof As {f1, . . . , f m} is a basis ofW we can write w uniquely in the form w =m

    i=1 ifi forsome i R. Therefore, if 1 j m then by linearity

    w, fj = mi=1

    ifi, fj = mi=1

    ifi, fj = j,

  • 7/28/2019 alg-adv-06[1]

    11/65

    Complex inner product spaces 11

    since fi, fj = ij. Hence, j = w, fj as claimed.

    With this result in hand it is reasonably straightforward to generalize Theorem 3.5; you will

    find the proof when you do the first assignment.

    5.7 Theorem Assume that W has an orthonormal basis {f1, . . . , f m} and let v V. Thenthere exists a unique vectorw W such that

    v, x = w, x, for all x W;

    indeed, w =m

    i=1

    v, fifi. Moreover, v w < v x, wheneverx W andx = w.

    Proof See assignment 1.

    Again, we define the orthogonal projection ofv V onto W to be the vector

    W(v) =m

    i=1

    v, fifi.

    Repeating the proof of Theorem 3.8 we find that every finite dimensional complex inner product

    space also has an orthogonal basis.

    5.8 Corollary Every finite dimensional complex inner product space has an orthogonal basis.

    Now we do something different and use orthogonal projections to prove the CauchySchwartzinequality for complex inner product spaces. The key fact that we need is obvious from a geo-

    metrical viewpoint (at least in the real case). For the proof note that if C then = ||2.

    5.9 Proposition Suppose thatW is a subspace ofV. Then v W(v) for all v V.

    Proof By the results above we can choose an orthonormal basis {f1, . . . , f m} for W. Extend{f1, . . . , f m} to a basis {f1, . . . , f m, vm+1, . . . , vn} ofV. By applying the GramSchmidt algo-rithm to this basis we obtain an orthonormal basis of V of the form {f1, . . . , f m, fm+1, . . . , f n};in particular, the first m elements of this basis are our original orthonormal basis of W.

    Let i

    =v, f

    ifor 1

    i

    n. Then v = ni=1v, fifi and W(v) = mi=1v, fifi byProposition 5.6. Therefore,

    W(v)2 = W(v), W(v) = m

    i=1

    ifi,m

    j=1

    j fj =m

    i=1

    ii =m

    i=1

    |i|2.

    Similarly, v2 =ni=1 |i|2 mi=1 |i|2 = W(v)2, completing the proof.As promised, this gives us a different proof of the CauchySchwartz inequality for complex

    inner product spaces.

    5.10 Corollary Suppose thatx, y V. Then |x, y| x y.

  • 7/28/2019 alg-adv-06[1]

    12/65

    12 Algebra (Advanced)

    Proof Ifx = 0 then there is nothing to prove, so suppose that x = 0. Let W = Cx be the onedimensional subspace ofV spanned by x. Then, by Proposition 5.9 and Lemma 5.5,

    y W(y) =

    y, xx, xx

    =|y, x|x2 x =

    |x, y|x .

    Rearranging this equation gives the result.

    Of course, this gives another proof of Theorem 2.1 when V is a real inner product space.Notice that unlike the case of real inner product spaces, Corollary 5.10 does not lead to a natural

    definition of the angle between two vectors x, y V because, in general, x,yxy

    will be a

    complex number.

    6. Isometries and inner product spaces

    Linear transformations (or, equivalently, matrices) are the maps between vector spaces which

    preserve the vector space structure. We now consider those homomorphisms of inner productspaces which preserve the inner product. Throughout we will consider inner product spaces

    which are eitherreal or complex vector spaces.

    6.1 Definition Suppose that V and W are (real or complex) inner product spaces. A lineartransformation T : V W is an isometry if it preserves lengths; that is, T(x) = x for allx V.

    Strictly speaking an isometry should be any map which preserves lengths; however, as far as

    were concerned, at least when it comes to vector spaces, all maps are linear.

    In fact, isometries are precisely those linear transformations which preserve inner products.

    6.2 Proposition Suppose thatV andW are inner product spaces and thatT : VW is lineartransformation. Then T is an isometry if and only ifT(x), T(y) = x, y for all x, y V.Proof One direction is trivial. Suppose that T(x), T(y) = x, y, for all x, y V. Then, inparticular, T(x)2 = T(x), T(x) = x, x = x2 for all x V. Hence, T is an isometry.

    Conversely, suppose that T(x) = x for all x V. Then for any x, y V we haveT(x), T(y) + T(y), T(x) = T(x + y)2 T(x)2 T(y)2

    = x + y2 x2 y2 = x, y + y, x.If V and W are real inner product spaces then this shows that

    T(x), T(y)

    =

    x, y

    so were

    done. If V and W are complex inner product spaces then this argument only shows that thereal parts ofT(x), T(y) and x, y are equal. The same argument applied to T(ix), T(y) =iT(x), T(y) shows that the imaginary parts of T(x), T(y) and x, y are equal (here, i =1). Hence, T(x), T(y) = x, y as claimed.

    It is often convenient to use that fact that isometries preserve inner products (rather than just

    lengths); we will use this fact freely from now on. (By freely what I really mean is that I will

    apply Proposition 6.2 automatically from now on without explicitly mentioning that I am doing

    so!)

    Recall from the Linear Algebra course that ifT : V

    W is a linear transformation then thekernel of T is ker T = { v V | T(v) = 0 }, a subspace of V, and the image of T is im T ={ T(v) | v V }, a subspace ofW.

  • 7/28/2019 alg-adv-06[1]

    13/65

    Isometries and inner product spaces 13

    6.3 Lemma Suppose thatT : VW is an isometry. Then ker T = 0 and so V = im T.Proof Suppose that x ker T; that is, T(x) = 0. Then 0 = T(x) = x; so x = 0 byLemma 5.4. Hence. ker T = 0.

    For the second claim, the RankNullity theorem says that dim V = dim ker T + dim im T.

    (A more familiar statement for you is probably that dim V = rank T + nullity T; by definition,rank T is the dimension of the image of T and nullity T is the dimension of the kernel.) There-fore, dim V = dimim T since ker T = 0. However, two vector spaces are isomorphic if andonly if they have the same dimension, so V = im T as claimed. (Later we will see that this is aspecial case of the first isomorphism theorem.)

    In light of this result we may as well restrict our attention to the isometries T : VV whichmap V to V; by the Lemma such isometries are isomorphisms.

    6.4 Theorem Suppose thatV is an inner product space and thatT : VV is a linear trans-formation. Then T is an isometry if and only if T maps every orthonormal basis of V to anorthonormal basis ofV.

    Proof For the proof, fix an orthonormal basis {f1, . . . , f n} of V; then fi, fj = ij for all1 i, j n.

    Suppose first that T is an isometry. Then we have T(fi), T(fj) = fi, fj = ij , so that{T(f1), . . . , T (fn)} is also an orthonormal basis ofV.

    The harder part is the converse. Let x, y V and write x =ni=1 ifi and y =nj=1 j fj.Then, by bilinearity and (5.2),

    x, y

    =

    n

    i=1 ifi,n

    j=1 jfj =n

    i=1n

    j=1 ijfi, fj =n

    i=1 ii.Moreover, since T is linear T(x) =

    ni=1 iT(fi) and T(y) =

    nj=1 jT(fj). Therefore,

    T(x), T(y) = n

    i=1

    iT(fi),n

    j=1

    jT(fj) =n

    i=1

    nj=1

    ijT(fi), T(fj) =n

    i=1

    ii,

    the last equality following because {T(f1), . . . , T (fn)} is also an orthonormal basis of V.Hence, comparing these two equations, T(x), T(y) = x, y as required.

    Fix an orthonormal basis of V; in fact, we may as well assume that our basis is the standardbasis of column vectors {e1, . . . , en} for V. Then a linear transformation T corresponds to leftmultiplication by the matrix AT = (aij ) where

    T(ej ) =n

    i=1

    aij ei;

    so thej th column ofAT is the column vector describing the vector T(ej) in terms of the standardbasis {e1, . . . , en} of V. (In the Linear Algebra course the matrix AT was usually denoted by[T]BB, where B is the basis {e1, . . . , en}.) This gives a correspondence, T AT, between theset of linear transformations on V and the set ofn

    n matrices. Since we have fixed a basis we

    can (and do) identify vectors in V with column vectors (relative to our fixed orthonormal basis).We now characterize isometries in terms of matrices.

  • 7/28/2019 alg-adv-06[1]

    14/65

    14 Algebra (Advanced)

    6.5 Corollary Suppose thatT : VV is a linear transformation and define the matrix AT asabove. Then T is an isometry if and only if the columns ofAT give an orthonormal basis ofV.

    Proof With respect to the basis {e1, . . . , en}, the vector T(ej) corresponds to the column vector(a1j, . . . , anj)

    t, which is column j of the matrix AT.

    In other words, the different isometries ofV correspond to (permutations of) the orthonormalbases ofV.

    A similar argument establishes the same result for the rows ofAT.

    6.6 Corollary Suppose thatT is an isometry. Then the rows ofAT correspond to an orthonor-mal basis ofV.

    6.7 Definition Suppose that A = (aij) is a matrix (with real or complex entries). Then theconjugate transposeofA is the matrix A = (aji).

    In particular, if all of the entries of A are real then A = At is just the transpose of A.In general, A = (A)t = At where A = (aij) is the matrix whose entries are the complexconjugates of the entries ofA. Using standard facts about transposes of matrices it follows that(A) = A, (AB) = BA, (A + B) = A + B and so on; see Tutorials.

    6.8 Corollary Suppose that T : V V is a linear transformation. Then T is an isometry ifand only ifATAT = 1.

    Proof As in the proof of Corollary 6.5, the j th column of AT is the vector T(ej ). Also, byTheorem 6.4 the basis

    {T(e

    1), . . . , T (e

    n)}

    is orthonormal. Therefore,T(e

    i), T(e

    j)

    = ij

    .

    Rewriting this equation in terms of the matrix AT = (aij) this becomes

    ij = T(ei), T(ej) = n

    k=1

    akiek,

    nl=1

    alj el =n

    k=1

    akiakj =n

    k=1

    ajk aki.

    In other words, the (j, i)th entry ofATAT is ij. Hence, ATAT = 1.

    We can rephrase this argument so as to give a clearer explanation for what is really happening.

    If x = (x1, . . . , xn)t and y = (y1, . . . , yn)

    t are two column vectors when x, y = yx remember that we are now identifying elements of V with columns vectors (with respect to thebasis {e1, . . . , en}). In the proof above, the column vector ATei is the ith column ofAT and therow vector (ATej)

    = ej AT is the j

    th row ofAT; so,

    ij = ei, ej = T(ei), T(ej) = (ATej )(ATei) = ej ATATeiis the (j, i)th entry of the matrix ATAT. This is precisely the claim that A

    TAT = 1.

    7. Normal, Hermitian and unitary matrices

    Now that we have a matrix theoretic characterization of isometries we concentrate upon un-

    derstanding the corresponding matrices. It turns out to be easier to study a larger class of

    matrices and that the additional matrices which arise are also important from the point of

    harmonic analysis and quadratic forms (not that well talk about these subjects in this course).

  • 7/28/2019 alg-adv-06[1]

    15/65

    Normal, Hermitian and unitary matrices 15

    7.1 Definition LetA be a matrix with complex entries. Thena) A is normal ifAA = AA;b) A is Hermitian ifA = A; and,c) A is unitary ifAA = 1.

    LetA be a matrix with real entries. Then

    a) A is normal ifAAt = AtA;b) A is symmetricifA = At; and,c) A is orthogonal ifAAt = 1.

    Now, T : V V is an isometry if and only if AT is a unitary matrix by Corollary 6.8; sothese are the matrices which we care about the most. Notice that every unitary matrix is normal

    (ifA is unitary then A1 = A so AA = 1 = AA). Similarly, every Hermitian matrix is alsonormal. Finally, observe that ifA is normal then AA = AA is Hermitian.

    The main result that we want to prove is that ifN is a normal matrix then there exists a unitarymatrix T such that T1NT = TNT is a diagonal matrix; in particular, this result will apply

    to unitary matrices and hence to isometries. Now we know that every matrix is conjugate to itsJordan canonical form (since we are working over the complex field C which is algebraicallyclosed), so what we really need to show is that V has an orthogonal basis which consists ofeigenvectors for N.

    Before we start the proof we backtrack and explain the real significance of the conjugate

    transpose of a matrix.

    7.2 Lemma Suppose thatA is any n n matrix. Then Ax,y = x, Ay for all x, y V.Proof Recalling our identification of elements of V with columns vectors, if x, y V thenAx,y = y(Ax) = (Ay)x = x, Ay.

    7.3 Proposition Suppose thatN is a normal matrix and thatv V is an eigenvector ofN witheigenvalue . Then Nv = v; that is, v is an eigenvector ofN with eigenvalue .

    Proof First consider the case when = 0; that is, Nv = 0. Then NNv = 0 so v is also aneigenvector ofNN = NN (since N is normal). Therefore, by Lemma 7.2,

    0 = NNv, v = Nv, Nv = Nv2;whence Nv = 0 by Lemma 5.4.

    Now consider the general case where Nv = v (and is not necessarily 0). Let

    N = NIn,where In is the n n identity matrix (and n = dim V). Then N = N In andNN = (N In)(N In) = NN N N + In

    = NN N N + In = (N In)(N In) = NN .

    So, N is also normal. Now, Nv = Nv v = 0 since v is a eigenvector of N. Therefore,Nv = 0 by the first paragraph of the proof. Expanding this equation we find that Nv = v aswe wanted to show.

    7.4 Corollary a) Suppose that A is a Hermitian matrix and that is an eigenvalue of A.Then R.

  • 7/28/2019 alg-adv-06[1]

    16/65

    16 Algebra (Advanced)

    b) Suppose thatA is a unitary matrix and that is an eigenvalue ofA. Then || = 1.

    Proof See tutorials.

    In particular, notice that this says that all of the eigenvalues of a symmetric matrix are real.Notice that part (b) also follows from the fact left multiplication by a unitary matrix is an

    isometry.

    7.5 Lemma Suppose that N is a normal matrix and that v and w are two eigenvectors of Nwith eigenvalues and respectively. Then v, w = 0 unless = .

    Proof Suppose that = . Then using Lemma 7.2, once again, and Proposition 7.3 we have

    v, w = Nv, w = v, Nw = v,w = v, w.

    As = this gives the result.

    Finally, we need a more technical Lemma.

    7.6 Lemma Suppose thatN is a normal matrix and that(N In)kv = 0 for some nonzerovector v V, some C (orR) and an integer k 1. Then (N In)v = 0; that is, v isa eigenvector of N.

    Proof We first make a series of reductions. As in the proof of Proposition 7.3 we may assume

    that = 0 by replacing N with N = N In if necessary; thus we have that Nkv = 0.Next, since N

    k

    v = 0 we certainly have (N

    )k

    Nk

    v = 0. However, N

    N = NN

    so(N)kNk = (NN)k and we have (NN)kv = 0. Let A = NN; then A is Hermitian.Assume, for the moment that we know that Av = 0. Then

    0 = Av,v = NNv, v = Nv, Nv = Nv2

    by Lemma 7.2; so Nv = 0 by Lemma 5.4.Hence, it is enough to prove that ifA is an Hermitian matrix such that Akv = 0 then Av = 0.

    Choose m large enough so that 2m k. Then A2mv = 0. Now, A is Hermitian so A = Aand, consequently, A2

    m

    = (A2m1

    )2 = (A)2m1

    A2m1

    = (A2m1

    )A2m1

    . Therefore, by

    Lemma 7.2 once again,

    0 = A2mv, v = (A2m1)A2m1v, v = A2m1v, A2m1v.

    So, A2m1

    v = 0 by Lemma 5.4. Ifm > 1 then we can repeat this argument and eventually wewill find that Av = 0, as we wished to show.

    We are now ready to prove the main result of this section.

    7.7 Theorem Suppose that N is a normal matrix. Then there exists a unitary matrix T suchthatTNT = T1NT is a diagonal matrix.

  • 7/28/2019 alg-adv-06[1]

    17/65

    Normal, Hermitian and unitary matrices 17

    Proof Let J = P1NP be the Jordan canonical form of N and let {v1, . . . , vn} be the cor-responding Jordan basis. (Thus, vi = P ei, for all i, and J is the matrix which describes thelinear transformation determined by N relative to the basis {v1, . . . , vn}.) Now suppose that Jcontains a Jordan block of the form

    1 0 00 1 0...

    . . .. . .

    . . ....

    0 0 10 0

    and let {vi1 , . . . , vim} be the corresponding basis elements (so Nvia = via + via1 , with theunderstanding that vi0 = 0). Then (N In)mv = 0 for all v vi1, . . . , vim. Therefore,(N In)v = 0 by Lemma 7.6; that is, Nv = v and V is an eigenvector ofN. However, thismeans that every vector in vi1 , . . . , vim is an eigenvector ofN; so m = 1 and this is a Jordanblock of size 1. Consequently, all of the Jordan blocks in J have size one; in other words, J isa diagonal matrix and so V has a basis which consists of eigenvectors for N.

    For each let V be the subspace ofv consisting ofeigenvectors; that is,

    V = { v V | Nv = v } .Then V =

    V, where runs over the eigenvalues of N. By Lemma 7.5 ifv V and

    w V, for = , then v, w = 0; so the eigenspaces V and V are automatically orthogonalto each other. Moreover, using GramSchmidt we can find an orthonormal basis for each V.Hence, we can find an orthonormal basis {f1, . . . , f n} for V which is built up from the orthonor-mal bases of the V. By construction, Nfi = ifi for some i C. Let T be the matrix suchthat fi = T ei; in other words the i

    th column ofT is the column vector for fi (with respect to thestandard basis {e1, . . . , en}). Then T1NT is the diagonal matrix diag(1, . . . , n).

    Finally, it remains to observe that T is a unitary matrix by Theorem 6.4 since it maps theorthonormal basis {e1, . . . , en} to the orthonormal basis {f1, . . . , f n}. Consequently, we alsoknow that T1 = T by Corollary 6.8. Hence, T1NT = TNT is diagonal as we wished toshow.

    As the proof shows, the diagonal entries ofTNT are just the eigenvalues of N.If we now specialize to the case of isometries we see that a linear transformation T : VV

    is an isometry if and only if there exists an orthonormal basis {f1, . . . , f n} of V together withsome (complex) numbers 1, . . . , n of norm 1 such that T(vi) = ivi, for i = 1, . . . , n.

    Application: quadratic surfaces in Rn.

    A general quadratic surface in Rn is given by the equation

    ni,j=1

    aij xixj +n

    i=1

    bixi + c = 0,

    where aij , bi, c are real constants. We can certainly assume that aij = aji for all i and j. Thismeans that the corresponding matrix A = [aij] is symmetric, A = A

    t. In particular, if n = 2

    then taking x = x1 and y = x2 we get the equation of a quadratic curve in R2

    ,

    a11x2 + 2a12xy + a22y

    2 + b1x + b2y + c = 0.

  • 7/28/2019 alg-adv-06[1]

    18/65

    18 Algebra (Advanced)

    By Corollary 7.4(a), all the eigenvalues 1, . . . , n of the matrix A are real. Furthermore, bythe real version of Theorem 7.7, there exists an orthogonal matrix T with real entries such thatthe matrix D := T1AT is diagonal with the diagonal entries 1, . . . , n.

    Denote by x the vector-column with the entries x1, . . . , xn. Then the quadratic part of theequation of the surface can be written as

    ni,j=1

    aij xixj = xtAx.

    Introduce new coordinates y1, . . . , yn in Rn by setting y = T1x, where y denotes the vector-

    column with the entries y1, . . . , yn. Then x = T y, and in the new coordinates the quadratic partof the surface takes the form

    xtAx = ytTtAT y = ytDy =n

    i=1iy

    2i .

    This brings the equation of the surface to the simpler form

    ni=1

    iy2i +

    ni=1

    iyi + c = 0,

    where the i are real constants. The surface is said to be nondegenerate if i = 0 for all i. Inthis case, using the shifts yi := yi + i and a possible renumbering of the coordinates we canbring the equation of the surface to the canonical form

    pi=1

    y2ia2i

    ni=p+1

    y2ia2i = 1,

    or

    pi=1

    y2ia2i

    ni=p+1

    y2ia2i = 0,

    where the ai are positive numbers and p {0, 1, . . . , n}. For the first equation, ifp = 0 then thesurface is empty. Ifp = n the surface is an ellipsoid and for the remaining values ofp the surfaceis a hyperboloid. For the second equation, the corresponding surface is a cone which degenerates

    into a point ifp = 0 or p = n.The degenerate surfaces (i.e. where i = 0 for certain i) include paraboloids, cylinders of

    various kinds as well as plains or pairs of plains.

    In particular, for n = 2 the quadratic equations determine ellipses, hyperbolas, parabolas,lines, pairs of lines, points or the emptyset. The detailed analysis of possible canonical forms

    in the cases n = 2 and n = 3 is left to the reader.

  • 7/28/2019 alg-adv-06[1]

    19/65

    The definition of a group 19

    PART II: GROUP THEORY

    8. The definition of a group

    A group is nothing more than a set which is equipped with one operation, such as addition or

    multiplication. Groups are ubiquitous in mathematics and also in the real world. They were first

    studied in detail by the French mathematician Evariste Galois who used them to show that there

    is no general formula for the solution of a polynomial equation of degree 5 or higher which uses

    only surds.

    The definition of a group is very abstract but, as we shall see, we already know may exam-

    ples. Many new examples also arise when we look at the symmetry properties of objects from

    geometry.

    8.1 Definition A binary operation on a setX is a map : X XX; (x, y) x y.For example, addition, multiplication and division are usually binary operations. Note, how-

    ever, that we do have to take some care here: if X = R then addition and multiplication areboth binary operations on X but division is not because we cannot divide by zero!

    A group is a set which comes with a special type of binary operation.

    8.2 Definition A group is a setG together with a binary operation,

    : G GG; (g, h) g h,such that:

    a) (associative) Ifg,h,k G then (g h) k = g (h k);b) (identity) there exists an elemente G such thate g = g = g e, for all g G; and,c) (inverses) for each g G there is an elementg

    G such thatg g

    = e = g

    g.Note that implicit in the definition is the assumption that g h G, for all g, h G.An element e G which satisfies property (b) is called an identity element of G. If g G

    then an element g G such that g g = e = g g is called an inverse of g. Note that weare assuming only that G has at least one identity and that each element of G has at least oneinverse; we will shortly see that these elements are unique.

    The following examples (and others) were discussed in more detail in lectures. You should

    check that all of these examples are groups; in particular, you need to ask yourself what the

    identity element is in each group and what the inverses are.

    8.3 Definition A group G is abelian(orcommutative) ifg h = h g for all g, h G.Abelian groups are the simplest sorts of groups around; however, even here there are still

    some nontrivial questions to be answered.

    8.4 Examples a) Let G = Z with being multiplication. Then G is not a group because,for example, 0 does not have an inverse. Note, however, that 1 is an identity element.

    b) Let G = Z with the operation being addition. This time 0 is an identity element (since0 + n = n = n + 0 for all n Z) and the inverse of n Z is n (since n + (n) =0 = (n) + n). As addition is associative this means that Z is a group (when we take theoperation to be addition). Notice that Z is abelian.

    c) Let G = Q be the set of nonzero rational numbers with the operation of multiplication.This time G is an abelian group.

  • 7/28/2019 alg-adv-06[1]

    20/65

    20 Algebra (Advanced)

    d) Let G be the set of all n n matrices with entries in a field F under addition. Again G isan abelian group.

    e) Let G be the set of all n n matrices with entries in a field F under multiplication. Thistime G is not a group for the same reason as in example (a): the zero matrix does not havean inverse.

    f) Let G = GLn(F) = { A Mn(F) | det A = 0 } be the set of invertible n n matricesunder multiplication (of matrices). Then G is a group. It is also our first example of anonabelian group.

    g) Let G = On(R) be the set of all n n orthogonal matrices, with the operation of multi-plication. Then G is a nonabelian group.

    h) Let V be a vector space, with the operation of addition (of vectors) so we are forget-ting about scalar multiplication. Then V is a group.

    i) Let V be a vector spaces and let G = GL(V) be the set of all isomorphisms from Vto V. Then G becomes a group under composition of maps: f g = f g. Actually, asan isomorphism from V to V corresponds to an invertible n

    n matrix, where n is the

    dimension ofV, this the same group as in (f) above.j) For a positive integer n, let Zn = { 0, 1, . . . , n 1 } with the operation being addition

    modulo n. (Recall that for any integer a there is a unique integer r such that a = kn + rand 0 r < n. Define a = r; it is also common to write a r (mod n). Then theoperation on Zn is a b = a + b.) Then Zn is the cyclic group of order n.

    k) For a positive integer n, let = exp(2i/n) be a primitive nth root of unity in C; thatis, = n

    1. Let Cn = { a | a Z } = {1, , 2, . . . , n1}, with the operation being

    multiplication (of complex numbers). Then Cn is (also?) the cyclic group of order n.If you compare the multiplication tables for Zn and Cn you will see that they are the

    same. Later, we will see that these two groups are isomorphic (via the map a a).l) Fix a positive integer n and let Sym(n) be the set of all permutations of the set n ={1, 2, . . . , n}. A permutation ofn is nothing more than a bijection from n to n: to each

    integer i nwe assign another integerj n. In particular, Sym(n) contains n! elements.The group operation on Sym(n) is just composition of maps; however, in order to makethe multiplication of permutations read more naturally I want to define composition as

    follows: if, Sym(n) then is the permutation ofn given by()(i) = ((i)), for all i n.

    (Usually, composition works from the left: (f g)(x) = f(g(x)). For permutations manyauthors define composition from the right. In fact, we should really write the maps on the

    right as well: (i)() = ((i)), but we wont do this.) As before, it is easy to check thatthis operation makes Sym(n) into a group.

    The most obvious way of specifying the elements ofSym(n) is to use two line notation;for example, let be the element of Sym(5) given by (1) = 2, (2) = 3, (3) = 1,(4) = 5 and (5) = 4. Then can be described more compactly as =

    1 2 3 4 52 3 1 5 4

    .

    The advantage of the convention that we are using for the operation in Sym(n) is thatin order to calculate the product of two permutations we just read the equations from left

    to right. For example,

    1 2 3 4 52 3 1 5 4

    1 2 3 4 55 4 3 2 1

    =

    1 2 3 4 54 3 5 1 2

    .

    There is also a more compact notation for permutations known as the cycle notation;

    using this notation we would write as = (1, 2, 3)(4, 5). You read this, cyclically,

  • 7/28/2019 alg-adv-06[1]

    21/65

    The definition of a group 21

    from left to right: the first cycle (1, 2, 3) says that sends 1 to 2, 2 to 3, and 3 back to1; the second cycle (4, 5) says that interchanges 4 and 5. Note that in cycle notation(1, 2, 3) = (2, 3, 1) = (3, 1, 2); all that matters is the order of the numbers up to a cyclicpermutation. Using cycle notation the product of the two permutations above becomes

    (1, 2, 3)(4, 5)

    (1, 5)(2, 4) = (1, 4)(3, 5, 2); again to work this out we read from left to

    right.

    Finally, notice that in cycle notation

    1 2 3 4 55 4 3 2 1

    = (1, 5)(2, 4). Strictly speaking we

    should write (1, 5)(2, 4)(3), where the (3) indicates that 3 is sent to 3 (i.e. it is fixed);however, we normally just omit the fixed points of the permutations.

    Warning: Armstrong reads his permutations differently.

    m) Generalizing the last example, let X be any set and let Sym(X) be the set of bijectionsfrom X to X permutations of X. The argument of the last example shows thatSym(X) is a group.

    n) Let be a graph; in other words, consists of a set of vertices Vand a set of edges

    E=

    {(x, y)

    |x, y

    V }. A graph automorphism of is a bijection f:

    V Vwhich

    preserves the edges of; that is, if(x, y) Ethen (f(x), f(y)) E. The symmetry groupof is the set G of all graph automorphisms, where the group operation is compositionof maps. Notice that ifX has no edges then G = Sym(X).

    o) Suppose that n 3 and let Pn be a regular ngon; that is, Pn is the graph with vertices1, 2, . . . , n and edges joining n and 1, and i and i + 1 for 1 i < n. (For Pn to bea regular ngon we also require that all of the edges in Pn have the same length.) Forexample, P3 is an equilateral triangle, P4 is a square, P5 is a pentagon, P6 is a hexagonand so on. See Example 10.7 for P8 and a detailed analysis ofD8.

    The symmetry group ofPn is known as the dihedral group Dn (of order 2n). An elementof Dn is completely determined by where it sends the vertices labelled 1 and 2. If 1 is

    mapped to the vertex i then 2 must be sent to either i+1 or i1 (interpret i1 modulo n);hence, the group Dn contains exactly 2n elements.Notice that the subset of Dn consisting of the rotations is also a cyclic group of

    order n.p) Consider the following four matrices with complex entries

    1 =

    1 0

    0 1

    , I =

    i 0

    0 i

    , J =

    0 1

    1 0

    , K =

    0 i

    i 0

    .

    One easily verifies the relations

    I2 = J2 = K2 = 1, IJK = 1which imply that the set of 8 matrices Q8 = {1, I, J, K} is closed under matrixmultiplication and forms a group called the quaternion group or Hamilton group. (The lower

    case letters i,j,k are often used instead ofI, J, K ).

    As the course unfolds it is a good idea to ask what the theorems we prove say about the

    various examples above.

    The examples show that we need to be a little careful with our notation: the operation isreminiscent of multiplication; however in many examples the operation is nothing like multi-

    plication. Shortly we will drop this dot notation and simply write gh rather than g h. This isa matter of convenience only: when we are talking about an abstract group (or, if you prefer,

    an arbitrary group) groups we need a notation for our operation. When we are talking about

    specific examples then the operation could be addition, multiplication, or possibly something

    quite different (as in the example of braid groups).

  • 7/28/2019 alg-adv-06[1]

    22/65

    22 Algebra (Advanced)

    8.5 Proposition Suppose thatG is a group. Then:a) The identity element of G is unique; that is, if e and e are elements of G such that

    g e = g = e g andg e = g = e g, for all g G, then e = e.b) Each element of G has a unique inverse; that is, if g G and there exist elements

    g, g

    G such thatg

    g = e = g

    g and g

    g = e = g

    g then g = g.

    Proof

    a) Suppose that there exist elements e, e G as the statement of the Proposition. Thene = e e = e, the first equality following because e is a (right) identity element and thesecond one because e is a (left) identity.

    b) Suppose that we have elements g and g as above. Then

    g = g e = g (g g) = (g g) g = e g = g,where we have used the facts that the binary operation is associative, e is an identityelement, g is a (right) inverse to g, and that g is a (left) inverse of g.

    Now that we know that identity elements and inverses are unique we adopt the following

    convention.

    8.6 Notation IfG is a group we let 1G (or simply 1 when G is understood), denote the identityelement ofG. Ifg G then we write g1 for the inverse ofg.

    With this notation we can now rewrite the group axioms for identity elements and inverses as

    the familiar looking equations 1 g = g = g 1 and g g1 = 1 = g1 g, respectively.We are really using a multiplicative (and exponential) notation for our group operation: we

    could equally well use an additive one. The only reason for preferring a multiplicative notationover an additive one is that addition tends to be commutative whereas multiplication is often

    not commutative (for example consider matrix multiplication).

    A word of warning here: we use this shorthand notation because it is very convenient; how-

    ever, you should not forget that this notation is shorthand. In particular, 1G = 1 is the identityelement ofG and not the number one; indeed, 1G could well be the number zero (for example,this would be the case ifG = Z and the operation were addition). Similarly, g1 is the inverseofg (and not 1

    g, even when this does make sense). Exactly how g1 is described will depend on

    the group in question (for example, ifG = Z then g1 is the negative g ofg G).From this point on we will also (mostly) drop the

    notation; so rather than g

    h we will simply

    write gh.

    8.7 Lemma Suppose thatg, h G. Then (g1)1 = g and(gh)1 = h1g1.

    Proof As gg1 = 1 = g1g, the first equality is obvious (as (g1)1 is the unique element ofGwhich satisfies this equation). For the second claim note that

    (gh)(h1g1) = g(hh1)g1 = g 1 g1 = gg1 = 1.Similarly, (h1g1)(gh) = 1. As inverses are uniquely determined, (gh)1 = h1g1 asclaimed.

    We end this section with some more notation.

  • 7/28/2019 alg-adv-06[1]

    23/65

    Subgroups 23

    8.8 Definition Suppose thatG is a group.a) Ifg G then the order |g| ofg is the smallest positive integer such thatgn = e, if such n

    exists. Otherwise, g is ofinfinite order.b) The order|G| ofG is the number of elements in G.c) We say thatG is a finitegroup if it has finite order.

    Given the nomenclature, it is natural to ask whether there is a relationship between the order

    of a group and the possible orders of its elements. As a challenge, try and work out what the

    connection is (see what happens in the examples above with |G| finite).9. Subgroups

    Now that we know that the inverse of an element is uniquely determined for any integer n wealso define

    gn =

    g g . . . g, (n times), ifn > 0,1, ifn = 0,

    g1 g1 . . . g1, (n times), ifn < 0.At the risk of boring you, once again this is just a convenient shorthand and the meaning of gn

    will depend upon the particular example we have in mind (for example, if G = Z then gn isactually ng = g + g + + g). We now have the easy Lemma.

    9.1 Lemma Suppose that G is a group, m and n are integers and that g G. Then gm+n =gmgn and(gm)n = gmn.

    Proof See tutorials.

    If g G let g = { gn | n Z }; so g is a subset of G. In fact, g is also a groupin its own right. First note that the operation on G gives a operation on g by restrictionsince gmgn = gm+n by the Lemma. Also, 1 = g0 g, so g has an identity element.Finally, if gn g then gn g, so every element of g has an inverse in g (sincegngn = 1 = gngn, by the Lemma once again). We say that g is the subgroup of Ggenerated by g.

    9.2 Definition Suppose thatG is a group. A subgroupofG is any nonempty subset ofG whichis itself a group, where the operation on H is the restriction of the operation on G to H. IfH isa subgroup ofG then we write H

    G.

    Every group always has at least two subgroups; namely, {1G} and G itself. A subgroup HofG is nontrivial if H = {1G} and it is proper if H = G. The interesting subgroups are thenontrivial proper subgroups.

    We saw above that g is subgroup ofG whenever g G. Further, if g = 1G then g is anontrivial subgroup of G. If G = g then we say that G is a cyclic group; cyclic groups arethe simplest types of all possible groups. For example, Z and Zn are both cyclic groups; in fact,later we will see that every cyclic group is isomorphic to one of these.

    The first problem that arises when considering subgroups is that a priori if a, b H thenthere is no reason to expect that a

    b

    H. The next problem is that the identity element of H

    may not be the same element as the identity element of G. Certainly, if 1G H then it mustbe true that 1G = 1H (since identity elements are unique); however, we have not assumed that

  • 7/28/2019 alg-adv-06[1]

    24/65

    24 Algebra (Advanced)

    1G H. Finally, the same problem arises when we consider inverses: the inverse of a Hinside H could be different to the inverse of a in G.

    In the statement of the next result for an element h H we write h1H and h1G for the inverseofh when considered as an element of H and ofG respectively.

    9.3 Proposition Suppose that H is a subgroup of G. Then 1H = 1G and h1H = h1G for allh H.Proof First consider 1H. We know that 1H 1H = 1H; therefore, 11H = 1H (in either G or Hsince both have the same operation). Therefore, 1G = 1H 11H = 1H 1H = 1H.

    The result for the inverses is now clear: inside H we have hh1H = 1H = h1H h. However, by

    the first paragraph 1H = 1G, so we can rewrite this equation as hh1H = 1G = h

    1H h. Therefore,

    h1H = h1G because inverses are unique by Proposition 8.5.

    Given this result we can now drop the subscripts on the identity and inverse elements of

    subgroups and unambiguously write 1 and h1

    for the identity element and inverse elements.Our immediate goal is to understand when a subset of a group is actually a subgroup.

    9.4 Lemma Suppose that H is a nonempty subset of G. Then H is a subgroup of G if andonly if the following two conditions hold:

    a) ifa, b H then ab H; and,b) ifa H then a1 H

    Proof IfH is a subgroup ofG then both of these conditions are satisfied by definition.Conversely, suppose that (a) and (b) are true. First, by (a) the binary operation on H does

    restrict to give an operation on H. Further, as the operation on G is associative, it is still

    associative when considered as an operation on H. Next, because H = we can find anelement a H. By condition (b), we know that a1 H; in particular, element ofH has aninverse in H. Finally, since H is closed under multiplication, 1 = a a1 is also an elementofH; so H has an identity element. Hence, H is a subgroup ofG.

    When condition (a) holds we say that H is closed under multiplication. Similarly, H is closedunder the taking of inverses if it satisfies (b).

    9.5 Proposition Suppose that G is a finite group. Then a nonempty subsetH is G is a sub-group ofG if and only ifH is closed under multiplication.

    Proof Again, ifH is a subgroup ofH then it is closed under multiplication, so there is nothingto prove here.

    Conversely, suppose that H is closed under multiplication. By the Lemma, in order to showthat H is a subgroup ofG it is enough to show that a1 H whenever a H. By assumptionifa H then an = a a . . . a H whenever n 1. However, G is a finite group so it mustbe true that am = an for some m > n (otherwise {a, a2, a3, . . . } would be an infinite subset ofG). Therefore, multiplying by an1 (inside G), we see that amn1 = a1. However, m > nso that means that a1 = amn1 H. (Note that ifm n 1 = 0 then a1 = 1 so a = 1; soifa = 1 then m n 1 1.) This is what we needed to show, so H is a subgroup as claimed.

    We now come to the main result which characterizes when a subset of a group is a subgroup.

  • 7/28/2019 alg-adv-06[1]

    25/65

    Generators 25

    9.6 Theorem (The Subgroup Criterion) Suppose thatG is a group and thatHis a nonemptysubset ofG. Then H is a subgroup ofG if and only ifab1 H for all a, b H.

    Proof IfH is a subgroup ofG then, in particular, it is a group so ab1 H whenever a, b H.Conversely, suppose that ab1

    H for all a, b

    H. Then 1 = bb1

    H, taking a = b.

    Consequently, if b H then b1 = 1 b1 is also in H; hence H is closed under inverses.Finally, since b1 H we see that ab = a(b1)1 H; so H is closed under multiplication.Thus, H is a subgroup by Lemma 9.4.

    10. Generators

    Our current goal, which will occupy us for most of the rest of this course, is to understand

    the structure of groups; that is, how groups are built up out of smaller groups.

    10.1 Proposition Suppose that G is a group. Then the intersection of an arbitrary collectionof subgroups ofG is again a subgroup ofG.

    Proof Let { Hi | i I} be a family of subgroups of G, for some indexing set I. We need toshow that HI =

    iI Hi is also a subgroup ofG. Suppose that a, b HI. Then a, b Hi for

    all i I. Therefore, by the subgroup criterion (Theorem 9.6), ab1 Hi for all i I. Hence,ab1 HI, so HI is a subgroup by Theorem 9.6.

    The proof of the last result is straightforward because we have the subgroup criterion to work

    with. Notice, however, that the result is very general because we are not assuming anything

    about the set I which indexes the subgroups. This is crucial for the next definition, which

    otherwise would not make sense.

    10.2 Definition Suppose that X G. Then the subgroup X of G generated by X is theintersection of all of the subgroups of G which contain X; that is,

    X =

    XHG

    H.

    Recall that we write H G to indicate that H is a subgroup ofG.

    10.3 Corollary Suppose that X G. Then X is the smallest subgroup of G which con-tains X.

    Proof By definition, X is contained in every subgroup ofG which contains X.

    This Corollary is really just a restatement of the definition. The content of the result is

    that there is a (unique) smallest subgroup of G which contains X. A priori, it is not clearthat there is a (unique) smallest subgroup of G which contains X; however, this follows fromProposition 10.1 because ifH1 and H2 are two subgroups containing X then H1 H2 is anothersuch subgroup.

    In the last section we defined g = { gn | n Z }, for g G. It is not immediately apparentthat this notation agrees with Definition 10.2; however, it does.

  • 7/28/2019 alg-adv-06[1]

    26/65

    26 Algebra (Advanced)

    10.4 Proposition LetG be a group.a) Ifg G then the subgroup ofG generated by g consists of the elements { gn | n Z }.b) IfX G then X = { x11 x22 . . . xkk | xi X andi = 1 for1 i k, k 1 }

    Proof We prove only part (a) and leave (b) to the tutorials.

    Let H be the subgroup of G generated by g. We have already seen that g is a subgroup,and certainly g g, so H g by definition. Conversely, g H so gn H, for all n 1,since H is closed under multiplication. Also, g1 H because H is closed under the taking ofinverses: so gn H, for all n 1. Finally, 1 = g0 H. Hence, g H and so H = g asrequired.

    10.5 Definition A group G is cyclicifG = g.For example Z = 1 and Zn = 1 are both cyclic groups.

    10.6 Proposition Suppose thatG = g is a cyclic group and that g has finite orderm = |g|.Then |G| = |g| andG = {1, g , . . . , gm1}.Proof By the proposition G = g = { gn | n Z }. Further, ifn Z then n = km + r for aunique integer r with 0 r < m; therefore,

    gn = gkm+r = gkmgr = (gm)kgr = gr

    by Lemma 9.1. Hence, g = {1, g . . . , gm1}.It remains to show that |G| = m; that is, that the elements {1, g , . . . , gm1} are all distinct.

    Suppose that gr = gs for some integers r and s with 0

    r < s < m. Then gsr = 1; however,this is nonsense because 0 s r < m and |g| = m. Hence, the result.

    10.7 Example We close this section by looking at the example of the dihedral group of order 16in detail; this is the symmetry group G of the octagon:

    1 2

    3

    4

    56

    7

    8

    Notice that G has exactly 16 elements because the vertex labelled 1 can be sent to any of the 8vertices; after this the vertex 2 must be connected to 1, so it can only be sent to one of the twoadjacent vertices. Once we have specified where 1 and 2 go the permutation of the vertices iscompletely determined; so |G| = 2 8 = 16 as claimed.

    The last paragraph gave an explicit description of the elements of G; we now describe themin terms of two generators of G. Let r be a clockwise rotation through 2/8 radians; so r =(1, 2, 3, 4, 5, 6, 7, 8) as a permutation of the vertices. Then r has order 8 this is clear whether

    we think ofr as a permutation or geometrically; therefore,

    r = {1, r , r2, r3, r4, r5, r6, r7}

  • 7/28/2019 alg-adv-06[1]

    27/65

    Generators 27

    is a subgroup of G of order 8. Next let s be the reflection in the line that bisects the edgesjoining 1 and 2, and 5 and 6; as a permutation of the vertices s = (1, 2)(3, 8)(4, 7)(5, 6).Then s2 = 1 so s is an element of order 2. Therefore, s = {1, s} is a subgroup of G oforder 2.

    I claim that G =

    r, s

    . (Strictly speaking I should write G =

    {r, s

    }; as a general rule Ill

    omit such extraneous brackets.) Certainly,

    r, s {1, r , r2, r3, r4, r5, r6, r7s,sr,sr2, sr3, sr4, sr5, sr6, sr7}.If we can show that these elements are all distinct then well be done because |G| = 16and |r, s| 16 with equality if and only if G = r, s. Geometrically it is clear that allof these elements are different because the powers of r are precisely the rotations of the oc-tagon, whereas the elements of the form srb the rotations followed by a twist. (It is perhaps notimmediately obvious that every automorphism of the octagon is of one of these forms; however,

    this must be the case because |G| = 16).We can also see algebraically that these elements are distinct. First, the powers of r are

    distinct since r has order 8. Next, ifra = srb for some a and b with 0 a, b < 8 then s = rba;so 1 = s2 = r2(ba) so b a = 4 and s = r4, which is nonsense! Finally, if sra = srb, for aand b as before, then ra = rb so a = b.

    We have now shown that G = r, s; however, this is a little curious because r, s alsocontains elements like r5s3r2s7r which does not seem to appear in the list above. The reasonfor this is the following: I claim that srs = r1. Again, we can see this geometrically or bymultiplying out the permutations; Ill leave the geometrical argument to you. The permutation

    calculation is the following:

    srs = (1, 2)(3, 8)(4, 7)(5, 6) (1, 2, 3, 4, 5, 6, 7, 8) (1, 2)(3, 8)(4, 7)(5, 6)

    = (1, 8, 7, 6, 5, 4, 3, 2) = r

    1

    .(Recall that we read permutations from left to right.) Hence, srs = r1; or, equivalently (bymultiplying on the left by s and using the fact that s2 = 1), rs = sr1. In other words, wheneverwe have an s to the right of an r we can move it to the left by changing r into r1. It is now easyto see by induction on b that rbs = srb, for all b with 0 b < 8. Therefore, the expressionabove becomes

    r5s3r2s7r = (r5s)(r6s)r = (sr5)(sr6)r = s(r3s)r2r = s(sr3)r3 = 1,

    noting that s2 = 1 and r8 = 1 so ra = r8k+a and s = s2k+1, for any a, k Z.In fact, the multiplication in G is completely determined by the relations r8 = 1, s2 = 1

    and rsr = r1. (In general, a relation says that some word in the generators is equal to 1.)Because of this we write

    G = r, s | r8 = 1, s2 = 1 and srs = r1 .This is called defining a group by generators and relations which we will come to towards

    the end of the course. To see that the relations determine the multiplication in G first supposethat b > 0. Then we have

    srbs = (srs)(srb1s) = r1(srb1s) = r1r(b1) = rb,

    where the second last equality follows by induction on b. Therefore, just as before, the relations

    in G (i.e. r8

    = 1, s2

    = 1 and srs = r1

    ) imply that

    G = { sarb | a = 0 or 1, and 0 b < 8 } .

  • 7/28/2019 alg-adv-06[1]

    28/65

    28 Algebra (Advanced)

    (With a small amount of effort it is possible to show that the relations also force all of these

    elements to be distinct.) Using the last formula we can show how to multiply two arbitrary

    elements ofG: suppose that 0 a, c < 2 and 0 b, d < 8; then

    (sa

    rb

    )(sc

    rd

    ) = sa

    (rb

    sc

    )rd

    = sarb+d, ifc = 0sa+1rdb, ifc = 1.So, as claimed, the multiplication in G is completely determined by the relations.

    Finally, we remark that it is not very hard to generalize this example to an arbitrary dihedral

    group Dn, the symmetry group of the regular ngon. The crucial point in all of the calcu-lations above was that srs = r1; this still true in Dn if we take r to be a rotation through2/n and take s to be any reflection. Repeating these arguments for Dn we find that Dn ={ sarb | a = 0 or 1, and 0 b < n } and that Dn = r, s | rn = 1, s2 = 1 and srs = r1 .

    11. Cosets

    The key to understanding how a subgroup sits within a group is the following definition.

    11.1 Definition Suppose thatH is a subgroup ofG and thatg G. Thena) gH = { gh | h H} is the left coset ofH in G which contains g;b) Hg = { hg | h H} is the right cosetofH in G which contains g.

    Most of the time we will work with left cosets; however, any result about left cosets can be

    translated into a result about right cosets because

    (aH)1def= { (ah)1 | h H} = { h1a1 | h H} = { ha1 H} = Ha1.

    The following properties of (left) cosets are both elementary and fundamental.

    11.2 Lemma LetH be a subgroup ofH.a) Suppose thath H. Then hH = H.b) Suppose thata G. Then |H| = |aH|.

    Proof (a) Certainly hH H since H is closed under multiplication. Conversely, if h Hthen h = h(h1h) hH, so H hH.

    (b) Let : HaH be the map (h) = ah. By definition is surjective; it is also injectivebecause (h1) = (h2) if and only if ah1 = ah2, so h1 = h2 (multiply on the left by a

    1).

    Hence, is a bijection and |H| = |aH|.

    11.3 Proposition Suppose that H is a subgroup of G and a, b G. Then the following areequivalent:

    a) aH = bH;b) aH bH = ;c) a bH;

    d) b aH;e) a1b H; and,f) b1a H.

    Proof IfaH = bH then aH bH = ; so (a) implies (b).Next, suppose that aH bH = . Then we can find an element x aH bH; so, x = aha

    and x = bhb, for some ha, hb H. Then aha = bhb, so a = bhbh1

    a h. Therefore, (b)implies (c).

  • 7/28/2019 alg-adv-06[1]

    29/65

    Cosets 29

    Now suppose that (c) holds; that is, a = bh for some h H. But then b = ah1 aH so (c)implies (d).

    Next, if (d) is true then b = ah for some h H. Therefore, a1b = h H, showing that (d)implies (e). In turn, (e) implies (f) because b1a = (a1b)1 and H is closed under the takingof inverses.

    Finally, suppose that (f) is true. Then h = b1a H, so a = bh andaH = { ah | h H} = { bhh | h H} = { bh | h H} = bH;

    the second last equality following because hH = H by Lemma 11.2(a). Hence, (f) implies (a).

    An important consequence of the Proposition is that ifaH and bH are two cosets then eitheraH = bH or aH bH = . The Proposition also gives a way of deciding when two cosets areequal; in particular, aH = H if and only ifa H

    IfA and B are subsets of a set X we write X = AB ifX = A B and A B = .11.4 Corollary Suppose that H is a subgroup of G. Then there exist elements { gi | i I}in G such that

    G =iI

    giH.

    Proof As x = x 1 xH we certainly have G = xG xH. For each coset xH pick arepresentative gi xH; then xH = giH and giH gj H = ifi = j. Hence, G =

    iI giH.

    11.5 Definition If G = iI giH we say that { gi | i I} is a (complete) set of left cosetrepresentativesofH in G. Similarly, { gi | i I} is a (complete) set ofright coset representativesforH in G ifG =

    iI Hg

    i.

    These sets are often called left and right transversal, respectively.

    Corollary 11.4 says that if H is a subgroup of G then we can always find a set of cosetrepresentatives. More than this, if we fix a set of left coset representatives { gi | i I} for Hin G then every element g G can be written uniquely in the form g = gih for some i I andh H.

    Notice that if{ gi | i I} is a set of left coset representatives for H in G then { g1i | i I}is a set of right coset representatives for H in G.

    Combining the last few results we obtain our first important structural theorem.

    11.6 Theorem (Lagranges Theorem) Suppose that G is a finite group and that H is a sub-group ofG. Then |H| divides |G| and |G|

    |H|is equal to the number of cosets ofH in G.

    Proof As G is a finite group, we can find a finite set {g1, . . . , gk} of coset representatives for Hin G; then G =

    ki=1 giH. Therefore, by Lemma 11.2(b),

    |G| =iI

    |giH| =k

    i=1

    |H| = k|H|.

    Hence, k = |G||H| .

  • 7/28/2019 alg-adv-06[1]

    30/65

    30 Algebra (Advanced)

    11.7 Definition The index [G : H] of H in G is the number of cosets of H in G. If G is finite

    then [G : H] = |G||H|

    .

    11.8 Corollary Suppose that G is a finite group and that g G. Then the order ofg dividesthe order ofG.

    Proof Let H = g = { gn | n Z }. Now G is a finite group, so g must have finite order; saym = |g| < . Then g = {1, g , . . . , gm1} and g = |g| = m by Proposition 10.6. Hence,the result follows by applying Lagranges theorem to the subgroup g.

    11.9 Corollary Suppose thatG is a group of prime order. Then G is cyclic.

    Proof Suppose that |G| = p, where p is prime. Pick any element g G such that g = 1. Theng is a nontrivial subgroup of G (that is, g = {1}), so |g| divides p = |G|. Therefore,either |g| = 1 or |g| = p; however, |g| 2, so we must have |g| = p. Hence, G = gand G is cyclic.

    As every cyclic group is abelian, we have also shown that the groups of prime order are

    abelian.

    12. Equivalence relations

    The key point about cosets is that even though a coset is a set all of its properties are deter-

    mined by one of its coset representatives: if aH is a coset then aH = xH, for any x aH, andit does not really matter which representative x we choose.

    It will be useful to formalise (or abstract) what is going on here.

    If X is a set then, formally, a relation on X is subset R of X X; we write x y if(x, y) R. Informally, you should think of relationships in the usual sense of the word (forexample, as a deep and meaningful connection between various elements ofX).

    12.1 Definition Let X be a set. An equivalence relation on X is a relation on X with thefollowing properties:

    a) (reflexive) ifx X then x x;b) (symmetric) ifx, y X andx y then y x; andc) (transitive) ifx,y,z X, x y andy z then x z.

    If is an equivalence relation on X and x X then x = { y X | x y } X is theequivalence classofx in X.

    12.2 Examples a) IfH is a subgroup ofG define a relation on G by a b ifaH = bH.Then is an equivalence relation and the equivalence class of a G is the coset aH.

    b) Let X be the set (class) of all finite dimensional vector spaces, and let be the relationon X given by V W if V = W. Then = is an equivalence relation on X and theequivalence class of a vector space V consists of all those vector spaces isomorphic to V.Notice that the isomorphism class of V is completely determined by its dimension (thisis by no means obvious; but this was proved in Math2902).

    c) Suppose that G is a group. Define a relation on G by a b if a = g1bg for someg G; we say that a and b are conjugate in G. Then is an equivalence relation on Gand the equivalence classes aG =

    {g1ag

    |g

    G

    }are the conjugacy classes of G. Of

    these three examples, conjugacy is the one we currently understand the least about; for

    example, ifg G how big is the conjugacy class of g?

  • 7/28/2019 alg-adv-06[1]

    31/65

    Homomorphisms 31

    12.3 Proposition Suppose that is an equivalence relation on X. Then X is a disjoint unionof its equivalence classes.

    Proof The proof is much the same as that of Corollary 11.4. As is reflexive, x x soX = xX x. We just need to check that different equivalence classes do not overlap.Suppose that x z = . Then there exists some y x z; so x y and z y. Bysymmetry, y z; so x z by transitivity. Hence, z x and z x by transitivity. Applyingsymmetry once again, z x; so x zand x z. It follows that xz= if and only ifx = z,which is what we needed to show.

    Note that if is an equivalence class on a set X then, in general, the equivalence classesin X will have different sizes (i.e. there can exist x, y X with |x| = |y|). In this sensethe equivalence classes corresponding to the cosets of a subgroup H in a group G are specialbecause |aH| = |H|, for all a G.

    13. Homomorphisms

    When studying vector spaces it is natural to consider maps f: V W which preservethe vector space structure; that is linear maps. If V and W are inner product spaces, then wealso want f to preserve inner products. When studying groups we consider those maps whichpreserve the group structure.

    13.1 Definition Suppose thatG andH are groups.a) A group homomorphismfrom G to His a function : GHsuch that(ab) = (a)(b),

    for all a, b G.b) A group isomorphism is a group homomorphism which is also a bijection.

    c) Two groups G and H are isomorphic if there exists a group isomorphism from G to H; inthis case we write G = H.

    There is a very important subtlety in this definition: on the left hand side, in (ab), wecompute the product of a and b inside G; on the right hand side, with (a)(b) we computethe product of (a) and (b) inside H. Thus, a group homomorphism (or, more simply, ahomomorphism), is a map : G H which is compatible with the different operations in Gand H.

    13.2 Examples The following maps are examples of group homomorphisms.

    a) Let G be any group and let : G G be the identity map (so (g) = g for all g G).Then is an isomorphism.b) Let G = GLn(F), for any field F, and define : GLn(F) F by (A) = det A.

    Then is a group homomorphism because (AB) = det(AB) = det(A)det(B) =(A)(B), for all A, B GLn(F).

    c) Let G = GLn(C) and let : G G be the map given by (A) = (A1). Then is a homomorphism since (AB) = (B1A1) = (A1)(B1) = (A)(B), for allA, B GLn(C). Since is bijective, it is an isomorphism.

    d) Let be the map from the braid group to the symmetric group which forgets overcrossings and under crossings.

    e) Let V and W be vector spaces, which we think of as additive (abelian) groups. Thenany linear transformation T : V W is a group homomorphism because (x + y) =(x) + (y), for all x, y V.

  • 7/28/2019 alg-adv-06[1]

    32/65

    32 Algebra (Advanced)

    f) Let n be a positive integer and let Zn and Cn be the cyclic groups of order n; see (j)and (k) in Example 8.4. Define :Zn Cn by (a) = a. Then is an isomorphism.To see this let a be equal to a modulo n; so a is the unique integer such that 0 a < nand a = a + qn for some q Z. Then the operation in Zn is a + b = a + b. Writea + b = a + b + kn, for k

    Z. Then

    (a + b) = (a + b) = a+b = a+b+kn = abkn = ab = (a)(b),

    since n = 1. Hence, is a group homomorphism. What is really happening here is that(kn) = kn = 1, so (a) = a, for all a Z.

    g) Let n be a positive integer and define :Z Cn by (a) = a. The is a grouphomomorphism since (a + b) = a+b = ab = (a)(b). Notice that is alsosurjective.

    h) Suppose that m and n are positive integers with m dividing n. Let = e2i/n and =e2i/m; then Cn = and Cm = . Let : Cn Cm be the map (a) = a. Tosee that is a group homomorphism first observe that, for any k Z, (kn) = kn = 1since m divides n. Therefore, for any a, b Z,

    (ab) = (a+b) = a+b = ab = (a)(b).

    Hence, is a surjective group homomorphism.

    13.3 Proposition Suppose thatG andH are groups and that : GH is a group homomor-phism. Then (1G) = 1H and(g

    1) =

    (g)1

    , for all g G.Proof The proof of this is basically the same as Proposition 9.3. We have

    (1G) = (1G 1G) = (1G)(1G).Multiplying this equation on the left by (1G)1 shows that

    1H = (1G)

    (1G)1

    =

    (1G)(1G)

    (1G)1

    = (1G)

    (1G)(1G)1

    = (1G).

    For the second claim, by what we have just shown, (g)(g1) = (gg1) = (1G) = 1Hand, similarly, (g1)(g) = 1H. Hence, (g

    1) =

    (g)1

    by Proposition 8.5(b).

    13.4 Definition Suppose that : GH is a group homomorphism.a) The imageof is im = { (g) | g G }, a subset ofH.b) The kernelof is ker = { g G | (g) = 1H}, a subset ofG.IfH is an additive group then the kernel of is the set of elements which are sent to zero

    by ; this is how you should think of the kernel in general: it is the set of elements which arekilled by .

    13.5 Corollary Suppose that : GH is a group homomorphism. Then im is a subgroupofH andker is a subgroup ofG.

    Proof Ifx, y im then x = (a) and y = (b) for some a, b G. Therefore,xy1 = (a)

    (b)1

    = (a)(b1) = (ab1) im by the Proposition. Hence, im is a subgroup ofH by the Subgroup Criterion (Theorem 9.6).

    Similarly, if a, b

    ker then (ab1) = (a)(b1) = 1H(b)

    1= 1H. Hence, ab

    1

    also belongs to the kernel; so ker is also a subgroup ofG by the Subgroup Criterion.

  • 7/28/2019 alg-adv-06[1]

    33/65

    Normal subgroups 33

    14. Normal subgroups

    Corollary 13.5 is slightly misleading because it seems to say that image and kernel of a

    (group) homomorphism are pretty much the same; in fact, the kernel of a homomorphism is a

    very special type of subgroup.

    For any subgroup H ofG we can always form the setwise product

    (aH)(bH) = { xy | x aH and y bH}

    of two cosets of H. Since (aH)(bH) hH ahbH we can write (aH)(bH) as a (disjoint)union of Hcosets (by Corollary 11.4). For most subgroups H the setwise product will be aunion of two or more cosets rather than just a single coset; however, for some subgroups, for

    example the kernel of a homomorphism, we obtain just a single coset.

    Consider a group homomorphism : GH and let K = ker = { g G | (g) = 1H}.

    Claim: Ifa

    G then aK =

    {g

    G

    |(g) = (a)

    }= Ka.

    One direction is easy: ifg aK then g = ak, for some k K, so (g) = (a)(k) = (a)since (k) = 1H. Hence, aK {g G | (g) = (a) }. Conversely, if(g) = (a) then

    1H = (g)1(a) = (g1)(a) = (g1a).

    so g1a K = { x G | (x) = 1H }. Therefore, aK = gK by Proposition 11.3.To prove the claim for the right cosets of K we proceed in much the same way. If (g) =

    (a) then 1H = (g)(a)1 = (ga1), so ga1 K; this implies that Kg = Ka by the

    right handed version of Proposition 11.3(e). Hence, { g G | (g) = (a) } Ka. The otherinclusion is virtually identical to the previous case.

    Now suppose that a, b G and that x aK = Ka and y bK = Kb. Then(xy) = (x)(y) = (a)(b) = (ab).

    Therefore, (aK)(bK) = { xy | (x) = (a) and (y) = (b) } { g G | (g) = (ab) } =abK. In fact, (aK)(bK) = abK because (aK)(bK) is a union of left cosets whereas the righthand side is just a single coset. A quicker way to see this is to note that

    (aK)(bK) = a(Kb)K = a(bK)K = abK2 = abK,

    the second equality following because bK = Kb by the claim.As you are not used to working with cosets we should perhaps be a little more careful here.

    We have

    (aK)(bK) = { axby | x, y K} = { a(bx)y | y, x K} = abK2 = abK,

    where the second equality follows because xb = bx for some x K (since Kb = bK), andthe last equality follows because K2 = K. Notice that (aK)(bK) =

    xK axbK is a union of

    left cosets ofK; what is surprising is that we get a single coset.

    14.1 Definition Suppose that G is a group. A normal subgroup of G is any subgroup N such

    thatgN = Ng for any g G. IfN is a normal subgroup ofG we write N

    G.

  • 7/28/2019 alg-adv-06[1]

    34/65

    34 Algebra (Advanced)

    From the discussion above, the kernel of a group homomorphism is always normal.

    It is important to note that N is normal if and only if gN = Ng as sets, for all g G. Thatis, we require that gN = { gn | n N} = { ng | n N} = Ng. So ifN is normal, g Gand n N then gn = ng for some n N; in general, n will be genuinely different from nand gn

    = ng.

    14.2 Proposition Suppose thatN is a subgroup ofG. Then the following are equivalent:a) g1Ng = N for all g G.b) N is a normal subgroup ofG;c) every left coset ofN is also a right coset (i.e. ifg G then gN = Ng for some g G);d) the product of any two left cosets of G is a single left coset (i.e. if a, b G then

    (aN)(bN) = cN, for some c G);e) (aN)(bN) = abN for all a, b G;

    Proof If g1Ng = N for all g G then gN = Ng by multiplying on the right by g. Hence,(a) implies (b).

    Next suppose that (b) is true. Then gN = Ng for all g G so (c) holds.If (c) is true then if b G there exists a b G such that Nb = bN. Therefore, for any

    a G we have (aN)(bN) = a(Nb)N = a(bN)N = abN. Hence, the product of two cosetsis again a coset; so (c) implies (d).

    Now assume that (d) is true: that is, (aN)(bN) = cN for some c G. Now, ab = a 1 b 1 (aN)(bN), so ab cN. Therefore, cN = abN by Proposition 11.3. Hence, (d) implies (e).

    Finally, suppose that (e) is true. Then, in particular, (g1N)(gN) = g1gN = N. Therefore,if n N then g1ng = g1ng 1 g1NgN = N; so g1Ng N. By the same argument(using g1 in place ofg), gN g1 N; therefore, N g1Ng multiply on the left by g1and the right by g. Hence, g1Ng = N, for all g

    G; so (a) holds.

    We will meet some more characterizations of normal subgroups in the next few sections.

    14.3 Examples a) Suppose that G is an abelian group. Then every subgroup ofG is normal.b) Consider GLn(F), for some field F, and let SLn(F) = { a GLn(F) | det(a) = 1 }. It

    is straightforward to check that SLn(F) is a subgroup ofGLn(F); this group is called thespecial linear group. Then SLn(F) is a normal subgroup ofGLn(F). Looking at cosets thisis by no means obvious; however, ifg GLn(F) and a SLn(F) then

    det(gag1) = det(g)det(a)det(g1) = det(g) 1 det(g)1 = 1;so gag1 SLn(F).

    c) Once again take D8 = r, s | r8 = 1, s2 = 1 and srs = r1 to be the dihedral group oforder 16; see Example 10.7. Then the following subgroups ofD8 are normal:

    H1 = 1 = {1}H2 = r4 = {1, r4}H3 = r2 = {1, r2, r4, r6}H4 = r = {1, r , r2, r3, r4, r5, r6, r7}H5 = s, r4 = {1, r4,s,sr4}H6 = s, r

    2

    = {1, r2

    , r4

    , r6

    ,s,sr2

    , sr4

    , sr6

    }H7 = s, r = G.

  • 7/28/2019 alg-adv-06[1]

    35/65

    Quotient groups 35

    To check this because r and s generate D8 it is enough to note that r1Hir Hi and that

    sHis Hi, for each i. This follows easily once you remember that ras = sra for all a.Finally, a subgroup of D8 which is not normal is H8 = s = {1, s}. To see this

    notice that rH8 = {r,rs} = {r,sr7}; whereas, H8r = {r,sr}. Similarly, if0 b < 8then

    srb

    =

    {1, srb

    }is not normal unless b = 4 (since rbs = srb is equal to srb if and

    only ifb = 4).d) Let G = Sym(3) = {1, (1, 2), (2, 3), (1, 3), (1, 2, 3), , (1, 3, 2)}. Let H = (1, 2) is not

    normal in G whereas K = (1, 2, 3) is a normal subgroup.In general, only a small number of the subgroups of a (non Abelian) group will be normal.

    We close this section with another characterization of normal subgroups. First, recall from

    Example 12.2(c) that ifa G then the conjugacy class ofa is aG = { g1ag | g G }.14.4 Proposition Suppose thatN is a subgroup of G. Then N is normal if and only if N is aunion