מן הגורן ומן היקב: אנגלית1

If S is a geometrical shape, then a rigid motion of S

is a way of moving S in such a way that the distances

between the points of S are not changed—squeezing and

stretching are not allowed. A rigid motion is a symmetry

of S if, after it is completed, S looks the same as it

did before it moved. For example, if S is an equilateral

triangle, then rotating S through 120◦ about its center

is a symmetry; so is reflecting S about a line that passes

through one of the vertices of S and the midpoint of the

opposite side.

More formally, a symmetry of S is a function f from S

to itself such that the distance between any two points

x and y of S is the same as the distance between the

transformed points f(x) and f(y).

This idea can be hugely generalized: if S is any mathematical

structure, then a symmetry of S is a function

from S to itself that preserves its structure. If S is a

geometrical shape, then the mathematical structure that

should be preserved is the distance between any two of

its points. But there are many other mathematical structures

that a function may be asked to preserve, most

notably algebraic structures of the kind that will soon be

discussed. It is fruitful to draw an analogy with the geometrical

situation and regard any structure-preserving

function as a sort of symmetry.

Because of its extreme generality, symmetry is an allpervasive

concept within mathematics; and wherever

symmetries appear, structures known as groups follow

close behind. To explain what these are and why

they appear, let us return to the example of an equilateral

triangle, which has, as it turns out, six possible

symmetries.

Why is this? Well, let f be a symmetry of an equilateral

triangle with vertices A, B, and C and suppose for convenience

that this triangle has sides of length 1. Then

f(A), f(B), and f(C) must be three points of the triangle

and the distances between these points must all

be 1. It follows that f(A), f(B), and f(C) are distinct

vertices of the triangle, since the furthest apart any two

points can be is 1 and this happens only when the two

points are distinct vertices. So f(A), f(B), and f(C) are

the vertices A, B, and C in some order. But the number of

possible orders of A, B, and C is 6. It is not hard to show

that, once we have chosen f(A), f(B), and f(C), the rest

of what f does is completely determined. (For example,

if X is the midpoint of A and C, then f(X) must be the

midpoint of f(A) and f(C) since there is no other point

at distance 1

2 from f(A) and f(C).)

Let us refer to these symmetries by writing down in

order what happens to the vertices A, B, and C. So, for

instance, the symmetry ACB is the one that leaves the

vertex A fixed and exchanges B and C, which is achieved

by reflecting the triangle in the line that joins A to the

midpoint of B and C. There are three reflections like this:

ACB, CBA, and BAC. There are also two rotations: BCA

and CAB. Finally, there is the “trivial” symmetry, ABC,

which leaves all points where they were originally. (The

“trivial” symmetry is useful in much the same way as

zero is useful for the algebra of integer addition.)

What makes these and other sets of symmetries into

groups is that any two symmetries can be composed,

meaning that one symmetry followed by another produces

a third (since if two operations both preserve a

structure then their combination clearly does too). For

example, if we follow the reflection BAC by the reflection

ACB, then we obtain the rotation CAB. To work this out,

one can either draw a picture or use the following kind

of reasoning: the first symmetry takes A to B and the second

takes B to C, so the combination takes A to C, and

similarly B goes to A, and C to B. Notice that the order

in which we perform the symmetries matters: if we had

started with the reflection ACB and then done the reflection

BAC, then we would have obtained the rotation BCA.

(If you try to see this by drawing a picture, it is important

to think of A, B, and C as labels that stay where they

are rather than moving with the triangle—they mark

positions that the vertices can occupy.)

We can think of symmetries as “objects” in their own

right, and of composition as an algebraic operation, a bit

like addition or multiplication for numbers. The operation

has the following useful properties: it is associative,

the trivial symmetry is an identity element, and

every symmetry has an inverse. (See binary operations

[I.2 §2.4]. For example, the inverse of a reflection is itself,

since doing the same reflection twice leaves the triangle

where it started.) More generally, any set with a binary

operation that has these properties is called a group. It

is not part of the definition of a group that the binary

operation should be commutative, since, as we have just

seen, if one is composing two symmetries then it often

makes a difference which one goes first. However, if it is

commutative then the group is called Abelian, after the

Norwegian mathematician Niels Henrik abel [VI.32]. The

number systems Z, Q, R, and C all form Abelian groups

with the operation of addition, or under addition, as one

usually says. If you remove zero from Q, R, and C, then

they form Abelian groups under multiplication, but Z

does not because of a lack of inverses: the reciprocal of

an integer is not usually an integer. Further examples of

groups will be given later in this section.

2.2 Fields

Although several number systems form groups, to

regard them merely as groups is to ignore a great deal of

their algebraic structure. In particular, whereas a group

has just one binary operation, the standard number

systems have two, namely addition and multiplication

(from which further ones, such as subtraction and division,

can be derived). The formal definition of a field is

quite long: it is a set with two binary operations and

there are several axioms that these operations must

satisfy. Fortunately, there is an easy way to remember

these axioms. You just write down all the basic properties

you can think of that are satisfied by addition and

multiplication in the number systems Q, R, and C.

These properties are as follows. Both addition and

multiplication are commutative and associative, and

both have identity elements (0 for addition and 1 for

multiplication). Every element x has an additive inverse

−x and a multiplicative inverse 1/x (except that 0 does

not have a multiplicative inverse). It is the existence of

these inverses that allows us to define subtraction and

division: x−y means x+(−y) and x/y means x·(1/y).

That covers all the properties that addition and multiplication

satisfy individually. However, a very general

rule when defining mathematical structures is that if a

definition splits into parts, then the definition as a whole

will not be interesting unless those parts interact. Here

our two parts are addition and multiplication, and the

properties mentioned so far do not relate them in any

way. But one final property, known as the distributive

law, does this, and thereby gives fields their special character.

This is the rule that tells us how to multiply out

PUP: Tim would brackets: x(y +z) = xy +xz for any three numbers x,

like to keep

‘brackets’ as even

he, as a

mathematician,

would say

‘brackets’ rather

than the more

formal

‘parentheses’. OK?

y, and z.

Having listed these properties, one may then view the

whole situation abstractly by regarding the properties as

axioms and saying that a field is any set with two binary

operations that satisfy all those axioms. However, when

one works in a field, one usually thinks of the axioms not

as a list of statements but rather as a general license to

do all the algebraic manipulations that one can do when

talking about rational, real, and complex numbers.

Clearly, the more axioms one has, the harder it is to

find a mathematical structure that satisfies them, and

it is indeed the case that fields are harder to come by

than groups. For this reason, the best way to understand

fields is probably to concentrate on examples. In addition

to Q, R, and C, one other field stands out as fundamental,

namely Fp, which is the set of integers modulo

a prime p, with addition and multiplication also defined

modulo p (see modular arithmetic [III.60]).

What makes fields interesting, however, is not so

much the existence of these basic examples as the fact

that there is an important process of extension that

allows one to build new fields out of old ones. The idea

is to start with a field F, find a polynomial P that has

no roots in F, and “adjoin” a new element to F with

the stipulation that it is a root of P. This produces an

extended field F

, which consists of everything that one

can produce from this root and from elements of F using

addition and multiplication.

We have already seen an important example of this

process: in the field R, the polynomial P(x) = x2+1 has

no root, so we adjoined the element i and let C be the

field of all combinations of the form a + bi.

We can apply exactly the same process to the field F3,

in which again the equation x2 + 1 = 0 has no solution.

If we do so, then we obtain a new field, which, like

C, consists of all combinations of the form a + bi, but

now a and b belong to F3. Since F3 has three elements,

this new field has nine elements. Another example is the

field Q(

√

2), which consists of all numbers of the form

a + b

√

2, where now a and b are rational numbers. A

slightly more complicated example is Q(γ), where γ is

a root of the polynomial x3 − x − 1. A typical element

of this field has the form a + bγ + cγ2, with a, b, and c

rational. If one is doing arithmetic in Q(γ), then whenever

γ3 appears, it can be replaced by γ + 1 (because

γ3 − γ − 1 = 0), just as i2 can be replaced by −1 in

the complex numbers. For more on why field extensions PUP: Tim and I

both think this

cross-referencing

sentence works

well but I wanted

to draw your

attention to it in

case you weren’t

so happy with it.

There aren’t many

cross-references

like this in the

volume.

are interesting, see the discussion of automorphisms

in section 4.1.

A second very significant justification for introducing

fields is that they can be used to form vector spaces, and

it is to these that we now turn.

2.3 Vector Spaces

One of the most convenient ways to represent points in

a plane that stretches out to infinity in all directions is

to use Cartesian coordinates. One chooses an origin and

two directions X and Y, usually at right angles to each

other. Then the pair of numbers (a, b) stands for the

point you reach in the plane if you go a distance a in

direction X and a distance b in direction Y (where if a

is a negative number such as −2, this is interpreted as

going a distance +2 in the opposite direction to X, and

similarly for b).

Another way of saying the same thing is this. Let x

and y stand for the unit vectors in directions X and

Y, respectively, so their Cartesian coordinates are (1, 0)

and (0, 1). Then every point in the plane is a so-called

linear combination ax + by of the basis vectors x and

y. To interpret the expression ax + by, first rewrite it

as a(1, 0) + b(0, 1). Then a times the unit vector (1, 0)

is (a, 0) and b times the unit vector (0, 1) is (0, b) and

when you add (a, 0) and (0, b) coordinate by coordinate

you get the vector (a, b).

Here is another situation where linear combinations

appear. Suppose you are presented with the differential

equation (d2y/dx2) + y = 0, and happen to know (or

notice) that y = sinx and y = cosx are two possible

solutions. Then you can easily check that y = asinx +

b cosx is a solution for any pair of numbers a and b.

That is, any linear combination of the existing solutions

sinx and cosx is another solution. It turns out that all

solutions are of this form, so we can regard sinx and

cosx as “basis vectors” for the “space” of solutions of

the differential equation.

Linear combinations occur in many many contexts

throughout mathematics. To give one more example

an arbitrary polynomial of degree 3 has the form

ax3 + bx2 + cx + d, which is a linear combination of the

four basic polynomials 1, x, x2, and x3.

A vector space is a mathematical structure in which the

notion of linear combination makes sense. The objects

that belong to the vector space are usually called vectors,

unless we are talking about a specific example and

are thinking of them as concrete objects such as polynomials

or solutions of a differential equation. Slightly

more formally, a vector space is a set V such that, given

any two vectors v and w (that is, elements of V) and

any two real numbers a and b, we can form the linear

combination av + bw.

Notice that this linear combination involves objects of

two different kinds, the vectors v and w and the numbers

a and b. The latter are known as scalars. The operation

of forming linear combinations can be broken up

into two constituent parts: addition and scalar multiplication.

To form the combination av +bw, first multiply

the vectors v andw by the scalars a and b, obtaining the

vectors av and bw, and then add these resulting vectors

to obtain the full combination av + bw.

The definition of linear combination must obey certain

natural rules. Addition of vectors must be commutative

and associative, with an identity, the zero vector, and

inverses for each v (written −v). Scalar multiplication

must obey a sort of associative law, namely that a(bv)

and (ab)v are always equal. We also need two distributive

laws: (a + b)v = av + bv and a(v +w) = av + aw

for any scalars a and b and any vectors v and w.

Another context in which linear combinations arise,

one that lies at the heart of the usefulness of vector

spaces, is the solution of simultaneous equations. Suppose

one is presented with the two equations 3x+2y =

6 and x − y = 7. The usual way to solve such a pair of

equations is to try to eliminate either x or y by adding

an appropriate multiple of one of the equations to the

other: that is, by taking a certain linear combination

of the equations. In this case, we can eliminate y by

adding twice the second equation to the first, obtaining

the equation 5x = 20, which tells us that x = 4 and

hence that y = −3. Why were we allowed to combine

equations like this? Well, let us write L1 and R1 for the

left- and right-hand sides of the first equation, and similarly

L2 and R2 for the second. If, for some particular

choice of x and y, it is true that L1 = R1 and L2 = R2,

then clearly L1 +2L2 = R1 +2R2, as the two sides of this

equation are merely giving different names to the same

numbers.

Given a vector space V, a basis is a collection of vectors

v1, v2, . . . , vn with the following property: every vector

in V can be written in exactly one way as a linear combination

a1v1 +a2v2+· · ·+anvn. There are two ways in

which this can fail: there may be a vector that cannot be

written as a linear combination of v1, v2, . . . , vn or there

may be a vector that can be so expressed, but in more

than one way. If every vector is a linear combination then

we say that the vectors v1, v2, . . . , vn span V, and if no

vector is a linear combination in more than one way then

we say that they are independent. An equivalent definition

is that v1, v2, . . . , vn are independent if the only way

of writing the zero vector as a1v1 + a2v2 +· · ·+anvn

is by taking a1 = a2 = · · · = an = 0.

The number of elements in a basis is called the dimension

of V. It is not immediately obvious that there could

not be two bases of different sizes, but it turns out that

there cannot, so the concept of dimension makes sense.

For the plane, the vectors x and y defined earlier formed

a basis, so the plane, as one would hope, has dimension

2. If we were to take more than two vectors, then

they would no longer be independent: for example, if

we take the vectors (1, 2), (1, 3), and (3, 1), then we can

write (0, 0) as the linear combination 8(1, 2) − 5(1, 3) −

(3, 1). (To work this out one must solve some simultaneous

equations—this is typical of calculations in vector

spaces.)

The most obvious n-dimensional vector space is the

space of all sequences (x1, . . . , xn) of n real numbers.

To add this to a sequence (y1, . . . , yn) one simply forms

the sequence (x1 +y1, . . . , xn +yn) and to multiply it

by a scalar c one forms the sequence (cx1, . . . , cxn).

This vector space is denoted Rn. Thus, the plane with

its usual coordinate system is R2 and three-dimensional

space is R3.

It is not in fact necessary for the number of vectors

in a basis to be finite. A vector space that does not have

a finite basis is called infinite dimensional. This is not

an exotic property: many of the most important vector

spaces, particularly spaces where the “vectors” are

functions, are infinite dimensional.

There is one final remark to make about scalars. They

were defined earlier as real numbers that one uses to

make linear combinations of vectors. But it turns out

that the calculations one does with scalars, in particular

solving simultaneous equations, can all be done in a

more general context. What matters is that they should

belong to a field, so Q, R, and C can all be used as systems

of scalars, as indeed can more general fields. If the

scalars for a vector space V come from a field F, then one

says that V is a vector space over F. This generalization

is important and useful: see, for example, algebraic

numbers [IV.3 §17].

מן הגורן ומן היקב

יום רביעי, 22 בינואר 2020

אנגלית1

אין תגובות:

הוסף רשומת תגובה