AATA A Review of Vector Spaces

Section 8.5 A Review of Vector Spaces

To further analyze extension fields, we must shift our perspective. Instead of considering extension fields as a type of field, we will consider them instead as a vector space. (This is not crazy: we did a similar thing when trying to understand the integers. We could consider them as a group, but when we consider them as a ring, we learn more.)

Here is the plan: we will briefly review the main ideas about vector spaces in general, in case it has been a while since you thought about the topic. Then we will leverage the machinery of linear algebra to discover new results about extension fields.

In a linear algebra class, you probably thought of vectors primarily as column vectors: lists of tuples such as \(\begin{bmatrix}3\\0\\1\end{bmatrix}\) of \(\begin{bmatrix}4\\-2\end{bmatrix}\text{.}\) These are exaples of vectors in \(\R^3\) and \(\R^2\text{,}\) respectively, and \(\R^3\) and \(\R^2\) are indeed examples of vector spaces.

The more abstract approach is to consider a set together with some operations that must satisfy some axioms. Then, just like we saw with groups and rings, we can generalize what works for those generic vectors to other mathematical objects. For example, the set of all polynomials of degree no more than 2 looks very much like \(\R^3\) as a vector space.

Our goal is to view an extension field as a vector space, so let's say carefully what a vector space is using the abstract, axiomatic approach.

Subsection 8.5.1 Definitions and Examples

A vector space \(V\) over a field \(F\) is an abelian group with a scalar product \(\alpha \cdot v\) or \(\alpha v\) defined for all \(\alpha \in F\) and all \(v \in V\) satisfying the following axioms.

\(\alpha(\beta v) =(\alpha \beta)v\text{;}\)
\((\alpha + \beta)v =\alpha v + \beta v\text{;}\)
\(\alpha(u + v) = \alpha u + \alpha v\text{;}\)
\(1v=v\text{;}\)

where \(\alpha, \beta \in F\) and \(u, v \in V\text{.}\)

The elements of \(V\) are called vectors; the elements of \(F\) are called scalars. It is important to notice that in most cases two vectors cannot be multiplied. In general, it is only possible to multiply a vector with a scalar. To differentiate between the scalar zero and the vector zero, we will write them as 0 and \({\mathbf 0}\text{,}\) respectively.

Let us examine several examples of vector spaces. Some of them will be quite familiar; others will seem less so.

Example 8.20.

The \(n\)-tuples of real numbers, denoted by \({\mathbb R}^n\text{,}\) form a vector space over \({\mathbb R}\text{.}\) Given vectors \(u = (u_1, \ldots, u_n)\) and \(v = (v_1, \ldots, v_n)\) in \({\mathbb R}^n\) and \(\alpha\) in \({\mathbb R}\text{,}\) we can define vector addition by

\begin{equation*} u + v = (u_1, \ldots, u_n) + (v_1, \ldots, v_n) = (u_1 + v_1, \ldots, u_n + v_n) \end{equation*}

and scalar multiplication by

\begin{equation*} \alpha u = \alpha(u_1, \ldots, u_n)= (\alpha u_1, \ldots, \alpha u_n)\text{.} \end{equation*}

Example 8.21.

If \(F\) is a field, then \(F[x]\) is a vector space over \(F\text{.}\) The vectors in \(F[x]\) are simply polynomials, and vector addition is just polynomial addition. If \(\alpha \in F\) and \(p(x) \in F[x]\text{,}\) then scalar multiplication is defined by \(\alpha p(x)\text{.}\)

Example 8.22.

The set of all continuous real-valued functions on a closed interval \([a,b]\) is a vector space over \({\mathbb R}\text{.}\) If \(f(x)\) and \(g(x)\) are continuous on \([a, b]\text{,}\) then \((f+g)(x)\) is defined to be \(f(x) + g(x)\text{.}\) Scalar multiplication is defined by \((\alpha f)(x) = \alpha f(x)\) for \(\alpha \in {\mathbb R}\text{.}\) For example, if \(f(x) = \sin x\) and \(g(x)= x^2\text{,}\) then \((2f + 5g)(x) =2 \sin x + 5 x^2\text{.}\)

Example 8.23.

Let \(V = {\mathbb Q}(\sqrt{2}\, ) = \{ a + b \sqrt{2} : a, b \in {\mathbb Q } \}\text{.}\) Then \(V\) is a vector space over \({\mathbb Q}\text{.}\) If \(u = a + b \sqrt{2}\) and \(v = c + d \sqrt{2}\text{,}\) then \(u + v = (a + c) + (b + d ) \sqrt{2}\) is again in \(V\text{.}\) Also, for \(\alpha \in {\mathbb Q}\text{,}\) \(\alpha v\) is in \(V\text{.}\) We will leave it as an exercise to verify that all of the vector space axioms hold for \(V\text{.}\)

Just as we were able to prove some basic, fundamental facts about groups from the axioms, we can easily verify some facts for vector spaces.

Proposition 8.24.

Let \(V\) be a vector space over \(F\text{.}\) Then each of the following statements is true.

\(0v ={\mathbf 0}\) for all \(v \in V\text{.}\)
\(\alpha {\mathbf 0} = {\mathbf 0}\) for all \(\alpha \in F\text{.}\)
If \(\alpha v = {\mathbf 0}\text{,}\) then either \(\alpha = 0\) or \(v = {\mathbf 0}\text{.}\)
\((-1) v = -v\) for all \(v \in V\text{.}\)
\(-(\alpha v) = (-\alpha)v = \alpha(-v)\) for all \(\alpha \in F\) and all \(v \in V\text{.}\)

Proof.

To prove (1), observe that

\begin{equation*} 0 v = (0 + 0)v = 0v + 0v; \end{equation*}

consequently, \({\mathbf 0} + 0 v = 0v + 0v\text{.}\) Since \(V\) is an abelian group, \({\mathbf 0} = 0v\text{.}\)

The proof of (2) is almost identical to the proof of (1). For (3), we are done if \(\alpha = 0\text{.}\) Suppose that \(\alpha \neq 0\text{.}\) Multiplying both sides of \(\alpha v = {\mathbf 0}\) by \(1/ \alpha\text{,}\) we have \(v = {\mathbf 0}\text{.}\)

To show (4), observe that

\begin{equation*} v + (-1)v = 1v + (-1)v = (1-1)v = 0v = {\mathbf 0}\text{,} \end{equation*}

and so \(-v = (-1)v\text{.}\) We will leave the proof of (5) as an exercise.

Subsection 8.5.2 Subspaces

Just as groups have subgroups and rings have subrings, vector spaces also have substructures. Let \(V\) be a vector space over a field \(F\text{,}\) and \(W\) a subset of \(V\text{.}\) Then \(W\) is a subspace of \(V\) if it is closed under vector addition and scalar multiplication; that is, if \(u, v \in W\) and \(\alpha \in F\text{,}\) it will always be the case that \(u + v\) and \(\alpha v\) are also in \(W\text{.}\)

Example 8.25.

Let \(W\) be the subspace of \({\mathbb R}^3\) defined by \(W = \{ (x_1, 2 x_1 + x_2, x_1 - x_2) : x_1, x_2 \in {\mathbb R} \}\text{.}\) We claim that \(W\) is a subspace of \({\mathbb R}^3\text{.}\) Since

\begin{align*} \alpha (x_1, 2 x_1 + x_2, x_1 - x_2) & = (\alpha x_1, \alpha(2 x_1 + x_2), \alpha( x_1 - x_2))\\ & = (\alpha x_1, 2(\alpha x_1) + \alpha x_2, \alpha x_1 -\alpha x_2)\text{,} \end{align*}

\(W\) is closed under scalar multiplication. To show that \(W\) is closed under vector addition, let \(u = (x_1, 2 x_1 + x_2, x_1 - x_2)\) and \(v = (y_1, 2 y_1 + y_2, y_1 - y_2)\) be vectors in \(W\text{.}\) Then

\begin{equation*} u + v = (x_1 + y_1, 2( x_1 + y_1) +( x_2 + y_2), (x_1 + y_1) - (x_2+ y_2))\text{.} \end{equation*}

Example 8.26.

Let \(W\) be the subset of polynomials of \(F[x]\) with no odd-power terms. If \(p(x)\) and \(q(x)\) have no odd-power terms, then neither will \(p(x) + q(x)\text{.}\) Also, \(\alpha p(x) \in W\) for \(\alpha \in F\) and \(p(x) \in W\text{.}\)

For groups and rings, a natural way to get a subgroup or subring was to look at the set of all elements generated by one or more elements. For vector spaces, the set of elements (vectors) that are generated by a few particular elements is called a spanning set.

More in general, let \(V\) be any vector space over a field \(F\) and suppose that \(v_1, v_2, \ldots, v_n\) are vectors in \(V\) and \(\alpha_1, \alpha_2, \ldots, \alpha_n\) are scalars in \(F\text{.}\) Any vector \(w\) in \(V\) of the form

\begin{equation*} w = \sum_{i=1}^n \alpha_i v_i = \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n \end{equation*}

is called a linear combination of the vectors \(v_1, v_2, \ldots, v_n\text{.}\) The spanning set of vectors \(v_1, v_2, \ldots, v_n\) is the set of vectors obtained from all possible linear combinations of \(v_1, v_2, \ldots, v_n\text{.}\) If \(W\) is the spanning set of \(v_1, v_2, \ldots, v_n\text{,}\) then we say that \(W\) is spanned by \(v_1, v_2, \ldots, v_n\text{.}\)

Proposition 8.27.

Let \(S= \{v_1, v_2, \ldots, v_n \}\) be vectors in a vector space \(V\text{.}\) Then the span of \(S\) is a subspace of \(V\text{.}\)

Proof.

Let \(u\) and \(v\) be in \(S\text{.}\) We can write both of these vectors as linear combinations of the \(v_i\)'s:

\begin{align*} u & = \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n\\ v & = \beta_1 v_1 + \beta_2 v_2 + \cdots + \beta_n v_n\text{.} \end{align*}

Then

\begin{equation*} u + v =( \alpha_1 + \beta_1) v_1 + (\alpha_2+ \beta_2) v_2 + \cdots + (\alpha_n + \beta_n) v_n \end{equation*}

is a linear combination of the \(v_i\)'s. For \(\alpha \in F\text{,}\)

\begin{equation*} \alpha u = (\alpha \alpha_1) v_1 + ( \alpha \alpha_2) v_2 + \cdots + (\alpha \alpha_n ) v_n \end{equation*}

is in the span of \(S\text{.}\)

Subsection 8.5.3 Linear Independence

Let \(S = \{v_1, v_2, \ldots, v_n\}\) be a set of vectors in a vector space \(V\text{.}\) If there exist scalars \(\alpha_1, \alpha_2 \ldots \alpha_n \in F\) such that not all of the \(\alpha_i\)'s are zero and

\begin{equation*} \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n = {\mathbf 0 }\text{,} \end{equation*}

then \(S\) is said to be linearly dependent. If the set \(S\) is not linearly dependent, then it is said to be linearly independent. More specifically, \(S\) is a linearly independent set if

\begin{equation*} \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n = {\mathbf 0 } \end{equation*}

implies that

\begin{equation*} \alpha_1 = \alpha_2 = \cdots = \alpha_n = 0 \end{equation*}

for any set of scalars \(\{ \alpha_1, \alpha_2 \ldots \alpha_n \}\text{.}\)

Proposition 8.28.

Let \(\{ v_1, v_2, \ldots, v_n \}\) be a set of linearly independent vectors in a vector space. Suppose that

\begin{equation*} v = \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n = \beta_1 v_1 + \beta_2 v_2 + \cdots + \beta_n v_n\text{.} \end{equation*}

Then \(\alpha_1 = \beta_1, \alpha_2 = \beta_2, \ldots, \alpha_n = \beta_n\text{.}\)

Proof.

\begin{equation*} v = \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n = \beta_1 v_1 + \beta_2 v_2 + \cdots + \beta_n v_n\text{,} \end{equation*}

then

\begin{equation*} (\alpha_1 - \beta_1) v_1 + (\alpha_2 - \beta_2) v_2 + \cdots + (\alpha_n - \beta_n) v_n = {\mathbf 0}\text{.} \end{equation*}

Since \(v_1, \ldots, v_n\) are linearly independent, \(\alpha_i - \beta_i = 0\) for \(i = 1, \ldots, n\text{.}\)

The definition of linear dependence makes more sense if we consider the following proposition.

Proposition 8.29.

A set \(\{ v_1, v_2, \dots, v_n \}\) of vectors in a vector space \(V\) is linearly dependent if and only if one of the \(v_i\)'s is a linear combination of the rest.

Proof.

Suppose that \(\{ v_1, v_2, \dots, v_n \}\) is a set of linearly dependent vectors. Then there exist scalars \(\alpha_1, \ldots, \alpha_n\) such that

\begin{equation*} \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n = {\mathbf 0 }\text{,} \end{equation*}

with at least one of the \(\alpha_i\)'s not equal to zero. Suppose that \(\alpha_k \neq 0\text{.}\) Then

\begin{equation*} v_k = - \frac{\alpha_1}{\alpha_k} v_1 - \cdots - \frac{\alpha_{k - 1}}{\alpha_k} v_{k-1} - \frac{\alpha_{k + 1}}{\alpha_k} v_{k + 1} - \cdots - \frac{\alpha_n}{\alpha_k} v_n\text{.} \end{equation*}

Conversely, suppose that

\begin{equation*} v_k = \beta_1 v_1 + \cdots + \beta_{k - 1} v_{k - 1} + \beta_{k + 1} v_{k + 1} + \cdots + \beta_n v_n\text{.} \end{equation*}

Then

\begin{equation*} \beta_1 v_1 + \cdots + \beta_{k - 1} v_{k - 1} - v_k + \beta_{k + 1} v_{k + 1} + \cdots + \beta_n v_n = {\mathbf 0}\text{.} \end{equation*}

The following proposition is a consequence of the fact that any system of homogeneous linear equations with more unknowns than equations will have a nontrivial solution. We leave the details of the proof for the end-of-chapter exercises.

Proposition 8.30.

Suppose that a vector space \(V\) is spanned by \(n\) vectors. If \(m \gt n\text{,}\) then any set of \(m\) vectors in \(V\) must be linearly dependent.

A set \(\{ e_1, e_2, \ldots, e_n \}\) of vectors in a vector space \(V\) is called a basis for \(V\) if \(\{ e_1, e_2, \ldots, e_n \}\) is a linearly independent set that spans \(V\text{.}\)

Example 8.31.

The vectors \(e_1 = (1, 0, 0)\text{,}\) \(e_2 = (0, 1, 0)\text{,}\) and \(e_3 =(0, 0, 1)\) form a basis for \({\mathbb R}^3\text{.}\) The set certainly spans \({\mathbb R}^3\text{,}\) since any arbitrary vector \((x_1, x_2, x_3)\) in \({\mathbb R}^3\) can be written as \(x_1 e_1 + x_2 e_2 + x_3 e_3\text{.}\) Also, none of the vectors \(e_1, e_2, e_3\) can be written as a linear combination of the other two; hence, they are linearly independent. The vectors \(e_1, e_2, e_3\) are not the only basis of \({\mathbb R}^3\text{:}\) the set \(\{ (3, 2, 1), (3, 2, 0), (1, 1, 1) \}\) is also a basis for \({\mathbb R}^3\text{.}\)

Example 8.32.

Let \({\mathbb Q}( \sqrt{2}\, ) = \{ a + b \sqrt{2} : a, b \in {\mathbb Q} \}\text{.}\) The sets \(\{1, \sqrt{2}\, \}\) and \(\{1 + \sqrt{2}, 1 - \sqrt{2}\, \}\) are both bases of \({\mathbb Q}( \sqrt{2}\, )\text{.}\)

From the last two examples it should be clear that a given vector space has several bases. In fact, there are an infinite number of bases for both of these examples. In general, there is no unique basis for a vector space. However, every basis of \({\mathbb R}^3\) consists of exactly three vectors, and every basis of \({\mathbb Q}(\sqrt{2}\, )\) consists of exactly two vectors. This is a consequence of the next proposition.

Proposition 8.33.

Let \(\{ e_1, e_2, \ldots, e_m \}\) and \(\{ f_1, f_2, \ldots, f_n \}\) be two bases for a vector space \(V\text{.}\) Then \(m = n\text{.}\)

Proof.

Since \(\{ e_1, e_2, \ldots, e_m \}\) is a basis, it is a linearly independent set. By Proposition 8.30, \(n \leq m\text{.}\) Similarly, \(\{ f_1, f_2, \ldots, f_n \}\) is a linearly independent set, and the last proposition implies that \(m \leq n\text{.}\) Consequently, \(m = n\text{.}\)

If \(\{ e_1, e_2, \ldots, e_n \}\) is a basis for a vector space \(V\text{,}\) then we say that the dimension of \(V\) is \(n\) and we write \(\dim V =n\text{.}\) We will leave the proof of the following theorem as an exercise.

Theorem 8.34.

Let \(V\) be a vector space of dimension \(n\text{.}\)

If \(S = \{v_1, \ldots, v_n \}\) is a set of linearly independent vectors for \(V\text{,}\) then \(S\) is a basis for \(V\text{.}\)
If \(S = \{v_1, \ldots, v_n \}\) spans \(V\text{,}\) then \(S\) is a basis for \(V\text{.}\)
If \(S = \{v_1, \ldots, v_k \}\) is a set of linearly independent vectors for \(V\) with \(k \lt n\text{,}\) then there exist vectors \(v_{k + 1}, \ldots, v_n\) such that

\begin{equation*} \{v_1, \ldots, v_k, v_{k + 1}, \ldots, v_n \} \end{equation*}

is a basis for \(V\text{.}\)