RA Metric spaces

Section 7.1 Metric spaces

Note: 1.5 lectures

As mentioned in the introduction, the main idea in analysis is to take limits. In Chapter 2 we learned to take limits of sequences of real numbers. And in Chapter 3 we learned to take limits of functions as a real number approached some other real number.

We want to take limits in more complicated contexts. For example, we want to have sequences of points in 3-dimensional space. We wish to define continuous functions of several variables. We even want to define functions on spaces that are a little harder to describe, such as the surface of the earth. We still want to talk about limits there.

Finally, we have seen the limit of a sequence of functions in Chapter 6. We wish to unify all these notions so that we do not have to reprove theorems over and over again in each context. The concept of a metric space is an elementary yet powerful tool in analysis. And while it is not sufficient to describe every type of limit we find in modern analysis, it gets us very far indeed.

Definition 7.1.1.

Let \(X\) be a set, and let \(d \colon X \times X \to \R\) be a function such that for all \(x,y,z \in X\)

\(d(x,y) \geq 0\text{.}\) (nonnegativity)
\(d(x,y) = 0\) if and only if \(x = y\text{.}\) (identity of indiscernibles)
\(d(x,y) = d(y,x)\text{.}\) (symmetry)
\(d(x,z) \leq d(x,y)+ d(y,z)\text{.}\) (triangle inequality)

The pair \((X,d)\) is called a metric space. The function \(d\) is called the metric or the distance function. Sometimes we write just \(X\) as the metric space instead of \((X,d)\) if the metric is clear from context.

The geometric idea is that \(d\) is the distance between two points. Items i–iii have obvious geometric interpretation: Distance is always nonnegative, the only point that is distance 0 away from \(x\) is \(x\) itself, and finally that the distance from \(x\) to \(y\) is the same as the distance from \(y\) to \(x\text{.}\) The triangle inequality iv has the interpretation given in Figure 7.1.

Figure 7.1. Diagram of the triangle inequality in metric spaces.

For the purposes of drawing, it is convenient to draw figures and diagrams in the plane with the metric being the euclidean distance. However, that is only one particular metric space. Just because a certain fact seems to be clear from drawing a picture does not mean it is true in every metric space. You might be getting sidetracked by intuition from euclidean geometry, whereas the concept of a metric space is a lot more general.

Let us give some examples of metric spaces.

Example 7.1.2.

The set of real numbers \(\R\) is a metric space with the metric

\begin{equation*} d(x,y) \coloneqq \abs{x-y} . \end{equation*}

Items i–iii of the definition are easy to verify. The triangle inequality iv follows immediately from the standard triangle inequality for real numbers:

\begin{equation*} d(x,z) = \abs{x-z} = \abs{x-y+y-z} \leq \abs{x-y}+\abs{y-z} = d(x,y)+ d(y,z) . \end{equation*}

This metric is the standard metric on \(\R\). If we talk about \(\R\) as a metric space without mentioning a specific metric, we mean this particular metric.

Example 7.1.3.

We can also put a different metric on the set of real numbers. For example, take the set of real numbers \(\R\) together with the metric

\begin{equation*} d(x,y) \coloneqq \frac{\abs{x-y}}{\abs{x-y}+1} . \end{equation*}

Items i–iii are again easy to verify. The triangle inequality iv is a little bit more difficult. Note that \(d(x,y) = \varphi(\abs{x-y})\) where \(\varphi(t) = \frac{t}{t+1}\) and \(\varphi\) is an increasing function (positive derivative, see Figure 7.2). Hence

\begin{equation*} \begin{split} d(x,z) = \varphi(\abs{x-z}) & = \varphi(\abs{x-y+y-z}) \\ & \leq \varphi(\abs{x-y}+\abs{y-z}) \\ & = \frac{\abs{x-y}+\abs{y-z}}{\abs{x-y}+\abs{y-z}+1} \\ & = \frac{\abs{x-y}}{\abs{x-y}+\abs{y-z}+1} + \frac{\abs{y-z}}{\abs{x-y}+\abs{y-z}+1} \\ & \leq \frac{\abs{x-y}}{\abs{x-y}+1} + \frac{\abs{y-z}}{\abs{y-z}+1} = d(x,y)+ d(y,z) . \end{split} \end{equation*}

The function \(d\) is thus a metric, and gives an example of a nonstandard metric on \(\R\text{.}\) With this metric, \(d(x,y) < 1\) for all \(x,y \in \R\text{.}\) That is, every two points are less than 1 unit apart.

Figure 7.2. Graph of \(\frac{t}{t+1}\) for positive \(t\) with an asymptote at 1.

An important metric space is the \(n\)-dimensional euclidean space \(\R^n = \R \times \R \times \cdots \times \R\text{.}\) We use the following notation for points: \(x =(x_1,x_2,\ldots,x_n) \in \R^n\text{.}\) We will not write \(\vec{x}\) nor \(\mathbf{x}\) for a point in \(\R^n\) as is common in multivariable calculus, we simply give it a name such as \(x\) and we will remember that \(x\) is an element of \(\R^n\text{.}\) We also write simply \(0 \in \R^n\) to mean the point \((0,0,\ldots,0)\text{.}\) Before making \(\R^n\) a metric space, we prove an important inequality, the so-called Cauchy–Schwarz inequality.

Lemma 7.1.4. Cauchy–Schwarz inequality.

Sometimes it is called the Cauchy–Bunyakovsky–Schwarz inequality. Karl Hermann Amandus Schwarz (1843–1921) was a German mathematician and Viktor Yakovlevich Bunyakovsky (1804–1889) was a Ukrainian mathematician. What we stated should really be called the Cauchy inequality, as Bunyakovsky and Schwarz provided proofs for infinite-dimensional versions.

Suppose \(x =(x_1,x_2,\ldots,x_n) \in \R^n\text{,}\) \(y =(y_1,y_2,\ldots,y_n) \in \R^n\text{.}\) Then

\begin{equation*} {\biggl( \sum_{k=1}^n x_k y_k \biggr)}^2 \leq \biggl(\sum_{k=1}^n x_k^2 \biggr) \biggl(\sum_{k=1}^n y_k^2 \biggr) . \end{equation*}

Proof.

A square of a real number is nonnegative. Hence a sum of squares is nonnegative:

\begin{equation*} \begin{split} 0 & \leq \sum_{k=1}^n \sum_{\ell=1}^n {(x_k y_\ell - x_\ell y_k)}^2 \\ & = \sum_{k=1}^n \sum_{\ell=1}^n \bigl( x_k^2 y_\ell^2 + x_\ell^2 y_k^2 - 2 x_k x_\ell y_k y_\ell \bigr) \\ & = \biggl( \sum_{k=1}^n x_k^2 \biggr) \biggl( \sum_{\ell=1}^n y_\ell^2 \biggr) + \biggl( \sum_{k=1}^n y_k^2 \biggr) \biggl( \sum_{\ell=1}^n x_\ell^2 \biggr) - 2 \biggl( \sum_{k=1}^n x_k y_k \biggr) \biggl( \sum_{\ell=1}^n x_\ell y_\ell \biggr) . \end{split} \end{equation*}

We relabel and divide by 2 to obtain precisely what we wanted,

\begin{equation*} 0 \leq \biggl( \sum_{k=1}^n x_k^2 \biggr) \biggl( \sum_{k=1}^n y_k^2 \biggr) - {\biggl( \sum_{k=1}^n x_k y_k \biggr)}^2 . \qedhere \end{equation*}

Example 7.1.5.

Let us construct the standard metric for \(\R^n\text{.}\) Define

\begin{equation*} d(x,y) \coloneqq \sqrt{ {(x_1-y_1)}^2 + {(x_2-y_2)}^2 + \cdots + {(x_n-y_n)}^2 } = \sqrt{ \sum_{k=1}^n {(x_k-y_k)}^2 } . \end{equation*}

For \(n=1\text{,}\) the real line, this metric agrees with what we defined above. For \(n > 1\text{,}\) the only tricky part of the definition to check, as before, is the triangle inequality. It is less messy to work with the square of the metric. In the following estimate, note the use of the Cauchy–Schwarz inequality.

\begin{equation*} \begin{split} {\bigl(d(x,z)\bigr)}^2 & = \sum_{k=1}^n {(x_k-z_k)}^2 \\ & = \sum_{k=1}^n {(x_k-y_k+y_k-z_k)}^2 \\ & = \sum_{k=1}^n \Bigl( {(x_k-y_k)}^2+{(y_k-z_k)}^2 + 2(x_k-y_k)(y_k-z_k) \Bigr) \\ & = \sum_{k=1}^n {(x_k-y_k)}^2 + \sum_{k=1}^n {(y_k-z_k)}^2 + 2 \sum_{k=1}^n (x_k-y_k)(y_k-z_k) \\ & \leq \sum_{k=1}^n {(x_k-y_k)}^2 + \sum_{k=1}^n {(y_k-z_k)}^2 + 2 \sqrt{ \sum_{k=1}^n {(x_k-y_k)}^2 \sum_{k=1}^n {(y_k-z_k)}^2 } \\ & = {\left( \sqrt{ \sum_{k=1}^n {(x_k-y_k)}^2 } + \sqrt{ \sum_{k=1}^n {(y_k-z_k)}^2 } \right)}^2 = {\bigl( d(x,y) + d(y,z) \bigr)}^2 . \end{split} \end{equation*}

Because the square root is an increasing function, the inequality is preserved when we take the square root of both sides, and we obtain the triangle inequality.

Example 7.1.6.

The set of complex numbers \(\C\) is the set of numbers \(z = x+iy\text{,}\) where \(x\) and \(y\) are in \(\R\text{.}\) By imposing \(i^2 = -1\text{,}\) we make \(\C\) into a field. For the purposes of taking limits, the set \(\C\) is regarded as the metric space \(\R^2\text{,}\) where \(z=x+iy \in \C\) corresponds to \((x,y) \in \R^2\text{.}\) For \(z=x+iy\) define the complex modulus by \(\sabs{z} \coloneqq \sqrt{x^2+y^2}\text{.}\) Then for two complex numbers \(z_1 = x_1 + iy_1\) and \(z_2 = x_2 + iy_2\text{,}\) the distance is

\begin{equation*} d(z_1,z_2) = \sqrt{{(x_1-x_2)}^2+ {(y_1-y_2)}^2} = \sabs{z_1-z_2}. \end{equation*}

Furthermore, when working with complex numbers it is often convenient to write the metric in terms of the so-called complex conjugate: The conjugate of \(z=x+iy\) is \(\bar{z} \coloneqq x-iy\text{.}\) Then \({\sabs{z}}^2 = x^2 +y^2 = z\bar{z}\text{,}\) and so \({\sabs{z_1-z_2}}^2 = (z_1-z_2)\overline{(z_1-z_2)}\text{.}\)

Example 7.1.7.

An example to keep in mind is the so-called discrete metric. For any set \(X\text{,}\) define

\begin{equation*} d(x,y) \coloneqq \begin{cases} 1 & \text{if } x \not= y, \\ 0 & \text{if } x = y. \end{cases} \end{equation*}

That is, all points are equally distant from each other. When \(X\) is a finite set, we can draw a diagram, see for example Figure 7.3. Of course, in the diagram the distances are not the normal euclidean distances in the plane. Things become subtle when \(X\) is an infinite set such as the real numbers.

Figure 7.3. Sample discrete metric space \(\{ a,b,c,d,e \}\text{,}\) the distance between any two points is \(1\text{.}\)

While this particular example may seldom come up in practice, it gives a useful “smell test.” If you make a statement about metric spaces, try it with the discrete metric. To show that \((X,d)\) is indeed a metric space is left as an exercise.

Example 7.1.8.

Let \(C\bigl([a,b],\R\bigr)\) be the set of continuous real-valued functions on the interval \([a,b]\text{.}\) Define the metric on \(C\bigl([a,b],\R\bigr)\) as

\begin{equation*} d(f,g) \coloneqq \sup_{x \in [a,b]} \abs{f(x)-g(x)} . \end{equation*}

Let us check the properties. First, \(d(f,g)\) is finite as \(\abs{f(x)-g(x)}\) is a continuous function on a closed bounded interval \([a,b]\text{,}\) and so is bounded. It is clear that \(d(f,g) \geq 0\text{,}\) it is the supremum of nonnegative numbers. If \(f = g\text{,}\) then \(\abs{f(x)-g(x)} = 0\) for all \(x\text{,}\) and hence \(d(f,g) = 0\text{.}\) Conversely, if \(d(f,g) = 0\text{,}\) then for every \(x\text{,}\) we have \(\abs{f(x)-g(x)} \leq d(f,g) = 0\text{,}\) and hence \(f(x) = g(x)\) for all \(x\text{,}\) and so \(f=g\text{.}\) That \(d(f,g) = d(g,f)\) is equally trivial. To show the triangle inequality we use the standard triangle inequality;

\begin{equation*} \begin{split} d(f,g) & = \sup_{x \in [a,b]} \abs{f(x)-g(x)} = \sup_{x \in [a,b]} \abs{f(x)-h(x)+h(x)-g(x)} \\ & \leq \sup_{x \in [a,b]} \bigl( \abs{f(x)-h(x)}+\abs{h(x)-g(x)} \bigr) \\ & \leq \sup_{x \in [a,b]} \abs{f(x)-h(x)}+ \sup_{x \in [a,b]} \abs{h(x)-g(x)} = d(f,h) + d(h,g) . \end{split} \end{equation*}

When treating \(C\bigl([a,b],\R\bigr)\) as a metric space without mentioning a metric, we mean this particular metric. Notice that \(d(f,g) = \norm{f-g}_{[a,b]}\text{,}\) the uniform norm of Definition 6.1.9.

This example may seem esoteric at first, but it turns out that working with spaces such as \(C\bigl([a,b],\R\bigr)\) is really the meat of a large part of modern analysis. Treating sets of functions as metric spaces allows us to abstract away a lot of the grubby detail and prove powerful results such as Picard’s theorem with less work.

Example 7.1.9.

Another useful example of a metric space is the sphere with a metric usually called the great circle distance. Let \(S^2\) be the unit sphere in \(\R^3\text{,}\) that is \(S^2 \coloneqq \{ x \in \R^3 : x_1^2+x_2^2+x_3^2 = 1 \}\text{.}\) Take \(x\) and \(y\) in \(S^2\text{,}\) draw a line through the origin and \(x\text{,}\) and another line through the origin and \(y\text{,}\) and let \(\theta\) be the angle that the two lines make. Then define \(d(x,y) \coloneqq \theta\text{.}\) See Figure 7.4. The law of cosines from vector calculus says \(d(x,y) = \arccos(x_1y_1+x_2y_2+x_3y_3)\text{.}\) It is relatively easy to see that this function satisfies the first three properties of a metric. Triangle inequality is harder to prove, and requires a bit more trigonometry and linear algebra than we wish to indulge in right now, so let us leave it without proof.

Figure 7.4. The great circle distance on the unit sphere.

This distance is the shortest distance between points on a sphere if we are allowed to travel on the sphere only. It is easy to generalize to arbitrary diameters. If we take a sphere of radius \(r\text{,}\) we let the distance be \(d(x,y) \coloneqq r \theta\text{.}\) As an example, this is the standard distance you would use if you compute a distance on the surface of the earth, such as computing the distance a plane travels from London to Los Angeles.

Oftentimes it is useful to consider a subset of a larger metric space as a metric space itself. We obtain the following proposition, which has a trivial proof.

Proposition 7.1.10.

Let \((X,d)\) be a metric space and \(Y \subset X\text{.}\) Then the restriction \(d|_{Y \times Y}\) is a metric on \(Y\text{.}\)

Definition 7.1.11.

If \((X,d)\) is a metric space, \(Y \subset X\text{,}\) and \(d' \coloneqq d|_{Y \times Y}\text{,}\) then \((Y,d')\) is said to be a subspace of \((X,d)\text{.}\)

It is common to simply write \(d\) for the metric on \(Y\text{,}\) as it is the restriction of the metric on \(X\text{.}\) Sometimes we say \(d'\) is the subspace metric and \(Y\) has the subspace topology.

A subset of the real numbers is bounded whenever all its elements are at most some fixed distance from 0. When dealing with an arbitrary metric space there may not be some natural fixed point 0, but for the purposes of boundedness it does not matter.

Definition 7.1.12.

Let \((X,d)\) be a metric space. A subset \(S \subset X\) is said to be bounded if there exists a \(p \in X\) and a \(B \in \R\) such that

\begin{equation*} d(p,x) \leq B \quad \text{for all } x \in S. \end{equation*}

We say \((X,d)\) is bounded if \(X\) itself is a bounded subset.

For example, the set of real numbers with the standard metric is not a bounded metric space. It is not hard to see that a subset of the real numbers is bounded in the sense of Chapter 1 if and only if it is bounded as a subset of the metric space of real numbers with the standard metric.

On the other hand, if we take the real numbers with the discrete metric, then we obtain a bounded metric space. In fact, any set with the discrete metric is bounded.

There are other equivalent ways we could generalize boundedness, and are left as exercises. Suppose \(X\) is nonempty to avoid a technicality. Then \(S \subset X\) being bounded is equivalent to either

For every \(p \in X\text{,}\) there exists a \(B > 0\) such that \(d(p,x) \leq B\) for all \(x \in S\text{.}\)
\(\operatorname{diam}(S) \coloneqq \sup \bigl\{ d(x,y) : x,y \in S \bigr\} < \infty\text{.}\)

The quantity \(\operatorname{diam}(S)\) is called the diameter of a set and is usually only defined for a nonempty set.

Exercises Exercises

7.1.1.

Show that for every set \(X\text{,}\) the discrete metric (\(d(x,y) = 1\) if \(x\not=y\) and \(d(x,x) = 0\)) does give a metric space \((X,d)\text{.}\)

7.1.2.

Let \(X \coloneqq \{ 0 \}\) be a set. Can you make it into a metric space?

7.1.3.

Let \(X \coloneqq \{ a, b \}\) be a set. Can you make it into two distinct metric spaces? (define two distinct metrics on it)

7.1.4.

Let the set \(X \coloneqq \{ A, B, C \}\) represent 3 buildings on campus. Suppose we wish our distance to be the time it takes to walk from one building to the other. It takes 5 minutes either way between buildings \(A\) and \(B\text{.}\) However, building \(C\) is on a hill and it takes 10 minutes from \(A\) and 15 minutes from \(B\) to get to \(C\text{.}\) On the other hand it takes 5 minutes to go from \(C\) to \(A\) and 7 minutes to go from \(C\) to \(B\text{,}\) as we are going downhill. Do these distances define a metric? If so, prove it, if not, say why not.

7.1.5.

Suppose \((X,d)\) is a metric space and \(\varphi \colon [0,\infty) \to \R\) is an increasing function such that \(\varphi(t) \geq 0\) for all \(t\) and \(\varphi(t) = 0\) if and only if \(t=0\text{.}\) Also suppose \(\varphi\) is subadditive, that is, \(\varphi(s+t) \leq \varphi(s)+\varphi(t)\text{.}\) Show that with \(d'(x,y) \coloneqq \varphi\bigl(d(x,y)\bigr)\text{,}\) we obtain a new metric space \((X,d')\text{.}\)

7.1.6.

Let \((X,d_X)\) and \((Y,d_Y)\) be metric spaces.

Show that \((X \times Y,d)\) with \(d\bigl( (x_1,y_1), (x_2,y_2) \bigr) \coloneqq d_X(x_1,x_2) + d_Y(y_1,y_2)\) is a metric space.
Show that \((X \times Y,d)\) with \(d\bigl( (x_1,y_1), (x_2,y_2) \bigr) \coloneqq \max \bigl\{ d_X(x_1,x_2) , d_Y(y_1,y_2) \bigr\}\) is a metric space.

7.1.7.

Let \(X\) be the set of continuous functions on \([0,1]\text{.}\) Let \(\varphi \colon [0,1] \to (0,\infty)\) be continuous. Define

\begin{equation*} d(f,g) \coloneqq \int_0^1 \abs{f(x)-g(x)}\varphi(x)\,dx . \end{equation*}

Show that \((X,d)\) is a metric space.

7.1.8.

Let \((X,d)\) be a metric space. For nonempty bounded subsets \(A\) and \(B\) let

\begin{equation*} d(x,B) \coloneqq \inf \bigl\{ d(x,b) : b \in B \bigl\} \qquad \text{and} \qquad d(A,B) \coloneqq \sup \bigl\{ d(a,B) : a \in A \bigr\} . \end{equation*}

Now define the Hausdorff metric as

\begin{equation*} d_H(A,B) \coloneqq \max \bigl\{ d(A,B) , d(B,A) \bigr\} . \end{equation*}

Note: \(d_H\) can be defined for arbitrary nonempty subsets if we allow the extended reals.

Let \(Y \subset \sP(X)\) be the set of bounded nonempty subsets. Prove that \((Y,d_H)\) is a so-called pseudometric space: \(d_H\) satisfies the metric properties i, iii, iv, and further \(d_H(A,A) = 0\) for all \(A \in Y\text{.}\)
Show by example that \(d\) itself is not symmetric, that is \(d(A,B) \not= d(B,A)\text{.}\)
Find a metric space \(X\) and two different nonempty bounded subsets \(A\) and \(B\) such that \(d_H(A,B) = 0\text{.}\)

7.1.9.

Let \((X,d)\) be a nonempty metric space and \(S \subset X\) a subset. Prove:

\(S\) is bounded if and only if for every \(p \in X\text{,}\) there exists a \(B > 0\) such that \(d(p,x) \leq B\) for all \(x \in S\text{.}\)
A nonempty \(S\) is bounded if and only if \(\operatorname{diam}(S) \coloneqq \sup \{ d(x,y) : x,y \in S \} < \infty\text{.}\)

7.1.10.

Working in \(\R\text{,}\) compute \(\operatorname{diam}\bigl([a,b]\bigr)\text{.}\)
Working in \(\R^n\text{,}\) for every \(r > 0\text{,}\) let \(B_r \coloneqq \{ x_1^2+x_2^2+\cdots+x_n^2 < r^2 \}\text{.}\) Compute \(\operatorname{diam}(B_r)\text{.}\)
Suppose \((X,d)\) is a metric space with at least two points, \(d\) is the discrete metric, and \(p \in X\text{.}\) Compute \(\operatorname{diam}(\{ p \})\) and \(\operatorname{diam}(X)\text{,}\) then conclude that \((X,d)\) is bounded.

7.1.11.

Find a metric \(d\) on \(\N\) such that \(\N\) is an unbounded set in \((\N,d)\text{.}\)
Find a metric \(d\) on \(\N\) such that \(\N\) is a bounded set in \((\N,d)\text{.}\)
Find a metric \(d\) on \(\N\) such that for every \(n \in \N\) and every \(\epsilon > 0\text{,}\) there exists an \(m \in \N\) such that \(d(n,m) < \epsilon\text{.}\)

7.1.12.

Let \(C^1\bigl([a,b],\R\bigr)\) be the set of once continuously differentiable functions on \([a,b]\text{.}\) Define

\begin{equation*} d(f,g) \coloneqq \snorm{f-g}_{[a,b]} + \snorm{f'-g'}_{[a,b]}, \end{equation*}

where \(\snorm{\cdot}_{[a,b]}\) is the uniform norm. Prove that \(d\) is a metric.

7.1.13.

Consider \(\ell^2\) the set of sequences \(\{ x_n \}_{n=1}^\infty\) of real numbers such that \(\sum_{n=1}^\infty x_n^2 < \infty\text{.}\)

Prove the Cauchy–Schwarz inequality for two sequences \(\{x_n \}_{n=1}^\infty\) and \(\{ y_n \}_{n=1}^\infty\) in \(\ell^2\text{:}\) Prove that \(\sum_{n=1}^\infty x_n y_n\) converges (absolutely) and

\begin{equation*} {\biggl( \sum_{n=1}^\infty x_n y_n \biggr)}^2 \leq \biggl(\sum_{n=1}^\infty x_n^2 \biggr) \biggl(\sum_{n=1}^\infty y_n^2 \biggr) . \end{equation*}
Prove that \(\ell^2\) is a metric space with the metric \(d(x,y) \coloneqq \sqrt{\sum_{n=1}^\infty {(x_n-y_n)}^2}\text{.}\) Hint: Don’t forget to show that the series for \(d(x,y)\) always converges to some finite number.

Prev Top Next