Chapter 1 – Exercises

Exercise 1: Key Concepts of Vector Spaces

a. The Scalar Product

(i) Compute the scalar product of v and w when…

    1. v = (2, 3)' and w = (-2, 9)'
    2. v = (3, 2, \ln(8), -6)' and w = (\ln(2), 1, -1, 1/4)' (Hint: a\ln(b) = \ln(b^a))
    3. v = (4, 19)' and w = (2, 3, 5)'

1.


The scalar product of v = (v_1,v_2)'\in\mathbb R^2 and w = (w_1,w_2)'\in\mathbb R^2 is defined as v\cdot w = v_1w_1 + v_2w_2. Here, we have

    \[ v\cdot w = 2\cdot (-2) + 3 \cdot 9 = -4 + 27 = 23. \]

 

2.


The scalar product of v and w is

    \[ v\cdot w = 3 \cdot \ln(2) + 2 \cdot 1 + \ln(8) \cdot (-1) + (-6)\cdot 1/4 = \ln(2^3) - \ln(8) + 2 - 3/2 = 1/2 \]

 

3.


The scalar product is defined only for vectors of the same set X, i.e. on the cartesian product of a set X with itself. Hence, the scalar product of v and w does not exist.

 

(ii) An important property of real number multiplication is distributivity over addition: for any a,b,c\in\mathbb R, it holds that a(b+c) = ab + ac. Does this hold for the scalar product as well? Justify your answer with a formal argument.


The property is true for the scalar product. Consider arbitrary vectors v,w,u\in X, where X\subseteq\mathbb R^n is some set of real vectors. Then,

    \[\begin{split} v\cdot [w+u] &= (v_1,v_2,\ldots,v_n)' \cdot [(w_1,w_2,\ldots,w_n)' + (u_1,u_2,\ldots,u_n)'] \\&= (v_1,v_2,\ldots,v_n)'\cdot (w_1 + u_1, w_2 + u_2, \ldots, w_n + u_n)' \\&= \sum_{i=1}^n v_i(w_i + u_i) = \sum_{i=1}^n v_iw_i + \sum_{i=1}^n v_iu_i \\&= v\cdot w + v\cdot u \end{split}\]

 

b. Metrics

(i) Verify that the so-called French Railway Metric on \mathbb R^2,

    \[d_{FR}(x,y)= \begin{cases} \|x-y\|_2 &\text{ if }x = \lambda y \text{ for a }\lambda\in\mathbb R,\\ \|x\|_2 + \|y\|_2&\text{else,} \end{cases}\]

is indeed a metric, i.e. show that it satisfies the three properties that define a metric function. (Hint: use case distinctions for whether or not \exists \lambda\in\mathbb R: x = \lambda y is true; the regular and inverse triangle inequality of the norm may be helpful at several instances.)


The three properties that we need to establish are: (a) non-negativity, (b) symmetry, and (c) triangle-inequality.
(a) Non-negativity. This property has two parts: first, the function must always be weakly positive, and second, it must be equal to zero if and only if the arguments coincide – formally, (a.i) \forall x,y\in \mathbb R^2: d_{FR}(x,y)\geq 0 and (a.ii) (d_{FR}(x,y) = 0 \Rightarrow x=y). (a.i) is straightforward from non-negativity of the Euclidean norm. For (a.ii), if d_{FR}(x,y) = 0, then either \|x-y\|_2 = 0 and \exists \lambda\in\mathbb R: x = \lambda y, or \|x\|_2 + \|y\|_2 = 0 and \nexists \lambda\in\mathbb R: x = \lambda y. The latter case can be disregarded: If \|x\|_2 + \|y\|_2 = 0, then by non-negativity of the norm, x = y = \mathbf 0, and \forall \lambda\in\mathbb R: x = \lambda y, so that this case never occurs. In the former, \|x-y\|_2 = 0, and by non-negativity of the norm, x-y = \mathbf 0, i.e. x=y. Hence, d_{FR} satisfies non-negativity.

(b) Symmetry. If \exists \lambda\in\mathbb R: x = \lambda y, by absolute homogeneity of the norm,

    \[d_{FR}(x,y) = \|x-y\|_2 = \|(-1)(y-x)\|_2 = |-1|\|y-x\| = \|y-x\|_2 = d_{FR}(y,x).\]

Otherwise,

    \[ d_{FR}(x,y) = \|x\|_2 + \|y\|_2 = \|y\|_2 + \|x\|_2 = d_{FR}(y,x) \]

Hence, \forall x,y\in\mathbb R^2: d_{FR}(x,y) = d_{FR}(y,x), i.e. d_{FR} is symmetric.

(c) Triangle inequality. Recall that intuitively, if \exists \lambda\in\mathbb R: x=\lambda y, the path between x and y is “direct” and shorter than the other case. This also holds formally:

    \[ \|x-y\|_2 = \|x + (-1)y\|_2 \leq \|x\|_2 + \|(-1)y\|_2 = \|x\|_2 + |-1|\cdot \|y\|_2 = \|x\|_2 + \|y\|_2. \]

where the second step follows from the triangle inequality and the second from absolute homogeneity of the Euclidean norm. Hence, it holds that

(1)   \begin{equation*} \|x-y\|_2 \leq d_{FR}(x,y) \end{equation*}

Let x,y,z\in\mathbb R^2. If \exists\lambda\in\mathbb R: x = \lambda z, then, using the triangle inequality and absolute homogeneity of the Euclidean norm,

    \[ d_{FR}(x,z) = \|x-z\|_2 = \|x-y + (-1) (z-y)\|_2 \leq \|x-y\|_2 + |-1|\cdot\|y-z\|_2 = \|x-y\|_2 + \|y-z\|_2 \]

Hence, by equation (1),

    \[ d_{FR}(x,z) = \|x-y\|_2 + \|y-z\|_2 \leq d_{FR}(x,y) + d_{FR}(y,z). \]

If instead, \nexists \lambda\in\mathbb R: x=\lambda z, then either, \nexists \mu^x\in\mathbb R: x = \mu^x y or \nexists \mu^z\in\mathbb R: z = \mu^z y, because otherwise, x = \frac{\mu^x}{\mu^z} z, and there exists \lambda = \frac{\mu^x}{\mu^z} for which x = \lambda z. Without loss of generality, assume that there exists no \mu^x\in\mathbb R: x = \mu^x y (otherwise, re-label x and z). Then,

    \[ d_{FR}(x,z) = \|x\|_2 + \|z\|_2 = \|x\|_2 + \|y\|_2 - \|y\|_2 + \|z\|_2 = d_{FR}(x,y) + \|z\|_2 - \|y\|_2. \]

With the inverse triangle inequality,

    \[ d_{FR}(x,z) \leq d_{FR}(x,y) + |\|z\|_2 - \|y\|_2| \leq d_{FR}(x,y) + \|z-y\|_2 = d_{FR}(x,y) + d_{FR}(y,z). \]

Hence, \forall x,y,z\in\mathbb R^2: d_{FR}(x,z) \leq d_{FR}(x,y) + d_{FR}(y,z), i.e. d_{FR} satisfies the triangle inequality.

 

(ii) Consider a strictly monotonically increasing function t:[0,\infty)\mapsto\mathbb R, i.e. t satisfies \forall x,y\in [0,\infty):(x>y\Rightarrow t(x)>t(y)), for which t(0) = 0. Further, consider a metric d:X\times X\mapsto\mathbb R that measures the distance of vectors in X\subseteq\mathbb R^n. Can you find a sufficient condition that imposes one additional property on t such that the monotonic transformation of d induced by t, i.e. \tilde d = t\circ d for which \tilde d(x,y) = t(d(x,y)), is guaranteed to be a metric?


When using a monotonic transformation, symmetry is guaranteed to be preserved: for any x,y\in\mathbb R^n,

    \[ \tilde d(x,y) = t(d(x,y)) = t(d(y,x)) = \tilde d(y,x) \]

by symmetry of d.
Non-negativity is not problematic as well: for any x\neq y \in \mathbb R^2, d(x,y)>0 and thus \tilde d(x,y) = t(d(x,y)) > t(0) = 0, and

    \[ \tilde d(x,y) = 0 \Leftrightarrow t(d(x,y)) = 0 \Leftrightarrow d(x,y) = 0 \Leftrightarrow x=y. \]

Lastly, for the triangle inequality, it is always true that

    \[ \tilde d(x,z) = t(d(x,z)) \leq t(d(x,y) + d(y,z)) \]

by strict monotonicity of t and the triangle inequality of d. However, to guarantee that this expression is weakly smaller than \tilde d(x,y) + \tilde d(y,z) = t(d(x,y)) + t(d(y,z)), an additional assumption must be imposed: for the range of d, R_d = d[\mathbb R^n\times \mathbb R^n], if \forall a,b\in R_d: t(a + b) \leq t(a) + t(b), then the triangle inequality is guaranteed. Intuitively, this means that t satisfies a triangle inequality itself, which is the case when t does not increase “too fast” with increasing arguments, we must e.g. have that t(2) = t(1+1)\leq t(1) + t(1) = 2t(1) so that t(1) \geq t(2)/2, and t increases faster in the argument than in the value, i.e. doubling the argument less than doubles the value.

In conclusion, the sufficient condition we were looking for is

    \[ \forall a,b\in d[\mathbb R^n\times \mathbb R^n]: t(a + b) \leq t(a) + t(b) \]

 

(iii) Using the result of (ii), is f:\mathbb R^n\times \mathbb R^n\mapsto\mathbb R, (x,y)\mapsto \ln(\|x - y\|_2 + 1) a metric on (\mathbb R^n,d_E), where d_E is the metric induced by the Euclidean norm? (Hint: \ln(ab) = \ln(a)+\ln(b) for all a,b > 0.)


The setup corresponds precisely to the scenario we investigated in (ii): f(x,y) = \ln(d_E(x,y) + 1) for the metric d_E induced by the Euclidean norm. Hence, the function t we consider is t:[0,\infty)\mapsto\mathbb R, x\mapsto t(x) = \ln(x+1). Because the natural logarithm function is strictly monotonically increasing on [1,\infty), so is t on [0,\infty). Further, t(0) = \ln(0+1) = 0. The range of d_E is [0,\infty) (the distance of two vectors in \mathbb R^n can get infinitely large; \|x-\mathbf 0\|_2 \to \infty as x extends towards \infty along any of its dimensions). By the result of (ii), we know that f is a metric on (\mathbb R^n,d_E) if

    \[\forall a,b\in [0,\infty): t(a + b) \leq t(a) + t(b).\]

For arbitrary a,b\in[0,\infty),

    \[\begin{split} t(a) + t(b) &= \ln(a+1) + \ln(b+1) \\&= \ln((a+1)\cdot(b+1)) \\& = \ln (ab + a + b + 1) \\&\geq \ln(a + b + 1) \\& = t(a + b) \end{split}\]

where the second equality follows from the logarithm rule given in the hint, and the fourth follows because ab\geq 0. Hence, t satisfies the sufficient condition found in (ii), and f is indeed a metric on (\mathbb R^n,d_E).

c. Norms

(i) The course discusses the most common examples of norm functions as used in practice. However, there are of course a variety of other functions that are norms. One such example is the following function on \mathbb R^3:

    \[ n:\mathbb R^3\mapsto\mathbb R, x=(x_1,x_2,x_3)'\mapsto |x_1| + \max\{2|x_2|, 3|x_3|\}. \]

Verify that this function constitutes a norm according to our definition by establishing the three norm properties for n.


The three properties that we need to establish are: (a) non-negativity, (b) triangle-inequality, and (c) absolute homogeneity.
(a) Non-negativity. This property has two parts: weak positivity and equality to zero only at the origin. Weak positivity is trivial given that the absolute value is weakly positive. For the latter,

    \[ n(x) = 0 \Leftrightarrow |x_1| + \max\{2|x_2|, 3|x_3|\} = 0 \Leftrightarrow (|x_1| = 0 \land \max\{2|x_2|, 3|x_3|\} = 0) \]

which is in turn equivalent to x_1 = x_2 = x_3 = 0, i.e. x=\mathbf 0. Hence, n satisfies non-negativity.
(b) Triangle-inequality. Note first that for any x_2,x_3,y_2,y_3, by the triangle inequality of the absolute value,

    \[\begin{split} \max\{2|x_2 + y_2|, 3|x_3 + y_3|\} &\leq \max\{2(|x_2| + |y_2|), 3(|x_3| + |y_3|)\} \\&\leq \max\left \{2|x_2| + \max\{2|y_2|, 3|y_3|\}, 3|x_3| + \max\{2|y_2|, 3|y_3|\}\right \} \\& = \max\{2|x_2|, 3|x_3|\}+ \max\{2|y_2|, 3|y_3|\} \end{split}\]

Therefore, for any x,y\in\mathbb R^3, using again the triangle inequality of the absolute value,

    \[\begin{split} n(x+y) &= |x_1 + y_1| + \max\{2|x_2+y_2|, 3|x_3+y_3|\} \\&\leq |x_1| + |y_1| + \max\{2|x_2+y_2|, 3|x_3+y_3|\} \\&\leq |x_1| + |y_1| + \max\{2|x_2|, 3|x_3|\} + \max\{2|y_2|, 3|y_3|\} \\& = n(x) + n(y). \end{split}\]

Hence, n satisfies the triangle inequality.
(c) Absolute homogeneity. For any x\in\mathbb R^3 and any \lambda\in\mathbb R,

    \[\begin{split} n(\lambda x) &= |\lambda x_1| + \max\{2|\lambda x_2|, 3|\lambda x_3|\} \\&= |\lambda|\cdot |x_1| + \max\{|\lambda|\cdot 2|x_2|, |\lambda|\cdot 3|x_3|\} \\&=|\lambda|\cdot \left (|x_1| + \max\{2|x_2|, 3|x_3|\}\right ) \\&= |\lambda|\cdot n(x). \end{split}\]

Hence, n satisfies absolute homogeneity.

 

(ii) A key concept related to norms that the course does not discuss is norm equivalence. You may recall that we said that a function may be continuous in one metric space but not in the other, depending on the metric chosen. Similarly, convergence of sequences may depend on the metric chosen. A further appealing aspect of norm-induced metrics over general metrics is that at least in metric spaces over \mathbb R^n with finite n\in\mathbb N, all of our usual p-norms are equivalent, which means that if a function f satisfies continuity or a sequence \{x_n\}_{n\in\mathbb N} converges with respect to one norm, they do so with respect to the other as well.

It turns out to be sufficient for norm equivalence that we can reduce the differences in two norms to some constants that matter little in the \varepsilon/\delta-arguments we use to investigate convergence, continuity and related concepts. Formally, two norms \|\cdot\|_a and \|\cdot\|_b on X are said to be equivalent if there exist constants 0<c\leq C<\infty such that

    \[\forall x\in X: c\|x\|_a \leq \|x\|_b \leq C \|x\|_a.\]

Show that all p-norms are equivalent on any \mathbb R^n, i.e. that \forall p,q\in\mathbb N\cup\{\infty\}, \|\cdot\|_p is equivalent to \|\cdot\|_q on \mathbb R^n for any n\in\mathbb N. (Hint: this is easiest done in two steps: (1) show that any p-norm is equivalent to the maximum norm, and (2) show that if a p– and q-norm are both equivalent to the maximum norm, the p– and q-norm are also equivalent to each other.)


Let n\in\mathbb N and p\in\mathbb N.
Following the hint, we first show that \|\cdot\|_p is equivalent to \|\cdot\|_\infty:
Let x = (x_1,\ldots,x_n)'\in\mathbb R^n, and m\in\arg\max_{j\in\{1,\ldots,n\}}|x_j|, i.e. m\in\{1,\ldots,n\} so that |x_m| = \max_{j\in\{1,\ldots,n\}}|x_j|. Then,

    \[\|x\|_\infty = \left (\|x\|_\infty^p\right )^{1/p} = \left (|x_m|^p\right )^{1/p} \leq \left (\sum_{k=1}^n|x_k|^p\right )^{1/p} = \|x\|_p,\]

and

    \[\begin{split} \|x\|_p &= \left (\sum_{k=1}^n|x_k|^p\right )^{1/p} \leq \left (\sum_{k=1}^n\max\{|x_j|^p: j\in\{1,\ldots,n\}\}\right )^{1/p} \\& = \left (n\cdot \max\{|x_j|: j\in\{1,\ldots,n\}\}^p\right )^{1/p} = n^{1/p} \left (\|x\|_\infty^p\right )^{1/p} = n^{1/p}\|x\|_\infty. \end{split}\]

Hence, we obtain

    \[\|x\|_\infty\leq \|x\|_p \leq n^{1/p}\cdot\|x\|_\infty\]

which meets the definition of norm equivalence with c=1 and C = n^{1/p}.
Now for the second step: take q\in\mathbb N. By the first step, we know that

    \[\|x\|_\infty\leq \|x\|_q \leq n^{1/q}\cdot\|x\|_\infty.\]

Therefore,

    \[ \|x\|_q \geq \|x\|_\infty \geq \frac{1}{n^{1/p}}\|x\|_p \]

and

    \[\|x\|_q \leq n^{1/q}\cdot\|x\|_\infty \leq n^{1/q}\cdot\|x\|_p,\]

so taken together

    \[ \frac{1}{n^{1/p}}\|x\|_p \leq \|x\|_q \leq n^{1/q}\cdot\|x\|_p \]

which meets the definition of norm equivalence with c=n^{-1/p} and C = n^{1/q}.

 

Exercise 2: Testing for Set Properties

For the following sets, investigate whether they are open, closed, compact and/or convex. Throughout the exercise, consider the natural metric of \mathbb R or the Euclidean norm of the \mathbb R^2. Note that there always are quite a number of ways to establish the set properties. It may well be that the solution sketches do not always cover your approach, even if it is correct – you should, however, always arrive at the same set properties as the solutions. The solution of 2.a is relatively verbose and also gives some review here and there, so it may be a good idea to start with this exercise.

a. A subset of the real line.

Let S_1:= \{x\in\mathbb R: x^2\leq 4\}. Is this set open, closed, compact and/or convex?

Closedness and Openness


First, it is instructive to simplify the expression for S_1. Noting that x^2\leq 4 \Leftrightarrow |x|\leq 2, we can re-write

    \[S_1 = [-2, 2].\]

You may know that closed intervals on the real line are, in fact, closed and not open, which makes this case trivial. However, this exercise provides a good opportunity to get some practice with the formal concepts using a manageable example, hence let us not rely on this fact in the following. This solution outlines the “direct” approach via the definitions and then gives the “sequence approach to closedness” as an alternative solution.
The closure of S_1 is every point “inside or touching” S_1, i.e.

    \[\overline{S_1} = \{x\in \mathbb R: (\forall \varepsilon>0\exists a\in B_\varepsilon(x): a\in S_1)\}.\]

Clearly, this applies to any x\in S_1. Further, for x_0 > 2, we can choose \varepsilon > 0 small enough so that the \varepsilon-open ball around x_0 does not overlap with S_1: set e.g. \varepsilon = \frac{x_0 - 2}{2}. Then,

    \[B_\varepsilon(x_0) = \left (x_0 - \frac{x_0 - 2}{2}, x_0 + \frac{x_0 - 2}{2}\right ) = (1/2 x_0 + 1, 3/2 x_0 - 1).\]

Because x_0 > 2, 1/2 x_0 + 1 > 2, and \nexists a\in B_\varepsilon(x_0): a\in S_1. Hence, it does not hold that \forall \varepsilon>0\exists a\in B_\varepsilon(x): a\in S_1. Thus, x_0 is not a closure point of S_1.
Similarly, for any fixed x_0 < -2, we can set \varepsilon = \frac{-2 - x_0}{2}. Then,

    \[B_\varepsilon(x_0) = \left (x_0 - \frac{-2 - x_0}{2}, x_0 + \frac{-2 - x_0}{2}\right ) = (1 + 3/2 x_0, 1/2 x_0 - 2),\]

and because x_0 < -2, 1 + 3/2 x_0 < -2 , and \nexists a\in B_\varepsilon(x_0): a\in S_1. As above, x_0 is not a closure point of S_1.
The reasoning suggests that any point x\in\mathbb R for which either x<-2 or x>2 is not a closure point of S_1. Hence:

    \[ \overline{S_1} = [-2, 2] = S_1, \]

that is, S_1 is equal to its closure, and by definition of the closed set, S_1 is closed.

Recall that closedness was the scenario where all boundary points are contained in the set, whereas openness was that none of them are. Hence, once we know that the set is closed, to disprove openness, it suffices to show that the set indeed has boundary points. Recall that boundary points where those elements of the set that are not in the interior, i.e. points for which there does not exist an \varepsilon-open ball around x_0 that is contained in S_1 (B_\varepsilon(x_0)\subseteq S_1). Intuitively, it should be pretty clear that -2 and 2 are such points, because any \varepsilon-ball around them will contain elements outside [-2,2]. Formally:
Consider x_0 = 2\in S_1. Let \varepsilon > 0. Then, y = 2 + \varepsilon/2\in B_\varepsilon(x_0) but y \notin S_1. Hence, B_\varepsilon(x_0) is no subset of S_1. Because \varepsilon > 0 was chosen arbitrarily, there exists no \varepsilon > 0 such that B_\varepsilon(x_0)\subseteq S_1, and x_0 = 2 is not an interior point of S_1. Hence, it is a boundary point.
Because S_1 is closed and has at least one boundary point, S_1 is not open.

Alternatively, you can use the sequence approach to establish closedness with relative ease:
To apply the sequence method, we (1) define an arbitrary sequence over the set in question and assume only that it converges to some limit point (potentially outside the set), (2) show that the limit point is indeed contained in the set, and (3) argue that because the convergent sequence was chosen arbitrarily, any convergent sequence over the set will have its limit in the set as well — which is what we need to show for the proposition we intend to use. So, let’s do this:
(1) Let \{x_n\}_{n\in\mathbb N} be a sequence over S_1, i.e. \forall n\in\mathbb N: x_n\in S_1, and assume that the sequence converges to a limit x\in\mathbb R.
(2) By the definition of convergence,

    \[\forall \varepsilon > 0 \exists N\in\mathbb N: (\forall n\geq N: |x_n - x|<\varepsilon).\]

If x were not an element of S_1, because all the x_n are, there would exist a small \varepsilon > 0 so that \forall n\in\mathbb N: |x_n - x| > \varepsilon, which violates the definition of convergence. Hence, x must be an element of S_1.
(3) Because we only imposed convergence for \{x_n\}_{n\in\mathbb N} and nothing else, any convergent sequence \{x_n\}_{n\in\mathbb N} over S_1 will have \lim_{n\to\infty} x_n \in S_1.

 

Compactness and Convexity

Compactness
By Heine-Borel’s theorem, because we already know that S_1 is closed, it suffices to investigate boundedness for compactness. From the chapter, we know that it is sufficient to show that the norm that induces our metric is bounded on S_1, i.e. \exists C>0: (\forall x\in S_1: \|x\|<C). The norm that induces the natural metric of the \mathbb R is the absolute value, so that we must find C>0 for which \forall x\in S_1: |x|<C. Clearly, for any C>2, this is satisfied, such that the C we are looking for exists! Hence, S_1 is also bounded, and thus compact.

Convexity
For convexity, the course has argued that intervals on the real line are convex. If you want to show convexity here explicitly, you can either refer to the line-intuition or the formal definition.
For the line, if you pick any two points x_1,x_2\in S_1, x_1<x_2, the collection of points on the line connecting them is [x_1,x_2], and as -2\leq x_1 < x_2\leq 2, [x_1,x_2]\subseteq S_1, i.e. the line is fully contained in S_1. By the arbitrary choice of x_1 and x_2, S_1 is convex.
For the definition, if you pick any two points x_1,x_2\in S_1 and \lambda \in [0,1], you have

    \[\lambda x_1 + (1-\lambda) x_2 \leq \lambda \cdot 2 + (1-\lambda)\cdot 2 = 2\]

and

    \[\lambda x_1 + (1-\lambda) x_2 \geq \lambda \cdot (-2) + (1-\lambda)\cdot (-2) = -2\]

and thus \lambda x_1 + (1-\lambda) x_2 \in S_1. By the arbitrary choice of x_1 and x_2 and \lambda\in [0,1], S_1 is convex.

 

b. A complement.

Let S_2:= S_1^c. Is this set open, closed, compact and/or convex?


The complement of S_1 is S_1^c = (-\infty,-2)\cup(2,\infty).

Closedness and Openness
As the complement of a closed set (see the solution of Ex. 2.a), S_2 is open.
S_2 is not closed: to establish this, we show that there exist boundary points that are not contained in S_2. Again, a natural candidate is x_0 = 2. This is indeed a boundary point: for any \varepsilon > 0, B_\varepsilon(x_0) contains points in (2,\infty) and thus points in S_2 (e.g. x = 2 + \varepsilon/2). Hence, S_2 is not closed.

Compactness
Because S_2 is not closed, by Heine-Borel, it can not be compact. Moreover, S_2 is unbounded below and above, as it extends indefinitely towards -\infty and +\infty, and one may show that S_2 is also not bounded. As it is sufficient that S_2 is not closed, however, the unboundedness argument is not given here formally.

Convexity
S_2 is not convex: consider x_1 = -3 and x_2 = 3, \lambda = 0.5. Then, y = \lambda x_1 + (1-\lambda) x_2 = 0 \notin S_2.

 

c. Two dimensions.

Let S_3:= \{x\in\mathbb R^2: x_1\leq 3\}. Is this set open, closed, compact and/or convex?

Closedness and Openness
The set is closed and not open.
Closedness can be shown from openness of the complement: S_3^c = \{x\in\mathbb R^2: x_1 > 3\}. For any x = (x_1,x_2)\in S_3^c, x_1 - 3 > 0, and for \varepsilon = \frac{x_1 - 3}{2} > 0, for any y\in B_\varepsilon(x),

    \[|x_1 - y_1| = \sqrt{(y_1 - x_1)^2} \leq \sqrt{(y_1 - x_1)^2 + (y_2 - x_2)^2} = \|x-y\| < \frac{x_1 - 3}{2}.\]

Therefore, it especially holds that x_1 - y_1< \frac{x_1 - 3}{2}, and

    \[ y_1 > \frac{x_1 + 3}{2} \overset{x_1 > 3}> \frac{6}{2} = 3.\]

Hence, y\in S_3^c. So, there exists \varepsilon > 0 such that B_\varepsilon(x) \subseteq S_3^c, and x is an interior point of S_3^c. Because x\in S_3^c was chosen arbitrarily, it results that int(S_3^c) = S_3^c, i.e. S_3^c is open. Thus, S_3 is closed.

To disprove openness, we need to find a boundary point. Intuitively, any x\in\mathbb R^2 with x_1 = 3 will have points with x_1 > 3 in its neighborhood that are not contained in S_3, and therefore constitute a non-interior, i.e. boundary point. Formally:
Let x = (3,0). Then, x\in S_3. For any \varepsilon > 0, for y^\varepsilon = (3 + \varepsilon/2, 0), it holds that

    \[ \|x-y^\varepsilon\| = \sqrt{ (3 + \varepsilon/2 - 3)^2 + (0 - 0)^2 } = \varepsilon/2 < \varepsilon \]

so that y^\varepsilon \in B_\varepsilon(x)\backslash S_3. Accordingly, x\notin int(S_3). Because x\in S_3, x is a boundary point. Hence, S_3 is not open.

Compactness
S_3 is not bounded. Intuitively, the x_2-dimension is entirely unrestricted, and the x_1-dimension is also restricted only above, such that the set can expand indefinitely into three directions. Formally:
For two points x,y\in S_3,

    \[ \|x - y\| = \sqrt{ |x_1 - y_1|^2 + |x_2 - y_2|^2 } \geq \sqrt{|x_2 - y_2|^2} = |x_2 - y_2| \]

Hence, for any C>0, we can pick x,y\in S_3 with x_2 = 2C and y_2 = 0 (and the first dimension chosen in any way that ensures x_1,y_1\leq 3) so that

    \[ \|x - y\|\geq |x_2 - y_2| = 2C > C \]

and there exists no C>0 so that \|x - y\| \geq C for all x,y\in S_3. Thus, S_3 is not bounded.

Convexity
S_3 is convex: Let x,y\in S_3 and \lambda\in[0,1]. Then,

    \[ \lambda x_1 + (1-\lambda)y_1 \leq \lambda \cdot 3 + (1-\lambda)\cdot 3 = 3 \]

and \lambda x + (1-\lambda)y\in S_3.

 

d. A set defined by functional equality.

Let S_4:= \{x\in\mathbb R^2: x_1x_2 = 5\}. Is this set open, closed, compact and/or convex? You can use that f:\mathbb R^2\mapsto\mathbb R, x = (x_1,x_2)'\mapsto x_1x_2 is a continuous function.

Closedness and Openness
S_4 is closed and not open. Perhaps, the easiest way to establish this is the sequence method (see the solution of Ex. 2.a for the step-by-step recipe):
Let \{x^n\}_{n\in\mathbb N} be a sequence over S_4, i.e. \forall n\in\mathbb N: x_n\in S_4, and assume that the sequence converges to a limit x\in\mathbb R^2. Then, with f as defined in the setup,

    \[ x_1x_2 = f(x) = f(\lim_{n\to\infty} x^n) = \lim_{n\to\infty} f(x^n) = \lim_{n\to\infty} x_1^nx_2^n = \lim_{n\to\infty} 5 = 5 \]

and thus, x\in S_4. Hence, S_4 is closed.

To disprove openness, we need to find a boundary point of S_4. Indeed, every point is a boundary point: for any x\in S_4, for any \varepsilon>0, y = (x_1 + \varepsilon/2,x_2)\in B_\varepsilon(x) and y_1y_2 = x_1x_2 + \varepsilon/2\cdot x_2 \neq 5 (this works as x_2\neq 0 because x_1x_2 \neq 0). Hence, S_4 is not open.

Compactness
S_4 is not bounded and hence not compact: consider x(z) = (z, 5/z) and x_0 = x(1) = (1,5). Note that for any z>0, x(z)\in S_4, and moreover

    \[ \| x(z) - x_0\| = \sqrt{|z-1|^2 + |5/z - 5|^2} \geq |z-1|. \]

Hence, for z\to\infty, \| x(z) - x_0\|\to \infty, and in particular, for any C>0, there exists z= C+2 such that x(z)\in S_4 and

    \[\| x(z) - x_0\| \geq |C + 2 - 1| = C + 1 > C.\]

Thus, there exists no C such that for all x,y\in S_4, \|x-y\| \leq C, and S_4 is not bounded.

Convexity
You can see formally that only very particular combinations of x,y\in S_4 and \lambda \in (0,1) satisfy \lambda x + (1-\lambda)y \in S_4. To disprove convexity, it is sufficient to find one such example, such that you don’t need to dig into the formal investigation. Consider e.g. x = (1,5) and y = (2, 5/2), and \lambda = 0.5. Then, z := \lambda x + (1-\lambda)y = (3/2, 15/4) for which z_1z_2 = 45/8\neq 5. Hence, S_4 is not convex.

 

e. The graph of a univariate real-valued function.

Let S_5:= G(f), i.e. the graph of a function f. Is this set open, closed, compact and/or convex?
You may assume that f:\mathbb R\mapsto\mathbb R. Your answers may depend on whether f is (a) continuous and (b) linear. Hint: As we will see in Exercise 3, the limit of vectors can be applied element-wise, so that if \{(x_n,y_n)\}_{n\in\mathbb N} is a convergent sequence, we have \lim_{n\to\infty} (x_n,y_n) = (\lim_{n\to\infty} x_n,\lim_{n\to\infty} y_n).


Note that G(f) is a subset of \mathbb R^2; hence, we use the Euclidean norm.

Closedness and Openness
Perhaps, the easiest way to investigate closedness is the sequence method (see the solution of Ex. 2.a for the step-by-step recipe):
Let \{(x_n,y_n)\}_{n\in\mathbb N} be a sequence over S_5, i.e. \forall n\in\mathbb N: (x_n,y_n)\in G(f), and assume that the sequence converges to a limit (x,y)\in\mathbb R^2. For simplicity and because (x_n,y_n)\in G(f), we write elements in the sequence as (x_n,f(x_n)) in the following.
We have

    \[ \lim_{n\to\infty} (x_n,f(x_n)) = (\lim_{n\to\infty} x_n,\lim_{n\to\infty}f(x_n)) = (x, \lim_{n\to\infty}f(x_n)). \]

Note that (x,y)\in G(f) if and only if y = f(x), which is the case if and only if f(x) = \lim_{n\to\infty}f(x_n), i.e. f is a continuous function. Hence, if f is continuous, G(f) is closed. Otherwise, G(f) is not closed.

On the other hand, G(f) is never open: for any x\in\mathbb R, for any \varepsilon>0, (x, f(x) + \varepsilon/2)\in B_\varepsilon((x,f(x)) since

    \[ \|(x, f(x)+ \varepsilon/2) - (x, f(x))\| = \sqrt{|x-x|^2 + |f(x) + \varepsilon/2 - f(x)|^2} = \varepsilon/2 < \varepsilon, \]

but (x, f(x) + \varepsilon/2)\notin G(f); i.e. any (x,f(x))\in G(f) is a boundary point and not an interior point. Hence, G(f) is not open.

Compactness
G(f) is not bounded because the set of arguments is unbounded; \|(y,f(y)) - (x,f(x))\| \geq |y-x|, which is allowed to diverge to +\infty. More formally, one would fix either x or y and argue that |y-x| and thus \|(y,f(y)) - (x,f(x))\| diverges as the other variable tends to \pm\infty, and thus, the bound C required by the definition of boundedness can not exist (cf. the previous exercises).

If the domain were bounded, the graph might still be unbounded, namely the range of f is unbounded; an example is f:(0,1)\mapsto \mathbb R, x\mapsto 1/x, where f[(0,1)] = (1,\infty). Then, apply the logic of above using \|(y,f(y)) - (x,f(x))\| \geq |f(y)-f(x)|. However, if both domain and range of f are bounded, then the graph is bounded as well. Formally, if C_1, C_2>0 are bounds so that |x-y|\leq C_1 and |f(x) - f(y)|\leq C_2 for any x,y\in X where X is the domain of f (which exist if and only if domain and range of f are bounded), then for any x,y\in X,

    \[ \|(y,f(y)) - (x,f(x))\| = \sqrt{|y-x|^2 + |f(y) - f(x)|^2} \leq \sqrt{C_1^2 + C_2^2} =: C \]

such that C bounds the distances of objects in the graph.

Convexity
G(f) is convex if and only if for any x,y\in\mathbb R and \lambda\in [0,1], (\lambda x + (1-\lambda)y, f(\lambda x + (1-\lambda)y))\in G(f), which is the case if and only if f(\lambda x + (1-\lambda)y) = \lambda f(x) + (1-\lambda) f(y). This is precisely the characterization of a linear function, i.e. a function for which there exist a,b\in\mathbb R such that for all x\in\mathbb R, f(x) = ax + b. Hence, G(f) is convex if and only if f is a linear function.

 

Exercise 3: Element-wise Applicability of the Multivariate Limit

The course discussed the multivariate limit concept for general metric spaces. However, just from the definition, it is not too obvious how we should use it. Here, we turn to a rather powerful result, namely that the limit can be applied element-wise in metric spaces where the metric is p-norm induced.

Fact 1. In any normed space (\mathbb R^n, \|\cdot\|_p), where \|\cdot\|_p is a p-norm, p\in\mathbb N, x^L\in\mathbb R^k is the limit of the sequence \{x^n\}_{n\in\mathbb N} over \mathbb R^k with x^n = (x_1^n,\ldots x_k^n)'\in\mathbb R^k for all n\in\mathbb N if and only if all component sequences \{x_j^n\}_{n\in\mathbb N}, j\in\{1,\ldots,k\}, are convergent, and x^L = (\lim_{n\to\infty}x_1^n,\ldots,\lim_{n\to\infty}x_k^n).

Prove Fact 1.
Hint 1: This is an equivalence proof, which typically means that you must show two directions separately.
Hint 2: Recall Exercise 1: all p-norms are equivalent! Hence, it is sufficient to pick any one p-norm and show that Fact 1 holds for this specific case; convenient choices may be the \infty-norm or the 1-norm; the solution given here uses the \infty norm.


As per hint 2, let us choose the \infty-norm as the p-norm we consider.
First, the perhaps more intuitive direction: we show that if all component sequences converge and x^L = (\lim_{n\to\infty}x_1^n,\ldots,\lim_{n\to\infty}x_k^n), then x^L is the limit of \{x^n\}_{n\in\mathbb N}. It needs to be established that

(2)   \begin{equation*} \forall \varepsilon>0 \exists N\in\mathbb N:(\forall n \geq N: \|x^n - x^L\|_\infty <\varepsilon). \end{equation*}

Now, fix an arbitrary \varepsilon > 0. By convergence of the component sequences, for any j\in\{1,\ldots,k\}, there exists N_j\in\mathbb N:(\forall n \geq N_j: |x_j^n - \lim_{n\to\infty} x_j^n| <\varepsilon). Hence, there exists N = \max_{j\in\{1,\ldots,k\}} N_j such that

    \[ \forall j\in\{1,\ldots,k\}: (\forall n \geq N: |x_j^n - \lim_{n\to\infty} x_j^n| <\varepsilon) \]

which is equivalent to

    \[ \forall n \geq N: \max_{j\in\{1,\ldots,k\}}|x_j^n - \lim_{n\to\infty} x_j^n| <\varepsilon. \]

Since \max_{j\in\{1,\ldots,k\}}|x_j^n - \lim_{n\to\infty} x_j^n| = \|x^n - x^L\|_\infty, and by the arbitrary choice of \varepsilon>0, this establishes (2).
Now, the other direction: we need to show that if x^L\in\mathbb R^k is the limit of \{x^n\}_{n\in\mathbb N}, then all component sequences converge and x^L = (\lim_{n\to\infty}x_1^n,\ldots,\lim_{n\to\infty}x_k^n). While this sounds a bit more intimidating, the logic is very similar to what we did so far.
Because x^L is the limit of \{x^n\}_{n\in\mathbb N}, it holds that

    \[ \forall \varepsilon>0 \exists N\in\mathbb N:(\forall n \geq N: \|x^n - x^L\|_\infty <\varepsilon). \]

Plugging in \max_{j\in\{1,\ldots,k\}}|x_j^n - x_j^L| = \|x^n - x^L\|_\infty, one obtains

    \[ \forall \varepsilon>0 \exists N\in\mathbb N:(\forall n \geq N: \max_{j\in\{1,\ldots,k\}}|x_j^n - x_j^L| <\varepsilon). \]

Now, fix an arbitrary \varepsilon > 0. Let N\in\mathbb N so that (\forall n \geq N: \max_{j\in\{1,\ldots,k\}}|x_j^n - x_j^L| <\varepsilon). Because for any j^*\in\{1,\ldots,k\}, |x_{j^*}^n - x_{j^*}^L| \leq \max_{j\in\{1,\ldots,k\}}|x_j^n - x_j^L|, it also holds that

    \[ \forall j\in\{1,\ldots,k\}:(\forall n \geq N: |x_j^n - x_j^L| <\varepsilon) \]

Hence, by the arbitrary choice of \varepsilon > 0, for any j\in\{1,\ldots,k\}, it holds that

    \[ \forall \varepsilon>0 \exists N\in\mathbb N:(\forall n \geq N: |x_j^n - x_j^L| <\varepsilon). \]

This is precisely the definition of the univariate limit, which allows to conclude that for any j\in\{1,\ldots,k\}, the component sequence \{x_j^n\}_{n\in\mathbb N} converges with limit \lim_{n\to\infty} x_j^n = x^L_j. Hence, x^L = (\lim_{n\to\infty}x_1^n,\ldots,\lim_{n\to\infty}x_k^n).

 

As a take-away of this exercise, you can memorize:

    1. When you want to compute a multivariate limit, you can pull the limit operator into the individual entries.
    2. If this does not work, the limit does not exist.