An overview of this chapter’s contents and take-aways can be found here.
The topic of vector spaces is one of if not the most important background concept for economic mathematics and also a large proportion of applied mathematics in general. However, the concepts you need to be familiar with in order to understand vector spaces have little direct value for the mathematical applications you will be concerned with during your Master’s studies. Therefore, the discussion to follow is confined to the intuition and gives the most fundamental definitions and results without digging too deep into where they come from. A thorough discussion of vector spaces can be found in the companion script, and depending on availability of time, will also feature in the associated class taught in the week before your first semester.
Let us first develop an intuition for why we need vector spaces. Consider two arbitrary real numbers, for instance, and . Then, if you are asked to add them, multiply them, or tell the distance between them, you will not have much of a problem. However, if you are asked the same questions for the functions given by and , things get tricky. What is the sum of two functions? How, conceptually, would it make sense to think of a distance between functions? Questions like these and many more can be easily addressed once one is familiar with vector spaces.
More generally, math is tractable when we have one or two dimensions of real numbers. When considering a second dimension, we find ourselves in the comfortable situation that we can illustrate a vast amount of problems graphically, which is oftentimes very helpful in solving them. A classical example from undergraduate economics which you may remember is the way one typically looks for the utility-maximizing consumption bundle of two goods when given the budget restriction and the indifference curve. However, when considering more complex problems (e.g. more than two goods/inputs or uncertainty through stochastic components) as we tend to do in more advanced studies, a more general concept is needed. The main objective of the theory of vector spaces is sometimes described as follows: Geometrical insights at hand with 2- or 3-dimensional real vectors are really helpful. Can we, in some way, generalize these insights to other mathematical objects, for which a geometric picture is not available?
To convince you even more of the usefulness of transferring the graphical intuition, consider the example of minimizing the (Euclidean) distance of a point to a line defined by , i.e. when choosing a point on so that the distance is smaller than for any point on , as illustrated graphically here:
Then, in the 2-D context, clearly, the line through and must be orthogonal to . Without this intuitive geometrical representation, however, the result would probably have been very hard to find out. Indeed, the result’s beauty lies within the insight that even when considering higher-dimensional spaces (e.g. of vectors , , that is, the ), the least-squares solution that minimizes the Euclidean distance continues to satisfy this orthogonality property!
First things first – to begin, let us define what we mean precisely by a vector.
Definition: Vector.
A row vector of length is an ordered tuple of elements . We write . A column vector stacks the elements in a column, i.e. where indicates vector transposition. A row vector is such that is a column vector. A “vector” typically refers to a column vector.
In distinction to the set, the order of elements in a vector matters, such that and are distinct! Also, the vector can contain an element multiple times, consider e.g. the origin . As with sets, however, the definition does not restrict elements to be real numbers, and may also refer to collections of functions, matrices, sets, vectors, etc. Thus, be aware that even though we predominantly deal with vectors of real numbers, the concept is much broader!
The standard set of vectors that economists typically consider is the one of n-dimensional real (column) vectors:
It turns out that to generalize a great variety of concepts and insights to arbitrary vector spaces, we only need two things: (1) a well-defined way of adding vectors to each other and of multiplying real numbers to them, and (2) that the set of vectors we consider is closed under these two operations, i.e. that any sum of vectors and any product of a vector and a real number is still an element of the set. Furthermore, when we need more mathematical structure, it is oftentimes helpful to consult the basis of the vector space – an object allowing for compact representations of all elements in the space, and providing some insight into its dimension. But let us go over these things more slowly, starting with the intuition for why these two seemingly basic properties are so mathematically powerful.
As you may already have done in school, it is useful to think of real vectors as an entity with direction and magnitude. To do so, one writes a vector as the product of a direction vector and an augmenting magnitude coefficient: one may write as , and as . Then, and have magnitude coefficients and , and directionality and , respectively. Indeed, this concept is the first fundamental building block of the structure we assign to sets of vectors to do algebra with them: scalar multiplication (for this course, you can think of scalars and real numbers as being the same thing). This concept gives us a feeling of how much individual vectors extend into some direction, and allows for a spatial comparison of objects within the vector set.
However, for more precision, it is desirable to decompose the extension along the fundamental directions. For our example of the , there are two fundamental directions: the horizontal and vertical axes. They can be expressed by the vectors and , respectively. For our examples and above, only extends alongside the second fundamental direction, so that
For , we can decompose
Notice that we have used the standard properties of multiplication and addition of real numbers here, despite the fact that we are dealing with vectors. This is precisely why we need an appropriate definition of vector addition as the second building block in our vector space definition: we must ensure that addition of vectors is “similar enough” to addition of real numbers.
The collection of (fundamental) directions that completely characterize the dimensions along which vectors can extend within the considered set is typically called a basis. In the exemplary context here, we would call the set a basis of the . Because the set actually describes the fundamental directions, we would even call this basis the canonical basis of the . A basis helps with representation and provides the ground for more complex mathematical analysis. In contrast to the basis operations of vector addition and scalar multiplication, however, its existence and/or structure is not essential for most economic applications.
In conclusion of the previous section, the operations of vector addition and scalar multiplication, i.e. multiplication of elements in the vector space with real numbers are always defined in a way that ensures a behavior “similar to” addition and multiplication of real numbers, i.e. similar to a way when the “vectors” we consider are indeed real numbers. Just to give you an idea, to achieve this, we always require that vector addition is associative and commutative, that scalar multiplication is associative and distributive over vector and scalar addition, and that there are “neutral” and “identity” elements of these operations. In the real number context, the additive neutral is the zero, as for all , and we have and for all . Similarly, for arbitrary vector spaces, we require that for any vector , and that we have a zero element, that is, some vector which gives for all vectors , and for which for all vectors . Should you be interested in more details, please consult the companion script.
If some set of vectors is combined with operations that do not satisfy some of these properties, we are not able to call this combination a “vector space”. On the other hand, this range of similarity required here is enough to make sure that all concepts that we will discuss here (and many more) can be defined for the general vector space and work in a way very similar or identical to the way we are used to from the lower-dimensional case.
Let us for now conclude with some more crucial details on the vector space concept. Firstly, a set alone (think of the with arbitrary , or a concrete example such as ) can not yet be a vector space. Before reading on, ask yourself, given what is written above, why is this?
Well, a vector space is the combination of a set of vectors, e.g. , with the two “basis operations” of vector addition and scalar multiplication. In consequence, we call the collection a vector space, where represents addition of real vectors of length 3, and represents multiplication of real numbers with these vectors. However, people (especially applied mathematicians such as economists) tend to be a bit sloppy with this distinction, and you will oftentimes read that sets, such as the , are indeed called vector spaces. In such cases, be aware that this is bad notation in a narrow sense, and only justified if it is sufficiently clear which basis operations should be considered. For the case of the , we usually consider addition element-wise so that for and , we have
Further, scalar multiplication is also defined element-wise, so that for and , we have
As stated initially, the elements in a vector need not be real numbers. As such, there are many more vector spaces than the (with addition and scalar multiplication as defined above). The most important ones for economic applications are the following:
(1) The set of real-valued sequences, endowed with vector addition and scalar multiplication in a way defined analogously to the corresponding operations for vectors of finite length.
(2) The set of real-valued functions with domain , where may be an arbitrary set, but is the common domain for all functions in , endowed with addition defined by for all and scalar multiplication defined by for all , .
(3) The set of matrices, , endowed with “element-wise” addition and scalar multiplication (see Chapter 2 for more detail).
Take a minute to make sure that you understand the definitions of addition and scalar multiplication for functions. First, for vector addition, by the concept’s nature, we need to consider two “vectors”, in this case functions, which we do by picking out (arbitrary) functions when defining the operation ““. The resulting object must again be a function, namely the one that maps any on the sum of the values and . Similarly, for scalar multiplication, we need to take a real number and a “vector”, here function, , to define which function will correspond to the scalar product of with .
As they hold little direct value to the remainder of this course, we skip the discussions related to the basis, spans and subspaces here. Should you be interested and/or need these concepts at a later point in your studies, please refer to the companion script for an introduction.
To conclude our discussion of the basics of vector spaces, let us consider a few concepts that you will come across frequently when dealing with vectors. The first one, we have already used in Chapter 0 without formally defining it:
Definition: Cartesian Product.
Let and be two real vector spaces. Then, the Cartesian product of and , denoted , is the collection of ordered pairs with elements and together with addition and scalar multiplication, respectively defined as and .
Indeed, the Cartesian product of two real vector spaces is itself a real vector space. As a very simple example, note that we can write .
To conclude the section on definitions, let us define a very special vector operation that is extensively used in all economic disciplines: the scalar product (alternative names are dot product, inner product or vector product):
Definition: Scalar product.
Let . Then, the scalar product is defined as
It tells us how to multiply elements of the with each other. Note that the scalar product as stated here is defined only for the and not more general vector spaces where vectors potentially contain elements other than real numbers (this is because multiplying elements within vector spaces is too context-specific to be generalized into a broad concept, as we have done with addition and scalar multiplication). But fear not, you will also know how to multiply, among others, functions and matrices with each other by the end of this course.
The remaining important concepts refer to combinations of the basis operations of vector addition and scalar multiplication:
Definition: Linear Combination.
Let be a real vector space, and . Then, the linear combination of and with coefficients and is . More generally, for , the linear combination of with coefficients is .
We say that is closed under linear combination if .
Definition: Convex Combination and Convex Set.
Let be a real vector space based on the set . A convex combination of the vectors is a linear combination , for which and .
A set is convex if it contains all convex combinations of any two of its elements, i.e.
Actually, we can show that being closed under linear combination is equivalent to being closed under both vector addition and scalar multiplication (recall: we required this for the vector space property).
Definition: Linear Dependence, Linear Independence.
Let be a real vector space, and let , . is said to be linearly dependent upon the set if it can be expressed as a linear combination of its elements, i.e.
Otherwise, the vector is said to be linearly independent of . A set is said to be linearly independent if each vector in the set is linearly independent of the remainder of the set, i.e. if
In words, is linearly dependent of if the elements in can be (linearly) combined to obtain . Then, does not add a new, independent direction, which is why we call it dependent of (the directions in) .
Theorem: Testing Linear Independence.
An equivalent condition for linear independence of the set of vectors is that
(1)
This result is really important and in fact a key take-away of this chapter! Hence, let us consider two simple examples of how we can use it to prove or disprove linear independence.
First, take the set and investigate whether is a linearly independent set using the theorem above.
Second, take the set and investigate whether is a linearly independent set using the theorem above.
If you are interested in why the Theorem indeed provides an equivalent condition for linear independence of a set, you can consult the proof in the companion script, which formally establishes the equivalence, that is, it shows you why the theorem implies linear independence and vice versa.
As a final note, recall that for the purpose of this course, scalars and real numbers are the same thing. Essentially, this remains true as long as we don’t deal with complex numbers. To indicate that we confine ourselves to studying real numbers, we call the corresponding vector spaces “real vector spaces”. To be formally precise, the following definitions, propositions and theorems will refer only to such spaces.
The previous section has aimed at conveying how, for very general and abstract sets of vectors, we can define a space, where we can find a helpful spatial representation – as with the simple two-dimensional plane – by characterizing the “position of” elements in a set of vectors using addition and multiplication in a fashion similar to real numbers. Building on these insights, this section addresses how we properly define the “length” (or: magnitude) of a vector and, even more important, how we assess the distance of two points in general vector spaces. Furthermore, we will learn how to use this distance concept transfer the standard definition of continuity of simple functions mapping real numbers on real numbers to much more general functions.
Before going into the formal details, let us consider an easy, intuitive example to grasp on what will be going on formally and more abstractly below, and to (hopefully) understand distances – as viewed by mathematicians – are indeed very intuitive and straightforward concepts. As you may know, Mannheim, similar to Manhattan, is organized in squares. Roughly, if you move north, the street names are increasing in letters (e.g. L1, M1, N1, etc.) whereas when moving east, they increase in numbers (L1, L2, L3, …). Note that a map of Mannheim can be thought of as the with fundamental directions “north” and “east” (south is “negative north” and west “negative east”, if you’re confused, draw it on a piece of paper). So, suppose you are in the econ building in L7 and tired of studying, so you wish to go see a movie in the Cineplex in P4. Then, regardless of how you walk precisely (but abstracting from wrong turns), you will have to go four blocks north and three blocks west, so a total number of seven blocks. This simple calculation (going only “zig-zag”) is called the “Manhattan metric”, a commonly used mathematical distance measure! Conversely, if you were a bird and could fly there, you would probably go the direct way (so the minimum distance necessary). Recalling the Pythagorean theorem, this distance is blocks. This is what we call the Euclidean distance, one of, if not the most natural mathematical definition of a distance!
The following discusses how we can generalize these intuitive concepts and introduce them to the more abstract framework of vector spaces, to be able to apply them to arbitrary vectors that we might wish to consider – which will allow us to, for instance, assess the distance of two functions.
This concludes our introductory discussion of vector spaces. If you feel like testing your understanding of the concepts discussed thus far, you can take a short quiz found here.
Many basic mathematical concepts are very intuitive; this is especially true for the concept of a metric or distance function. Consider two objects that stand nearby you, and ask yourself what properties you would like the “distance” between these two objects to have. Clearly, the distance should not below zero or respectively non-negative, and zero if and only if the objects are in fact in the exact same location (e.g. same building but different level in the maps example). Second, it seems natural that the distance should be the same from object 1 to object 2 as for the other way around, i.e. that the distance measure is symmetric. Finally, a third natural requirement is the following: when asked to measure the distance traveled from object 1 and 2 (i) directly and (ii) while passing by some arbitrarily located object 3, one should hope the outcome from (i) to be, in some sense, “smaller” than the outcome from (ii). As the following formal definition will show, these three properties are exactly what defines, in the eyes of mathematicians, a distance function.
In the following, we will assume that we consider vectors in some vector space . Thus, you can assume that vector addition and scalar multiplication are well-defined even when they are not explicitly introduced in definitions.
Definition: Metric and Metric Space.
Let be a real vector space. Then, a function defines a metric on if it satisfies the following three properties:
If defines a metric on , we call a metric space.
Note that the metric is defined on the Cartesian product of with itself, because the metric takes two elements of and assesses their distance (i.e. the first element must be in and also the second, and thus the complete input must lie in the Cartesian product)!
The metric is the most crude distance concept that we usually consider. It is crude in the sense that it is based on satisfaction of only on a set of minimum requirements that already rules out a number of erratically behaving functions as distance measures, but, as you will also see below, still leaves a high degree of freedom of how a metric may be defined, and in consequence also some room for properties that may frequently be viewed as inconvenient in applications.
A simplistic example of a metric is the so-called binary metric, defined as . It is simply an indicator equal to one if the points considered are not the same, and verbally, indicates whether there is a positive distance between the two points or not. Should you desire to increase your familiarity with the metric concept, a good exercise may be to verify the metric property for this function.
From the example of the binary metric, it becomes apparent that the metric concept may indeed be too crude to give us what we intuitively want when thinking of a distance: while technically satisfying all requirements of a metric, the binary metric is not helpful in answering “how far” two objects are apart – the answer we get is always only “yes” or “no”.
This means that we need to extend our list of aspects that we desire in a distance measure that makes intuitive sense. First, a further characteristic of the intuitive distance concept is that, when starting from two objects, call their positions and , then moving them in the exact same fashion, e.g. , , should leave the measured distance unaffected. In terms of a measure , this means that . This property is called translation invariance. Note that it is not part of our definition of a metric above, and indeed, it is not ensured to hold for any function that we may call a metric according to this definition.
A further (related but distinct) issue of the metric concept is the one of scaling or “distance from the origin”. Suppose that we are considering some “origin point”; for the , this would usually be the zero vector . The origin point is special because it has no magnitude, that is, it does not extend into any of the vector space’s directions. Typically, we find it practical to think of the length of a vector as its distance from the origin, i.e. . Then, intuitively, when doubling the magnitude (e.g. “zooming in”, if we imagine the as a map) of , we should double its length, so that . Like translation invariance, this is neither part of the definition of a metric nor ensured by it.
These two points motivate the following concept:
Definition: Norm and Normed Vector Space.
Let be a real vector space. Then, a function defines a norm on if it satisfies the following three properties:
If defines a norm on , we call a normed vector space.
We can use norms to define distances as follows:
Definition: Norm-induced Metric.
Let be a normed vector space. Then, the metric induced by is .
To deepen your understanding of norm and metric, it may be a useful exercise for you to verify that the norm-induced metric is, indeed, a metric.
Make sure that you understand the following distinction conceptually: The norm by itself is not a distance function. Rather, it can be used to define a norm-induced metric which is a more specific sub-concept of the more general metric definition.
The reason that we define it is because the norm fixes the two issues of general, unrestricted metrics when it comes to a natural distance interpretation that we discussed above, as summarized by the following result:
Theorem: Norm vs. Metric.
Let be a normed vector space, , and the metric induced by . Then, defines a metric on . Further, exhibits the following extra properties:
In case you want to understand why this is true, note that the theorem directly follows by plugging in the definition of the norm-induced metric, and using absolute homogeneity of the norm for (i).
As we have seen, norm-induced metrics are indeed metrics, and therefore satisfy the intuitive “basis characteristics” of a distance measure as discussed in the introductory paragraph above. Moreover, because of this result here, norms are extremely helpful in defining distance functions with a broader range of appealing properties. Indeed, in almost all applications relevant to economists, norm-induced metrics are our go-to way of defining a distance in the mathematical sense.
A further appealing feature of norms is that we can use them (or the metrics induced by them) to define the length of a vector in the desired way: usually, when talking about the length of some vector , we simply refer to the norm . To see why intuitively, recall our decomposition of vectors into magnitude and directionality, , where is the magnitude and is the direction vector of the same shape as . Suppose that for direction vectors , we have (this is true for most common norms when we consider the fundamental directions of any and their convex combinations to arbitrary directions). Then, the length is simply the absolute magnitude:
The remainder of this subsection is concerned with (i) which norms are natural candidates to consider when defining specific norm-induced metrics, and (ii) which central results help us when handling norms.
We know that norm-induced metrics are a very promising concept for measuring distances in a mathematical way. Still, in practical applications, the general concept does not yet give us sufficient guidance on how we can measure distances – for this, we need specific functions that are norms and can be used to define concrete norm-induced metrics. In the context of the , the most commonly used class of functions is the following:
Definition: p-Norm and Euclidean space.
Consider the real vector space . Then, the p-norm over with is the norm . Moreover, we define as the maximum norm. When is the metric induced by the 2-norm (“Euclidean norm”), we call the Euclidean space of dimension .
Note that when , i.e. when considering rather than an actual vector space, all p-norms are simply equal to the absolute value. Indeed, the resulting metric, for , is the so-called natural metric of the , and is the common metric used to measure distances between points in . The interested reader may want to verify that the p-norm indeed constitutes a norm. You can use that the mapping is concave for , and that thus, , then this should be a simple exercise. The classical spaces considered in economics are metric spaces endowed with norm-induced metrics to have all the intuitive properties we are interested in. For instance, the “zig-zag” Manhattan-metric discussed earlier corresponds to the metric induced by the 1-norm, and the “direct way” Euclidean metric to one induced by the 2-norm. Mostly, we are interested in the “direct” or “shortest” distance, so that we consider the Euclidean space.
As a take-away of the discussions thus far, if you define a distance measure (a “metric”) from a norm, you are guaranteed a broad set appealing, intuitive properties. For the , norms are rather easy to come by, and can e.g. be constructed as p-norms. We usually deal with the Euclidean norm, a special p-norm with p=2, because it has an intuitive “direct distance” interpretation in the .
Let’s get to the helpful facts related to norms that were promised above. First, we can find a “reversed” version of the triangle inequality:
Proposition: Inverse Triangle Inequality.
Let be a normed vector space. Then,
Since this is our first proposition, recall that, as already mentioned, everything labeled a “proposition” is proven in the companion script, and if you are ever interested in digging deeper into some of the presented facts, you can have a look there. Especially students struggling with formality and/or finding the variety of new concepts challenging may benefit from looking into the proofs – most of them are relatively accessible once you have understood the concepts previously discussed, and having studied the proof of a fact is frequently helpful for memorizing and correctly applying the fact itself.
Further, a nice relationship of p-norms that you may want to be aware of is the following:
Proposition: p-Norm and Maximum-Norm.
Consider the vector space , , and let . Then, for any ,
To conclude the introductory discussion of the mathematical distance, to gain some more familiarity for the concepts that we just considered, let us investigate a scenario illustrative of the fact that when not making use of norm-induced metrics, the broad range of appealing properties they promise is not guaranteed. For this, let us consider the so-called French Railway metric:
for .
To understand the name, recall that norms of vectors can be thought of distances from the origin, and consider the following figure:
The metric defined above is called the French Railway Metric because it used to be almost true that, in France, if you were to travel between two cities that did not lie on a single ray from Paris, then you had to travel through Paris. For instance, in the figure you see that for going from Bordeaux to Madrid you can proceed without going through Paris, while to go from Bordeaux to Lyon, you need to go through Paris. Accordingly, imposes to to pass by the origin (Paris) when measuring the distance between two points that are not contained in a single ray from the origin.
Then, if one translates the origin to, say, Mannheim, i.e. one imposes that all travels must go through Mannheim unless they are contained on the same line to Mannheim (rather than Paris), the distance between Bordeaux and Madrid, or between Toulouse and Bordeaux, as measured by our railway metric, will change! This intuition is easily verified mathematically: Let and . Then, , so that
However, and are linearly independent (recall our theorem for testing linear independence – this can be verified by solving for ), so that
Therefore, it can not be the case that for any and for any : , because we have found a specific counterexample!
If you wonder how precisely to come up with such a counterexample, think again about the intuition. If two points were on the same line before moving the origin, the travel distance will be longer if we move the origin such that they are no more. Thus, start from to linearly dependent vectors and move them in a way that they are no longer linearly dependent. However, note also that for points not on the same line, the distance will change because the necessary detour will get longer or shorter, depending on how the new origin is chosen. Indeed, you would be very unlucky to pick an example where the distance stays the same.
Now that we know quite a bit about distances in vector spaces, let’s move on to some definitions and characterizations of sets in vector spaces that can be obtained from a distance measure: balls and neighborhoods, set openness, closedness and compactness, and continuity of functions. These concepts are quite important for economists and are fundamentals of mathematical analysis, so even though they might not be 100% intuitive right now, it is worthwhile developing a firm understanding and good intuition for their meaning. Regarding how easy it will be to understand the contents in later courses of your master studies (and also later topics in this one), there is a lot to gain from doing so now.
A very important concept is the following:
Definition: -Ball, Neighborhood.
Let be a real vector space and be a metric space. Further, let , and . The –open ball or neighborhood centered at is the set of points whose distance from is strictly smaller than , that is:
Conversely, the –closed ball centered at is the set of points whose distance from is not larger than :
Recall also that we said that in , all p-norms reduce to the absolute value. Thus, in with its natural metric, that is, the metric induced by the absolute value, , the balls are just intervals around their middle point: and .
Below, we consider the concepts of interior, boundary and closure points of a set, and set closedness and openness. To develop an intuition for them, which will be the focus of the elaborations below, is rather straightforward using the figure you have just seen. To give you some chance to practice mathematical and formal correctness, the formal definitions of these concepts are also given. But remember, it is all quite intuitive, so don’t freak out about the notation!
In the figure above, you can immediately imagine what we mean by the “interior” and the “boundary” of a ball, right? As we did with addition and scalar multiplication previously, we now extend these concepts to general metric spaces, which allows to generalize this graphical intuition to more abstract scenarios that we can not sketch. But the intuition that we will use is precisely the one from the : an interior point should have only other interior points “very close” to it, and no matter how “close” we move to a boundary point, it will always be surrounded by points that do and do not belong to the set.
Definition: Interior Point, Interior.
Let be a real vector space and be a metric space. Let . Then, is said to be an interior point of if there exists such that the -open ball centered at lies entirely inside of , i.e. . The set of interior points of is called the interior of , denoted int() or , i.e. .
As you have seen, the open ball includes only interior points, i.e. the set itself is in fact equal to its interior! Accordingly, we use the interior concept to define what we mean by openness of a set more generally:
Definition: Open Set.
Let be a real vector space and be a metric space. Let . Then, is said to be an open set if .
Note that trivially, . Hence, any set contains its interior, but the converse is true if and only if is open. Indeed, this is the key take-away also for proofs that seek to establish openness of a set : it suffices to check that any point is also contained in !
The boundary, on the other hand, corresponds to the circle separating the open ball from points that should are be included and lie outside of it in our figure. Accordingly, for a more general set , we can view it as the line between the interior of and the points that lie outside .
Definition: Boundary Point, Boundary.
Let be a real vector space and be a metric space. Let . Then, is said to be a boundary point of if , for every , the -open ball centered on contains both points that belong to and ones that do not, i.e. The set of boundary points of is called the boundary of and denoted , i.e. .
The closed ball as shown in the figure, beyond its interior points, contains also all points on its boundary. Thus, for a point to be included in the closed ball, it must be either an interior point or a boundary point. We will summarize these two types of points as “closure points” to more easily talk about closed sets:
Definition: Closure Point, Closure.
Let be a real vector space and be a metric space. Let . Then, is said to be a closure point of if , for every , the -open ball centered at contains at least one point that belongs to , i.e. . The set of closure points of is called the closure of , denoted , i.e. .
As stated above, verbally, a closure point is either an interior point or a boundary point of the set. To see that the somewhat unwieldy definition of closure points as given here is indeed equivalent to this, you can read it in the following way: closure points are either elements of , or they lie outside of but “touch” in the sense that no matter how small a ball we choose around them, they still contain elements of . Graphically, the former type of points corresponds to interior points of , while the latter type corresponds to the boundary.
According to our discussion above, we will call a set closed if it contains all closure points:
Definition: Closed Set.
Let be a real vector space and be a metric space. Let . Then, is said to be a closed set if .
For establishing closedness, note that any set is included in its closure, but the converse is true if and only if is closed. Transferring the intuition of the illustrating figure above more directly, we now can characterize the boundary as the a set of elements such that, if they all belong to , then is closed, and, if none of them belong to , then is open.
For our usual metric space and and , we may now rephrase our concepts of open and closed sets as follows:
Note that a set may be neither open nor closed, namely, if only a fraction of boundary points lie in the set! Imagine, for instance, the set that includes the boundary only in the left half of the ball, i.e. for . Conversely, if there is no boundary of a set (think, for instance, of the whole space ), then all closure points are interior points (recall: closure = interior + boundary), so a set may be both open and closed! Therefore, make sure to remember that when it comes to closed and open, a set need not always be one and not the other!
Now we have an idea of what closed and open sets and balls are, and it will soon become evident that they are very useful when studying functions and characterizing the behavior and properties. However, when given a specific set (e.g. think about the budget set with income and prices ), it is typically not immediately clear to determine whether it is open, closed, or neither, and directly applying the definitions above may be cumbersome. To overcome this issue, there are a wide range of results providing equivalent and sufficient conditions for openness and closedness of sets. Below, you can find the ones that you should be familiar with as those are the ones used most frequently, at least in the economics context.
Theorem: Properties of Open Sets.
In a metric space (, ),
(i) and are open in .
(ii) A set is open if and only if its complement is closed.
(iii) The union of an arbitrary (possibly infinite) collection of open sets is open.
(iv) The intersection of a finite collection of open sets is open.
Theorem: Properties of Closed Sets.
In a metric space (, ),
(i) and are closed in .
(ii) A set is closed if and only if its complement is open.
(iii) The union of a finite collection of closed sets is closed.
(iv) The intersection of an arbitrary (possibly infinite) collection of closed sets is closed.
Two more theorems might be helpful at times to establish closedness:
Theorem: Weak Inequalities and the Limit: Functions.
Suppose that is a real vector space, and so that : (in function notation, we would write ). Let , and suppose that so that , . Then, it holds that .
The theorem exists for sequences in an analogous way:
Theorem: Weak Inequalities and the Limit: Sequences.
Suppose that is a real vector space. Let and be convergent sequences over , i.e. , with limits and , respectively. If , it holds that , then, we also have .
This is extremely useful for the context of set closedness and openness because:
Theorem: Closedness and Sequences.
Suppose that is a real vector space, and let . Then, is closed if and only if, for any convergent sequence over , i.e. , it holds that .
To see the intuition, consider again the figure illustrating balls in the . As , convergent sequences are restricted to an ever smaller ball around the limit point . Therefore, if the sequence is over the set , it will reduce to an ever smaller ball “in proximity to” points of , if not in — the precise definition of the closure. This means that either, the limit point lies in the interior or on the boundary. However, for the point to be certainly included in the set, next to all interior points, any boundary point must be contained in the set — which precisely describes boundedness!
To see how the last theorem can be used to simplify investigations of closedness, consider the budget set with income and prices :
We are yet to discuss convergence and continuity for higher-dimensional functions formally, so let us just discuss the intuition here: consider an arbitrary sequence over that converges to some point . We will see that the dot product is a continuous function (intuitively, it is a generalization of multiplication, which is a continuous operation). Hence, we can pull the limit in, so that
Because the sequence is over , by its definition, it holds that
Applying the theorem on weak inequalities and the limit:
Therefore, is also a point in the budget set! Because we have started from an arbitrary, convergent sequence over the budget set, the last theorem allows us to conclude that the budget set is closed!
Indeed, we can use this way of thinking about sets characterized by inequalities one step further: whenever we have weak inequalities and a continuous function of , the set will be closed. For strict inequalities, we can also say something: note that the complement set will be characterized by a weak inequality (e.g. ) – and therefore closed. Hence, the set characterized by the strict inequality will be open! Finally, for a set characterized by an equality, the complement can be split into the union of two sets characterized by a strict inequality: e.g.
Because the RHS sets are characterized by strict inequalities, they are open, and the complement of the set characterized by the equality, as a union of two open sets, is open! As such, the set itself is a closed set. To summarize:
Theorem: Closedness, Openness and Inequalities.
Suppose that is a real vector space, and let . Further, let be a continuous function, and consider a threshold value . Then,
This yields that the budget set is closed also when we impose that all money is spent, i.e. . Be cautious that this theorem can only be applied if the function is continuous, which has to be verified in a first step.
A further important property of sets in metric spaces is the one of boundedness. It is defined as follows:
Definition: Bounded Set.
Let be a real vector space and be a metric space. Let . Then, is said to be a bounded set if it is contained in an open ball of finite radius , i.e.
Verbally, the set is bounded if the distance between any two points in the set can not get arbitrarily large, but rather, it is bounded by some finite threshold value. Indeed, the definition above is equivalent to
which more explicitly highlights this interpretation. The intuition is that within a ball of finite radius , no two points in the ball can lie further apart than the diameter . As the bounded set is contained in the ball, i.e. it is a subset of it, points in also lie in the ball, and their distance is bounded by . As usual, should you be interested in the formalities behind the equivalence, you can find them in the companion script.
You will shortly see why boundedness is a very useful property. But first, let us turn to how we can establish it. Either, you can show any of the two equivalent definitions above directly, or, should you be working with a (p-) norm induced metric, as we almost always will, you can refer to the following more easily checked result:
Proposition: Checking Boundedness with a Norm-induced Metric.
Let be a real vector space and be a metric space such that is norm-induced, i.e. for , . Let . Then, is bounded if the norm is bounded on , i.e. .
This proposition is very important and in fact, it is easily established using the triangle inequality of the metric. For any , the triangle inequality of the metric gives
If the norm is bounded by on , then . Thus, for an arbitrary , there exists so that : , that is, , which is precisely what we require in the definition of boundedness that we have seen above. Thus, the set is bounded.
The last key concept discussed in this section is compactness. Don’t worry about the definition which is rather abstract, it’s just stated here for completeness, the intuition and how we investigate it, as discussed below, are far more important.
Definition: Compact Set.
Let be a real vector space and be a metric space. Let . Then, is said to be compact if every open covering with index set , i.e. such that is open and , has a finite subcovering, i.e. such that contains finitely many elements, and .
For the , the following equivalence holds:
Theorem: Heine-Borel.
Consider the metric space , where is induced by a p-norm, and let . Then, is compact if and only if is closed and bounded.
Indeed, almost all compactness proofs in economics use Heine-Borel’s theorem, so that when asked to show compactness, it is the starting point for you. To apply it, one separately shows closedness and boundedness.
To see the value of compact sets, consider the , where intervals are a special form of closed and bounded and thus compact sets. Clearly, any continuous function defined on the whole interval will assume a maximum and minimum on such a set, either in the interior , or otherwise at or (this is precisely the Weierstrass Extreme Value Theorem that you may remember from Chapter 0)! As we will see, similar reasoning applies to more general spaces, and compact sets are a powerful concept for functional analysis and optimization.
To get some feeling for boundedness and compactness, let us re-consider the budget set. If we can show that it is compact, we know that when we look at a continuous, but otherwise unrestricted utility function, there will always be a utility-maximizing consumption bundle given the budget constraint because of the Weierstrass Extreme Value Theorem! This would be quite a general and powerful result, and because we have already shown closedness of the budget set, by the Heine-Borel Theorem, all we need to worry about for this is boundedness of the budget set.
Let us first think about this issue intuitively. When can the distance between to possible consumption bundles and in the budget set
get arbitrarily large? For this, we would either have to move the consumption of the first or the second good (or both) in the two consumption bundles infinitely far apart from each other. Because consumption can not be negative for either good, this effectively means that in one consumption bundle, we would have to consume infinitely much of one of the goods. However, this is only possible if the price of the associated good is zero (assuming prices can not be negative), else, infinite consumption is not possible with a finite budget .
Indeed, we can formally show that for strictly positive prices , , the budget set is bounded! To make life more simple, let’s assume that we’re dealing with a p-norm-induced metric (for instance our usual Euclidean metric). Then, using our boundedness theorem above, what we need to show is that the norm we use is bounded on , i.e. that there exists some threshold so that for any .
Without loss of generality, assume , that is, assume that the goods are labeled in a way that the least expensive one is the second good – if this is not the case, we would re-label the goods and then proceed the same as we do now (that is what “without loss of generality” means: we make an assumption for analytical simplicity that does in no way restrict the generality of the obtained result). Using our proposition on the p-norm and the maximum norm at the first inequality, note that for ,
This yields the conclusion that the budget set of the with strictly positive prices is bounded and also closed, as we have shown earlier. Therefore, by Heine-Borel’s theorem, it is compact, a result which is very important for optimization. Indeed, also for budget sets in the , we can proceed in an analogous way as we have done here to establish:
Proposition: Compactness of the Budget Set.
Consider the budget set
Then, is closed. If , then is also bounded and therefore compact.
Note that closedness follows directly from the result on sets characterized by inequalities. For boundedness, we just would have to modify the line of reasoning above slightly, assuming without loss of generality that is the smallest price. Should you be motivated, go ahead and try to come up with this more general argument.
As a final note of caution, be aware that “compact = closed + bounded” is in fact a result and not the definition of compactness. Even in advanced (applied) math texts, people sometimes confuse this and thereby show an incomplete understanding of mathematics, which is just a bad look if you’re trying to make a professional impression.
If you remember the preliminary chapter, using the limit concept for real-valued functions, we associated continuity with the requirement that as two points become ever closer, their images should not be too far apart one from another. Now, we know how to mathematically handle distances more generally, it is time to formalize and generalize the continuity concept, which is the purpose of this section.
Start again from the and a function with , where we call the limit of at , if
The continuity requirement for at , can be written as
Now, recall that the common metrics that we use for the are p-norm-induced, and that for the , any p-norm is equal to the absolute value, and therefore, we commonly use the so-called natural metric of the real line, for . Then, the definition of continuity at is equivalent to
This step is indeed all that is necessary to generalize the continuity concept to arbitrary metric spaces:
Definition: Continuous Function.
Let and be metric spaces based on the sets and , respectively. Then, a function is continuous at if for every , there exists a such that the image of the -open ball around is contained in the -open ball around , i.e.
A function that is continuous at every point of its domain is said to be continuous.
Be sure to understand how the statement in quantifiers relates to the verbal statement referring to the open balls. Note also that a function can not be continuous at if in any -open ball around , there is a point at which is not defined!
The definition uses two metrics, and . This is because the distance of and , two points in the domain of , needs to be assessed by a metric defined on , while and are points in the codomain of , and assessing their distance requires a metric defined on . Since we intend to be as general as possible and do not want restrict domain and codomain to be identical here, we have to make sure to use one metric for the domain, , and one for the codomain, .
Similarly to continuity, we can also straightforwardly generalize convergence of sequences in metric spaces: recall that in , is the limit of a sequence if
Using again the natural metric of , the condition in brackets can be equivalently written as . Exploiting this intuition, we can define convergence of sequences more generally:
Definition: Convergent Sequence.
Let be a metric space based on the set , and let be a sequence over , i.e. . Then, is said to be convergent if
If is convergent, the point satisfying this condition is called the limit of , denoted .
To conclude this section, let us consider one last theorem that is very helpful for investigations into continuity that try to avoid dealing with the direct, formal definition. It combines the concepts of limits and continuity, and is perhaps the most important tool to disprove continuity, so you can gain a lot from being familiar to it.
Theorem: Sequence Characterization of Continuity.
Let and be metric spaces based on the sets and , respectively. Then, the function is continuous at if and only if for every sequence over , .
Thus, to establish that is not continuous at , it suffices to find a sequence so that and either does not exist, or it does but .
Further, the sequence characterization gives us that the very useful tool of “pulling limits into continuous functions” works also for general functions as we have considered them here: simply plugging in that into the theorem above, we get for any continuous function and any sequence over with limit :
This concludes our investigations into vector spaces here. The companion script further discusses set convexity as a weakening of the subspace concept, which is not featured in this course. Therefore, it is not too useful to investigate convexity of sets here, and you need not worry about it now. Still, there is definitely some value to be gained from familiarizing oneself with this concept, since much of convex optimization relies on the convexity of sets. If enough time remains, the subject will therefore be introduced in the lectures in the week before your first semester.
To test how well you understood the second half of the discussions on vector spaces, you can do a short quiz found here.