Measure Theory – Introduction

Welcome to the first post of ‘Measure Theory’. As you can read in the website (at ‘About’ page), I am currently taking a measure theory class (more precisely, the class deals mostly with lebesgue measure), among other topics. Measure theory is more advanced math that not everyone in the academic world is familiar with. Most of the students that major in exact science, study Calculus in their first year. Those who aren’t math majors – usually stop there, and use calculus definitions and theorems during more advanced classes. However, math majors do take some sort of ‘measure theory’ class, in order to fill holes and give better definition of integration, size, and more.

It turns out that there are lot’s of problems with the definition of an Integral at basic calculus classes. Recall that a Riemann integral is defined as limit of Riemann sums.

In case you don’t remember what a Riemann sum basic idea is to approximate the area under a graph with rectangles, and as we add more and more rectangles, where their width is getting smaller and smaller, we get better approximation to the surface area.

But that’s basic calulus, this defintion, for most of the people is more than enough, it is intuitive, and really easy to demonstrate (as I did here, I made this animation with less than 100 lines of code). But turns out that there are lot’s of problems with this definition.

What’s wrong with it?

Let’s start with an example, consider the function 1_\mathbb{Q}:\mathbb{R}\to \mathbb{R}

1_{\mathbb{Q}}(x)=\begin{cases}
1 & x\in\mathbb{Q}\\
0 & x\notin\mathbb{Q}
\end{cases}

This function is called Dirichlet function. If you want to see how it looks, thats how:

It looks like that for a simple reason that you might already know, the rationals are dense inside the real numbers. This function causes problem to Riemann integral, the integral of this function is not defined. For every partition, you can pick midd rational or irrational midpoints, the function is nowhere continous. However, this function should have an integral, and I state that the integral need to be 0. Why? for a simple reason, even though the rationals set is dense, there are a lot ‘more’ irrationals than rationals (If you are familiar with cardinals you know that |\mathbb{Q}|=\aleph_{0}<\aleph=|\mathbb{R}|). We want to be able to integrate functions like this one, and riemann integral just can’t handle this kind of functions, that is a real problem.

One more stuff Riemann integral is struggling with is non-continous functions. Usually theorems that deal with Riemann integral involve (almost) continous function. That’s also a problem. Most functions are not continous and we still want to calculate the area under their graph (or something else). We want a defintion which is able to deal with these kind of functions.

There many other problems with Riemann integral, the question now is: How do we solve it? For starters, we want to find an ideal way to ‘measure’ sets.

What propeties measure function should satisfy?

People thought about properties they want a measure function to fulfill. Eventually, they came up with 4 properties for a measure m on \mathbb{R}.

  1. For every E\subset \mathbb{R}, m(E) is defined and satisfies 0\leq m(E)\leq \infty
  2. For every interval E\subset \mathbb{R}, m(E)=|E|.
  3. If we ‘shift’ / ‘slide’ a set, we don’t want the output of the measure to change.
  4. If we consider the disjoint union E=\bigcup_{n=1}^{\infty}E_{n} we want that: m(E)=\sum_{n=1}^{\infty}m(E_{n}).

All of these properties actually do make sense, Let’s look at a candidate:

The Outer measure

Let E\subset \mathbb{R} be some set. The Outer measure of E is:

m^{*}(E)=\inf\{\sum_{n=1}^{\infty}|I_{n}|:E\subset\bigcup_{n=1}^{\infty}I_{n}\}

Where I_n are open Intervals, and |I_n| is the length of I_n.

This definition might look a little scary at first, but it is actually a really intuitive one. Think of open intervals as covers, where you can cover with them any desired set. However, some covers are larger, and some are smaller, the ‘ideal’ cover of a set, is obviously the smallest, right? why would you use a large cover when you can use someting shorter? These covers of the set are usually called open covers.

Notice that the outer measure is defined as an infimum, which allways exits when talking about bounded real sets. In addition, the sums of the open sets lengths is always non-negative. We get that 0\leq m^*(E) \leq \infty.

Let’s find some properties of the outer measure. First, let’s try to calculate the measure of a point, i.e the measure of the set \{x_0\}. notice that we can take for any cover of that set, there is at least one set, let’s call it I_n such that, x_0\in I_n. We can now take the new cover which made of I_n only, and it is smaller (or equals) the sum of lengths of the original cover. Because I_n is an open set, There is some \varepsilon > 0 such that (x_0-\frac{\varepsilon}{2},x_0+\frac{\varepsilon}{2})\in I_n, Which is a cover of length \varepsilon. We can make \varepsilon smaller and smaller, and get smaller and smaller length as we want. We now conclude that m^*(\{x_0\})=0. In other words, the outer measure of a point is 0, that is actually makes sense.

Let’s see more: if A\subseteq B then every open cover of B is also an open cover of A, thus m^*(A) is an infimum of more numbers then m^*(B), So we get that m^*(A)\leq m^*(B).

It turns out that m^* also satisfies property 3: if E is covered by \bigcup_n I_n then x_0+I_n is covered by \bigcup_n x_0 + I_n and since \sum_n |I_n| = \sum_n |x_0 + I_n| we get m^*(E)=m^*(x_0+E).

It also satisfies propery 3, I’ll prove it only for a closed interval [a,b], for the other cases the process is similar.

We want to show that m^*([a,b])=|[a,b]|=b-a. First notice that for every \varepsilon > 0, the open interval (a-\varepsilon, a+\varepsilon) is an open cover of [a,b], with length (b-a)+2\cdot\varepsilon. Then we get that m^*([a,b])\leq (b-a)+2\cdot\varepsilon, this is true for any \varepsilon > 0, thus, m^*([a,b])\leq b-a, which proves one side of the inequality. On the other hand, let \{I_n\}_{n=1}^{\infty} be an open cover. Beacuse [a,b] is compact (Hiene-Borel) there exits a finite sub-cover \{I_n\}_{n=1}^{k}. We can also denote each open interval as I_n=(a_n,b_n). now suppose WLOG that a\in I_1. thus a_1 < a. if b\notin I_1, it means that I_1 not covers all the interval, then there is another interval I_2=(a_2,b_2) such that b_1\in I_2, and again, if b \notin I_2, then there is another interval I_3=(a_3,b_3) such that b_2\in I_3. and so on until we find an interval I_k=(a_k,b_k) such that b\in I_k.

This process has to end because we are dealing with finite cover. we now get: \sum_{n=1}^k |I_n|=(b_k-a_k)+(b_{k-1}-a_{k+1})+\dots+(b_1-a_1)=b_k+(b_{k-1}-a_k)+\dots+(b_1-a_2)-a_1>b_k-a_1>b-a. and that is true for every cover, thus m^*([a,b])\geq b-a, Which completes the proof.

So far we have proved that the outer measure satisfies properties 1-3. What about property 4? Right now, I will only prove that if (E_n)_{n=1}^\infty is a sequence of sets in \mathbb{R} and if E=\bigcup_n E_n then: m^*(E) \leq \sum_{n=1}^\infty m^*(E_n).

The idea of the proof, is to take an open cover for each E_n, while picking the covers that will be as close to the measure as we wish.

Let’s write it formally, let \varepsilon >  0 be a real positive number. For each set E_n, we choose a cover (I_{n,k})_{k=1}^\infty such that:

\sum_{k=1}^\infty |I_{k,n}|\leq m^*(E_n) + \frac{\varepsilon}{2^n}

This is possible to choose such a cover since m^*(E_n) is defined as infimum. Now, observe that the union of all those covers, is a cover of E=\bigcup_n E_n, so the sum \sum_{k,n}|I_{k,n}| is greater or equals the outer measure. So:

m^*(E)\leq \sum_{k,n}|I_{k,n}|=\sum_{n=1}^\infty\sum_{k=1}^\infty|I_{k,n}| \leq \sum_{n=1}^\infty m^*(E_n) + \frac{\varepsilon}{2^n}

And recall that \sum_{n=1}^\infty \frac{1}{2^n}=1, then we get

m^*(E)\leq  \sum_{n=1}^\infty m^*(E_n) + \varepsilon

And that is true for all \varepsilon > 0 , thus m^*(E) \leq \sum_{n=1}^{\infty}m^*(E_n), and we proved the statement.

Conclutions

We have seen that the outer measure fulfills almost all 4 requierments. However, it turns out that there is no function that satisfies all those 4 properties. In the next post I am going to prove why a function that satisfies all 4 requirements can not exist, and I’ll discuss about solutions. However, It turns out that we can go past this problem and not everything is lost after all. That’s it for now, see you in the next one!

One thought on “Measure Theory – Introduction

Leave a Reply

%d bloggers like this: