In the last post I’ve introduced the term of **trees** and proved a theorem that allows us to fully understand trees. They are **connected**, they have **edges **(where is the number of vertices) and they have **no cycles**.

I’ve also presented the term of a **spanning sub-graph** – which is a sub-graph of a graph such that .

Today I would like to discuss about **spanning trees** – which are exactly what you thinks they are: A spanning subgraph which is also a tree.

As it turns out, spanning trees are very important, therefore, people started wondering how many spanning trees are there in a graph.

**Cayley’s** formula gives us a (partial) solution, it tells us how many spanning trees are there in the **complete graph** . For example, consider all the spanning trees in :

There are only 3 here, but that makes sense cause we don’t have too many options. What about ?

Well… that’s escalated quickly! There are 16 spanning trees in contrast to only 3 in – I don’t even want to start drawing the spanning trees of … But if you’re wondering, there are of them.

## The formula

I am going to denote the number of spanning trees of as . I want you to notice some stuff before I am presenting the formula:

You may notice that some spanning trees are **isomorphic** to each other, for example, **all** of the spanning trees of are isomorphic. However, I still consider them as different trees. Why? Let’s name the vertices of :

I think it is clearer now, for example, the left spanning tree in the example is the graph:

T_1=(\{1,2,3\},\{\{1,2\},\{1,3\}\})

On the other hand, the graph on the middle is:

T_2=(\{1,2,3\},\{\{1,2\},\{2,3\}\})

Clearly, they are not the same, but The only thing that makes them really different is the **lable** of the vertices. Thus, we are counting **labeled** **trees**, and that’s an important thing to understand – If we considered isomorphic trees as the same, we would have get that the number of spanning trees in is one – and that is not what I am aiming for.

I also want you to understand that we can treat **any** graph with vertices as a subgraph of . Therefore, our question is equivalent to the question: “How many trees with vertices are there?”

OK, we are now ready to present Cayley’s formula: The number of spanning trees of is exactly .

This formula explains why the number of spanning trees ‘grew up’ so fast when we moved from to . Notice how simple the formula is – it is very short and even a high-schooler can understand it.

## Goals for this post

After we’ve seen Cayley’s formula, we also need to prove it and understand why it works. The proof I will bring today is a proof from 1981 by **Joyal**. The beautiful thing about this proof, is that it gives us an **algorithm** to randomally generate trees! Therefore the rest of the post will be splitted into 3 parts:

- Some preparations for the proof.
- The proof.
- An actual algorithm that generates trees.

In order to understand the algorithm, you need to understand the proof as well, however, if you understand the proof and you are not a fan of algorithms, you can skip this part without a problem, since I will not use it in the upcoming posts. However, I do suggest you to check it out, in my opinion, it is really cool to see how we can make the computer do the hard work for us in such a simple way.

## preparations for the proof

I only want to prove two small lemmas, suppuse that is a tree, then:

- There is a unique path between any two vertices.
- If you add
**one**edge to , you will create only**one**cycle.

The proofs are really easy:

- Aiming for contradiction, suppose that there are two paths between and . Therefore, we can find a cycle in the graph (convince yourself!) and that’s a contradiction to being a tree (since a tree has no cycles).
- Suppose that adding one edge to creates
**two**cycles: and . We now remove the added edge and we get that and are two different paths between and , and that’s a contradiction to the first statement!

Ok, I think we are ready to prove Cayley’s formula.

## The proof

The proof is going to be done step by step, since it is a little bit complicated and since brinining the proof organized and completemakes it much easier to understand.

#### Part 1 – Describing our plan

My goal is to define a **bijection** between the set of functions from to itself:

[n]^{[n]}=\{f:[n]\to[n]\}

To the sets of triples:

\{(T,L,R)\}

Where is a tree, and are some vertices in the tree (they can be the same).

If we will manage to find a bijection, this would imply that the sets have the same number of elements:

|\{(T,L,R)\}|=|[n]^{[n]}|

Recall that the number of function from to itself is exactly (we have options for the output of each element from the set, and there are elements in the set). Thus:

|\{(T,L,R)\}|=n^n

On the other hand:

|\{(T,L,R)\}|=t_n\cdot n\cdot n=t_n \cdot n^2

Recall that is the number of trees, and are just arbitrary vertices, and there are vertices, therefore, there are different options for the value of and .

We can now compare both sizes of to get:

t_n \cdot n^2=n^n\Rightarrow t_n=n^{n-2}

And that’s exactly what we wanted!

So In the proof I will describe such a bijection, and I’ll start by creating a map from to .

#### Part 2 – From function to directed graph

Suppose that is some arbitrary function. We can represent like that:

\left(\begin{array}{cccc} 1 & 2 & \cdots & n\\ f(1) & f(2) & \cdots & f(n) \end{array}\right)

There’s nothing special here, this is just a comfortable way to represent to function. I am going to define now the **functional graph of ** – which is a graph that describes how works – It is a **directed** graph that an edge starts at some and ends at . Formally:

G_f=([n],\{(i,f(i)):1\leq i\leq n\})

For example, consider the function:

f=\left(\begin{array}{cccccccc} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8\\ 3 & 4 & 2 & 3 & 5 & 7 & 6 & 1 \end{array}\right)

It’s functional graph is:

It is a visual object that tells us how the function operates. However, we can learn much more from it.

#### Part 3 – What can we learn from the functional graph?

The first thing we can see, is that the graph doesn’t have to be connected, so we will explore each components separately. Now, observe that from every vertex, there is **exactly** one edge coming out of it – which is the edge . Therefore, the number of edges in each component equals to the number of vertices in the component (you can think of a vertex and an edge coming out of it as a pair – every directed edge has a starting vertex, and every vertex has only one directed edge coming out of it).

Suppose that in some components there are vertices, thus, there are edges as well. Since the component is a connected graph by definition, we conclude that it is actually **a tree with an extra edge**! By the lemma I’ve proved before, we know that the component has **exactly one** cycle.

I want to mention one more thing though, notice that if you pick some arbitrary vertex , there is only only way to ‘get out’ it – since there is exactly one edge coming out of it (which is ).Therefore, if you start a walk in an arbitrary vertex, sooner or later you will find yoursef inside a circle! If you want a formal proof for it (as I wanted) here you go:

##### Quick explanation

Start an arbitrary walk on some component with vertices, every time you reach a new vertex you label it as ‘visited’ (The first vertex will be marked as ‘visited’ as well). Statring the walk in a vertex , from there going to the vertex and so on. We repeat it times.

If during the process we’ve reached a ‘visited’ vertex, this means we’ve completed a cycle, therefore, we started from an edge pointing towards the cycle.

Notice that this will **always** be the case. After the first steps on the walk, in the ‘worst’ case, we visited each vertex exactly once, then in the next step, we must visit a ‘visited’ vertex.

Let me illustrate it:

In our example. Let’s start our walk from 8, mark 8 as ‘visited’ and then proceed to which is 1. Mark 1 as visited and proceed to . There are 5 vertices in the components of 8. After 5 steps, the walk we get it:

8\to1\to3\to2\to4

And all now all the vertices in the component are marked as ‘visited’. The next vertex in the walk is going to be which is already visited, thus:

3\to2\to4\to3

Is a cycle in the component, and we found out that the edge from actually brings us closer to the cycle.

#### Part 4 – Creating the tree

We are know finally ready to construct the tree. First define to be the set of vertices which are in a cycle. In our example, . Now there are 2 observations:

- If , then . By the construction of , is a part of a cycle, since only one edge comes out of , then this edge must end in vertex which is a vertex of the same cycle as . Thus, . This allows us to treat as a function from to
**itself**: . - is
**onto**. Suppose that , then it is part of a cycle, thus, there must be some in the cycle such that there is an edge from to . is a part of a cycle, thus , and we know that .

Now, since is onto and has the same domain and rangle (which have the same number of elements) we conclude that is a **bijection**!

This fact allows us to write as a **permutation** of the element of :

f|_M=\left(\begin{array}{cccc} v_{1} & v_{2} & \cdots & v_{k}\\ f(v_{1}) & f(v_{2}) & \cdots & f(v_{k}) \end{array}\right)

Where .

We now define: . and create a path:

P=(f(v_1),f(v_2),\dots,f(v_k))

Since is a bijection, it is indeed a path made of the vertices (convince yourself!).

We now add all the vertices outside of , and the edges coming out from them to out path to get a tree.

In our example:

f|_M=\left(\begin{array}{cccccc} 2 & 3 & 4 & 5 & 6 & 7\\ 4 & 2 & 3 & 5 & 7 & 6 \end{array}\right)

, and the path will be:

The rest of the vertices are 1 and 8, and the edges are and . Adding it to the path to get the tree :

And we now have the triple .

### Part 5 – proving that we got a tree

Our construction is good and we like it very much, however, we didn’t even proved that it is indeed a tree! So let’s prove it:

**It has edges**: the set contains vertices and created a path, thus it ‘donates’ to the tree edges. Moreover, there are vertices outside of , where each one ‘donates’ **one** edge (which is ). Therefore, there are edges coming from vertices outside of . Thus we have a total of edges, as we wanted.

**It is connected**: Pick two vertices . If they are from the same component in the functional graph, then there is a path between them.

Otherwise, there is a path from to where is a vertex from the same component of , and it is part of a cycle (This one follows from this fact)

Similarly, there is a path from to where is a vertex from the same component of , and it is part of a cycle.

There is also a path between and , since both of them are part of cycles, therefore the path is a a path between and .

#### Part 6 – proving the map is a bijection

Perfect, we now found a way to create a tree out of a function , (this way can easily be translated to an **algorithm**, which I’ll discuss about later). However, we still need to prove that the function is a bijection. But that’s easy, we can do everything we did here from end to start to get from the triple, the initial function:

Given a triple find a path between to and denote it by .

The path is made of integeres so we can sort it. Now consider the permutation:

\left(\begin{array}{ccccc} u_{1} & u_{2} & \cdots & u_{m-1} & u_{m}\\ L & v_{2} & \cdots & v_{m-1} & R \end{array}\right)

Where are just the elements of the path sorted.

We can now define a function as:

f^\prime(u_1)=L,f^\prime(u_2)=v_2,\dots,f^\prime(u_m)=R

For example, in this tree:

We have , and . The path is . Sotring the element to get the premutation:

f^\prime=\left(\begin{array}{cccccc} 2 & 3 & 4 & 5 & 6 & 7\\ 4 & 2 & 3 & 5 & 7 & 6 \end{array}\right)\begin{array}{c} \text{:sorted}\\ :P \end{array}

Then we exapnd to all the vertices that are incident to vertices in , and define for each such vertex which is incident to , the directed edge or . If all vertices are now in the domain of we are done and we’ve created a function . If not, we add the vertices that are adjacent to one of those we added in the previous step, and since the number of vertieces it finite, the process must come to an end.

In our example, first we add 1 to the domain and define . Then we add 8 to the domain and define . This yields the function:

f=\left(\begin{array}{cccccccc} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8\\ 3 & 4 & 2 & 3 & 5 & 7 & 6 & 1 \end{array}\right)

And that’s exactly the function we started with!

In cocnlustion, we just proved that this map is invertiable, thus, it is an injection and the prove is done!

## The algorithm

I am now going to show you how to implement this prove in code, I am going to construct a function that get’s an array that will represent a function, and it will print what are and the edges of the tree.

I am going to write to code in **Java**, however, it is really not that complicated, and will be very similar in other languages.

I will implement the method:

```
public void treeFromFunction(int[] f)
```

The input is an array of integers that will represent a function . How exactly? that’s simple – the value of is just . However, recall that arrays start with index 0, thus out function will be slightly different, it will be a function from to itself.

##### The implementation

I am going to do it step by step:

- Denote by the length of the array.
- Find the vertices that are part of cycles in the functional graph (which can be easily represented using the array)
- Creating the tree – Setting up the path, and adding the rest.
- Defining the values of
- printing and the edges of the tree.

This is how the method looks like:

```
public void treeFromFunction(int[] f) {
//part 1
int n = f.length;
//part 2
int[] verticesOfCycles = getVerticesOfCycles(n, f);
Arrays.sort(verticesOfCycles);
int lastIdx = verticesOfCycles.length - 1;
//part 3
int[][] tree = createTree(f, verticesOfCycles, n);
//part 4
int L = f[verticesOfCycles[0]];
int R = f[verticesOfCycles[lastIdx]];
//part 5
System.out.println("L = " + L + ", R = " + R);
printTree(tree, n);
}
```

Part 1 is pretty clear, nothing special there. Part 2 is where things get interesting, I am using a method that finds all the vertices that are part of a cycle.

Let’s see how this method works:

#### Finding vertices of cycles

The method is based on the idea that I have desribed here. I am going to iterate over each vertex and do the following

- Create an array called ‘visited’ that will help me to indicate if a vertex is visited or not
- Start a walk from the specific vertex , and mark visited vertices
- If I reach a visited vertex, it means that a cycle was found. Now there are two options:
- The visited vertex is – Then is part of a cycle, so I’ll add it to the list
- The visited vertex is
**not**– Then even though we found a cycle, is not a part of it, so I’ll proceed to the next vertex.

Notice that I will **always** complete a cycle, no matter what vertex I am starting from. It follows from this fact.

In this process I end up with a list of vertices that are part of a cycle (which is guaranteed not to be empty by the proof) and the only thing left to do is just translate the list into an array and return the array.

This is the code for this method:

```
int[] getVerticesOfCycles(int n, int[] f) {
//iterating over all the vertices.
List<Integer> listOfVerticesOfCycles = new LinkedList<>();
for (int i = 0; i < n; i++) {
boolean[] visited = new boolean[n];
int current = i;
while (true) {
// case where we got back to where we started
if (current == i && visited[current]) {
listOfVerticesOfCycles.add(i);
break;
}
visited[current] = true;
int next = f[current];
if (visited[next] && next != i) break;
else current = next;
}
}
int[] verticesOfCycles = new int [listOfVerticesOfCycles.size()];
for(int i = 0; i < verticesOfCycles.length; i ++) {
verticesOfCycles[i] = listOfVerticesOfCycles.get(i).intValue();
}
return verticesOfCycles;
}
```

(For some reason it when I type “&” the output is “&” so be aware of that…)

The only part left to explain is how I represent the tree. As you can see, the tree is a **2D** **array** in my code. Why? since I am going to represent it using the **adjacency matrix**. I have discussed about this matrix in this post, so if you are not familiar with it, you are more than welcome to check out this post.

##### Setting the tree

There are two types of vertices:

- Those who part of cycles
- The rest

We will go over all the vertices and do the following, assuming that the set of vertices of cycles is :

- If the vertex is an element of , so it equals to for some . Then we will add the edge . However, if than doesn’t exists, so we will do nothing.
- Otherwise , then we will just add the edge .

Here is the code for this method:

```
int[][] createTree(int[] f, int[] verticesOfCycles, int n) {
int[][] tree = new int[n][n];
for (int i = 0; i < n; i++) {
if (isInArray(verticesOfCycles, i)) {
int idx = indexOfInArray(verticesOfCycles, i);
if (idx != verticesOfCycles.length - 1) {
int nextIdx = verticesOfCycles[idx + 1];
tree[f[i]][f[nextIdx]] = 1;
}
} else{
tree[i][f[i]] = 1;
}
}
return tree;
}
//helpers
boolean isInArray(int[] arr, int v) {
for(int i = 0; i < arr.length; i ++) {
if (arr[i] == v) return true;
}
return false;
}
int indexOfInArray(int[] arr, int v) {
for(int i = 0; i < arr.length; i ++) {
if (arr[i] == v) return i;
}
return -1;
}
```

#### Printing the tree

There is nothing complicated here:

```
void printTree(int[][] tree, int n) {
System.out.println("The edges of the tree are:");
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
if (tree[i][j] == 1) System.out.println("{" + i + "," + j + "}");
}
}
}
```

##### Quick test for the code

We can easlly see that the code is fine:

```
public static void main(String[] args) {
CayleyAlgorithm cayleyAlgorithm = new CayleyAlgorithm();
int[] f = {4,3,1,0,2,1,7,3};
cayleyAlgorithm.treeFromFunction(f);
}
```

The output is:

```
L = 4, R = 2
The edges of the tree are:
{0,2}
{1,0}
{3,1}
{4,3}
{5,1}
{6,7}
{7,3}
```

You can verify it yourself and see if you got the same tree!

If you wish, you are welcome to download the file and test it a little, modify it or whatever you wanna do with. You can get it in this link:

https://github.com/TairGalili/Graph-Theory-Posts.git

## Summary

We’ve seen in this post one of the most famous theorems in graph theory and saw how we can implement a proof into a code. Now if you are bored or something like that, you can generate a random function from to itself to get a tree! In the next post I want to generalize Cayley’s theorem, and find out how many spanning trees are there in **any** graph, and I warn in advance – prepare yourself for some **linear algebra**!

## One thought on “Cayley’s formula”