In the last post I’ve introduced the term of trees and proved a theorem that allows us to fully understand trees. They are connected, they have edges (where
is the number of vertices) and they have no cycles.
I’ve also presented the term of a spanning sub-graph – which is a sub-graph of a graph
such that
.
Today I would like to discuss about spanning trees – which are exactly what you thinks they are: A spanning subgraph which is also a tree.

As it turns out, spanning trees are very important, therefore, people started wondering how many spanning trees are there in a graph.
Cayley’s formula gives us a (partial) solution, it tells us how many spanning trees are there in the complete graph . For example, consider all the spanning trees in
:

There are only 3 here, but that makes sense cause we don’t have too many options. What about ?

Well… that’s escalated quickly! There are 16 spanning trees in contrast to only 3 in – I don’t even want to start drawing the spanning trees of
… But if you’re wondering, there are
of them.
The formula
I am going to denote the number of spanning trees of as
. I want you to notice some stuff before I am presenting the formula:
You may notice that some spanning trees are isomorphic to each other, for example, all of the spanning trees of are isomorphic. However, I still consider them as different trees. Why? Let’s name the vertices of
:

I think it is clearer now, for example, the left spanning tree in the example is the graph:
T_1=(\{1,2,3\},\{\{1,2\},\{1,3\}\})
On the other hand, the graph on the middle is:
T_2=(\{1,2,3\},\{\{1,2\},\{2,3\}\})
Clearly, they are not the same, but The only thing that makes them really different is the lable of the vertices. Thus, we are counting labeled trees, and that’s an important thing to understand – If we considered isomorphic trees as the same, we would have get that the number of spanning trees in is one – and that is not what I am aiming for.
I also want you to understand that we can treat any graph with vertices as a subgraph of
. Therefore, our question is equivalent to the question: “How many trees with
vertices are there?”
OK, we are now ready to present Cayley’s formula: The number of spanning trees of is exactly
.
This formula explains why the number of spanning trees ‘grew up’ so fast when we moved from to
. Notice how simple the formula is – it is very short and even a high-schooler can understand it.
Goals for this post
After we’ve seen Cayley’s formula, we also need to prove it and understand why it works. The proof I will bring today is a proof from 1981 by Joyal. The beautiful thing about this proof, is that it gives us an algorithm to randomally generate trees! Therefore the rest of the post will be splitted into 3 parts:
- Some preparations for the proof.
- The proof.
- An actual algorithm that generates trees.
In order to understand the algorithm, you need to understand the proof as well, however, if you understand the proof and you are not a fan of algorithms, you can skip this part without a problem, since I will not use it in the upcoming posts. However, I do suggest you to check it out, in my opinion, it is really cool to see how we can make the computer do the hard work for us in such a simple way.
preparations for the proof
I only want to prove two small lemmas, suppuse that is a tree, then:
- There is a unique path between any two vertices.
- If you add one edge to
, you will create only one cycle.
The proofs are really easy:
- Aiming for contradiction, suppose that there are two paths between
and
. Therefore, we can find a cycle in the graph (convince yourself!) and that’s a contradiction to
being a tree (since a tree has no cycles).
- Suppose that adding one edge
to
creates two cycles:
and
. We now remove the added edge and we get that
and
are two different paths between
and
, and that’s a contradiction to the first statement!
Ok, I think we are ready to prove Cayley’s formula.
The proof
The proof is going to be done step by step, since it is a little bit complicated and since brinining the proof organized and completemakes it much easier to understand.
Part 1 – Describing our plan
My goal is to define a bijection between the set of functions from to itself:
[n]^{[n]}=\{f:[n]\to[n]\}
To the sets of triples:
\{(T,L,R)\}
Where is a tree, and
are some vertices in the tree (they can be the same).
If we will manage to find a bijection, this would imply that the sets have the same number of elements:
|\{(T,L,R)\}|=|[n]^{[n]}|
Recall that the number of function from to itself is exactly
(we have
options for the output of each element from the set, and there are
elements in the set). Thus:
|\{(T,L,R)\}|=n^n
On the other hand:
|\{(T,L,R)\}|=t_n\cdot n\cdot n=t_n \cdot n^2
Recall that is the number of trees, and
are just arbitrary vertices, and there are
vertices, therefore, there are
different options for the value of
and
.
We can now compare both sizes of to get:
t_n \cdot n^2=n^n\Rightarrow t_n=n^{n-2}
And that’s exactly what we wanted!
So In the proof I will describe such a bijection, and I’ll start by creating a map from to
.
Part 2 – From function to directed graph
Suppose that is some arbitrary function. We can represent
like that:
\left(\begin{array}{cccc} 1 & 2 & \cdots & n\\ f(1) & f(2) & \cdots & f(n) \end{array}\right)
There’s nothing special here, this is just a comfortable way to represent to function. I am going to define now the functional graph of –
which is a graph that describes how
works – It is a directed graph that an edge starts at some
and ends at
. Formally:
G_f=([n],\{(i,f(i)):1\leq i\leq n\})
For example, consider the function:
f=\left(\begin{array}{cccccccc} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8\\ 3 & 4 & 2 & 3 & 5 & 7 & 6 & 1 \end{array}\right)
It’s functional graph is:

It is a visual object that tells us how the function operates. However, we can learn much more from it.
Part 3 – What can we learn from the functional graph?
The first thing we can see, is that the graph doesn’t have to be connected, so we will explore each components separately. Now, observe that from every vertex, there is exactly one edge coming out of it – which is the edge . Therefore, the number of edges in each component equals to the number of vertices in the component (you can think of a vertex and an edge coming out of it as a pair – every directed edge has a starting vertex, and every vertex has only one directed edge coming out of it).
Suppose that in some components there are vertices, thus, there are
edges as well. Since the component is a connected graph by definition, we conclude that it is actually a tree with an extra edge! By the lemma I’ve proved before, we know that the component has exactly one cycle.
I want to mention one more thing though, notice that if you pick some arbitrary vertex , there is only only way to ‘get out’ it – since there is exactly one edge coming out of it (which is
).Therefore, if you start a walk in an arbitrary vertex, sooner or later you will find yoursef inside a circle! If you want a formal proof for it (as I wanted) here you go:
Quick explanation
Start an arbitrary walk on some component with vertices, every time you reach a new vertex you label it as ‘visited’ (The first vertex will be marked as ‘visited’ as well). Statring the walk in a vertex
, from there going to the vertex
and so on. We repeat it
times.
If during the process we’ve reached a ‘visited’ vertex, this means we’ve completed a cycle, therefore, we started from an edge pointing towards the cycle.
Notice that this will always be the case. After the first steps on the walk, in the ‘worst’ case, we visited each vertex exactly once, then in the next step, we must visit a ‘visited’ vertex.
Let me illustrate it:
In our example. Let’s start our walk from 8, mark 8 as ‘visited’ and then proceed to which is 1. Mark 1 as visited and proceed to
. There are 5 vertices in the components of 8. After 5 steps, the walk we get it:
8\to1\to3\to2\to4
And all now all the vertices in the component are marked as ‘visited’. The next vertex in the walk is going to be which is already visited, thus:
3\to2\to4\to3
Is a cycle in the component, and we found out that the edge from actually brings us closer to the cycle.
Part 4 – Creating the tree
We are know finally ready to construct the tree. First define to be the set of vertices which are in a cycle. In our example,
. Now there are 2 observations:
- If
, then
. By the construction of
,
is a part of a cycle, since only one edge comes out of
, then this edge must end in vertex
which is a vertex of the same cycle as
. Thus,
. This allows us to treat
as a function from
to itself:
.
is onto. Suppose that
, then it is part of a cycle, thus, there must be some
in the cycle such that there is an edge from
to
.
is a part of a cycle, thus
, and we know that
.
Now, since is onto and has the same domain and rangle (which have the same number of elements) we conclude that
is a bijection!
This fact allows us to write as a permutation of the element of
:
f|_M=\left(\begin{array}{cccc} v_{1} & v_{2} & \cdots & v_{k}\\ f(v_{1}) & f(v_{2}) & \cdots & f(v_{k}) \end{array}\right)
Where .
We now define: . and create a path:
P=(f(v_1),f(v_2),\dots,f(v_k))
Since is a bijection, it is indeed a path made of the vertices
(convince yourself!).
We now add all the vertices outside of , and the edges coming out from them to out path to get a tree.
In our example:
f|_M=\left(\begin{array}{cccccc} 2 & 3 & 4 & 5 & 6 & 7\\ 4 & 2 & 3 & 5 & 7 & 6 \end{array}\right)
, and the path
will be:

The rest of the vertices are 1 and 8, and the edges are and
. Adding it to the path to get the tree
:

And we now have the triple .
Part 5 – proving that we got a tree
Our construction is good and we like it very much, however, we didn’t even proved that it is indeed a tree! So let’s prove it:
It has edges: the set
contains
vertices and created a path, thus it ‘donates’ to the tree
edges. Moreover, there are
vertices outside of
, where each one ‘donates’ one edge (which is
). Therefore, there are
edges coming from vertices outside of
. Thus we have a total of
edges, as we wanted.
It is connected: Pick two vertices . If they are from the same component in the functional graph, then there is a path between them.
Otherwise, there is a path from
to
where
is a vertex from the same component of
, and it is part of a cycle (This one follows from this fact)
Similarly, there is a path from
to
where
is a vertex from the same component of
, and it is part of a cycle.
There is also a path between
and
, since both of them are part of cycles, therefore the path
is a a path between
and
.
Part 6 – proving the map is a bijection
Perfect, we now found a way to create a tree out of a function , (this way can easily be translated to an algorithm, which I’ll discuss about later). However, we still need to prove that the function is a bijection. But that’s easy, we can do everything we did here from end to start to get from the triple, the initial function:
Given a triple find a path between
to
and denote it by
.
The path is made of integeres so we can sort it. Now consider the permutation:
\left(\begin{array}{ccccc} u_{1} & u_{2} & \cdots & u_{m-1} & u_{m}\\ L & v_{2} & \cdots & v_{m-1} & R \end{array}\right)
Where are just the elements of the path sorted.
We can now define a function as:
f^\prime(u_1)=L,f^\prime(u_2)=v_2,\dots,f^\prime(u_m)=R
For example, in this tree:

We have , and
. The path is
. Sotring the element to get the premutation:
f^\prime=\left(\begin{array}{cccccc} 2 & 3 & 4 & 5 & 6 & 7\\ 4 & 2 & 3 & 5 & 7 & 6 \end{array}\right)\begin{array}{c} \text{:sorted}\\ :P \end{array}
Then we exapnd to all the vertices that are incident to vertices in
, and define for each such vertex
which is incident to
, the directed edge
or
. If all vertices are now in the domain of
we are done and we’ve created a function
. If not, we add the vertices that are adjacent to one of those we added in the previous step, and since the number of vertieces it finite, the process must come to an end.
In our example, first we add 1 to the domain and define . Then we add 8 to the domain and define
. This yields the function:
f=\left(\begin{array}{cccccccc} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8\\ 3 & 4 & 2 & 3 & 5 & 7 & 6 & 1 \end{array}\right)
And that’s exactly the function we started with!
In cocnlustion, we just proved that this map is invertiable, thus, it is an injection and the prove is done!
The algorithm
I am now going to show you how to implement this prove in code, I am going to construct a function that get’s an array that will represent a function, and it will print what are and the edges of the tree.
I am going to write to code in Java, however, it is really not that complicated, and will be very similar in other languages.
I will implement the method:
public void treeFromFunction(int[] f)
The input is an array of integers that will represent a function . How exactly? that’s simple – the value of
is just
. However, recall that arrays start with index 0, thus out function will be slightly different, it will be a function from
to itself.
The implementation
I am going to do it step by step:
- Denote by
the length of the array.
- Find the vertices that are part of cycles in the functional graph (which can be easily represented using the array)
- Creating the tree – Setting up the path, and adding the rest.
- Defining the values of
- printing
and the edges of the tree.
This is how the method looks like:
public void treeFromFunction(int[] f) {
//part 1
int n = f.length;
//part 2
int[] verticesOfCycles = getVerticesOfCycles(n, f);
Arrays.sort(verticesOfCycles);
int lastIdx = verticesOfCycles.length - 1;
//part 3
int[][] tree = createTree(f, verticesOfCycles, n);
//part 4
int L = f[verticesOfCycles[0]];
int R = f[verticesOfCycles[lastIdx]];
//part 5
System.out.println("L = " + L + ", R = " + R);
printTree(tree, n);
}
Part 1 is pretty clear, nothing special there. Part 2 is where things get interesting, I am using a method that finds all the vertices that are part of a cycle.
Let’s see how this method works:
Finding vertices of cycles
The method is based on the idea that I have desribed here. I am going to iterate over each vertex and do the following
- Create an array called ‘visited’ that will help me to indicate if a vertex is visited or not
- Start a walk from the specific vertex
, and mark visited vertices
- If I reach a visited vertex, it means that a cycle was found. Now there are two options:
- The visited vertex is
– Then
is part of a cycle, so I’ll add it to the list
- The visited vertex is not
– Then even though we found a cycle,
is not a part of it, so I’ll proceed to the next vertex.
- The visited vertex is
Notice that I will always complete a cycle, no matter what vertex I am starting from. It follows from this fact.
In this process I end up with a list of vertices that are part of a cycle (which is guaranteed not to be empty by the proof) and the only thing left to do is just translate the list into an array and return the array.
This is the code for this method:
int[] getVerticesOfCycles(int n, int[] f) {
//iterating over all the vertices.
List<Integer> listOfVerticesOfCycles = new LinkedList<>();
for (int i = 0; i < n; i++) {
boolean[] visited = new boolean[n];
int current = i;
while (true) {
// case where we got back to where we started
if (current == i && visited[current]) {
listOfVerticesOfCycles.add(i);
break;
}
visited[current] = true;
int next = f[current];
if (visited[next] && next != i) break;
else current = next;
}
}
int[] verticesOfCycles = new int [listOfVerticesOfCycles.size()];
for(int i = 0; i < verticesOfCycles.length; i ++) {
verticesOfCycles[i] = listOfVerticesOfCycles.get(i).intValue();
}
return verticesOfCycles;
}
(For some reason it when I type “&” the output is “&” so be aware of that…)
The only part left to explain is how I represent the tree. As you can see, the tree is a 2D array in my code. Why? since I am going to represent it using the adjacency matrix. I have discussed about this matrix in this post, so if you are not familiar with it, you are more than welcome to check out this post.
Setting the tree
There are two types of vertices:
- Those who part of cycles
- The rest
We will go over all the vertices and do the following, assuming that the set of vertices of cycles is :
- If the vertex is an element of
, so it equals to
for some
. Then we will add the edge
. However, if
than
doesn’t exists, so we will do nothing.
- Otherwise
, then we will just add the edge
.
Here is the code for this method:
int[][] createTree(int[] f, int[] verticesOfCycles, int n) {
int[][] tree = new int[n][n];
for (int i = 0; i < n; i++) {
if (isInArray(verticesOfCycles, i)) {
int idx = indexOfInArray(verticesOfCycles, i);
if (idx != verticesOfCycles.length - 1) {
int nextIdx = verticesOfCycles[idx + 1];
tree[f[i]][f[nextIdx]] = 1;
}
} else{
tree[i][f[i]] = 1;
}
}
return tree;
}
//helpers
boolean isInArray(int[] arr, int v) {
for(int i = 0; i < arr.length; i ++) {
if (arr[i] == v) return true;
}
return false;
}
int indexOfInArray(int[] arr, int v) {
for(int i = 0; i < arr.length; i ++) {
if (arr[i] == v) return i;
}
return -1;
}
Printing the tree
There is nothing complicated here:
void printTree(int[][] tree, int n) {
System.out.println("The edges of the tree are:");
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
if (tree[i][j] == 1) System.out.println("{" + i + "," + j + "}");
}
}
}
Quick test for the code
We can easlly see that the code is fine:
public static void main(String[] args) {
CayleyAlgorithm cayleyAlgorithm = new CayleyAlgorithm();
int[] f = {4,3,1,0,2,1,7,3};
cayleyAlgorithm.treeFromFunction(f);
}
The output is:
L = 4, R = 2
The edges of the tree are:
{0,2}
{1,0}
{3,1}
{4,3}
{5,1}
{6,7}
{7,3}
You can verify it yourself and see if you got the same tree!
If you wish, you are welcome to download the file and test it a little, modify it or whatever you wanna do with. You can get it in this link:
https://github.com/TairGalili/Graph-Theory-Posts.git
Summary
We’ve seen in this post one of the most famous theorems in graph theory and saw how we can implement a proof into a code. Now if you are bored or something like that, you can generate a random function from to itself to get a tree! In the next post I want to generalize Cayley’s theorem, and find out how many spanning trees are there in any graph, and I warn in advance – prepare yourself for some linear algebra!
One thought on “Cayley’s formula”