Problem
How many bits are needed to represent a binary tree with \(n\) nodes?
A lower bound is given by the fact that \(m\) bits can represent at most \(2^{m}\) entities. So the problem reduces to one of:
How many binary trees exist with \(n\) internal nodes? The answer is then the log of it.
Solution
Hereafter, when I say node, I mean internal node.
Let’s represent this quantity by \(T_{n}\).
Consider a binary tree with \(k>0\) nodes on the left subtree. The number of trees that exist in such a configuration is given by \(T_{k}T_{n-k-1}\) (the first term is for the left subtree, the second is for the right). Thus, we have:
Note that the Kronecker delta function takes care of the degenerate case where \(n=0\). This term is critical in our analysis, and it’s easy to forget.
Now comes the key trick. Define \(T(z)=\sum_{n=0}^{\infty}T_{n}z^{n}\). This is the generating function of \(T_{n}\).
Multiply both sides of our equation by \(z^{n}\) and sum it to get:
But also, let’s square our original definition of \(T(z)\):
Consider a given \(u=m+n\). Note that the way to form \(z^{u}\) is:
But this is practically \(T_{u+1}\).
So we get:
Solving for \(zT(z)\) gives:
Note: the other solution diverges at \(z=0\).
Make use of the following Binomial expansion:
Matching coefficients yields
Expanding it out and doing several manipulations, we get
These are the Catalan Numbers.
Using Stirling’s Approximation:
for large \(n\), we get: