\(n\)">

Number of Binary Trees of Size \(n\)

Posted by Beetle B. on Tue 09 September 2014

Problem

How many bits are needed to represent a binary tree with \(n\) nodes?

A lower bound is given by the fact that \(m\) bits can represent at most \(2^{m}\) entities. So the problem reduces to one of:

How many binary trees exist with \(n\) internal nodes? The answer is then the log of it.

Solution

Hereafter, when I say node, I mean internal node.

Let’s represent this quantity by \(T_{n}\).

Consider a binary tree with \(k>0\) nodes on the left subtree. The number of trees that exist in such a configuration is given by \(T_{k}T_{n-k-1}\) (the first term is for the left subtree, the second is for the right). Thus, we have:

\begin{equation*} T_{n}=\delta_{n,0}+\sum_{k=0}^{n-1}T_{k}T_{n-k-1} \end{equation*}

Note that the Kronecker delta function takes care of the degenerate case where \(n=0\). This term is critical in our analysis, and it’s easy to forget.

Now comes the key trick. Define \(T(z)=\sum_{n=0}^{\infty}T_{n}z^{n}\). This is the generating function of \(T_{n}\).

Multiply both sides of our equation by \(z^{n}\) and sum it to get:

\begin{equation*} T(z)=1+\sum_{n=0}\sum_{k=0}^{n-1}T_{k}T_{n-k-1}z^{n} \end{equation*}

But also, let’s square our original definition of \(T(z)\):

\begin{equation*} T^{2}(z)=\sum_{n=0}^{\infty}\sum_{m=0}^{\infty}T_{n}T_{m}z^{n+m} \end{equation*}

Consider a given \(u=m+n\). Note that the way to form \(z^{u}\) is:

\begin{equation*} T_{0}T_{u}+T_{1}T_{u-1}+\cdots +T_{u-1}T_{1}+T_{u}T_{0}=\sum_{k=0}^{u}T_{k}T_{u-k} \end{equation*}

But this is practically \(T_{u+1}\).

So we get:

\begin{equation*} zT^{2}(z)+1=\sum_{k=0}^{\infty}T_{k}z^{k}=T(z) \end{equation*}

Solving for \(zT(z)\) gives:

\begin{equation*} \frac{1}{2}\left(1-\left(1-4z\right)^{1/2}\right) \end{equation*}

Note: the other solution diverges at \(z=0\).

Make use of the following Binomial expansion:

\begin{equation*} \left(1+x\right)^{1/2}=\sum_{k=0}^{\infty}\dbinom{\frac{1}{2}}{k}x^{k} \end{equation*}
\begin{equation*} zT(z)=-\frac{1}{2}\sum_{n=0}^{\infty}\dbinom{\frac{1}{2}}{n+1}\left(-4\right)^{n+1}z^{n+1} \end{equation*}

Matching coefficients yields

\begin{equation*} T_{n}=-\frac{1}{2}\dbinom{\frac{1}{2}}{n+1}\left(-4\right)^{n+1} \end{equation*}

Expanding it out and doing several manipulations, we get

\begin{equation*} T_{n}=\frac{1}{n+1}\dbinom{2n}{n} \end{equation*}

These are the Catalan Numbers.

Using Stirling’s Approximation:

\begin{equation*} N!\simeq\sqrt{2\pi N}N^{N}e^{-N} \end{equation*}

for large \(n\), we get:

\begin{equation*} T_{n}\simeq\frac{4^{n}}{\sqrt{\pi n^{3}}} \end{equation*}