Sorting Complexity Analysis

Posted by Beetle B. on Sun 10 August 2014

Any comparison based sorting algorithm must use at least \(\lg(N!)\sim N\lg N\) comparisons for the worst case. This is absent any further information one can utilize (e.g. whether the values being sorted come from a known set, etc). Thus, for a generic bound, one cannot sort faster than this bound.

Suppose you’re sorting \(N\) numbers. Any general algorithm will have to compare the first element with another one and make a decision. Then compare the “result” with another element, and so on. Let’s form a tree where each node is the current comparison and the descendants are the next comparison, and so on. The leaf node is the resulting list one gets after the algorithm is complete.

The tree for the algorithm must have \(N!\) leaf nodes. If not, then that means the algorithm cannot reach a certain ordering of the list (put another way, there will exist some list it cannot sort).

The height \(h\) of the tree represents the worst case in the number of compares. A binary tree of that height can have at most \(2^{h}\) leaves.

We must have \(2^{h}\ge N!\) The bound follows as a result.