Why we are happy with θ(n log(n)) sorting.

Sorting is a very important problem in computer science. A well known conclusion is that good sorting algorithms run with a time-complexity of $θ (n lo g (n))$ . That is if the sorting is based on pair-wise comparison of objects. The neat thing is that we can proof that any sorting algorithm based on pair-wise comparisons runs in $Ω (n lo g (n))$ time.

When we say “based only on pair-wise comparison of objects” we mean that we do not use the values of the objects directly. We only use a comparator, a function of the form: $f_{<} (a, b) = {+ 1 - 1 i f a \geq b e l se$ the same conclusions hold if the comparator gives a special value if $a = b$ .

Proof

Consider an arbitrary list $a = [a_{1}, a_{2}, \dots, a_{n}]$ for witch every $a_{i}$ is comparable to every other $a_{j}$ for all $i$ and $j$ . We know that there are $n!$ possible permutations of the list.

A program that sorts arbitrary lists, should be able to apply any of the $n!$ permutations of the list. If this would not be the case, one could easily construct a list that the algorithm can’t sort. Namely the result of applying the inverse permutation to $[1, 2, \dots, n]$ .

Note that the permutation that the algorithm can do depend only on the values of the list, since this is the only input. our program can only gather information from the list by doing pair-wise comparisons. Let’s call the number of comparisons the the algorithm does $n_{c}$ . Given $n_{c}$ comparisons, we can have $2^{n_{c}}$ different outcomes. There are $n!$ possible permutations that the algorithm should do. Thus we need:

2^{n_{c}} \geq n!

or by taking the $lo g_{2}$ of both sides (note that both sides are positive integers):

n_{c} \geq lo g_{2} (n!)

Let’s look at $lo g_{2} (n!)$ a little closer:

n_{c} \geq lo g_{2} (n!) = lo g_{2} (n) + lo g_{2} (n - 1) + \dots + lo g_{2} ⌈ n /2 ⌉ + \dots + lo g_{2} (1) ​

We make a lower bound by cutting off the terms after $lo g_{2} ⌈ n /2 ⌉$ :

n_{c} \geq lo g_{2} (n!) \geq lo g_{2} (n) + lo g_{2} (n - 1) + \dots + lo g_{2} ⌈ n /2 ⌉

Thus we have:

n_{c} \geq ⌈ n /2 ⌉ lo g_{2} ⌈ n /2 ⌉

Or $n_{c} = Ω (n lo g_{2} (n))$