CS

Analysis of Heapsort (Thesis)

Report ID: TR-370-92

Date: 1992-06-00

Pages: 90

Download Formats: |PDF|

Abstract:

Heapsort is a classical sorting algorithm due to Williams. Given an array to sort, Heapsort first transforms the keys of the array into a heap. The heap is then sorted by repeatedly swapping the root of the heap with the last key in the bottom row, and then sifting this new root down to an appropriate position to restore heap order. In Williams's original Heapsort, the new root is sifted down by repeatedly comparing its two children, and swapping it with its larger child if a comparison shows the child to be larger than the key being sifted. Floyd proposed an important variant of Williams's Heapsort which unconditionally performs the swap with the larger child. This thesis analyzes the asymptotic number of executions of each instruction for both versions of Heapsort in the average, best, and worst cases. In the average case, when sorting a uniformly generated random heap on $N$ distinct keys, Williams's Heapsort performs $sim 2Nlg N$ key comparisons while Floyd's variant performs $sim Nlg N$ key comparisons (lg is the logarithm base two). Another quantity of interest is the number of times keys are swapped with their right children. Both Williams's and Floyd's versions of Heapsort expect to perform $sim {1over 2}Nlg N$ such swaps. Sedgewick, and independently Fleischer and Wegener, have presented arguments to demonstrate that the number of key comparisons required by Williams's Heapsort in the best case and Floyd's Heapsort in the worst case are $sim Nlg N$ and $sim {3over 2}Nlg N$ respectively. These arguments are extended and applied in a different form to demonstrate that in the worst case, Williams's and Floyd's Heapsorts perform $sim {3over 4}Nlg N$ swaps of keys with their right children, while in the best case at most $sim {1over 4}Nlg N$ such swaps are performed. For both versions of Heapsort, it is shown that these best and worst case numbers can be found in heaps that also require the best and worst case numbers of key comparisons.