Sunday, March 15, 2009

Quick Sort

Quicksort is a well-known sorting algorithm developed by C. A. R. Hoare that, on average, makes Θ(nlogn) (big O notation) comparisons to sort n items. However, in the worst case, it makes Θ(n2) comparisons. Typically, quicksort is significantly faster in practice than other Θ(nlogn) algorithms, because its inner loop can be efficiently implemented on most architectures, and in most real-world data, it is possible to make design choices which minimize the probability of requiring quadratic time.Quicksort is a comparison sort and, in efficient implementations, is not a stable sort.

Analysis:

From the initial description it's not obvious that quicksort takes Θ(nlogn) time on average. It's not hard to see that the partition operation, which simply loops over the elements of the array once, uses Θ(n) time. In versions that perform concatenation, this operation is also Θ(n).

In the best case, each time we perform a partition we divide the list into two nearly equal pieces. This means each recursive call processes a list of half the size. Consequently, we can make only logn nested calls before we reach a list of size 1. This means that the depth of the call tree is Θ(logn). But no two calls at the same level of the call tree process the same part of the original list; thus, each level of calls needs only Θ(n) time all together (each call has some constant overhead, but since there are only Θ(n) calls at each level, this is subsumed in the Θ(n) factor). The result is that the algorithm uses only Θ(nlogn) time.

An alternate approach is to set up a recurrence relation for the T(n) factor, the time needed to sort a list of size n. Because a single quicksort call involves Θ(n) factor work plus two recursive calls on lists of size n / 2 in the best case, the relation would be:


The master theorem tells us that T(n) = Θ(nlogn).

In fact, it's not necessary to divide the list this precisely; even if each pivot splits the elements with 99% on one side and 1% on the other (or any other fixed fraction), the call depth is still limited to 100logn, so the total running time is still Θ(nlogn).

In the worst case, however, the two sublists have size 1 and n − 1 (for example, if the array consists of the same element by value), and the call tree becomes a linear chain of n nested calls. The ith call does Θ(n − i) work, and . The recurrence relation is:
T(n) = Θ(n) + T(0) + T(n − 1) = O(n) + T(n − 1)

This is the same relation as for insertion sort and selection sort, and it solves to T(n) = Θ(n2). Given knowledge of which comparisons are performed by the sort, there are adaptive algorithms that are effective at generating worst-case input for quicksort on-the-fly, regardless of the pivot selection strategy.[4]

Sample code:

No comments:

Post a Comment