Quicksort: Difference between revisions

Revision as of 16:11, 2 October 2014

Quick Sort

whatever

Openlearnware

General information

Algorithmic problem: Sorting based on pairwise comparison

Type of algorithm: recursion

Abstract view

Invariant: After a recursive call, the input sequence of this recursive call is sorted.

Variant: In each recursive call, the sequence of the callee is strictly shorter than that of the caller.

Break condition: The sequence is empty or a singleton.

Induction basis

Abstract view: Nothing to do on an empty sequence or a singleton.

Implementation: Ditto.

Proof: Empty sequences and singletons are trivially sorted.

Induction step

Abstract view:

Choose a pivot value [math]\displaystyle{ p \in [min\{x|x \in S\},\dots,max\{x|x \in S\}] }[/math] (note that [math]\displaystyle{ p }[/math] is not required to be an element of [math]\displaystyle{ S }[/math].
Partition [math]\displaystyle{ S }[/math] into sequences, [math]\displaystyle{ S_1 }[/math], [math]\displaystyle{ S_2 }[/math] and [math]\displaystyle{ S_3 }[/math], such that [math]\displaystyle{ x \lt p }[/math] for [math]\displaystyle{ x \in S_1 }[/math], [math]\displaystyle{ x = p }[/math] for [math]\displaystyle{ x \in S_2 }[/math], and [math]\displaystyle{ x \gt p }[/math] for [math]\displaystyle{ x \in S_3 }[/math].
Sort [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_3 }[/math] recursively.
The concatenation of all three lists, [math]\displaystyle{ S_1 \| S_2 \| S_3 }[/math], is the result of the algorithm.

Implementation:

Chose [math]\displaystyle{ p \in [min\{x|x \in S\},\dots,max\{x|x \in S\}] }[/math] according to some pivoting rule.
[math]\displaystyle{ S_1 := S_2 := S_3 := \emptyset }[/math].
For [math]\displaystyle{ x \in S }[/math], append [math]\displaystyle{ x }[/math] to
1. [math]\displaystyle{ S_1 }[/math] if [math]\displaystyle{ x \lt p }[/math],
2. [math]\displaystyle{ S_2 }[/math] if [math]\displaystyle{ x = p }[/math],
3. [math]\displaystyle{ S_3 }[/math] if [math]\displaystyle{ x \gt p }[/math].
Call Quicksort on [math]\displaystyle{ S_1 }[/math] giving [math]\displaystyle{ S_1' }[/math]
Call Quicksort on [math]\displaystyle{ S_3 }[/math] giving [math]\displaystyle{ S_3' }[/math]
Return [math]\displaystyle{ S_1' \| S_2' \| S_3' }[/math].

Correctness:

By incuction hypothesis, [math]\displaystyle{ S_1' }[/math] and [math]\displaystyle{ S_3' }[/math] are sorted permutations of [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_3 }[/math], respectively. In particular [math]\displaystyle{ S_1' \| S_2 \| S_3' }[/math] is a permutation of [math]\displaystyle{ S }[/math]. To see that this permutation is sorted, let [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] be two memebers of [math]\displaystyle{ S }[/math] such that [math]\displaystyle{ y }[/math] immediately succeeds [math]\displaystyle{ x }[/math] in the resulting sequence [math]\displaystyle{ S_1' \| S_2 \| S_3' }[/math]. We have to show [math]\displaystyle{ x \leq y }[/math].

If [math]\displaystyle{ x,y \in S_1' }[/math] or [math]\displaystyle{ x,y \in S_3' }[/math], [math]\displaystyle{ x \leq y }[/math] resultes from the induction hypothesis.
On the other hand, if [math]\displaystyle{ x,y \in S_2 }[/math]. It is [math]\displaystyle{ x = y = p }[/math], which trivially implies [math]\displaystyle{ x \leq y }[/math]
Finally, for following cases, [math]\displaystyle{ x \leq y }[/math] is implied by the specific way of partitioning [math]\displaystyle{ S }[/math] into [math]\displaystyle{ S_1' }[/math], [math]\displaystyle{ S_2 }[/math] and [math]\displaystyle{ S_3' }[/math]:
1. [math]\displaystyle{ x \in S_1' }[/math] and [math]\displaystyle{ y \in S_2 }[/math]
2. [math]\displaystyle{ x \in S_2 }[/math] and [math]\displaystyle{ y \in S_3' }[/math]
3. [math]\displaystyle{ x \in S_1' }[/math] and [math]\displaystyle{ y \in S_3' }[/math]

Obviously, this case distinction covers all potential cases, so the claim is proved.

Complexity

Statement:

In the worst case, the complexity is [math]\displaystyle{ \Theta(n^2) }[/math].

If the pivot rule ensures for some [math]\displaystyle{ \alpha \lt 1 }[/math] that the lengths of [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_3 }[/math] are at most [math]\displaystyle{ \alpha }[/math] times the size of [math]\displaystyle{ S }[/math], then it is even [math]\displaystyle{ O(n \log n) }[/math] in the worst case.

If each pivot value is chosen uniformly randomly from members of the respective sequence and if all selections of pivot values are stochastically independent, the average-case complexity is [math]\displaystyle{ O(n \log n) }[/math].

Proof:

First note that the complexity for a single recursive call on [math]\displaystyle{ S }[/math] (disregarding the complexity for the recursive descents on [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_3\gt ) is in \lt mathO(|S|) }[/math]. On each recursive level, all calls are on distinct subsets of [math]\displaystyle{ S }[/math]. Therefore, the number of recursive calls with non-empty sequences on one recursive level is in [math]\displaystyle{ O(n) }[/math]. The number of calls with empty sequences on one level is at most twice the total number of calls with non-empty sequences on the previous level. Hence, the number of calls with empty sequences on one recursive level is in [math]\displaystyle{ O(n) }[/math] as well. In summary, the total complexity on a recursive level is [math]\displaystyle{ O(n) }[/math]. So, for the total complexity, it remains to estimate the number of recursive levels.

Now consider the first statement. The recursion variant implies that the deepest recursive level is [math]\displaystyle{ O(n) }[/math]. This gives the claimed [math]\displaystyle{ O(n^2) }[/math] in the worst case.

Next assume there is a fixed [math]\displaystyle{ \alpha \lt 1 }[/math] such that [math]\displaystyle{ |S_1| \leq \alpha \cdot|S| }[/math] and [math]\displaystyle{ |S_3| \leq \alpha \cdot|S| }[/math] is guaranteed in each recursive call. Then the length of any sequence on recursive level [math]\displaystyle{ \#i }[/math] is at most [math]\displaystyle{ \alpha ^ i \cdot |S| }[/math]. Therefore, the maximal recursive depth is [math]\displaystyle{ \lceil \log_{a-1}(n)\rceil }[/math]. Since [math]\displaystyle{ \alpha^{-1} \gt 1 }[/math], the total complexity is in [math]\displaystyle{ O(n \log n) }[/math] in the worst case.

For the last statement, the average-case analysis, first note that the number of comparisons alone has the same asymptotic complexity as the algorithm as a whole. Next note that any [math]\displaystyle{ x,y \in S }[/math] are compared at most once througout the entire algorithm: if, and only if,[math]\displaystyle{ x }[/math] or [math]\displaystyle{ y }[/math] is chosen as the pivot value for a subsequence to which both elements belong. For [math]\displaystyle{ x,y \in S }[/math], let [math]\displaystyle{ Pr(x,y) }[/math] denote the probability that and are indeed compared. Since comparison events are distinct and [math]\displaystyle{ Pr(x,y \in \{0,1\} }[/math] for all [math]\displaystyle{ x,y \in S }[/math], the expected number of comparisons is

[math]\displaystyle{ \sum_{x,y \in S, x \neq y} Pr(x,y) }[/math]

Let [math]\displaystyle{ n := |S| }[/math]

Further information

@@ Line 63: / Line 63: @@
 == Complexity ==
+=== Statement: ===
+In the worst case, the complexity is <math>\Theta(n^2)</math>.
+If the pivot rule ensures for some <math>\alpha < 1</math> that the lengths of <math>S_1</math> and <math>S_3</math> are at most <math>\alpha</math> times the size of <math>S</math>, then it is even <math>O(n \log n)</math> in the worst case.
+If each pivot value is chosen uniformly randomly from members of the respective sequence and if all selections of pivot values are stochastically independent, the average-case complexity is <math>O(n \log n)</math>.
+=== Proof: ===
+First note that the complexity for a single recursive call on <math>S</math> (disregarding the complexity for the recursive descents on <math>S_1</math> and <math>S_3>) is in <mathO(|S|)</math>. On each recursive level, all calls are on distinct subsets of <math>S</math>. Therefore, the number of recursive calls with non-empty sequences on one recursive level is in <math>O(n)</math>. The number of calls with empty sequences on one level is at most twice the total number of calls with non-empty sequences on the previous level. Hence, the number of calls with empty sequences on one recursive level is in <math>O(n)</math> as well. In summary, the total complexity on a recursive level is <math>O(n)</math>. So, for the total complexity, it remains to estimate the number of recursive levels.
+Now consider the first statement. The recursion variant implies that the deepest recursive level is <math>O(n)</math>. This gives the claimed <math>O(n^2)</math> in the worst case.
+Next assume there is a fixed <math>\alpha < 1</math> such that <math>|S_1| \leq \alpha \cdot|S|</math> and <math>|S_3| \leq \alpha \cdot|S|</math> is guaranteed in each recursive call. Then the length of any sequence on recursive level <math>\#i</math> is at most <math>\alpha ^ i \cdot |S|</math>. Therefore, the maximal recursive depth is <math>\lceil \log_{a-1}(n)\rceil</math>. Since <math>\alpha^{-1} > 1</math>, the total complexity is in <math>O(n \log n)</math> in the worst case.
+For the last statement, the average-case analysis, first note that the number of comparisons alone has the same asymptotic complexity as the algorithm as a whole. Next note that any <math>x,y \in S</math> are compared at most once througout the entire algorithm: if, and only if,<math>x</math> or <math>y</math> is chosen as the pivot value for a subsequence to which both elements belong. For <math>x,y \in S</math>, let <math>Pr(x,y)</math> denote the probability that and are indeed compared. Since comparison events are distinct and <math>Pr(x,y \in \{0,1\}</math> for all <math>x,y \in S</math>, the [[Expected value|expected number]] of comparisons is
+<math>\sum_{x,y \in S, x \neq y} Pr(x,y)</math>
+Let <math>n := |S|</math>
 == Further information ==

Quicksort: Difference between revisions

Revision as of 16:11, 2 October 2014

Contents

General information

Abstract view

Induction basis

Induction step

Abstract view:

Implementation:

Correctness:

Complexity

Statement:

Proof:

Further information

Navigation menu

Quicksort: Difference between revisions

Revision as of 16:11, 2 October 2014

General information

Abstract view

Induction basis

Induction step

Abstract view:

Implementation:

Correctness:

Complexity

Statement:

Proof:

Further information

Navigation menu

Search