Bucketsort: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
[[Category:Sorting Algorithms]] | [[Category:Sorting Algorithms]] | ||
{{#ev:youtube|https://www.youtube.com/watch?v= | {{#ev:youtube|https://www.youtube.com/watch?v=-POIDU_ew98|500|right|Bucketsort|frame}} | ||
== General information == | == General information == | ||
Revision as of 17:17, 17 June 2015
General information
Algorithmic problem: Sorting Sequences of Strings
Type of algorithm: loop
Auxiliary data:
- An ordered sequence
of strings, which will eventually hold the overall result of the algorithm. - The buckets, that is, an array
whose index range includes the ID range of (e.g. 0...127 suffices for all alphabets covered by ASCII) and whose components are ordered sequences of strings. - Let
denote the maximum length of an input string. - An array
with index range and multisets of strings as components.
Abstract view
Invariant: After
- For
, contains all input strings of length . contains all other input strings, that is, the ones with length at least . The sequence is sorted according to the following definition of comparison: (resp. ) means that the substring of starting at position is lexicographically smaller (resp., smaller or equal) than the substring of starting at position .- All buckets are empty.
Variant:
Break condition:
Induction basis
Abstract view: The auxiliary data must be initialized.
Implementation:
.- For each
, set . - For each
, set . - Move each string
in to where denotes the length of .
Proof: Obvious.
Induction step
Abstract view:
- Move each string
in to . - Afterwards: Append each string
in at the tail of . It is essential that the strings are considered in the order in which they are stored in . (Now is empty.) - For each
in ascending order: append at the tail of and make empty.
Implementation: Obvious.
Correctness: We have to show that the relative order of each pair of strings,
- If
, and are placed in different buckets in Step 3 of the -th iteration, so they are correctly ordered according to the character at position . - If
and both strings have length , their relative order is irrelevant at this stage, so nothing is to show. - Next, consider the case that
, one string (say, ) has length and the other one ( ) has a length strictly greater than . Since the strings of length are placed in their buckets prior to the longer strings (cf. Steps 1 and 2), appears after , which is correct. - Finally, consider the case that
, and the lengths of both strings are larger than . Then the induction hypothesis guarantees the correct relative order due to the specific order in which the strings are considered in Step 2.
Complexity
Statement: Let
Proof: Obviously, the preprocessing takes