Simple string matching algorithm
Algorithmic problem: One-dimensional string matching
Prerequisites:
Type of algorithm: loop
Auxiliary data A FIFO queue [math]\displaystyle{ I }[/math] of natural numbers, which always contains the start indexes of all occurrences of [math]\displaystyle{ T }[/math] in [math]\displaystyle{ S }[/math] that have been seen only partially thus far.
Abstract view
Invariant: After [math]\displaystyle{ i\ge 0 }[/math] iterations:
- In ascending order, [math]\displaystyle{ R }[/math] contains exactly the start indexes of all occurrences of [math]\displaystyle{ T }[/math] in [math]\displaystyle{ S }[/math] that lie completely in the substring [math]\displaystyle{ (S[1],...,S[i]) }[/math] of [math]\displaystyle{ S }[/math]. In other words, [math]\displaystyle{ R }[/math] contains the start indexes in the range [math]\displaystyle{ S[1],...,S[i-m+1] }[/math], where [math]\displaystyle{ m }[/math] is the length of [math]\displaystyle{ T }[/math].
- In ascending order, [math]\displaystyle{ I }[/math] contains exactly the start indexes of all occurrences of [math]\displaystyle{ T }[/math] in [math]\displaystyle{ S }[/math] that lie partially in the substring [math]\displaystyle{ (S[1],...,S[i]) }[/math] of [math]\displaystyle{ S }[/math]. In other words, [math]\displaystyle{ I }[/math] contains the start indexes in the range [math]\displaystyle{ S[i-m+2],...,S[i] }[/math].
Variant: [math]\displaystyle{ i }[/math] increases by [math]\displaystyle{ 1 }[/math].
Break condition: [math]\displaystyle{ i=n }[/math].
Induction basis
Abstract view: [math]\displaystyle{ R }[/math] and [math]\displaystyle{ I }[/math] have to be empty.
Implementation:
- [math]\displaystyle{ R:=\emptyset }[/math]
- [math]\displaystyle{ I:=\emptyset }[/math]
Proof: Nothing to show.