Simple string matching algorithm: Difference between revisions

From Algowiki
Jump to navigation Jump to search
No edit summary
Line 11: Line 11:


==Abstract view==
==Abstract view==
'''Invariant:''' After <math>i\ge 0</math> iterations:
# In ascending order, <math>R</math> contains exactly the start indexes of all occurrences of <math>T</math> in <math>S</math> that lie completely in the substring <math>(S[1],...,S[i])</math> of <math>S</math>. In other words, <math>R</math> contains the start indexes in the range <math>S[1],...,S[i-m+1]</math>, where <math>m</math> is the length of <math>T</math>.
# In ascending order, <math>I</math> contains exactly the start indexes of all occurrences of <math>T</math> in <math>S</math> that lie partially in the substring <math>(S[1],...,S[i])</math> of <math>S</math>. In other words, <math>I</math> contains the start indexes in the range <math>S[i-m+2],...,S[i]</math>.
'''Variant:''' <math>i</math> increases by <math>1</math>.
'''Break condition:''' <math>i=n</math>.


==Induction basis==
==Induction basis==

Revision as of 09:59, 1 October 2014

Algorithmic problem: One-dimensional string matching

Prerequisites:

Type of algorithm: loop

Auxiliary data A FIFO queue [math]\displaystyle{ I }[/math] of natural numbers, which always contains the start indexes of all occurrences of [math]\displaystyle{ T }[/math] in [math]\displaystyle{ S }[/math] that have been seen only partially thus far.

Abstract view

Invariant: After [math]\displaystyle{ i\ge 0 }[/math] iterations:

  1. In ascending order, [math]\displaystyle{ R }[/math] contains exactly the start indexes of all occurrences of [math]\displaystyle{ T }[/math] in [math]\displaystyle{ S }[/math] that lie completely in the substring [math]\displaystyle{ (S[1],...,S[i]) }[/math] of [math]\displaystyle{ S }[/math]. In other words, [math]\displaystyle{ R }[/math] contains the start indexes in the range [math]\displaystyle{ S[1],...,S[i-m+1] }[/math], where [math]\displaystyle{ m }[/math] is the length of [math]\displaystyle{ T }[/math].
  2. In ascending order, [math]\displaystyle{ I }[/math] contains exactly the start indexes of all occurrences of [math]\displaystyle{ T }[/math] in [math]\displaystyle{ S }[/math] that lie partially in the substring [math]\displaystyle{ (S[1],...,S[i]) }[/math] of [math]\displaystyle{ S }[/math]. In other words, [math]\displaystyle{ I }[/math] contains the start indexes in the range [math]\displaystyle{ S[i-m+2],...,S[i] }[/math].

Variant: [math]\displaystyle{ i }[/math] increases by [math]\displaystyle{ 1 }[/math].

Break condition: [math]\displaystyle{ i=n }[/math].

Induction basis

Induction step

Complexity

Further information