Breadth-first search: Difference between revisions

From Algowiki
Jump to navigation Jump to search
m (Luedecke moved page Breadth-First Search to Breadth-first search)
No edit summary
Line 1: Line 1:
[[Category:Algorithms]]
[[Category:Search Algorithms]]
[[Category:Tree Algorithms]]
[[Category:Graph Traversal]]
== General information ==
'''Algorithmic problem:''' [[Graph traversal]]
'''Type of algorithm:''' loop
== Abstract view ==
'''Definition:‘‘‘
On this page, the '''distance''' of a node <math>v\in V</math> is the minimal number of arcs on a path from the start node <math>s</math> to <math>v</math>.
'''Specific characteristic:'''
The nodes are finished in the order of increasing distance.
'''Auxiliary data:'''
A queue <math>Q</math> whose elements are nodes in <math>V</math>.
'''Invariant:'''
Before and after each iteration:
# There is a '''current distance''' <math>k\in\mathbb{N}</math>.
# Let <math>n</math> denote the current size of <math>Q</math>. There is <math>\ell\in\{1,\ldots,n\}</math> such that the first <math>\ell</math> elements of <math>Q</math> have distance <math>k</math> and the last <math>n-\ell</math> elements have distance <math>k+1</math>.
'''Variant:''' One node is removed from <math>Q</math>.
'''Break condition:''' <math>S=\emptyset</math>.
== Induction basis ==
'''Abstract view:''' No node is finished. The start node is seen, no other node is seen. The start node is the only element of <math>S</math>. The current arc of the start node is its first outgoing arc. The arborescence <math>T</math> is initialized so as to contain all nodes and no arcs.
'''Implementation:''' Obvious.
'''Proof:''' Obvious.
== Induction step ==
'''Abstract view:'''
# Let <math>v</math> be the last node of <math>p</math> (=the top element of <math>S</math>).
# While the current arc of <math>v</math> is not void and the head of the current arc is labeled as seen, move the current arc one step forward.
# If the current arc of <math>v</math> is not void, add the current arc to <math>T</math>, push the head of the current arc on <math>S</math> and label this node as seen.
# Otherwise, remove <math>v</math> from <math>S</math> and label <math>v</math> as finished.
'''Implementation:''' Obvious.
'''Proof:'''
The loop variant is obviously fulfilled.
The first point of the invariant is obviously fulfilled. The second point follows from the fact that the current arc of a node is initialized accordingly and only changed after the node is labele as seen. Point 2.1 follows from the fact that the current arc never skips an arc to an unseen node.
For a contradiction to 2.2, note that any lexicographically smaller path <math>p'</math> has at least one node that is lexicographically smaller than <math>p</math>. By induction hypothesis (point 4), these nodes are finished. Let <math>w</math> denote the last finished node on <math>p'</math>. Since <math>v</math> is not finished immediately before the iteration, <math>w</math> has a successor on <math>p'</math>, which is unfinished. However, obviously, a nodes is marked finished not before all of its immediate successors are finished.
When a node is pushed on <math>S</math>, it is neither seen nor finished immediately before that iteration and labaled as seen in that iteration. This proves the first sentence. The other statements of point 4 follow from the observation that <math>p</math> increases lexicographically in each iteration.
== Correctness ==
It is easy to see that each operation of the algorithm is well defined. Due to the variant, the loop terminates after a finite number of steps. Immediately before the last iteration, <math>p</math> consists of the start node <math>s</math> only, and the current arc of <math>s</math> is void. Therefore, ''all'' nodes reachable from <math>s</math> except for <math>s</math> itself are lexicographically smaller than <math>p</math>. Due to point 4 of the invariant, all of these nodes are finished. In the last iteration, <math>s</math> is finished as well.
Suppose for a contradictin that the specific characteristic is not fulfilled by <math>v,w\in V</math>. That is, <math>v</math> is seen and finished before <math>w</math>. Consider the situation immediately after the iteration in which <math>v</math> is finished. So, in this moment, <math>w</math> is not yet finished. By assumption, there is a path from <math>v</math> to <math>w</math> in <math>G</math>. There are two successive nodes on this path, <math>x</math> and <math>y</math>, such that <math>x</math> is finished and <math>y</math> is not at that moment. However, obviously, <math>x</math> cannot be finished prior to <math>y</math>.
== Complexity ==
'''Statement:''' The asymptotic complexity is in <math>\Theta(|V|+|A|)</math> in the best and worst case.
'''Proof:''' From every node, the algorithm goes forward at most once for each of its outgoing arcs. And from each node, th algorithm goes backwards only once. Obviously, each of these steps requires a constant number of operations.
== Pseudocode ==  
== Pseudocode ==  



Revision as of 18:24, 8 October 2014


General information

Algorithmic problem: Graph traversal

Type of algorithm: loop

Abstract view

Definition:‘‘‘ On this page, the distance of a node [math]\displaystyle{ v\in V }[/math] is the minimal number of arcs on a path from the start node [math]\displaystyle{ s }[/math] to [math]\displaystyle{ v }[/math].

Specific characteristic: The nodes are finished in the order of increasing distance.

Auxiliary data: A queue [math]\displaystyle{ Q }[/math] whose elements are nodes in [math]\displaystyle{ V }[/math].

Invariant: Before and after each iteration:

  1. There is a current distance [math]\displaystyle{ k\in\mathbb{N} }[/math].
  2. Let [math]\displaystyle{ n }[/math] denote the current size of [math]\displaystyle{ Q }[/math]. There is [math]\displaystyle{ \ell\in\{1,\ldots,n\} }[/math] such that the first [math]\displaystyle{ \ell }[/math] elements of [math]\displaystyle{ Q }[/math] have distance [math]\displaystyle{ k }[/math] and the last [math]\displaystyle{ n-\ell }[/math] elements have distance [math]\displaystyle{ k+1 }[/math].

Variant: One node is removed from [math]\displaystyle{ Q }[/math].

Break condition: [math]\displaystyle{ S=\emptyset }[/math].

Induction basis

Abstract view: No node is finished. The start node is seen, no other node is seen. The start node is the only element of [math]\displaystyle{ S }[/math]. The current arc of the start node is its first outgoing arc. The arborescence [math]\displaystyle{ T }[/math] is initialized so as to contain all nodes and no arcs.

Implementation: Obvious.

Proof: Obvious.

Induction step

Abstract view:

  1. Let [math]\displaystyle{ v }[/math] be the last node of [math]\displaystyle{ p }[/math] (=the top element of [math]\displaystyle{ S }[/math]).
  2. While the current arc of [math]\displaystyle{ v }[/math] is not void and the head of the current arc is labeled as seen, move the current arc one step forward.
  3. If the current arc of [math]\displaystyle{ v }[/math] is not void, add the current arc to [math]\displaystyle{ T }[/math], push the head of the current arc on [math]\displaystyle{ S }[/math] and label this node as seen.
  4. Otherwise, remove [math]\displaystyle{ v }[/math] from [math]\displaystyle{ S }[/math] and label [math]\displaystyle{ v }[/math] as finished.

Implementation: Obvious.

Proof: The loop variant is obviously fulfilled.

The first point of the invariant is obviously fulfilled. The second point follows from the fact that the current arc of a node is initialized accordingly and only changed after the node is labele as seen. Point 2.1 follows from the fact that the current arc never skips an arc to an unseen node.

For a contradiction to 2.2, note that any lexicographically smaller path [math]\displaystyle{ p' }[/math] has at least one node that is lexicographically smaller than [math]\displaystyle{ p }[/math]. By induction hypothesis (point 4), these nodes are finished. Let [math]\displaystyle{ w }[/math] denote the last finished node on [math]\displaystyle{ p' }[/math]. Since [math]\displaystyle{ v }[/math] is not finished immediately before the iteration, [math]\displaystyle{ w }[/math] has a successor on [math]\displaystyle{ p' }[/math], which is unfinished. However, obviously, a nodes is marked finished not before all of its immediate successors are finished.

When a node is pushed on [math]\displaystyle{ S }[/math], it is neither seen nor finished immediately before that iteration and labaled as seen in that iteration. This proves the first sentence. The other statements of point 4 follow from the observation that [math]\displaystyle{ p }[/math] increases lexicographically in each iteration.

Correctness

It is easy to see that each operation of the algorithm is well defined. Due to the variant, the loop terminates after a finite number of steps. Immediately before the last iteration, [math]\displaystyle{ p }[/math] consists of the start node [math]\displaystyle{ s }[/math] only, and the current arc of [math]\displaystyle{ s }[/math] is void. Therefore, all nodes reachable from [math]\displaystyle{ s }[/math] except for [math]\displaystyle{ s }[/math] itself are lexicographically smaller than [math]\displaystyle{ p }[/math]. Due to point 4 of the invariant, all of these nodes are finished. In the last iteration, [math]\displaystyle{ s }[/math] is finished as well.

Suppose for a contradictin that the specific characteristic is not fulfilled by [math]\displaystyle{ v,w\in V }[/math]. That is, [math]\displaystyle{ v }[/math] is seen and finished before [math]\displaystyle{ w }[/math]. Consider the situation immediately after the iteration in which [math]\displaystyle{ v }[/math] is finished. So, in this moment, [math]\displaystyle{ w }[/math] is not yet finished. By assumption, there is a path from [math]\displaystyle{ v }[/math] to [math]\displaystyle{ w }[/math] in [math]\displaystyle{ G }[/math]. There are two successive nodes on this path, [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], such that [math]\displaystyle{ x }[/math] is finished and [math]\displaystyle{ y }[/math] is not at that moment. However, obviously, [math]\displaystyle{ x }[/math] cannot be finished prior to [math]\displaystyle{ y }[/math].

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \Theta(|V|+|A|) }[/math] in the best and worst case.

Proof: From every node, the algorithm goes forward at most once for each of its outgoing arcs. And from each node, th algorithm goes backwards only once. Obviously, each of these steps requires a constant number of operations.


Pseudocode

BFS(G,s)
 1  for each vertex uG.V - {s}
 2       u.color = WHITE
 3       u.d = ∞
 4       u.π = NIL
 5  s.color = GRAY
 6  s.d = 0
 7  s.π = NIL
 8  Q = Ø
 9  ENQUE(Q, s)
10  while Q ≠ Ø
11      u = DEQUEUE(Q)
12      for each vG.Adj[u]
13              if v.color == WHITE
14                       v.color = GRAY
15                       v.d = u.d + 1
16                       v.π = u
17                       ENQUEUE(Q, v)
18      u.color = BLACK