Depth-first search: Difference between revisions

From Algowiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 3: Line 3:
[[Category:Tree Algorithms]]
[[Category:Tree Algorithms]]
[[Category:Graph Traversal]]
[[Category:Graph Traversal]]
== Definition ==


Depth-first search
== General information ==
Algorithmic problem: Graph Traversal
Prerequisites:
Type of algorithm: loop
Auxiliary data:
A stack  whose elements are nodes in .
Each node  has a current arc , which is either  or an outgoing arc  of .
Each node has a Boolean label with semantics "is marked".
Abstract view


Invariant: After  iterations:
'''Algorithmic problem:''' [[Graph traversal]]
The nodes marked so far are the  lexicographically first nodes from .
The arcs traversed in forward direction are the arcs of the tree that is the union of the lexicographically first paths from  to all marked nodes.
Stack  contains the nodes on the lexicographically first path of the node marked last, in the order from  to that node.
For each node ,  is the lexicographically first outgoing arc of  that has not yet been considered for going forward from  (or , if no such arc exists).
Variant:  increases by .
Break condition:  .


Induction basis
'''Type of algorithm:''' loop


Abstract view:
== Abstract view ==
has one element: root .
Root  is marked, all other nodes are unmarked.
Implementation: Obvious.
Proof: Nothing to show.


Induction step
'''Definitions:'''
# For each node, an arbitrary but fixed ordering of the outgoing arcs its assumed. An arc <math>(v,w)</math> preceding an arc <math>(v,w')</math> in this ordering is called '''lexicographically smaller''' than <math>(v,w)</math>.
# Let <math>p</math> and <math>p'</math> be two paths starting from <math>s\in V</math>, and let <math>v</math> be the last common node such that the subpaths of <math>p</math> and <math>p'</math> from <math>s</math> up to <math>v</math> (possibly <math>s=v</math>) are identical. If the next arc of <math>p</math> is lexicographically smaller than the next arc of <math>p'</math>, <math>p</math> is said to be '''lexicograpically smaller than <math>p'</math>.
# Note that the lexicographically smallest path from <math>v\in V</math> to <math>w\in V</math> is well defined and unique. With respect to a starting node <math>s\in V</math>, a node <math>v\in V</math> is '''lexicographically smaller''' than <math>w\in V</math> if the lexicographically smallest path from <math>s</math> to <math>v</math> is smaller than any path from <math>s</math> to <math>w</math>.
# In all cases, the reverse relation is called '''lexicographically larger'''.


Abstract view: If the current node has at least one outgoing arc to an unmarked node, traverse the lexicographically first of them to its target node and mark that node. Otherwise, go back along the path in  until the current node does have such an outgoing arc, and traverse the first such arc in the list.
'''Auxiliary data:'''
Implementation:
# A stack <math>S</math> whose elements are nodes in <math>V</math>.
Let  denote the top element of .
# Each node has a '''current arc''' <math>a_v\in V</math>, which is either void or an outgoing arc <math>a_v=(v,w)</math> of <math>v</math>.
While :
# Each node has two Boolean label with semantics, "is seen" and "is finished".
Pop the top element from .
If , terminate the algorithm.
Let  denote the new top element of .
While  and the target node of  is marked, let the new  be the next arc after  in the list of outgoing arcs of  (or , if no such arc exists).
Push the target node of  onto  and mark it.
Correctness: Since  terminates the main loop, the first step and the first execution of Step 2.1 are well defined: at that moment, there indeed is a top element ready to be read and removed from . Due to Step 2.2, all further executions of Step 2.1 and all executions of Steps 2.3 and 2.4 are well defined, too.
Obviously, the loop in Step 2 goes backwards along the path encoded in  until there is an opportunity  to go forward. Due to the prefix property, always contains the lexicographically first path to the current top element during the backtracking. Since  is not marked,  is the first arc from some lexicographically preceding node to . Therefore, when  is pushed onto , the current contents of  comprises the lexicographically first path to .


Complexity
'''Invariant:'''
Before and after each iteration:
# <math>S</math> forms a path <math>p</math> from the start node to some other node, that is, the order of the nodes on <math>p</math> is the order in which they appear in <math>S</math> (start node at the bottom of <math>S</math>).
# For each node <math>v</math> on <math>p</math>:
## The subpath of <math>p</math> from the start node to <math>v</math> is the [[lexicographically]] first path from the start node to <math>v</math>.
## If there are arcs <math>(v,w)\in A</math> such that <math>w</math> is not seen, the current arc is the first such arc; otherwise, the current arc is void.
# The nodes on <math>p</math> are seen but not finished. Let <math>p'</math> denote the concatenation of <math>p</math> with the current arc <math>a</math> of the last node of <math>p</math>' The nodes that lexicographically precede <math>p'</math> are seen and finished, and the nodes that lexicographically succeed <math>p</math> are neither seen nor finished. (Note that nothing is said about the head of <math>a</math>).


Statement: The asymptotic complexity is in  in the worst case, where  and . If all nodes of  are reachable from , it is also  in the best and worst case.
'''Variant:''' Either one node is finished or the current arc of one node is moved forward.
Proof: Note that every node is pushed at most once onto  because a marked (=already pushed) node is skipped in Step 2.4. Therefore, the number of forward steps is in  in the worst case. Clearly, the number of removals from  is in  in the worst case as well. Also note that each arc is considered at most once for going forward because an outgoing arc of  becomes the current arc at most once. No other operation is executed more often than the stack operations and these arc operations. In summary, this gives  in the worst case.
Each node of that is reachable from  will be seen by the search. Therefore, if all nodes are reachable, the number of push (and pop) operations is in  in any case. Since a node is only removed from  after all of its outgoing arcs were seen, all arcs will eventually be seen in that case, so the total number of arc inspections is then in .


Further information
'''Break condition:''' <math>S=\emptyset</math>.
 
== Induction basis ==
 
'''Abstract view:''' No node is finished. The start node is seen, no other node is seen. The start node is the only element of <math>S</math>. The current arc of the start node is its first outgoing arc.
 
'''Implementation:''' Obvious.
 
'''Proof:''' Obvious.
 
== Induction step ==
 
'''Abstract view:'''
# If the current arc of the last node <math>v</math> of <math>p</math> (=the top element of <math>S</math>) is void, that node is removed from <math>S</math> and labeled as finished.
# Otherwise, let <math>(v,w)</math> be the current arc of <math>v</math>. Then <math>w</math> is labeled as seen and put on <math>S</math>, and the current arc of <math>v</math> is moved one arc forward (and becomes void if there is no more arc).
 
'''Implementation:'''
 
'''Correctness:'''
The loop variant is obviously fulfilled.
 
If the top node is removed from <math>S</math>, all points of the invariant are obviously maintained (note that the top node succeeds <math>p</math> after removal from <math>S</math>). On the other hand, if some node <math>p</math> is put on <math>S</math>,the first point is obviously fulfilled as well.
 
For a contradiction to 2.1, note that any lexicographically smaller path <math>p'</math> has at least one node that is lexicographically smaller than <math>p</math>. By induction hypothesis (point 4), these nodes are finished. Let <math>w</math> denote the last finished node on <math>p'</math>. Since <math>v</math> is not finished immediately before the iteration, <math>w</math> has a successor on <math>p'</math>, which is unfinished. However, obviously, a nodes is marked finished not before all of its immediate successors are finished.
 
== Complexity ==
 
'''Statement:''' The asymptotic complexity is in <math>\Theta(|V|+|A|)</math> in the best and worst case.
 
'''Proof:''' From every node, the algorithm goes forward once for each of its outgoing arcs. And from each node, th algorithm goes backwards only once. Obviously, each of these steps requires a constant number of operations.


Depth-first search could alternatively be implemented as a more fine-grained loop in which, in each iteration, exactly one forward or backward step over an arc is performed.


== Pseudocode ==  
== Pseudocode ==  

Revision as of 10:57, 8 October 2014


General information

Algorithmic problem: Graph traversal

Type of algorithm: loop

Abstract view

Definitions:

  1. For each node, an arbitrary but fixed ordering of the outgoing arcs its assumed. An arc [math]\displaystyle{ (v,w) }[/math] preceding an arc [math]\displaystyle{ (v,w') }[/math] in this ordering is called lexicographically smaller than [math]\displaystyle{ (v,w) }[/math].
  2. Let [math]\displaystyle{ p }[/math] and [math]\displaystyle{ p' }[/math] be two paths starting from [math]\displaystyle{ s\in V }[/math], and let [math]\displaystyle{ v }[/math] be the last common node such that the subpaths of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ p' }[/math] from [math]\displaystyle{ s }[/math] up to [math]\displaystyle{ v }[/math] (possibly [math]\displaystyle{ s=v }[/math]) are identical. If the next arc of [math]\displaystyle{ p }[/math] is lexicographically smaller than the next arc of [math]\displaystyle{ p' }[/math], [math]\displaystyle{ p }[/math] is said to be lexicograpically smaller than [math]\displaystyle{ p' }[/math].
  3. Note that the lexicographically smallest path from [math]\displaystyle{ v\in V }[/math] to [math]\displaystyle{ w\in V }[/math] is well defined and unique. With respect to a starting node [math]\displaystyle{ s\in V }[/math], a node [math]\displaystyle{ v\in V }[/math] is lexicographically smaller than [math]\displaystyle{ w\in V }[/math] if the lexicographically smallest path from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ v }[/math] is smaller than any path from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ w }[/math].
  4. In all cases, the reverse relation is called lexicographically larger.

Auxiliary data:

  1. A stack [math]\displaystyle{ S }[/math] whose elements are nodes in [math]\displaystyle{ V }[/math].
  2. Each node has a current arc [math]\displaystyle{ a_v\in V }[/math], which is either void or an outgoing arc [math]\displaystyle{ a_v=(v,w) }[/math] of [math]\displaystyle{ v }[/math].
  3. Each node has two Boolean label with semantics, "is seen" and "is finished".

Invariant: Before and after each iteration:

  1. [math]\displaystyle{ S }[/math] forms a path [math]\displaystyle{ p }[/math] from the start node to some other node, that is, the order of the nodes on [math]\displaystyle{ p }[/math] is the order in which they appear in [math]\displaystyle{ S }[/math] (start node at the bottom of [math]\displaystyle{ S }[/math]).
  2. For each node [math]\displaystyle{ v }[/math] on [math]\displaystyle{ p }[/math]:
    1. The subpath of [math]\displaystyle{ p }[/math] from the start node to [math]\displaystyle{ v }[/math] is the lexicographically first path from the start node to [math]\displaystyle{ v }[/math].
    2. If there are arcs [math]\displaystyle{ (v,w)\in A }[/math] such that [math]\displaystyle{ w }[/math] is not seen, the current arc is the first such arc; otherwise, the current arc is void.
  3. The nodes on [math]\displaystyle{ p }[/math] are seen but not finished. Let [math]\displaystyle{ p' }[/math] denote the concatenation of [math]\displaystyle{ p }[/math] with the current arc [math]\displaystyle{ a }[/math] of the last node of [math]\displaystyle{ p }[/math]' The nodes that lexicographically precede [math]\displaystyle{ p' }[/math] are seen and finished, and the nodes that lexicographically succeed [math]\displaystyle{ p }[/math] are neither seen nor finished. (Note that nothing is said about the head of [math]\displaystyle{ a }[/math]).

Variant: Either one node is finished or the current arc of one node is moved forward.

Break condition: [math]\displaystyle{ S=\emptyset }[/math].

Induction basis

Abstract view: No node is finished. The start node is seen, no other node is seen. The start node is the only element of [math]\displaystyle{ S }[/math]. The current arc of the start node is its first outgoing arc.

Implementation: Obvious.

Proof: Obvious.

Induction step

Abstract view:

  1. If the current arc of the last node [math]\displaystyle{ v }[/math] of [math]\displaystyle{ p }[/math] (=the top element of [math]\displaystyle{ S }[/math]) is void, that node is removed from [math]\displaystyle{ S }[/math] and labeled as finished.
  2. Otherwise, let [math]\displaystyle{ (v,w) }[/math] be the current arc of [math]\displaystyle{ v }[/math]. Then [math]\displaystyle{ w }[/math] is labeled as seen and put on [math]\displaystyle{ S }[/math], and the current arc of [math]\displaystyle{ v }[/math] is moved one arc forward (and becomes void if there is no more arc).

Implementation:

Correctness: The loop variant is obviously fulfilled.

If the top node is removed from [math]\displaystyle{ S }[/math], all points of the invariant are obviously maintained (note that the top node succeeds [math]\displaystyle{ p }[/math] after removal from [math]\displaystyle{ S }[/math]). On the other hand, if some node [math]\displaystyle{ p }[/math] is put on [math]\displaystyle{ S }[/math],the first point is obviously fulfilled as well.

For a contradiction to 2.1, note that any lexicographically smaller path [math]\displaystyle{ p' }[/math] has at least one node that is lexicographically smaller than [math]\displaystyle{ p }[/math]. By induction hypothesis (point 4), these nodes are finished. Let [math]\displaystyle{ w }[/math] denote the last finished node on [math]\displaystyle{ p' }[/math]. Since [math]\displaystyle{ v }[/math] is not finished immediately before the iteration, [math]\displaystyle{ w }[/math] has a successor on [math]\displaystyle{ p' }[/math], which is unfinished. However, obviously, a nodes is marked finished not before all of its immediate successors are finished.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \Theta(|V|+|A|) }[/math] in the best and worst case.

Proof: From every node, the algorithm goes forward once for each of its outgoing arcs. And from each node, th algorithm goes backwards only once. Obviously, each of these steps requires a constant number of operations.


Pseudocode

DFS(G)

for each vertex uV [G]
do color[u] ← WHITE
π[u] ← NIL
time ← 0
do if color[u] == WHITE
then DFS-VISiT(u)


DFS-VISIT(u)

color[u] ← GRAY
timetime + 1
d[u] ← time
for each vAdj[u]
do if color[v] = WHITE
then π [v] ← u
DFS-VISIT(v)
color[u] ← BLACK
f[u] ← timetime + 1