Depth-first search

From Algowiki
Jump to navigation Jump to search


General information

Algorithmic problem: Graph traversal

Type of algorithm: loop

Abstract view

Definitions:

  1. For each node, an arbitrary but fixed ordering of the outgoing arcs is assumed. An arc [math]\displaystyle{ (v,w) }[/math] preceding an arc [math]\displaystyle{ (v,w') }[/math] in this ordering is called lexicographically smaller than [math]\displaystyle{ (v,w) }[/math].
  2. Let [math]\displaystyle{ p }[/math] and [math]\displaystyle{ p' }[/math] be two paths starting from [math]\displaystyle{ s\in V }[/math], and let [math]\displaystyle{ v }[/math] be the last common node such that the subpaths of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ p' }[/math] from [math]\displaystyle{ s }[/math] up to [math]\displaystyle{ v }[/math] (possibly [math]\displaystyle{ s=v }[/math]) are identical. If the next arc of [math]\displaystyle{ p }[/math] is lexicographically smaller than the next arc of [math]\displaystyle{ p' }[/math], [math]\displaystyle{ p }[/math] is said to be lexicograpically smaller than [math]\displaystyle{ p' }[/math].
  3. Note that the lexicographically smallest path from [math]\displaystyle{ v\in V }[/math] to [math]\displaystyle{ w\in V }[/math] is well defined and unique. With respect to a starting node [math]\displaystyle{ s\in V }[/math], a node [math]\displaystyle{ v\in V }[/math] is lexicographically smaller than [math]\displaystyle{ w\in V }[/math] if the lexicographically smallest path from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ v }[/math] is lexicographically smaller than the lexicographically smallest path from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ w }[/math].
  4. In all of the above cases, the reverse relation is called lexicographically larger.
  5. A node [math]\displaystyle{ v\in V }[/math] is lexicographically smaller (resp., lexicograpically larger) than a path [math]\displaystyle{ p }[/math] if the lexicographically smallest path from the start node of [math]\displaystyle{ p }[/math] to [math]\displaystyle{ v }[/math] is lexicographically smaller (resp., larger) than [math]\displaystyle{ p }[/math]. (Note the asymmetry: In both cases, the lexicographically smallest path to [math]\displaystyle{ v }[/math] is used.)

Auxiliary data:

  1. A stack [math]\displaystyle{ S }[/math] whose elements are nodes in [math]\displaystyle{ V }[/math].
  2. Each node has a current arc [math]\displaystyle{ a_v\in V }[/math], which is either void or an outgoing arc [math]\displaystyle{ a_v=(v,w) }[/math] of [math]\displaystyle{ v }[/math].
  3. Each node has two Boolean label with semantics, "is seen" and "is finished".

Invariant: Before and after each iteration:

  1. [math]\displaystyle{ S }[/math] forms a path [math]\displaystyle{ p }[/math] from the start node to some other node, that is, the order of the nodes on [math]\displaystyle{ p }[/math] is the order in which they appear in [math]\displaystyle{ S }[/math] (start node at the bottom of [math]\displaystyle{ S }[/math]).
  2. For each node not yet seen, the current arc is the first arc (or void if the node has no outgoing arcs).
  3. For each node [math]\displaystyle{ v }[/math] on [math]\displaystyle{ p }[/math]:
    1. The subpath of [math]\displaystyle{ p }[/math] from the start node to [math]\displaystyle{ v }[/math] is the lexicographically first path from the start node to [math]\displaystyle{ v }[/math].
    2. If there are arcs [math]\displaystyle{ (v,w)\in A }[/math] such that [math]\displaystyle{ w }[/math] is not seen, the current arc is the first such arc; otherwise, the current arc is void.
  4. The nodes on [math]\displaystyle{ p }[/math] are seen but not finished. Let [math]\displaystyle{ p' }[/math] denote the concatenation of [math]\displaystyle{ p }[/math] with the current arc [math]\displaystyle{ a }[/math] of the last node of [math]\displaystyle{ p }[/math]' The nodes that lexicographically precede [math]\displaystyle{ p' }[/math] are seen and finished, and the nodes that lexicographically succeed [math]\displaystyle{ p' }[/math] are neither seen nor finished. (Note that nothing is said about the head of [math]\displaystyle{ a }[/math]).

Variant: Either one node is finished or the current arc of one node is moved forward.

Break condition: [math]\displaystyle{ S=\emptyset }[/math].

Induction basis

Abstract view: No node is finished. The start node is seen, no other node is seen. The start node is the only element of [math]\displaystyle{ S }[/math]. The current arc of the start node is its first outgoing arc.

Implementation: Obvious.

Proof: Obvious.

Induction step

Abstract view:

  1. If the current arc of the last node [math]\displaystyle{ v }[/math] of [math]\displaystyle{ p }[/math] (=the top element of [math]\displaystyle{ S }[/math]) is void, that node is removed from [math]\displaystyle{ S }[/math] and labeled as finished.
  2. Otherwise, let [math]\displaystyle{ (v,w) }[/math] be the current arc of [math]\displaystyle{ v }[/math]. Then [math]\displaystyle{ w }[/math] is labeled as seen and put on [math]\displaystyle{ S }[/math], and the current arc of [math]\displaystyle{ v }[/math] is moved one arc forward (and becomes void if there is no more arc).

Implementation:

Correctness: The loop variant is obviously fulfilled.

If the top node is removed from [math]\displaystyle{ S }[/math], all points of the invariant are obviously maintained (note that the top node succeeds [math]\displaystyle{ p }[/math] after removal from [math]\displaystyle{ S }[/math]). On the other hand, if some node [math]\displaystyle{ p }[/math] is put on [math]\displaystyle{ S }[/math],the first point is obviously fulfilled as well.

For a contradiction to 2.1, note that any lexicographically smaller path [math]\displaystyle{ p' }[/math] has at least one node that is lexicographically smaller than [math]\displaystyle{ p }[/math]. By induction hypothesis (point 4), these nodes are finished. Let [math]\displaystyle{ w }[/math] denote the last finished node on [math]\displaystyle{ p' }[/math]. Since [math]\displaystyle{ v }[/math] is not finished immediately before the iteration, [math]\displaystyle{ w }[/math] has a successor on [math]\displaystyle{ p' }[/math], which is unfinished. However, obviously, a nodes is marked finished not before all of its immediate successors are finished.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \Theta(|V|+|A|) }[/math] in the best and worst case.

Proof: From every node, the algorithm goes forward once for each of its outgoing arcs. And from each node, th algorithm goes backwards only once. Obviously, each of these steps requires a constant number of operations.


Pseudocode

DFS(G)

for each vertex uV [G]
do color[u] ← WHITE
π[u] ← NIL
time ← 0
do if color[u] == WHITE
then DFS-VISiT(u)


DFS-VISIT(u)

color[u] ← GRAY
timetime + 1
d[u] ← time
for each vAdj[u]
do if color[v] = WHITE
then π [v] ← u
DFS-VISIT(v)
color[u] ← BLACK
f[u] ← timetime + 1