Dijkstra: Difference between revisions

From Algowiki
Jump to navigation Jump to search
Line 121: Line 121:


== Further infromation ==
== Further infromation ==
# We do not need to insert all nodes except for <math>s</math> in <math>Q</math>. In fact, it suffices to initially insert the endnodes of the outgoing arcs of <math>s</math> and, during an iteration, to insert the nodes of the outgoing arcs of the node that is being extracted from <math>Q</math>. In a sense, the nodes of <math>Q</math> then constitute the "frontierline" of the algorithm. We chose another variant on Dijkstra's algorithm mainly for simplicity of presentation.
# Clearly, Dijkstra's algorithm also solves the single-source single-target case. In this case, Dijkstra may safely terminate once has been extracted from <math>Q</math>. In fact, the loop invariant implies <math>\delta(t) = \Delta(t)</math> from then on. This variant on Dijkstra's algorithm is sometimes called the '''early termination variant'''.
# In a scenario where we perform a large number of searches on the same graph and arc lengths, we do not need to touch the whole graph in each search:
## We change the algorithm as suggested by the first point: only keep the "frontierline" in <math>Q</math>.
## We also change the algorithm as suggested by the second point: terminate once <math>t</math> is processed.
## To avoid touching all nodes for initializing the <math>\delta</math>-values, we maintain sort of a '''version counter''', or '''time stamp''', for each node, which holds the ID of the last search that met this node. Whenever a node is met in a search and the node's version counter does not equal the ID of the current search, the distance value of this node is regarded as <math>+ \infty</math> (no matter what its current value is).
# The early termination variant may be modified as follows: for and [[Paths|admissible distance function]] <math>h:V \to \R</math>, replace length <math>l</math> by <math>l_h</math> where <math>l_h(a) := l(a) + h(w) - h(v)</math> for each <math>a = (v,w) \in A</math>. This variant is usually called the '''A* algorithm'''. The heristic idea is that we (hopefully) reach <math> t</math> earlier because the arcs that roughly point towards <math>t</math> are shortened, and the arcs the roughly point in a direction away from <math>t</math> are lengthened.

Revision as of 11:33, 15 October 2014


Dijstra's algorithm is a graph algortihm solving the single-source shortest-paths problem.

Requirements

  • directed Graph [math]\displaystyle{ G=(V,E) }[/math]
  • weight function [math]\displaystyle{ w\colon E\to\mathbb R }[/math]
  • [math]\displaystyle{ \forall(u,v)\in E\colon w(u,v)\geq 0 }[/math]
  • source [math]\displaystyle{ s\in V }[/math]
  • Datastruct [math]\displaystyle{ S }[/math]
  • Datastruct [math]\displaystyle{ Q }[/math]

General information

Algorithmic problem: Single source shortest paths

Prerequisities: For all [math]\displaystyle{ \alpha \in A }[/math], it is [math]\displaystyle{ l(a) \geq 0 }[/math].

Type of algortihm: loop

Auxiliary data:

  1. A temporary distance value [math]\displaystyle{ \delta(v) \in \R }[/math] for each node [math]\displaystyle{ v \in V }[/math]. At termination, it is [math]\displaystyle{ \delta(v) = \Delta(v) }[/math] for all [math]\displaystyle{ v \in V }[/math].
  2. A bounded priority queue [math]\displaystyle{ Q }[/math] of size [math]\displaystyle{ |V|-1 }[/math], which contains nodes from [math]\displaystyle{ V }[/math] and takes their [math]\displaystyle{ \delta }[/math]-values as keys. The node with minimal key is returned.

Abstract view

Invariant: After [math]\displaystyle{ i \geq 0 }[/math] iterations:

  1. [math]\displaystyle{ Q }[/math] contains all nodes of [math]\displaystyle{ V }[/math] except for [math]\displaystyle{ s }[/math] and [math]\displaystyle{ i }[/math] further nodes.
  2. For the nodes [math]\displaystyle{ v \in V }[/math] not in [math]\displaystyle{ Q }[/math], it is [math]\displaystyle{ \delta(v) = \Delta(v) }[/math].
  3. For the nodes [math]\displaystyle{ v \in V }[/math] in [math]\displaystyle{ Q }[/math], [math]\displaystyle{ \delta(v) }[/math] is the length of a shortes [math]\displaystyle{ (s,v) }[/math]-path that solely contains nodes not in [math]\displaystyle{ Q }[/math] (except for [math]\displaystyle{ v }[/math] itself, of course). As usual, this means [math]\displaystyle{ \delta(v) = +\infty }[/math] if there is no such path.

In particular, it is [math]\displaystyle{ \delta(v) \geq \Delta(v) }[/math] for each node [math]\displaystyle{ v \in V }[/math].

Variant: [math]\displaystyle{ i }[/math] increases by [math]\displaystyle{ 1 }[/math].

Break condition:

  • [math]\displaystyle{ Q = \empty }[/math], which means that all nodes are reachable from [math]\displaystyle{ s }[/math] and have been processed;
  • otherwise, if [math]\displaystyle{ \delta(v) = +\infty }[/math] for the noext node [math]\displaystyle{ v }[/math] of [math]\displaystyle{ Q }[/math], which means that [math]\displaystyle{ \delta(v) = + \infty }[/math] for every node in [math]\displaystyle{ Q }[/math] and hence none of the nodes in [math]\displaystyle{ Q }[/math] is reachable from [math]\displaystyle{ s }[/math].

Induction basis

Abstract view:

  1. All nodes of [math]\displaystyle{ V }[/math] except for [math]\displaystyle{ s }[/math] have to be members of [math]\displaystyle{ Q }[/math].
  2. Root [math]\displaystyle{ s }[/math] must have correct distance [math]\displaystyle{ \Delta(s) = 0 }[/math].
  3. The other nodes must meet Invariant #3.

Implementation:

  1. [math]\displaystyle{ \delta(s) := 0 }[/math];
  2. For all [math]\displaystyle{ a = (s,v) \in A }[/math], set [math]\displaystyle{ \delta(v) := l(a) }[/math].
  3. For all [math]\displaystyle{ v \in V\setminus\{s\} }[/math] with [math]\displaystyle{ (s,v) \notin A }[/math] set [math]\displaystyle{ \delta(v) := +\infty }[/math].
  4. Insert all nodes in [math]\displaystyle{ V\setminus\{s\} }[/math] into [math]\displaystyle{ Q }[/math].

Proof:

Obvious.

Induction step

Abstract view:

One node is extracted from [math]\displaystyle{ Q }[/math], and the distance values of the endnodes of its outgoing arcs are updated to meet Invariant #3.

Implementation:

  1. Extract the next node [math]\displaystyle{ v }[/math] from [math]\displaystyle{ Q }[/math].
  2. For each outgoing arc [math]\displaystyle{ a = (v,w) \in A }[/math] such that [math]\displaystyle{ w \in Q }[/math], set [math]\displaystyle{ \delta(w) := min\{\delta(w), \delta(v) + l(a)\} }[/math].

Correctness:

The first invariant is trivially maintained. So we will focus on the second and third invariants.
Consider the moment immediately before the [math]\displaystyle{ i }[/math]-iteration. The core of the proof is to show that [math]\displaystyle{ \delta(v) = \Delta(v) }[/math] held at that moment (and hence forever). For a contradiction, suppose there is an [math]\displaystyle{ (s,v) }[/math]-path [math]\displaystyle{ p }[/math] whose length is strictly less than [math]\displaystyle{ \delta(v) }[/math]. The third invariant implies that [math]\displaystyle{ p }[/math] has nodes in [math]\displaystyle{ Q }[/math] besides [math]\displaystyle{ v }[/math]. Let [math]\displaystyle{ u }[/math] denote the first such node on [math]\displaystyle{ p }[/math]. Then none of the nodes on the subpath of [math]\displaystyle{ p }[/math] from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ u }[/math] belonged to [math]\displaystyle{ Q }[/math] at that moment (except for [math]\displaystyle{ u }[/math] itself, of course). Due to the prefix property, this subpath is a shortest [math]\displaystyle{ (s,u) }[/math]-path, so Invariant #3 implies [math]\displaystyle{ \delta(u) = \Delta(u) }[/math]. On the other hand, the specific choice of [math]\displaystyle{ v }[/math] implies [math]\displaystyle{ \delta(u) \geq \delta(v) }[/math]. In summary, the subpath of [math]\displaystyle{ p }[/math] from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ w }[/math] is not shorter than [math]\displaystyle{ \delta(v) }[/math]. To obtain [math]\displaystyle{ l(p) \lt \delta(v) }[/math], the subpath of [math]\displaystyle{ p }[/math] from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ v }[/math] must have negative total length, which contradicts the prerequisite [math]\displaystyle{ l \geq 0 }[/math].
Now we know [math]\displaystyle{ \delta(v) = \Delta(v) }[/math] Therefore, [math]\displaystyle{ v }[/math] fulfills the statement of the second invariant after the [math]\displaystyle{ i }[/math]-th iteration. Of course, the second invariant is not violated by any other node either, and is thus maintained.
For a node [math]\displaystyle{ w \in V }[/math] let [math]\displaystyle{ P_i (w) }[/math] denote the set of all [math]\displaystyle{ (s,w) }[/math]-paths such that no node (except for [math]\displaystyle{ w }[/math] itself) is in [math]\displaystyle{ Q }[/math] immediately after the [math]\displaystyle{ i }[/math]-th iteration. In particular, the statement of the third invariant means that [math]\displaystyle{ \delta(w) }[/math] is the length of a shortest path in [math]\displaystyle{ P_i (w) }[/math]. Tthe induction hypothesis says that [math]\displaystyle{ \delta(w) }[/math] is the minimum length of all [math]\displaystyle{ (s,w) }[/math]-paths in [math]\displaystyle{ P_{i-1} (w) }[/math]. The difference [math]\displaystyle{ P_i (w) \setminus P_{i-1} (w) }[/math] consists of all [math]\displaystyle{ (s,w) }[/math]-paths such that [math]\displaystyle{ (v,w) }[/math] is the last arc and all nodes except for [math]\displaystyle{ v }[/math] and [math]\displaystyle{ w }[/math] belong to [math]\displaystyle{ V_{i-1} }[/math]. Therefore, the second step of the iteration incorporates [math]\displaystyle{ P_i (w) \setminus P_{i-1} (w) }[/math] in the computation of [math]\displaystyle{ \delta(w) }[/math] appropriately to maintain Invaraint #3.

Complexity

Statement: The worst-case complexity is in [math]\displaystyle{ O(m+T(n)\cdot n) }[/math], where [math]\displaystyle{ n = |V| }[/math] and [math]\displaystyle{ m = |A| }[/math], and [math]\displaystyle{ T(n) }[/math] is the worst-case compexity for extraction/inserting nodes in [math]\displaystyle{ Q }[/math].

Proof: Each node is inserted in and extracted from [math]\displaystyle{ Q }[/math] at most once, respectively, which gives [math]\displaystyle{ O(T(n) \cdot n) }[/math] in total. Also, each arc [math]\displaystyle{ (v,w) \in A }[/math] is touched at most once, namely

  • in the initialization if [math]\displaystyle{ v = s }[/math],
  • when [math]\displaystyle{ v }[/math] is extracted from [math]\displaystyle{ Q }[/math] otherwise.

Pseudocode

DIJKSTRA(G,w,s)

DIJKSTRA(G,w,s)
1 INITIALIZE-SINGLE-SOURCE(G,s) 
2      S = ∅
3      Q = G.V
4      while Q ≠ ∅
5            u = EXTRACT-MIN(Q)
6            S = S ∪ {u} 
7            for each vertex v ∈ G.Adj[u]
8                 RELAX(u,v,w)

RELAX(u,v,w)

RELAX(u,v,w)
1 if v.d > u.d + w(u,v)
2       v.d = u.d + w(u,v)
3       v.π = u

INITIALIZE-SINGLE-SOURCE(G,s)

INITIALIZE-SINGLE-SOURCE(G,s) 
1 for each vertex vG.V
2         v.d = ∞
3         v.π = NIL
4 s.d = 0

Further infromation

  1. We do not need to insert all nodes except for [math]\displaystyle{ s }[/math] in [math]\displaystyle{ Q }[/math]. In fact, it suffices to initially insert the endnodes of the outgoing arcs of [math]\displaystyle{ s }[/math] and, during an iteration, to insert the nodes of the outgoing arcs of the node that is being extracted from [math]\displaystyle{ Q }[/math]. In a sense, the nodes of [math]\displaystyle{ Q }[/math] then constitute the "frontierline" of the algorithm. We chose another variant on Dijkstra's algorithm mainly for simplicity of presentation.
  2. Clearly, Dijkstra's algorithm also solves the single-source single-target case. In this case, Dijkstra may safely terminate once has been extracted from [math]\displaystyle{ Q }[/math]. In fact, the loop invariant implies [math]\displaystyle{ \delta(t) = \Delta(t) }[/math] from then on. This variant on Dijkstra's algorithm is sometimes called the early termination variant.
  3. In a scenario where we perform a large number of searches on the same graph and arc lengths, we do not need to touch the whole graph in each search:
    1. We change the algorithm as suggested by the first point: only keep the "frontierline" in [math]\displaystyle{ Q }[/math].
    2. We also change the algorithm as suggested by the second point: terminate once [math]\displaystyle{ t }[/math] is processed.
    3. To avoid touching all nodes for initializing the [math]\displaystyle{ \delta }[/math]-values, we maintain sort of a version counter, or time stamp, for each node, which holds the ID of the last search that met this node. Whenever a node is met in a search and the node's version counter does not equal the ID of the current search, the distance value of this node is regarded as [math]\displaystyle{ + \infty }[/math] (no matter what its current value is).
  4. The early termination variant may be modified as follows: for and admissible distance function [math]\displaystyle{ h:V \to \R }[/math], replace length [math]\displaystyle{ l }[/math] by [math]\displaystyle{ l_h }[/math] where [math]\displaystyle{ l_h(a) := l(a) + h(w) - h(v) }[/math] for each [math]\displaystyle{ a = (v,w) \in A }[/math]. This variant is usually called the A* algorithm. The heristic idea is that we (hopefully) reach [math]\displaystyle{ t }[/math] earlier because the arcs that roughly point towards [math]\displaystyle{ t }[/math] are shortened, and the arcs the roughly point in a direction away from [math]\displaystyle{ t }[/math] are lengthened.