Preflow-push

From Algowiki
Jump to navigation Jump to search

Abstract view

Algorithmic problem: max-flow problem (standard version)

Type of algorithm: loop.

Auxiliary data:

  1. A nonnegative integral value [math]\displaystyle{ d(v) }[/math] for each node [math]\displaystyle{ v\in V }[/math].
  2. Each node [math]\displaystyle{ v\in V\setminus\{t\} }[/math] has a current arc, which may be implemented as an iterator on the list of outgoing arcs of [math]\displaystyle{ v }[/math].
  3. The excess [math]\displaystyle{ e_f(v) }[/math] of a node [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math] with respect to the current preflow.
  4. A (dynamically changing) set [math]\displaystyle{ S }[/math] of nodes.

Invariant: Before and after each iteration:

  1. For each arc [math]\displaystyle{ a\in A }[/math], it is [math]\displaystyle{ 0\leq f(a)\leq u(a) }[/math] . If all upper bounds are integral, all [math]\displaystyle{ f }[/math]-values are integral, too.
  2. For each node [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math], it is [math]\displaystyle{ e_f(v)\geq 0 }[/math]. In other words, [math]\displaystyle{ f }[/math] is a preflow.
  3. The node labels [math]\displaystyle{ d }[/math] form a valid distance labeling, and it is [math]\displaystyle{ d(s)=|V| }[/math].
  4. The currently active nodes are stored in [math]\displaystyle{ S }[/math].

Variant: In each iteration, one of the following three actions will take place:

  1. The label [math]\displaystyle{ d(v) }[/math] of at least one node [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math] is increased.
  2. A saturating push is performed.
  3. The value of [math]\displaystyle{ D:=\sum_{v\in V\setminus\{s,t\}\atop e_f(v)\gt 0}d(v) }[/math] decreases.

No label [math]\displaystyle{ d(\cdot) }[/math] is ever decreased.

Break condition: [math]\displaystyle{ S=\emptyset }[/math].

Induction basis

Abstract view:

  1. For all arcs [math]\displaystyle{ a\in A }[/math], set [math]\displaystyle{ f(a):=0 }[/math].
  2. For each arc [math]\displaystyle{ (s,v)\in A }[/math], overwrite this value by [math]\displaystyle{ f(a):=u(a) }[/math].
  3. Compute a valid distance labeling [math]\displaystyle{ d }[/math], for example, the true distances from all nodes to [math]\displaystyle{ t }[/math] in the residual network.
  4. Set [math]\displaystyle{ d(s):=n }[/math].
  5. For all [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math], reset the current arc of [math]\displaystyle{ v }[/math] so as to point to the first arc in the list of outgoing arcs of [math]\displaystyle{ v }[/math].

Proof: For the subgraph induced by [math]\displaystyle{ V\setminus\{s\} }[/math], the arguments in the correctness proof for the Ahuja-Orlin algorithm prove that the [math]\displaystyle{ d }[/math] form a valid distance labeling here as well. For [math]\displaystyle{ s }[/math], nothing is to show because all outgoing arcs are saturated.

Induction step

Abstract view:

  1. Choose an active node [math]\displaystyle{ v }[/math] from [math]\displaystyle{ S }[/math].
  2. While the current arc of [math]\displaystyle{ v }[/math] is not void and not admissible either, move the current arc one step forward.
  3. If the current arc of [math]\displaystyle{ v }[/math] is not void now but an (admissible) outgoing arc [math]\displaystyle{ (v,w) }[/math], say:
    1. If [math]\displaystyle{ w\neq s }[/math] and [math]\displaystyle{ e_f(w)=0 }[/math], insert [math]\displaystyle{ w }[/math] in [math]\displaystyle{ S }[/math].
    2. Increase the flow over [math]\displaystyle{ f }[/math] by the minimum of [math]\displaystyle{ e_f(v) }[/math] and the residual capacity of [math]\displaystyle{ (v,w) }[/math].
    3. If [math]\displaystyle{ e_f(v)=0 }[/math] now, extract [math]\displaystyle{ v }[/math] from [math]\displaystyle{ S }[/math].
  4. Otherwise:
    1. Let [math]\displaystyle{ D }[/math] denote the minimal label [math]\displaystyle{ d(w) }[/math] of any arc [math]\displaystyle{ (v,w) }[/math] in the residual network.
    2. Set [math]\displaystyle{ d(v):=D+1 }[/math].
    3. Reset the current arc of [math]\displaystyle{ v }[/math] so as to point to the beginning of the list of outgoing arcs of [math]\displaystyle{ v }[/math].

Remark: The preflow-push algorithm is also known as the push-relabel algorithm. The push operation is step 2.2; the relabel operation is step 3.

Proof: Points 1, 2, and 4 of the invariant are obviously fulfilled. Point 3 of the invariant is affected by step 4 only, and the outgoing arcs of [math]\displaystyle{ v }[/math] are the only arcs where the distance labeling may become invalid. However, the extremely conservative of [math]\displaystyle{ d(v) }[/math] increase ensures point 3 of the invariant.

To prove the variant, consider a step in which neither any [math]\displaystyle{ d }[/math]-value is increased nor a saturating push is performed. This means step 3.2 is applied, but the arc [math]\displaystyle{ (v.w) }[/math] is not saturated by that. Potentially, [math]\displaystyle{ w }[/math] becomes active. However, [math]\displaystyle{ v }[/math] definitely becomes inactive since the push step is non-saturating. Now the variant follows from the fact that [math]\displaystyle{ d(w)=d(v)-1 }[/math] for an admissible arc [math]\displaystyle{ (v,w) }[/math].

It remains to show termination; this is proved by the following complexity considerations.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \mathcal{O}(n^2m) }[/math], where [math]\displaystyle{ n=|V| }[/math] and [math]\displaystyle{ m=|A| }[/math].

Proof: First we show that the total number of relabel operations (step 4 of the main loop) is in [math]\displaystyle{ \mathcal{O}(n^2) }[/math]. To see that, let [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math] be an active node before/after an iteration of the main loop. A straightforward induction over the number of push operations shows that there is at least one simple [math]\displaystyle{ (s,v) }[/math]-path [math]\displaystyle{ p }[/math] with positive flow on all arcs. The transpose of [math]\displaystyle{ p }[/math] is augmenting. Due to the validity of [math]\displaystyle{ d }[/math] (induction hypothesis), [math]\displaystyle{ d(v)-d(s)=d(v)-n }[/math] cannot be larger than the number of arcs on [math]\displaystyle{ p }[/math], which is not larger than [math]\displaystyle{ n-1 }[/math]. Therefore, no node label can be larger than [math]\displaystyle{ 2n-1 }[/math]. Since node labels are nonnegative and increase at least by one in each relabel operation, the claimed upper bound on the relabel operations follows.

From this bound, we may immediately conclude that the total number of forward steps of the current arcs of all nodes is in [math]\displaystyle{ \mathcal{O}(n\!\cdot\!m) }[/math] because the list of outgoing arcs of a node is passed [math]\displaystyle{ \mathcal{O}(n) }[/math] times.

The argument in the complexity analysis of the Ahuja-Orlin algorithm to prove that the total number of saturating push operations is in [math]\displaystyle{ \mathcal{O}(nm) }[/math], applies here as well.

Finally, we consider the non-saturating push operations. First note that [math]\displaystyle{ D\geq 0 }[/math] before and after each iteration. The value of [math]\displaystyle{ D }[/math] is increased in each relabel operation exactly by the amount by which the label of the current node is increased. Since node labels are never decreased and bounded from above by [math]\displaystyle{ 2n-1 }[/math], [math]\displaystyle{ D }[/math] increases by less than [math]\displaystyle{ 2n^2 }[/math] in total over all relabel operations. On the other hand, a saturating push operation may increase [math]\displaystyle{ D }[/math] by at most [math]\displaystyle{ 2n-1 }[/math] (namely, in the case that [math]\displaystyle{ w }[/math] was not active immediately before the push). In summary, the total sum of all values by which [math]\displaystyle{ D }[/math] is increased throughout the algorithm is in [math]\displaystyle{ \mathcal{O}(n^2m) }[/math]. Due to the variant, the value of [math]\displaystyle{ D }[/math] is decreased by at least one in each non-saturating push operation. This proves the claim.

Heuristic speedup techniques

  1. After [math]\displaystyle{ \Omega(n) }[/math] iterations of the main loop, the [math]\displaystyle{ d }[/math]-values are recomputed analogously to the induction basis: as the current distance of each node to [math]\displaystyle{ t }[/math] in the residual network. This modification is seldom enough not to increase the asymptotic complexity, but may save many unnecessary relabel steps.
  2. The main loop may be decomposed into two phases: First, as much flow as possible is sent into [math]\displaystyle{ t }[/math]; second, all surplus flow that cannot reach [math]\displaystyle{ t }[/math] is sent back to [math]\displaystyle{ s }[/math]. The first phase may be finished once there is no more path in the residual network from any active node to [math]\displaystyle{ t }[/math]. All nodes from which [math]\displaystyle{ t }[/math] is reachable may be safely disregarded in the second phase. For any other node, to save unnecessary relabel operations, the distance label may be safely increased to the minimum number of arcs in the residual network from this node back to [math]\displaystyle{ s }[/math].