Preflow-push: Difference between revisions

From Algowiki
Jump to navigation Jump to search
 
(95 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Abstract view ==
== Abstract view ==
'''Also known as:''' ''push-relabel'' algorithm or ''Goldberg-Tarjan'' algorithm


'''Algorithmic problem:'''
'''Algorithmic problem:'''
[[Max-Flow Problems|max-flow problem (standard version)]]
[[Max-Flow Problems#Standard version|max-flow problem (standard version)]]


'''Type of algorithm:'''
'''Type of algorithm:'''
loop.
loop.


'''Definition:'''
'''Auxiliary data:'''
# For <math>a\in A</math>, let <math>f(a)</math> be a real number (not necessarily foring a flow). For <math>v\in V\setminus\{s,t\}</math>, the '''excess''' of <math>v</math> with respect to <math>f</math> is defined by <math>e_f(v):=\sum_{w:(w,v)\in A}f(w,v)-\sum_{w:(v,w)\in A}f(v,w)</math>.
# A nonnegative integral value <math>d(v)</math> for each node <math>v\in V</math>.
# A node <math>v\in V\setminus\{s,t\}</math> is called '''active''' it its excess is positive.
# Each node <math>v\in V\setminus\{t\}</math> has a '''current arc''', which may be implemented as an iterator on the list of outgoing residual arcs of <math>v</math>.
# Let <math>\Delta</math> denote the sum of the distance labels <math>d(v)</math> of all active nodes <math>v\in V\setminus\{s,t\}</math>.
# The [[Basic flow definitions#Preflow|excess]] <math>e_f(v)</math> of a node <math>v\in V\setminus\{s,t\}</math> with respect to the current [[Basic flow definitions#Preflow|preflow]].
# A (dynamically changing) [[Sets and sequences|set]] <math>S</math> of nodes.


'''Invariant:'''
'''Invariant:'''
Before and after each iteration:
Before and after each iteration:
# For each arc <math>a\in A</math>, it is <math>0\leq f(a)\leq u(a)</math> .
# For each arc <math>a\in A</math>, it is <math>0\leq f(a)\leq u(a)</math> . If all upper bounds are integral, all <math>f</math>-values are integral, too.
# For each node <math>v\in V</math>, it is <math>e_f(v)\geq 0</math>.
# For each node <math>v\in V\setminus\{s,t\} </math>, it is <math>e_f(v)\geq 0</math>. In other words, <math>f</math> is a [[Basic flow definitions#Preflow|preflow]].
# The node labels <math>d</math> form a [[Basic flow definitions#Valid distance labeling|valid distance labeling]], and it is <math>d(s)=|V|</math>.
# The node labels <math>d</math> form a [[Basic flow definitions#Valid distance labeling|valid distance labeling]], and it is <math>d(s)=n:=|V|</math>.
# The currently [[Basic flow definitions#Preflow|active nodes]] are stored in <math>S</math>.
# The current arc of a node is an outgoing arc of the node's in the residual graph. In the list of all of these arcs, no admissible arc precedes the current arc.


'''Variant:'''
'''Variant:'''
At least one
No label <math>d(\cdot)</math> is ever decreased. In each iteration, one of the following three actions will take place:
# The label <math>d(v)</math> of at least one node <math>v\in V\setminus\{s,t\}</math> is increased.
# The label <math>d(v)</math> of at least one node <math>v\in V\setminus\{s,t\}</math> is increased.
# A saturating push is performed.
# A saturating push is performed.
# The value of <math>\Delta</math> decreases (in case of a non-saturating push).
# The value of <math>D:=\sum_{v\in V\setminus\{s,t\}\atop e_f(v)>0}d(v)</math> decreases.
 
 
'''Break condition:'''
<math>S=\emptyset</math>.


== Induction basis ==
== Induction basis ==
Line 28: Line 37:
'''Abstract view:'''
'''Abstract view:'''
# For all arcs <math>a\in A</math>, set <math>f(a):=0</math>.
# For all arcs <math>a\in A</math>, set <math>f(a):=0</math>.
# For each arc <math>(s,v)\in A</math>, overwrite this value by <math>f(a):=u(a)</math>.
# For each arc <math>(s,v)\in A</math>, overwrite this value by <math>f(a):=u(a)</math> and put <math>v</math> into <math>S</math>.
# Compute a valid distance labeling <math>d</math>, for example, the true distances from all nodes to <math>t</math> in the residual network.
# Compute a [[Basic flow definitions#Valid distance labeling|valid distance labeling]] <math>d</math> for <math>V\setminus\{s\}</math> with respect to <math>f</math>, for example, the true distances from all nodes to <math>t</math> in the residual network of <math>f</math>.
# Set <math>d(s):=n</math>.
# Set <math>d(s):=n</math>.
# For all <math>v\in V\setminus\{s,t\}</math>, reset the current arc of <math>v</math> so as to point to the first arc in the list of outgoing arcs of <math>v</math>.


'''Proof:'''
'''Proof:'''
Obvious.
For the [[Basic graph definitions#Subgraphs|subgraph induced]] by <math>V\setminus\{s\}</math>, the arguments in the [[Ahuja-Orlin#Correctness|correctness proof]] for the [[Ahuja-Orlin]] algorithm prove that the <math>d</math>-labels form a valid distance labeling here as well. For <math>s</math>, nothing is to show because all outgoing arcs are saturated.


== Induction step ==
== Induction step ==


# Extract an active node <math>v</math> from <math>Q</math>.
'''Abstract view:'''
# Choose an [[Basic flow definitions#Preflow|active node]] <math>v</math> from <math>S</math>.
# While the current arc of <math>v</math> is not void and not [[Basic flow definitions#Valid distance labeling|admissible]] either, move the current arc one step forward.
# If the current arc of <math>v</math> is ''not'' void now but an (admissible) outgoing arc <math>(v,w)</math>, say:
## If <math>w\neq s</math> and <math>w\neq t</math> and <math>e_f(w)=0</math>, insert <math>w</math> in <math>S</math>.
## Increase the flow over <math>(v,w)</math> by the minimum of <math>e_f(v)</math> and the residual capacity of <math>(v,w)</math>.
## Increase <math>e_f(w)</math> by that value and decrease <math>e_f(v)</math> by the same value.
## If <math>e_f(v)=0</math> now, extract <math>v</math> from <math>S</math>.
# Otherwise:
## Let <math>d_\min</math> denote the minimal label <math>d(w)</math> of any arc <math>(v,w)</math> in the residual network.
## Set <math>d(v):=d_\min+1</math>.
## Reset the current arc of <math>v</math> so as to point to the beginning of the list of outgoing arcs of <math>v</math>.
 
'''Remark:'''
The preflow-push algorithm is also known as the '''push-relabel''' algorithm. The ''push'' operation is step 3; the ''relabel'' operation is step 4.
 
'''Proof:'''
Points 1, 2, and 4 of the invariant and <math>d(s)=n</math> are obviously fulfilled. The rest of point 3 of the invariant is affected by step 4 only, and the outgoing arcs of <math>v</math> are the only arcs where the distance labeling may become invalid. However, the extremely conservative increase of <math>d(v)</math> ensures point 3 of the invariant.
 
To prove the variant, consider a step in which neither any <math>d</math>-value is increased nor a saturating push is performed. This means step 3.2 is applied, but the arc <math>(v,w)</math> is not saturated by that. Potentially, <math>w</math> becomes active. However, <math>v</math> definitely becomes inactive since the push step is non-saturating. Now the variant follows from the fact that <math>d(w)=d(v)-1</math> for an admissible arc <math>(v,w)</math>.
 
It remains to show termination; this is proved by the following complexity considerations.
 
== Complexity ==
 
'''Statement:'''
The asymptotic complexity is in <math>\mathcal{O}(n^2m)</math>, where <math>n=|V|</math> and <math>m=|A|</math>.
 
'''Proof:'''
First we show that the total number of relabel operations (step 4 of the main loop) is in <math>\mathcal{O}(n^2)</math>. To see that, let <math>v\in V\setminus\{s,t\}</math> be an active node between two iterations of the main loop. A straightforward induction over the number of push operations shows that there is at least one simple <math>(s,v)</math>-path <math>p</math> with positive flow on all arcs. The [[Basic graph definitions#Transpose of a graph|transpose]] of <math>p</math> is [[Basic flow definitions#Flow-augmenting paths and saturated arcs|augmenting]]. Due to the validity of <math>d</math> (induction hypothesis), <math>d(v)-d(s)=d(v)-n</math> cannot be larger than the number of arcs on <math>p</math>, which is not larger than <math>n-1</math>. Therefore, no node label can be larger than <math>2n-1</math>. Since node labels are nonnegative and increase at least by one in each relabel operation, the claimed upper bound on the relabel operations follows.
 
From this bound, we may immediately conclude that the current arc of a node is reset <math>\mathcal{O}(n)</math> times, so the total number of forward steps of the current arcs of all nodes is in <math>\mathcal{O}(n^3)\subseteq\mathcal{O}(n^2m)</math>.
 
The argument in the [[Ahuja-Orlin#Complexity|complexity analysis]] of the [[Ahuja-Orlin]] algorithm to prove that the total number of ''saturating'' push operations is in <math>\mathcal{O}(nm)</math>, applies here as well.
 
Finally, we consider the ''non-saturating'' push operations. First note that <math>D\geq 0</math> before and after each iteration. The value of <math>D</math> is increased in each relabel operation exactly by the amount by which the label of the current node is increased. Since node labels are never decreased and bounded from above by <math>2n-1</math>, <math>D</math> increases by less than <math>2n^2</math> in total over all relabel operations. On the other hand, a saturating push operation may increase <math>D</math> by at most <math>2n-1</math> (namely, in the case that <math>w</math> was not active immediately before the push). In summary, the total sum of all values by which <math>D</math> is increased throughout the algorithm is in <math>\mathcal{O}(n^2m)</math>. Due to the variant, the value of <math>D</math> is decreased by at least one in each non-saturating push operation. This proves the claim.
 
== Heuristic speedup techniques ==
 
# After <math>\Omega(n)</math> iterations of the main loop, the <math>d</math>-values are recomputed analogously to the induction basis: as the current distance of each node to <math>t</math> in the residual network. This modification is seldom enough, so the asymptotic complexity is not increased. In practice, this technique may save many unnecessary relabel steps.
# The main loop may be decomposed into two phases: First, as much flow as possible is sent into <math>t</math>; second, all surplus flow that cannot reach <math>t</math> is sent back to <math>s</math>. The first phase may be finished once there is no more path in the residual network from any active node to <math>t</math>. A sufficient and easy-to-check condition for that is <math>d(v)\geq n</math> for all active nodes <math>v</math>. All nodes from which <math>t</math> is reachable may be safely disregarded in the second phase. For any other node, to save unnecessary relabel operations, the distance label may be safely increased to the minimum number of arcs in the residual network from this node back to <math>s</math>.

Latest revision as of 03:53, 20 June 2017

Abstract view

Also known as: push-relabel algorithm or Goldberg-Tarjan algorithm

Algorithmic problem: max-flow problem (standard version)

Type of algorithm: loop.

Auxiliary data:

  1. A nonnegative integral value [math]\displaystyle{ d(v) }[/math] for each node [math]\displaystyle{ v\in V }[/math].
  2. Each node [math]\displaystyle{ v\in V\setminus\{t\} }[/math] has a current arc, which may be implemented as an iterator on the list of outgoing residual arcs of [math]\displaystyle{ v }[/math].
  3. The excess [math]\displaystyle{ e_f(v) }[/math] of a node [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math] with respect to the current preflow.
  4. A (dynamically changing) set [math]\displaystyle{ S }[/math] of nodes.

Invariant: Before and after each iteration:

  1. For each arc [math]\displaystyle{ a\in A }[/math], it is [math]\displaystyle{ 0\leq f(a)\leq u(a) }[/math] . If all upper bounds are integral, all [math]\displaystyle{ f }[/math]-values are integral, too.
  2. For each node [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math], it is [math]\displaystyle{ e_f(v)\geq 0 }[/math]. In other words, [math]\displaystyle{ f }[/math] is a preflow.
  3. The node labels [math]\displaystyle{ d }[/math] form a valid distance labeling, and it is [math]\displaystyle{ d(s)=n:=|V| }[/math].
  4. The currently active nodes are stored in [math]\displaystyle{ S }[/math].
  5. The current arc of a node is an outgoing arc of the node's in the residual graph. In the list of all of these arcs, no admissible arc precedes the current arc.

Variant: No label [math]\displaystyle{ d(\cdot) }[/math] is ever decreased. In each iteration, one of the following three actions will take place:

  1. The label [math]\displaystyle{ d(v) }[/math] of at least one node [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math] is increased.
  2. A saturating push is performed.
  3. The value of [math]\displaystyle{ D:=\sum_{v\in V\setminus\{s,t\}\atop e_f(v)\gt 0}d(v) }[/math] decreases.


Break condition: [math]\displaystyle{ S=\emptyset }[/math].

Induction basis

Abstract view:

  1. For all arcs [math]\displaystyle{ a\in A }[/math], set [math]\displaystyle{ f(a):=0 }[/math].
  2. For each arc [math]\displaystyle{ (s,v)\in A }[/math], overwrite this value by [math]\displaystyle{ f(a):=u(a) }[/math] and put [math]\displaystyle{ v }[/math] into [math]\displaystyle{ S }[/math].
  3. Compute a valid distance labeling [math]\displaystyle{ d }[/math] for [math]\displaystyle{ V\setminus\{s\} }[/math] with respect to [math]\displaystyle{ f }[/math], for example, the true distances from all nodes to [math]\displaystyle{ t }[/math] in the residual network of [math]\displaystyle{ f }[/math].
  4. Set [math]\displaystyle{ d(s):=n }[/math].
  5. For all [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math], reset the current arc of [math]\displaystyle{ v }[/math] so as to point to the first arc in the list of outgoing arcs of [math]\displaystyle{ v }[/math].

Proof: For the subgraph induced by [math]\displaystyle{ V\setminus\{s\} }[/math], the arguments in the correctness proof for the Ahuja-Orlin algorithm prove that the [math]\displaystyle{ d }[/math]-labels form a valid distance labeling here as well. For [math]\displaystyle{ s }[/math], nothing is to show because all outgoing arcs are saturated.

Induction step

Abstract view:

  1. Choose an active node [math]\displaystyle{ v }[/math] from [math]\displaystyle{ S }[/math].
  2. While the current arc of [math]\displaystyle{ v }[/math] is not void and not admissible either, move the current arc one step forward.
  3. If the current arc of [math]\displaystyle{ v }[/math] is not void now but an (admissible) outgoing arc [math]\displaystyle{ (v,w) }[/math], say:
    1. If [math]\displaystyle{ w\neq s }[/math] and [math]\displaystyle{ w\neq t }[/math] and [math]\displaystyle{ e_f(w)=0 }[/math], insert [math]\displaystyle{ w }[/math] in [math]\displaystyle{ S }[/math].
    2. Increase the flow over [math]\displaystyle{ (v,w) }[/math] by the minimum of [math]\displaystyle{ e_f(v) }[/math] and the residual capacity of [math]\displaystyle{ (v,w) }[/math].
    3. Increase [math]\displaystyle{ e_f(w) }[/math] by that value and decrease [math]\displaystyle{ e_f(v) }[/math] by the same value.
    4. If [math]\displaystyle{ e_f(v)=0 }[/math] now, extract [math]\displaystyle{ v }[/math] from [math]\displaystyle{ S }[/math].
  4. Otherwise:
    1. Let [math]\displaystyle{ d_\min }[/math] denote the minimal label [math]\displaystyle{ d(w) }[/math] of any arc [math]\displaystyle{ (v,w) }[/math] in the residual network.
    2. Set [math]\displaystyle{ d(v):=d_\min+1 }[/math].
    3. Reset the current arc of [math]\displaystyle{ v }[/math] so as to point to the beginning of the list of outgoing arcs of [math]\displaystyle{ v }[/math].

Remark: The preflow-push algorithm is also known as the push-relabel algorithm. The push operation is step 3; the relabel operation is step 4.

Proof: Points 1, 2, and 4 of the invariant and [math]\displaystyle{ d(s)=n }[/math] are obviously fulfilled. The rest of point 3 of the invariant is affected by step 4 only, and the outgoing arcs of [math]\displaystyle{ v }[/math] are the only arcs where the distance labeling may become invalid. However, the extremely conservative increase of [math]\displaystyle{ d(v) }[/math] ensures point 3 of the invariant.

To prove the variant, consider a step in which neither any [math]\displaystyle{ d }[/math]-value is increased nor a saturating push is performed. This means step 3.2 is applied, but the arc [math]\displaystyle{ (v,w) }[/math] is not saturated by that. Potentially, [math]\displaystyle{ w }[/math] becomes active. However, [math]\displaystyle{ v }[/math] definitely becomes inactive since the push step is non-saturating. Now the variant follows from the fact that [math]\displaystyle{ d(w)=d(v)-1 }[/math] for an admissible arc [math]\displaystyle{ (v,w) }[/math].

It remains to show termination; this is proved by the following complexity considerations.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \mathcal{O}(n^2m) }[/math], where [math]\displaystyle{ n=|V| }[/math] and [math]\displaystyle{ m=|A| }[/math].

Proof: First we show that the total number of relabel operations (step 4 of the main loop) is in [math]\displaystyle{ \mathcal{O}(n^2) }[/math]. To see that, let [math]\displaystyle{ v\in V\setminus\{s,t\} }[/math] be an active node between two iterations of the main loop. A straightforward induction over the number of push operations shows that there is at least one simple [math]\displaystyle{ (s,v) }[/math]-path [math]\displaystyle{ p }[/math] with positive flow on all arcs. The transpose of [math]\displaystyle{ p }[/math] is augmenting. Due to the validity of [math]\displaystyle{ d }[/math] (induction hypothesis), [math]\displaystyle{ d(v)-d(s)=d(v)-n }[/math] cannot be larger than the number of arcs on [math]\displaystyle{ p }[/math], which is not larger than [math]\displaystyle{ n-1 }[/math]. Therefore, no node label can be larger than [math]\displaystyle{ 2n-1 }[/math]. Since node labels are nonnegative and increase at least by one in each relabel operation, the claimed upper bound on the relabel operations follows.

From this bound, we may immediately conclude that the current arc of a node is reset [math]\displaystyle{ \mathcal{O}(n) }[/math] times, so the total number of forward steps of the current arcs of all nodes is in [math]\displaystyle{ \mathcal{O}(n^3)\subseteq\mathcal{O}(n^2m) }[/math].

The argument in the complexity analysis of the Ahuja-Orlin algorithm to prove that the total number of saturating push operations is in [math]\displaystyle{ \mathcal{O}(nm) }[/math], applies here as well.

Finally, we consider the non-saturating push operations. First note that [math]\displaystyle{ D\geq 0 }[/math] before and after each iteration. The value of [math]\displaystyle{ D }[/math] is increased in each relabel operation exactly by the amount by which the label of the current node is increased. Since node labels are never decreased and bounded from above by [math]\displaystyle{ 2n-1 }[/math], [math]\displaystyle{ D }[/math] increases by less than [math]\displaystyle{ 2n^2 }[/math] in total over all relabel operations. On the other hand, a saturating push operation may increase [math]\displaystyle{ D }[/math] by at most [math]\displaystyle{ 2n-1 }[/math] (namely, in the case that [math]\displaystyle{ w }[/math] was not active immediately before the push). In summary, the total sum of all values by which [math]\displaystyle{ D }[/math] is increased throughout the algorithm is in [math]\displaystyle{ \mathcal{O}(n^2m) }[/math]. Due to the variant, the value of [math]\displaystyle{ D }[/math] is decreased by at least one in each non-saturating push operation. This proves the claim.

Heuristic speedup techniques

  1. After [math]\displaystyle{ \Omega(n) }[/math] iterations of the main loop, the [math]\displaystyle{ d }[/math]-values are recomputed analogously to the induction basis: as the current distance of each node to [math]\displaystyle{ t }[/math] in the residual network. This modification is seldom enough, so the asymptotic complexity is not increased. In practice, this technique may save many unnecessary relabel steps.
  2. The main loop may be decomposed into two phases: First, as much flow as possible is sent into [math]\displaystyle{ t }[/math]; second, all surplus flow that cannot reach [math]\displaystyle{ t }[/math] is sent back to [math]\displaystyle{ s }[/math]. The first phase may be finished once there is no more path in the residual network from any active node to [math]\displaystyle{ t }[/math]. A sufficient and easy-to-check condition for that is [math]\displaystyle{ d(v)\geq n }[/math] for all active nodes [math]\displaystyle{ v }[/math]. All nodes from which [math]\displaystyle{ t }[/math] is reachable may be safely disregarded in the second phase. For any other node, to save unnecessary relabel operations, the distance label may be safely increased to the minimum number of arcs in the residual network from this node back to [math]\displaystyle{ s }[/math].