Hopcroft-Karp: Difference between revisions

From Algowiki
Jump to navigation Jump to search
Line 59: Line 59:


'''Proof:'''
'''Proof:'''
Each edge is touched at most twice in each iteration
Each edge is touched at most twice in each iteration, once from either incident node. Hence, it suffices to prove that the number of iterations is in <math>\mathcal{O}(\sqrt{n})</math>.
After <math>\lfloor\sqrt{n}\rfloor</math> iterations, any augmenting path has a length of at least <math>\lfloor\sqrt{n}\rfloor</math>.
 
After <math>\lceil\sqrt{n}\rceil</math> iterations, any augmenting path has a length of at least <math>\lfloor\sqrt{n}\rfloor</math>. Consequently, there cannot be more than <math>\lfloor\sqrt{n}\rfloor</math> node-disjoint augmenting paths. The remark on the proof of [[Cardinality-maximal matching#Berge's theorem|Berge's theorem]] implies

Revision as of 17:57, 22 November 2014

Abstract view

Algorithmic problem: Cardinality-maximal matching in bipartite graphs.

Type of algorithm: loop.

Auxiliary data structures: a set [math]\displaystyle{ S }[/math] of BFS runs as iterators (cf. the corresponding remark). The modification to find augmenting paths is applied.

Invariant: After [math]\displaystyle{ i\geq 0 }[/math] iterations:

  1. [math]\displaystyle{ M }[/math] is a matching in [math]\displaystyle{ G }[/math].
  2. All exposed nodes that are end nodes of augmenting paths, are in [math]\displaystyle{ S }[/math].
  3. Each node in [math]\displaystyle{ S }[/math] is the root of an arborescence:
    1. These arborescences are pairwise node-disjoint.
    2. Each leaf of each arborescence has distance [math]\displaystyle{ i }[/math] form the root of this arborescence (both in [math]\displaystyle{ G }[/math] and in the arborescence).

Variant: The shortest length of an augmenting path in terms of the number of edges is increased.

Break condition: [math]\displaystyle{ S=\emptyset }[/math].

Induction basis

Abstract view:

  1. Start with an arbitrary matching, for example, the empty matching.
  2. Create and initialize one BFS run for each exposed node.

Induction step

Abstract view: In the [math]\displaystyle{ i }[/math]-th iteration:

  1. Process all nodes of distance [math]\displaystyle{ i-1 }[/math] in the queues of all BFS runs.
  2. For each BFS run: If it had no nodes to process in step 1, remove it from [math]\displaystyle{ S }[/math].
  3. For each edge [math]\displaystyle{ \{v,w\} }[/math] that is seen by more than one:
    1. Choose two such BFS runs and such an edge [math]\displaystyle{ \{v,w\} }[/math], say.
    2. Augment [math]\displaystyle{ M }[/math] along the concatenation of the paths in the arborescences of these two BFS runs from the start nodes to [math]\displaystyle{ v }[/math] and [math]\displaystyle{ w }[/math], respectively, and of [math]\displaystyle{ \{v,w\} }[/math].
    3. Remove these two BFS runs from the set of all BFS runs.

Proof: The invariant is obviously maintained. To prove the variant, consider an execution of step 3.2. Let [math]\displaystyle{ p }[/math] denote the path along which the augmentation takes place. Let [math]\displaystyle{ p_1 }[/math] and [math]\displaystyle{ p'_2 }[/math] be the paths in the arborescences from the start nodes to [math]\displaystyle{ v }[/math] and [math]\displaystyle{ w }[/math], respectively. So, it is [math]\displaystyle{ p=p_1+\{v,w\}+p_2 }[/math]. Suppose augmenting along [math]\displaystyle{ p }[/math] has created a new augmenting path [math]\displaystyle{ q }[/math]. We have to show that [math]\displaystyle{ q }[/math] is longer than [math]\displaystyle{ p }[/math].

In either orientation of [math]\displaystyle{ p }[/math], let [math]\displaystyle{ v' }[/math] be the first node of [math]\displaystyle{ p }[/math] that also belongs to [math]\displaystyle{ q }[/math], and let [math]\displaystyle{ w' }[/math] be the last node of [math]\displaystyle{ p }[/math] that also belongs to [math]\displaystyle{ q }[/math]. Since [math]\displaystyle{ \{v,w\} }[/math] belongs to [math]\displaystyle{ q }[/math], we may assume, without loss of generality, that [math]\displaystyle{ v }[/math] is on [math]\displaystyle{ p_1 }[/math] and [math]\displaystyle{ w }[/math] is on [math]\displaystyle{ p_2 }[/math]. Let [math]\displaystyle{ q_1 }[/math] and [math]\displaystyle{ q_2 }[/math] denote the loose ends of [math]\displaystyle{ q }[/math] up to [math]\displaystyle{ v }[/math] and [math]\displaystyle{ w }[/math], respectively.

Note that [math]\displaystyle{ v'=v }[/math] must hold because, otherwise, the concatenation [math]\displaystyle{ p_1+q_1 }[/math] would yield an augmenting path that is shorter than [math]\displaystyle{ p }[/math]. This would contradict the induction hypothesis. Analogously, we may conclude [math]\displaystyle{ w'=w }[/math]. Note that each edge of [math]\displaystyle{ q }[/math] that is incident to exactly one node on [math]\displaystyle{ p }[/math] must be unmatched. Therefore, the parities of the lengths of [math]\displaystyle{ p_1 }[/math] and [math]\displaystyle{ q_1 }[/math] are different. In particular, the lengths themselves are different. Again, if [math]\displaystyle{ q_1 }[/math] were shorter than [math]\displaystyle{ p_1 }[/math], [math]\displaystyle{ p_1+q_1 }[/math] were an augmenting path shorter than [math]\displaystyle{ p }[/math]. For the same reason, [math]\displaystyle{ q_2 }[/math] is longer than [math]\displaystyle{ p_2 }[/math]. In summary, [math]\displaystyle{ q }[/math] is longer than [math]\displaystyle{ p }[/math].

Remark: The union of all arborescences at termination is called a Hungarian forest in the literature. The algorithm proves, constructively, that a matching in a bipartite graph is cardinality-maximal if, and only if, a Hungarian forest exists.

Correctness

Due to the invariant, the matching is cardinality-maximal if the main loop terminates. Termination of the main loop results from the following complexity considerations.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \mathcal{O}(m\!\cdot\!\sqrt{n}) }[/math], where [math]\displaystyle{ n=|V| }[/math] and [math]\displaystyle{ m=E| }[/math].

Proof: Each edge is touched at most twice in each iteration, once from either incident node. Hence, it suffices to prove that the number of iterations is in [math]\displaystyle{ \mathcal{O}(\sqrt{n}) }[/math].

After [math]\displaystyle{ \lceil\sqrt{n}\rceil }[/math] iterations, any augmenting path has a length of at least [math]\displaystyle{ \lfloor\sqrt{n}\rfloor }[/math]. Consequently, there cannot be more than [math]\displaystyle{ \lfloor\sqrt{n}\rfloor }[/math] node-disjoint augmenting paths. The remark on the proof of Berge's theorem implies