Classical bipartite cardinality matching

From Algowiki
Jump to navigation Jump to search

Abstract view

Algorithmic problem: Cardinality-maximal matching in bipartite graphs.

Type of algorithm: loop.

Invariant: [math]\displaystyle{ M }[/math] is a matching in [math]\displaystyle{ G }[/math].

Variant: [math]\displaystyle{ |M| }[/math] is increased by one.

Break condition: There is no more augmenting path.

Induction basis

Abstract view: Initialize [math]\displaystyle{ M }[/math] to be a feasible matching, for example, the empty matching.

Induction step

Abstract view:

  1. Search for an augmenting path.
  2. If there is none, the loop terminates.
  3. Let [math]\displaystyle{ p }[/math] denote the augmenting path found in step 1.
  4. Augment [math]\displaystyle{ M }[/math] along [math]\displaystyle{ p }[/math].

Implementation of step 1: A graph traversal strategy is to be chosen that generates an arborescence rooted at the start node, for example, a DFS or a BFS. The modification to find augmenting paths must be applied. In a loop over all exposed nodes, a graph search is started at each of them. This loop may terminate early, namely when the first augmenting path is found.

Proof: Basically, we have to show that there is no more augmenting path if this repeated graph search does not find one. So suppose for a contradiction that a search from an exposed node [math]\displaystyle{ u }[/math] fails although there is an augmenting path [math]\displaystyle{ p }[/math] from [math]\displaystyle{ u }[/math] to some other exposed node [math]\displaystyle{ v }[/math]. Then [math]\displaystyle{ v }[/math] is not seen by this graph search. Let [math]\displaystyle{ x }[/math] be the last node on [math]\displaystyle{ p }[/math] seen by this graph search, let [math]\displaystyle{ e }[/math] denote the edge over which [math]\displaystyle{ x }[/math] was seen, and let [math]\displaystyle{ e' }[/math] be the next edge on [math]\displaystyle{ p }[/math] (as seen in the direction from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ v }[/math]). Since [math]\displaystyle{ e }[/math] was considered by the graph search and [math]\displaystyle{ e' }[/math] was not, we may conclude [math]\displaystyle{ e,e'\not\in M }[/math]. Let [math]\displaystyle{ p' }[/math] be the subpath of [math]\displaystyle{ p }[/math] from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ x }[/math] and [math]\displaystyle{ p'' }[/math] the path from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ x }[/math] in the arborescence. As [math]\displaystyle{ e'\not\in M }[/math], the last edge of [math]\displaystyle{ p' }[/math] is in [math]\displaystyle{ M }[/math], so [math]\displaystyle{ p' }[/math] has even length. Since [math]\displaystyle{ e\not\in M }[/math], [math]\displaystyle{ p'' }[/math] has odd length. In summary, the concatenation [math]\displaystyle{ p'+p'' }[/math] is a (usually non-simple) cycle of odd length. In particular, [math]\displaystyle{ G }[/math] is not bipartite.

Remark on the proof: Go from [math]\displaystyle{ x }[/math] backwards along [math]\displaystyle{ p'' }[/math] until the first node [math]\displaystyle{ y }[/math], say, is seen that also belongs to [math]\displaystyle{ p }[/math]. In case [math]\displaystyle{ y\neq u }[/math], the subpaths of [math]\displaystyle{ p' }[/math] and [math]\displaystyle{ p'' }[/math] from [math]\displaystyle{ x }[/math] to [math]\displaystyle{ y }[/math] form a blossom, and the graph search entered this blossom via its stem.

Correctness

Berge's theorem immediately implies that the resulting matching is cardinality-maximal.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \mathcal{O}(n\!\cdot\!m) }[/math], where [math]\displaystyle{ n=|V| }[/math] and [math]\displaystyle{ m=|E| }[/math].

Proof: Since a matching cannot have more than [math]\displaystyle{ |V|/2 }[/math] edges, the number of iterations is linear in the number of nodes. The complexity of an iteration is dominated by the graph search. The statement and remark on isolated nodes apply.

Connection to max-flow

This algorithm may be interpreted as an application of Ford-Fulkerson:

  1. Let [math]\displaystyle{ \{V_1,V_2\} }[/math] denote a bipartition of [math]\displaystyle{ V }[/math] such that each edge is incident to one node of [math]\displaystyle{ V_1 }[/math] and one node of [math]\displaystyle{ V_2 }[/math].
  2. Replace each undirected edge [math]\displaystyle{ \{v,w\} }[/math] by the directed arc [math]\displaystyle{ (v,w) }[/math] with infinite capacity, where [math]\displaystyle{ v\in V_1 }[/math] and [math]\displaystyle{ w\in V_2 }[/math] (so, all arcs go from [math]\displaystyle{ V_1 }[/math] to [math]\displaystyle{ V_2 }[/math]).
  3. Insert a new super-source [math]\displaystyle{ s }[/math] and a new super-target [math]\displaystyle{ t }[/math] in [math]\displaystyle{ V }[/math].
  4. For each node [math]\displaystyle{ v\in V_1 }[/math], insert an arc [math]\displaystyle{ (s,v) }[/math] with capacity one.
  5. Mirror-symmetrically, for each node [math]\displaystyle{ w\in V_2 }[/math], insert an arc [math]\displaystyle{ (w,t) }[/math] with capacity one.

There is an obvious one-to-one correspondence between integral flows in that network and matchings in the original graph: An edge belongs to the matching if, and only if, the corresponding arc has positive flow value. The maximum flows correspond to the cardinality-maximal matchings. There is also a one-to-one correspondence between augmenting paths in the original graph and flow-augmenting paths in the induced flow network. Augmenting the matching along an augmenting path is tantamount to augmenting the flow along the corresponding flow-augmenting path. Finally, the max-flow min-cut theorem reduces to the famous min-max theorem for bipartite matchings:

Definition:

  1. A node cover in an undirected graph is a set of nodes such that each edge is incident to at least one node in this set.
  2. A node in a node cover is said to cover the edges incident to this node.

Min-max theorem: In a bipartite graph, the maximum cardinality of a matching equals the minimum cardinality of a node cover.

Reduction to the max-flow min-cut theorem: Let [math]\displaystyle{ M }[/math] be a cardinality-maximal matching. No node can cover more than one edge of [math]\displaystyle{ M }[/math]. Therefore, no node set of less than [math]\displaystyle{ |M| }[/math] nodes can be a node cover. Thus, it suffices to construct a node cover of cardinality [math]\displaystyle{ |M| }[/math].

The max-flow min-cut theorem implies that the minimum [math]\displaystyle{ (s,t) }[/math]-cut [math]\displaystyle{ (S,T) }[/math] has capacity [math]\displaystyle{ |M| }[/math]. In particular, [math]\displaystyle{ (S,T) }[/math] has finite capacity, so it is only crossed by arcs from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ T }[/math] and by arcs from [math]\displaystyle{ S }[/math] to [math]\displaystyle{ t }[/math]. Now, let [math]\displaystyle{ S':=S\cap V_2 }[/math] and [math]\displaystyle{ T':=T\cap V_1 }[/math]. By construction, the node sets [math]\displaystyle{ S' }[/math] and [math]\displaystyle{ T' }[/math] are disjoint. The nodes in [math]\displaystyle{ S' }[/math] are in one-to-one correspondence to the arcs from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ T }[/math], and the nodes in [math]\displaystyle{ T' }[/math] are in one-to-one correspondence to the arcs from [math]\displaystyle{ S }[/math] into [math]\displaystyle{ t }[/math]. In summary, [math]\displaystyle{ |S'\cup T'| }[/math] is the capacity of [math]\displaystyle{ (S,T) }[/math]. Hence, it suffices to show that [math]\displaystyle{ S'\cup T' }[/math] is a node cover for the original graph.

For a contradiction suppose some edge [math]\displaystyle{ \{v,w\} }[/math] is not covered, where [math]\displaystyle{ v\in V_1 }[/math] and [math]\displaystyle{ w\in V_2 }[/math], say. So, it is [math]\displaystyle{ v\in S }[/math] and [math]\displaystyle{ w\in T }[/math], However, then [math]\displaystyle{ (v,w) }[/math] would go from [math]\displaystyle{ S }[/math] to [math]\displaystyle{ T }[/math], which contradicts the fact that the capacity of [math]\displaystyle{ (S,T) }[/math] is finite.

Comment: Tutte's theorem extends the min-max theorem to non-bipartite graphs.