Classical bipartite cardinality matching: Difference between revisions

From Algowiki
Jump to navigation Jump to search
m (→‎Induction step: Korrektur Bezeichnung p'')
 
(50 intermediate revisions by one other user not shown)
Line 2: Line 2:


'''Algorithmic problem:'''
'''Algorithmic problem:'''
[[Matchings in graphs#Cardinality-maximal matching|Cardinality-maximal matching]] in [[Basic graph definitions#Bipartite and k-partite graphs|bipartite graphs]].
[[Cardinality-maximal matching|Cardinality-maximal matching]] in [[Basic graph definitions#Bipartite and k-partite graphs|bipartite graphs]].


'''Type of algorithm:''' loop.
'''Type of algorithm:''' loop.
Line 13: Line 13:


'''Break condition:'''
'''Break condition:'''
There is no more augmenting path.
There is no more [[Matchings in graphs#Alternating and augmenting paths|augmenting path]].


== Induction basis ==
== Induction basis ==
Line 20: Line 20:
Initialize <math>M</math> to be a feasible matching, for example, the empty matching.
Initialize <math>M</math> to be a feasible matching, for example, the empty matching.


== Induction step: ==
== Induction step ==


'''Abstract view:'''
'''Abstract view:'''
# Search for an [[Matchings in graphs#Definitions|augmenting path]].
# Search for an [[Matchings in graphs#Alternating and augmenting paths|augmenting path]].
# If there is none, the loop terminates.
# If there is none, the loop terminates.
# Let <math>p</math> denote the augmenting path found in step 1.
# Let <math>p</math> denote the augmenting path found in step 1.
# [[Matchings in graphs#Definitions|Augment]] <math>M</math> along <math>p</math>.
# [[Matchings in graphs#Alternating and augmenting paths|Augment]] <math>M</math> along <math>p</math>.


'''Implementation of step 1:'''
'''Implementation of step 1:'''
In a loop over all exposed nodes <math>v</math>, a [[graph traversal]] algorithm is started at <math>v</math>. An algorithm must be chosen that generates an [[Basic graph definitions#Forests, trees, branchings, arborescences|arborescence]] rooted at the start node <math>v</math>, for example, a [[Depth-first search|DFS]] or a [[Breadth-first search|BFS]]. This terminates once an augmenting path has been found. To find an augmenting path, the graph search algorithm is modified as follows:
A [[Graph traversal|graph traversal strategy]] is to be chosen that generates an [[Basic graph definitions#Forests, trees, branchings, arborescences|arborescence]] rooted at the start node, for example, a [[Depth-first search|DFS]] or a [[Breadth-first search|BFS]]. The [[Matchings in graphs#Finding augmenting paths|modification to find augmenting paths]] must be applied.
# Whenever the current node <math>v</math> has been reached via an edge in <math>M</math>, only incident edges in <math>E\setminus M</math> are considered for seeing new nodes.
In a loop over all exposed nodes, a graph search is started at each of them. This loop may terminate early, namely when the first augmenting path is found.
# Mirror-symmetrically, whenever the current node <math>v</math> has been reached via an edge in <math>E\setminus M</math>, only the (unique) incident edge in <math>M</math>, if existing, is considered for seeing a new node.


'''Proof:'''
'''Proof:'''
Basically, we have to show that there is no more augmenting path if this repeated graph search does not find one. So suppose for a contradiction that a search from an exposed node <math>u</math> fails although there is an augmenting path <math>p</math> from <math>u</math> to some other exposed node <math>v</math>. Then <math>v</math> is not seen by this graph search. Let <math>x</math> be the last node on <math>p</math> seen by this graph search, let <math>e</math> denote the edge over which <math>x</math> was seen, and let <math>e'</math> be the next edge on <math>p</math> (as seen in the direction from <math>u</math> to <math>v</math>). Since <math>e</math> was considered by the graph search and <math>e'</math> was not, we may conclude <math>e,e'\not\in M</math>. Let <math>p'</math> be the subpath of <math>p</math> from <math>u</math> to <math>x</math> from <math>u</math> to <math>x</math> and <math>p''</math> the path from <math>u</math> to <math> <math>x</math> in the arborescence. Note that the last edge of <math>p'</math> is in <math>, so <math>p'</math> has even length. Since <math>e\not\in M</math>, <math>p''</math> has odd legngth. In summary, the concatenation <math>p+p'</math> is a (usually non-simple) cycle of odd length. In particular, <math>G</math> is not bipartite.
Basically, we have to show that there is no more augmenting path if this repeated graph search does not find one. So suppose for a contradiction that a search from an exposed node <math>u</math> fails although there is an augmenting path <math>p</math> from <math>u</math> to some other exposed node <math>v</math>. Then <math>v</math> is not seen by this graph search. Let <math>x</math> be the last node on <math>p</math> seen by this graph search, let <math>e</math> denote the edge over which <math>x</math> was seen, and let <math>e'</math> be the next edge on <math>p</math> (as seen in the direction from <math>u</math> to <math>v</math>). Since <math>e</math> was considered by the graph search and <math>e'</math> was not, we may conclude <math>e,e'\not\in M</math>. Let <math>p'</math> be the subpath of <math>p</math> from <math>u</math> to <math>x</math> and <math>p''</math> the path from <math>u</math> to <math>x</math> in the arborescence. As <math>e'\not\in M</math>, the last edge of <math>p'</math> is in <math>M</math>, so <math>p'</math> has even length. Since <math>e\not\in M</math>, <math>p''</math> has odd length. In summary, the concatenation <math>p'+p''</math> is a (usually non-simple) cycle of odd length. In particular, <math>G</math> is not bipartite.


'''Remark:'''
'''Remark on the proof:'''
This modified graph traversal in an undirected graph could be implemented as a regular graph traversal in a directed graph:
Go from <math>x</math> backwards along <math>p''</math> until the first node <math>y</math>, say, is seen that also belongs to <math>p</math>. In case <math>y\neq u</math>, the subpaths of <math>p'</math> and <math>p''</math> from <math>x</math> to <math>y</math> form a  [[Matchings in graphs#Blossoms|blossom]], and the graph search entered this blossom via its [[Matchings in graphs#Blossoms|stem]].
# Duplicate each matched node <math>v</math> giving <math>v_1</math> and <math>v_2</math>.
 
# Replace each edge <math>\{v,w\}</math> by two arcs, <math>(v,w)</math> and <math>(w,v)</math>.
== Correctness ==
# For each matched node <math>v</math>:
 
## Let all incoming arcs of <math>v</math> in <math>M</math> point to <math>v_1</math> and all outgoing arcs in <math>E\setminus M</math> leave <math>v_1</math>.
[[Cardinality-maximal matching#Berge's theorem|Berge's theorem]] immediately implies that the resulting matching is [[Sets and sequences#Maximal and minimal sets|cardinality-maximal]].
## Mirror-symmetrically, let all incoming arcs of <math>v</math> in <math>E\setminus M</math> point to <math>v_2</math> and all outgoing arcs of <math>v</math> in <math>M</math> leave <math>v_2</math>.
 
== Complexity ==
 
'''Statement:'''
The asymptotic complexity is in <math>\mathcal{O}(n\!\cdot\!m)</math>, where <math>n=|V|</math> and <math>m=|E|</math>.
 
'''Proof:'''
Since a matching cannot have more than <math>|V|/2</math> edges, the number of iterations is linear in the number of nodes. The complexity of an iteration is dominated by the graph search. The [[Basic graph definitions#Adjacency, incidence, and degree|statement and remark on isolated nodes]] apply.
 
== Connection to max-flow ==
 
This algorithm may be interpreted as an application of [[Ford-Fulkerson]]:
# Let <math>\{V_1,V_2\}</math> denote a [[Sets and sequences#Partitions|bipartition]] of <math>V</math> such that each edge is incident to one node of <math>V_1</math> and one node of <math>V_2</math>.
# Replace each undirected edge <math>\{v,w\}</math> by the directed arc <math>(v,w)</math> with infinite capacity, where <math>v\in V_1</math> and <math>w\in V_2</math> (so, all arcs go from <math>V_1</math> to <math>V_2</math>).
# Insert a new super-source <math>s</math> and a new super-target <math>t</math> in <math>V</math>.
# For each node <math>v\in V_1</math>, insert an arc <math>(s,v)</math> with capacity one.
# Mirror-symmetrically, for each node <math>w\in V_2</math>, insert an arc <math>(w,t)</math> with capacity one.
There is an obvious one-to-one correspondence between integral flows in that network and matchings in the original graph: An edge belongs to the matching if, and only if, the corresponding arc has positive flow value. The maximum flows correspond to the cardinality-maximal matchings. There is also a one-to-one correspondence between [[Matchings in graphs#Alternating and augmenting paths|augmenting paths]] in the original graph and [[Basic flow definitions#Flow-augmenting paths and saturated arcs|flow-augmenting paths]] in the induced flow network. [[Matchings in graphs#Alternating and augmenting paths|Augmenting]] the matching along an augmenting path is tantamount to [[Basic flow definitions#Augmenting along a path|augmenting]] the flow along the corresponding flow-augmenting path. Finally, the [[Max-flow min-cut|max-flow min-cut theorem]] reduces to the famous min-max theorem for bipartite matchings:
 
'''Definition:'''
# A '''node cover''' in an undirected graph is a set of nodes such that each edge is incident to at least one node in this set.
# A node in a node cover is said to '''cover''' the edges incident to this node.
 
'''Min-max theorem:'''
In a bipartite graph, the maximum cardinality of a matching equals the minimum cardinality of a node cover.
 
'''Reduction to the max-flow min-cut theorem:'''
Let <math>M</math> be a cardinality-maximal matching. No node can cover more than one edge of <math>M</math>. Therefore, no node set of less than <math>|M|</math> nodes can be a node cover. Thus, it suffices to construct a node cover of cardinality <math>|M|</math>.
 
The [[Max-flow min-cut|max-flow min-cut theorem]] implies that the minimum <math>(s,t)</math>-cut <math>(S,T)</math> has capacity <math>|M|</math>. In particular, <math>(S,T)</math> has finite capacity, so it is only crossed by arcs from <math>s</math> to <math>T</math> and by arcs from <math>S</math> to <math>t</math>. Now, let <math>S':=S\cap V_2</math> and <math>T':=T\cap V_1</math>. By construction, the node sets <math>S'</math> and <math>T'</math> are disjoint. The nodes in <math>S'</math> are in one-to-one correspondence to the arcs from <math>s</math> to <math>T</math>, and the nodes in <math>T'</math> are in one-to-one correspondence to the arcs from <math>S</math> into <math>t</math>. In summary, <math>|S'\cup T'|</math> is the capacity of <math>(S,T)</math>. Hence, it suffices to show that <math>S'\cup T'</math> is a node cover for the original graph.
 
For a contradiction suppose some edge <math>\{v,w\}</math> is not covered, where <math>v\in V_1</math> and <math>w\in V_2</math>, say. So, it is <math>v\in S</math> and <math>w\in T</math>, However, then <math>(v,w)</math> would go from <math>S</math> to <math>T</math>, which contradicts the fact that the capacity of <math>(S,T)</math> is finite.
 
'''Comment:'''
'''Tutte's theorem''' extends the min-max theorem to non-bipartite graphs.

Latest revision as of 08:49, 22 February 2015

Abstract view

Algorithmic problem: Cardinality-maximal matching in bipartite graphs.

Type of algorithm: loop.

Invariant: [math]\displaystyle{ M }[/math] is a matching in [math]\displaystyle{ G }[/math].

Variant: [math]\displaystyle{ |M| }[/math] is increased by one.

Break condition: There is no more augmenting path.

Induction basis

Abstract view: Initialize [math]\displaystyle{ M }[/math] to be a feasible matching, for example, the empty matching.

Induction step

Abstract view:

  1. Search for an augmenting path.
  2. If there is none, the loop terminates.
  3. Let [math]\displaystyle{ p }[/math] denote the augmenting path found in step 1.
  4. Augment [math]\displaystyle{ M }[/math] along [math]\displaystyle{ p }[/math].

Implementation of step 1: A graph traversal strategy is to be chosen that generates an arborescence rooted at the start node, for example, a DFS or a BFS. The modification to find augmenting paths must be applied. In a loop over all exposed nodes, a graph search is started at each of them. This loop may terminate early, namely when the first augmenting path is found.

Proof: Basically, we have to show that there is no more augmenting path if this repeated graph search does not find one. So suppose for a contradiction that a search from an exposed node [math]\displaystyle{ u }[/math] fails although there is an augmenting path [math]\displaystyle{ p }[/math] from [math]\displaystyle{ u }[/math] to some other exposed node [math]\displaystyle{ v }[/math]. Then [math]\displaystyle{ v }[/math] is not seen by this graph search. Let [math]\displaystyle{ x }[/math] be the last node on [math]\displaystyle{ p }[/math] seen by this graph search, let [math]\displaystyle{ e }[/math] denote the edge over which [math]\displaystyle{ x }[/math] was seen, and let [math]\displaystyle{ e' }[/math] be the next edge on [math]\displaystyle{ p }[/math] (as seen in the direction from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ v }[/math]). Since [math]\displaystyle{ e }[/math] was considered by the graph search and [math]\displaystyle{ e' }[/math] was not, we may conclude [math]\displaystyle{ e,e'\not\in M }[/math]. Let [math]\displaystyle{ p' }[/math] be the subpath of [math]\displaystyle{ p }[/math] from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ x }[/math] and [math]\displaystyle{ p'' }[/math] the path from [math]\displaystyle{ u }[/math] to [math]\displaystyle{ x }[/math] in the arborescence. As [math]\displaystyle{ e'\not\in M }[/math], the last edge of [math]\displaystyle{ p' }[/math] is in [math]\displaystyle{ M }[/math], so [math]\displaystyle{ p' }[/math] has even length. Since [math]\displaystyle{ e\not\in M }[/math], [math]\displaystyle{ p'' }[/math] has odd length. In summary, the concatenation [math]\displaystyle{ p'+p'' }[/math] is a (usually non-simple) cycle of odd length. In particular, [math]\displaystyle{ G }[/math] is not bipartite.

Remark on the proof: Go from [math]\displaystyle{ x }[/math] backwards along [math]\displaystyle{ p'' }[/math] until the first node [math]\displaystyle{ y }[/math], say, is seen that also belongs to [math]\displaystyle{ p }[/math]. In case [math]\displaystyle{ y\neq u }[/math], the subpaths of [math]\displaystyle{ p' }[/math] and [math]\displaystyle{ p'' }[/math] from [math]\displaystyle{ x }[/math] to [math]\displaystyle{ y }[/math] form a blossom, and the graph search entered this blossom via its stem.

Correctness

Berge's theorem immediately implies that the resulting matching is cardinality-maximal.

Complexity

Statement: The asymptotic complexity is in [math]\displaystyle{ \mathcal{O}(n\!\cdot\!m) }[/math], where [math]\displaystyle{ n=|V| }[/math] and [math]\displaystyle{ m=|E| }[/math].

Proof: Since a matching cannot have more than [math]\displaystyle{ |V|/2 }[/math] edges, the number of iterations is linear in the number of nodes. The complexity of an iteration is dominated by the graph search. The statement and remark on isolated nodes apply.

Connection to max-flow

This algorithm may be interpreted as an application of Ford-Fulkerson:

  1. Let [math]\displaystyle{ \{V_1,V_2\} }[/math] denote a bipartition of [math]\displaystyle{ V }[/math] such that each edge is incident to one node of [math]\displaystyle{ V_1 }[/math] and one node of [math]\displaystyle{ V_2 }[/math].
  2. Replace each undirected edge [math]\displaystyle{ \{v,w\} }[/math] by the directed arc [math]\displaystyle{ (v,w) }[/math] with infinite capacity, where [math]\displaystyle{ v\in V_1 }[/math] and [math]\displaystyle{ w\in V_2 }[/math] (so, all arcs go from [math]\displaystyle{ V_1 }[/math] to [math]\displaystyle{ V_2 }[/math]).
  3. Insert a new super-source [math]\displaystyle{ s }[/math] and a new super-target [math]\displaystyle{ t }[/math] in [math]\displaystyle{ V }[/math].
  4. For each node [math]\displaystyle{ v\in V_1 }[/math], insert an arc [math]\displaystyle{ (s,v) }[/math] with capacity one.
  5. Mirror-symmetrically, for each node [math]\displaystyle{ w\in V_2 }[/math], insert an arc [math]\displaystyle{ (w,t) }[/math] with capacity one.

There is an obvious one-to-one correspondence between integral flows in that network and matchings in the original graph: An edge belongs to the matching if, and only if, the corresponding arc has positive flow value. The maximum flows correspond to the cardinality-maximal matchings. There is also a one-to-one correspondence between augmenting paths in the original graph and flow-augmenting paths in the induced flow network. Augmenting the matching along an augmenting path is tantamount to augmenting the flow along the corresponding flow-augmenting path. Finally, the max-flow min-cut theorem reduces to the famous min-max theorem for bipartite matchings:

Definition:

  1. A node cover in an undirected graph is a set of nodes such that each edge is incident to at least one node in this set.
  2. A node in a node cover is said to cover the edges incident to this node.

Min-max theorem: In a bipartite graph, the maximum cardinality of a matching equals the minimum cardinality of a node cover.

Reduction to the max-flow min-cut theorem: Let [math]\displaystyle{ M }[/math] be a cardinality-maximal matching. No node can cover more than one edge of [math]\displaystyle{ M }[/math]. Therefore, no node set of less than [math]\displaystyle{ |M| }[/math] nodes can be a node cover. Thus, it suffices to construct a node cover of cardinality [math]\displaystyle{ |M| }[/math].

The max-flow min-cut theorem implies that the minimum [math]\displaystyle{ (s,t) }[/math]-cut [math]\displaystyle{ (S,T) }[/math] has capacity [math]\displaystyle{ |M| }[/math]. In particular, [math]\displaystyle{ (S,T) }[/math] has finite capacity, so it is only crossed by arcs from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ T }[/math] and by arcs from [math]\displaystyle{ S }[/math] to [math]\displaystyle{ t }[/math]. Now, let [math]\displaystyle{ S':=S\cap V_2 }[/math] and [math]\displaystyle{ T':=T\cap V_1 }[/math]. By construction, the node sets [math]\displaystyle{ S' }[/math] and [math]\displaystyle{ T' }[/math] are disjoint. The nodes in [math]\displaystyle{ S' }[/math] are in one-to-one correspondence to the arcs from [math]\displaystyle{ s }[/math] to [math]\displaystyle{ T }[/math], and the nodes in [math]\displaystyle{ T' }[/math] are in one-to-one correspondence to the arcs from [math]\displaystyle{ S }[/math] into [math]\displaystyle{ t }[/math]. In summary, [math]\displaystyle{ |S'\cup T'| }[/math] is the capacity of [math]\displaystyle{ (S,T) }[/math]. Hence, it suffices to show that [math]\displaystyle{ S'\cup T' }[/math] is a node cover for the original graph.

For a contradiction suppose some edge [math]\displaystyle{ \{v,w\} }[/math] is not covered, where [math]\displaystyle{ v\in V_1 }[/math] and [math]\displaystyle{ w\in V_2 }[/math], say. So, it is [math]\displaystyle{ v\in S }[/math] and [math]\displaystyle{ w\in T }[/math], However, then [math]\displaystyle{ (v,w) }[/math] would go from [math]\displaystyle{ S }[/math] to [math]\displaystyle{ T }[/math], which contradicts the fact that the capacity of [math]\displaystyle{ (S,T) }[/math] is finite.

Comment: Tutte's theorem extends the min-max theorem to non-bipartite graphs.