Kruskal for maximum spanning forest

From Algowiki
Revision as of 22:54, 19 June 2015 by Luedecke (talk | contribs)
Jump to navigation Jump to search
Algorithmus von Kruskal

Algorithmic problem: Maximum spanning forest

Prerequisites:

Type of algorithm: loop

Auxiliary data:

  1. A sorted sequence S whose items are the edges of E. The edges are sorted in descending order, that is, the edge eE with the largest value of (e) is the first element of the sequence (ties arbitrarily broken). Below, ei denotes the i-th edge in S.
  2. A union-find data structure UF on V.

Abstract view

Invariant: After i0 iterations:

  1. The undirected graph Fi=(V,Ei), where Ei is the set of all selected edges immediately after the i-th iteration, is cycle-free; in particular, it is maximally cycle-free in the sense that, for j{1,...,i}, adding an edge eJFi would close a cycle in Fi.
  2. For the undirected graph Fi=(V,Ei), there is a maximum spanning forest Fi=(V,Ei) of G such that EiEi.
  3. The sets of UF are the connected components of Fi.

Variant: i increases by 1.

Break condition: i=|E|.

Induction basis

Abstract view:

  1. E0:=
  2. UF is initialized by V, so each node vV is a singleton in UF.

Implementation: Obvious.

Proof: Nothing to show.

Induction step

Abstract view: If inserting ei would close a cycle, reject ei; otherwise, insert ei.

Implementation:

  1. Let v,wV denote the endnodes of ei, that is, ei={v,w}.
  2. If v and w belong to different sets in UF:
    1. Insert ei in the result.
    2. Unite the sets of v and w in UF.

Correctness: The first and third invariants are obvious. So we focus on the second invariant. We may safely set Fi:=Fi1 if

  1. either eiEi1 and ei is inserted in the i-th iteration
  2. or eiEi1 and ei is rejected in the -th iteration.

Note that the case that eiEi1 and ei is rejected cannot occur. In fact, eiEi1 implies that (V,Ei1{ei}) is cycle-free, but rejecting ei implies that (V,Ei1{ei}) contains a cycle. So it remains to consider the case that ei is accepted but eiEi1.

As (ei)>0 and Fi1 is maximal, (V,Ei1{ei}) contains a cycle. Since Fi1 is cycle-free, there must be an edge eEi1Ei1 on that cycle. By assumption, (V,Ei1{ei}) is cycle-free as well. Therefore, e is none of the edges considered so far but one of ei+1,...,e|E|, which implies (e)(ei) and, hence, maximality of the spanning forest Fi:=(V,Ei1{e}{ei}). Clearly, this Fi includes Fi, so the second invariant is maintained again.

Complexity

Statement: The asymptotic complexity is in O(mT(n)+mlogm) in the worst case, where n=|V| and m=|E|, and T(n) is the complexity of a single union-find operation.

Proof: The summand O(mlogm) results from sorting all edges in the preprocessing. The main loop has m iterations, and each iteration takes O(T(n)) time.