B-tree: remove: Difference between revisions
Line 49: | Line 49: | ||
=== Correctness: === | === Correctness: === | ||
Analogously to the induction basis, we have to verify that the [[B-Trees|implementation invariants of the B-tree data structure]] and the above-mentioned loop invariants of the removal procedure are maintained by an iteration of the main loop. | |||
Again, implementation invariants #1 and #2 are trivially fulfilled. In case points to a leaf, implementation invariants #4, #7, and #8 are trivially maintained as well. In this case, Steps 1.1.3 and 1.2.3 guarantee #3, Step 1.1.2 guarantees #5, and Steps 1.1.2 and 4 guarantee #6. In case does not point to a leaf, implementation invariants #3-8 are guaranteed by the subroutines called in Step 2. | |||
Finally, consider the loop invariants of the removal procedure. #1 is trivial; #2 is guaranteed by Step 2.4; #3 and #4.1 by Step 2.3; #4.2 by Step 4, and #5 by the subroutines called in Step 2. | |||
== Complexity == | == Complexity == |
Revision as of 21:18, 29 September 2014
General Information
Algorithmic problem: Sorted sequence: remove
Type of algorithm: loop
Auxiliary data:
- Pointers, [math]\displaystyle{ p }[/math], [math]\displaystyle{ p' }[/math], [math]\displaystyle{ p_1 }[/math] and [math]\displaystyle{ p_2 }[/math] of type "pointer to B-tree node".
- A Boolean variable found, which is false, in the beginning and set true once the key [math]\displaystyle{ K }[/math] to be removed is seen.
Abstract View
Invariant:
- [math]\displaystyle{ p }[/math] points to a node of the B-tree.
- If [math]\displaystyle{ found = false }[/math], [math]\displaystyle{ K }[/math] is in the range of the node to which [math]\displaystyle{ p }[/math] points.
- It is [math]\displaystyle{ found = true }[/math] if, and only if, [math]\displaystyle{ K }[/math] is contained at least once in one of the current nodes of previous iterations.
- If [math]\displaystyle{ found = true }[/math]
- [math]\displaystyle{ p' }[/math] points to a node where [math]\displaystyle{ K }[/math] is currently stored.
- The immediate predecessor of [math]\displaystyle{ K }[/math] is in the range of the node to which [math]\displaystyle{ p }[/math] points.
- If [math]\displaystyle{ p }[/math] points to the root, at least two keys are currently stored in the root; otherwise, at least [math]\displaystyle{ M }[/math] keys are currently stored in the node to which [math]\displaystyle{ p }[/math] points.
Variant: [math]\displaystyle{ p }[/math] is redirected to a node one level deeper.
Break condition: [math]\displaystyle{ p }[/math] points to a leaf.
Induction Basis
Abstract view:
Implementation:
Proof:
Basically, we have to verify that the implementation invariants of the B-tree data structure and the above-mentioned loop invariants of the removal procedure are fulfilled after the preprocessing.
If the tree is empty (Step 1), the proof is trivial, so consider the case that the tree is non-empty.
Implementation invariants #1, #2, and #8 are trivially fulfilled. Implementation invariants #3, #5, #6, and #7 are guaranteed by Steps 3.1.3 and 4.1.3-5 and by the postconditions of the subroutines called in Step 4.2, respectively. Implementation invariant #4 is guaranteed by Step 3.1.2 and the postconditions of the subroutines called in Step 4, respectively.
The loop invariants of the removal procedure are only affected if Step 4 is executed. Loop invariants #1, #2, and #3 are obvious, #4 does not apply, and #5 is guaranteed by the subroutines called in Step 4.
Obviously, the case distinction in Step 4 covers all possible cases.
Induction Step
Abstract view:
Implementation:
Correctness:
Analogously to the induction basis, we have to verify that the implementation invariants of the B-tree data structure and the above-mentioned loop invariants of the removal procedure are maintained by an iteration of the main loop.
Again, implementation invariants #1 and #2 are trivially fulfilled. In case points to a leaf, implementation invariants #4, #7, and #8 are trivially maintained as well. In this case, Steps 1.1.3 and 1.2.3 guarantee #3, Step 1.1.2 guarantees #5, and Steps 1.1.2 and 4 guarantee #6. In case does not point to a leaf, implementation invariants #3-8 are guaranteed by the subroutines called in Step 2.
Finally, consider the loop invariants of the removal procedure. #1 is trivial; #2 is guaranteed by Step 2.4; #3 and #4.1 by Step 2.3; #4.2 by Step 4, and #5 by the subroutines called in Step 2.
Complexity
Statement: The asymptotic complexity is in [math]\displaystyle{ \Theta(\log n) }[/math] in the best and worst case.
Proof: Follows immediately from the facts that
- [math]\displaystyle{ M }[/math] is assumed to be fixed, and
- the height of a B-tree with [math]\displaystyle{ n }[/math] nodes is in [math]\displaystyle{ \Theta(\log n) }[/math] (cf. the remark clause of the B-Trees page).
Further Information
In the above specification, each node with exactly [math]\displaystyle{ M - 1 }[/math] keys is modified when visited. This is done for precautionary reasons only. With a slight modification, this can be avoided: when the leaf is reached, go back along the traversed path and modify each nodes with [math]\displaystyle{ M - 1 }[/math] keys until the first node with more than [math]\displaystyle{ M - 1 }[/math] keys is visited. Evidently, this reduces the number of modifications. However, chances are high that these nodes have to be modified sooner or later, so the true benefit is not clear. The version of the removal procedure presented here has primarily been selected because its loop invariant is simpler and more intuitive.