B-tree

From Algowiki
Jump to navigation Jump to search


General information

Abstract data structure: Sorted sequence

Implementation Invariant:

  1. There is a number [math]\displaystyle{ M\in\mathbb{N} }[/math], [math]\displaystyle{ M\gt 1 }[/math], which is specific for the B-tree object and is constant throughout the life time of the B-tree object. This value [math]\displaystyle{ M }[/math] is called the order of the B-tree.
  2. A B-tree of order [math]\displaystyle{ M }[/math] is a multi-way search tree with [math]\displaystyle{ 2M-1 }[/math] slots for keys and, consequently, [math]\displaystyle{ 2M }[/math] slots for children pointers, in each node.
    1. The slots for keys are denoted keys[math]\displaystyle{ [1],\ldots, }[/math]keys[math]\displaystyle{ [2M\!-\!1] }[/math].
    2. The slots for children are denoted children[math]\displaystyle{ [1],\ldots, }[/math]children[math]\displaystyle{ [2M] }[/math].
  3. The attribute [math]\displaystyle{ n }[/math] of a node stores the number of filled key slots in this node.
  4. In each tree node except for the root, it is [math]\displaystyle{ n\geq M-1 }[/math]; for the root, it is [math]\displaystyle{ n\geq 1 }[/math].
  5. The filled key slots are the ones at positions [math]\displaystyle{ 1,\ldots,n }[/math]; the filled children slots are the ones at positions [math]\displaystyle{ 0,\ldots,n }[/math] (except for leaves, of course, where no child slot is filled at all).
  6. The keys appear in ascending order in a B-tree node, that is, keys[math]\displaystyle{ [i]\lt }[/math] keys[math]\displaystyle{ [i\!+\!1] }[/math] for all [math]\displaystyle{ i\in\{1,\ldots,n-1\} }[/math].
  7. The children pointers appear in the order as used in the definition of multi-way search trees. In other words, for [math]\displaystyle{ R }[/math] the range of the considered node and for [math]\displaystyle{ i\in\{0,\ldots,n\} }[/math], the range of the node to which children points is
    1. [math]\displaystyle{ R\cap\{K\in\mathcal{K}\,|\,K\leq }[/math] keys[math]\displaystyle{ [1]\,\} }[/math] for [math]\displaystyle{ i=0 }[/math];
    2. [math]\displaystyle{ R\cap\{K\in\mathcal{K}\,|\, }[/math] keys[math]\displaystyle{ [i\!-\!1]\leq K\leq }[/math] keys[math]\displaystyle{ [i]\,\} }[/math] for [math]\displaystyle{ i\in\{1,\ldots,n-1\} }[/math];
    3. [math]\displaystyle{ R\cap\{K\in\mathcal{K}\,|\,K\geq }[/math] keys[math]\displaystyle{ [n]\,\} }[/math] for [math]\displaystyle{ i=n }[/math].
  8. All leaves of a B-tree are on the same height level.


Depth of a B-tree

Statement: The height of a B-tree of order [math]\displaystyle{ M }[/math] with [math]\displaystyle{ N }[/math] currently stored keys is [math]\displaystyle{ \lceil\log_M(N/(2\cdot(M-1))) \rceil+1\in\Theta(\log n) }[/math] in the worst case.

Proof: It suffices to show that the minimum total number of keys on height level [math]\displaystyle{ i\gt 0 }[/math] is [math]\displaystyle{ 2\cdot M^{h-1}\cdot(M-1) }[/math]. We apply an induction on [math]\displaystyle{ i }[/math].

  1. Induction basis: Since the root contains at least one key, there are at least two tree nodes on height level [math]\displaystyle{ i=1 }[/math], each containing at least [math]\displaystyle{ M-1 }[/math] keys.
  2. Induction step: As all leaves are on the same height level, the minimum number of tree nodes on level [math]\displaystyle{ i }[/math] is [math]\displaystyle{ M }[/math] times the minimum number of tree nodes on level [math]\displaystyle{ i-1 }[/math]. Therefore, the minimum number of keys on height level [math]\displaystyle{ i }[/math] is [math]\displaystyle{ M }[/math] times the minimum number of keys on height level [math]\displaystyle{ i-1 }[/math].