B-tree: Difference between revisions

From Algowiki
Jump to navigation Jump to search
No edit summary
 
(18 intermediate revisions by the same user not shown)
Line 2: Line 2:
[[Category:Tree]]
[[Category:Tree]]
[[Category:B-tree]]
[[Category:B-tree]]
== General information ==


'''Abstract data structure:''' [[Sorted sequence]]
'''Abstract data structure:''' [[Sorted sequence]]
Line 7: Line 9:
'''Implementation Invariant:'''
'''Implementation Invariant:'''
# There is a number <math>M\in\mathbb{N}</math>, <math>M>1</math>, which is specific for the B-tree object and is constant throughout the life time of the B-tree object. This value <math>M</math> is called the '''order''' of the B-tree.
# There is a number <math>M\in\mathbb{N}</math>, <math>M>1</math>, which is specific for the B-tree object and is constant throughout the life time of the B-tree object. This value <math>M</math> is called the '''order''' of the B-tree.
# A B-tree of order <math>M</math> is a [[Multi-way search tree|multi-way search tree]] with <math>2M-1</math> slots for keys and, consequently, <math>2M</math> slots for children pointers, in each node.
# A B-tree of order <math>M</math> is a [[Directed Tree#Multi-way Search Trees|multi-way search tree]] with <math>2M-1</math> slots for keys and, consequently, <math>2M</math> slots for children pointers, in each node.
## The slots for keys are denoted keys<math>[1],\ldots,</math>keys<math>[2M\!-\!1]</math>.
## The slots for keys are denoted keys<math>[1],\ldots,</math>keys<math>[2M\!-\!1]</math>.
## The slots for children are denoted children<math>[1],\ldots,</math>children<math>[2M]</math>.
## The slots for children are denoted children<math>[0],\ldots,</math>children<math>[2M-1]</math>.
# The attribute <math>n</math> of a node stores the number of filled key slots in this node.
# The attribute <math>n</math> of a node stores the number of filled key slots in this node.
# In each tree node except for the root, it is <math>n\geq M-1</math>; for the root, it is <math>n\geq 1</math>.
# In each tree node except for the root, it is <math>n\geq M-1</math>; for the root, it is <math>n\geq 1</math>.
# The filled key slots are the ones at positions <math>1,\ldots,n</math>; the filled children slots are the ones at positions <math>0,\ldots,n</math> (except for leaves, of course, where no child slot is filled at all).
# The filled key slots are the ones at positions <math>1,\ldots,n</math>; the filled children slots are the ones at positions <math>0,\ldots,n</math> (except for leaves, of course, where no child slot is filled at all).
# The keys appear in ascending order in a B-tree node, that is, keys<math>[i]<</math> keys<math>[i\!+\!1]</math> for all <math>i\in\{1,\ldots,n-1\}</math>.
# The keys appear in ascending order in a B-tree node, that is, keys<math>[i]\leq</math> keys<math>[i\!+\!1]</math> for all <math>i\in\{1,\ldots,n-1\}</math>.
# The children pointers appear in the order as used in the definition of [[Multi-way search tree|multi-way search trees]]. In other words, for , the [[Directed tree#Range|range]] of the node to which points is
# The children pointers appear in the order as used in the definition of [[Directed Tree#Multi-way Search Trees|multi-way search trees]]. In other words, for <math>R</math> the [[Directed Tree#Range|range]] of the considered node and for <math>i\in\{0,\ldots,n\}</math>, the [[Directed Tree#Range|range]] of the node to which children points is
## for ;
## <math>R\cap\{K\in\mathcal{K}\,|\,K\leq</math> keys<math>[1]\,\}</math> for <math>i=0</math>;
## for ;
## <math>R\cap\{K\in\mathcal{K}\,|\,</math> keys<math>[i\!-\!1]\leq K\leq</math> keys<math>[i]\,\}</math> for <math>i\in\{1,\ldots,n-1\}</math>;
## for .
## <math>R\cap\{K\in\mathcal{K}\,|\,K\geq</math> keys<math>[n]\,\}</math> for <math>i=n</math>.
# All leaves of a B-tree are on the same height level.
# All leaves of a B-tree are on the same height level.
== Depth of a B-tree ==
'''Statement:'''
The height of a B-tree of order <math>M</math> with <math>N</math> currently stored keys is <math>\lceil\log_M(N/(2\cdot(M-1))) \rceil+1\in\Theta(\log N)</math> in the worst case.
'''Proof:'''
It suffices to show that the minimum total number of keys on height level <math>i>0</math> is <math>2\cdot M^{i-1}\cdot(M-1)</math>. We apply an induction on <math>i</math>.
# ''Induction basis'': Since the root contains at least one key, there are at least two tree nodes on height level <math>i=1</math>, each containing at least <math>M-1</math> keys.
# ''Induction step'': As all leaves are on the same height level, the minimum number of tree nodes on level <math>i</math> is <math>M</math> times the minimum number of tree nodes on level <math>i-1</math>. Therefore, the minimum number of keys on height level <math>i</math> is <math>M</math> times the minimum number of keys on height level <math>i-1</math>.

Latest revision as of 13:49, 26 May 2015


General information

Abstract data structure: Sorted sequence

Implementation Invariant:

  1. There is a number [math]\displaystyle{ M\in\mathbb{N} }[/math], [math]\displaystyle{ M\gt 1 }[/math], which is specific for the B-tree object and is constant throughout the life time of the B-tree object. This value [math]\displaystyle{ M }[/math] is called the order of the B-tree.
  2. A B-tree of order [math]\displaystyle{ M }[/math] is a multi-way search tree with [math]\displaystyle{ 2M-1 }[/math] slots for keys and, consequently, [math]\displaystyle{ 2M }[/math] slots for children pointers, in each node.
    1. The slots for keys are denoted keys[math]\displaystyle{ [1],\ldots, }[/math]keys[math]\displaystyle{ [2M\!-\!1] }[/math].
    2. The slots for children are denoted children[math]\displaystyle{ [0],\ldots, }[/math]children[math]\displaystyle{ [2M-1] }[/math].
  3. The attribute [math]\displaystyle{ n }[/math] of a node stores the number of filled key slots in this node.
  4. In each tree node except for the root, it is [math]\displaystyle{ n\geq M-1 }[/math]; for the root, it is [math]\displaystyle{ n\geq 1 }[/math].
  5. The filled key slots are the ones at positions [math]\displaystyle{ 1,\ldots,n }[/math]; the filled children slots are the ones at positions [math]\displaystyle{ 0,\ldots,n }[/math] (except for leaves, of course, where no child slot is filled at all).
  6. The keys appear in ascending order in a B-tree node, that is, keys[math]\displaystyle{ [i]\leq }[/math] keys[math]\displaystyle{ [i\!+\!1] }[/math] for all [math]\displaystyle{ i\in\{1,\ldots,n-1\} }[/math].
  7. The children pointers appear in the order as used in the definition of multi-way search trees. In other words, for [math]\displaystyle{ R }[/math] the range of the considered node and for [math]\displaystyle{ i\in\{0,\ldots,n\} }[/math], the range of the node to which children points is
    1. [math]\displaystyle{ R\cap\{K\in\mathcal{K}\,|\,K\leq }[/math] keys[math]\displaystyle{ [1]\,\} }[/math] for [math]\displaystyle{ i=0 }[/math];
    2. [math]\displaystyle{ R\cap\{K\in\mathcal{K}\,|\, }[/math] keys[math]\displaystyle{ [i\!-\!1]\leq K\leq }[/math] keys[math]\displaystyle{ [i]\,\} }[/math] for [math]\displaystyle{ i\in\{1,\ldots,n-1\} }[/math];
    3. [math]\displaystyle{ R\cap\{K\in\mathcal{K}\,|\,K\geq }[/math] keys[math]\displaystyle{ [n]\,\} }[/math] for [math]\displaystyle{ i=n }[/math].
  8. All leaves of a B-tree are on the same height level.

Depth of a B-tree

Statement: The height of a B-tree of order [math]\displaystyle{ M }[/math] with [math]\displaystyle{ N }[/math] currently stored keys is [math]\displaystyle{ \lceil\log_M(N/(2\cdot(M-1))) \rceil+1\in\Theta(\log N) }[/math] in the worst case.

Proof: It suffices to show that the minimum total number of keys on height level [math]\displaystyle{ i\gt 0 }[/math] is [math]\displaystyle{ 2\cdot M^{i-1}\cdot(M-1) }[/math]. We apply an induction on [math]\displaystyle{ i }[/math].

  1. Induction basis: Since the root contains at least one key, there are at least two tree nodes on height level [math]\displaystyle{ i=1 }[/math], each containing at least [math]\displaystyle{ M-1 }[/math] keys.
  2. Induction step: As all leaves are on the same height level, the minimum number of tree nodes on level [math]\displaystyle{ i }[/math] is [math]\displaystyle{ M }[/math] times the minimum number of tree nodes on level [math]\displaystyle{ i-1 }[/math]. Therefore, the minimum number of keys on height level [math]\displaystyle{ i }[/math] is [math]\displaystyle{ M }[/math] times the minimum number of keys on height level [math]\displaystyle{ i-1 }[/math].