File Organization: B+-Tree Index Files

A B+-tree is a kind of balanced tree. The length of every path from root of the tree to the leaf of the tree are same. The number of children n is fixed for a particular tree. Each non-leaf node in the tree has between (n/2) and n children. Index used in B+-tree files is multilevel index but its structure differ from the index structure used in multilevel index-sequential files.

A typical node of B+-tree contains up to n–1 search key values and pointers. Ki represents search key value and Pi represents pointer to a file record, as shown in Figure 3.24.

In B+-trees search key values are in sorted order, thus, if i < j, then Ki < Kj. The number of pointers in a node is called Fanout of the node.

  • Leaf nodes : In a leaf node, a pointer Pi points to either a file record having search key value Ki or to a bucket of pointers where each pointer to a file record with search key value Ki (In case of secondary index in which index is made of non- prime key attributes and file is not sorted in search key value order). A leaf node for file Employee  (ID, Name,  Salary)  is  shown in  Figure  25.
  • Non-leaf nodes : The structure of non-leaf nodes are same as of leaf node with a single difference that pointer Pi points to the tree It contains at least (n/2) pointers and a maximum of n pointers.

  • Root nodes : A root node contains at least 2 pointers and a maximum of less than [n/2] A B+-tree contains only a single node if root node consists only a single pointer.

A pointer Pi with a search key value Ki in a non-leaf node points to a part of subtree having search key values less than Ki, and greater than or equal to Ki-1. Pointer Pm points to a part of subtree having search key values greater than or equal to Km-1, Pointer P1 points to the part of subtree having search key values less than K1.

A B+-tree for file employee is shown in Figure 3.26.

Searching a record : Searching a record with its key value starts from root node. It is possible to find desired record without going to leaf node in B+-tree.

Deletion of a record : Deletion in B+-tree is complicated process in some special cases. If desired entry is in leaf node then, simply delete it and if bucket is associated then bucket becomes empty as a result. If deletion causes a node to be less than half full, then it is combined with  its neighboring  nodes  and it  is  propagated all  the way  to  the root.

Insertion of a record : To insert a new record in B+-tree, first search a leaf node with same search key value. Simply add a new record to file or add a pointer in bucket, which points to the record. If search key value does not appear, simply insert the value in leaf node in correct position or create a new bucket with the appropriate pointer if necessary.

Advantages of B+-trees : The advantages of B+-trees are as follows:

  1. It provides a reasonable performance for direct access.
  2. It provides an excellent  performance  for sequential  and range accesses.
  3. Searching is faster.

Disadvantages of B+-trees : The disadvantages of B+-trees are as follows:

  1. Insertion is more complex than B-trees.
  2. Deletion is more complex than B-trees.
  3. Search key values are duplicated which results in wastage of memory space.

Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Database Management System, 2nd Edition-University Science Press (2017)

Leave a Reply

Your email address will not be published. Required fields are marked *