B-Tree (Balanced Tree)

What is B-Tree

B-Tree (Balanced Tree) is a self-balancing search tree in which each node can store multiple keys and have multiple children. It was developed in 1970 by Rudolf Bayer and Edward M. McCreight for efficient interaction with disk storage.


Key Properties

For a B-Tree of order t (minimum degree, t ≥ 2):

Property Value
Minimum keys per node t - 1 (except root)
Maximum keys per node 2t - 1
Minimum children per node t (for internal nodes)
Maximum children per node 2t
All leaves at the same level
Keys in node sorted ascending

Node Structure

┌─────────────────────────────────────────────┐
│  P0 │ K1 │ P1 │ K2 │ P2 │ K3 │ P3 │ ... │  │
└─────────────────────────────────────────────┘
  ↑         ↑         ↑         ↑
child      key      child      key
pointer             pointer

Structure Visualization

graph TD Root["[10 | 20 | 30]"] Root --> C1["[3 | 7]"] Root --> C2["[15 | 17]"] Root --> C3["[25 | 27]"] Root --> C4["[35 | 40]"] C1 --> L1["[1 | 2]"] C1 --> L2["[4 | 5 | 6]"] C1 --> L3["[8 | 9]"] C2 --> L4["[11 | 12]"] C2 --> L5["[16]"] C2 --> L6["[18 | 19]"] C3 --> L7["[21 | 22]"] C3 --> L8["[26]"] C3 --> L9["[28 | 29]"] C4 --> L10["[31 | 33]"] C4 --> L11["[37 | 38]"] C4 --> L12["[42 | 45]"]

Basic Operations

🔍 Search — O(log n)

  1. Start from the root
  2. In the current node, search for the first key ≥ target
  3. If an exact match is found → done
  4. If reached a leaf and not found → does not exist
  5. Otherwise → move to the corresponding child
Search 16 in the tree above:
Root [10|20|30]: 16 < 20 → go to P1
Node [15|17]:     15 < 16 < 17 → go to P1
Leaf [16]:        16 == 16 ✓

➕ Insertion — O(log n)

Key rule: cannot insert into a full node (2t-1 keys).

Algorithm (proactive splitting):

1. Descend from root to the target leaf
2. If a full node is encountered (2t-1 keys) — split it BEFORE entering
3. Insert key into the leaf

Node Splitting:

Before: Parent[...] → Full Node [K1 K2 K3 K4 K5]  (t=3, max=5)
                                         ↑
                                   median key K3

After: Parent[... K3 ...] 
              ↙           ↘
       [K1 K2]           [K4 K5]

The median rises to the parent, and the node is split in two.


➖ Deletion — O(log n)

Three cases:

Case Situation Action
1 Key in leaf Remove directly
2 Key in internal node Replace with predecessor/successor from leaf, remove from there
3 Node becomes empty after deletion Merge or redistribute with sibling

Merging:

Parent [...K...]          Parent [...]
     ↙        ↘    →              ↓
 [A B]       [D E]          [A B K D E]
(underflow) (underflow)       (merged)

Redistribution (Rotation):

Parent [...K...]          Parent [...B...]
     ↙        ↘    →             ↙        ↘
 [A]       [B C D]           [A K]       [C D]
(underflow) (rich sibling)

Operation Complexity

{
  "title": { "text": "B-Tree Operation Complexity" },
  "tooltip": { "trigger": "axis" },
  "legend": { "data": ["O(log n)", "O(n) — worst case of other structures"] },
  "xAxis": {
    "type": "category",
    "data": ["1K", "10K", "100K", "1M", "10M", "100M"]
  },
  "yAxis": {
    "type": "value",
    "name": "Operations"
  },
  "series": [
    {
      "name": "O(log n)",
      "type": "line",
      "data": [10, 13, 17, 20, 23, 27],
      "smooth": true,
      "lineStyle": { "color": "#5cb85c", "width": 3 }
    },
    {
      "name": "O(n) — worst case of other structures",
      "type": "line",
      "data": [1000, 10000, 100000, 1000000, 10000000, 100000000],
      "smooth": true,
      "lineStyle": { "color": "#d9534f", "width": 2, "type": "dashed" }
    }
  ]
}
Operation Average Worst Case
Search O(log n) O(log n)
Insertion O(log n) O(log n)
Deletion O(log n) O(log n)
Traversal O(n) O(n)

Tree height: h = O(log_t n) — the larger t, the shorter the tree


B-Tree vs B+Tree

B-Tree B+Tree
Data storage In any node Only in leaves
Leaf links No Yes (linked list)
Range queries Slower Very fast
Application File systems Most DBMS

Where It is Used