Binary Heap

What is Binary Heap

Binary Heap is a complete binary tree that satisfies the heap property: every node is greater than (or less than) its children. This makes the heap an ideal structure for implementing priority queues.

It was proposed by J.W.J. Williams in 1964 as the basis for the HeapSort algorithm.


Two Types of Heap

graph LR A["Max-Heap\n(parent ≥ children)\nRoot = maximum"] B["Min-Heap\n(parent ≤ children)\nRoot = minimum"] A ~~~ B
Max-Heap Min-Heap
Property parent ≥ child parent ≤ child
Root maximum element minimum element
Application HeapSort, top-K Dijkstra, schedulers

Key Properties

Property Value
Tree Type Complete binary (all levels filled except the last)
Last Level filled left to right
Storage usually in an array (no pointers)
Tree Height O(log n)
Order between siblings undefined

Array Storage

Main feature: the tree is stored in a regular array, without pointers:

        1
       / \
      3   2
     / \ / \
    7  4 5  6

Array: [ _, 1, 3, 2, 7, 4, 5, 6 ]  (1-based indexing)
        0  1  2  3  4  5  6  7

Index Navigation:

For a node at index i:
┌─────────────────────────────────┐
│  Parent      →  i / 2           │
│  Left child  →  2 * i           │
│  Right child →  2 * i + 1       │
└─────────────────────────────────┘

(for 0-based indexing: parent = (i-1)/2, left = 2i+1, right = 2i+2)

Max-Heap Visualization

graph TD N1(("100")) N2(("85")) N3(("90")) N4(("70")) N5(("60")) N6(("75")) N7(("80")) N8(("40")) N9(("50")) N10(("55")) N1 --> N2 N1 --> N3 N2 --> N4 N2 --> N5 N3 --> N6 N3 --> N7 N4 --> N8 N4 --> N9 N5 --> N10
Array: [100, 85, 90, 70, 60, 75, 80, 40, 50, 55]
Index:   0    1   2   3   4   5   6   7   8   9

Basic Operations

🔼 Sift Up (bubble up) — auxiliary

Used after insertion: a new element is added to the end and "bubbles up" to its correct position.

Inserting 95 into Max-Heap:

[100, 85, 90, 70, 60, 75, 80, 40, 50, 55, 95]
                                           ↑ added to the end

95 > 60 (parent) → swap
95 > 85 (parent) → swap
95 < 100 (parent) → stop

Result: [100, 95, 90, 70, 85, 75, 80, 40, 50, 55, 60]
def sift_up(heap, i):
    while i > 0:
        parent = (i - 1) // 2
        if heap[i] > heap[parent]:      # for Max-Heap
            heap[i], heap[parent] = heap[parent], heap[i]
            i = parent
        else:
            break

🔽 Sift Down (percolate down) — auxiliary

Used after root removal: the last element is placed at the root and "sinks down" to its correct position.

Removing root 100:
  Place last element (60) at the root
  [60, 95, 90, 70, 85, 75, 80, 40, 50, 55]

  60 < max(95, 90) = 95 → swap with left child
  [95, 60, 90, 70, 85, 75, 80, 40, 50, 55]

  60 < max(70, 85) = 85 → swap with right child
  [95, 85, 90, 70, 60, 75, 80, 40, 50, 55]

  60 > no children → stop  ✓
def sift_down(heap, i, n):
    while True:
        largest = i
        left, right = 2*i + 1, 2*i + 2
        if left < n and heap[left] > heap[largest]:
            largest = left
        if right < n and heap[right] > heap[largest]:
            largest = right
        if largest == i:
            break
        heap[i], heap[largest] = heap[largest], heap[i]
        i = largest

➕ Insertion — O(log n)

1. Add element to the end of the array
2. Sift Up from the added element
def push(heap, val):
    heap.append(val)
    sift_up(heap, len(heap) - 1)

🏆 Get Max/Min — O(1)

def peek(heap):
    return heap[0]    # just check the root

🗑️ Root Removal (Extract Max/Min) — O(log n)

1. Remember the root (result)
2. Put the last element in the root position
3. Decrease heap size by 1
4. Sift Down from the root
def pop(heap):
    heap[0], heap[-1] = heap[-1], heap[0]
    val = heap.pop()
    sift_down(heap, 0, len(heap))
    return val

🏗️ Heapify (Build Heap) — O(n)

Naive way (n insertions) = O(n log n)
Optimal way (Floyd, 1964) = O(n) — sift_down from middle to root:

def heapify(arr):
    n = len(arr)
    # Start from the last internal node
    for i in range(n // 2 - 1, -1, -1):
        sift_down(arr, i, n)
Why O(n)? Most nodes are leaves (n/2 nodes), 
which perform 0 operations. Nodes above perform O(1), O(2)...
Total: O(n) based on geometric progression formula.

HeapSort — O(n log n)

The heap is the basis of a classic sorting algorithm:

1. Build Max-Heap from array     → O(n)
2. n times: extract maximum      → O(n log n)
   - swap root with last element
   - decrease heap size
   - sift_down

In-place sorting, no additional memory!
def heap_sort(arr):
    n = len(arr)
    heapify(arr)                           # O(n)
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]   # root to the end
        sift_down(arr, 0, i)              # O(log n)

Operation Complexity

{
  "title": { "text": "Binary Heap Operation Complexity", "left": "center" },
  "tooltip": { "trigger": "axis" },
  "legend": { "data": ["O(1)", "O(log n)", "O(n)"], "bottom": 0 },
  "xAxis": {
    "type": "category",
    "data": ["1K", "10K", "100K", "1M", "10M"]
  },
  "yAxis": {
    "type": "value",
    "name": "Operations"
  },
  "series": [
    {
      "name": "O(1)",
      "type": "line",
      "data": [1, 1, 1, 1, 1],
      "smooth": true,
      "lineStyle": { "color": "#5cb85c", "width": 3 }
    },
    {
      "name": "O(log n)",
      "type": "line",
      "data": [10, 13, 17, 20, 23],
      "smooth": true,
      "lineStyle": { "color": "#f0ad4e", "width": 3 }
    },
    {
      "name": "O(n)",
      "type": "line",
      "data": [1000, 10000, 100000, 1000000, 10000000],
      "smooth": true,
      "lineStyle": { "color": "#d9534f", "width": 2, "type": "dashed" }
    }
  ]
}
Operation Complexity Note
Peek (min/max) O(1) just heap[0]
Push O(log n) sift up
Pop (extract) O(log n) sift down
Heapify O(n) Floyd's algorithm
HeapSort O(n log n)
Arbitrary Search O(n) heap doesn't support
Arbitrary Deletion O(log n) requires knowing the index

Pros and Cons

✅ Pros ❌ Cons
Min/Max in O(1) Arbitrary Search — O(n)
Compact array storage No efficient in-order traversal
No pointer overhead Not suitable for range queries
Cache-friendly (array) Unstable order structure
Heapify in O(n) Merging two heaps — O(n)

Where It is Used

Area Example
📊 Graph Algorithms Dijkstra, Prim — Min-Heap to select next vertex
🗓️ Task Schedulers OS selects next task by priority
📡 Streaming Data Top-K elements in real-time
🔀 K-way merge Merging K sorted arrays
🗜️ Data Compression Huffman algorithm — built using Min-Heap
🐍 Python heapq — built-in Min-Heap
☕ Java PriorityQueue — Min-Heap by default

Final Comparison of All Four Structures

B-Tree Trie Patricia Trie Binary Heap
Key Type any strings strings any (ord.)
Search O(log n) O(m) O(m) O(n)
Min/Max O(log n) O(1)
Insertion O(log n) O(m) O(m) O(log n)
Prefix Search
Memory moderate large compact minimal
Disk-friendly
Application DBMS, FS dictionaries, NLP routing, OS kernel queues, sorting