Red-Black Tree

A red-black tree is a self-balancing binary search tree where each node has a color (red or black), and rules are followed to guarantee logarithmic tree height.


Rules (Invariants)

# Rule
1 Each node is either red or black
2 The root is always black
3 All leaves (NIL) are black
4 If a node is red, both its children are black
5 Every path from a node to its descendant NIL leaves contains the same number of black nodes (black-height)

Node Structure

class RBNode:
    def __init__(self, key):
        self.key   = key
        self.color = "RED"      # new nodes are always red
        self.left  = None       # NIL leaf
        self.right = None
        self.parent = None

Visualization

graph TD 13B["🔲 13"]:::black 8R["🔴 8"]:::red 17B["🔲 17"]:::black 1B["🔲 1"]:::black 11B["🔲 11"]:::black 15R["🔴 15"]:::red 25R["🔴 25"]:::red 6R["🔴 6"]:::red 13B --> 8R 13B --> 17B 8R --> 1B 8R --> 11B 17B --> 15R 17B --> 25R 1B --> 6R classDef black fill:#2d2d2d,color:#fff,stroke:#000 classDef red fill:#c0392b,color:#fff,stroke:#8e1a0e

Operation Complexity

Operation Average Worst Case
Search O(log n) O(log n)
Insertion O(log n) O(log n)
Deletion O(log n) O(log n)
Memory O(n) O(n)

Tree height ≤ 2 · log₂(n+1) — guaranteed ceiling.


Insertion Algorithm

def insert(self, key):
    # 1. Standard BST insertion (new node is RED)
    node = RBNode(key)
    self._bst_insert(node)

    # 2. Restore invariants
    self._fix_insert(node)

def _fix_insert(self, z):
    while z.parent and z.parent.color == "RED":
        if z.parent == z.parent.parent.left:
            uncle = z.parent.parent.right

            if uncle and uncle.color == "RED":
                # Case 1: uncle is red → recoloring
                z.parent.color         = "BLACK"
                uncle.color            = "BLACK"
                z.parent.parent.color  = "RED"
                z = z.parent.parent
            else:
                if z == z.parent.right:
                    # Case 2: node is right child → left rotation
                    z = z.parent
                    self._rotate_left(z)
                # Case 3: node is left child → right rotation
                z.parent.color        = "BLACK"
                z.parent.parent.color = "RED"
                self._rotate_right(z.parent.parent)
        else:
            # Symmetric cases (right side)
            ...

    self.root.color = "BLACK"   # rule 2

Rotations

Left Rotation (x):          Right Rotation (y):

    x                               y
   / \          →                  / \
  A   y                           x   C
     / \                         / \
    B   C                       A   B
def _rotate_left(self, x):
    y = x.right
    x.right = y.left
    if y.left:
        y.left.parent = x
    y.parent = x.parent
    if not x.parent:
        self.root = y
    elif x == x.parent.left:
        x.parent.left = y
    else:
        x.parent.right = y
    y.left = x
    x.parent = y

Comparison with Other Trees

Criterion RB-Tree AVL-Tree B-Tree
Balancing Less strict Strict By degree
Insertion/Deletion Faster Slower Fast on disk
Search Slightly slower Faster Slower in memory
Application std::map, TreeMap DB indexes File systems

Real-World Applications


💡 Tip: If you need to use an RB-Tree in DataPlayer queries, such structures are already built into PostgreSQL and MySQL indexes — you don't need to implement them manually.


Searching in a Red-Black Tree

Search Algorithm

Searching in an RB-Tree is identical to searching in a standard BST — coloring is completely ignored:

def search(node, key):
    if node is NIL or node.key == key:
        return node          # found or reached a leaf

    if key < node.key:
        return search(node.left, key)   # go left
    else:
        return search(node.right, key)  # go right
flowchart TD A["Start: root"] --> B{"key == node.key?"} B -- "Yes" --> C["✅ Found"] B -- "key < node.key" --> D["Go left"] B -- "key > node.key" --> E["Go right"] D --> F{"node == NIL?"} E --> F F -- "Yes" --> G["❌ Not found"] F -- "No" --> B

Main Question: Does Coloring Speed Up Searching?

Short Answer: No, not directly

The node color is not checked during search and does not affect the direction of traversal.

Then Why Colors?

Coloring is an indirect mechanism: it ensures tree balance during insertions and deletions, and balance guarantees fast searching.

Colors → adherence to invariants
       → tree remains balanced
       → height h = O(log n)
       → search runs in O(log n)

How Invariants Limit Height

Rules 4 and 5 together lead to a strict height limit:

The height of an RB-Tree does not exceed 2 · log₂(n+1)

Why this is so:

{
  "title": { "text": "Maximum Tree Height vs Number of Nodes", "left": "center" },
  "tooltip": { "trigger": "axis" },
  "legend": { "data": ["Unbalanced BST (worst case)", "Red-Black Tree (2·log₂n)", "AVL Tree (1.44·log₂n)"], "bottom": 0 },
  "xAxis": {
    "type": "category",
    "name": "n (nodes)",
    "data": ["1", "10", "50", "100", "500", "1000", "10000", "100000"]
  },
  "yAxis": { "type": "value", "name": "Height" },
  "series": [
    {
      "name": "Unbalanced BST (worst case)",
      "type": "line",
      "data": [1, 10, 50, 100, 500, 1000, 10000, 100000],
      "lineStyle": { "color": "#e74c3c" }
    },
    {
      "name": "Red-Black Tree (2·log₂n)",
      "type": "line",
      "data": [0, 7, 11, 13, 18, 20, 27, 34],
      "lineStyle": { "color": "#e67e22" }
    },
    {
      "name": "AVL Tree (1.44·log₂n)",
      "type": "line",
      "data": [0, 5, 8, 10, 13, 14, 19, 24],
      "lineStyle": { "color": "#2ecc71" }
    }
  ]
}

AVL vs RB-Tree: Search

AVL Red-Black
Max Height 1.44 · log₂n 2 · log₂n
Search (theory) Slightly faster Slightly slower
Search (practice) Comparable Comparable
Balance after insertion Strict Soft
Rotations on insertion More Fewer
Application Read-heavy Universal

In practice, the search speed difference between AVL and RB is minimal and often eclipsed by cache effects and implementation details. Therefore, RB-trees are more commonly chosen as a universal structure — they are faster for writes without significantly sacrificing read performance.


Summary