# 5.7: Red-Black trees

A red-black tree is a self-balancing tree structure that applies a color to each of its nodes. The structure of a red-black tree must adhere to a set of rules which dictate how nodes of a certain color can be arranged. The application of these rules is performed when the tree is modified in some way, causing the rotation and recolouring of certain nodes when a new node is inserted or an old node is deleted. This keeps the red-black tree balanced, guaranteeing a search complexity of O(log n).

The rules that a red-black tree must adhere to are as follows:

1. Each node must be either red or black.
2. The root is always black.
3. All leaves within the tree are black (leaves do not contain data and can be modelled as null or nil references in most programming languages).
4. Every red node must have two black child nodes.
5. Every path from a given node to any of its descendant leaves must contain the same number of black nodes.

A red-black tree can be modelled as 2-3-4 tree, which is a sub-class of B tree (below). A black node with one red node can be seen as linked together as a 3-node, and a black node with 2 red child nodes can be seen as a 4-node.

4-nodes are split, producing a two node, and the middle node made red, which turns a parent of the middle node which has no red child from a 2-node to a 3-node, and turns a parent with one red child into a 4-node (but this doesn't occur with always left red nodes).

A in-line arrangement of two red nodes, is rotated into a parent with two red children, a 4-node, which is later split, as described before.

   A right rotate      'split 4-node'  |red
red / \  -->    B      --->  B
B        red/ \red        / \
red / \        C  A        C  A
C  D         /          /
D          D


An optimization mentioned by Sedgewick is that all right inserted red nodes are left rotated to become left red nodes, so that only inline left red nodes ever have to be rotated right before splitting. AA-trees (above) by Arne Anderson , described in a paper in 1993 , seem an earlier exposition of the simplification, however he suggested right-leaning 'red marking' instead of left leaning , as suggested by Sedgewick, but AA trees seem to have precedence over left leaning red black trees. It would be quite a shock if the Linux CFS scheduler was described in the future as 'AA based'.

In summary, red-black trees are a way of detecting two insertions into the same side, and levelling out the tree before things get worse . Two left sided insertions will be rotated, and the two right sided insertions, would look like two left sided insertions after left rotation to remove right leaning red nodes. Two balanced insertions for the same parent could result in a 4-node split without rotation, so the question arises as to whether a red black tree could be attacked with serial insertions of one sided triads of a < P < b, and then the next triad's P' < a.

Python illustrative code follows

RED = 1
BLACK = 0
class Node:
def __init__(self, k, v):
# all newly inserted node's are RED
self.color = RED
self.k = k
self.v = v
self.left = None
self.right = None

class RBTree:
def __init__(self):
self.root = None

def insert(self, k, v) :
self.root = self._insert(self.root, k,v)

def _insert(self, n , k, v):
if n is None:
return Node(k,v)
if k < n.k :
n.left = self._insert(n.left, k , v)
elif k > n.k :
n.right = self._insert(n.right, k, v)
if n.right.color is RED:
#always on the left red's
#left rotate
tmp = n.right
n.right = tmp.left
tmp.left = n
n = tmp

#color rotation is actually a swap
tmpcolor = n.color
n.color = n.left.color
n.left.color = tmpcolor

if n.left <> None and n.left.left <> None and n.left.left.color == RED and n.left.color == RED:
# right rotate in-line reds
print "right rotate"
tmp = n.left
n.left = tmp.right
tmp.right = n
n = tmp

#color rotation is actually a swap
tmpcolor = n.color
n.color = n.right.color
n.right.color = tmpcolor

if n.left <> None: print n.left.color, n.color, n.right.color

#no need to test, because after right rotation, will need to split 3-node , as right rotation has
#brought red left grandchild to become left red child, and left red child is now red right child
#so there are two red children.

#if n.left <> None and n.right <> None and n.left.color == RED and n.right.color == RED:
print "split"
n.color = RED
n.left.color = BLACK
n.right.color = BLACK

return n

def find(self, k):
return self._find_rb(k, self.root)

def _find_rb(self, k, n):
if n is None:
return None
if k < n.k:
return self._find_rb( k, n.left)
if k > n. k:
return self._find_rb( k, n.right)
return n.v

def inorder(self):
self.inorder_visit(self.root, "O")

def inorder_visit(self, node,label=""):
if node is None: return
self.inorder_visit(node.left, label+"/L")
print label, "val=", node.v
self.inorder_visit(node.right, label+"/R")

def test1(N):
t = RBTree()
for i in xrange(0,N):
t.insert(i,i)

l = []
t.inorder()
for i in xrange(0,N):
x =t.find(i)
if x <> None:
l.append((x, i) )
print "found", len(l)

if __name__ == "__main__":
import random
test1(100000)
test1(1000)
test1(100)
test1(10)