Thursday, 24 March 2011

Binary search tree–Deletion

“In a binary search tree, deleting a node is a challenge. Unlike insert where we do a plain check for in-order property to insert elements. That is, we add from left to right. But even after deleting, we need to get the in-order property to be satisfied”

Introduction

  In Binary search tree, deletion is a tricky algorithm and must be understood well to understand self-balancing trees. Binary search trees has a small problem when it comes to insertion and deletions. We won’t get a balanced tree ever!!

   Heap is balanced by default since we construct it from across with the single property that root is greater than the leaves. But BST property makes it grow either in left or the right side if the insertion of a sorted array is done.

Consider the insertion code we already have. Try giving,

    int A[]={1,2,3,4,5,6,6,7,8};

The above array will yield a tree which grows towards the right!!.. So, when we search such a binary tree, we get a worst case time if the tree is too huge!!..

This is where self-balancing trees came into picture.

  But what does deletion needs to do in a BST, it needs to follow the in-order sorted property. It also never balances a BST if it is right inclined. But there are proofs that random insertion and deletion gives somewhat balanced BST.

Concept

Deletion involves three cases. Consider you have given a node for deletion. For any node in a BST, there are 3 cases. Node can have 0, 1 or 2 children.

Case 1: node has no children = remove
Case 2: node has 1 children = move right or left, ignore node
Case 3: node has 2 children = find the successor and swap content

This is explained clearly in many books. So, please refer Cormen to get clear picture.

Code

 
BST* createBSTNode(int data)
{
    BST* newNode = new BST();
    newNode->data = data;
    newNode->left = NULL;
    newNode->right = NULL;
    newNode->parent = NULL;
    return newNode;
}
 
void bst_insert(BST*& tree, BST* node)
{
    BST* prev = NULL;
    BST* iter=tree;
    while(iter != NULL)
    {
        prev = iter;
        if(node->data < iter->data)
            iter = iter->left;
        else
            iter = iter->right;
    }
    // found node is parent to our node
    node->parent = prev;
 
    // if prev is NULL
    // then this is the first node
    // change the root NODE
    if(prev == NULL)
        tree = node;
    else {
        // now we need to decide where the node
        // should go: left or right
        if(node->data < prev->data)
            prev->left = node;
        else
            prev->right = node;
    }
 
}
 
BST* convert_array_bst(int* arr, int size)
{
    BST* tree=NULL;
    for(int i=0;i<size;i++) {
        bst_insert(tree, createBSTNode(arr[i]));
    }
    return tree;
}
 
BST* bst_search(BST* tree, int key)
{
    BST* skey = NULL;
    while(tree != NULL)
    {
        skey = tree;
        if(tree->data == key)
            return skey;
        else {
            if(key < tree->data)
                tree = tree->left;
            else
                tree = tree->right;
        }
    }
    return skey;
}
 
// left most is MIN
BST* min(BST* root)
{
    BST* minnode=NULL;
    while(root != NULL) {
        minnode = root;
        root = root->left;
    }
 
    return minnode;
}
 
BST* find_successor(BST* root)
{
    if(root == NULL) return NULL;
 
    // min of right subtree
    if(root->right != NULL) return min(root->right);
 
    // ancestor of node without right subtree
    BST* ancestor = root->parent;
    while(ancestor != NULL && ancestor->right == root)
    {
        root = ancestor;
        ancestor = ancestor->parent;
    }
    return ancestor;
}
 
void bst_delete(BST*& tree,BST* node)
{
    // Case 1: node has no children = remove
    // Case 2: node has 1 children = move right or left, ignore node
    // Case 3: node has 2 children = find the successor and swap content
    BST* y = NULL;
    BST* x = NULL;
 
    // Case 3: both nodes present, find successor
    if(node->left != NULL && node->right != NULL)
        y = find_successor(node);
    else
        y = node;
 
    // Case 2, check for right or left node
    if(y->left != NULL)
        x = y->left;
    else
        x = y->right;
 
    // Case 1 & 2: ignore node
    // Case 3: ignore successor
    if(x != NULL)
        x->parent = y->parent;
    
    if(y->parent == NULL) {
        // this means this is the root node DELETION!!
        tree = y;
    } else {
        // now change the y's parent node left & right pointers
        if( y == y->parent->left) {
            // y is the left node of parent then
            y->parent->left = x;
        } else {
            y->parent->right = x;
        }
    }
 
    // Case 1 & Case 2: y is always equal to node
    // Case 3: y = successor(node) 
    // Case 3: pending thing, swap content between node & y
    if(y != node) {
        node->data = y->data;
        // we can ignore left, right, parent
        // as successor is greater than left and lesser than right
    }
    free(y);
}
 
void main()
{
    int A[]={1,2,3,4,5,6,6,7,8};
    BST* sort = convert_array_bst(A, 9);
    BST* del = bst_search(sort, 6);
    bst_delete(sort, del);
    del = bst_search(sort, 6);
    bst_delete(sort, del);
 
}

Important points:

  • please study more about finding the successor.
  • deletion algorithm does all the 3 cases in a mixed way
  • we free(y) blindly!!.. but doesn’t break the tree structure.