Thursday, 20 January 2011

Make a partial max heap a perfect max heap.

 

Maintain the property that Parent > Children on every tree on the heap.

See also:

http://analgorithmaday.blogspot.com/2011/01/heap-data-structure.html

Introduction

   When you include some more elements in the array to the max heap, you need to still satisfy the property that the parent > children. To get this property true, we need to change the root of the tree or swap some elements(if its array) from parent to children. To simplify this operation, we use a simple induction method which is the reverse of the condition.

    Consider, parent > children is violated at index i. Also, assume that, the LEFT(i) and RIGHT(i) trees have not violated this condition yet. This is the initial condition using which we swap elements to make a perfect max heap.

More concepts

   Consider we have the following array which is partially a max heap, (Courtesy: Corman-Lewis)

Index: 1 2 3 4 5 6 7 8 9 10
Value: 16 4 10 14 7 9 3 2 8 1

You might note something interesting comes out from the above data set. If you see, 16->10->9->3 is sorted. i.e, the RIGHT(root) tree is sorted but not the LEFT(root) tree.

                             16

                      /               \

                4                         10

          /          \                 /        \

     14              7             9             3

/      \         /

2        8     1

 

This means, we need to convert the left sub-tree to match the max heap property. That, parent > child while already RIGHT(root) tree is following that.

Code

  • Calculate the left and right index for the current root
  • check whether the value at the left is greater than the root
    • If yes, this is the largest
    • no, then current root is the largest
  • Compare the largest element between LEFT(i) and i with RIGHT(i), if it becomes the largest, then largest = RIGHT index
  • If both are not true, then return the algorithm.. This is a breaking condition.
  • If left or right becomes the largest, continue the recursion in the direction of the largest, until left or right doesn't exceed the heap size. This is one more return condition
   1:  #define LEFT(x) 2*x+1 // since we start with index 0
   2:  #define RIGHT(x) 2*x+2
   3:  #define heap_size(x) (sizeof(x)/sizeof(x[0])) //similar to ARRAYSIZE macro
   4:   
   5:  template <typename T, size_t size>
   6:  void max_heapify(T (&arr)[size], int idx)
   7:  {
   8:      int l = LEFT(idx);
   9:      int r = RIGHT(idx);
  10:      int largest;
  11:      if( l <= heap_size(arr) && arr[l] > arr[idx])
  12:          largest = l;
  13:      else
  14:          largest = idx;
  15:   
  16:      if(r <= heap_size(arr) && arr[r] > arr[largest])
  17:          largest = r;
  18:   
  19:      if(largest != idx) {
  20:          int temp = arr[idx];
  21:          arr[idx] = arr[largest];
  22:          arr[largest] = temp;
  23:          max_heapify(arr, largest);
  24:      }
  25:   
  26:  }
  27:   
  28:  void main()
  29:  {
  30:      int A[]={16,4,10,14,7,9,3,2,8,1};
  31:      int val = heap_size(A);
  32:      max_heapify(A, 1);
  33:      
  34:  }

 

Problems found while coding:

  • As we always consider start index as 0, LEFT(i) will equal 2i+1 and RIGHT(i) will equal 2i+2
  • Interesting problem: in C/C++, array is passed by value by default which means you just get a pointer inside the function, not the full array. So, you cannot use sizeof(array) to get correct size of the array
  • To complicate this more, in C++, we have possibility of passing multiple references if we know the size of the array. So, we make use of templates to make it generic. (not advised for large scale use) Refer: http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=241
  • Read more about pointers and references here: http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=166
  • I will cover a extensive topic about pointers and references soon.