An Algorithm a Day: maxheap

Showing posts with label maxheap. Show all posts

Monday, 14 February 2011

Priority Queue using binary heap

“Advantage of using binary heap for priority queue is that, remove operations are faster!! so that it can be used in a real time schedulers”

http://analgorithmaday.blogspot.com/2011/02/fun-with-bitwiseor-and-not-xor-and-left.html

Introduction

Before jumping into priority queues which we already saw a metaphor example. We will study how to represent an array as a tree. Only if we understand how to represent array as a tree, it would be easy to do study tree representation using linked lists in future!!

What are all the terminologies?

PARENT = floor(i/2)

LEFT = 2i

RIGHT = 2i + 1

The above calculation is based on the assumption that “i” can never be zero or one.. So you cannot find the PARENT of a root, which would be 0 then.

But in all our calculations, we start with index 0. Since we start with zero, 1 and 2 becomes the right and left index of the root!!. so in our case,

PARENT = floor(i-1/2) && LEFT = 2i + 1 && RIGHT = 2i+2

Since we have already studied bitwise, we will use bitwise operators to represent these..

PARENT = i-1 >> 1 && LEFT = (i << 1) + 1 && RIGHT = (i << 1) + 2

The bit shift operations sometimes reduce the CPU instructions but not for the case when index is zero. :) for ex: for LEFT(i), mul and add is required.

int A[] = {15,13,9,5,12,8,7,4,0,6,2,1};

                   15                -----> 0

          13                 9      ------> 1 , 2 (2i+1, 2i+2)

    12         8         7       4    -----> 3, 4   5,6

0 6 2 1 -----> 7,8 9, 10

Concept

The various operations in a priority queue ADT.

HEAP-INSERT – it behaves similar to build max heap!! insert elements

HEAP-EXTRACT-MAX – remove the maximum element (the root) and max heapify!!

HEAP-MAXIMUM – return the top element of the binary heap – A[0]

HEAP-INCREASE-KEY – Change the key in a particular index and again make it a max heap

The heap sort can be implemented using the procedures given by the algorithm.

Code

#define PARENT(i)  (i-1)>>1 const int PQSIZE = 15; class PriorityQueue{    int arr[PQSIZE];    int idx; public:    PriorityQueue()    {        idx=0;    }     bool insert(int val)    {        if(idx > PQSIZE)            return false;         arr[idx] = INT_MIN;        change(idx, val);        idx++;        return true;     }     int maximum() const { return arr[0]; }     bool remove(int& val)    {        if(idx <= 0)            return false;         val = arr[0];        arr[0] = arr[--idx];        arr[idx] = INT_MIN;        max_heapify(arr, 0, idx);        return true;    }     bool change(int i, int val)    {        if(val < arr[i])            return false;                arr[i] = val;        // the below loop never runs if the input        // array is already in maxheap format!!        while(i>0 && arr[PARENT(i)] < arr[i]) {            int tmp=arr[PARENT(i)];            arr[PARENT(i)] = arr[i];            arr[i] = tmp;            i = PARENT(i);        }        return true;    }}; void main(){    PriorityQueue pq;    int A[] = {4,0,13,15,8,7,12,6,2,1,9,5};     for(int i=0; i<12; i++)    {        pq.insert(A[i]);    }    int maxval = pq.maximum();    int remval = 0;    for(int i=0; i<15; i++)    {        pq.remove(remval);    }}

Important points

remove, change, insert all operates over array with index 0
every remove does max heapify to rectify the correctness of the array
similar to max heapify, insert also does a correctness of the array!!

Sunday, 23 January 2011

Build a max heap with input array

“As we studied tree in brief, and understood an important concept that, every node is a root for some tree except the leaves. With this as the basics, we should build a heap. The important property of a heap is that mostly, the last elements of the array will become the leaves, that too it always in range [n/2]+1, [n/2]+2,…n”

http://analgorithmaday.blogspot.com/2011/01/make-partial-max-heap-perfect-max-heap.html

Metaphor

How would a forest tree grow? Bottom up right? :) This is not a chicken and egg problem btw. :P

But top to bottom model comes into picture only for a family hierarchy or a majestic empire cabinet. But it shows as clearly that, there are two ways a tree kind of structure can grow, top to bottom (impossible for a forest tree) and bottom to up. Lets take the example of bottom to up example as it matches with what we are seeing now.

For growing a tree, you start with the seeds. After the seeds are planted, roots grow first and make the tree grow upwards with leaves. But just understand the basic model of bottom up here. roots grow first and then the whole tree.

Concept

The bottom up tree growth from a seed is interesting!! But note that we start with the root. But a computer tree, has leaves at the bottom and roots at the top!! This never matches with a normal forest tree!!.. Bad design :D

But the methodology in which a computer tree is constructed follows the same historical ways: bottom up, top down. For a heap, we need to construct a tree bottom up like how a natural forest tree grows. But you might shout out like leaves should be at the bottom!! where would you fit the leaves?

We treat all the leaves as root in any bottom up traversal approach in computer algorithms. Since we grow from the bottom, we should get the leaves of a computer tree as input!!.. Since we only know the leaves, we should treat them as initial root obviously. The next input that we get might be assumed as the root of some of these leaves and the algorithm goes until you are left with only one root to process.

This is the approach the build max heap algorithm follows. For understanding this fully, you should understand how a normal tree is converted to a max heap tree. [http://analgorithmaday.blogspot.com/2011/01/make-partial-max-heap-perfect-max-heap.html]

The basic input to the max heapify algorithm is: the root of a tree. The tree can be already following or not following the max heap principle (parent > children). But max heapify traverses the full tree until it reaches a small tree in which parent > children. This basic principle can be used to build the heap.

The interesting point in a heap is, its leaves will be always [n/2]+1, [n/2]+2,..n. For example, if size of the heap is n=10, then 6,7,8,9,10 are the leaves. If these indices are leaves, the next level indices which will all be roots which means [n/2]…1 are the remaining nodes which roots.

Code

   1:  #define LEFT(x) 2*x+1 // since we start with index 0

   2:  #define RIGHT(x) 2*x+2

   3:  #define heap_size(x) (sizeof(x)/sizeof(x[0])) //similar to ARRAYSIZE macro

4:

   5:  template <typename T, size_t size>

   6:  void build_max_heap(T (&arr)[size])

   7:  {

   8:      for(int i=(int) floor((float)heap_size(arr)/(float)2);i>=0; i--)

   9:      {

  10:          // if size = 10, then,

  11:          // initial max_heapify call will be like, max_heapify(arr, 5)

  12:          // But note that, we run until 0 and zero is root for this code

  13:          max_heapify(arr, i);

  14:      }

  15:  }

16:

  17:  template <typename T, size_t size>

  18:  void max_heapify(T (&arr)[size], int idx)

  19:  {

  20:      int l = LEFT(idx);

  21:      int r = RIGHT(idx);

  22:      int largest;

  23:      if( l <= heap_size(arr) && arr[l] > arr[idx])

  24:          largest = l;

  25:      else

  26:          largest = idx;

27:

  28:      if(r <= heap_size(arr) && arr[r] > arr[largest])

  29:          largest = r;

30:

  31:      if(largest != idx) {

  32:          int temp = arr[idx];

  33:          arr[idx] = arr[largest];

  34:          arr[largest] = temp;

  35:          max_heapify(arr, largest);

  36:      }

37:

  38:  }

39:

  40:  void main()

  41:  {

  42:      int A[]={5,3,17,10,84,19,6,22,9};

  43:      build_max_heap(A);

  44:  }

Important observations while coding

As we start from zero, in the actual algorithm and explanation in Corman-Lewis, assume –1 for all the representation.
After the execution of the code, you always get the max element at the index 0.
The time taken to build a max heap is explained in book. As you can see a recursion and a tree involved, you would normally assume O(log n). Along with it, we have a loop running for n/2. So we would usually assume O(n log n).
But the above time is not so tight and its too much upper bound to assume. This heap building algorithm comes down to O(n) at perfect worst case.
Note that we never get, array out-of-bounds, as we verify LEFT(i) and RIGHT(i) with the array total size. This is an important step to never miss!!!
This is a recursion call from inside a loop. That’s why we assume O(n log n). But if we calculate correctly, this is the linear time algorithm and best algorithm for organizing large data sets.
This algorithm is the basis for heap sort algorithm. Don’t dive directly into heap sort without understanding max heapify and build max heap algorithms.

Thursday, 20 January 2011

Make a partial max heap a perfect max heap.

Maintain the property that Parent > Children on every tree on the heap.

Introduction

When you include some more elements in the array to the max heap, you need to still satisfy the property that the parent > children. To get this property true, we need to change the root of the tree or swap some elements(if its array) from parent to children. To simplify this operation, we use a simple induction method which is the reverse of the condition.

Consider, parent > children is violated at index i. Also, assume that, the LEFT(i) and RIGHT(i) trees have not violated this condition yet. This is the initial condition using which we swap elements to make a perfect max heap.

More concepts

Consider we have the following array which is partially a max heap, (Courtesy: Corman-Lewis)

Index:	1	2	3	4	5	6	7	8	9	10
Value:	16	4	10	14	7	9	3	2	8	1

You might note something interesting comes out from the above data set. If you see, 16->10->9->3 is sorted. i.e, the RIGHT(root) tree is sorted but not the LEFT(root) tree.

/ \

4 10

/ \ / \

14 7 9 3

/ \ /

2 8 1

This means, we need to convert the left sub-tree to match the max heap property. That, parent > child while already RIGHT(root) tree is following that.

Code

Calculate the left and right index for the current root
check whether the value at the left is greater than the root
- If yes, this is the largest
- no, then current root is the largest
Compare the largest element between LEFT(i) and i with RIGHT(i), if it becomes the largest, then largest = RIGHT index
If both are not true, then return the algorithm.. This is a breaking condition.
If left or right becomes the largest, continue the recursion in the direction of the largest, until left or right doesn't exceed the heap size. This is one more return condition

   1:  #define LEFT(x) 2*x+1 // since we start with index 0

   2:  #define RIGHT(x) 2*x+2

   3:  #define heap_size(x) (sizeof(x)/sizeof(x[0])) //similar to ARRAYSIZE macro

4:

   5:  template <typename T, size_t size>

   6:  void max_heapify(T (&arr)[size], int idx)

   7:  {

   8:      int l = LEFT(idx);

   9:      int r = RIGHT(idx);

  10:      int largest;

  11:      if( l <= heap_size(arr) && arr[l] > arr[idx])

  12:          largest = l;

  13:      else

  14:          largest = idx;

15:

  16:      if(r <= heap_size(arr) && arr[r] > arr[largest])

  17:          largest = r;

18:

  19:      if(largest != idx) {

  20:          int temp = arr[idx];

  21:          arr[idx] = arr[largest];

  22:          arr[largest] = temp;

  23:          max_heapify(arr, largest);

  24:      }

25:

  26:  }

27:

  28:  void main()

  29:  {

  30:      int A[]={16,4,10,14,7,9,3,2,8,1};

  31:      int val = heap_size(A);

  32:      max_heapify(A, 1);

33:

  34:  }

Problems found while coding:

As we always consider start index as 0, LEFT(i) will equal 2i+1 and RIGHT(i) will equal 2i+2
Interesting problem: in C/C++, array is passed by value by default which means you just get a pointer inside the function, not the full array. So, you cannot use sizeof(array) to get correct size of the array
To complicate this more, in C++, we have possibility of passing multiple references if we know the size of the array. So, we make use of templates to make it generic. (not advised for large scale use) Refer: http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=241
Read more about pointers and references here: http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=166
I will cover a extensive topic about pointers and references soon.

Monday, 14 February 2011

Priority Queue using binary heap

Sunday, 23 January 2011

Build a max heap with input array

Thursday, 20 January 2011

Make a partial max heap a perfect max heap.

About the Author