Tuesday, 22 February 2011

Partitioning Algorithm–In-place

“Think about a problem in which you are given a value. You need to use that value to move all the lesser elements to the left and the greater elements to the right. And most important without using extra space. What would you do? How would you treat the same array as 2 arrays. How we represent a binary heap in an array the same way we do this as well!!”

Introduction

   The partitioning of array is useful for the next algorithm we discuss about sorting, which is quick sort. Why it is quick, is just because of some important properties of partitioning. The partition algorithm gives divide and combine for any algorithm. Its a part of divide and conquer method similar to the merge method discussed for merge sort.

How the division happens?

Divide: Divide an array A[start….end] into two sub arrays, A[start…minindex] and A[maxindex…end-1]. A[start..minindex] will contain values < A[end] and A[maxindex..end-1] contains values > A[end]

This divide step is somewhat similar to binary search, where we divide the problem set by > or <. Similar way, we divide the data set into two parts,

    =

/       \

       <              >

Concept

  What is the use of this division? How the combine step is contributed equally by the partition. If you note, in merge sort, divide is actually done by the merge sort algorithm. Only the combine is done using merge. That’s why, we call merge sort recursion first, and then finally we do a merge. recollect the algorithm of merge!!

    mid = start+end/2

     MERGE-SORT(arr,start,mid)

     MERGE-SORT(arr,mid+1, end)

     MERGE(arr,start,mid,end);

But for quick sort, if you note, the divide & combine both are intrinsic to partition algorithm. So, first, we do partition and then quick sort recursion does the loop through. The lingos are here:

Merge sort: First divide and then merge

Quick sort: First partition and then divide further

Now, you get why we discussed tree traversal also along with this.. :) What we want to do first goes above the recursion loop always.

How a partition takes care of combine as well?

   Always it makes the array to the left and right with the property explained above. So, if we select all the elements as pivot in a 3 element array, it gets sorted. Since you always get partially sorted based over the pivot. partition doesn’t require a separate combine step

Code

void swap(int& a, int& b)
{
    register int tmp;
    tmp = a;
    a=b;
    b=tmp;
}
 
int partition(int* arr, int start, int end)
{
    /* 
      1. decide a pivot element, let it be arr[end]
      2. categorize arr the remainting start -> end-1 
      3. consider 2 indeces = i,j: initially i=start-1, j=start
      4. make sure that i+1 to j-1 always contains element > pivot
      5. the above means, i marks end of smaller elements array
      6. j marks the end of larger elements array
     */
 
    /*
    Example:
      start=0                                           end=12
  i=-1   j=0                                            
        3    6    2    1    9    4    12    19    13    10    8    5    11
        
        After you do this partition,
        start                            i            j       end
        3    6    2    1    9    4    10    8    5    [11]    12    19    13    
    The important point here is, after 4, we see 8 which is < 11.
    So, it needs to be inserted.
    Because of the above point we need 2 indeces.
    */
 
    int i,j, pivot;
    pivot = arr[end];
    i=start-1;
    for(j=start; j<=end-1; j++)
    {
        if(arr[j] <= pivot) {
            i=i+1;
            swap(arr[i], arr[j]);
        }
    }
    swap(arr[i+1], arr[end]);
    return i+1;
}
 
 
 
void main()
{
    int A[]={3,    6,    2,    1,    9,    4,    12,    19,    13,    10,    8,    5,    11};
    int B[]={6,10,2};
    partition(A,0,12);
 
    //sorting 3 elements
    int div = partition(B,0,2);
    //small list= 0->div-1
    //large list = div->3
    int l1 = partition(B,div+1,2);
    int l2 = partition(B,l1+1,2);
}

Important points

  • partition just divides an array based on a pivot!! and returns the mid index. The index can be start or end as well!!.. since there is always possibility that non of the elements are greater or lesser than pivot
  • pivot selection is the very important concept in partition algorithm. Since selection of pivot decides how much sorted you get.
  • There is not even a single guarantee that array get sorted by minimal partitions without doing a partition over each and every element. You need to see all the elements in the array and do a partition.
  • partition takes O(1) time, when start and end are equal or zero.
  • partition at its worst case takes Theta(n) and sometimes since swap is ignored, constant time is lesser!! That’s why, we tell Theta(n) rather than O(n)
  • Doing multiple partitions at worst case always leads to Theta(n2) since we need to treat all the elements as pivot. That’s why quicksort worst case is Theta(n2). Quick sort is not so quick as you hear!! :D
  • How can multiple partition lead to Theta(log n)? The constant time is the main factor here!! when partition is done on a reducing set, you get same Theta(n).
  • There are lots of mathematics studied with such recursive algorithms like quick sort and merge sort!!.. We will learn all those in detail in future