An Algorithm a Day: partition

Showing posts with label partition. Show all posts

Monday, 30 May 2011

Friday, 27 May 2011

Randomized Quick sort

“Randomized quick sort uses random partition method we discussed. We just select a random pivot in an array. The advantage we get is that we can reduce the worst case performance of quick sort!! But still, this optimization is expected!!”

Introduction

As we have seen a lot about this already, we can directly jump into Randomized quick sort

Concept

Just select a random pivot element in the partition algorithm.

Code

 int random_partition(int* arr, int start, int end){    srand(time(NULL));    int pivotIdx = start + rand() % (end-start+1);    int pivot = arr[pivotIdx];    swap(arr[pivotIdx], arr[end]); // move pivot element to the end    pivotIdx = end;    int i = start -1;     for(int j=start; j<=end-1; j++)    {        if(arr[j] <= pivot)        {            i = i+1;            swap(arr[i], arr[j]);        }    }     swap(arr[i+1], arr[pivotIdx]);    return i+1;} void random_quick_sort(int* arr, int start, int end){    if(start < end) {        int mid = random_partition(arr, start, end);        random_quick_sort(arr, start, mid-1);        random_quick_sort(arr, mid+1, end);    }}void main(){    int A[] = {2,5,7,1,10,8,9};    random_quick_sort(A, 0,6);}

pivot Index must be start + rand() %(end-start+1).. We do a recursive algorithm with different start and end.. :) so be careful about the formula for pivotIndex
We need to add start to the rand() % end because we need to get near the size of the array
start->mid-1 and mid+1->end.. is enough for quick sort.. NOTE: we always get the pivot element in right position at every iteration
There is an alternate way of partitioning given at many other places.. Not a CLRS way!! :) We will see that as well in the next article..!! :D

Thursday, 26 May 2011

Random partition & its use-cases

“We studied lots of theories about random numbers and also how it gets purely random. But there are mathematical proofs that, randomized algorithms give good average performance. We are going to see Randomized partitioning which is the most important for a series of algorithms in sorting & selection”

http://analgorithmaday.blogspot.com/2011/02/partitioning-algorithmin-place.html

http://analgorithmaday.blogspot.com/2011/05/relation-between-binary-search-tree-and.html

Introduction

If you get a question like, “Find the k-th smallest element in an array”. What are the various methods you would follow,

sort the array with comparison algorithms using max O(n log n) and read the indexed element from the array in O(1)
sort the array using linear sorting techniques and get a O(n) with extra space. Read the indexed element from the array (special case)

Always you resort to sorting to find the k-th smallest element. And the second linear time algorithm to get the array sorted needs extra space.

What is the method you can use other than all this? Partitioning. When we do partitioning the array based on a pivot, the important output we get is that, the pivot element fits into its perfect position where it will be present in a sorted array. This happens for every pivot selection!! :)

This behaviour is really wonderful and can be used to get the k-th smallest element as well. But the observation is that, its not possible to get the final position of the pivot equal to a given value k. We need to do some iterations and select random pivots so that we get the pivot in our required position k. But by doing fewer iterations to find k is what really important here. Otherwise, we are again doing sorting. :)

Concept

Selecting random pivots? What does that mean? We have learned partition algorithms with a fixed pivot (the end element). How to randomize it? and how we can we guarantee that we can still get the k-th smallest element in O(n). Moreover, if we get the final pivot position equal to k at first attempt, we are very lucky!! :) definitely O(n).

There are many tricks with random partitioning of an array. Even this is used in randomly balanced binary trees. Remember the relation between quick sort & binary search tree ? :)

So, lets see first, how the random partitioning works. It’s just simple as generating a random pivot index and move it as the end element. Then, apply the normal partition algorithm we were doing already!!

But, what you get out of such a partition is wonderful!!.. If you are lucky

you get the k-th smallest or largest element
you get a perfectly balanced tree
you can avoid worst case O(n^2) in quick sort even if the array is sorted..

Mostly, it depends on the luck factor!! :) That’s the reason why, we use expectations to analyse these algorithms. The partitioning is the basic example to explain expectation based asymptotic notations. We avoid the log n extra performance delay in quick sort to select k-th element without sorting.

Code

int random_partition(int* arr, int size){    srand(time(NULL));    int pivotIdx = rand() % size;    int pivot = arr[pivotIdx];    int i=-1;    int j=0;    swap(arr[pivotIdx], arr[size-1]); // move pivot element to the end    pivotIdx = size-1;     while(j < size)    {        if(arr[j] <= pivot)        {            i = i+1;            swap(arr[i], arr[j]);        }        j++;    }     swap(arr[i+1], arr[pivotIdx]);    return i+1;} void main(){    int A[] = {2,5,7,1,10,8,9};    int loc = random_partition(A, 7);}

the loc you get is nothing but k. It gives a random k every time you execute the algorithm.
Usually, applications need a perfect k.. :) that’s where randomized selection algorithm comes into picture.

Tuesday, 22 February 2011

Partitioning Algorithm–In-place

“Think about a problem in which you are given a value. You need to use that value to move all the lesser elements to the left and the greater elements to the right. And most important without using extra space. What would you do? How would you treat the same array as 2 arrays. How we represent a binary heap in an array the same way we do this as well!!”

Introduction

The partitioning of array is useful for the next algorithm we discuss about sorting, which is quick sort. Why it is quick, is just because of some important properties of partitioning. The partition algorithm gives divide and combine for any algorithm. Its a part of divide and conquer method similar to the merge method discussed for merge sort.

How the division happens?

Divide: Divide an array A[start….end] into two sub arrays, A[start…minindex] and A[maxindex…end-1]. A[start..minindex] will contain values < A[end] and A[maxindex..end-1] contains values > A[end]

This divide step is somewhat similar to binary search, where we divide the problem set by > or <. Similar way, we divide the data set into two parts,

=

/ \

< >

Concept

What is the use of this division? How the combine step is contributed equally by the partition. If you note, in merge sort, divide is actually done by the merge sort algorithm. Only the combine is done using merge. That’s why, we call merge sort recursion first, and then finally we do a merge. recollect the algorithm of merge!!

mid = start+end/2

MERGE-SORT(arr,start,mid)

MERGE-SORT(arr,mid+1, end)

MERGE(arr,start,mid,end);

But for quick sort, if you note, the divide & combine both are intrinsic to partition algorithm. So, first, we do partition and then quick sort recursion does the loop through. The lingos are here:

Merge sort: First divide and then merge

Quick sort: First partition and then divide further

Now, you get why we discussed tree traversal also along with this.. :) What we want to do first goes above the recursion loop always.

How a partition takes care of combine as well?

Always it makes the array to the left and right with the property explained above. So, if we select all the elements as pivot in a 3 element array, it gets sorted. Since you always get partially sorted based over the pivot. partition doesn’t require a separate combine step

Code

void swap(int& a, int& b){    register int tmp;    tmp = a;    a=b;    b=tmp;} int partition(int* arr, int start, int end){    /*       1. decide a pivot element, let it be arr[end]      2. categorize arr the remainting start -> end-1       3. consider 2 indeces = i,j: initially i=start-1, j=start      4. make sure that i+1 to j-1 always contains element > pivot      5. the above means, i marks end of smaller elements array      6. j marks the end of larger elements array     */     /*    Example:      start=0                                           end=12  i=-1   j=0                                                    3    6    2    1    9    4    12    19    13    10    8    5    11                After you do this partition,        start                            i            j       end        3    6    2    1    9    4    10    8    5    [11]    12    19    13        The important point here is, after 4, we see 8 which is < 11.    So, it needs to be inserted.    Because of the above point we need 2 indeces.    */     int i,j, pivot;    pivot = arr[end];    i=start-1;    for(j=start; j<=end-1; j++)    {        if(arr[j] <= pivot) {            i=i+1;            swap(arr[i], arr[j]);        }    }    swap(arr[i+1], arr[end]);    return i+1;}   void main(){    int A[]={3,    6,    2,    1,    9,    4,    12,    19,    13,    10,    8,    5,    11};    int B[]={6,10,2};    partition(A,0,12);     //sorting 3 elements    int div = partition(B,0,2);    //small list= 0->div-1    //large list = div->3    int l1 = partition(B,div+1,2);    int l2 = partition(B,l1+1,2);}

Important points

partition just divides an array based on a pivot!! and returns the mid index. The index can be start or end as well!!.. since there is always possibility that non of the elements are greater or lesser than pivot
pivot selection is the very important concept in partition algorithm. Since selection of pivot decides how much sorted you get.
There is not even a single guarantee that array get sorted by minimal partitions without doing a partition over each and every element. You need to see all the elements in the array and do a partition.
partition takes O(1) time, when start and end are equal or zero.
partition at its worst case takes Theta(n) and sometimes since swap is ignored, constant time is lesser!! That’s why, we tell Theta(n) rather than O(n)
Doing multiple partitions at worst case always leads to Theta(n²) since we need to treat all the elements as pivot. That’s why quicksort worst case is Theta(n²). Quick sort is not so quick as you hear!! :D
How can multiple partition lead to Theta(log n)? The constant time is the main factor here!! when partition is done on a reducing set, you get same Theta(n).
There are lots of mathematics studied with such recursive algorithms like quick sort and merge sort!!.. We will learn all those in detail in future

An Algorithm a Day

Monday, 30 May 2011

Alternative partition method

Friday, 27 May 2011

Randomized Quick sort

Thursday, 26 May 2011

Random partition & its use-cases

Tuesday, 22 February 2011

Partitioning Algorithm–In-place

About the Author