Monday 7 March 2011

Sorting in linear time

“Not so important but some ask!!.. You have a integer array for which you know that all the elements are less than some value… How efficiently you can sort such an array? You would tell merge sort, heap sort, quick sort.. But all will yield only O(n log n) max. That’s where different methodologies for sorting comes up”

Introduction

   These are quite old ways of sorting but useful when you want to do sorting in linear time without affecting the stability. There is new concept comes up here. Stability. Stability is nothing but maintaining the order of the repeated elements even in the sorted array. That is, is {2,2} which comes part of unsorted array in order when sorted, you should get the first ‘2’ first and the second following it (in the same order as unsorted array).

So, How stable is our sort we studied till now?

Say, you have {4,2,0,1,1,9}

In insertion sort, you will {1,1} in the sorted sequence in-order. So, its stable.

In merge sort, you create two lists and merge, i.e, if the merge happens between {2,0,1} and {1,9}, you still get {0,1,1,2,9} with {1,1} in same order. This is true in case of using extra space. But if there is no extra space, you would disobey the order since you need to swap or link.

In quick sort or heap sort, you do all swaps in-place. You even swap elements like {1,1} which changes its order!!.. So, they are not stable!!..

Most of the comparison sorts in which swapping is involved without considering the order, is a non-stable sort.

Till now, what we saw are all comparison sorts

Concept

   There is a new learning we got from comparison sorts and its performance. We will revise the performance of all the sorts we learnt till now.

Merge sort

Worst case – O(n log n)

Average case – Theta(n log n)

Best case – Gamma(n log n)

Insertion sort

Worst case – O(n^2), Average and Best might reduce the swaps required, but if its in-place, you always get O(n^2) with lesser constant time

Heap sort

Same as merge sort. max heapify takes O(n) and recursion pulls in O(log n) to get O(n log n)

Quick sort

Worst case – O(n^2) if the array is reverse sorted.

Best and Average – O(n log n) with very less constant time.

All the above sorts work with a basic property for sorting, comparisons. Comparison and swap are the costliest operations on all the above sorts which never makes to yield lesser than O(n log n) time.

In case of heap sort, max heapify requires only O(n). But O(log n) extra is required to up-bubble or down-bubble elements since we are based upon comparisons.

If we ask, can we ever sort in time less than O(n log n)? Then its a big yes!! but not using comparison as its base.

Conclusion

   In coming articles, we will start reading a completely different perspective of sorts which is used only for special cases and are not general purpose sorts. But sorting can be done in linear time with extra space when we are not using comparison sorts. Since most of these sorting is possible only on integers, they even call these sorting category as integer sorting methods.

No comments:

Post a Comment