- logb(xy) = logbx + logby.
- logb(x/y) = logbx - logby.
- logb(xn) = n logbx.
- logbx = logax / logab.
Tuesday, 21 August 2012
Understanding Logarithms
Monday, 9 May 2011
Count duplicates in an Integer array–Google Phone Interview Question
Question
This is one of the question asked to me on a phone interview with Google. Funny that it takes lots of time to think and the interviewer even after he gave ample time I just slept down as its 12:00 midnight. :)
This is the question: http://www.careercup.com/question?id=3483977
Find the repeated numbers in a sorted array using less than O(n) time complexity
So, if you have, A[] = {1,1,1,2,2,2,3,3,4}
you should print, 1=>3, 2=>3, 3=>2, 4=>1
Approach
The approach to this problem in O(n) way would be to just keep counting from 0 to n in the array. But the interviewer needs lesser time than that.
I came up with a approach but cannot make that into a working code:
The approach is, since the array is sorted, you can apply a modified version of binary traversal over the array and count the number of entries..
For example,
1 1 1 2 2 2 3 3 4
start mid end
If arr[start] == arr[end] then its simply means arr[end] symbol count is end-start+1
If not, keep splitting the array and compare arr[start] and arr[mid], if they are equal arr[mid] count is mid-start+1
similarly, its same for arr[mid+1] and arr[end]
But the important end condition to see here is, set of 3 numbers..
When the number of elements is 3, i.e, end-start = 2. you get following combinations
start mid end
3, 3, 3 = All same, so our start –> end takes care of this
2, 3, 3 = mid and end are different!! a shift!!
2, 2, 3 = start and mid are different!! a shift
3, 4, 5 = start, mid, end all are different!!
All the above cases should be handled perfectly to get the correct working code.
Code
map<int, int> blist;
void count_dup(int* arr, int start, int end)
{
if(end == 0)
blist[arr[start]]+=1;
if(start<end) {
if(arr[start] == arr[end]) {
blist[arr[end]] += end - start + 1;
return;
}
int mid = abs((start+end)/2);
if(end - start == 2 && arr[start] < arr[mid] && arr[mid] < arr[end]) {
blist[arr[start]] +=1;
blist[arr[mid]] +=1;
blist[arr[end]] +=1;
return;
}
if(arr[start] != arr[mid] && start == mid) {
blist[arr[start]]+=1;
blist[arr[mid]] += 1;
return;
}
if(arr[start] < arr[mid]) {
count_dup(arr, start, mid);
}
else {
blist[arr[mid]] += mid - start + 1;
}
if(arr[mid+1] != arr[end] && mid+1 == end) {
blist[arr[mid+1]]+=1;
blist[arr[end]] += 1;
return;
}
if(arr[mid+1] < arr[end]) {
count_dup(arr, mid+1, end);
}
else {
blist[arr[end]] += end - mid;
}
}
}
void main()
{
int A[] = {1,2,2,3,3,3,4,4,4,5,5,5,5,6,6,6,6,7,7,8,9};
count_dup(A, 0,20);
}
- Note that we only do recursive call if start, mid or mid+1, end is lesser.. That’s the way we do a binary traversal
- The last case of all 3 different should be handled first itself. In this case, we add to each start, mid, end +1
- In cases when, start-end = 1 you either get, start=mid or mid+1=end, these are places where we need to check the other 2 cases.. mid+1&end, start&mid
- Note that, since we only do, mid+1 and end.. mid & mid+1 get missed. This case is taken care by else part of the two recursions.
- Worst case!!.. If all the elements in the array are different, it visits each and every node. Omega(n)
- Best case!!. If all the elements in the array is the same, O(1). Yes, we just use index to calculate the count.
- Average case!!.. It would be Theta(log n) depending on the input data
The perfect easy way to code is here: http://codepad.org/p0guN137.
The approach is to form a binary search tree from the sorted array and while forming that, use the hash map to track the duplicates count. But it is also O(n) since we need to traverse the whole input set to form the tree. We can ignore the recursion when it goes around for duplicate items to get around O(log n). But using this way, it would convince the interviewer a little bit and also as we can ignore recursion in-between it reduces the time to less than O(n).
Wednesday, 23 February 2011
Quick sort
“Beauty of quick sort is, its so quick!!.. unlike other sorting algorithms based on divide and conquer, where divide, conquer, combine all are independent steps. In quick sort, divide and combine are done in one-shot. i.e, just divide actually sorts the whole array!!”
See Also
http://analgorithmaday.blogspot.com/2011/02/partitioning-algorithmin-place.html
Introduction
Quick sort is the mostly discussed and used algorithm in any code that you have written. That’s why its the simplest interview question to be asked!! :). If even this cannot be answered, then you will be thrown out. The first important step is partitioning. Please study the above link fully!!
Multiple partitioning is what quick sort does. But while doing multiple partitioning, it reduces the input set given for partitioning. So, partially sorted array at both the side of a partition, again gets partitioned further.
Concept
The important concept to be understood in a quick sort is that, we take each and every element in the array as pivot to get the final sorted array. It’s not possible to take one element as pivot and sort the whole array!!.. As a dummy, i too thought like that, but its not the case.
Take for example,
13 4 5 6 11 10 2
Partition=1: pivot=2,
13 4 5 6 11 10 (2)
Partition=2: pivot = 10
4 5 6 (10) 11 13
Partition=3: left, pivot=6, right pivot = 13
4 5 (6) 11 (13)
Partition=4: left pivot=5, right pivot=11
4 (5) (11)
Ignore the pivots in each partitions.. to decide the next pivot. This is also the main concept of quick sort!!
The tree representation,
13 4 5 6 11 10 (2)
/ \
4 5 ( 6) (10) 11 (13)
……….. Check some reference books to get a perfect quick sort tree..!! :)
So, final sorted list is, 2 4 5 6 10 11 13.
But note that, we don’t partition single element arrays.. :) That is base condition for quick sort. And also you could see that, the pivot rearranges the array, so if you consider a tree, it will be like, go from top to bottom to partition, to get all sorted, go from bottom to top!!..But pivots should occur in order, they are in the initial tree shown above!! But note that, all elements become a pivot!!
Code
void swap(int& a, int& b)
{
register int tmp;
tmp = a;
a=b;
b=tmp;
}
int partition(int* arr, int start, int end)
{
/*
1. decide a pivot element, let it be arr[end]
2. categorize arr the remainting start -> end-1
3. consider 2 indeces = i,j: initially i=start-1, j=start
4. make sure that i+1 to j-1 always contains element > pivot
5. the above means, i marks end of smaller elements array
6. j marks the end of larger elements array
*/
/*
Example:
start=0 end=12
i=-1 j=0
3 6 2 1 9 4 12 19 13 10 8 5 11
After you do this partition,
start i j end
3 6 2 1 9 4 10 8 5 [11] 12 19 13
The important point here is, after 4, we see 8 which is < 11.
So, it needs to be inserted.
Because of the above point we need 2 indeces.
*/
int i,j, pivot;
pivot = arr[end];
i=start-1;
for(j=start; j<=end-1; j++)
{
if(arr[j] <= pivot) {
i=i+1;
swap(arr[i], arr[j]);
}
}
swap(arr[i+1], arr[end]);
return i+1;
}
void quick_sort(int* arr, int start, int end)
{
if(start < end) {
int mid = partition(arr, start, end);
quick_sort(arr, start, mid-1);
quick_sort(arr, mid+1, end);
}
}
void main()
{
int A[]={3, 6, 2, 1, 9, 4, 12, 19, 13, 10, 8, 5, 11};
int B[]={6,10,2};
//sorting 3 elements
quick_sort(B, 0, 2);
//sorting more elements
quick_sort(A, 0, 12);
}
Important notes
- quick sort has the perfect predictable average case of theta(n log n).
- quick sort also has Theta(n2) worst case. Merge or heap sort is the best algorithm since even worst case is O(n log n). But, merge or heap sort has space complexity, bad average case and large constant time!!
- quick sort performs in its worst case when the array is sorted!! its because, all the splits will be bad!!.. always you will have the pivot at the end!!..
- If the pivot is at the end, even after the split, its a bad split!!.. good split means, you get the pivot at the middle after a partition
- To do quick sort in descending order, just change the <= to >= in partition code!!