An Algorithm a Day: google

Showing posts with label google. Show all posts

Monday, 9 May 2011

Count duplicates in an Integer array–Google Phone Interview Question

Question
This is one of the question asked to me on a phone interview with Google. Funny that it takes lots of time to think and the interviewer even after he gave ample time I just slept down as its 12:00 midnight. :)
This is the question: http://www.careercup.com/question?id=3483977
Find the repeated numbers in a sorted array using less than O(n) time complexity
So, if you have, A[] = {1,1,1,2,2,2,3,3,4}
you should print, 1=>3, 2=>3, 3=>2, 4=>1
Approach
The approach to this problem in O(n) way would be to just keep counting from 0 to n in the array. But the interviewer needs lesser time than that.
I came up with a approach but cannot make that into a working code:
The approach is, since the array is sorted, you can apply a modified version of binary traversal over the array and count the number of entries..
For example,
1 1   1 2   2   2   3   3   4
start        mid            end
If arr[start] == arr[end] then its simply means arr[end] symbol count is end-start+1
If not, keep splitting the array and compare arr[start] and arr[mid], if they are equal arr[mid] count is mid-start+1
similarly, its same for arr[mid+1] and arr[end]
But the important end condition to see here is, set of 3 numbers..
When the number of elements is 3, i.e, end-start = 2. you get following combinations
start     mid          end
3,       3,          3 = All same, so our start –> end takes care of this
2,          3,          3 = mid and end are different!! a shift!!
2,       2,              3 = start and mid are different!! a shift
3,       4,          5 = start, mid, end all are different!!
All the above cases should be handled perfectly to get the correct working code.
Code

 
map<int, int> blist;

void count_dup(int* arr, int start, int end)

{

    if(end == 0)

        blist[arr[start]]+=1;

    if(start<end) {

        if(arr[start] == arr[end]) {

            blist[arr[end]] += end - start + 1;

            return;

        }

        int mid = abs((start+end)/2);

        if(end - start == 2 && arr[start] < arr[mid] && arr[mid] < arr[end]) {

            blist[arr[start]] +=1;

            blist[arr[mid]] +=1;

            blist[arr[end]] +=1;

            return;

        }

        if(arr[start] != arr[mid] && start == mid) {

            blist[arr[start]]+=1;

            blist[arr[mid]] += 1;

            return;

        }

        if(arr[start] < arr[mid]) {

            count_dup(arr, start, mid);

        } 

        else {

            blist[arr[mid]] += mid - start + 1;

        }

        if(arr[mid+1] != arr[end] && mid+1 == end) {

            blist[arr[mid+1]]+=1;

            blist[arr[end]] += 1;

            return;

        }

        if(arr[mid+1] < arr[end]) {

            count_dup(arr, mid+1, end);

        }

        else {

            blist[arr[end]] += end - mid;

        }

    }

}

void main()

{

    int A[] = {1,2,2,3,3,3,4,4,4,5,5,5,5,6,6,6,6,7,7,8,9};

    count_dup(A, 0,20);

}

Note that we only do recursive call if start, mid or mid+1, end is lesser.. That’s the way we do a binary traversal
The last case of all 3 different should be handled first itself. In this case, we add to each start, mid, end +1
In cases when, start-end = 1 you either get, start=mid or mid+1=end, these are places where we need to check the other 2 cases.. mid+1&end, start&mid
Note that, since we only do, mid+1 and end.. mid & mid+1 get missed. This case is taken care by else part of the two recursions.
Worst case!!.. If all the elements in the array are different, it visits each and every node. Omega(n)
Best case!!. If all the elements in the array is the same, O(1). Yes, we just use index to calculate the count.
Average case!!.. It would be Theta(log n) depending on the input data

UPDATE:The above code is really brute force and not sure whether it works out for all cases and scenarios.

The perfect easy way to code is here: http://codepad.org/p0guN137.

The approach is to form a binary search tree from the sorted array and while forming that, use the hash map to track the duplicates count. But it is also O(n) since we need to traverse the whole input set to form the tree. We can ignore the recursion when it goes around for duplicate items to get around O(log n). But using this way, it would convince the interviewer a little bit and also as we can ignore recursion in-between it reduces the time to less than O(n).

Sunday, 1 May 2011

Find the k-th minimum in two sorted array

Question

You have two sorted array of sizes m and n. How will find the k-th smallest number when these two sorted arrays are merged?

Try to get logarithmic performing algorithm

Answer

I have not tried a logarithmic performing one. But a O(k) performance code is easy to do. Just take two indices in a merge algorithm and traverse it.

Code

int findKSmall(int* a, int m, int* b, int n, int k){    int i=0;    int j=0;    int elem;    while(k>0)    {        if(a[i] < b[j]) {            if(i>m)                continue;            elem = a[i];            i++;        }        else {            if(j > n)                continue;            elem = b[j];            j++;        }        k--;    }    return elem;}  void main(){    int a[] = {2,4,6,9};    int b[] = {8,10,14,21,32,44};    int val = findKSmall(a,4,b,6, 10);}

Friday, 29 April 2011

Traverse an array diagonally

Question

How would you traverse an array diagonally and display its contents ?

Say you have something like this,

1  2  3
4  5  6
7  8  9
In the above array, if you traverse from top left corner, then the list of number will be,

[1], [2,4], [3,5,7], [6,8], [9]

How would you get the above set?

Solution

Reference: http://stackoverflow.com/questions/1779199/traverse-matrix-in-diagonal-strips

     0     1     2

0   1     2     3

1    4    5     6

2    7    8     9

The list of indices for the required traversals are,

{[0,0]}  {[0,1][1,0]}  {[0,2][1,1][2,0]}  {[1,2][2,1]} {[2,2]}

If you can get the pattern, then you can do this easily. Total 5 slices will come which is array size N * 2-1. i.e, for 3x3 = 6-1 = 5.

you should have a incrementing counter at row/column when moving to the middle slice. After that you need a decrementing counter at row/column.

Code

void main(){    int x[3][3] = {1, 2, 3,                   4, 5, 6,                   7, 8, 9};    int N=3;    for (int slice = 0; slice < N*2-1; ++slice) {        int z = slice < N ? 0 : slice - N + 1;        for (int j = z; j <= slice - z; ++j) {            int c1=j;            int c2=(N-1)-(slice-j);            printf("%d ", x[c1][c2]);        }        printf("\n");    }        printf("\n");    for (int slice = 0; slice < N*2-1; ++slice) {        int z = slice < N ? 0 : slice - N + 1;        for (int j = z; j <= slice - z; ++j) {            int c1=j;            int c2=slice-j;            printf("%d ", x[c1][c2]);        }        printf("\n");    }}

here slice maintains the incrementing/decrementing index required. slice-N+1 will take care of that value
slice-j allows as move in both row and column.
When you need to move from top right, just subtract N-1 from slice-j so that you also travel from top right.

Output

Top right to bottom left

3
2 6
1 5 9
4 8
7

Top left to bottom right

1
2 4
3 5 7
6 8
9

Wednesday, 27 April 2011

Google CodeJam–Problem A: Fix-it

Question

You have a list of Unix file paths present in the system without repetition. Then there is another list of file paths which are required to be created based on the previous list.

The Unix file path is tree based. :) So, it obviously gives as the idea that, we need to do a tree.

The problem statement is here: http://code.google.com/codejam/contest/dashboard?c=635101#

You will be given a list of directory already present in the system like this,

/chicken /chicken/egg

and you will be asked to create the following new directories/chicken

/chicken/tom

/chicken/tom/jerry

We should basically check whether directory is already present in the system and also check how many new directories are required.

The output of the above code will be: 2

Since only tom & jerry directories need to be created newly.

Answer

class Tree{public:    string name;    vector<Tree*> list;    Tree() {}    Tree(string name)    {        this->name = name;    }}; Tree* getMatch(vector<Tree*> list, string name){    if(list.empty()) return NULL;     for(vector<Tree*>::iterator it = list.begin(); it != list.end(); ++it)    {            if((*it)->name == name) {            return *it;        }    }    return NULL;} void addDir(Tree* t, char* dirname){    char* back = strdup(dirname);    char* next = strtok(back , "/");    Tree* current = t;    while(next != NULL)    {        Tree* chk = getMatch(current->list, next);        if(chk == NULL) {            Tree *newNode = new Tree(next);            current->list.push_back(newNode);            current = newNode;        } else {            current = chk;        }        next = strtok(NULL, "/");    }} void main(){    int N;    ofstream outfile("result.txt");    scanf("%d",&N);    for(int Ti=1; Ti<=N; Ti++)    {        int c1,c2;        int count = 0;        char dirname[105];        Tree *fs = new Tree();        fs->name = "/";        scanf("%d %d",&c1,&c2);        for(int i=0; i<c1; i++) {            scanf("%s", dirname);            addDir(fs, dirname);        }        for(int i=0; i<c2; i++) {            scanf("%s", dirname);            char* back = strdup(dirname);            char* next = strtok(back , "/");            Tree* current = fs;            while(next != NULL)            {                Tree* chk = getMatch(current->list, next);                if(chk == NULL) {                    Tree *newNode = new Tree(next);                    current->list.push_back(newNode);                    current = newNode;                    ++count;                } else {                    current = chk;                }                next = strtok(NULL, "/");            }        }        outfile<<"Case #"<<Ti<<": "<<count<<endl;    }}

When reading the first input of currently present directories, just create a tree (with multiple children). Make sure that you make the next child dir in the path as a children to the previous parent dir path
After you read path given as present in the system into a tree, next is to read the list of paths for which presence need to verified + new directory need to be created.
Even in this case do the same first step. Just start creating the tree and ignore if directory already present. If not present, Just get the count of it which is not already present in the tree.
Also after taking the count, again add the new dir to the actual tree.
You are done with the solution!! :)

Monday, 9 May 2011

Count duplicates in an Integer array–Google Phone Interview Question

Sunday, 1 May 2011

Find the k-th minimum in two sorted array

Friday, 29 April 2011

Traverse an array diagonally

Wednesday, 27 April 2011

Google CodeJam–Problem A: Fix-it

About the Author