An Algorithm a Day: counting

Showing posts with label counting. Show all posts

Wednesday, 29 August 2012

Find power of 2

Introduction

This something simple to understand recursion. Just for a time pass code recipe. :)

Concept

Recursion base condition:

2⁰ = 1

Generic formula,

pow2(n) = { 1 , when n=0

2*pow2(n-1), when n>1

The above formula works out well to find the powers of 2.

Code

double pow2(double val){  if(val == 0)    return 1;   return 2*pow2(val-1);} void main(){    cout<<pow2(1000); }

Warning

This is not an alternate to pow function. The runtime implements it using for loop.

Monday, 3 October 2011

It’s a simple and usual question. If you need to add 2 50-digit numbers, what would you do? The first thing a dummy would look into is definitely the limits of the primitive data type. But note that, a 32-bit machine with a separate floating point processor can at max process 64-bit (for double). So, it comes to max 2^64 which is 20 digits. Then how would you solve for more digits?

Say, the question is you need to add some 100’s of 50-digit numbers and you need to find the first 20 digits.

Brain storm

I started solving the problem. Straight forward I knew that I need to write code for a simple adder system. Addition is very simple considering the operations required for doing it.

Take the LSD part (Least significant digit or the last digit in the number)
Add all the LSD digits of the given numbers (50^th digit)
Ex: if there are 100 numbers with digit max which can be 9, it can add up to max 900.
We need to maintain a carry which is the digits other than the last digit that we got in the sum
Take carry as well in addition of the next set of digits. (49-0^th digit)
Finally output the sum

The method I followed to find the last digit is not much optimized. That is, if you divide a number by 10, you get the carry value. If you mod the number by 10, you get the remainder which is nothing but the last digit.

This same question might get twisted for different bases. Say, you need to do the same for base 2. J Then the rules are different and our algorithm won’t work.

Code

int main()

{

char line[55];

short dignum[100][50];

int count=0;

ifstream myfile ("eulerinput.txt");

for(int i=0 ; i<100; i++) {

myfile.getline(line, 51);

for(int j=0; j < 50; j++) {

dignum[i][j] = line[j]-48;

cout<<dignum[i][j];

}

cout<<endl;

}

int carry=0; // never >999.. since 100 digits addition + carry can never exceed over 999

//adder

int dsum;

for(int i=49; i>=0; i--) {

dsum = 0;

for(int j=0; j < 100; j++) {

dsum += dignum[j][i];

}

dsum += carry;

carry = dsum/10;

cout<<dsum%10<<" "; //digits in reverse

}

cout<<endl<<dsum;

return 0;

}

Friday, 13 May 2011

Permutations & Combinations for Dummies

“I never understood concepts like permutation, combination which is quite confusing. But permutation is nothing but just a way to count all possibilities of a nested iteration!!”

Introduction

Just take, the following nested for loops

for(int i=0;i<m; i++) { for(int j=0; j<n;j++) {   cout<<i<<":"<<j<<endl; }}

Simply we can tell directly the total iterations are m*n. But the traversal done by the loop with an array will be like..

REPEAT –> read the row –> read from left to right –> move to next row –> UNTIL i*j == m*n

These are the various values comes out considering m=3, n=3,

{0,0}, {0,1}, {0,2}

{1,0}, {1,1}, {1,2}

{2,0}, {2,1}, {2,2}

Keeping i, we iterate with changing symbols in j.. Doesn’t it look like Cartesian Product with a Condition?

This kind of rearrangement of numbers are necessary to traverse the array. Similarly you need rearrangement of numbers in various applications!!..

For example, say, you need to find all possible rearrangements of a shuffled 56 deck card.. There are numerous ways if you need to find the rearrangements of all the 56 cards(56!).. But if you want to know rearrangements possible when the cards are distributed within 14 people..

You just need to count 14 possible rearrangements out of 56 cards to calculate. This is where you need permutation..

To understand permutation well, you need to understand set theory!! Especially Cartesian Product!.. Learn about set theory first: http://analgorithmaday.blogspot.com/2011/05/set-theorya-dummies-guide.html

Concept

Every multiplication can be represented with a set of additions. When you add a number that much times, you get the multiplied value. Similarly, if you have a set of 3 numbers.. how would need to find the number of ways it can be rearranged?

Strings

If you take Cartesian product of same set k times with each set having some n symbols in it, you get strings. S^k sets with each having n items.

i.e, If the set is S, S x S x S is a 3-set product. If you have only 2 elements in it, then total you get 2^3 distinct elements as a result of this product. This count of distinct elements can be got from n^k equation. Example, Take a binary string of length 3. You have max 2 symbols possible in a binary. Total possible sets are,

000,001,010,011,100,101,110,111

Which is 2^3.

Permutation

Permuting means, simply re-arranging without missing any symbols!!.. if you want the number of rearrangements what would you do ? That’s where permutation comes into picture.

For example, saw you have a set {a,b,c}. The total number of ways, you can order this variables are,

abc, acb, bca, bac, cab,cba

Total 6 ways, the above 3 char array can be permuted. This gives as a pattern, the first element is chosen n ways, the next ignoring the first element chosen in n-1 ways, the next in n-2 ways. This means, you get n(n-1)(n-2) which is nothing but n!

If in the same above array, you just need to know the permutation of only k-elements instead of all n elements. (n is the total elements in the array). ex: this is similar to finding the 14 cards possible rearrangements in 56 card deck..

then, the total ways are,

ab, ac, bc, ba, ca, cb

Again the same, 6 ways. But now, n, n-1, n-2 and upto n-k+1 is what required. So, The formula becomes like below,

Combination

We already saw about Strings which we get by Cartesian product of the same set and get the unique items out of the set with each symbol in an order.

Combination has the same property of getting unique items with each item at one place but ignores the order of items requirement. It considers symbols in both the set and takes out in-order uniquely!!.. i.e, You should remove transitive, symmetric symbols in a combination!!

Ex: Permutation for 2 element in a 4 element set, {a,b,c,d}

ab,ac,ad,ba,bc,bd, ca,cb,cd,da,db,dc

Total 12 sets are possible since we need to consider every re-arrangement in any order.

Combination for 2 element in a 4 element set, {a,b,c,d}

ab, ac, ad, bc,bd, cd

There are only 6 sets!!.

Formula:

Examples

Some real world example!!

http://betterexplained.com/articles/easy-permutations-and-combinations/

You have 8 players.. and 3 different cups to be won. Gold, Silver, Bronze. You give each cup to only one of the players and the cups will be given in the order, Gold->Silver->Bronze.

Gold: When you need to give Gold, you have all 8 players competing for it. So, total 8 options possible.

Silver: When you need to give Silver, you have already given Gold, so only 7 options possible

Bronze: When you need to give Bronze, you have already given Silver, so only 6 options possible.

Total ways in which the 8 players can win will be 8! ways..

But, there are only 3 cups, so Total ways in which the top 3 cups can be won, if given one-by-one in order to the persons will be: 8 * 7 * 6.

This is where Permutation comes into Picture!

If the price money is same, i.e, a Stainless steel plate. You don’t need to care about the order in which the 3 cups are given. So, its 3 cups given to 8 players. But 3 cups re-arrangement don’t need to be taken into account as all are equal. So 3! ways of arranging the cups(6 ways) are useless when it comes to just picking some 3 out-of-8.

This is combination and you get only: 8*7 ways according to the formula.

Monday, 9 May 2011

Count duplicates in an Integer array–Google Phone Interview Question

Question
This is one of the question asked to me on a phone interview with Google. Funny that it takes lots of time to think and the interviewer even after he gave ample time I just slept down as its 12:00 midnight. :)
This is the question: http://www.careercup.com/question?id=3483977
Find the repeated numbers in a sorted array using less than O(n) time complexity
So, if you have, A[] = {1,1,1,2,2,2,3,3,4}
you should print, 1=>3, 2=>3, 3=>2, 4=>1
Approach
The approach to this problem in O(n) way would be to just keep counting from 0 to n in the array. But the interviewer needs lesser time than that.
I came up with a approach but cannot make that into a working code:
The approach is, since the array is sorted, you can apply a modified version of binary traversal over the array and count the number of entries..
For example,
1 1   1 2   2   2   3   3   4
start        mid            end
If arr[start] == arr[end] then its simply means arr[end] symbol count is end-start+1
If not, keep splitting the array and compare arr[start] and arr[mid], if they are equal arr[mid] count is mid-start+1
similarly, its same for arr[mid+1] and arr[end]
But the important end condition to see here is, set of 3 numbers..
When the number of elements is 3, i.e, end-start = 2. you get following combinations
start     mid          end
3,       3,          3 = All same, so our start –> end takes care of this
2,          3,          3 = mid and end are different!! a shift!!
2,       2,              3 = start and mid are different!! a shift
3,       4,          5 = start, mid, end all are different!!
All the above cases should be handled perfectly to get the correct working code.
Code

 
map<int, int> blist;

void count_dup(int* arr, int start, int end)

{

    if(end == 0)

        blist[arr[start]]+=1;

    if(start<end) {

        if(arr[start] == arr[end]) {

            blist[arr[end]] += end - start + 1;

            return;

        }

        int mid = abs((start+end)/2);

        if(end - start == 2 && arr[start] < arr[mid] && arr[mid] < arr[end]) {

            blist[arr[start]] +=1;

            blist[arr[mid]] +=1;

            blist[arr[end]] +=1;

            return;

        }

        if(arr[start] != arr[mid] && start == mid) {

            blist[arr[start]]+=1;

            blist[arr[mid]] += 1;

            return;

        }

        if(arr[start] < arr[mid]) {

            count_dup(arr, start, mid);

        } 

        else {

            blist[arr[mid]] += mid - start + 1;

        }

        if(arr[mid+1] != arr[end] && mid+1 == end) {

            blist[arr[mid+1]]+=1;

            blist[arr[end]] += 1;

            return;

        }

        if(arr[mid+1] < arr[end]) {

            count_dup(arr, mid+1, end);

        }

        else {

            blist[arr[end]] += end - mid;

        }

    }

}

void main()

{

    int A[] = {1,2,2,3,3,3,4,4,4,5,5,5,5,6,6,6,6,7,7,8,9};

    count_dup(A, 0,20);

}

Note that we only do recursive call if start, mid or mid+1, end is lesser.. That’s the way we do a binary traversal
The last case of all 3 different should be handled first itself. In this case, we add to each start, mid, end +1
In cases when, start-end = 1 you either get, start=mid or mid+1=end, these are places where we need to check the other 2 cases.. mid+1&end, start&mid
Note that, since we only do, mid+1 and end.. mid & mid+1 get missed. This case is taken care by else part of the two recursions.
Worst case!!.. If all the elements in the array are different, it visits each and every node. Omega(n)
Best case!!. If all the elements in the array is the same, O(1). Yes, we just use index to calculate the count.
Average case!!.. It would be Theta(log n) depending on the input data

UPDATE:The above code is really brute force and not sure whether it works out for all cases and scenarios.

The perfect easy way to code is here: http://codepad.org/p0guN137.

The approach is to form a binary search tree from the sorted array and while forming that, use the hash map to track the duplicates count. But it is also O(n) since we need to traverse the whole input set to form the tree. We can ignore the recursion when it goes around for duplicate items to get around O(log n). But using this way, it would convince the interviewer a little bit and also as we can ignore recursion in-between it reduces the time to less than O(n).

Wednesday, 9 March 2011

Radix sort–Punch card sorting

“The oldest algorithm ever invented. Simpler and very useful. Used in mechanical devices even. So this is even before computers are invented. Involves pure mathematics and works for any numeral and text systems. The oldest method used to sort dictionaries”

Introduction

This is one of the oldest method and some what easy to understand until you get the point clear. The main idea behind this sort is: if there is array of digits, if you sort array from the least significant to most significant number, you finally get a sorted array.

Similarly, in a dictionary in which you sort the words lexicographically, Most significant character to least significant character sorting is done.. All these are Radix sorts.. It gives a good performance benefit as well, i.e, around O(n).

This sort is not so important. But gives as an idea about why stable sorting is important in certain cases. Radix sort will not work out if the order of the elements is not maintained while putting the elements at the sorted spot.

Concept

Radix sort supports all bases starting from 2.. But the algorithm we wrote and tested is only for base 10 for simplicity. We take each digit and sort the whole array around it.

For example,

{329,457,657,839,436,720,355}

On first sorting around the least significant digit, should yield,

{720,355,436,457,657,329,839}

{720,329,436,839,355,457,657}

{329,355,436,457,657,720,839}

Sort just the digits picked from 1’s place, 10’s place and 100th place. Even though intermediate steps looks like unsorted, You finally get a sorted array out of it. This is achieved just because the order of elements in maintained through out the iterations.

The above example illustrates only an array of integers with equal number of digits. This will also work for integers sizes < 3. The best way to start radix sort is to find out the maximum value and count the digits in it or to decide on the number of max digits.

Usually, in a mechanical punching card where this algorithm is used, they have static 12 digit number imprinted over it and sort starting from LSD to MSD.

Two kinds of radix sort is clearly visible now: LSD radix sort, MSD radix sort. MSD is mostly used for dictionary sort. We will discuss LSD sort in this article.

The simple algorithm is like this,

1. For 0 to number of digits

2. Sort the digits in a stable way in the main array

Code

#define BASE 10 void counting_sort_radix(int*& arr, int k, int size){    int *B, *C;    C = (int*) malloc(sizeof(int)*BASE);    B = (int*) malloc(sizeof(int)*(size+1));     for(int i=0;i<BASE; i++) C[i] = 0;     // only the last digit need to be used     // for sorting    for(int i=0;i<size;i++){        int val = arr[i]/k;        int idx = val%BASE;        C[idx]++;    }     for(int i=1;i<BASE;i++) C[i] += C[i-1];     for(int j=size-1;j>=0;j--) {        int val = arr[j]/k;        int idx = val%BASE;        B[C[idx]] = arr[j];        C[idx]--;    }     free(C);    for(int i=0; i<size; i++) arr[i] = B[i+1];    free(B);} void radix_sort(int* arr, int d, int size){    //d = number of digits    // must calculate d from the max val in array     for(int i=0;i<d; i++) {        counting_sort_radix(arr, pow((float) BASE, (float) i), size);    }} void main(){    int A[] = {329,457,657,89,46,720,355};     radix_sort(A, 3, 7);  }

Important points

We have reused counting sort as the base sorting algorithm for our radix sort algorithm
We go from 1st place to 100th place in above code.. it will run until 3 digits. So, input can only have max 3 digits
BASE is only tested for 10. It can be base 2, base 8(octet), base 16(hex).
We need to read different digit in different iterations. That’s why we have to modify the counting sort and get k=10^d as argument which is used to evaluate arr[j]/k and val%BASE. These two gives you the exact digit at necessary place we are looking for
The 10^d loop decides the place around which counting sort needs to be applied
B[] as usual requires size+1 since we never get a zero digit count. So, we copy from B[1] –> A[0] to balance it.
THE MOST IMPORTANT BUG FIX IN THIS CODE IS STABILITY!!.. The last loop which creates the B[] must run from length to 0 rather than running from 0 to length. If we run from 0 to length, we just get all elements in reverse order instead of forward order.
Why when we run from 0 to length, we get the order of elements in reverse ? Because cumulative sum is taken from front to back, denoting all the other elements go below that index from back to front. So, we must go in reverse to get the correct order
Without this stability condition, we will never get a sorted list from radix sort. Stability of the sorting used to sort digits is very important in radix sort method
Any stable sorting method can be used to sort the digits, not just counting sort.

An Algorithm a Day