An Algorithm a Day: basics

Showing posts with label basics. Show all posts

Thursday, 15 March 2012

Searching BST

“Scanning through an English dictionary is faster because, you directly go to the indexed character and scan through letter one-by-one to get to the exact text you are searching for”

Preface

Sorry for not updating for a long period of around 5 months. Really struck up with busy schedules. Now I forgot all the data structures and algorithms. So need to relearn and hence will revisit all my previous posts before going into medium difficulty content.

References

http://analgorithmaday.blogspot.in/2011/03/successor-and-predecessorbst.html

Introduction

How would you design a search algorithm which does a search through the English dictionary in really fast manner? Manually, you can easily check a dictionary based on lexicographic ordering. But how would you design a data structure which can naturally allow you to do that? you can do this with a binary search tree.

In previous article, we learnt about Hash Tables(http://analgorithmaday.blogspot.in/2011/10/hash-tables-dictionaries.html) that it is an array index to data mapping. This is a direct mapping hash table which can be generically implemented using binary search trees. If you check the implementations of map and hash_map in C++ STLs, these are implemented using a kind of binary search trees.

Concept

We decide left or right of the binary search tree based on the value we need to search for. It reduces loops logarithmically.

Bonus: We can also check how to get the 4th smallest element out of the binary search tree. This can be done by having an extra field which can calculate the rank of the node.

Rank of a node in BST is number of nodes in the left sub-tree + 1. This is nothing but number of keys less than the current value. This rank is helpful to identify the successor and predecessor easily.

If space is not a constraint, then we should go with ranking the nodes. But if its a constraint, then you should identify the successor of each node

Code

Searching a binary search tree (iteration)

class BST {public:    int data;    BST* parent;    BST* left;    BST* right;};  BST* createBSTNode(int data){    BST* newNode = new BST();    newNode->data = data;    newNode->left = NULL;    newNode->right = NULL;    newNode->parent = NULL;    return newNode;} void bst_insert(BST*& tree, BST* node){    BST* prev = NULL;    BST* iter=tree;    while(iter != NULL)    {        prev = iter;        if(node->data > iter->data)            iter = iter->left;        else            iter = iter->right;    }    // found node is parent to our node    node->parent = prev;     // if prev is NULL    // then this is the first node    // change the root NODE    if(prev == NULL)        tree = node;    else {        // now we need to decide where the node        // should go: left or right        if(node->data > prev->data)            prev->left = node;        else            prev->right = node;    } } BST* convert_arr_bst(int* arr, int size){    BST* tree=NULL;    for(int i=0;i<size;i++) {        bst_insert(tree, createBSTNode(arr[i]));    }    return tree;} void main(){    BST *tree, *tmp;    int k = 6;    bool found = false;    int arr[]= {2,30,12,10,22,6};    tree = convert_arr_bst(arr, 6);    tmp = tree;    while(tmp && !found)    {        if(tmp->data == k) found = true;        if(tmp->data < k) tmp = tmp->left;        else {            tmp = tmp->right;        }    }    if(found == true)        cout<<"Found"<<endl;    else        cout<<"Not Found"<<endl;}

Searching a binary search tree based on ranks

Will see this in next article about constructing binary search trees with ranks.

Friday, 14 October 2011

Division method–5th Grade way

"As you become old, you will definitely forget your primary school rhymes!!.. similar to that, you would also forgot multiplication and division as well!!.. This is the dummy’s way to understand, Division with reminder.. Take 1 in the hand, 5 in the middle.. :)”

Introduction

Suddenly, i found that i forgot the 5th Grade Long division method. I don’t know about you. But thought of covering it, as its not so simple.. :)

It’s not a pure dummies or a 5th grade guide. We need to understand Long division method clearly to go in-depth for doing Division of Large numbers or Getting reminder between two large numbers.

As you might guessed, division of large numbers (as heavy as 2048-bit size) are used now in RSA which is also proven to be cracked within years!!.. :)

As this method is required basics for RSA encryption algorithm, this article will be useful for a computer geek as well as a 5th grader!!..

Concept

The way we do long division is simple. But underlying logic behind division is very important one to understand. Multiplication if you take,

how we come up with 4*3 = 12 ? 4 + 4 + 4 or 4 added 3-times gives you 12. So for a successful multiplication, just successive additions are more than enough. Even in current systems, they just use the adder circuit at ALU to do this.

Then what does 12/3 does ? There are 2-types are division a system categorized into,

integer division
floating point division

12/3 will yield you just an integer. So, based on programming language and system, it selects to use integer division where floating point is completely ignored.

Try, int z = 12/5

You will just get z=2. At the same case, if you convert these to float, you get the mantissa & exponential part which might require more than 32-bit. (float is just 32-bit, double is 64-bit).

Try this now, float z=12.0/5.0;

The above is an example of floating point division.

At the core, division happens using subtraction. 12/3 means.. how many times subtracting three yields zero.

“Ok, i understood int, float. What is quotient and reminder ?”

(Take example: 12/3) - quotient(4) is the number you get after you divide with divisor (3) the dividend(12). Sometimes, just subtracting denominator with numerator will not yield zero. Subtract as much as you can so that, the value is less than your quotient. This value will become your reminder.If you want to continue subtracting, then it must be after a decimal point.

In the above division, 10 is the reminder and 17 is the value you get in integer division. now to find the exponential part, you can further divide from there, 10/25.. :) you get 0.4. So the resultant of this division is 17.4.

More explanation

Division is just a concept of finding out how many values are inside a value

Just take the below long division example,

This is the normal way, we have learnt the division. But actually what happens here is different and this needs to be understood to write code to do large number division.

The above is the right way our teachers should have explained. We don’t just take the first 2 digits, we always assume that we have 0’s to complement as. 23*3 = 69 if we do and just subtract the first 2 digit simply means that there are 0’s in the other places. we have to finally add, all these 300 + 60 + 8 to get the actual quotient. Remember that, 11 is the reminder we have got. If we further divide, 11/23.. you get decimal pointed values.

Thus, it doesn’t mean that you need to always have 23*3 for your operation. you can even do like the above way, if you want to step in a easy multiplicative way.

References

http://en.wikipedia.org/wiki/Binary_multiplier

http://www.conceptualstudy.org/Elementary%20Math/Understanding%20Division.htm

Some exercises

“Now, how would you find the reminder between 56.74 / 45.67647”

In C++, you have fmod. Normally, you can do the following,

first do integer division of the above numbers - 56/45
now to get the reminder do this, 56.74 – 45.67647 * quotient (which is 1 in our case)
this gives out our reminder as, 11.06

Wednesday, 8 June 2011

Infix, Prefix and Postfix notations

“Notations are created for easy processing of expressions. Polish and Reverse Polish Notations are popular ones. Normal notation is human readable.”

Introduction

There are 3 kinds of expression notations: infix, prefix and postfix.

Infix: is human readable. (2+3)*4
Prefix: is readable with stack. Expressions first, values next. *+234
Postfix: Values first, expressions next. 23+4*

Read more about the prefix notation here: http://en.wikipedia.org/wiki/Polish_notation

Prefix & postfix doesn’t need to use brackets and can be easily represented using stack or tree.

The main advantage is that we don’t need to care much about the precedence as in Infix notation.

Code

/*Scan the given prefix expression from right to leftfor each symbol {  if operand then   push onto stack  if operator then   {    operand1=pop stack    operand2=pop stack    compute operand1 operator operand2    push result onto stack   } }return top of stack as result*/ void main(){    char* str = "*-5 6 57";    StackList opStack;    int d, sum = 0;    int result=0;    int count = 0;    char* rstr = str;     while(*rstr != '\0') rstr++;    rstr--;     while(rstr > str-1) {         if(isdigit(*rstr)) {            d = *rstr - 48;            if(count == 0) sum = d;            sum += d*count*10;            count++;        } else {            if(sum != 0) {                opStack.push(sum);                sum = 0;                count = 0;            }            switch(*rstr)            {                case '+':                    opStack.push(opStack.pop() + opStack.pop());                    break;                case '*':                    opStack.push(opStack.pop() * opStack.pop());                    break;                case '-':                    opStack.push(opStack.pop() - opStack.pop());                    break;                case '/':                    opStack.push(opStack.pop() / opStack.pop());                    break;            }        }         rstr--;    }    cout<<opStack.pop();}

We need to come in reverse and push into stack the integers
When we see a operator, we need to operate the top 2 elements in the stack.
Note that, we use space as the delimiter between 2 digits. For operators, we don’t need that
In the stack we will have the only element which is the processed output of the expression

Thursday, 2 June 2011

Mis-conceptions with Bitwise operators

“Bitwise operations are really important in all kinds of embedded systems.. and the things that you do are all on base 2. mostly, these operators are not used much in application programming.. But very useful for systems & memory optimized programs.!! If you write algorithms to save memory, knowing bitwise is a must”

Introduction
    Let’s start with ground basics for binary..
   In a 32-bit binary, you should know that, based on the location of the 1 in it, 2^n varies..
That is, 0000 0001 – 2^0                0000 0100 - 2^2             0001 0000 - 2^4
            0000 0010 – 2^1                 0000 1000 - 2^3             0010 0000 - 2^5
So, the series is like 1, 2, 4, 8, 16,.. The in-between numbers permute. i.e, 2->4.. If you take the count of numbers between these, you will get a series, 0, 1, 3, 7, 9, 15…
This is how mapping between a binary and a decimal is done.. hexadecimal number is just the grouping of 4-bits.. that is, < 2^4..
With all these bits, you can do logical bitwise operations, AND, OR, NOT, XOR.
Concepts

OR – its used for masking. In a embedded system, there will be a 32-bit status register. To set or turn on a flag in this 32-bit, you will use a mask and | it with the already existing status bit.
AND – its used for verifying whether a particular bit is set. mostly used to check whether a particular status bit is set in the register
XOR – its for toggling the bits. used for set/reset!!.. only if you have different bits, you get 1. otherwise, its all zero. that’s why, it can convert 1 –> 0 and if 0 is there then 1.
NOT – mostly for turning off a flag.

If you have a 8-bit flag. Then you get the following constants defined,

static final int flagAllOff = 0;  //         000...00000000 (empty mask)
static final int flagbit1 = 1;    // 2^^0    000...00000001
static final int flagbit2 = 2;    // 2^^1    000...00000010
static final int flagbit3 = 4;    // 2^^2    000...00000100
static final int flagbit4 = 8;    // 2^^3    000...00001000
static final int flagbit5 = 16;   // 2^^4    000...00010000
static final int flagbit6 = 32;   // 2^^5    000...00100000
static final int flagbit7 = 64;   // 2^^6    000...01000000
static final int flagbit8 = 128;  // 2^^7    000...10000000

Example, Now, take a integer status register(statusreg) with all 0.

1. Set bit-3,7 in the status register (using OR)

   statusreg |= flagbit3

   statusreg |= flagbit7

2. test whether bit-7 is set or not, (using AND)

    if((statusreg & flagbit7) == flagbit7)  // bit 7 is set

3. toggle bit-5, (using XOR)

   statusreg ^= flagbit5

4. set the bit-5 to zero. Using NOT, AND operator

   statusreg & ~flagbit5

5. combining masks using OR. set bit 5 and 6.

   statusreg & (flagbit5 | flagbit6)

6. XOR has a special property.. if u XOR same number, you get zero.

  i.e, a = 01010011.. a ^ a = 0

5. Using the above principle for swapping. :)

a ^= b; // a = a^b
b ^= a; // b = b^a^b = a
a ^= b; // a = a^b^a = b

6. Extract just 8-bits from a value,

    ch &= 0xFF

More operations & More use-cases

   Left shift & right shift. Now, it should be clear that the number of left shifts you do,

you multiple more 2 to the number. The more right shifts you do, you divide number by 2.

Left shift – Add 0 to the right side.

Right shift – Add 0 to the left side.

All permissions even on applications are maintained with bits to easily understand it.

Errors caused till now on the articles

The major bitwise we misused is,

http://analgorithmaday.blogspot.com/2011/04/unique-items-in-array.html

You cannot get the unique items in the array if the numbers are greater than 32. And, for negative numbers, its only 31 numbers.

Since we use the position of 1’s to get the duplicate.  But it works well for strings, because there is no character which will round and overlap with the same bit.

i.e, if in an array, you have integers like, {90, 26, 7, 2}

and you want to check whether duplicates are there in this using the logic we discussed in the above link, for 90, you get the bit set @26. Now the next integer in array also sets bit at 26.. this will make 90 & 26 duplicate :)

This is the loophole !!.. Thanks to my friend who worked on optimizing RSA algorithm for  pointing out this mistake.. :D

Even longest string check won’t work!!.. :).. It will work only if you have lower case letters and you subtract the ASCII value with ‘a’.. So you will get only values within 26. :D

The below links are all wrong: :D

http://analgorithmaday.blogspot.com/2011/05/finding-longest-substring-in-string.html

http://analgorithmaday.blogspot.com/2011/04/unique-items-in-array.html

finding longest substring will work only if the duplication doesn’t collide.

We will see more about collisions soon.. With HASH TABLES.. :D

Friday, 13 May 2011

Permutations & Combinations for Dummies

“I never understood concepts like permutation, combination which is quite confusing. But permutation is nothing but just a way to count all possibilities of a nested iteration!!”

Introduction

Just take, the following nested for loops

for(int i=0;i<m; i++) { for(int j=0; j<n;j++) {   cout<<i<<":"<<j<<endl; }}

Simply we can tell directly the total iterations are m*n. But the traversal done by the loop with an array will be like..

REPEAT –> read the row –> read from left to right –> move to next row –> UNTIL i*j == m*n

These are the various values comes out considering m=3, n=3,

{0,0}, {0,1}, {0,2}

{1,0}, {1,1}, {1,2}

{2,0}, {2,1}, {2,2}

Keeping i, we iterate with changing symbols in j.. Doesn’t it look like Cartesian Product with a Condition?

This kind of rearrangement of numbers are necessary to traverse the array. Similarly you need rearrangement of numbers in various applications!!..

For example, say, you need to find all possible rearrangements of a shuffled 56 deck card.. There are numerous ways if you need to find the rearrangements of all the 56 cards(56!).. But if you want to know rearrangements possible when the cards are distributed within 14 people..

You just need to count 14 possible rearrangements out of 56 cards to calculate. This is where you need permutation..

To understand permutation well, you need to understand set theory!! Especially Cartesian Product!.. Learn about set theory first: http://analgorithmaday.blogspot.com/2011/05/set-theorya-dummies-guide.html

Concept

Every multiplication can be represented with a set of additions. When you add a number that much times, you get the multiplied value. Similarly, if you have a set of 3 numbers.. how would need to find the number of ways it can be rearranged?

Strings

If you take Cartesian product of same set k times with each set having some n symbols in it, you get strings. S^k sets with each having n items.

i.e, If the set is S, S x S x S is a 3-set product. If you have only 2 elements in it, then total you get 2^3 distinct elements as a result of this product. This count of distinct elements can be got from n^k equation. Example, Take a binary string of length 3. You have max 2 symbols possible in a binary. Total possible sets are,

000,001,010,011,100,101,110,111

Which is 2^3.

Permutation

Permuting means, simply re-arranging without missing any symbols!!.. if you want the number of rearrangements what would you do ? That’s where permutation comes into picture.

For example, saw you have a set {a,b,c}. The total number of ways, you can order this variables are,

abc, acb, bca, bac, cab,cba

Total 6 ways, the above 3 char array can be permuted. This gives as a pattern, the first element is chosen n ways, the next ignoring the first element chosen in n-1 ways, the next in n-2 ways. This means, you get n(n-1)(n-2) which is nothing but n!

If in the same above array, you just need to know the permutation of only k-elements instead of all n elements. (n is the total elements in the array). ex: this is similar to finding the 14 cards possible rearrangements in 56 card deck..

then, the total ways are,

ab, ac, bc, ba, ca, cb

Again the same, 6 ways. But now, n, n-1, n-2 and upto n-k+1 is what required. So, The formula becomes like below,

Combination

We already saw about Strings which we get by Cartesian product of the same set and get the unique items out of the set with each symbol in an order.

Combination has the same property of getting unique items with each item at one place but ignores the order of items requirement. It considers symbols in both the set and takes out in-order uniquely!!.. i.e, You should remove transitive, symmetric symbols in a combination!!

Ex: Permutation for 2 element in a 4 element set, {a,b,c,d}

ab,ac,ad,ba,bc,bd, ca,cb,cd,da,db,dc

Total 12 sets are possible since we need to consider every re-arrangement in any order.

Combination for 2 element in a 4 element set, {a,b,c,d}

ab, ac, ad, bc,bd, cd

There are only 6 sets!!.

Formula:

Examples

Some real world example!!

http://betterexplained.com/articles/easy-permutations-and-combinations/

You have 8 players.. and 3 different cups to be won. Gold, Silver, Bronze. You give each cup to only one of the players and the cups will be given in the order, Gold->Silver->Bronze.

Gold: When you need to give Gold, you have all 8 players competing for it. So, total 8 options possible.

Silver: When you need to give Silver, you have already given Gold, so only 7 options possible

Bronze: When you need to give Bronze, you have already given Silver, so only 6 options possible.

Total ways in which the 8 players can win will be 8! ways..

But, there are only 3 cups, so Total ways in which the top 3 cups can be won, if given one-by-one in order to the persons will be: 8 * 7 * 6.

This is where Permutation comes into Picture!

If the price money is same, i.e, a Stainless steel plate. You don’t need to care about the order in which the 3 cups are given. So, its 3 cups given to 8 players. But 3 cups re-arrangement don’t need to be taken into account as all are equal. So 3! ways of arranging the cups(6 ways) are useless when it comes to just picking some 3 out-of-8.

This is combination and you get only: 8*7 ways according to the formula.