An Algorithm a Day: bitwise

Showing posts with label bitwise. Show all posts

Thursday, 2 June 2011

Mis-conceptions with Bitwise operators

“Bitwise operations are really important in all kinds of embedded systems.. and the things that you do are all on base 2. mostly, these operators are not used much in application programming.. But very useful for systems & memory optimized programs.!! If you write algorithms to save memory, knowing bitwise is a must”

Introduction
    Let’s start with ground basics for binary..
   In a 32-bit binary, you should know that, based on the location of the 1 in it, 2^n varies..
That is, 0000 0001 – 2^0                0000 0100 - 2^2             0001 0000 - 2^4
            0000 0010 – 2^1                 0000 1000 - 2^3             0010 0000 - 2^5
So, the series is like 1, 2, 4, 8, 16,.. The in-between numbers permute. i.e, 2->4.. If you take the count of numbers between these, you will get a series, 0, 1, 3, 7, 9, 15…
This is how mapping between a binary and a decimal is done.. hexadecimal number is just the grouping of 4-bits.. that is, < 2^4..
With all these bits, you can do logical bitwise operations, AND, OR, NOT, XOR.
Concepts

OR – its used for masking. In a embedded system, there will be a 32-bit status register. To set or turn on a flag in this 32-bit, you will use a mask and | it with the already existing status bit.
AND – its used for verifying whether a particular bit is set. mostly used to check whether a particular status bit is set in the register
XOR – its for toggling the bits. used for set/reset!!.. only if you have different bits, you get 1. otherwise, its all zero. that’s why, it can convert 1 –> 0 and if 0 is there then 1.
NOT – mostly for turning off a flag.

If you have a 8-bit flag. Then you get the following constants defined,

static final int flagAllOff = 0;  //         000...00000000 (empty mask)
static final int flagbit1 = 1;    // 2^^0    000...00000001
static final int flagbit2 = 2;    // 2^^1    000...00000010
static final int flagbit3 = 4;    // 2^^2    000...00000100
static final int flagbit4 = 8;    // 2^^3    000...00001000
static final int flagbit5 = 16;   // 2^^4    000...00010000
static final int flagbit6 = 32;   // 2^^5    000...00100000
static final int flagbit7 = 64;   // 2^^6    000...01000000
static final int flagbit8 = 128;  // 2^^7    000...10000000

Example, Now, take a integer status register(statusreg) with all 0.

1. Set bit-3,7 in the status register (using OR)

   statusreg |= flagbit3

   statusreg |= flagbit7

2. test whether bit-7 is set or not, (using AND)

    if((statusreg & flagbit7) == flagbit7)  // bit 7 is set

3. toggle bit-5, (using XOR)

   statusreg ^= flagbit5

4. set the bit-5 to zero. Using NOT, AND operator

   statusreg & ~flagbit5

5. combining masks using OR. set bit 5 and 6.

   statusreg & (flagbit5 | flagbit6)

6. XOR has a special property.. if u XOR same number, you get zero.

  i.e, a = 01010011.. a ^ a = 0

5. Using the above principle for swapping. :)

a ^= b; // a = a^b
b ^= a; // b = b^a^b = a
a ^= b; // a = a^b^a = b

6. Extract just 8-bits from a value,

    ch &= 0xFF

More operations & More use-cases

   Left shift & right shift. Now, it should be clear that the number of left shifts you do,

you multiple more 2 to the number. The more right shifts you do, you divide number by 2.

Left shift – Add 0 to the right side.

Right shift – Add 0 to the left side.

All permissions even on applications are maintained with bits to easily understand it.

Errors caused till now on the articles

The major bitwise we misused is,

http://analgorithmaday.blogspot.com/2011/04/unique-items-in-array.html

You cannot get the unique items in the array if the numbers are greater than 32. And, for negative numbers, its only 31 numbers.

Since we use the position of 1’s to get the duplicate.  But it works well for strings, because there is no character which will round and overlap with the same bit.

i.e, if in an array, you have integers like, {90, 26, 7, 2}

and you want to check whether duplicates are there in this using the logic we discussed in the above link, for 90, you get the bit set @26. Now the next integer in array also sets bit at 26.. this will make 90 & 26 duplicate :)

This is the loophole !!.. Thanks to my friend who worked on optimizing RSA algorithm for  pointing out this mistake.. :D

Even longest string check won’t work!!.. :).. It will work only if you have lower case letters and you subtract the ASCII value with ‘a’.. So you will get only values within 26. :D

The below links are all wrong: :D

http://analgorithmaday.blogspot.com/2011/05/finding-longest-substring-in-string.html

http://analgorithmaday.blogspot.com/2011/04/unique-items-in-array.html

finding longest substring will work only if the duplication doesn’t collide.

We will see more about collisions soon.. With HASH TABLES.. :D

Tuesday, 24 May 2011

Finding the longest substring in a string without repetition

Question

This is an interesting question shared by my friend. Not so tough. But again, we need to remember the use of bitwise. Read more about here

http://analgorithmaday.blogspot.com/2011/04/unique-items-in-array.html

The question is to find the longest substring in a string which doesn’t have repetition.

The algorithm you write must not use any extra space and should perform with O(n).

Example:

If you are given a string like this,

abcdbbnjkuiokklmnuuurstnvwabcd

to find the longest substring without repetition, you should ignore repetitions in each substring. but must still consider the repeated string to be part of the next substring!!

bb (or) abcdb – is a invalid substring

But, bnjkuio – is a valid substring

Solution

char* longest_substring(char* str){    char* istr = str;    int checker=0;    int countmax = 0;    int maxl = 0;    char* maxidx = str;    while(*str != '\0')    {        if((checker & (1 << *str)))        {            if(maxl < countmax) {                maxl=countmax;                maxidx= str;             }            checker=0;            countmax=1; // include count of the duplicate        }        else {            ++countmax;        }        checker |= (1 << *str);        str++;    }    if(maxl < countmax) {        maxl = countmax;        maxidx = str;    }    char* start = maxidx-maxl;    if(maxl != 0)     start[maxl] = '\0';    return start;} void main(){    char* test = strdup("abcdbbnjkuiokklmnuuurstnvwabcd");    char* str = longest_substring(test);}

In the above code, we just keep track of the max length of the currently found substring in maxl. We also maintain the maxidx (pointer address)
We just track the max only when you get repetitions in the substring found till now
checker bit variable helps us in detecting the duplicate in a substring. We set it back to 0 when we are done with detecting duplicate in the first substring.
after everything is done, we just subtract the maxidx with the substring length to get the start of the substring. Just place a NULL character in maxl location to get the perfect longest substring
NOTE that, the input string is destroyed!!.. It doesn’t use extra space either for replicating the string argument or for finding duplicates within the string.
We do traversal only once the whole string and the 2 condition checks gives O(2n) performance. All other operations would be negligibly constant.
If there is a case like,

abcdefghijkklmnopqrstu

What will be the output of the code? If there are equal sized substrings, the function just gives the first found substring!!

Tuesday, 8 February 2011

Fun with bitwise–OR, AND, NOT, XOR and Left/ Right shift

“The funny thing about bitwise operator is, it operates only on bits on any type of data similar to how a computer hardware operates. To understand bitwise operations, we need to open up the Logical unit of a computer system. The various gates used are NOR, NAND, XOR, Shifters etc”

Metaphor

When it comes to bits, its purely a computer or electronics phenomenon. 1 means highest signal, 0 means a lowest signal. In electronics, its a square wave with 2 symbols. In computers, if we treat that as a binary language, then its a 2 symbol digit system. To understand how this system works, we should know how does a number system work.

In mathematics, the number system that we know has total 10 symbols, 0 – 9. After that what we have is just combination of these 10 symbols. Infinity is a new symbol btw!! :)

But, as you can see, just 10 symbols which is why decimal system is called a base 10 decimal system. What does it mean by base 10 ? Every number in the decimal system can be represented by 10^x where x is a number from -infinity to infinity. Since this scale is very minute, we use logarithms to avoid multiplication and division.

So, log₁₀ 5 = 10^x => how much if you raise power of 10, will you get 5 when 10^0.6 is done. So, log 5 = 0.6 :D

With logarithms as we know, we can just need to add and subtract. rather than multiply/divide, as we have all in powers now.

Very funny and obvious question. What is 10^-infinity ? Its zero amazingly :D. So, any based logarithm function with 0 as parameter will always give –infinity.

10^0 = 1. Anything power 0 is 1 but it excludes infinity!!.. Indians are the culprits who bought zero and made all this confusion!! :D

Concept

Getting the above concept. Base 10 is what human understands. Base 2 is what computer understands. With 32-bit systems, hexa decimal came in, where 16 symbols are present. 0-9 and then A-F, which means, Base 16 is also taken up by computers now. Base 8 is octet, which was used in older 8-bit systems. There are base 32 and base 64 as well used.

But the important thing here is, how will you convert between bases. Decimal system’s property is our base reference for everything as that is the master mind. :)

1’s(10^0) place, 10’s(10^1) place, 100’th(10^2) place, 1000’s(10^3) place!!.. same way, 2^0’s place, 2^1’s place, 2^2’s place.. is for binary!! :D

When you just add-up all the values of 2^0, 2^1, 2^2 for a binary, you get a decimal equivalent to it!! Note that, this is because, the binary system with 2 symbols grows in multiples of 2 in a decimal system!!..

Bitwise is similar to a logical hardware gates.

OR or | = Adder (all symbol changes are taken)

AND or & = Condition flag checker (same symbol, take down the symbol)

NOT or ! = Reverse bit (!0 = 1)

XOR or ^ = Extract unique values ( 0 XOR 0 and 1 XOR 1 = 1.. same symbol, then 1)

Left shift or << x = Add zero’s at the right end for the amount specified. 2^x is the mapping decimal equivalent for each digit.

Right shift or >> x = Add zero’s at the left side. same 2^x is the mapping

Some fun code

Some XOR fun!!

   1:  void swap(int *a, int *b)

   2:  {

   3:      *a ^= *b; // a ^ b

   4:      *b ^= *a; // b ^ a

   5:      *a ^= *b; // a ^ b

   6:  }

7:

   8:  void main()

   9:  {

  10:      int a=5, b=8;

  11:      int c=10, d=10;

  12:      swap(&a,&b);

  13:      swap(&c, &d); // it works

14:

  15:  }

Important things

We will see more topics about AND, OR in coming topics :)