Monday, 4 April 2011

Unique items in an Array

“These are the most common interview questions all time. Remove duplicates in an string, Find whether there is a duplicate etc. How to attach these problems?”
Question & Code
   1. Find whether a string contains unique items. The string contains only lower case letters.
 
bool isUniqueChars(char* str)
{
//assume lower case
int checker = 0;
while(*str != '\0')
{
int val = *str - 'a';
if((checker & (1 << val))) // when some 1 collide with checker, false
return false;
checker |= 1 << val; // checker will become series of 1's
str++;
}
return true;
}
 
void main()
{
char* mstr = "ravi kuma";
bool test = isUniqueChars(mstr);
}

   2. Find whether the integer array contains unique numbers. The integer value in array can be negative and also can be 32-bit size.
 
bool isUniqueDigits(int* arr, int size)
{
int checker=0;
for(int i=0; i < size; i++)
{
if((checker & 1 << arr[i]))
return false;
checker |= 1 << arr[i];
}
return true;
}
 
 
 
void main()
{
int arr[]={2,4,5,6,2,2,5,8,9,10,11};
bool test = isUniqueDigits(arr, sizeof(arr)/sizeof(arr[0]));
}

   3. What if the integer input in array is huge? shift operation will overflow? how does it work for negative numbers?
UPDATE:      The above code works for all kinds of integer input. We are using bitwise which handles overflows by moving to the negative numbers. Even negative number will work correctly, as we plainly check for the number of matching 1’s to determine uniqueness rather than any integer value.   NOTE: It works only for identifying max 31 symbols in a 32-bit string. You need a bool array for any amount of symbols. Don't go with bitwise if the symbols in the array is greater or overflow (value >32).. It never works out.
 
   4. How to remove duplicate characters from an array without using any additional data structure or space. You can only use temporary variables
   An hacky & confusing code
void removeUnique(char*& str)
{
int checker=0;
char* istr = str;
char* ehead = NULL;
while(*str != '\0')
{
int val = *str;
while(*str != '\0' && (checker & 1 << val))
{
if(ehead == NULL)
ehead = str;
// while there are duplicates
*str='\0';
str++;
val = *str;
}
if(*str == '\0') break;
checker |= 1<<val;
if(ehead != NULL) {
*ehead = *str;
*str = '\0';
ehead++;
}
str++;
}
str = istr;
}
 
void main()
{
char* text = strdup("How to Potentially Resolve Big Issues ");
removeUnique(text);
}
But for the above code, there is no constant space bargain necessary. A cleaner code would be to use a bool array for 256 bytes and use it to just add unique characters to the top of the array. All non-unique characters never reach the top.
You can also have taken the negative case in the above code. If the character is not repeated, then we can simple move *str to a the top of the array.
  5. The same question might be asked for integer array as well. The simple approach is to confirm that the integer is not repeated using bitwise, and simply but that in lower indices