An Algorithm a Day: no-extra-space

Showing posts with label no-extra-space. Show all posts

Tuesday, 24 May 2011

Finding the longest substring in a string without repetition

Question

This is an interesting question shared by my friend. Not so tough. But again, we need to remember the use of bitwise. Read more about here

http://analgorithmaday.blogspot.com/2011/04/unique-items-in-array.html

The question is to find the longest substring in a string which doesn’t have repetition.

The algorithm you write must not use any extra space and should perform with O(n).

Example:

If you are given a string like this,

abcdbbnjkuiokklmnuuurstnvwabcd

to find the longest substring without repetition, you should ignore repetitions in each substring. but must still consider the repeated string to be part of the next substring!!

bb (or) abcdb – is a invalid substring

But, bnjkuio – is a valid substring

Solution

char* longest_substring(char* str){    char* istr = str;    int checker=0;    int countmax = 0;    int maxl = 0;    char* maxidx = str;    while(*str != '\0')    {        if((checker & (1 << *str)))        {            if(maxl < countmax) {                maxl=countmax;                maxidx= str;             }            checker=0;            countmax=1; // include count of the duplicate        }        else {            ++countmax;        }        checker |= (1 << *str);        str++;    }    if(maxl < countmax) {        maxl = countmax;        maxidx = str;    }    char* start = maxidx-maxl;    if(maxl != 0)     start[maxl] = '\0';    return start;} void main(){    char* test = strdup("abcdbbnjkuiokklmnuuurstnvwabcd");    char* str = longest_substring(test);}

In the above code, we just keep track of the max length of the currently found substring in maxl. We also maintain the maxidx (pointer address)
We just track the max only when you get repetitions in the substring found till now
checker bit variable helps us in detecting the duplicate in a substring. We set it back to 0 when we are done with detecting duplicate in the first substring.
after everything is done, we just subtract the maxidx with the substring length to get the start of the substring. Just place a NULL character in maxl location to get the perfect longest substring
NOTE that, the input string is destroyed!!.. It doesn’t use extra space either for replicating the string argument or for finding duplicates within the string.
We do traversal only once the whole string and the 2 condition checks gives O(2n) performance. All other operations would be negligibly constant.
If there is a case like,

abcdefghijkklmnopqrstu

What will be the output of the code? If there are equal sized substrings, the function just gives the first found substring!!

Tuesday, 22 February 2011

Partitioning Algorithm–In-place

“Think about a problem in which you are given a value. You need to use that value to move all the lesser elements to the left and the greater elements to the right. And most important without using extra space. What would you do? How would you treat the same array as 2 arrays. How we represent a binary heap in an array the same way we do this as well!!”

Introduction

The partitioning of array is useful for the next algorithm we discuss about sorting, which is quick sort. Why it is quick, is just because of some important properties of partitioning. The partition algorithm gives divide and combine for any algorithm. Its a part of divide and conquer method similar to the merge method discussed for merge sort.

How the division happens?

Divide: Divide an array A[start….end] into two sub arrays, A[start…minindex] and A[maxindex…end-1]. A[start..minindex] will contain values < A[end] and A[maxindex..end-1] contains values > A[end]

This divide step is somewhat similar to binary search, where we divide the problem set by > or <. Similar way, we divide the data set into two parts,

=

/ \

< >

Concept

What is the use of this division? How the combine step is contributed equally by the partition. If you note, in merge sort, divide is actually done by the merge sort algorithm. Only the combine is done using merge. That’s why, we call merge sort recursion first, and then finally we do a merge. recollect the algorithm of merge!!

mid = start+end/2

MERGE-SORT(arr,start,mid)

MERGE-SORT(arr,mid+1, end)

MERGE(arr,start,mid,end);

But for quick sort, if you note, the divide & combine both are intrinsic to partition algorithm. So, first, we do partition and then quick sort recursion does the loop through. The lingos are here:

Merge sort: First divide and then merge

Quick sort: First partition and then divide further

Now, you get why we discussed tree traversal also along with this.. :) What we want to do first goes above the recursion loop always.

How a partition takes care of combine as well?

Always it makes the array to the left and right with the property explained above. So, if we select all the elements as pivot in a 3 element array, it gets sorted. Since you always get partially sorted based over the pivot. partition doesn’t require a separate combine step

Code

void swap(int& a, int& b){    register int tmp;    tmp = a;    a=b;    b=tmp;} int partition(int* arr, int start, int end){    /*       1. decide a pivot element, let it be arr[end]      2. categorize arr the remainting start -> end-1       3. consider 2 indeces = i,j: initially i=start-1, j=start      4. make sure that i+1 to j-1 always contains element > pivot      5. the above means, i marks end of smaller elements array      6. j marks the end of larger elements array     */     /*    Example:      start=0                                           end=12  i=-1   j=0                                                    3    6    2    1    9    4    12    19    13    10    8    5    11                After you do this partition,        start                            i            j       end        3    6    2    1    9    4    10    8    5    [11]    12    19    13        The important point here is, after 4, we see 8 which is < 11.    So, it needs to be inserted.    Because of the above point we need 2 indeces.    */     int i,j, pivot;    pivot = arr[end];    i=start-1;    for(j=start; j<=end-1; j++)    {        if(arr[j] <= pivot) {            i=i+1;            swap(arr[i], arr[j]);        }    }    swap(arr[i+1], arr[end]);    return i+1;}   void main(){    int A[]={3,    6,    2,    1,    9,    4,    12,    19,    13,    10,    8,    5,    11};    int B[]={6,10,2};    partition(A,0,12);     //sorting 3 elements    int div = partition(B,0,2);    //small list= 0->div-1    //large list = div->3    int l1 = partition(B,div+1,2);    int l2 = partition(B,l1+1,2);}

Important points

partition just divides an array based on a pivot!! and returns the mid index. The index can be start or end as well!!.. since there is always possibility that non of the elements are greater or lesser than pivot
pivot selection is the very important concept in partition algorithm. Since selection of pivot decides how much sorted you get.
There is not even a single guarantee that array get sorted by minimal partitions without doing a partition over each and every element. You need to see all the elements in the array and do a partition.
partition takes O(1) time, when start and end are equal or zero.
partition at its worst case takes Theta(n) and sometimes since swap is ignored, constant time is lesser!! That’s why, we tell Theta(n) rather than O(n)
Doing multiple partitions at worst case always leads to Theta(n²) since we need to treat all the elements as pivot. That’s why quicksort worst case is Theta(n²). Quick sort is not so quick as you hear!! :D
How can multiple partition lead to Theta(log n)? The constant time is the main factor here!! when partition is done on a reducing set, you get same Theta(n).
There are lots of mathematics studied with such recursive algorithms like quick sort and merge sort!!.. We will learn all those in detail in future

Tuesday, 11 January 2011

Merge lists without using extra space

Common Interview question

Merge two already sorted lists into a single sorted list without using extra space.

Code

The below code uses the memory allocated for a1, a2 to create a new list “c”. It doesn’t use any extra space from the heap which means this merge is RAM efficient. But in performance wise, we need to traverse the end of the list every time to insert the new least of two list element.

   1:  List* merge_list_nospace(List *a1, List *a2)

   2:  {

   3:      List* c=0;

   4:      if(a2==0)

   5:      {

   6:          c=a1;

   7:          return c;

   8:      }

   9:      else if(a1 == 0)

  10:      {

  11:          c=a2;

  12:          return c;

  13:      }

14:

  15:      while(a1 !=0 && a2!= 0)

  16:      {

  17:          if(a1->data < a2->data)

  18:          {

  19:              if(c == 0) {

  20:                  c = a1;

  21:                  a1=a1->next;

  22:                  c->next = 0;

  23:              } else {

  24:                  List *iter = c;

  25:                  while(iter->next != 0) iter = iter->next;

  26:                  iter->next = a1;

  27:                  a1 = a1->next;

  28:                  iter->next->next = 0;

  29:              }

  30:          }

  31:          else

  32:          {

  33:              if(c == 0) {

  34:                  c = a2;

  35:                  a2=a2->next;

  36:                  c->next = 0;

  37:              } else {

  38:                  List *iter = c;

  39:                  while(iter->next != 0) iter = iter->next;

  40:                  iter->next = a2;

  41:                  a2 = a2->next;

  42:                  iter->next->next = 0;

  43:              }

  44:          }

  45:      }

46:

  47:      List *iter = c;

  48:      if(c != 0)

  49:      {

  50:          while(iter->next !=0)

  51:              iter = iter->next;

  52:      }

53:

  54:      if(a2!=0)

  55:          iter->next=a2;

  56:      else

  57:          iter->next=a1;

58:

  59:      return c;

  60:  }

61:

  62:  void main()

  63:  {

  64:      List *list1 = createNode(2);

  65:      append(list1, createNode(3));

  66:      append(list1, createNode(5));

  67:      append(list1, createNode(10));

68:

  69:      List *list2 = createNode(1);

  70:      append(list2, createNode(4));

  71:      append(list2, createNode(9));

  72:      append(list2, createNode(12));

73:

  74:      List *list3 = merge_list_nospace(list1, list2);

  75:  }

Important points

Note that you should always maintain the head pointer of a list in a variable to iterate across the linked list
To create a perfect link always, iter->next = <address> is required.
BUG FIX!!! You need to test for empty lists input of a1 and a2. This is not done in previous case which will lead to null-pointer access crash!!
IMPORTANT!! AND VERY IMPORTANT!!.. C or C++ if you are not using C++ reference variable, you are not passing by reference. You generally pass everything as value. So you need a double-pointer, if you want to change the value of a pointer. This double pointer is what converted to a reference in C++