## Monday 14 March 2011

### String of String search

Introduction

You are given a simple question to implement strstr C API.  You should not use extra space, no extra library APIs and performance must be as near to O(n). How would you do that?

Note that O(n) is a must.. :)

How I did this?

I started a normal approach of comparing each character with the string and came with an algorithm which is O(n2). This is a brute force approach to pattern matching of string

int str_str(char* sstr, char* needle)
{
char* iter = 0;
char* stmp = 0;
char* initial = sstr;
while(*sstr != '\0') {
iter = needle;
stmp = sstr;
while(*iter != '\0') {
if(*stmp != *iter) {
stmp++;
break;
}
else
{
iter++;
stmp++;
}
}
if(iter-needle == stmp-sstr)
break;
sstr++;
}
if(*sstr == '\0')
return 0;
else
return sstr-initial;
}

void main()
{
char* search_string="aabaaabbbbbbaaaaabbabaaaaaaaaaa";
char* str = "aaa";
std::cout<<str_str(search_string, str);

}

Important points:

• sstr, needle is not const char* here unlike the C API
• we need copy of initial sstr to calculate the index at the last.
• sstr itself is used as a counter to move inside the string. When some match of iter happens, we continue the loop, if not, we break.
• we need to reset the iter and current sstr to start the comparisons fresh
• iter runs for the needle and checks in stmp range, if any initial condition fails, we return
• stmp++ is required in a failure case for our calculation of equality check in outer loop
• If sstr is \0, that means we reached end of string and didn’t find the match yet. So we return 0.
• Note that, even if the string is found at first index, the above code returns 0. :). So, you must take care of the return value
• The above algorithm is fully based on pointer arithmetic and a bit hacky!! If you are well versed in pointers, you can come with the above solution.
• The above code works. But gives O(n2). So, this is not the right algorithm interviewer is expecting. :). But if you come up at least with this, you can think about optimizing it after studying further algorithms.
• The optimization mainly happens in the inner loop. The iter run.. There are many algorithms for pattern matching (from Goodrich)
• Boyer-Moore Algorithm
• Knuth-Morris-Pratt
• There are many text processing algorithms which we will look into soon!!