Monday, 14 March 2011

String of String search



    You are given a simple question to implement strstr C API.  You should not use extra space, no extra library APIs and performance must be as near to O(n). How would you do that?

Note that O(n) is a must.. :)

How I did this?

   I started a normal approach of comparing each character with the string and came with an algorithm which is O(n2). This is a brute force approach to pattern matching of string

int str_str(char* sstr, char* needle)
    char* iter = 0;
    char* stmp = 0;
    char* initial = sstr;
    while(*sstr != '\0') {
        iter = needle;
        stmp = sstr;
        while(*iter != '\0') {
            if(*stmp != *iter) {
        if(iter-needle == stmp-sstr)
    if(*sstr == '\0')
        return 0;
        return sstr-initial;
void main()
    char* search_string="aabaaabbbbbbaaaaabbabaaaaaaaaaa";
    char* str = "aaa";
    std::cout<<str_str(search_string, str);

Important points:

  • sstr, needle is not const char* here unlike the C API
  • we need copy of initial sstr to calculate the index at the last.
  • sstr itself is used as a counter to move inside the string. When some match of iter happens, we continue the loop, if not, we break.
  • we need to reset the iter and current sstr to start the comparisons fresh
  • iter runs for the needle and checks in stmp range, if any initial condition fails, we return
  • stmp++ is required in a failure case for our calculation of equality check in outer loop
  • If sstr is \0, that means we reached end of string and didn’t find the match yet. So we return 0.
  • Note that, even if the string is found at first index, the above code returns 0. :). So, you must take care of the return value
  • The above algorithm is fully based on pointer arithmetic and a bit hacky!! If you are well versed in pointers, you can come with the above solution.
  • The above code works. But gives O(n2). So, this is not the right algorithm interviewer is expecting. :). But if you come up at least with this, you can think about optimizing it after studying further algorithms.
  • The optimization mainly happens in the inner loop. The iter run.. There are many algorithms for pattern matching (from Goodrich)
    • Boyer-Moore Algorithm
    • Knuth-Morris-Pratt
  • There are many text processing algorithms which we will look into soon!!


Unknown said...

Hello, Your collection of solutions to problems and algorithms are really nice. Keep up the good work. I have similar posts .. just started.. plz do visit and give your valuable comments. Regds.

vprajan said...

@Arabinda.. wow.. nice collection.. !! but not a dummies guide definitely.. :D Will do visit and learn something from it..

Post a Comment