Tuesday, 15 March 2011

Run length Encoding

“Run length encoding is a principle to compress repeated strings in a stream of strings. Remember encoding used in telegrams!!!.. its the same old technic”

See Also

http://en.wikipedia.org/wiki/Run-length_encoding

Introduction

    Its a good interview question to ask and should be easily solvable. Say, you are given a problem like this,

aaaaabbbbbbbcccbbbbcccdddddaaaa

You need to transform this as,

a5b7c3b4c3d4a4

Do this without using extra space!! and without using any SDK APIs

This is nothing but run length encoding in simple terms. Note that, the count of characters can even be 2 digits.. :)

Code

void rle(char* sstr)
{
    while(*sstr != '\0')
    {
        int cc=1;
        std::cout<<*sstr;
        while(*sstr == *(sstr+1)) {
            cc++;
            sstr++;
        }    
        std::cout<<cc;
        sstr++;
    }
    return;
}
 
void main()
{
    char* str="WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW";
    rle(str);
}

Important points

  • This has O(n) worst case
  • We still use cout :) instead of storing in an array !! Its ok to do this, to finish it fast. We can explain about storing it in an array after its asked for
  • After you get the number of equals.. you need to again increment it by 1. That is what is done at sstr++
  • Even at boundary conditions, sstr and \0 are not equal, this loop works
  • Such a simple code is what is expected!!.. don’t think too much!! :D