Monday, 22 April 2013

Searching in an list for given value - Dummies guide


Intro


   We have seen lots of sorting, data structures and algorithms. But main problem for dummies (atleast me) is that, selecting the right things for right problems. That comes only when you face problems. That too abstract problems. Now, am going to take up abstract problem and explain how to optimize to some level. Becoming a pro on kinds of problem comes with only experience. But no one is expert on all the problems you see in computer science domain.

   Lets start with a basic search in a list problem. We can tweak this single problem into multiple possible problems to get the pro solution.

Problem

   Searching the list!! First don't ever thing about the language!! Think about what are given, what is required etc. some questions to ask
  • What the list contains ? strings or numbers ?
  • How big the list is? Is it huge or a smaller list?
  • If it is so big, there is no scarcity in space. Can i use extra space to design the algorithm to be faster?
  • Extra space!! Yukk.. it shouldn't be equal to the list size ever. only constant space is a nice option.
  • How much fast it needs to execute? single scan is also costly ?
  • If you are little bit smarter, you would ask is the list sorted ?
Having got answers for all, we will start thinking about C/C++/Java. Yah.. void main().. :) No, then you are dummy again.. :P

Solution

  Attacking problems should be a step-by-step approach. First come up with an approach. Even writing algorithm is like doing a project.

 Consider, this is what required from an algorithm you write,
1. Search should take logarithmic time
2. List is huge
3. List is a fully sorted.
4. List has integer values

Approach: Brute force

  Single scan for the value in the list. Performance O(n). No space required. you need to never look at values lesser than the searching key or greater than that, to get to pinpoint solution. So what you do here,
    read/skip.. read/skip.. read/skip... Haaaww!! read/skip.. read/skip... Sleepy!! read/skip...

The reason to ignore brute force is: read/skip every item in the list to reach solution.. Not O(n) or any mugged up things on your head

Approach: Optimized way

   Some smarter dummies would have found by now, binary search. But they are still dummies since they mugged up this. They can solve this problem, but not similar problems. Other optimized ways for above brute force.

  1. instead of read/skip each element. We know that list is fully sorted. Jump in intervals of 3. Take the position 1 & position 3. If both are less than key (k),move next level and increase the interval to 6 and move on. When u find at some point, the value at the end is greater than k. Start from interval start till interval end. This reduces the processing time drastically!! Performance: O(number of intervals + final interval size where key found)
  2. find the median, usually around middle index of the list. Check whether the median is lesser or greater. Now, you halve the read perfectly. this is what is called binary search.
   3. There might be other methods as well to search. Have a thought!! :)

Binary search

Now, we found the approach. Next writing an algorithm.

every time, half the index and check value on that index, next greater/lesser. If lesser go the greater side and greater go the lesser side, again halve it and continue.

Iterative way is easier to think through first. but edge cases

 end of the list, beginning of the list

main condition: (k found in median), (k greater than median), (k lesser than median)
median index:  median = list[(start+end)/2];

Logic:

loop thru the list until median goes out of index
   median = (start+end)/2;
   if(list[median] == k)
      return true;
   else
       if(list[median] > k )
             end = median;
        else
              start = median;


 Simple approach isn't it!! Now, the recursive way.

 f(list, start, end, val) = { found val : if  val = list[mid]
                                         f(list, start, mid, val) : if val < list[mid]
                                         f(list, mid, end, val) : if val > list[mid]

Above is considered done. except conditions like mid != 0 or mid != end + 1


Will see more on attacking problems + code.. ;)