Monday, 3 January 2011

Divide and Conquer–Introduction

 

“When you have a bigger problem and don’t have a overall solution. You take a part of it and try to solve it. If the part gets solved, you apply it to different parts of the bigger problem. Ex: You have a farm of size 500x200 = 10000 sq.ft to plant. If a single person can plant, just 250 sq.ft/hr. How much person you need to plant the full farm if you want to complete it within 4 hr?  Just 10 people is enough. This is a typical real world example for divide and conquer”

Metaphors

As you can see in the above quotation, there are 3 things considered as input values for the problem,

  • The problem size which is 10,000 sq.ft
  • The basic solution to this big problem = 250 sq.ft/hr can be planted by a person
  • The requirement for completely solving the problem = 4 hrs to plant 10,000 sq ft.

The above input forms becomes the basics of a Divide and Conquer style algorithm. This means, only if  you have a huge data which can be processed as chunks, you can apply the methodology of Divide and Conquer to the problem.

Methodology

The usual steps in these style of algorithms are the following,

  • Divide: the problem size known should be divided to the processable size. (which is 250 sq ft/hr for the above example).
  • Conquer: After dividing the problem into independently processable units, process each of them using a small algorithm. NOTE THAT, THIS IS THE PLACE WHERE THE ACTUAL RUNNING OF THE ALGORITHM COMES IN. i.e, this is planting the crop in 250 sq.ft by a person
  • Combine: There should be some condition to prove that divide and conquer is actually working out to solve the bigger problem. This proof is identified at this stage and this is where termination condition is checked. i.e, if we allow 4 hrs for the whole plantation to complete, then we need to combine the problem from just 10 people.

Divide and Combine step will be automatic and related while design of the Conquer step is what that requires pure mathematics. As we see in the above problem, if the condition is to complete by 4 hrs, then it requires 10 people. If the condition is to complete by 2 hrs, then it requires 20 people.

But if 20 people on the field is a congestion, then the division of work is wrong. So, the conquer step should know how much to divide and how much to combine.

It will be like, Divide –> Conquer –> Combine. Where 2 divided sub problems is the input to conquer and Combined output of the 2 sub problems is the output from conquer.

This is the usual way an algorithm is designed.

Important Points

  • Divide –> Conquer –> Combine, here, Divide = Input, Conquer = Processing Function, Combine = Output
  • Divide and combine step always handles bigger Input/Output and conquer should be able to process and connect Input and Output.
  • Sometimes divide and conquer is done collectively while combine is done separately. Sometimes, divide is done first and then conquer and combine is done collectively. But the order of operations never change.
  • The merge algorithm we already saw is an example of Divide and Conquer, where we take Merge(Array, start, mid, end), which means we divide the array into 2 and process + combine it. Merge is an example of divide, then conquer+combine.
  • Recursion is normally used to easily represent mathematically the divide step. As recursion is equal to a mathematical function which gets assigned to itself. like,
    f(n) = f(n+1)
  • On computers, the above mathematical functions grow on stack like how these functions are plotted on a graph. This is the most confusing part dummies find it difficult to get. Finding the exact recursive functions required for solving a problem is itself a new topic.
  • Divide and Conquer doesn’t need to be done only by recursion. If you know exactly how to divide the problem, there is no need to do a recursion to find out the input for the conquer step. Recursion is like, mathematical summation which means it can also done using a for loop if you know the end conditions clearly.
  • Moreover, without knowing the end conditions when the combine ends, you can never design a successful recursion. For more clearance, we will see about Recursion in next article.