An Algorithm a Day: big-oh

Thursday, 13 December 2012

Amortized Analysis - Dummies Overview

Introduction

"Amortized!!! what the hell is that?" - is the first voice i would hear from any dummy like me. :) But when it comes to algorithms, performance matters. Algorithm is not something we write daily, (write and throw away code). That's the reason why along with algorithms, mathematics becomes a must. But with modern software engineering/agile ways of programming, mathematics is hidden in the form of requirements.

Amortized - in my term means, paying your home loan using EMI option for life long. :) But the EMI balance reduces in long run as i pay up my interest first. It's is also associated with depreciation in general terms.

But why would depreciation related to algorithms?

Concept

Faster depreciation, we generally don't consider that product if it has a huge price tag. In algorithms terms, huge price tag (cost) is a big no no as we all know. But at the same it, algorithms require faster depreciation. That is, the cost should reduce when input size increases. How is that possible!!?? Who would need a product even though it costs less with faster depreciation? There are still people who buy made in china products. ;)

Just Kidding.. Faster depreciation is not possible but at least there should be some balanced depreciation with the size of data an algorithm is processing. This is where we come with Big-Oh, Small-Oh, Smallest-Oh.. all that stuff!! :P

But why that 'O'? There are many other letters in English. It's just the function naming similar to f(x) in mathematics. There are other functions as small-O, big/small-theta, big/small-gama for average/best cases. We use 'O' for indicating the worst case

More concepts

Amortized analysis gives as ways to calculate like a banker, the cost involved in an algorithm by running through each and every line in code. The various ways to calculate are clearly given in many sites: the aggregate method, the accounting method, and the potential method (http://en.wikipedia.org/wiki/Amortized_analysis). Usually, algorithm designers use the above methods to optimize the algorithm for a longer run with huge data sets.

More to come from scratch about algorithms.

Thursday, 13 January 2011

Asymptotic Notation–In a mathematical way

“A simplified way to calculate the running time of an algorithm. The primitive way is to count the number of iterations, instructions and their constant running time on the machine. But tracking the running time on all the machine is impossible. Instead we can use Asymptotic Notation to eliminate constant “

Metaphor

The time taken to do some thing is really important these days. Not just time, even the resource usage is important nowadays. To calculate how much resource/time is taken to do a job, there are many statistical ways. These statistical ways are needed for a computer algorithm as well.

Concept

Consider, your whole algorithm can be represented as a function f(n). Where n is the input size. We need to calculate the running time f(n). How can we do this? Take the constant time taken on a machine as “c”. Up to certain input values, running time will be proportional to the input size. The running time is represented with a separate function named g(n). Usually, it gets multiplied like c.g(n). This “c” value is never taken into account and a inequality is created for avoiding it.

There are three inequalities possible for f(n) = c.g(n) (note that this equality is possible only for a some n, as always running time increases with input size and type of input)

O(worst case): f(n) <= [g(n)]: Upper bound running time considering input size->infinity. Here, we consider largest running time possible for some input size. Note that, f(n) is the function which takes n size input, and g(n) is the time taken to run f(n), for n being the size of input.
Ω(best case): f(n) >= [g(n)]: Lower bound running time considering f(n) runs in best case. Here, we treat running time is smaller for some input size. This considers the lowest running time possible for some input.
Θ(average case): 0 <= f(n) >= [g(n)]: Average/Ranged running time vs input size. This is a average running time case between above two notations.

The above notations are called Big-Oh, Big-Omega and Big-Theta respectively. For most of the algorithms, they ask only Big-Oh case. Read more about how to determine this value here: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/

Important points

The same notation can be used to represent space vs input size as well
Upper bound, Lower bound, middle range: makes it clear that, upper bound covers all the iterations of an algorithm for the full input set. Lower bound only covers the best case input set and Average case covers a median.
Read more about Asymptote here: http://en.wikipedia.org/wiki/Asymptote. Its a tangent which is parallel to the infinitely running curve.
Remember drawing tangents in an asymptotic circular graph sheet? :) That is the math behind these notations which basically represents the infinite/generic function for an algorithm.
There is something called, Little-Oh, Little-Omega and Little-Theta. They consider constant value strictly while Big notations ignore it considering g(n) is sufficiently large to ignore “c”.
I think this post became a pure mathematical post. As i said, we need to study multiplication tables from 1st grade to do this perfectly.. ;)