An Algorithm a Day: method

Showing posts with label method. Show all posts

Thursday, 12 May 2011

Set Theory–A Dummies Guide

“If its possible to express the world in a set, a mathematician would do that!! :) The power of sets are vast that its used in statistics and computing a lot!!. To understand algorithms clearly, you need to understand set theory!!”

Background

I am assuming that, the reader already know a little bit about Sets. Sets is a collection of unique objects. There is natural(without negative) number set, real number(floating point) set, integer(with negative) set. You can also have sets with different symbols. But the major operations that can be done between two sets, A & B, are:

Intersection: common items between A & B , $A\capB$
Union: all unique items between A & B (common items appear only once) $A\cupB$
Difference: if its, A-B, then items in B if present in A are removed and you get remaining elements.
Disjoint: both A & B don’t have any items in common

I assume you understand the above general and basic concepts to move on into More details

Point-By-Point

Extra concepts which you might require in some problem solving & algorithms are,

singleton set: a set with only 1 element in it.
partition of a set: a set if gets partitioned into subsets of size>0.. its called a partition. [Check here: http://en.wikipedia.org/wiki/Partition_of_a_set]
complement of a set: If you take a set A, complement of set A is the elements which are not present in A. $\bar{A}$
universal set: $A\cup\bar{A}$ is a universal set which includes all the elements in the bigger set
n-set: If you have “n” elements in a set, its called n-set,
k-subset: If you have “k” element subset from a “n” element set, its called a k-subset of the bigger set.
Cartesian product set: Its the product of two sets.. straight to straight!! A x B = {a,b} x {a,b} = {(a,a),(a,b),(b,a),(b,b)}
binary relation set: subset of Cartesian product between sets. For example, if A & B are subsets in a Natural number set and a relation is defined between the elements in the sets, like, a<b, for a in A, b in B, then, it means, the two sets A & B has elements sorted in ascending order when combined!!
equivalence relation between sets: is a type of binary relation set!! but the sets will contain elements which are disjoint and exhibit equivalence among each other!!.. i.e, two sets are equivalence sets only if they are part of a same set under a binary relation
equivalence class: the condition of equivalence between two sets should hold in all the elements of the class set!!.. i.e, if you need a set of even numbers from a natural number set given an even number, define a relation such that, equivalence is hold, like a+b is even..
functions: extension of Cartesian product. Instead of just Cartesian product, we place a function with a binary relation among elements in the set

Practical Applications

The main application will be on Graphs!! :) We will study about it soon. We have all the set concepts in a graph!!
As it can put in Graph, Tree is just a subset of Graph!!.. Tree concepts are all derived from the set theory!!
Binary trees we studied are disjoint sets composed of: root node, left sub tree, right sub tree. Disjoint means, all these 3 sets should have completely different nodes!! It’s not necessary to have any nodes as well. This makes, Binary tree a special case of Trees.
Partition of a set can be directly related to substring, subsets in an array etc.

Example: The set { 1, 2, 3 } has these five partitions:

{ {1}, {2}, {3} }, or 1/2/3.
{ {1, 2}, {3} }, or 12/3.
{ {1, 3}, {2} }, or 13/2.
{ {1}, {2, 3} }, or 1/23.
{ {1, 2, 3} }, or 123

This is clearly how python lists work!! :). Python and Perl both have splice, union, intersection of arrays/lists.

Red-black tree’s & Graph colouring all uses the set theory principles to distinguish different coloured nodes
Equivalence relations are used in extracting subsets which exhibit equivalence. The famous subset sum problem(http://www.codechef.com/problems/PAIRSUM/).
Functions over a set can be used to extract elements matching a binary condition!! we use nested for loops & if conditions to do this actually :)

Wednesday, 11 May 2011

Random number generation logic - Linear congruential generator

“Randomness is really important for probabilistic actions. Like card shuffling, coin tossing etc. How can we represent that in a machine? There are lot of random number generators available!!”

Introduction

Random number generators have huge application space mainly in Cryptography. But we need this random number generators for the articles we will soon start studying about, HASH TABLES!! :) You need random unique keys for storing the values so that search is O(1) similar to an array.

There are types of randomness however!! Pseudo Randomness & “True” Randomness!!.. All proven with heavy mathematical theorems. “True” Randomness is really important for security algorithms like RSA. But if the randomness fails and a pattern appears, you get to crack the security system as well!! :)

Randomness is mystic and same as how our universe behaves :) So impossible to get true randomness without leaving it to run around the world.

Watch this movie to understand how tough it is.. :) It’s about a Math scientist trying to find patterns and suicides with a head gun shot http://www.imdb.com/title/tt0138704/ (Psycho movie.. Beware)

Ok, into the business. let’s get into some simple random generators!!

Concept

LCG – Linear Congruential Generator is a method to generate pseudo random numbers until a particular range. After that range, the numbers would start repeating. This range value is also called as “seed”.

$m,\, 0<m$ — the "modulus"

$a,\,0 < a < m$ — the "multiplier"

$c,\,0 \le c < m$ — the "increment"

$X_0,\,0 \le X_0 < m$ — the "seed" or "start value"

Source: http://en.wikipedia.org/wiki/Linear_congruential_generator

The random value can go up to MAX = “m”. single linear congruent equation usually repeats numbers. To avoid that, we have a multiplier “a”. We can either keep “c”, zero or non-zero.

If you have only “a”, then its called multiplicative congruential method.

Note that, the values of m, a, c decides the amount of randomness you get with this function. The values of these numbers are experimented and must have noted properties. :D

Refer the Wikipedia link for more details.

In C/C++, mostly we use time() function as the seed which gives us good precision. like below,

srand(time(NULL));

After that, we set the modulus value “m”, by giving the rand a value,

rand() % 500;

The above will generate a random number below 500 with X₀= system time + the above suitable random multipliers. But you should note that, if you iterate 1000 times the above rand function, you will start getting the same numbers frequently.. we should get at least the first 500 times the different number!!.. But even that won’t happen :D

How this recursive function generated different numbers ?

The above plot will show you how the values get randomly distributed!!. Note that, this is happening just because m and a are prime..!!

Check more plots here: http://demonstrations.wolfram.com/LinearCongruentialGenerators/

Code

Try the below code:

void main(){    srand(time(NULL));    for(int i=0;i<1000;i++)        cout<<": "<<rand()%100<<endl;}

I got the following last set of output with “8” straight-to-straight repeated. Just with a difference of some 2 extra iterations!!..

Output : 8: 18: 37: 8: 33: 14

Tuesday, 3 May 2011

Relation between Binary Search Tree and Quicksort–MIT lecture series (hand picked)

Monday, 7 March 2011

Sorting in linear time

“Not so important but some ask!!.. You have a integer array for which you know that all the elements are less than some value… How efficiently you can sort such an array? You would tell merge sort, heap sort, quick sort.. But all will yield only O(n log n) max. That’s where different methodologies for sorting comes up”

Introduction

These are quite old ways of sorting but useful when you want to do sorting in linear time without affecting the stability. There is new concept comes up here. Stability. Stability is nothing but maintaining the order of the repeated elements even in the sorted array. That is, is {2,2} which comes part of unsorted array in order when sorted, you should get the first ‘2’ first and the second following it (in the same order as unsorted array).

So, How stable is our sort we studied till now?

Say, you have {4,2,0,1,1,9}

In insertion sort, you will {1,1} in the sorted sequence in-order. So, its stable.

In merge sort, you create two lists and merge, i.e, if the merge happens between {2,0,1} and {1,9}, you still get {0,1,1,2,9} with {1,1} in same order. This is true in case of using extra space. But if there is no extra space, you would disobey the order since you need to swap or link.

In quick sort or heap sort, you do all swaps in-place. You even swap elements like {1,1} which changes its order!!.. So, they are not stable!!..

Most of the comparison sorts in which swapping is involved without considering the order, is a non-stable sort.

Till now, what we saw are all comparison sorts

Concept

There is a new learning we got from comparison sorts and its performance. We will revise the performance of all the sorts we learnt till now.

Merge sort

Worst case – O(n log n)

Average case – Theta(n log n)

Best case – Gamma(n log n)

Insertion sort

Worst case – O(n^2), Average and Best might reduce the swaps required, but if its in-place, you always get O(n^2) with lesser constant time

Heap sort

Same as merge sort. max heapify takes O(n) and recursion pulls in O(log n) to get O(n log n)

Quick sort

Worst case – O(n^2) if the array is reverse sorted.

Best and Average – O(n log n) with very less constant time.

All the above sorts work with a basic property for sorting, comparisons. Comparison and swap are the costliest operations on all the above sorts which never makes to yield lesser than O(n log n) time.

In case of heap sort, max heapify requires only O(n). But O(log n) extra is required to up-bubble or down-bubble elements since we are based upon comparisons.

If we ask, can we ever sort in time less than O(n log n)? Then its a big yes!! but not using comparison as its base.

Conclusion

In coming articles, we will start reading a completely different perspective of sorts which is used only for special cases and are not general purpose sorts. But sorting can be done in linear time with extra space when we are not using comparison sorts. Since most of these sorting is possible only on integers, they even call these sorting category as integer sorting methods.

Wednesday, 23 February 2011

Quick sort

“Beauty of quick sort is, its so quick!!.. unlike other sorting algorithms based on divide and conquer, where divide, conquer, combine all are independent steps. In quick sort, divide and combine are done in one-shot. i.e, just divide actually sorts the whole array!!”

Monday, 21 February 2011

Tree traversal for binary trees

“We have seen binary trees and specific use cases of binary trees!!.. But trees doesn’t need to be always binary!!.. one root can have multiple children.. there can be cycles.. When these cycles are created, they are known as graphs!!.. The structures are all application based!!.. But how would you traverse it is not!!.. There are specific ways to visit the nodes in a tree.”

Introduction

As we already studied about heap in the previous article, you should have noted two ways of moving inside the tree: up-heap and down-heap. In up-heap, we move from down to top and down-heap, top to bottom. This is a form of traversal which is application specific, specific to heap. Moreover, insertion happens in left-to-right, that is, fill all the above nodes before filling the next level which is kind of Level order or breadth first. Similarly there are many ways a tree can be traversed.

Concept

Tree must be understood as a type of graph from now on. In a graph, the main applications are derived around the traversal which we will see soon. Similar to that, even in a tree, there are many applications linked with a tree traversal. The types of traversal’s possible in a tree are,

Normal graph traversals – Breadth First (also known as level-order in a binary tree), Depth First
Normal tree traversals – In-order, Pre-order, Post-order

You can do all the above traversals in a tree depending upon the application.

Diagrams for each of the Tree traversals, along with level-order,

Courtesy: http://www.macs.hw.ac.uk/flex/BScCS/ds1/activity10.html

Code

#define PARENT(i) i-1/2#define LEFT(i) 2*i+1#define RIGHT(i) 2*i+2#define ARRAYSIZE(a) sizeof(a)/sizeof(a[0]) int A[14] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14}; /*The tree looks like this,                    1            2                3       4        5        6        7    8    9      10  11  12  13  14preorder: root=>left=>right1. is(root) => left => is(root), until root = NULL2. visited(root) => right => is(root) => left => is(root), until root=NULL     1 2 4 8 9 5 10 11 3 6 12 13 7 14postorder: left=>right=>root8 9 4 10 11 5 2 12 13 6 14 7 3 1*/ void preorder_stack(int idx){    Stack<int> obj;    int j= idx;    while(j < ARRAYSIZE(A)) {        int i = j;        // move to root till left        // note, this runs for all the root        // even the right root!!        while(i < ARRAYSIZE(A)) {            std::cout<<A[i]<<" ";            obj.push(i);            i=LEFT(i);        }        // move to the right        int val;        obj.pop(val);        j=RIGHT(val);        // remove leaf less nodes        while(j > ARRAYSIZE(A)) {            obj.pop(val);            j=RIGHT(val);        }    }} void preorder(int idx){     if ( idx < ARRAYSIZE(A)){         std::cout<<A[idx]<<" ";         preorder(LEFT(idx));         preorder(RIGHT(idx));     }} void postorder(int idx){    if(idx < ARRAYSIZE(A)) {        postorder(LEFT(idx));        postorder(RIGHT(idx));        std::cout<<A[idx]<<" ";    }} void inorder(int idx){    if(idx < ARRAYSIZE(A)) {        inorder(LEFT(idx));        std::cout<<A[idx]<<" ";        inorder(RIGHT(idx));    }}void main(){  preorder_stack(0);  std::cout<<endl;  preorder(0);  std::cout<<endl;  postorder(0);  std::cout<<endl;}

Important points

This algorithm uses array which is not usual. Mostly you would see linked list based implementation
We will see more of linked list in coming chapters before jumping into trees using linked list
Preorder – we need to visit all the roots at the left and then right, but whenever we see root, we need to make sure that root is visited first!! then left and then right
Post order – visit all left and right nodes, before visiting the root
Inorder – visit left node and then the root and the right node.
Only for preorder, i have written an implementation using Stack, iterative way. It can be reused for all others and very useful when you need to do traversal using iterations.

Thursday, 12 May 2011

Set Theory–A Dummies Guide

Wednesday, 11 May 2011

Random number generation logic - Linear congruential generator

Tuesday, 3 May 2011

Relation between Binary Search Tree and Quicksort–MIT lecture series (hand picked)

Monday, 7 March 2011

Sorting in linear time

Wednesday, 23 February 2011

Quick sort

Monday, 21 February 2011

Tree traversal for binary trees

About the Author