## Thursday 6 January 2011

### Insertion sort using Linked List

Using good data structures for some problems yields good results and sometimes you can only use some data structures for effectively representing a dynamic domain. For example, static arrays can be used for static data set. How about a dynamic data set ? This is where Lists comes into picture.

Metaphor

Lists are sometimes called the dynamic array. In real world, Lists are like infinite set of elements which gets expanded dynamically. For example, take the latest cloud computing trend. When the load in the server increases, the system dynamic adds a new computing resource to maintain the load. The world is so dynamic in real world and as well as in computers. Data is always dynamic.

Static structure means array. Dynamic structures means Lists. Both these structure can take different shapes. Think about the static circular moon. It takes curve, semi-circle, ring shapes. Think about the dynamic stars. They take different shapes names constellations. There are different shapes determined and some unorganized shapes with different sizes are also possible. But in a moon, we have only fixed shapes of fixed size.

Concept

Now, let me tell some new buzz words, trees, stacks, queues, graphs. All these are data structures in computer science which can be dynamic + static. In other words, All these buzz words are shapes and they can grow dynamic or static way.

The dynamic and static is nothing but Linked list and Arrays. We already saw the metaphor about insertion sort. So diving into concept of Linked list.

In Linked list, there are many operations that can be implemented. But some of the normal and useful operations are

• createNode = for creating new dynamic node
• append = for adding the new dynamic node to the list at the end.

There is a specification for how to implement these APIs. This is called as Abstract Data Type(ADT: http://en.wikipedia.org/wiki/Abstract_data_type) which gives a mathematical model for the buzz words like stack, queue, trees, graphs. ADT is nothing but giving a definition of a type in a generalized manner for any kind of system(like 32-bit/4-byte integer means between –2147483647 to +2147483648).The operations defined by an ADT is implemented using either static data structures(arrays) or dynamic data structures(linked list).

I don’t want to bore more with the concepts of Linked List and ADT. But the basic concept behind Linked list is that each data item is linked with each other using pointers. Unlike array where we can use index to access data item, here we need to go through the pointers from the start to reach a particular data item. So, linked list always takes time to reach some element in it.

The important point here is random access within the list is not so easy without knowing the addresses of all the elements in the dynamic list. So the algorithms which stress on usage of particular dynamic data structure, places more stress on storing the addresses of each element.

Code

Linked list APIs. The design of the API is different based on the data structure we intend to use. (remember ADT)

`   1:  struct List`
`   2:  {`
`   3:      int data;`
`   4:      List *next;`
`   5:  };`
`   6:   `
`   7:  List* createNode(int data)`
`   8:  {`
`   9:      List* n = new List;`
`  10:      n->data = data;`
`  11:      n->next = 0;`
`  12:   `
`  13:      return n;`
`  14:  }`
`  15:   `
`  16:  void append(List* head, List* node)`
`  17:  {`
`  18:      List *iter = head;`
`  19:      while(iter->next != 0) iter = iter->next;`
`  20:      iter->next = node;`
`  21:      iter->next->next = 0;`
`  22:  }`

The below code explains insertion sort and also generation of linked list from the array elements. The main characteristic to be highlighted here is inserting the element at the right place. We use prev to indicate the previous node at any point. iter runs until sorting order is violated. So, the new node is inserted between prev and iter.

The special case here is when prev is zero. We need to change the head pointer. IMPORTANT NOTE: Changing head is very risky!!. Sometimes you would get interview questions like write an algorithm without changing the head pointer as well. But for this algorithm, its a must to change to head pointer. You can fight with the interviewer that, you cannot implement this algorithm without changing the head pointer!!

`   1:   `
`   2:  List* insertion_sort_list(int *arr, int size)`
`   3:  {`
`   4:      List *head;`
`   5:      head = createNode(arr[0]);`
`   6:      for(int i =1; i<size;i++)`
`   7:      {`
`   8:          List *prev = 0;`
`   9:          List *iter = head;`
`  10:          while(iter != 0)`
`  11:          {`
`  12:              if(iter->data < arr[i])`
`  13:              {`
`  14:                  prev = iter;`
`  15:                  iter = iter->next;`
`  16:              }`
`  17:              else`
`  18:                  break;`
`  19:          }`
`  20:          List *newNode = createNode(arr[i]);`
`  21:          if(prev == 0)`
`  22:              head = newNode;`
`  23:          else`
`  24:              prev->next = newNode;`
`  25:          newNode->next = iter;`
`  26:      }`
`  27:      return head;`
`  28:  }`
`  29:   `
`  30:  void main()`
`  31:  {`
`  32:      int A[] = { 5, 3, 7, 6, 2, 2};`
`  33:      List *list = insertion_sort_list(A, 6);`
`  34:  }`

Dummy questions

• Why not we swap data for the in-place version of insertion sort?

This is not a dummy question btw. Swapping can be used for in-place version of insertion sort on a array. Remember, we create a hole and swap elements to the hole to create the correct spot to insert the key. As we don’t know the addresses/positions of all the nodes in the list at one shot, this is not a feasible way. Changing head pointer and swapping addresses is a very easy solution!!

“Swap works well only when you know the index” – NOTE: This is true only for insertion sort algorithm.

• Where is the in-place insertion sort code for linked list?

Doing in-place insertion sort is insane. I tried doing it. But felt that, no one would understand it. But am so dummy that, I have to drop it.

If you get to implement something less buggy, please share. If i get it working, i will share it.

Important points

• Random access is not possible in a dynamic list without knowing the addresses of all the entries in the list. This is why in-place insertion sort would take more time, as we need to find the end of the sorted list always which is dynamic.
• Position indicators are generally absent in a list. There is no start and end in the list. But you have a head pointer which marks the start temporarily. head pointer can also change when we change the start entry. This is why, the start and end of a list is always considered null.
• Insertion sort can be done in-place just within linked list or input can be an array which can be used to create a dynamic linked list. Will discuss about in-place insertion sort using linked list when we forgot about insertion sort after some time.
• Saving the next pointers and previous pointers are important concepts in linked list involved implementation of algorithms
• Note that, the core insertion algorithm never change. Still we need to assume a sorted list, pick a number from unsorted list and put it at the right place. This is why i said, data structures never change the core algorithm steps or concept behind it. It just optimizes it or extends it.
• Recursion usage with linked list is really confusing. So, before jumping into recursion with linked list. We will see more examples of Recursion usage.