Sunday, 15 September 2013

Remove duplicates in a Linked List

 

Introduction

   Removing duplicates from a linked list is a popular interview question. Although its simple, some people may ask the implement the linked list as such along with this. Yes, you need to add items to the linked list, remove an item from the linked list to do the removeDuplicates function.

Problem

  Move step-by-step.  (The questions to ask before you start)

  1. Is the linked list sorted?
  2. How large is the linked list? Is the size of the linked list fixed?

The above questions are because of the following reasons,

  1. sorted list means duplicates are found in single iteration. (just a single scan and remove)
  2. huge list means – doing a easy sorting would help. Dynamic list means – no sorting, need to try other approaches like heap. If space allowed, you can use hash map

Always go with the brute force way. 2 for loop approach.

Solution

 
class List
{
    int data;
    List* next;
    
public:
    List(int d)
    {
        this->data = d;
        this->next = NULL;
    }
 
    void append(int item)
    {
        List* iter = this;
        
        while(iter->next != NULL) iter = iter->next;
 
        List* newNode = new List(item);
 
        iter->next = newNode;
    }
 
    void deleteNode(List*& node)
    {
        if(this == node)
        {
            List *tmp = node->next;
            node->data = node->next->data;
            node->next = node->next->next;
            delete tmp;
        } else {
            List *iter = this;
            while(iter && iter->next != node) iter = iter->next;
 
            if(iter != NULL) {
                iter->next = node->next;
                delete node;
                node = iter;
            }
        }
    }
    void remove_dup()
    {
        for(List *iter1 = this; iter1 != NULL; iter1 = iter1->next) {
            for(List *iter2 = iter1->next; iter2 != NULL; iter2 = iter2->next) {
                if(iter1->data == iter2->data) {
                    deleteNode(iter2);
                }
            }
        }
    }
 
    void printList()
    {
        List *iter = this;
        while(iter) {
            std::cout<<iter->data<<"->";
            iter = iter->next;
        }
    }
};
 
 
void main()
{
    List obj(10);
    List *ptr = &obj;
    obj.append(2);obj.append(3);obj.append(5);
    obj.append(3); obj.append(6);
    obj.append(2); 
 
    obj.deleteNode(ptr);
    obj.printList();
    obj.remove_dup();
    std::cout<<std::endl;
    obj.printList();
}

Things found while writing code:

  • this pointer is const!! :) which means List *const this; u cannot change what it points to. But u can change the object inside it. const member function has it like, List const *const this;
  • As we delete node using next pointers in the list, to free the pointer you should take a backup of the pointer.
  • Base test cases must be run to catch errors: 2 nodes, 1 node, first node, last node.
  • Reference to a pointer will help in modifying the pointer in a function.
  • The above code is a O(n2) complexity. Brute force. Best way is to do merge sort. – O(n log n) and when doing the merge, remove duplicates.

References

http://analgorithmaday.blogspot.in/2011/01/merge-lists-without-using-extra-space.html

http://analgorithmaday.blogspot.in/2011/01/merge-algorithm.html

http://analgorithmaday.blogspot.in/2011/01/merge-sort.html