Thursday, 27 January 2011

Pointers & References

“The basic confusion between C and C++ is the calling conventions. C only allows pass-by-value. Even this is misunderstood by many. For a proper pass-by-reference in C, double pointers are introduced. These double pointers renamed as references”

Misunderstandings

    Common misconceptions about pointers and references are very much prevalent in C/C++ world. The first thing you learn in a C++ course is reference and it is also the first thing to be misunderstood. Both C or C++, pass pointers as value only!! There is no pass-by-reference which means, you always get a copy of the value to the function to which the argument is passed. For all the functions, the variable in the argument is the local to that function.

Concepts

    Pointers as you know is very powerful and at the same time, very risky out-of-the-box feature exposed to a high level language. For understanding pointers, you need to understand some assembly code,

       str [ax], 0x56666

The above assembly reads the contents of the address in ax register and stores it inside the absolute memory location 0x56666. Here, the ax register is the pointer. This is called indirect addressing and this is the feature exposed outside from assembly in the form of a pointer.

   The most common crash you would have got on any application is “access violation”, “invalid memory access” etc. All these problems are because of indirect addressing in assembly. In C, these are caused by pointers. :)

   A pointer just stores the address and we can never track when its contents will be accessed. When a pointer is global, its life time is phenomenal and destructive. From the assembly, its clear that ax is a normal RW(Read/Write) register. Similarly a pointer variable can also be overwritten. It can even be just 0x00 (which is named as NULL). So, using pointers became the source of all problems. Passing the pointer address to function is little bit tricky!! See this code,

   1:  void ptrTest(int* a, int* b)
   2:  {
   3:      *(&a) = b;
   4:  }
   5:  void main()
   6:  {
   7:      int A[]={5,13,2,25,7,17,20,8,4};
   8:      int B[]= {2,3,4,5,6};
   9:      ptrTest(A,B);
  10:  }
  • You have two array A (0x1234ffff) and B (0x5678ffff)
  • As C or C++, pointers are passed by value, As we learnt that, arguments are always local to a function. We have two integer pointer variables a,b. Please put * before int. I feel that is a good practice and gives clear understanding that, pointer is just another form of data type for storing an address. (As pointer only stores addresses, it just require 4 bytes, 32-bit, in a 32-bit system)
  • Pointers need a designated type just for the compilation to succeed. :) At run time there is no size difference. Even if you have a structure of size 1 MB, a pointer requires just 4 bytes (32-bit) which can address even 4 GB structure. :)
  • In the above example, a and b gets copies of addresses (0x1234ffff) and (0x5678ffff). The content of a is changed to b. So, a and b will be (0x5678ffff) on return.
  • But still A, B has same address. Since we have not done a pass-by-reference here. :)
  • IMPORTANT NOTE: Array’s address can never be changed!! Array is a constant pointer type.

 

   1:  void ptrTest(int** a, int** b)
   2:  {
   3:      *a = *b;
   4:  }
   5:  void main()
   6:  {
   7:      int A[]={5,13,2,25,7,17,20,8,4};
   8:      int B[]={1,2,3,4,5,6,7,8,9};
   9:      int *ptrA = A;
  10:      int *ptrB = B;
  11:      ptrTest(&ptrA, &ptrB);
  12:  }
  • In the above code, we take a copy of the static array pointer object and try to change the pointer
  • &ptrA and &ptrB is pass by reference here. We get double table address here. &ptrA = address of ptrA + address of A
  • Since two addresses are hold instead of one, this is called double pointer. If you pass a simple variables with &, you just give a single address to be read. It cannot be written. This is the important point am trying to explain!!
  • In ptrTest we change a, b which also changes ptrA and ptrB. Note that this is the major cause for stack corruption issues. :) Since we write into another functions stack here :).. (ptrTest writes into main function’s stack)

Reference – A simplified version of pass by reference.

   The above double address referencing is simplified using C++ references. The important properties of it is well studied in basic C++ course.

  • References are pointer objects. They store read-only addresses. Rather than like pointers.
  • There is no null references. References should point to some address variable
  • Reference cannot be reset
  • Dereference of reference doesn’t require operator “*”
   1:  void refTest(int& a, int& b)
   2:  {
   3:      a = b;
   4:  }
   5:  void main()
   6:  {
   7:      int a=20, b=40;
   8:      int c=30;
   9:      int& refA = a;
  10:      int& refB = b;
  11:      refTest(refA, refB);
  12:  }