ATL Under the Hood Part 1

In this series of tutorials I am going to discuss some inner working of ATL and techniques which ATL use. Let's start discussion by the memory layout of the program. Let's make a simple program which doesn't have any data member and take a look at the memory structure of it.

Program 1.

#include <iostream>
using namespace std;

class Class {
};

int main() {
  Class objClass;
  cout << "Size of object is = " << sizeof(objClass) << endl;
  cout << "Address of object is = " << &objClass << endl;
  return 0;
}

The output of this program is

Size of object is = 1
Address of object is = 0012FF7C

Now if we are going to add some data member then the size of the class is sum of all the storage of individual member variable. It is also true in case of template. Now take a look at template class of Point.

Program 2.

#include <iostream>
using namespace std;

template <typename T>
class CPoint {
public:
  T m_x;
  T m_y;
};

int main() {
  CPoint<int> objPoint;
  cout << "Size of object is = " << sizeof(objPoint) << endl;
  cout << "Address of object is = " << &objPoint << endl;
  return 0;
}

Now the output of the program is

Size of object is = 8
Address of object is = 0012FF78

Now add inheritance too in the program. Now we are going to inherit class Point3D from Point class and see the memory structure of this program.

Program 3.

#include <iostream>
using namespace std;

template <typename T>
class CPoint {
public:
  T m_x;
  T m_y;
};

template <typename T>
class CPoint3D : public CPoint<T> {
public:
  T m_z;
};

int main() {
  CPoint<int> objPoint;
  cout << "Size of object Point is = " \
           << sizeof(objPoint) << endl;
  cout << "Address of object Point is = " \
           << &objPoint << endl;

  CPoint3D objPoint3D;
  cout << "Size of object Point3D is = " \
           << sizeof(objPoint3D) << endl;
  cout << "Address of object Point3D is = " \
           << &objPoint3D << endl;

  return 0;
}

The output of this program is

Size of object Point is = 8
Address of object Point is = 0012FF78
Size of object Point3D is = 12
Address of object Point3D is = 0012FF6C

This program shows the memory structure of the drive class. It shows the memory occupied by the object is sum of its data member plus its base member.

Things become interesting when virtual function join the party. Take a look at the following program

Program 4.

#include <iostream>
using namespace std;

class Class {
public:
  virtual void fun() { cout << "Class::fun" << endl; }
};

int main() {
  Class objClass;
  cout << "Size of Class = " << sizeof(objClass) << endl;
  cout << "Address of Class = " << &objClass << endl;
  return 0;
}

The output of the program is

Size of Class = 4
Address of Class = 0012FF7C

And situation becomes more interesting when we add more than one virtual function.

Program 5.

#include <iostream>
using namespace std;

class Class {
public:
  virtual void fun1() { cout << "Class::fun1" << endl; }
  virtual void fun2() { cout << "Class::fun2" << endl; }
  virtual void fun3() { cout << "Class::fun3" << endl; }
};

int main() {
  Class objClass;
  cout << "Size of Class = " << sizeof(objClass) << endl;
  cout << "Address of Class = " << &objClass << endl;
  return 0;
}

The output of the program is same as above program. Let's do one more experiment to better understand it.

Program 6.

#include <iostream>
using namespace std;

class CPoint {
public:
  int m_ix;
  int m_iy;
  virtual ~CPoint() { };
};

int main() {
  CPoint objPoint;
  cout << "Size of Class = " << sizeof(objPoint) << endl;
  cout << "Address of Class = " << &objPoint << endl;
  return 0;
}

The output of the program is

Size of Class = 12
Address of Class = 0012FF68

The output of these programs shows that when you add any virtual function in the class then its size increases one int size. I.e. in visual C++ increase 4 bytes. It means there are 3 Slot for integer in this class one for x one for y and one to handle virtual function that is called virtual pointer. First take a look the new slot i.e. virtual pointer is at starting of the object or ending of the object.

To do this we are going to directly access memory occupied by the object. To do this stores the address of object in int pointer and use the magic of pointer arithmetic.

Program 7.

#include <iostream>
using namespace std;

class CPoint {
public:
  int m_ix;
  int m_iy;
  CPoint(const int p_ix = 0, const int p_iy = 0) : 
     m_ix(p_ix), m_iy(p_iy) { 
  }
  int getX() const {
     return m_ix;
  }
  int getY() const {
     return m_iy;
  }
  virtual ~CPoint() { };
};

int main() {
  CPoint objPoint(5, 10);

  int* pInt = (int*)&objPoint;
  *(pInt+0) = 100;   // want to change the value of x
  *(pInt+1) = 200;   // want to change the value of y

  cout << "X = " << objPoint.getX() << endl;
  cout << "Y = " << objPoint.getY() << endl;

  return 0;
}

The important thing in this program is

  int* pInt = (int*)&objPoint;
  *(pInt+0) = 100;   // want to change the value of x
  *(pInt+1) = 200;   // want to change the value of y

In which we treat object as an integer pointer after store its address in integer pointer. The output of this program is

X = 200
Y = 10

Of course this is not our required result. This shows when 200 is store in the location where m_ix data member is resident. This means m_ix i.e. first member variable, start from second position of the memory not the first. In other words the first member is the virtual pointer and then rest is the data member of the object. Just change the following two lines

  int* pInt = (int*)&objPoint;
  *(pInt+1) = 100;   // want to change the value of x
  *(pInt+2) = 200;   // want to change the value of y

And we get the required result. Here is the complete program

Program 8.

#include <iostream>
using namespace std;

class CPoint {
public:
  int m_ix;
  int m_iy;
  CPoint(const int p_ix = 0, const int p_iy = 0) : 
     m_ix(p_ix), m_iy(p_iy) { 
  }
  int getX() const {
     return m_ix;
  }
  int getY() const {
     return m_iy;
  }
  virtual ~CPoint() { };
};

int main() {
  CPoint objPoint(5, 10);

  int* pInt = (int*)&objPoint;
  *(pInt+1) = 100;   // want to change the value of x
  *(pInt+2) = 200;   // want to change the value of y

  cout << "X = " << objPoint.getX() << endl;
  cout << "Y = " << objPoint.getY() << endl;

  return 0;
}

And output of the program is

X = 100
Y = 200

This clearly shows that whenever we add the virtual function int the class then virtual pointer is added at first location of memory structure.

Now the question arise what is store in the virtual pointer? Take a look at the following program to get an idea of this

Program 9.

#include <iostream>
using namespace std;

class Class {
  virtual void fun() { 
     cout << "Class::fun" << endl; 
  }
};

int main() {
  Class objClass;

  cout << "Address of virtual pointer " \
             << (int*)(&objClass+0) << endl;
   cout << "Value at virtual pointer " << \
             (int*)*(int*)(&objClass+0) << endl;
   return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer 0046C060

Virtual pointer stores the address of a table that is called virtual table. And virtual table store address of all the virtual function of that class. In other words virtual table is an array of address of virtual function. Let's take a look at the following program to get an idea of it.

Program 10.

#include <iostream>
using namespace std;

class Class {
  virtual void fun() { cout << "Class::fun" << endl; }
};

typedef void (*Fun)(void);

int main() {
  Class objClass;

  cout << "Address of virtual pointer " \
           << (int*)(&objClass+0) << endl;
  cout << "Value at virtual pointer i.e. Address of virtual table " \
           << (int*)*(int*)(&objClass+0) << endl;
  cout << "Value at first entry of virtual table " \
           << (int*)*(int*)*(int*)(&objClass+0) << endl;

  cout << endl << "Executing virtual function" << endl << endl;
  Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
  pFun();
  return 0;
}

This program has some uncommon indirection with typecast. Most important line of this program is

  Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);

Here Fun is a typedefed function pointer.

  typedef void (*Fun)(void);

Let's dissect the lengthy uncommon indirection. (int*)(&objClass+0) give address of virtual pointer of the class which is first entry in the class and we typecast it to int*. To get the value at this address use indirection operator (i.e. *) and then again typecast it to int* i.e. (int*)*(int*)(&objClass+0). This will give the address of first entry of the virtual table. To get the value at this location, i.e. get the address of first virtual function of the class again use the indirection operator and now typecast to the appropriate function pointer type. So

  Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);

Means get the value from the first entry of the virtual table and store it in pFun after typecast it into the Fun type.

What happen when one more virtual function add in the class. Now we want to access second member of the virtual table. Take a look at the following program to see the values at virtual table

Program 11.

#include <iostream>
using namespace std;

class Class {
  virtual void f() { cout << "Class::f" << endl; }
  virtual void g() { cout << "Class::g" << endl; }
};

int main() {
  Class objClass;

  cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
  cout << "Value at virtual pointer i.e. Address of virtual table " 
     << (int*)*(int*)(&objClass+0) << endl;

  cout << endl << "Information about VTable" \
       << endl << endl;
  cout << "Value at 1st entry of VTable " \
       << (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
  cout << "Value at 2nd entry of VTable " \
       << (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
   
  return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C0EC

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E

Now one question naturally comes in the mind. How compiler knows the length of vtable. The answer is the last entry of vtable is NULL. Change a program little bit to get and idea of this.

Program 12.

#include <iostream>
using namespace std;

class Class {
  virtual void f() { cout << "Class::f" << endl; }
  virtual void g() { cout << "Class::g" << endl; }
};

int main() {
  Class objClass;

  cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
  cout << "Value at virtual pointer i.e. Address of virtual table " 
      << (int*)*(int*)(&objClass+0) << endl;

  cout << endl << "Information about VTable" \
       << endl << endl;
  cout << "Value at 1st entry of VTable " \
       << (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
  cout << "Value at 2nd entry of VTable " \
       << (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
  cout << "Value at 3rd entry of VTable " \
       << (int*)*((int*)*(int*)(&objClass+0)+2) << endl;
  cout << "Value at 4th entry of VTable " \
       << (int*)*((int*)*(int*)(&objClass+0)+3) << endl;

  return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C134

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E
Value at 3rd entry of VTable 00000000
Value at 4th entry of VTable 73616C43

Output of this program shows that the last entry of vtable is NULL. Let's call virtual function from the knowledge we have.

Program 13.

#include <iostream>
using namespace std;

class Class {
  virtual void f() { cout << "Class::f" << endl; }
  virtual void g() { cout << "Class::g" << endl; }
};

typedef void(*Fun)(void);

int main() {
  Class objClass;

  Fun pFun = NULL;

  // calling 1st virtual function
  pFun = (Fun)*((int*)*(int*)(&objClass+0)+0);
  pFun();
   
  // calling 2nd virtual function
  pFun = (Fun)*((int*)*(int*)(&objClass+0)+1);
  pFun();

  return 0;
}

The output of this program is

Class::f
Class::g

Now see the case of multiple inheritances. Let's see the simple case of multiple inheritances

Program 14.

#include <iostream>
using namespace std;

class Base1 {
public:
  virtual void f() { }
};

class Base2 {
public:
  virtual void f() { }
};

class Base3 {
public:
  virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

int main() {
  Drive objDrive;
  cout << "Size is = " << sizeof(objDrive) << endl;
  return 0;
}

The output of this program is

Size is = 12

This program shows when you drive class with more then one base class then drive class have virtual pointer of all of base classes.

And what happen when drive class also have virtual function. Lets see this program to better understand the concepts of virtual function with multiple inheritance.

Program 15.

#include <iostream>
using namespace std;

class Base1 {
  virtual void f() { cout << "Base1::f" << endl; }
  virtual void g() { cout << "Base1::g" << endl; }
};

class Base2 {
  virtual void f() { cout << "Base2::f" << endl; }
  virtual void g() { cout << "Base2::g" << endl; }
};

class Base3 {
  virtual void f() { cout << "Base3::f" << endl; }
  virtual void g() { cout << "Base3::g" << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
public:
  virtual void fd() { cout << "Drive::fd" << endl; }
  virtual void gd() { cout << "Drive::gd" << endl; }
};

typedef void(*Fun)(void);

int main() {
  Drive objDrive;

  Fun pFun = NULL;

  // calling 1st virtual function of Base1
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+0);
  pFun();
   
  // calling 2nd virtual function of Base1
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+1);
  pFun();

  // calling 1st virtual function of Base2
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+0);
  pFun();

  // calling 2nd virtual function of Base2
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+1);
  pFun();

  // calling 1st virtual function of Base3
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+0);
  pFun();

  // calling 2nd virtual function of Base3
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+1);
  pFun();

  // calling 1st virtual function of Drive
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+2);
  pFun();

  // calling 2nd virtual function of Drive
  pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+3);
  pFun();

  return 0;
}

The output of this program is

Base1::f
Base1::g
Base2::f
Base2::g
Base3::f
Base3::g
Drive::fd
Drive::gd

This program show that the virtual function of drive store in the vtable of first vptr.

We can get the offset of Drive class vptr with the help of static_cast. Let's take a look at he following program to better understand it.

Program 16.

#include <iostream>
using namespace std;

class Base1 {
public:
  virtual void f() { }
};

class Base2 {
public:
  virtual void f() { }
};

class Base3 {
public:
  virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

// any non zero value because multiply zero with any no is zero
#define SOME_VALUE   1

int main() {
  cout << (DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
  cout << (DWORD)static_cast<Base2*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
  cout << (DWORD)static_cast<Base3*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
  return 0;
}

ATL use a macro name offsetofclass defined in ATLDEF.H to do this. Macro is defined at

#define offsetofclass(base, derived) \
 ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

This macro returns the offset of the base class vptr in the drive class object model. Let's see an example to get an idea of this

Program 17.

#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
  virtual void f() { }
};

class Base2 {
public:
  virtual void f() { }
};

class Base3 {
public:
  virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
  ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
  cout << offsetofclass(Base1, Drive) << endl;
  cout << offsetofclass(Base2, Drive) << endl;
  cout << offsetofclass(Base3, Drive) << endl;
  return 0;
}

The memory layout of the drive class is

And output of this program is

0
4
8

Output of this program shows this macro return the offset of vptr of required base class. In Don Box's Essential COM, he used similar macro to this. Change a program little bit and replaces ATL macro with Box's macro.

Program 18.

#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
  virtual void f() { }
};

class Base2 {
public:
  virtual void f() { }
};

class Base3 {
public:
  virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define BASE_OFFSET(ClassName, BaseName) \
  (DWORD(static_cast<BaseName*>(reinterpret_cast<ClassName*>\
  (0x10000000))) - 0x10000000)

int main() {
  cout << BASE_OFFSET(Drive, Base1) << endl;
  cout << BASE_OFFSET(Drive, Base2) << endl;
  cout << BASE_OFFSET(Drive, Base3) << endl;
  return 0;
}

The output and purpose of this program is same as previous program.

Let's do something practical and use this macro in our program. In fact we can call the virtual function of our required base class by getting the offset of base class vptr in drive's memory structure.

Program 19.

#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
  virtual void f() { cout << "Base1::f()" << endl; }
};

class Base2 {
public:
  virtual void f() { cout << "Base2::f()" << endl; }
};

class Base3 {
public:
  virtual void f() { cout << "Base3::f()" << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
  ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
  Drive d;

  void* pVoid = NULL;

  // call function of Base1
  pVoid = (char*)&d + offsetofclass(Base1, Drive);
  ((Base1*)(pVoid))->f();

  // call function of Base2
  pVoid = (char*)&d + offsetofclass(Base2, Drive);
  ((Base2*)(pVoid))->f();

  // call function of Base1
  pVoid = (char*)&d + offsetofclass(Base3, Drive);
  ((Base3*)(pVoid))->f();

  return 0;
}

The output of the program is

Base1::f()
Base2::f()
Base3::f()

I tried to explain the working of offsetofclass macro of ATL in this tutorial. Hope to explore other mysterious of ATL in next article.



About the Author

Zeeshan Amjad

C++ Developer at Bechtel Corporation. zamjad.wordpress.com

Comments

  • Thank you Very Much

    Posted by c0mgu1 on 07/10/2006 10:38am

    Very Nice explation dear friend, thanks alot :) www.jaicity.com

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • IBM Worklight is a mobile application development platform that lets you extend your business to mobile devices. It is designed to provide an open, comprehensive platform to build, run and manage HTML5, hybrid and native mobile apps.

  • Managing your company's financials is the backbone of your business and is vital to the long-term health and viability of your company. To continue applying the necessary financial rigor to support rapid growth, the accounting department needs the right tools to most efficiently do their job. Read this white paper to understand the 10 essentials of a complete financial management system and how the right solution can help you keep up with the rapidly changing business world.

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds