Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

ATL Under the Hood - Part 1

0.00/5 (No votes)
27 Jan 2002 1  
In this series of tutorials I am going to discuss some of the inner workings of ATL and the techniques that ATL uses.

Introduction

In this series of tutorials I am going to discuss some of the inner workings of ATL and the techniques that ATL uses.

Let's start the discussion by talking about the memory layout of a program. Let's make a simple program which doesn't have any data members and take a look at the memory structure of it.

Program 1.

#include <iostream>

using namespace std;

class Class {
};

int main() {
	Class objClass;
	cout << "Size of object is = " << sizeof(objClass) << endl;
	cout << "Address of object is = " << &objClass << endl;
	return 0;
}

The output of this program is

Size of object is = 1
Address of object is = 0012FF7C

Now if we are going to add some data members then the size of the class is the sum of all the storage of the individual member variables. It is also true in the case of template. Now let's take a look at template class of Point.

Program 2.

#include <iostream>

using namespace std;

template <typename T>
class CPoint {
public:
	T m_x;
	T m_y;
};

int main() {
	CPoint<int> objPoint;
	cout << "Size of object is = " << sizeof(objPoint) << endl;
	cout << "Address of object is = " << &objPoint << endl;
	return 0;
}

Now the output of the program is

Size of object is = 8
Address of object is = 0012FF78

Now add inheritance too in the program. Now we are going to inherit class Point3D from Point class and see the memory structure of this program.

Program 3.

#include <iostream>

using namespace std;

template <typename T>
class CPoint {
public:
	T m_x;
	T m_y;
};

template <typename T>
class CPoint3D : public CPoint<T> {
public:
	T m_z;
};

int main() {
	CPoint<int> objPoint;
	cout << "Size of object Point is = " << sizeof(objPoint) << endl;
	cout << "Address of object Point is = " << &objPoint << endl;

	CPoint3D<int> objPoint3D;
	cout << "Size of object Point3D is = " << sizeof(objPoint3D) << endl;
	cout << "Address of object Point3D is = " << &objPoint3D << endl;

	return 0;
}

The output of this program is

Size of object Point is = 8
Address of object Point is = 0012FF78
Size of object Point3D is = 12
Address of object Point3D is = 0012FF6C
This program shows the memory structure of the derived class. It shows the memory occupied by the object is sum of its data member plus its base member. 

Things become interesting when a virtual function joins the party. Take a look at the following program

Program 4.

#include <iostream>

using namespace std;

class Class {
public:
	virtual void fun() { cout << "Class::fun" << endl; }
};

int main() {
	Class objClass;
	cout << "Size of Class = " << sizeof(objClass) << endl;
	cout << "Address of Class = " << &objClass << endl;
	return 0;
}

The output of the program is

Size of Class = 4
Address of Class = 0012FF7C

And situation becomes more interesting when we add more than one virtual function.

Program 5.

#include <iostream>

using namespace std;

class Class {
public:
	virtual void fun1() { cout << "Class::fun1" << endl; }
	virtual void fun2() { cout << "Class::fun2" << endl; }
	virtual void fun3() { cout << "Class::fun3" << endl; }
};

int main() {
	Class objClass;
	cout << "Size of Class = " << sizeof(objClass) << endl;
	cout << "Address of Class = " << &objClass << endl;
	return 0;
}

The output of the program is same as above program. Let's do one more experiment to better understand it.

Program 6.

#include <iostream>

using namespace std;

class CPoint {
public:
	int m_ix;
	int m_iy;
	virtual ~CPoint() { };
};

int main() {
	CPoint objPoint;
	cout << "Size of Class = " << sizeof(objPoint) << endl;
	cout << "Address of Class = " << &objPoint << endl;
	return 0;
}

The output of the program is

Size of Class = 12
Address of Class = 0012FF68

The output of these programs shows that when you add any virtual function in the class then its size increases one int size. i.e. in Visual C++ it increase by 4 bytes. It means there are 3 slots for integer in this class one for x one for y and one to handle virtual function that is called a virtual pointer. First take a look at the new slot, namely the virtual pointer that is at the start (or end) the object. To do this we are going to directly access memory occupied by the object. We do this by storing the address of an object in an int pointer and using the magic of pointer arithmetic.

Program 7.

#include <iostream>

using namespace std;

class CPoint {
public:
	int m_ix;
	int m_iy;
	CPoint(const int p_ix = 0, const int p_iy = 0) : 
		m_ix(p_ix), m_iy(p_iy) { 
	}
	int getX() const {
		return m_ix;
	}
	int getY() const {
		return m_iy;
	}
	virtual ~CPoint() { };
};

int main() {
	CPoint objPoint(5, 10);

	int* pInt = (int*)&objPoint;
	*(pInt+0) = 100;	// want to change the value of x

	*(pInt+1) = 200;	// want to change the value of y


	cout << "X = " << objPoint.getX() << endl;
	cout << "Y = " << objPoint.getY() << endl;

	return 0;
}

The important thing in this program is

	int* pInt = (int*)&objPoint;
	*(pInt+0) = 100;	// want to change the value of x

	*(pInt+1) = 200;	// want to change the value of y

In which we treat object as an integer pointer after store its address in integer pointer. The output of this program is
X = 200
Y = 10

Of course this is not our required result. This shows when 200 is store in the location where m_ix data member is resident. This means m_ix i.e. first member variable, start from second position of the memory not the first. In other words the first member is the virtual pointer and then rest is the data member of the object. Just change the following two lines

int* pInt = (int*)&objPoint;
*(pInt+1) = 100;	// want to change the value of x

*(pInt+2) = 200;	// want to change the value of y

And we get the required result. Here is the complete program

Program 8.

#include <iostream>

using namespace std;

class CPoint {
public:
	int m_ix;
	int m_iy;
	CPoint(const int p_ix = 0, const int p_iy = 0) : 
		m_ix(p_ix), m_iy(p_iy) { 
	}
	int getX() const {
		return m_ix;
	}
	int getY() const {
		return m_iy;
	}
	virtual ~CPoint() { };
};

int main() {
	CPoint objPoint(5, 10);

	int* pInt = (int*)&objPoint;
	*(pInt+1) = 100;	// want to change the value of x

	*(pInt+2) = 200;	// want to change the value of y


	cout << "X = " << objPoint.getX() << endl;
	cout << "Y = " << objPoint.getY() << endl;

	return 0;
}

And output of the program is

X = 100
Y = 200

This clearly shows that whenever we add the virtual function into the class then the virtual pointer is added at first location of memory structure.


Now the question arises: what is stored in the virtual pointer? Take a look at the following program to get an idea of this

Program 9.

#include <iostream>

using namespace std;

class Class {
	virtual void fun() { cout << "Class::fun" << endl; }
};

int main() {
	Class objClass;

	cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
	cout << "Value at virtual pointer " << (int*)*(int*)(&objClass+0) << endl;
	return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer 0046C060

The virtual pointer stores the address of a table that is called the virtual table. And a virtual table stores address of all the virtual functions of that class. In other words the virtual table is an array of addresses of virtual functions. Let's take a look at the following program to get an idea of it.

Program 10.

#include <iostream>

using namespace std;

class Class {
	virtual void fun() { cout << "Class::fun" << endl; }
};

typedef void (*Fun)(void);

int main() {
	Class objClass;

	cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
	cout << "Value at virtual pointer i.e. Address of virtual table " 
		 << (int*)*(int*)(&objClass+0) << endl;
	cout << "Value at first entry of virtual table " 
		 << (int*)*(int*)*(int*)(&objClass+0) << endl;

	cout << endl << "Executing virtual function" << endl << endl;
	Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
	pFun();
	return 0;
}

This program has some uncommon indirection with typecast. The most important line of this program is

	Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
Here Fun is a typedef'd function pointer.
	typedef void (*Fun)(void);

Let's dissect the lengthy uncommon indirection. (int*)(&objClass+0) gives the address of the virtual pointer of the class which is the first entry in the class and we typecast it to int*. To get the value at this address we use the indirection operator (i.e. *) and then again typecast it to int* i.e. (int*)*(int*)(&objClass+0). This will give the address of first entry of the virtual table. To get the value at this location, i.e. get the address of first virtual function of the class again use the indirection operator and now typecast to the appropriate function pointer type. So

	Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);

Means get the value from the first entry of the virtual table and store it in pFun after typecast it into the Fun type.


What happen when one more virtual function add in the class. Now we want to access second member of the virtual table. Take a look at the following program to see the values at virtual table

Program 11.

#include <iostream>

using namespace std;

class Class {
	virtual void f() { cout << "Class::f" << endl; }
	virtual void g() { cout << "Class::g" << endl; }
};

int main() {
	Class objClass;

	cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
	cout << "Value at virtual pointer i.e. Address of virtual table " 
		<< (int*)*(int*)(&objClass+0) << endl;

	cout << endl << "Information about VTable" << endl << endl;
	cout << "Value at 1st entry of VTable " 
		<< (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
	cout << "Value at 2nd entry of VTable " 
		<< (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
	
	return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C0EC

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E

Now one question naturally comes in the mind. How does the compiler know the length of the vtable? The answer is the last entry of vtable is NULL. Change a program little bit to get an idea of this.

Program 12.

#include <iostream>

using namespace std;

class Class {
	virtual void f() { cout << "Class::f" << endl; }
	virtual void g() { cout << "Class::g" << endl; }
};

int main() {
	Class objClass;

	cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
	cout << "Value at virtual pointer i.e. Address of virtual table " 
		 << (int*)*(int*)(&objClass+0) << endl;

	cout << endl << "Information about VTable" << endl << endl;
	cout << "Value at 1st entry of VTable " 
		 << (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
	cout << "Value at 2nd entry of VTable " 
		 << (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
	cout << "Value at 3rd entry of VTable " 
		 << (int*)*((int*)*(int*)(&objClass+0)+2) << endl;
	cout << "Value at 4th entry of VTable " 
		 << (int*)*((int*)*(int*)(&objClass+0)+3) << endl;

	return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C134

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E
Value at 3rd entry of VTable 00000000
Value at 4th entry of VTable 73616C43

Output of this program shows that the last entry of vtable is NULL. Let's call virtual function from the knowledge we have.

Program 13.

#include <iostream>

using namespace std;

class Class {
	virtual void f() { cout << "Class::f" << endl; }
	virtual void g() { cout << "Class::g" << endl; }
};

typedef void(*Fun)(void);

int main() {
	Class objClass;

	Fun pFun = NULL;

	// calling 1st virtual function

	pFun = (Fun)*((int*)*(int*)(&objClass+0)+0);
	pFun();
	
	// calling 2nd virtual function

	pFun = (Fun)*((int*)*(int*)(&objClass+0)+1);
	pFun();

	return 0;
}

The output of this program is

Class::f
Class::g

Now let's see the case of multiple inheritance. Let's see the simple case of multiple inheritances

Program 14.

#include <iostream>

using namespace std;

class Base1 {
public:
	virtual void f() { }
};

class Base2 {
public:
	virtual void f() { }
};

class Base3 {
public:
	virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

int main() {
	Drive objDrive;
	cout << "Size is = " << sizeof(objDrive) << endl;
	return 0;
}

The output of this program is

Size is = 12

This program shows when you drive class with more then one base class then drive class have virtual pointer of all of base classes.


And what happen when drive class also have virtual function. Lets see this program to better understand the concepts of virtual function with multiple inheritance.

Program 15.

#include <iostream>

using namespace std;

class Base1 {
	virtual void f() { cout << "Base1::f" << endl; }
	virtual void g() { cout << "Base1::g" << endl; }
};

class Base2 {
	virtual void f() { cout << "Base2::f" << endl; }
	virtual void g() { cout << "Base2::g" << endl; }
};

class Base3 {
	virtual void f() { cout << "Base3::f" << endl; }
	virtual void g() { cout << "Base3::g" << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
public:
	virtual void fd() { cout << "Drive::fd" << endl; }
	virtual void gd() { cout << "Drive::gd" << endl; }
};

typedef void(*Fun)(void);

int main() {
	Drive objDrive;

	Fun pFun = NULL;

	// calling 1st virtual function of Base1

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+0);
	pFun();
	
	// calling 2nd virtual function of Base1

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+1);
	pFun();

	// calling 1st virtual function of Base2

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+0);
	pFun();

	// calling 2nd virtual function of Base2

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+1);
	pFun();

	// calling 1st virtual function of Base3

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+0);
	pFun();

	// calling 2nd virtual function of Base3

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+1);
	pFun();

	// calling 1st virtual function of Drive

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+2);
	pFun();

	// calling 2nd virtual function of Drive

	pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+3);
	pFun();

	return 0;
}

The output of this program is

Base1::f
Base1::g
Base2::f
Base2::f
Base3::f
Base3::f
Drive::fd
Drive::gd

This program show that the virtual function of drive store in the vtable of first vptr.


We can get the offset of Drive class vptr with the help of static_cast. Let's take a look at he following program to better understand it.

Program 16.

#include <iostream>

using namespace std;

class Base1 {
public:
	virtual void f() { }
};

class Base2 {
public:
	virtual void f() { }
};

class Base3 {
public:
	virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

// any non zero value because multiply zero with any no is zero

#define SOME_VALUE	1

int main() {
	cout << (DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
	cout << (DWORD)static_cast<Base2*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
	cout << (DWORD)static_cast<Base3*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
	return 0;
}

ATL use a macro name offsetofclass defined in ATLDEF.H to do this. Macro is defined at

#define offsetofclass(base, derived) \
       ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

This macro returns the offset of the base class vptr in the drive class object model. Let's see an example to get an idea of this

Program 17.

#include <windows.h>

#include <iostream>

using namespace std;

class Base1 {
public:
	virtual void f() { }
};

class Base2 {
public:
	virtual void f() { }
};

class Base3 {
public:
	virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
	((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
	cout << offsetofclass(Base1, Drive) << endl;
	cout << offsetofclass(Base2, Drive) << endl;
	cout << offsetofclass(Base3, Drive) << endl;
	return 0;
}

The memory layout of the drive class is


And output of this program is
0
4
8

Output of this program shows this macro returns the offset of vptr of the required base class. In Don Box's Essential COM, he used a similar macro to this. Change the program little bit and replace ATL macro with Box's macro.

Program 18.

#include <windows.h>

#include <iostream>

using namespace std;

class Base1 {
public:
	virtual void f() { }
};

class Base2 {
public:
	virtual void f() { }
};

class Base3 {
public:
	virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define BASE_OFFSET(ClassName, BaseName) \
	(DWORD(static_cast<BaseName*>(reinterpret_cast<ClassName*>\
	(0x10000000))) - 0x10000000)

int main() {
	cout << BASE_OFFSET(Drive, Base1) << endl;
	cout << BASE_OFFSET(Drive, Base2) << endl;
	cout << BASE_OFFSET(Drive, Base3) << endl;
	return 0;
}

The output and purpose of this program is the same as the previous program.

Let's do something practical and use this macro in our program. In fact we can call the virtual function of our required base class by getting the offset of base class vptr in drive's memory structure.

Program 19.

#include <windows.h>

#include <iostream>

using namespace std;

class Base1 {
public:
	virtual void f() { cout << "Base1::f()" << endl; }
};

class Base2 {
public:
	virtual void f() { cout << "Base2::f()" << endl; }
};

class Base3 {
public:
	virtual void f() { cout << "Base3::f()" << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
	((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
	Drive d;

	void* pVoid = NULL;

	// call function of Base1

	pVoid = (char*)&d + offsetofclass(Base1, Drive);
	((Base1*)(pVoid))->f();

	// call function of Base2

	pVoid = (char*)&d + offsetofclass(Base2, Drive);
	((Base2*)(pVoid))->f();

	// call function of Base1

	pVoid = (char*)&d + offsetofclass(Base3, Drive);
	((Base3*)(pVoid))->f();

	return 0;
}

The output of the program is

Base1::f()
Base2::f()
Base3::f()

I tried to explain the working of offsetofclass macro of ATL in this tutorial. I Hope to explore other mysterious of ATL in the next article.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here