Purpose
This article is intended for software programmers who have never been exposed to the COM technologies. The article doesn't cover the COM fundamentals, however it does cover the basic concepts of an Interface. It explains, using the powerful C++ language , how a class eventually becomes an interface, and what were the reasons for this formation. The article is just a brief introduction with what is actually going on behind the scenes of an interface.
Contents
Introduction
Components history
Using object oriented approach (oo)
Limitations of C++
The basic idea of an interface using C++
A basic COM interface using C++
Interface Inheritance
Summary
Introduction
It is very common among programmers to break-up the code functionality into small, simpler pieces. Each piece is call a Component. The concept is to store these components into a library or libraries, and to approach them (the components that is) using an API service (Application Programming Interface). The re-use of these libraries is very simple and powerful and can be approached easily.
Components history
Many programmers have encountered several problems whilst developing components. The problems were related to the component side and also to the client that uses the component. Here are some of the major problems developers have encountered:
- Scope - The scope of a component was a problem both to the developer and the user of the component. Any change in the component by its writer could have altered the application that uses the component and cause re-compilation of the entire client code.
- Versions - How can one force a programmer to check an interface version? The maintenance and version publication can be a problem.
- Communications - Component communication is also a problem especially if several people were involved in the writing process.
- Language - If a component was written in C++, how can one approach it using Visual Basic or C for example.
There were several solutions to the problems of developing components but one of the powerful ones was the usage of object oriented approach.
Using object oriented approach (OO)
One of the most common techniques to write a component is to use the object oriented approach. The use of the object oriented approach allows a programmer to approach an application in a more abstract way: Looking at an application as a collection of objects, each object has its own unique qualities and one object can communicate with an other object. Using the object oriented approach makes it easier on a programmer to understand the complexity of the application and to find a proper solution to simplify the problem. Designing an object is much more understandable than to read procedural algorithms. One of the languages that uses the object oriented concept is C++. In this article, the examples were written in C++ (a basic knowledge of C++ is required).
Limitations of C++
One of the programming languages to implement object oriented is C++. C++ is limited from the component access point of view:
- Size does matter I - Suppose we have 2 classes, i.e. class A and class B and class B inherits class A. Now suppose we want to add a new data member to class A (the parent). Automatically the size of class B will be changed (or its virtual table size), and we will need to compile the code for class B again.
- Distinction - There is no clear way to distinguish between where exactly the implementation of the classes is (in the previous section class A and class B). The implementation of the methods in both classes can be in the parent (class A) or the child (class B). C++ doesn't provide an easy distinction.
- Size does matter II - Sometimes a client that uses the component (classes A and B in the previous sections), needs to know the component size upfront. Not knowing the size may cause synchronization problems between the client and the component.
- Size does matter III- Suppose we add a virtual function to the parent class (class A in the previous sections), now the class will have a pointer to a virtual table and each inherited class will contain a pointer to the virtual table as well, this means that each inherited class will now contain 4 more bytes (for the pointer to the virtual table) and if a client that uses the class counts on the class size the communication may be broken.
To solve the problems above we use Interfaces. What exactly is an interface? How do we write code for defining an interface? How can we communicate with an interface? Answers to these questions and much more can be found in the the following section.
The basic idea of an interface using C++
Consider the following class:
class CExampleArray
{
public:
int getLength() { return m_iLength; }
private:
int m_iLength;
int m_ArrVec[100];
};
If we create an instance of the above class, it will waste memory space because of the pre-definition of the m_ArrVec
to be at the size of 100.What if we need an array of more than 100 integers? This class is inadequate for a definition of a general array.
We can resolve the problem by altering the fixed size to be a pointer, and storing the size of the array in another data member as in the following class:
class CExampleArray
{
public:
int getLength() { return m_iLength; }
private:
int m_iLength;
int* m_ArrVec;
short m_iArrSize;
};
Since data members were changed and added to the above class, it should be re-compiled and also any client using this class needs to be re-compiled as well to work with the new class. Now in the new version of the class, an instance of the class won't take wasteful memory space but any client depending on the size of the array will have a problem (because one client can define an array with 20 elements of integer and one can define an array with 100 elements of integer).
Besides members that changed the size of the class in memory, we can also add virtual functions that will alter the size as well:
class CExampleArray
{
public:
virtual void ReverseArray();
int getLength() { return m_iLength; }
private:
int m_iLength;
int* m_ArrVec;
short m_iArrSize;
};
A virtual table will be created for this class because of the virtual method ReverseArray
, and each instance of the class will contain a pointer to the virtual table (VPTR) that will add 4 bytes (pointer size) to the total class size. Again client communication to the class might be broken.
Suppose we don't want to let a client modify the class data members. We can restrict the client from any data modification of our class and move the data handling to another class:
class CExampleArray
{
public:
virtual void ReverseArray();
int getLength() { return m_pDataImpl->getLength(); }
private:
CExampleArrayDataImpl* m_pDataImpl;
};
class CExampleArrayDataImpl
{
}
Now all the data members are handled in the class CExampleArrayDataImpl
. To retrieve or modify the data members, we access the implementation class CExampleArrayDataImpl
by invoking its methods using m_pDataImpl
pointer. The size of the implementation class can be changed (i.e. adding new data members, adding virtual functions) but this doesn't cause the base class size to be changed because it only holds a pointer to the implementation class.
The following diagram shows how both classes (CExampleArray
and CExampleArrayDataImple
) will be presented in memory:
The class CExampleArray
still holds a data member (a pointer to CExampleArrayDataImple
). Our goal is to create a class without any data members, A class that exposed only by its methods, and allows inherited classes to implement the exposed methods.
In order to do that we need to change all the method headers in the class CExampleArray
to be pure virtual and we also need to remove all the data members from the class:
class CExampleArray
{
public:
virtual void ReverseArray() = 0;
virtual int getLength() = 0;
};
This is an abstract class, it contains only pure virtual declaration of functions (the " = 0 " indicates the purity). All the data members were removed.
This abstract class is known as an Interface. An interface is basically a class, containing only pure virtual functions and has no data members.
A client that will use this interface will only use a pointer to the interface, and won't need to count on the interface size. All the data of the interface are hidden from the client, and implemented in the inherited classes of the interface. Syntactically interface class is usually prefixed with 'I'. From now on, I will refer to the interface class CExampleArray
as IExampleArray
.
So far we've learned what an interface looks like. Now let's take a look on how a basic COM interface looks like:
A basic COM interface using C++
Let us look on how COM interfaces are created using C++. C++ does not allow us to create an instance of an abstract class, i.e. the following line of code is not valid:
IExampleArray* pNewInstance = new IExampleArray;
The interface class can only be approached by using a pointer to the virtual table that exposes the methods in the interface. Interface does not come by itself, it usually comes with an inherited class that implements the exposed method in the interface. Such a class that implements the interface exposed methods is often called a co-class. Here is an example of a co-class:
class CExampleArrayImpl :
public IExampleArray
{
public:
virtual void ReverseArray() {
virtual int getLength() {
};
Since the implementation class CExampleArrayImpl
has no pure virtual functions, we can easily create an instance of it. We will now write a method that performs a new operation on the implementation class (co-class) and return a valid pointer to the interface (this can be done due to the powerful C++ technique called polymorphism). We can use a global method to create an instance of the co-class or we can use a static method as well. The technique of using a method that creates an instance of a co-class and returning a pointer to its interface is often called Class Factoring. Here is the global create instance method:
IExampleArray*CreateArrInstance()
{
return new CExampleArrayImpl ;
}
Now if a client wants to use the interface all it has to do is obtain a pointer to the interface object by calling the CreateArrInstance
method, and then invoke the methods of the interface:
int UseTheInterfaceMethod()
{
IExampleArray* pArr = CreateArrInstance();
int iLength = pArr->getLength();
return 0;
}
The client doesn't see the class CExampleArrayImpl
or use the operator new directly. It only knows the virtual table of the IExampleArray
interface. Interfaces can be inherited in the same way as classes in C++. Interface inheritance
We can create another interface that will inherit from the IExampleArray
and will be an extent of the IExampleArray
(will have the same 2 methods as IExampleArray
and will add a new method of its own):
class IAnOtherExampleArray : public IExampleArray
{
public:
virtual void ReverseArray() = 0;
virtual int getLength() = 0;
virtual BOOL Find(int iKey) = 0;
};
Notice that inheritance is allowed between interfaces because they are classes after all. We can now create a new co-class (implementation class for IAnOtherExampleArray
interface):
class CExampleArrayImpl : public IAnOtherExampleArray
{
public:
virtual void ReverseArray() {
virtual int getLength() {
virtual BOOL Find(int iKey) {
};
A client that will use the above interfaces will see both IExampleArray and IAnOtherExampleArray but will not see CExampleArrayImpl
and CAnOtherExampleArrayImpl
co-classes. In order to obtain a pointer to the new interface IAnOtherExampleArray
the client must use a dynamic cast as follows:
int UseTheInterfaceMethod()
{
IExampleArray* pArr = CreateArrInstance();
int iLength = pArr->getLength();
IAnOtherExampleArray *pNewArr = dynamic_cast<IAnOtherExampleArray *> (pArr);
pNewArr->Find(1);
return 0;
}
Remark In order to make the above code work with the Visual Studio environment, make sure you are compiling it with the option '/GR', which enable the run-time type information, otherwise you will get a compilation warning (C4541).
Summary
By now hopefully you better understand what lies beneath the interface concept. I did not cover COM issues in my article because there are excellent articles regarding the technology both in MSDN and in the developers sites.