Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C
Print

Custom (User Defined) Operators in C++

4.49/5 (20 votes)
15 Dec 2008CPOL8 min read 118.1K   1K  
C++ has no native support for adding new operators, but this article shows how, with macros and some clever overloading, it is possible to easily add your own anyway.

Introduction

C++ supports operator overloading, but you are not allowed to create your own operators. That is the word on the street anyway. In practice, this is only partly true. C++ gives you enough rope to hang yourself 10 times over. The good thing about this is that we have more than enough power to emulate almost anything that was left out, including creating our own operators.

What does it do?

In a word, syntax sugar. Rather than writing:

if (CString("foo bar baz").Find("bar") != -1)

You could define a "contains" operator, and write:

if ("foo bar baz" contains "bar")

This is obviously much easier, both to write and to read. You also avoid far-too-common-but-nasty errors, such as if you made the following mistake:

if (CString("foo bar baz").Find("foo")) 

Now, be honest, how long did it take you to spot the error above? Or, did you give up? How long would it have taken if you were trying to find the line that was misbehaving? How long if the contents of the string weren't hard coded on the same line? This ugly, but sadly common, bug can be avoided entirely with our custom "contains" operator. Syntax sugar is one of the more powerful tools in a programmer's arsenal. When used correctly, it can dramatically increase output and reduce bugs.

Background

If you don't know what operator overloading is, you should quickly Google it before attempting to use this code or read this article. It should only take you 5-10 minutes to get a good handle on this. Since any C++ developer worth his salt should understand operators and overloading, I'm not going to cover it here.

Using the code

Since most of you care more about getting your new toy, and less about how it works, we'll get you up and running first.

Just add the following two lines to your project, and you will have a few new operators, including the "contains" operator described earlier:

#include "CustomOperator.h"
// IMPORTANT: SampleOperators.h assumes a TCHAR enabled MFC environment
// If this does NOT describe your project, you will need to make some
// adjustments before you will be able to use it.
#include "SampleOperators.h"

Easy, huh?

If you want to add your own custom operators (that was the objective after all), you will need just a few lines more, but it isn't hard at all. If you don't want to understand how all of this works, your best bet is to read through the absurdly detailed comments in the two header files. Copy the samples to get your first couple of operators working, and then you should be good to go.

How does it all work?

If you don't know what a macro is, or if you have involuntary spasms when you think about template functions/classes, I recommend not trying to understand the code. The file  CustomOperator.h handles all the really messy stuff for you. Just follow the examples in the sample files, and you should be up and running with "that one operator you always wanted".

For the rest of you (or the masochistically curious), the basic concept is fairly easy. Make the following line work, and you effectively have a custom operator:

#define contains == CCustomOperatorHelper_contains() ==

The precedence of the above operator will be the same as the precedence for the == (equals) operator, because that is what we use on both sides. I shouldn't have to tell you how powerful this is.

Now, making the above line work in the single case isn't all that bad. You can do a darn good job for small tasks with a tiny class and two operator overloads.

class CCustomOperatorHelper_contains
{
public:
   CCustomOperatorHelper_contains(){}
   CString m_sLeft;
};
inline CCustomOperatorHelper_contains& operator == 
       (CString l, CCustomOperatorHelper_contains& mid)
{
   mid.m_sLeft = l;
   return mid;
}
inline bool operator == (CCustomOperatorHelper_contains& mid, CString r)
{
   return mid.m_sLeft.Find(r) != -1;
}

That, plus the original #define, and you have a functional operator. I did it in about 10 minutes. Why didn't they think of this before?

Really...is that it?

Sadly, no. The solution above is a great proof of concept, and not much else. It suffers from a number of fatal problems that are not easily solved. First, our operator has three extraneous string copy operations, meaning that we have as much as quadrupled the runtime for this line of code. If the core operation were something much simpler than a find (for example, a compare), this line could easily take 100 times longer than the original, or worse. (Disclaimer: before some MS guru flames me for it, CString does supposedly have some internal copy on write logic to prevent problems in exactly these kinds of settings, but the nature of the problem remains, even if CString magically solves it this time.)

To add insult to injury, the architecture above doesn't allow the type of the left operand to vary, since the second half of the operation starts with the helper rather than the original left operand. As long as the helper only deals with CStrings, we can infer that the type was a CString, but if we want to expand that, things get messy awfully fast.

As fate would have it, one change lets us solve both issues. The catch is that we now need four classes for our single custom operator.

class CCustomOperator_param_base
{
public:
   virtual ~CCustomOperator_param_base(){}
};

template<class T_left> class CCustomOperatorHelper_contains_leftparam_T
                     : public CCustomOperator_param_base
{
public:
   CCustomOperatorHelper_contains_leftparam_T(T_left l)
   {
      m_l_buf = l;
      m_pl = &m_l_buf;
   }
   T_left m_l_buf; // buffer, since we have to copy the left side value
   T_left* m_pl; // pointer to the value on the left
};

template<class T_left> class CCustomOperatorHelper_contains_leftparamref_T
                     : public CCustomOperator_param_base
{
public:
   CCustomOperatorHelper_contains_leftparamref_T(T_left& l)
   {
      m_pl = &l;
   }
   T_left* m_pl; /* pointer to the value on the left */
};

class CCustomOperatorHelper_contains
{
public:
   CCustomOperatorHelper_contains(){m_pLeft = NULL;}
   ~CCustomOperatorHelper_contains(){delete m_pLeft;}
   CCustomOperator_param_base* m_pLeft;
};

For the uninitiated, that template <class T> stuff lets us defer judgment on variable types. The compiler will generate classes as needed to satisfy whatever variable types we use with those classes. If you have been using macros without templates, you have a real treat coming, because when you put them together, you can create some truly amazing syntactic sugar, and keep it all type safe.

I digress; back to the code. The classes above handle some of the juggling that we can't handle otherwise. The two template classes will be used as return values from the left half of our operation. This allows us to use different overloads for the second part, based on which type was on the left for the first part. We have two of these template classes, because we have two different ways to pass parameters. We will use one class if we are dealing with references, and the other for everything else. This way, we can avoid copy operations for anything that is big, or doesn't have a copy operator (like CDWordArray). (Note: We can't just handle it all by reference, because that would prevent implicit conversions and constants in our operands.)

To set up the first half, we need one overloaded operator per type that may be a left operand. In our original scenario, we had one type, but we'll change that to three, just so you can see how it works:

inline CCustomOperatorHelper_contains_leftparam_T<const TCHAR*>& 
         operator == (const TCHAR* l, CCustomOperatorHelper_contains& r)
{
   return *(CCustomOperatorHelper_contains_leftparam_T<const TCHAR*>*)(r.m_pLeft = 
            new CCustomOperatorHelper_contains_leftparam_T<const TCHAR*>(l));
}
inline CCustomOperatorHelper_contains_leftparamref_T<CString>& 
       operator == (CString& l, CCustomOperatorHelper_contains& r)
{
   return *(CCustomOperatorHelper_contains_leftparamref_T<CString>*)
           (r.m_pLeft = new CCustomOperatorHelper_contains_leftparamref_T<CString>(l));
}
inline CCustomOperatorHelper_contains_leftparamref_T<CDWordArray>& 
       operator == (CDWordArray& l, CCustomOperatorHelper_contains& r)
{
   return *(CCustomOperatorHelper_contains_leftparamref_T<CDWordArray>*)
           (r.m_pLeft = new CCustomOperatorHelper_contains_leftparamref_T<CDWordArray>(l));
}

The inside of the functions is a little sloppy, but it isn't a big deal once we slice this into macros anyway. Also, notice, that we used the normal version for the const TCHAR* form, but the pass by reference version for CString and CDWordArray. This allows us to minimize copying all the way around, while still remaining compatible with constants. CDWordArray has no copy constructor, so only the ref version will work there.

The next step is to declare the real operators. We declare the real stuff in a separate function from the final overload, because it will allow us to make things super tidy when we slice it into macros.

inline bool _op_contains(CString& l, const TCHAR* r)
{
   return l.Find(r) != -1;
}
inline bool _op_contains(CString& l, int r)
{
   TCHAR a[20];
   _itot_s(r, a, 10);
   return l.Find(a) != -1;
}
inline bool _op_contains(const TCHAR* l, const TCHAR* r)
{
   // inefficient, but this is just an example, right?
   return CString(l).Find(r) != -1;
}
inline bool _op_contains(const TCHAR* l, int r)
{
   TCHAR a[20];
   _itot_s(r, a, 10);
   // inefficient, but this is just an example, right?
   return CString(l).Find(a) != -1;
}
inline bool _op_contains(CDWordArray& l, DWORD r)
{
   for (int a = 0; a < l.GetCount(); a++)
   {
      if (l[a] == r)
      {
         return true;
      }
   }
   return false;
}

You can see that we're adding the ability to accept an int on the right, but not the left, and we're also extending the code to handle searching a CDWordArray for a value. In our final code, ONLY these five operations will work. You can't mix the CDWordArray from the left with a const TCHAR* on the right. If you try, you will get a very meaningful compiler error at the line where you messed up.

The only thing that remains now is to define five operator overloads for the second half of our operation. Each overload will point to one of the above functions.

inline bool operator == 
  (CCustomOperatorHelper_contains_leftparam_T<const TCHAR*>& l, const TCHAR* r)
{
   return _op_contains(*l.m_pl, r);
}
inline bool operator == 
  (CCustomOperatorHelper_contains_leftparam_T<const TCHAR*>& l, int r)
{
   return _op_contains(*l.m_pl, r);
}
inline bool operator == 
  (CCustomOperatorHelper_contains_leftparamref_T<CString>& l, const TCHAR* r)
{
   return _op_contains(*l.m_pl, r);
}
inline bool operator == 
  (CCustomOperatorHelper_contains_leftparamref_T<CString>& l, int r)
{
   return _op_contains(*l.m_pl, r);
}
inline bool operator == 
  (CCustomOperatorHelper_contains_leftparamref_T<CDWordArray>& l, DWORD r)
{
   return _op_contains(*l.m_pl, r);
}

Assuming that you didn't delete the original #define, you now have a much better operator than the first one.

Is there a shortcut?

Obviously, going through this whole process each time you want an operator would get a little bit tiresome. That is a lot of extra very complicated code, just so that we can sweeten a few lines elsewhere.

To make life easy for everybody, I broke this out into a set of macros. Each distinct component can be defined with a macro. A complete operator could look something like the following admittedly worthless operator:

#define avg BinaryOperatorDefinition(_op_avg, /)
DeclareBinaryOperator(_op_avg)
DeclareOperatorLeftType(_op_avg, /, double);
inline double _op_avg(double l, double r)
{
   return (l + r) / 2;
}
BindBinaryOperator(double, _op_avg, /, double, double)

Each macro is defined in CustomOperator.h, and is accompanied by more commentary than you ever wanted.

You may have noticed that the macros identify this as a "binary" operator, meaning it has two sides. This is because we can also create UnaryPost operators (unary operators that sit to the right of the operand). Here is a quick example of a UnaryPost operator (again, totally worthless; may you come up with a better use):

#define squared UnaryPostOperatorDefinition(_op_squared)
DeclareUnaryPostOperator(_op_squared)
inline double _op_squared(double l)
{
   return l*l;
}
BindUnaryPostOperator(double, _op_squared, double)

UnaryPost operators are a bit easier than binary operators. First, we can eliminate the left hand type garbage, because there is only actually one operation. Second, the order of operations isn't a big deal, and the UnaryPost operator always evaluates at the same priority as multiplication. Yes, I know, this is different from other postfix operators, but I believe the behavior will match the way that the code reads (and I couldn't do the other because some idiot decided that [] has to be overridden inside the object; darn C2801 errors).

Is there anything you don't do?

Eventually, I plan to support ternary and multary operators (?: is a ternary operator, an example of a multary operator would be an inline switch statement). Unfortunately, this is not in place yet.

Unary prefixes (that sit to the left) are not officially supported, but can be emulated. There is a serious problem with attempting this, which is why I've chosen not to support it. A detailed explanation is available in the code.

Finally, your operators have to be a-zA-Z0-9_ based, just like everything else. The reason has to do with limitations in the macro symbols that make it impossible to declare operators like >>> or anything else with a symbol. You can still use really short symbols, though I would capitalize them and avoid using single letters.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)