(untagged)

Implementing 'foreach' for C++ as a Design Pattern

Rajiv Bhagwat

0.00/5 (No votes)

26 May 2005

Adding 'foreach' in C++ programs for simplifying iteration.

Introduction

Unlike the newer languages, 'C' and 'C++' do not have a built-in foreach operation. Looping over the containers is one of the most common operations. A foreach simplifies writing and reading of code. Here, a simple approach is suggested to avail this facility in C++.

If you search the net, you will find long winded discussions on how to implement this facility using complex macros, conditional operators ?:, multi line macro definitions, code having dependency on STL or boost or other libraries, preprocessor programs, convoluted expressions for type safety etc..

What I present here, uses three lines of macros that you need to place in your headers, then implement an interface and enjoy a cross platform solution with type safety. Simplicity has been the prime goal.

Using 'foreach'

Take a look at the following (desired) program fragments:

#include "files.hpp"

typedef const char *CSTR;

extern void find_replace(CSTR filename, CSTR from, CSTR to);

void change()
{
    Files files("*.htm");
    foreach(files)
    {
        find_replace(files.current(), "(c) 2005", "(c) 2006");
    }
}

and..

void bill()
{
    Clients     clients;
    Invoices     invoices;
    Services     services;

    foreach(clients)
    {
        cout << "Client: " << clients.current().name() << endl;
        invoices = clients.current();
        foreach(invoices)
        {
            cout << "Invoice #: " << invoices.current().number() << endl;
            services = invoices.current();
            foreach(services)
            {
                cout << services.current() << endl;
            }
        }
    }
}

Note that in each of these cases, there are no iterators, it is not known (totally encapsulated) whether these are STL containers or not!

Thus, we want to achieve two goals at the same time: mainly, simplicity in writing the code; and secondly, information hiding - so that the implementation (STL containers or not) does not dictate the use.

Well, in these examples, some of the 'containers' are STL compliant and some are not. This method allows you to achieve the same consistent usage nevertheless; without losing STL's efficiency, generality, and extensibility wherever you can; while keeping the interface utterly simple.

Correctly speaking, I should say, there are no open, 'visible' iterators! Looping often involves two types of containers: those holding simple (POD - plain old data: ints, doubles, chars) types and those holding abstract data types (objects of some class).

Simple PODs: For simple things like char line[80], looping using for(i=0; i < ...) is perfectly fine. The explicit for loop is quite OK here, as it expresses the operation clearly. Trying to use anything else would obscure things.
Objects: For this, usually an 'iterator' is declared and its * operator is used to access the elements in turn.

Take a look at the following function, taken from actual usage. Lots of people write like this, so, this would be considered 'normal'.

void bill()
{
    Client client;

    // Loop over the clients

    for(Client::const_iterator ri = client.begin(); 
                                    ri != client.end(); ++ri) 
    {
      cout << (*ri).name() << endl;

      // Loop over the Invoices

      for(Invoices::const_iterator fi = (*ri).begin(); 
                                     fi != (*ri).end(); ++fi) 
      {
         cout << "Invoice #:" << (*fi).number() << endl;
         // Loop over the Services

         for(Services::const_iterator si = (*fi).begin(); 
                                     si != (*fi).end(); ++si) 
             cout << (*si) << endl;

      }
      cout << endl;
    }
}

In case, it is needed to add namespace specifiers like 'Projectname::Client::const_iterator' and 'std::cout', 'std::endl' everywhere, you can imagine the reading excitement this is going to cause!

What I found utterly unacceptable for looping over object containers, as seen from this routine, was - declaring an iterator every time merely for looping, and having to type things like (*ri).number().

Do you want to express the intent more clearly, type less and make it more maintainable? Tired of inventing new names for the iterators every time? Then recheck the foreach based version above.

Implementing the container

The formula for clearer code is achieved by combining the container with the iteration capabilities and encapsulating it as a single class. The foreach (described later) works for any class having the following member functions:

class Cont {
    public:
        ...
        bool first();
        bool next();
        T current();  // T is the type of whatever that is contained herein

// optional:

        T & operator()();  // Reference, hence good for mutables

        T & operator[](size_t index);  // Sometimes useful..

    protected:
...
    };

I started using this pattern before the STL surfaced, hence the names resemble the older iteration style.

For STL based containers (or compatible - things made to behave like STL containers), this is equally simple to achieve:

template < class T > 
class Container : public vector<T>{ 
    protected : 
        Container::iterator it;
    public : 
//        Container(){}

//        ~Container(){}

        bool first(){ it = begin(); return it != end(); }
        bool next(){ ++it; return it != end(); }
        T current(){ return *it; }
...

This is an example of a class holding objects of type 'T' in a vector. It could really be any of the other STL containers, suitable for the application at hand. The first() and next() are defined in terms of begin() and end(). Note how current() provides access to the desired contained object. Also note the use of ++it, as it++ would mean something quite different. All such traps are hidden, the users see only foreach.

For other (non STL based) classes, first() positions to the first object, and next() to the next. Both return false if that can't be done. Surprising number of other objects will readily yield to this: files, configuration entries, fields in an HTML form, DBMS tables etc., where you would otherwise have to build code to make them STL iteration compliant, just to unify the looping.

The foreach is achieved by using the following macros:

#define JOIN2(a,b) a##b
#define JOIN(a,b) JOIN2(a,b)
#define foreach(a) for(bool JOIN(flag,__LINE__)=a.first();
            JOIN(flag,__LINE__);JOIN(flag,__LINE__)=a.next())

Contrary to the common macro naming conventions, I have left foreach in lower case as it comes out as a true single line for statement and avoids all possible problems with similar macros regarding dangling ifs, braces, etc. Simple, huh?

So, to recapitulate, this foreach takes in any 'container' object with first(), next() defined for it, and provides access to each element using current(). The access to mutable objects can be provided using the operator().

A new flag

Every time it is used, my foreach creates a new bool flag variable, unique because it uses the current source line number as part of its name. This flag is never used again, so the actual name does not matter. It just needs to be unique, every time. According to the new standard, it will be forgotten when the scope of the for is over.

Type safety is achieved as there is a separate class for each type of object container, the compiler does not need to know anything special about the objects being iterated over.

Caveats

My STL version as shown, evaluates end() once in first() and then again in next(). Change it if you have problems with that implementation. I use a template container version using direct STL containers and have not encountered a problem yet. For other things like 'Files' above, your implementations of first(), next() will depend upon whether you are using DOS, Win32 or Linux - all nicely hidden by the implementation.

My understanding of STL is close to minimal, I just am sold on its efficiency. So, I use it whenever the readability does not suffer. I can always buy a faster computer, can't put in the readability later.

Variations

Quite often, it makes sense to hold a copy of the contained item within the envelope, and use it as the 'current' value.

Customers cust;
foreach(cust){
    cout << cust.name << "\t" << cust.phone << endl;
    }

Here, attributes like name and phone are stored within the Customers class and set to the values of the current record in both first() and next(). Here, the container holds the object attributes and is the object itself.

Alternatively, you can derive Customers (note the ending 's', therefore a set of customers) from Customers (single object) class, isolating the object and iterating concerns. Then, exactly the same code is usable: (Here the container is derived from the object).

Customers cust;
foreach(cust){
    cout << cust.name << "\t" << cust.phone << endl;
    }

Once again, the usage does not tell the user anything about the underlying design aspects, they are not exposed.

Note: In the examples given much earlier, the container and the objects are separate, hence copy operations may be invoked on the objects, when referred using current(). Under such conditions, following the regular C++ guidelines, don't forget to define the copy constructor and the copy operator if it contains anything but PODs. But this is common C++ programming, nothing specific about this implementation of foreach.

Also, this foreach does not take two parameters as in foreach(i in iset), which would require the compiler to know about the container. So if and when the standard adds such a thing, you can easily distinguish between the two.

Epilog

I had second thoughts about publishing this technique, as it is particularly obvious and simple, once shown. For long time, I had the impression that others must have stumbled on such a trivial solution earlier. Discussions in a newsgroup indicated otherwise. If nothing else, please treat this as a simple pattern that I found stable and have used in my code.

I would compare this with #define dim(x) (sizeof(x)/sizeof(x[0])), which has saved me many chagrins, just define and use brainlessly. You can certainly code without it, and enjoy the debugging fun.

A comment about the target level: this article should be useful for beginners and advanced users, as beginners readily will see the simple 'user' code; advanced users will understand the impact of seemingly simple solutions. The intermediates will most likely have too much code written to even think about changing anything!

Beginners, please don't ask me the following:

A class to handle files such as "*.htm"! Better sources exist on the net.
A class to handle clients, invoices and services. You got to write them!

The example given is just that, an example. I do not endorse OpenRJ or derivatives. When I complained that the library and the usage examples could be simpler, the author wanted clarifications. My sample code had this foreach. It was revealed during further discussions that foreach could be useful to others. I wrote the test program first, then the alternate library implementations. The original library can be found at: Open-RJ (redirects to SourceForge). Thanks to Matthew Wilson for pointing me in the right direction!

References

Much more involved discussions have been done about 'foreach' in C++. Most of these have complex implementation suggestions. Some of the discussions are at:

Artima.com
CUJ article: foreach is Your Friend
DDJ, Mar 2004: Collection enumeration: loops, iterators, and nested functions, Walter Bright Matthew Wilson.

History

26^th May, 2005: The technique is simple, and quite stable. Any suggestions for improvement?

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here