Introduction
Unlike the newer languages, 'C' and 'C++' do not have a built-in foreach
operation. Looping over the containers is one of the most common operations. A foreach
simplifies writing and reading of code. Here, a simple approach is suggested to avail this facility in C++.
If you search the net, you will find long winded discussions on how to implement this facility using complex macros, conditional operators ?:
, multi line macro definitions, code having dependency on STL or boost or other libraries, preprocessor programs, convoluted expressions for type safety etc..
What I present here, uses three lines of macros that you need to place in your headers, then implement an interface and enjoy a cross platform solution with type safety. Simplicity has been the prime goal.
Using 'foreach'
Take a look at the following (desired) program fragments:
#include "files.hpp"
typedef const char *CSTR;
extern void find_replace(CSTR filename, CSTR from, CSTR to);
void change()
{
Files files("*.htm");
foreach(files)
{
find_replace(files.current(), "(c) 2005", "(c) 2006");
}
}
and..
void bill()
{
Clients clients;
Invoices invoices;
Services services;
foreach(clients)
{
cout << "Client: " << clients.current().name() << endl;
invoices = clients.current();
foreach(invoices)
{
cout << "Invoice #: " << invoices.current().number() << endl;
services = invoices.current();
foreach(services)
{
cout << services.current() << endl;
}
}
}
}
Note that in each of these cases, there are no iterators, it is not known (totally encapsulated) whether these are STL containers or not!
Thus, we want to achieve two goals at the same time: mainly, simplicity in writing the code; and secondly, information hiding - so that the implementation (STL containers or not) does not dictate the use.
Well, in these examples, some of the 'containers' are STL compliant and some are not. This method allows you to achieve the same consistent usage nevertheless; without losing STL's efficiency, generality, and extensibility wherever you can; while keeping the interface utterly simple.
Correctly speaking, I should say, there are no open, 'visible' iterators! Looping often involves two types of containers: those holding simple (POD - plain old data: int
s, double
s, char
s) types and those holding abstract data types (objects of some class).
- Simple PODs: For simple things like
char line[80]
, looping using for(i=0; i < ...)
is perfectly fine. The explicit for
loop is quite OK here, as it expresses the operation clearly. Trying to use anything else would obscure things.
- Objects: For this, usually an 'iterator' is declared and its
*
operator is used to access the elements in turn.
Take a look at the following function, taken from actual usage. Lots of people write like this, so, this would be considered 'normal'.
void bill()
{
Client client;
for(Client::const_iterator ri = client.begin();
ri != client.end(); ++ri)
{
cout << (*ri).name() << endl;
for(Invoices::const_iterator fi = (*ri).begin();
fi != (*ri).end(); ++fi)
{
cout << "Invoice #:" << (*fi).number() << endl;
for(Services::const_iterator si = (*fi).begin();
si != (*fi).end(); ++si)
cout << (*si) << endl;
}
cout << endl;
}
}
In case, it is needed to add namespace specifiers like 'Projectname::Client::const_iterator'
and 'std::cout'
, 'std::endl'
everywhere, you can imagine the reading excitement this is going to cause!
What I found utterly unacceptable for looping over object containers, as seen from this routine, was - declaring an iterator every time merely for looping, and having to type things like (*ri).number()
.
Do you want to express the intent more clearly, type less and make it more maintainable? Tired of inventing new names for the iterators every time? Then recheck the foreach
based version above.
Implementing the container
The formula for clearer code is achieved by combining the container with the iteration capabilities and encapsulating it as a single class. The foreach
(described later) works for any class having the following member functions:
class Cont {
public:
...
bool first();
bool next();
T current();
T & operator()();
T & operator[](size_t index);
protected:
...
};
I started using this pattern before the STL surfaced, hence the names resemble the older iteration style.
For STL based containers (or compatible - things made to behave like STL containers), this is equally simple to achieve:
template < class T >
class Container : public vector<T>{
protected :
Container::iterator it;
public :
bool first(){ it = begin(); return it != end(); }
bool next(){ ++it; return it != end(); }
T current(){ return *it; }
...
This is an example of a class holding objects of type 'T
' in a vector. It could really be any of the other STL containers, suitable for the application at hand. The first()
and next()
are defined in terms of begin()
and end()
. Note how current()
provides access to the desired contained object. Also note the use of ++it
, as it++
would mean something quite different. All such traps are hidden, the users see only foreach
.
For other (non STL based) classes, first()
positions to the first object, and next()
to the next. Both return false
if that can't be done. Surprising number of other objects will readily yield to this: files, configuration entries, fields in an HTML form, DBMS tables etc., where you would otherwise have to build code to make them STL iteration compliant, just to unify the looping.
The foreach
is achieved by using the following macros:
#define JOIN2(a,b) a##b
#define JOIN(a,b) JOIN2(a,b)
#define foreach(a) for(bool JOIN(flag,__LINE__)=a.first();
JOIN(flag,__LINE__);JOIN(flag,__LINE__)=a.next())
Contrary to the common macro naming conventions, I have left foreach
in lower case as it comes out as a true single line for
statement and avoids all possible problems with similar macros regarding dangling if
s, braces, etc. Simple, huh?
So, to recapitulate, this foreach
takes in any 'container' object with first()
, next()
defined for it, and provides access to each element using current()
. The access to mutable objects can be provided using the operator()
.
A new flag
Every time it is used, my foreach
creates a new bool
flag variable, unique because it uses the current source line number as part of its name. This flag is never used again, so the actual name does not matter. It just needs to be unique, every time. According to the new standard, it will be forgotten when the scope of the for
is over.
Type safety is achieved as there is a separate class for each type of object container, the compiler does not need to know anything special about the objects being iterated over.
Caveats
My STL version as shown, evaluates end()
once in first()
and then again in next()
. Change it if you have problems with that implementation. I use a template container version using direct STL containers and have not encountered a problem yet. For other things like 'Files' above, your implementations of first()
, next()
will depend upon whether you are using DOS, Win32 or Linux - all nicely hidden by the implementation.
My understanding of STL is close to minimal, I just am sold on its efficiency. So, I use it whenever the readability does not suffer. I can always buy a faster computer, can't put in the readability later.
Variations
Quite often, it makes sense to hold a copy of the contained item within the envelope, and use it as the 'current' value.
Customers cust;
foreach(cust){
cout << cust.name << "\t" << cust.phone << endl;
}
Here, attributes like name
and phone
are stored within the Customers
class and set to the values of the current record in both first()
and next()
. Here, the container holds the object attributes and is the object itself.
Alternatively, you can derive Customers
(note the ending 's', therefore a set of customers) from Customers
(single object) class, isolating the object and iterating concerns. Then, exactly the same code is usable: (Here the container is derived from the object).
Customers cust;
foreach(cust){
cout << cust.name << "\t" << cust.phone << endl;
}
Once again, the usage does not tell the user anything about the underlying design aspects, they are not exposed.
Note: In the examples given much earlier, the container and the objects are separate, hence copy operations may be invoked on the objects, when referred using current()
. Under such conditions, following the regular C++ guidelines, don't forget to define the copy
constructor and the copy operator if it contains anything but PODs. But this is common C++ programming, nothing specific about this implementation of foreach
.
Also, this foreach
does not take two parameters as in foreach(i in iset)
, which would require the compiler to know about the container. So if and when the standard adds such a thing, you can easily distinguish between the two.
Epilog
I had second thoughts about publishing this technique, as it is particularly obvious and simple, once shown. For long time, I had the impression that others must have stumbled on such a trivial solution earlier. Discussions in a newsgroup indicated otherwise. If nothing else, please treat this as a simple pattern that I found stable and have used in my code.
I would compare this with #define dim(x) (sizeof(x)/sizeof(x[0]))
, which has saved me many chagrins, just define and use brainlessly. You can certainly code without it, and enjoy the debugging fun.
A comment about the target level: this article should be useful for beginners and advanced users, as beginners readily will see the simple 'user' code; advanced users will understand the impact of seemingly simple solutions. The intermediates will most likely have too much code written to even think about changing anything!
Beginners, please don't ask me the following:
- A class to handle files such as "*.htm"! Better sources exist on the net.
- A class to handle clients, invoices and services. You got to write them!
The example given is just that, an example. I do not endorse OpenRJ or derivatives. When I complained that the library and the usage examples could be simpler, the author wanted clarifications. My sample code had this foreach
. It was revealed during further discussions that foreach
could be useful to others. I wrote the test program first, then the alternate library implementations. The original library can be found at: Open-RJ (redirects to SourceForge). Thanks to Matthew Wilson for pointing me in the right direction!
References
Much more involved discussions have been done about 'foreach
' in C++. Most of these have complex implementation suggestions. Some of the discussions are at:
History
26th May, 2005: The technique is simple, and quite stable. Any suggestions for improvement?