A Generic Pipeline Framework for C++

theidealist

4.70/5 (11 votes)

23 Sep 2006CPOL11 min read

1.7K

Presentation of a generic, extendable C++ pipeline framework

Introduction

In software, a processing "pipeline" is one of the few examples of a real, physical concept having a useful counterpart in software. Another popular example is thinking of software in terms of "objects." One of the advantages of these types of concepts is that because we understand their physical counterparts, we know how to use and can easily grasp them. Although there is certainly more to object-oriented programming than understanding the concept of a physical object, a pipeline really is just about that simple.

A pipeline, as defined here, is simply a collection of pipe segments connected together in different ways, according to pre-specified rules, in order to accomplish a task. In its simplest form, a pipeline only requires a head and a tail, and they can be one and the same (although this typically defeats the purpose).

A software pipeline can be used in a lot of ways. Some examples I've seen are: performing long scientific algorithms, where each step is well-delineated; performing abstracted device reads/writes, as in the case of buffered I/O (network, harddrive, etc.); and in graphical processing.

Background

As a big fan of the interface-as-a-contract school of thought, when I sat down to design the pipeline, I wanted to establish a simple means by which to connect two objects, while also affording interface-level freedom throughout. Connecting one object to another object in this context - via the object1.connectTo(&object2) method, or with the overloaded object1 += object2 operator - is synonymous with telling one object(object1) to send its output to another (object2).

In the event that you are unfamiliar with the interface-as-a-contract paradigm, I'll give you my brief synopsis. It says that one of the things you consider when you start a software development project is interface definition. Define interfaces in areas that you believe will be changed/expanded - in other words (not mine), "encapsulate the variation." If you incorporate good interfaces into your software design, you will reap the benefits many times over. This is particularly useful in a pipelined design, where it's typically prudent to define an interface at each connection-point. Then, users can implement a particular interface (agree to a contract) they are interested in working on and connect it right into the pipeline seamlessly.

Using the Code

This framework provides a single class that defines a segment in the pipe, and can determine, at compile time, whether or not it can successfully connect to another pipe segment. The user can specify a class, base-class, or interface (abstract class), that defines the types of classes that his/her pipe segment(s) can connect to. A pipe segments may be simultaneously connected to other pipe segments, as well as having other pipe segments simultaneously connected to it.

Since the demo application provided is about the most boring application ever written, and I am not interested in praise for this article on the grounds of its ability to cure insomnia, I'll show how to use the framework to construct a marginally more interesting, purely hypothetical, HTTP server request processing pipeline. Please take note of the fact that there will not be any implementation of a server here (this code will not compile), just the following grossly over simplified four step pipeline:

Receive a request and start the pipeline
Authenticate the request
Authorize the request
Respond to the request

Our HTTP server processing pipeline will contain a pipe segment for each of these steps:

HTTPRequestHandler - Responds to an incoming HTTPRequest object by simply sending it off down the pipeline.
HTTPRequestAuthenticator - Accepts an HTTPRequest object, authenticates it, and passes it down the pipeline (if appropriate).
HTTPRequestAuthorizer - Accepts an HTTPRequest, authorizes it, and passes it down the pipeline (if appropriate).
HTTPRequestResponder - Accepts an HTTPRequest, and responds to it in the appropriate manner.

The terrific thing about this example is that, in order to accomplish this enormous feat of coding prowress, we will only need to define one interface, which each of the above classes will implement:

An IHTTPRequestHandler interface, defining a single method called HandleHTTPRequest(HTTPRequest *theRequest)

I'm not going to discuss the innards of the HTTPRequest class, I'm just going to assume it already exists, nor will I wax eloquent on the proper methodologies for doing HTTP request authentication or anything similar. Please, do not write to me to let me know this is not the right way to code an HTTP server. I'm sure there's a lot more to it.

What we're working towards here is for the following code to be executed prior to the first request handled by the HTTP server. This is where we're building the pipeline. I realize this would probably cause some scoping problems if implemented cut-n-paste from here - again, this is only an example to give you a feel for things. I'll let you figure out how to get around scoping/variable lifetime issues:

C++

...
//Build the individual pipeline segments
HTTPRequestHandler theRequestHandler;
HTTPRequestAuthenticator theAuthenticator;
HTTPRequestAuthorizer theAuthorizer;
HTTPRequestResponder theResponder;

//Attach the individual pipeline segments
theRequestHandler += theAuthenticator;
theAuthenticator += theAuthorizer;
theAuthorizer += theResponder;

//Or, you can attach them all in one line if you like
theRequestHandler.connectTo(theAuthenticator.connectTo
	(theAuthorizer.connectTo(theResponder)));
...

Hopefully, that's pretty self-explanatory.

Now, how do we get there? Well first we'll write the IHTTPRequestHandler interface and define it something like this:

C++

#include "HTTPRequest.h"
#include "PipeSegmentBaseAdapter.h"

class IHTTPRequestHandler : public PipeLineProcessing::PipeSegmentBaseAdapter
{
    public:
        virtual void HandleHTTPRequest(HTTPRequest *request)=0;
};

With that interface defined, we can start writing the pipe segment objects. They might look like this:

C++

#include "HTTPRequest.h"
#include "PipeSegment.h"

class HTTPRequestHandler :
    public IHTTPRequestHandler,        //the object itself "is a" IHTTPRequestHandler
    //and it only outputs to IHTTPRequestHandler objects
    public PipeLineProcessing::PipeSegment<IHTTPRequestHandler>
{
    public:
        virtual void HandleHTTPRequest(HTTPRequest *request) {
            //Since the HTTPRequestHandler object does nothing besides
            //send the request off to any connected request handlers,
            //this method simply iterates over the collection of output
            //handlers currently "connected to" this object.
            for (int i=0; i < (int)this->theOutput.size(); i++) {
                IHTTPRequestHandler *anOutputHandler =
		(IHTTPRequestHandler *)this->theOutput.at(i);
                anOutputHandler->HandleHTTPRequest(request);
            }
        };
};

class HTTPRequestAuthenticator :
    public IHTTPRequestHandler,      //the object itself "is a" IHTTPRequestHandler
    //and it only outputs to IHTTPRequestHandler objects
    public PipeLineProcessing::PipeSegment<IHTTPRequestHandler>
{
    private:
        bool requestIsAuthentic(HTTPRequest *request)
            { return true; }; //everyone's authenticated!

    public:
        virtual void HandleHTTPRequest(HTTPRequest *request) {
            //This object only passes the HTTPRequest down the
            //pipeline if the request is successfully authenticated
            if (requestIsAuthentic(request)) {

                for (int i=0; i < (int)this->theOutput.size(); i++) {
                    IHTTPRequestHandler *anOutputHandler =
                    	(IHTTPRequestHandler *)this->theOutput.at(i);
                    anOutputHandler->HandleHTTPRequest(request);
                }
            }
        }
};

Above we've defined the first two pipeline segments. Hopefully they are readable enough that you can see they both have the same hierarchy tree. In pipeline terms, they both output to and serve as input for IHTTPRequestHandler type objects. The convention here is that the first type listed in the inheritance definition specifies the interface you are implementing, if any. Secondly, you use the PipeSegment<ihttprequesthandler /> to specify which type of objects you will output to. If you haven't been sleeping well the last few nights and/or you have deep questions at this point, look at the code in the demo app supplied. It should clear things up -- or put you to sleep, whichever your ailment.

Please also notice the for loop that is in each class. This will be discussed in more detail later.

So, the other two classes will look similar -- maybe like this:

C++

class HTTPRequestAuthorizer :
    public IHTTPRequestHandler,    //the object itself "is a" IHTTPRequestHandler
    //and it only outputs to IHTTPRequestHandler objects
    public PipeLineProcessing::PipeSegment<IHTTPRequestHandler>
{
    private:
        bool requestIsAuthorized(HTTPRequest *request)
            { return true; }; //everyone's authorized!

    public:
        virtual void HandleHTTPRequest(HTTPRequest *request) {
            //This object only passes the HTTPRequest down the
            //pipeline if the request is successfully authorized
            if (requestIsAuthorized(request)) {

                for (int i=0; i < (int)this->theOutput.size(); i++) {
                    IHTTPRequestHandler *anOutputHandler =
			(IHTTPRequestHandler *)this->theOutput.at(i);
                    anOutputHandler->HandleHTTPRequest(request);
                }
            }
        }
};

class HTTPRequestResponder :
    public IHTTPRequestHandler,         //the object itself "is a" IHTTPRequestHandler
    //and it is the end of the tail -- it has no output objects
    public PipeLineProcessing::IPipeTail

{
    private:
        void respondToRequest(HTTPRequest *request)
            { /* umm, nothing here. */ };

    public:
        virtual void HandleHTTPRequest(HTTPRequest *request) {
            respondToRequest(request);
        }
};

And that's it! Now you can execute the code we first started with, and send the HTTPRequest to the HTTPRequestHandler. The HTTPRequest object will be authenticated, authorized and responded to via our sweet little pipeline -- all automatically!

The only remaining question is, "what's with the for loop?" Well, this was the one evil part of the whole thing. I could not devise a means by which to automatically determine and invoke which method of the output objects to call without complicating things enormously, adding extra library dependencies, and/or possibly using functors/delegates, with which I am not yet too friendly. I didn't want to do that, so I kept things simple and levied the requirement on the user to actually spell out when he/she wants to send the output down the rest of the pipeline.

The for loop iterates over the theOutput STL vector inherited from the PipeLineProcessing::PipeSegmentBase base class. This vector stores pointers to all of the pipe segment objects this pipe segment connects to (that is, sends its output to). As it iterates over these output handlers, it downcasts them to pointers of the type specified in the class' definition (as this type of pipe segment's template parameter). By calling the desired method on that type, the pipeline is continued.

The astute programmer should've woken up when he/she saw the word "downcast" since this is considered an unsafe, dangerous casting. Unlike an upcast which casts a derived object as a base class type, and is therefore always safe, a downcast casts a base class as a derived class. This is dangerous because, in general, one never knows if the object being executed at run-time will be of the derived type, whereas one always knows that a derived type can be treated as a base type. To use a physical example, all soccer balls are balls(upcast), but not all balls are soccer balls(downcast).

So, how can we safely downcast a ball to a soccer ball? Only if we are certain that it is one. In this framework, that certainty is accomplished through the methods that add objects to the theOutput vector. While they are generic internally, the only methods exposed to clients to facilitate connections, the += operator and the connectTo method, only accept the appropriate types of objects. In other words, we let the compiler do our checking; unless the object being added on to the end of the pipe is, or can safely be addressed as (that is, upcast to), the type specified in the output defining portion of this class's definition, it will be flagged at compile time. This is one of the cooler tricks in this little project, and it is mostly thanks to the Kevlin Henney trick I found. I show examples of this in my boring demo.

This is my first submission to CodeProject, so please give it a try and leave comments.

Interesting Links

VTK - The Visual Toolkit, by Kitware. An open-source 3D graphics/visualization framework that is built on the concept of a graphics pipeline. I'm not sure if they use a design similar to the one I present here, as I have not looked at the code. But I have used the toolkit, and I hope they used some mechanism like this one to do it!
A Generic Data Process Pipeline Library. A totally different take on a processing pipeline in C++. Largely macro based and useful if you have the exact same pipeline multiple times in your application.
Cool Template Trick. This is a great little trick provided by Kevlin Henney. Use it anytime you want to require at compile time that a template parameter be of a particular base-type.

A Few Notes on the Demo

I am hoping for the prestigious Code Project's Most Boring Demo Application award with this submission. Please do not download the demo app thinking that it is going to do something wild and crazy, or you will be disappointed. It really only serves as a starting point for someone looking to build their pipeline with this framework.
Along with the demo, I have provided VS 2003 .NET solution and project files to ease building it in that environment.
Additionally, I have included .cdtbuild, .cdtproject and .project files for building in the Eclipse CDT -- my IDE of choice.
Since the framework is entirely header based (the .cpps contain only documentation), all you have to do is put them somewhere and make them accessible to your tool-chain. In this case, the project files I have included are set to look under the same root directory that contains the demo project's directory for a directory called "Pipeline". For example, if the "PipelineDemo" directory is under "projects" then the demo project will build correctly if it finds the "Pipeline" directory under "projects" as well. Otherwise, you will have to point it to the "Pipeline" folder, wherever that might be on your system.

License

G & A Technical Software (GATS) has released this code to the public domain. This means that you are allowed to take this code and use it with almost no legal obligations whatsoever. No copy left nonsense, no requirement that you keep the header in-tact, nothing. The only stipulation, in fact, is that by using the software you release GATS from any and all liability -- neither of us will be held responsible for any difficulties that might arise from you or your company's use of the code. For more details, read the header on any file in the source.

Future

Like all open source software, this framework could be extended (hopefully easily) to provide even more power to the developer using it. Some of my ideas are:

Multi-Directional/Multi-Channel Pipeline - oftentimes in a pipeline it is intuitive to have a two-way connection, one going conceptually "downstream" the other going "upstream." Currently, two disassociated pipelines must be maintained in order to accomplish this. It would be sweet if both of them could be defined in a single pipeline.
Functors/Delegates - a way to avoid the aforementioned for loop would be sweet.
Built-in Control Signals - a generic mechanism for basic control and analysis of the pipeline would be beneficial and useful to many potential users.
Multi-Threaded Support - in an assembly line processing pipeline, each step in the process is working on its portion of the process concurrently. This would be a major accomplishment/improvement here if each pipe segment was launched in its own thread.
Better Support for Error Handling and Exceptions - currently, there is no facility whatsoever for handling errors in the processing pipeline. Having one of these would definitely be advantageous.
A Main Pipeline Type - giving the user of the framework the ability to define an entire pipeline in one statement, similar to the way they did it in the borland article mentioned above would be a pretty sweet addition - and it might make some of these other improvements/additions/features a little easier to accomplish.

Obviously, if you make any improvements that you would like to share, send them to me, and if I think they're good, I'll add them here for sure.

History

Original post: September 23^rd, 2006

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)