Motivation
I want to write about C++ expression templates for the following reasons:
- I stumbled upon this topic recently and found it rather interesting.
- There is not much topical information available.
- I'm tired of reading articels which describe how awesome functional programming (with Haskell or Lisp) is. And that languages like C++ will surely never reach such elegance and beauty. An infinite repeated point is the ability to create ASTs (abstract syntax trees) and to work on them. I hope to demonstrate how to do this in C++.
First a short definition of expression templates to know exactly what we are talking about:
C++ expression templates are a meta programming technique based on templates which represent expressions. Spiced up with operator overloading it's possible to create DSLs (domain specific language). It is important to know that an expression template can be passed as function parameter and evaluated later (at runtime). We can refine this definition once we have reached a better understanding of this topic.
Not to be confused with compile-time computation (which is much more easier today with the *constexpr* keyword, but that is another topic).
Lets climb up the the three steps to expression heaven:
Step 1 - the classic
Let's consider a simple expression in the following form:
<svg height="113px" version="1.1" width="143px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><defs><g transform="translate(0.5,0.5)"><path d="M 71 21 L 24.7 85.82" fill="none" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 21.65 90.09 L 22.87 82.36 L 24.7 85.82 L 28.57 86.43 Z" fill="#000000" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 71 21 L 117.3 85.82" fill="none" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 120.35 90.09 L 113.43 86.43 L 117.3 85.82 L 119.13 82.36 Z" fill="#000000" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="51" y="1"><g transform="translate(63,5)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="17">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="9" y="14">[Not supported by viewer]<rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="1" y="91"><g transform="translate(13,95)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="16">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="8" y="14">[Not supported by viewer]<rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="101" y="91"><g transform="translate(113,95)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="16">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="8" y="14">[Not supported by viewer]
The classic way to express this in C/C++ is the following method:
inline int expr(int n)
{
return n + 42;
}
This is straight forward, but has nothing to do with expression templates. The compiler generates the AST during compile time and transforms this to maschine code. The only way to access this expression is to call this function. (Since C++14 you can declare the function as constexpr.)
Step 2- function pointers
To move closer to the goal to compute with ASTs we can compose function pointers. Each node of the expression will be presented by a function. And with functions that accept function pointers as argument we can build up the tree.
inline int forty_two()
{
return 42;
}
int add(int lhs, int(*rhs)())
{
return lhs + rhs();
}
inline int bin_expr(int(*op)(int, int(*)()), int lhs, int(*rhs)())
{
return op(lhs, rhs);
}
int main(int argc, char **argv) {
if (argc > 1)
{
const int i = std::stoi( argv[1] );
std::cout << bin_expr(&add, i, &forty_two) << std::endl;
return EXIT_SUCCESS;
}
return EXIT_FAILURE;
}
To be honest, this code is a mess. To implement different ASTs we have to produce endless many functions with slightly different signatures. And the resulting code is not very efficient. Added to this the compiler has not many chances to optimize the code. An object oriented implementation with abstract base classes would ease the pain but we skip this and go forward to step 3:
Step 3 - heaven
In a template based solution all is expressed by a template, the values, the constants and also the operations. To spare some work we use the function objects from the STL library functional.
To start we define our building blocks constants
and variables
. These templates acts as functions that produce a value. In the weird world of functional programming all values are expressed as functions. In our case we use templated function objects. This has the advantage to cover virtually every datatype.
template < typename T >
struct constant
{
typedef T result_type;
template < typename U >
constant (U const& c)
: c_(c)
{}
T const& operator()(void) const
{
return c_;
}
const T c_;
};
template < typename T >
struct variable
{
typedef T result_type;
variable(T& v)
: v_(v)
{}
T& operator()(void) const
{
return v_;
}
T& v_;
};
Template solutions require name commonality. So all evalutions will be done by an overloaded function call operator with the signature: T operator()(void) const
. Since we are open to all value types (int, double, std::complex, ...) a result_type
is to define in all building blocks to help the compiler to find the correct data type.
Next step is to build a non-terminal expression by using compile time recursion. For our example we define a binary expression:
template< typename BOP, typename LHE, typename RHE, typename T = typename BOP::result_type >
struct binary_expression
{
public:
typedef T result_type;
public:
binary_expression (LHE lhexp, RHE rhexp)
: bop_()
, lhe_(lhexp)
, rhe_(rhexp)
{}
T operator()(void) const
{
return bop_(lhe_(), rhe_());
}
private:
BOP bop_;
LHE lhe_;
RHE rhe_;
};
Thats all. It is sufficient to define the first AST:
int main(int argc, char **argv) {
if (argc > 1)
{
int i = std::stoi( argv[1] );
Variable< int > v(i);
Constant< int > c(42);
typedef binary_expression < std::plus< int >, variable< int >, constant< int > > plus_expr;
plus_expr be(v, c);
std::cout << be() << std::endl;
return EXIT_SUCCESS;
}
return EXIT_FAILURE;
If we start this programm with paramter 2 (./expr 2) the output is 44.
What strikes us is the utterly complex definition of the expression template plus_expr. To get this more readable operator overlading comes to rescue:
template < typename LHE typename RHE >
binary_expression < std::plus< typename lhe::result_type >, LHE, RHE >
operator+(LHE lhexp, RHE rhexp)
{
typedef typename LHE::result_type result_type;
return binary_expression < std::plus< result_type >, LHE, RHE >(lhexp, rhexp);
}
This creator function produces a binary expression that executes a plus operation on the specified template parameters. Now we can write:
auto expr = v + c;
This generates the same type as the plus_expr
type from the example above. And this works for other expression templates too. But notice the complete different semantics in this line of code: This is not a the summation of of two values. This is an expression that computes the summation of two values. There is an additional level of indirection.
And beyond
Lets try to build the following AST:
<svg height="205px" version="1.1" width="192px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><defs><g transform="translate(0.5,0.5)"><path d="M 71 113 L 24.7 177.82" fill="none" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 21.65 182.09 L 22.87 174.36 L 24.7 177.82 L 28.57 178.43 Z" fill="#000000" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 71 113 L 117.3 177.82" fill="none" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 120.35 182.09 L 113.43 178.43 L 117.3 177.82 L 119.13 174.36 Z" fill="#000000" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="51" y="93"><g transform="translate(63,97)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="17">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="9" y="14">[Not supported by viewer]<rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="1" y="183"><g transform="translate(13,187)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="16">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="8" y="14">[Not supported by viewer]<rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="101" y="183"><g transform="translate(113,187)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="16">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="8" y="14">[Not supported by viewer]<path d="M 120 21 L 73.7 85.82" fill="none" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 70.65 90.09 L 71.87 82.36 L 73.7 85.82 L 77.57 86.43 Z" fill="#000000" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 120 21 L 166.3 85.82" fill="none" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><path d="M 169.35 90.09 L 162.43 86.43 L 166.3 85.82 L 168.13 82.36 Z" fill="#000000" pointer-events="none" stroke="#000000" stroke-miterlimit="10"><rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="100" y="1"><g transform="translate(113,5)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="14">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="7" y="14">[Not supported by viewer]<rect fill="none" height="20" pointer-events="none" stroke="none" width="40" x="150" y="91"><g transform="translate(162,95)"><switch><foreignobject height="16" pointer-events="all" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" width="16">
<text fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle" x="8" y="14">[Not supported by viewer]
First we define a creator function for the minus (-) operator:
template < typename LHE, typename RHE >
binary_expression< std::minus< typename LHE::result_type >, LHE, RHE >
operator-(LHE lhexp, RHE rhexp)
{
typedef typename LHE::result_type result_type;
return binary_expression< std::minus< result_type >, LHE, RHE >(lhexp, rhexp);
}
Now we can write the AST in a astonishingly simple way:
auto expr = (v + c) - c;
Compounding is optional since both operators have the same precedence and are assoziativ.
In order to use plain numerical literals a partial specialization of the *binary_expression* template is required:
template< typename BOP, typename LHE, typename T >
struct binary_expression< BOP, LHE, T, T >
{
public:
typedef T result_type;
public:
binary_expression(LHE lhexp, T rhexp)
: bop_()
, lhe_(lhexp)
, rhe_(rhexp)
{}
T operator()(void) const
{
return bop_(lhe_(), rhe_());
}
private:
BOP bop_;
LHE lhe_;
constant< T > rhe_;
};
With this we can write:
auto expr = (v + c) - 3;
For the sake of completeness here is the unary expression:
template< typename UOP, typename E, typename T = typename UOP::result_type >
struct unary_expression
{
public:
typedef typename UOP::result_type result_type;
public:
unary_expression(E exp, UOP op)
: uop_(op)
, expr_(exp)
{}
T operator()(void) const
{
return uop_(expr_());
}
private:
UOP uop_;
E expr_;
};
It works the same way as the binary expression template.
Conclusion
This short introduction shows how to build arithmetic expression templates in C++. This can be done relatively easily with some preparations. To dig deeper you can study libraries like Eigen, Armadillo or Boost.Fusion. These libraries can speed-up your applications significantly. Although this article only deals with arithmetic expression templates, these technique is also suitable for other purposes. A good example is the Boost.Spirit library, which allows to embed a kind of EBNF syntax in C++ to define a parser/generator for a specific DSL. The latest compilers support this very well. Only the handling of syntax errors in complex templates is an area for potential improvement.
History
6. June 2015 | initial version |
10. June 2015 | formatting issue fixed |
| |