During my design analysis for my Network Alchemy implementation, I thought that the tuple may be the answer to allow me to iterate over the types defined in a network message definition. Tuples are data structures that are a generalization of the std::pair
. Rather than having a limitation of 2 items in a tuple, potentially any number of items can be constructed within a custom tuple definition. The upper-limit will most likely be associated with the limit of your compilers ability to recurse down within a nested template structure. Conceptually, a tuple is similar to a compile-time linked list. If a tuple were implemented in terms of the std::pair
, it would be constructed like this:
C++
#include < utility >
using std::pair;
pair< int ,
pair< float,
pair< char, long >
>
> tuple;
The previous block of code defines a structure that contains 4 elements, where the head of the list is a defined type, and the tail is the remainder of the list. This particular tuple defines a type that contains: int
, float
, char
, and a long
.
What Exactly is a Tuple?
Before I can get into details about how a tuple is useful, I think it would be best to show the difference between a pair and a tuple, especially if you are already familiar with the std::pair
. The most common way to be introduced to the pair, is when you use std::map
. The key-value pairs are actually stored in the map as an instance of an std::pair
. There are only two parameters in the pair; the names of these parameters are first
and second
. This values represent the first and second items, respectively, which are defined in the pair. As this is a parameterized type, the parameter types of first
and second
is entirely dependent on each instantiation of std::pair
.
Until recently, I have rarely use the pair outside of my usage of the std::map
, primarily because of the accessor name, first
and second
. As I stated earlier, the tuple is a generalization of the pair. Accessing the different fields of a tuple differs from the pair, in that you reference an entry by its position, similar to an array. There is a parameterize function called, get
that returns the value of an element of a specified index in the tuple. Similarly, there is a function called tuple_element
that can be used to query the type of the element at a specified index. Here is an example that demonstrates the tuple:
C++
#include < iostream >
#include < tuple >
using std::tuple_element;
using std::get;
typedef std::tuple< int , char, size_t> tuple_t;
tuple_t data = std::make_tuple (10,'a', 100);
tuple_element< 0,tuple_t >::type valueA = get< 0 >(data);
tuple_element< 1,tuple_t >::type valueB = get< 1 >(data);
tuple_element< 2,tuple_t >::type valueC = get< 2 >(data);
There are only a few other functions to be aware of with regards to the tuple:
tuple_size
: Returns the number of elements in the tuple tuple_cat
: Concatenates two tuples to create a third tuple type tie
: Unpacks arguments from a tuple, and constructs a new tuple
Tie
is the only function that I think needs further explanation. There is a companion type used with tie
called ignore
. This is used to skip over elements in the incoming tuple to manipulate the final tuple that is constructed from the call to tie
. Ultimately, this is a utility function to more conveniently construct a new type from an existing tuple type. After all, the moral equivalent to variables in template meta-programming is a new type.
How Can A Tuple Be Useful?
Let's return back to Alchemy and discuss the tuple in terms of our proposed library. Remember in the previous entry that I was searching for a mechanism that would allow me to iterate over each of the types defined in a message structure?! Well, it appears that the tuple is a mechanism that can provide those capabilities for us. For example, consider if a tuple was defined with all of the elements that should appear in a message packet structure. With the handle full of functions provided by the C++ Standard Library, we can:
- Determine the number of elements in the tuple with
tuple_size
- Iterate the tuple to determine the type of an element at each index with
tuple_element
- Obtain the type of an element at a specified index with
get
- Most importantly, we can now reliably calculate the offset that a parameter should be placed in a raw buffer with this set of meta-functions
OffsetOf < Idx, TupleT >
Let's analyze what I just defined to calculate the offset in bytes, for the specified position of a parameter in a tuple definition. The very first thing that I chose to do for this implementation was to create a forward declaration of this OffsetOf
utility function. Simply to introduce the format of the template before we analyze the more complex aspects of this construct. This template requires a positive integer value to specify the index of the element, and a fully defined tuple. Because it's a template declaration, any type can actually be placed in the second parameter TupleT
. However, the first parameter's type is already defined as size_t
, so a positive integral value must be specified for the first parameter. I will demonstrate shortly that the actual types that can be specified are slightly more restrictive than I first indicated.
C++
template < size_t Index, typename TupleT>
struct OffsetOf;
The second struct
defined in the first block requires us to take off our imperative programming hat, and switch over to use our functional programming hat. This hat has a very awkward fit at first; it will probably feel a little tight and lead to headaches, unless you have worn one while programming a different language such as F#, Haskel or Lisp. My advice, don't fight it, when you break this hat in, you will be a better programmer for having done so. It opens your mind to a new way of thinking about problems. You will be able to see potential solutions from multiple perspectives.
As many meta-template solutions tend to be composed of recursive implementation, the second item in the first block contains the terminating case. That is, a template specialization is created for an element whose specified index is 0. The offset for all elements at index 0, is at 0-bytes. One of the characteristics of functional programming is the inability to create mutable state within a statement. We are able to define new functions, and placeholder variables of sorts. The former is accomplished by declaring a new type, the latter by defining a constant. This is the reason an enumeration named value
is assigned the value 0
; to hold the calculated offset for the element at index 0
. To access the value of the index, the OffsetOf
type must be specified with the desired index, and the enumeration value
must be referenced to get access to the value.
C++
template < size_t Index, typename TupleT>
struct OffsetOf;
template < typename TupleT >
struct OffsetOf< 0, TupleT >
{
enum { value = 0 };
};
C++
size_t offset = OffsetOf<0, tuple_t>::value;
By convention, constant values are given the name value
and type definitions for sub-types are given a name that indicates it is a type of some sorts. For example, look at the documentation for the C++ Standard containers such as vector
and map
. These containers have many typedef
s to declare the value_type, size_type, allocator_type
; then there are other general purpose names used for reference, pointer, iterator
and const
versions for these types.
Specialization for a Specific Tuple Type
C++
template < typename TupleT>
struct OffsetOf< IdxT, TupleT >
{
typedef TupleT container;
typedef std::tuple_element< IdxT-1,
container>::type prev;
enum { value = OffsetOf<(IdxT)-1, container>::value
+ sizeof(prev)};
};
There is truly only one statement in the declaration of OffsetOf
and that is the definition for the enumeration, value
. The two typedef
s, container
, and prev
are solely added for improved readability. There is, however, the possibility this extra type definitions could also prove useful later on. None-the-less, readability is my primary reasoning for adding them here. The calculation of the offset is a straight-forward recursive implementation, and the zero-defined specialization will terminate the loop. Essentially, the offset of the previous element is calculated plus the size of that element to calculate the offset of this
element. However, to calculate the offset of the previous element, we must know the offset and size of the element prior to that one as well. This behavior recurses until the zero terminator is reached. Then the values bubble back up, and the calculated offset is defined.
All of this behavior occurs at compile-time, with these calculations actually being performed by the compiler. No run-time code is generated from these definitions, only constant values that will be substituted in place anytime an instance of one of these templates is defined. While it is possible that heavy reliance on code written with templates can slow down the compile-times, it has been my experience that the increase in time is negligible. Also, for recursive solution such as this, many compilers have improved to the point where they store the values of previous template instantiation calculations. Therefore, the second time iterating through the loop for calculated offsets, the answer to the previous item will already be known, and a simple lookup is performed to get the value. I will continue to go into deeper detail regarding the work that occurs behind the scenes another time.
Will Tuples Work for Alchemy
I believe the answer to this question is "Yes." However, I do not think they will be the best fit. As I have been exploring how to use tuples, including accessing values, I think it would be difficult to achieve the natural struct-type syntax I have original set out to achieve. Primarily because of the syntax that is required to access the values from a tuple, the std::get
function Therefore, I am going to put this idea to the side, and explore some other options. If the other options do not turn out to be as good as the tuple, I can always return to this idea. I want to point out that I believe the std::tuple
would provide a much more robust and portable solution that the offsetof
MACRO we explored in the last entry.
Next Step
We will explore what I believe to be the most promising approach in the next entry, TypeLists
. I do not know the full etymology for the term, however, many of the sources that I have read credit Andrei Alexandrescu with creating the term in his book Modern C++ Design, which I believe was previously mentioned in an article he wrote that was published in Dr. Dobbs Journal. At TypeList
is a similar template-type construct, that is strictly a type. No data is contained in the TypeList
itself. This will give us the freedom to choose how to represent and manage all of the data, and at the same time be able to iterate over each of the field types as required.