A continuation of a series of blog entries that documents the design and implementation process of a library. The library is called Network Alchemy[^]. Alchemy performs low-level data serialization. It is written in C++ using template meta-programming.
At this point, I was still in the make it work phase of implementation. However, I learned quite a bit from experiencing the progression of design and multiple implementations. So much that I thought it was valuable enough to share.
After the initial creation of the Datum
type, it was relatively simple to abstract the design and implement the Bit-Field object. The next step was to combine the collection of BitField
objects into a single group that could be statically-verified as type-safe. This was also relatively simple. Again I just had to model the implementation of the message structure. I called this collection a BitList.
Primarily to differentiate it from the std::bitset
.
The real trouble appeared when I began the integration of the BitList
with the Message
in a way that was similar to the Datum
. At this point, I started to see the potential this library could provide. Through experience, I have learned that it is important to get the programming interface correct in the initial design. Any mistakes that were made, or code that is not as clean as you would like can be refactored or even re-implemented behind the original interface.
The BitList
My initial approach was straight-forward. As meta-programming was new to me, I looked for opportunities to take advantage of the compiler to verify invariants wherever I could apply them. Therefore, there is a type_check
that verifies the number of bits specified by the user does not exceed the capacity of the data type used for storage.
template < typename BaseT,
size_t B0, size_t B1=0, size_t B2=0, size_t B3=0,
size_t B4=0, size_t B5=0, size_t B6=0, size_t B7=0
>
struct BitSet
: type_check< (B0+B1+B2+B3+B4+B5+B6+B7)
<=(sizeof(typename BaseT::value_type)*8)>
, public BaseT
{...
I thought the best approach at the time was to store a reference to the primary value, to which all of the BitFields
referred. This was an attempt to avoid a mechanism where the child calls the parent for the value then the operation is performed. I was focused on the general usage of the object, and didn't pay too much attention to the construction of this type. As you can see, the construction of this BitList
is quite expensive compared to the amount of information that is encoded in the type.
typedef typename
BaseT::value_type value_type;
BitList(value_type &data_field)
: m_data(data_field)
, m_field_0(GetFieldAddress< 0 >(m_field_0))
, m_field_1(GetFieldAddress< 1 >(m_field_1))
, m_field_2(GetFieldAddress< 2 >(m_field_2))
, m_field_3(GetFieldAddress< 3 >(m_field_3))
, m_field_4(GetFieldAddress< 4 >(m_field_4))
, m_field_5(GetFieldAddress< 5 >(m_field_5))
, m_field_6(GetFieldAddress< 6 >(m_field_6))
, m_field_7(GetFieldAddress< 7 >(m_field_7))
{
m_field_0.attach(m_data); m_field_1.attach(m_data); m_field_2.attach(m_data); m_field_3.attach(m_data); m_field_4.attach(m_data);
m_field_5.attach(m_data);
m_field_6.attach(m_data);
m_field_7.attach(m_data);
}
There are two parts to the previous section of code, the initializer list, and the constructor body. The initializer list is used to associate address of the <var>BitField</var>
object with the local reference held by the <var>BitList</var>
container. This is how the container is able to enumerate over the values in the class. The second part attaches a reference of the internal data field with the child fields.
Because of this twisted setup, the majority of what remains is simple and looks much like the other value classes I have previously shown. The one oddity that remains is a set of static
constants defined to help navigate the set of sub-fields. The code below depends on the knowledge of the size and relative offset for each value.
static const size_t k_size_0 = B0;
static const size_t k_offset_0 = 0;
static const size_t k_size_1 = B1;
static const size_t k_offset_1 = value_if< (B1>0), size_t, k_size_0, 0 >::value;
static const size_t k_size_2 = B2;
static const size_t k_offset_2 = value_if< (B2>0 ), size_t,
k_offset_1 + k_size_1,
0>::value;
static const size_t k_size_3 = B3;
static const size_t k_offset_3 = value_if< (B3>0 ), size_t,
k_offset_2 + k_size_2,
0>::value;
Those definitions allow the actual value types fields to be defined with the code below:
value_type &m_data;
Bit_field< k_offset_0, B0, value_type> &m_field_0;
Bit_field< k_offset_1, B1, value_type> &m_field_1;
Bit_field< k_offset_2, B2, value_type> &m_field_2;
Bit_field< k_offset_3, B3, value_type> &m_field_3;
Bit_field< k_offset_4, B4, value_type> &m_field_4;
Bit_field< k_offset_5, B5, value_type> &m_field_5;
Bit_field< k_offset_6, B6, value_type> &m_field_6;
Bit_field< k_offset_7, B7, value_type> &m_field_7;
Unfortunately with this design, there will always be eight sub-fields per <var>BitList</var>
object, regardless of the actual number of bit-fields that are used. Imagine my surprise when shortly after I released this into the wild, I had requests to support 32 bit-fields per <var>Bitlist</var>
. At that point, I started to think of ways that I could eliminate all of these unused references. Unfortunately, I was not able to implement my ideas until the second version of this type. So those ideas will have to wait.
Euphemism
To organize the Hg::BitList
similarly to the message required an outer container to be defined for the user type. I am not going to show all of the details because they are not that interesting. Also, the current implementation for this feature in Alchemy barely resembles this concept at all. Here is a high-level overview of it's original structure.
Essentially, the MACRO below is a euphemism for the morass of partially defined parameterized errors waiting to be compiled.
HG_BEGIN_BIT_LIST(uint32_t, info)
HG_BIT_FIELD(0, 4, first)
HG_BIT_FIELD(1, 4, second)
HG_BIT_FIELD(2, 4, third)
HG_BIT_FIELD(3, 4, fourth)
HG_BIT_FIELD(4, 4, fifth)
HG_BIT_FIELD(5, 4, sixth)
HG_BIT_FIELD(6, 4, seventh)
HG_BIT_FIELD(7, 4, eighth)
HG_END_BIT_LIST
The name <var>info</var>
defines a new BitList
type the user will place in the type-list definition that defines the overall message format.
In the end, I spent much more time on this section of code that I am comfortable admitting. So let me just show you a few snippets of what the MACRO definition hides and move away from this.
struct C \
: public Base_bit_list \
{ \
typedef C this_type; \
typedef T value_type; \
typedef Bit_field<0,0,value_type> nil_bits_t;\
\
enum { k_offset_0 = 0 }; \
\
template < typename IndexT, \
typename BitT> \
BitT& GetField(const BitT &) \
{ return GetFieldData(BitT()); } \
\
nil_bits_t& GetFieldData(const nil_bits_t&) \
{ \
static nil_bits_t nil_bits; \
return nil_bits; \
}
For each field that is defined, a collection of definitions is created by the MACRO that is similar to this:
typedef FieldIndex< 2, this_type,4> idx_2;
typedef Bit_field < k_offset_2, 4, value_type > third_t;
enum { k_offset_3 = k_offset_2 + N };
third_t third;
third_t& GetFieldData(const third_t&)
{ return P; }
The definition above declares a new Hg::BitField
type that meets that criteria for the proper index, offset and size of that field. A new instance of that time is defined with the name the user has specified, and a function is provided that returns a reference to that field. The function requires a const
reference of an object of the same type as the desired field (I have since realized that pointers are much less expensive than const references and just as effective). Since all of the fields have different indices and offsets, there will be no problems with duplicate types.
The type definitions use a well-defined format that can be programmatically indexed. Therefore, developing functions to operate over the fields will be possible. However, there is still more work to integrate this type into the Alchemy framework.
Type-deduced Definitions
Throughout the development of this project, the template definitions continued to grow longer and more complex. In many cases, they weren't that difficult to deal with at all. Rather, they were tedious and required the parameterized types to be propagated through the depths of the interfaces to which they applied. It didn't take long for me to recognize that I could wrap all of the complex definitions into parent-type
, and reference my types much like the meta-functions I have been creating. The only difference is in this case I truly do want a simpler definition of the type.
Here is what the definition of a Hg::BitSet
with 8 fields that each used 4 bits would look like:
Hg::BitSet< uint32_t, 4,4,4,4,4,4,4,4 > info;
Because the type information is packaged in the convenient message type-list, all of the details required to reconstruct the Hg::BitList
definition is still available, even when packaged like this:
template< size_t Idx,
typename format_type>
struct Format_bitset
{
typedef
typename Hg::TypeAt < Idx,
format_type
>::type base_t;
typedef Hg::BitSet< base_t,
base_t::idx_0::k_size,
base_t::idx_1::k_size,
base_t::idx_2::k_size,
base_t::idx_3::k_size,
base_t::idx_4::k_size,
base_t::idx_5::k_size,
base_t::idx_6::k_size,
base_t::idx_7::k_size
> type;
};
The type-list
and index
make the Hg::BitList
much more accessible, considering all of the other utility functions and definitions that have been created to support the fundamental types.
Integrating as a Datum
At first glance, this does not appear to be a difficult issue to resolve. The primary issue is we need to make the Hg::BitList
compatible with the existing Hg::Datum
type. For a basic template, it would already meet that requirement. Through the use of static-polymorphism, a type only needs to contain the same member data and functions for all of the types to be compatible in an algorithm. They are not required to be tied together through inheritance, which is the basis of dynamic-polymorphism.
The difficulty I ran into, was the logic that I had created to programmatically navigate each field of a message were specialized for the Hg::Datum
type. To specialize on a second distinct type would cause ripples through the entire API, and cause almost complete duplicate implementations for alternative specialization. This seems to defeat the purpose of templates. So I searched for a solution to bring the two types together.
BitListDatum
The solution I used for this first version was to create a class, Hg::BitListDatum
, which derived from both of these classes. Here is the declaration of the type:
template< size_t IdxT,
typename format_type
>
struct BitListDatum
: public Hg::Format_bitlist< IdxT, format_type >::type
, public Hg::Datum< IdxT, format_type >
{ };
The purpose of this class is to become a shallow adapter that is compatible with the definitions and types already defined for the Hg::Datum
. The only challenge that I ran into was creating a definition that would allow the Hg::BitList
type to be returned for the user to interact with, but still be a Hg::Datum
. This is what I created:
typedef typename
Hg::Datum < Idx,
format_type,
StoragePolicy,
kt_offset
> base_type;
operator base_type&()
{
return *static_cast< base_type *>(this);
}
operator value_type() const
{
return static_cast< const base_type*>(this)->operator value_type();
}
The base_type
operator properly handles the interactions where a reference to the Hg::Datum
type is required. In all other cases, the user can interact with the value_type
operator that returns access to the child Hg::BitField
objects by name.
Summary
After dynamically-sized arrays, the support of bit-fields has been the most tedious feature to implement in Alchemy. In fact, I do not think implementing support for arrays would have gone nearly as smoothly for me if I hadn't learned the lessons from implementing bit-field support.
Bit-fields are a feature that are not used much. However, they tend to appear quite frequently when dealing with network protocols or data-storage formats. Once you consider that the bit-field feature is not well defined for C and C++, it becomes clear that an alternate solution for providing bit-level access to the data is important. Mask and shift code is tedious and error prone, not to mention difficult to read if you do not see it often. Considering that the simplification of software maintenance is one of the primary goals of Alchemy, providing a feature such as the Hg::BitList
is well worth the effort.