Introduction
Endianness is a problem that arises mostly when our programs have to deal with raw data. Until now, the common wisdom involved rolling your own functions or having to deal with non-standard compiler extensions (__builtin_bswapXX
) or functions (htoleXX
, htobeXX
). However, this approach quickly leads to code-duplication and/or a great amount of boilerplate.
Although other solutions exist (such as the great Boost.Serialization
library), these libraries deal with much more complex issues than my library, like versioning, different input/output formats, etc. All those features can make them somewhat heavy and are out of scope for XEndian
.
Background
Design Rationale
XEndian is part of libhdbg
, a work-in-progress library trying to offer a cross-platform debugging interface. As such, the main focus of XEndian
has always been to remove code duplication in the loading-unloading of mostly fixed structures (think of the Elf file format). It was never meant for the serialization of ever-changing complex objects (although it can be used as such).
Using the Code
Say you have a custom structure named Foo
, such as:
struct Foo {
std::uint32_t a;
std::uint16_t b;
std::uin8_t c;
};
You only have to (partially) specialize the xe_impl_for_type
template class like this:
template <class XeImpl>
struct xe_impl_for_type<Foo, XeImpl>
{
template <class Rw, class Self, class Mem>
static void serialize(Self & self, Mem * mem)
{
Rw::field( self.a, mem + offsetof(Self, a) );
Rw::field( self.b, mem + offsetof(Self, b) );
Rw::field( self.c, mem + offsetof(Self, c) );
}
};
The XeImpl
parameter encodes the selected endianness, while the Rw
parameter encodes the operation. The Self
and Mem
parameters hide the const/non-const differences in parameters during loading-unloading. Now you can use Foo
with the {le/be}_load
, {le/be}_load_from
, {le/be}_load_into
and {le/be}_store
family of functions like this:
int main()
{
static const unsigned char foo_bytes[] = {
0xdd, 0xcc, 0xbb, 0xaa,
0x11, 0x22,
0xff
};
const auto be_foo = be_load<Foo>(foo_bytes); const auto le_foo = le_load<Foo>(foo_bytes);
const auto foo_p = reinterpret_cast<const Foo *>(foo_bytes)
const auto be_foo_a = be_load_from(foo_p->a); const auto le_foo_a = le_load_from(foo_p->a);
Foo into_foo; be_load_into(foo_bytes, into_foo); le_load_into(foo_bytes, into_foo);
const Foo foo { 0x11223344, 0xaabb, 0xff };
unsigned char buffer[ sizeof(Foo) ];
be_store(foo, buffer); le_store(foo, buffer); }
Disassembly
The following code:
int main()
{
static const unsigned char foo_bytes[] = {
0xdd, 0xcc, 0xbb, 0xaa,
0x22, 0x11,
0xff
};
const auto be_foo = be_load<Foo>(foo_bytes);
if(be_foo.a != 0xddccbbaa || be_foo.b != 0x2211 || be_foo.c != 0xff)
return EXIT_FAILURE;
const auto le_foo = le_load<Foo>(foo_bytes);
if(le_foo.a != 0xaabbccdd || le_foo.b != 0x1122 || le_foo.c != 0xff)
return EXIT_FAILURE;
}
...compiled with g++ with optimizations enabled gives the following disassembly:
0000000000400690 <main>:
400690: 8b 05 da 01 00 00 mov eax,DWORD PTR [rip+0x1da] # 400870 <main::foo_bytes>
400696: 0f b7 0d d7 01 00 00 movzx ecx,WORD PTR [rip+0x1d7] # 400874 <main::foo_bytes+0x4>
40069d: 89 c2 mov edx,eax
40069f: 0f ca bswap edx
4006a1: 66 c1 c1 08 rol cx,0x8
4006a5: 81 fa aa bb cc dd cmp edx,0xddccbbaa
4006ab: 74 06 je 4006b3 <main+0x23>
4006ad: b8 01 00 00 00 mov eax,0x1
4006b2: c3 ret
4006b3: 0f b7 c9 movzx ecx,cx
4006b6: 81 c9 00 00 ff 00 or ecx,0xff0000
4006bc: 81 f9 11 22 ff 00 cmp ecx,0xff2211
4006c2: 75 e9 jne 4006ad <main+0x1d>
4006c4: 3d dd cc bb aa cmp eax,0xaabbccdd
4006c9: 75 e2 jne 4006ad <main+0x1d>
4006cb: 0f b7 05 a2 01 00 00 movzx eax,WORD PTR [rip+0x1a2] # 400874 <main::foo_bytes+0x4>
4006d2: 48 ba 00 00 00 00 00 movabs rdx,0xff000000000000
4006d9: 00 ff 00
4006dc: 48 c1 e0 20 shl rax,0x20
4006e0: 48 09 d0 or rax,rdx
4006e3: 48 c1 e8 20 shr rax,0x20
4006e7: 3d 22 11 ff 00 cmp eax,0xff1122
4006ec: 0f 95 c0 setne al
4006ef: 0f b6 c0 movzx eax,al
4006f2: c3 ret
License
Licensed under the Apache License, Version 2.0
History
- 12/09/2014 - Published
XEndian
header, samples and unit tests - 18/09/2014 - Less macros and even more DRY
- 20/12/2014 - Simplified interface, improved naming and added more examples