Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++11

C++: Size Matters in Platform Compatibility

5.00/5 (1 vote)
15 Aug 2019CPOL2 min read 5.1K  
Data width must stay unchanged for cross-platform interoperability

Introduction

For file storage and data communication to work interoperably, the width of datatype must stay invariant across platforms. This tip discusses the pitfalls of platform-dependent data width and their solution. Endianess, deserving a tip of its own, is not covered here.

time_t

time_t stores the number of seconds since 1st January, 1970. It is a 32-bit integer, on 32-bit Linux, where it can run up to year 2038, a Y2K equivalent crisis for Linux and otherwise, it is 64-bit on 64-bit Linux. Whereas on modern Visual C++, time_t is 64-bit, no matter the x86 or x64 platform. time_t is not guaranteed to be interoperable between platforms, so it is best to store time as text and convert to time_t accordingly.

wchar_t

wchar_t type to hold the Unicode character is UTF-16 on Windows while UTF-32 on Linux/MacOS, therefore incompatible with each other. UTF-16 character can be 2 bytes or 4 bytes depending on its codepage while UTF-32 character is always 4 bytes which is a colossal waste of memory since most Unicode characters can be expressed in 2 bytes. UTF-8 is 1 byte for ASCII and multibyte for Unicode. For interoperability between Windows and other OSes, the solution is to store the text in UTF-8 and convert to wchar_t upon loading. Another solution is to use fixed-width character types such as char16_t or char32_t introduced in C++11.

Integer Types

size_t and its signed counterpart type, ptrdiff_t whose width varies on x86 or x64 platform, should always be avoided in storage and communication packet. Undetermined width type like long type should be avoided as well. Use the fixed width integer types introduced in C++11, such as uint32_t and int32_t.

Pointer Types

Pointer width varies according to x86 or x64 mode. Pointer sometimes are used as a opaque index/identity. Window SDK's DWORD_PTR is one such example. Pointer derived identity can be temporarily stored in database, file storage or network packets due to distinctness of memory address. It poses a problem where a 64-bit value is sliced off in a 32-bit, say database column type, when the code is recompiled in x64 mode from the original x86 mode. If it has to be done, then use the largest pointer width as the data width. If not, it is best to derive your identity through other means like GUID or truly random number generation.

History

  • 15th August, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)