Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Encoding / Decoding 7 bit User Data for SMS PDU (PDU Bit Packer)

0.00/5 (No votes)
10 Oct 2012 1  
A library for packing / unpacking 7bit user data for SMS according to the GSM 03.38 standards.

Introduction

Most developers who develop applications that send SMS will eventually hit the problem of encoding SMS text into 7bit packed bytes. I found a lot of approaches and implementations that attempt to solve this issue on the web, but the ones that actually worked were either buggy or had serious performance issues.

The SMS application that I was developing was designed to send and receive millions of SMS daily, so what I needed was a simple, easy to use and high performance 7bit PDU packing and unpacking library. I developed the 7 bit PDU packing library and thought to share it with other developers, so they can invest their precious time somewhere else like improving their applications, solving other issues or just having some good time with family or friends.  

Packing and Unpacking

The most common text encodings for SMS text (user data) are GSM encoding (7bits) and Unicode (16bits). GSM encoding is commonly used for Latin (English, German, Spanish, …) text messages where each character is represented using 7bits only. The GSM encoding can map 128 Latin characters.

The user data part of a SMS holds the encoded text data. According to the GSM 03.38, the user data can hold up to 140 bytes. When using the GSM encoding, the 7bits characters must be packed into 8bits bytes. This allows the SMS to hold 160 Latin characters in its user data field.

The 7 bits binary representation of a character is called a septet and the 8bit binary representation is called an octet.

The process of filling septets (7bits characters) into octets (8bits bytes) is called packing. The reverse process is called unpacking which means extracting septets from the packed data.

The Packing Protocol 

The best way to understand how to pack septets into octets is by example. The following steps will describe how to pack the text ‘12345678’ according to the GSM encoding:

Step One: Convert the text to septets according to the GSM characters table:

Step Two: Move the least significant bits from the next septet to the current one to create octets:

The final result will look like this:

Step Three: Complete the last byte's bits to 8 bits by padding it with zeros.

The unpacking process is exactly the reverse steps of the packing process.

Using The Library

The library is wrapped in the PduBitPacker class. The methods the library contains are:

  • PackBytes : Packs an unpacked 7 bit array to an 8 bit packed array according to the GSM protocol. This method has 3 overloads.
  • UnpackBytes : Unpacks a packed 8 bit array to a 7 bit unpacked array according to the GSM protocol.
  • ConvertHexToBytes :This is a utility method that converts a hex string to a byte array.
  • ConvertBytesToHex : This is a utility method that converts a byte array to a hex string.

An example of packing:

 // Filling the array with demo data to be packed
 // The byte array is the GSM default encoding character codes for the following text:
 // "12345678"
 byte[] unpackedBytes = { 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38 };
 // Pack the bytes 
 byte[] packedBytes = PduBitPacker.PackBytes(unpackedBytes);
 // Display the output as a hex string
 MessageBox.Show(PduBitPacker.ConvertBytesToHex(packedBytes)); 

An example of unpacking bytes:  

 
 // Fill the array with the packed bytes
 byte[] packedBytes = PduBitPacker.ConvertHexToBytes("31D98C56B3DD70");
 // Unpack the bytes 
 byte[] unpackedBytes = PduBitPacker.UnpackBytes(packedBytes);
 // Display the output as a hex string
 MessageBox.Show(PduBitPacker.ConvertBytesToHex(unpackedBytes));

References

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here