Introduction
I have been working on a VoIP application, and wanted to implement the G.711 specification, which I found out had two variants: A-law and µ-law. Compounded by the problem of the latter being referred to as "mu-law" and "u-law" in addition to µ-law (ALT-0181, by the way), the documentation for the two is quite poor. Without an ITU login, you can't see the actual standard, and few places online go in to exactly what happens in these encodings. Wikipedia just throws a couple of equations around, but is not terribly helpful. Eventually, I did find various implementations, only one of which was reasonably commented. Of course, that one was the one with the error in it...
So here, in my first CodeProject article, I will explain thoroughly the implementation of G.711 in both of its forms. The code is in C#, but it is simple enough to be ported to, say, Java.
Note that in most contexts, I will use mu instead of µ. Only a few comments and variable names use µ, it's just too strange to think of as a normal character. And although I know the alt-code by heart, it still takes much longer to type.
The Code
The project is arranged into five C# files. One is Program.cs, which is not really important until later, and even then, it isn't so terribly important. The other four are static classes, MuLawEncoder
, MuLawDecoder
, ALawEncoder
, and ALawDecoder
, which all do exactly as their names imply.
The static constructors do all of the real work, and store the results in a table. When the Encode
or Decode
methods are called, they look in the table. At the cost of only a little memory (64 KB per encoder, 0.5KB per decoder), it is definitely worth it.
µ-Law Encoding
The MuLawEncoder
class handles the µ-law encoding (surprise!) by looking up values in its own private byte array pcmToMuLawMap
. If the index is the unsigned 16-bit PCM value, the value is the unsigned 8-bit µ-law byte.
public const int BIAS = 0x84;
public const int MAX = 32635;
These are the constants used by the encoder. Both will come up later.
static MuLawEncoder()
{
pcmToMuLawMap = new byte[65536];
for (int i = short.MinValue; i <= short.MaxValue; i++)
pcmToMuLawMap[(i & 0xffff)] = encode(i);
}
The static constructor fills the table. It instantiates it as an array of 65536 bytes, and then goes from -37638 to 37637. To make the index of the array not be negative, i
is ANDed with 0xffff, making it positive.
The method encode(short i)
does exactly what you would expect. However, it is private
. This is because the only time it will ever be used is by the constructor.
private static byte encode(int pcm)
{
int sign = (pcm & 0x8000) >> 8;
if (sign != 0)
pcm = -pcm;
if (pcm > MAX) pcm = MAX;
pcm += BIAS;
int exponent = 7;
for (int expMask = 0x4000; (pcm & expMask) == 0;
exponent--, expMask >>= 1) { }
int mantissa = (pcm >> (exponent + 3)) & 0x0f;
byte mulaw = (byte)(sign | exponent << 4 | mantissa);
return (byte)~mulaw;
}
The comments say everything that needs to be said.
The public methods all access the table, rather than do reprocessing. They are all called MuLawEncode
, with different arguments. It is redundant, but so be it. The Encode overloads:
public static byte MuLawEncode(int pcm) { }
public static byte MuLawEncode(short pcm) { }
public static byte[] MuLawEncode(int[] data) { }
public static byte[] MuLawEncode(short[] data) { }
public static void[] MuLawEncode(byte[] data, byte[] target)
{ }
public static byte[] MuLawEncode(byte[] data) {
int size = data.Length / 2;
byte[] encoded = new byte[size];
for (int i = 0; i < size; i++)
encoded[i] = MuLawEncode((data[2 * i + 1] << 8) | data[2 * i]);
return encoded;
}
The last takes an array of bytes in Little-Endian order. Thus, it is special, and gets to be displayed.
The last thing in the MuLawEncoder
class is the ZeroTrap
. Apparently, it is not so great of a thing to send an all-zero µ-law byte, so when the trap is enabled, an all-zero µ-law byte is replaced instead, by 0x02. By default, this trap is disabled.
Normally, the zero trap is a boolean, but here, there is no need. Since the unsigned PCM value 33000 maps to 0x00, we know that if the table reads 0x00, the zero trap is off. See:
public bool ZeroTrap
{
get { return (pcmToMuLawMap[33000] != 0); }
set
{
byte val = (byte)(value ? 2 : 0);
for (int i = 32768; i <= 33924; i++)
pcmToMuLawMap[i] = val;
}
}
When the zero trap is assigned, the program will go through the table and assign either 0x00 or 0x02 to all of the places that map to 0x00 normally. These are the values in the range [32768, 33924].
µ-Law Decoding
In yet another major surprise, this is done in the MuLawDecoder
class. This uses the same table lookup technique as the above, but has an array of type short
, since the values are 16-bit signed PCM values.
static MuLawDecoder()
{
muLawToPcmMap = new short[256];
for (byte i = 0; i < byte.MaxValue; i++)
muLawToPcmMap[i] = decode(i);
}
private static short decode(byte mulaw)
{
mulaw = (byte)~mulaw;
int sign = mulaw & 0x80;
int exponent = (mulaw & 0x70) >> 4;
int data = mulaw & 0x0f;
data |= 0x10;
data <<= 1;
data += 1;
data <<= exponent + 2;
data -= MuLawEncoder.BIAS;
return (short)(sign == 0 ? data : -data);
}
Again, the comments explain the magic.
And again, the main function is overloaded:
public static short MuLawDecode(byte mulaw) { }
public static short[] MuLawDecode(byte[] data) { }
public static void MuLawDecode(byte[] data, out short[] decoded) { }
public static void MuLawDecode(byte[] data, out byte[] decoded)
{
int size = data.Length;
decoded = new byte[size * 2];
for (int i = 0; i < size; i++)
{
decoded[2 * i] = (byte)(muLawToPcmMap[data[i]] & 0xff);
decoded[2 * i + 1] = (byte)(muLawToPcmMap[data[i]] >> 8);
}
}
The out
parameters are used because otherwise there would be no way to separate the two MuLawDecode
functions that both take a byte
array. And again, the Little-Endian byte order is displayed.
A-law Encoding
A-law is even worse documented than µ-law. The implementations are even worse in terms of commenting, so this took a bit longer to figure out. In the end, it is indeed similar to µ-law's implementation, despite how different the A-law C code looks from the µ-law C code.
There is no zero trap, and the ALawEncode
overloads are identical to the MuLawEncode
overloads, so the only difference is the encode(short i)
method, and that the MAX
is 0x7fff instead of (0x7fff-0x84) like in MuLawEncoder
.
private static byte encode(int pcm)
{
int sign = (pcm & 0x8000) >> 8;
if (sign != 0)
pcm = -pcm;
if (pcm > MAX) pcm = MAX;
int exponent = 7;
for (int expMask = 0x4000; (pcm & expMask) == 0
&& exponent>0; exponent--, expMask >>= 1) { }
int mantissa = (pcm >> ((exponent == 0) ? 4 : (exponent + 3))) & 0x0f;
byte alaw = (byte)(sign | exponent << 4 | mantissa);
return (byte)(alaw^0xD5);
}
Even this has only subtle differences. The mask, lack of bias, and the zero exponent weirdness are the key differences.
A-Law Decoding
The only difference between this and MuLawDecoder
is the decode(short i)
method.
private static short decode(byte alaw)
{
alaw ^= 0xD5;
int sign = alaw & 0x80;
int exponent = (alaw & 0x70) >> 4;
int data = alaw & 0x0f;
data <<= 4;
data += 8;
if (exponent != 0)
data += 0x100;
if (exponent > 1)
data <<= (exponent - 1);
return (short)(sign == 0 ? data : -data);
}
That's it for the encoders and decoders.
Program.cs
The program included in the source package runs the ALawEncoder
and MuLawEncoder
on a random series of data, and averages the percent errors. It then displays the average errors for each codec for the full range.
Some results:
On the range [1,32767]: µ-Law: 1.14%, A-Law: 1.26%
On the range [-32767,-1]: µ-Law: 1.14%, A-Law: 1.16%
Conclusion
A-law and µ-law are not so complicated, especially when. laid out in plain sight. I hope this is useful. After all, G.711 can make 16-bit samples, take up 8 bits, or 50% compression. At 8KHz sampling, that turns a 128Kbps stream into a 64Kbps stream.
Edited July 28th, 2006 to fix error and add new overload suggested by Nathan Allan