Streaming Audio to the WaveOut Device

Member 11363595

4.50/5 (7 votes)

16 Jan 2015CPOL7 min read

42.6K

1.8K

Playing a continuous audio stream from C#

Download source - 8.1 KB

Introduction

Many applications need to be able to output sound s of one sort or another and C# offers the SoundPlayer class to help with this. SoundPlayer is designed for playing WAV files and includes some quite helpful features for making it easy to load WAVs from a variety of sources. However, when it comes to playing a continuous audio stream, such as might be produced by an audio synthesizer program or a proprietary streaming protocol, SoundPlayer does not offer much to help.

In my own (custom streaming) application, while I was able to encapsulate the continuous stream into discrete memory-based WAVs, I could not get SoundPlayer to produce an adequately smooth playback experience. To overcome this, I returned to the more classic multimedia audio interface provided by Windows. While straightforward enough in some programming paradigms, C# presented some interesting (and for me quite new) challenges.

Incorporating dlls into C# is remarkably simple and, after the few tweaks necessary to accommodate C#’s data types, I was quickly able to create an interface to the default audio device on my PC. The line below was literally all that was necessary to get started.

[DllImport("winmm.dll")]
public static extern int waveOutOpen(out IntPtr hWaveOut, int uDeviceID, WaveFormat lpFormat,
                                        WaveDelegate dwCallback, IntPtr dwInstance, int dwFlags);

However, the waveOut device requires that data remain valid in the application while being either played or queue for playing; you can’t just fire the data at the device and forget it. The principal reason for this is that you only actually send a header to the device itself. The header contains a pointer detailing where the data live. Anyone familiar with managed objects will be nodding their head at this point but, as a newcomer to C#, it was a surprise to me. Managed objects represent a mechanism used by C# applications to tidy up memory that has been used by a program but is allegedly no longer required; in most languages the programmer has this responsibility. In C#, however, this is taken care of behind the scenes. One of the side effects of this is that objects, words, bytes, and so forth, can (and do) move around. If we apply a traditional programming paradigm to C# we find that, sooner or later, our pointers are no longer pointing at the data structures we thought they were. Typically, in C# program, you are either not allowed to do these kinds of things or you get a stern warning when you try. If you’re interfacing to external libraries, of course, you don’t get these reminders because C# doesn’t know what the external libraries are going to do with the information.

In order to use the waveOut interface we need, therefore, to create some unmanaged objects. Essentially these are areas of reserved memory that the C# garbage collector won’t interfere with. For this interface we need, as a minimum, two fixed elements. The first is a buffer where all the raw audio samples and, for good measure, the wave headers will be stored. The second is a UInt32 which is used in the waveOut callback. Interestingly, my first suspicion that all was not well with my code was the result of this callback. I registered a pointer to a managed UInt32 with the waveOut interface so that I could determine how many audio frames were currently queued for playout. This worked consistently for a few seconds than stopped. Eventually the program, while seeming initially to carry on, became singularly unhappy. So what was happening? The garbage collector was kicking in and moving my frame counter to a new location; the address registered with the wave out device, however, remained unchanged. Consequently, each time the callback was triggered after garbage collection, some other unsuspecting variable was being modified instead.

My two unmanaged structures (IntPtrs) are created as follows:

audioRawP = Marshal.AllocHGlobal(audioRawSize); 

rFrames = Marshal.AllocHGlobal(4);

These will not now be touched by background garbage collection; audioRawP is an Int32 pointer to audioRawSize bytes of memory while rFrames is an IntPtr to four bytes of memory which will be cast to a UInt32. Because we have created these in this way, we have to dispose of them too. To do this the class is declared as an IDisposable type

public class woLib : IDisposable

When a class is declared as IDisposable we must include a Dispose() method. This method is called when the object gets destroyed and ensures that any unmanaged memory is tidied up. Without this we’ll end up with potential memory leaks in or programs. My Dispose method is

public void Dispose()
{
    CloseWODevice();                             // terminate an access to memory and close the wave out device
    if (audioRawP == (IntPtr)0)                  // ensure that memory was allocated
    {
        Marshal.FreeHGlobal(audioRawP);          // free it up
        Marshal.FreeHGlobal(rFrames);
    }
}

I ensure that the application finishes with the memory, check that memory has actually been allocated (since audioRawP is initialised to 0,) then free it. It is interesting that, when pointers are created in this way, they become more intuitive to work with if you come from a more traditional programming background.

The final result is a class called woLib which provides a very simple interface to the waveOut device. After creation, only two methods are required to get going. My own application requires raw PCM and G.711 so the initialisation method

public void InitWODevice(UInt32 sf, UInt32 chans, UInt32 bps, bool G711)

includes a Boolean, G711. This feeds into the waveOut device through the WaveFormat tag

if (G711) wFmt.wFormatTag = (short)WaveFormats.mulaw; else wFmt.wFormatTag = (short)WaveFormats.Pcm;

Clearly it is a simple matter to include a more generic interface, if necessary. I’ve included a few typical formats in the WaveFormat list

public enum WaveFormats
    {
        Unknown = 0,
        Pcm = 1,
        Adpcm = 2,
        Float = 3,
        alaw = 6,
        mulaw = 7
    }

In case they should prove helpful. Finally, I have used only the default audio interface on my PC. When the waveOut device is opened

waveOutOpen(out hwo, -1, wFmt, woDone, rFrames, CALLBACK_FUNCTION);

I specify -1 as the required device. -1 selects the default. If you have more than one audio device present you can choose the device by replacing -1 with 0 to n-1 for n devices. The multimedia interface does include a call to find out how many devices are present on your system, should you wish to implement it. The call is imported from the same dll and is called "int waveOutGetNumDevs()"

Using the code

To use the code, an object is first created. In my program this appears as

static woLib WaveOut = new woLib(); // interface to classic windows wave-out device

Next, the device must be initialised thus:

WaveOut.InitWODevice(48000, 2, 16, false);  // initialise the audio device

In this case the waveOut interface will expect data delivered at 48kHz with 4 bytes per sample; two bytes (16 bits) per channel, and two channels. The false at the end means that this isn’t G.711 (which would be 8 bits per sample in any case.) If InitWOdevice is called after it has already been initialised it will close itself and reopen with the new settings. Once this is accomplished, it remains only to send data to the interface.

WaveOut.SendWODevice(pPCM, 4096);

In this case, I have 4096 bytes that are located at pPCM which is an IntPtr to my raw PCM data. This represents 1024 actual audio samples, base on my current settings, which will last for around 21ms. If I were feeding data from a disk file, I would monitor the number of queued buffers

UInt32 queued = GetQueued();

And test against a minimum and maximum threshold. My raw buffer size is set at

private const int audioRawSize = 1000000;    // size of raw audio buffer used with hwo device

1000000 bytes so I can fit around 240 (4096-byte) frames on the queue before overflowing. I’d probably aim at maintaining half this which means that I’d only need to check every 2 or so seconds. Clearly, if you’re accessing data from a disk you can use much larger data chunks; mine come from an AAC decoder stream and thus arrive in 1024-sample blocks. There is no particular reason that the raw audio buffer size is set to a fixed value. You can easily extend InitWODevice to allow you to specify the size of this buffer, should you need to.

The complete class is included below.

//----------------------------------------------------------------------------------------------------
// THIS CLASS WRAPS UP A SIMPLE INTERFACE TO THE WAVE OUT DEVICE
//----------------------------------------------------------------------------------------------------
// Note, because there are some unmanaged resources required in this interface, the class is declared as IDisposable; in consequence there is a Dispose method included to delete
// them when finished.
 
public class woLib : IDisposable
{
//----------------------------------------------------------------------------------------------------
// this structure is used at the head of each block of raw data
    [StructLayout(LayoutKind.Sequential)]
    public struct WaveHdr
    {
        public IntPtr lpData;                                                       // pointer to locked data buffer
        public int dwBufferLength;                                                  // length of data buffer
        public int dwBytesRecorded;                                                 // used for input only
        public IntPtr dwUser;                                                       // for client's use
        public int dwFlags;                                                         // assorted flags (see defines)
        public int dwLoops;                                                         // loop control counter
        public IntPtr lpNext;                                                       // PWaveHdr, reserved for driver
        public int reserved;                                                        // reserved for driver
    }
//----------------------------------------------------------------------------------------------------
// these identify the expected nature of the raw audio data. Typically it will be PCM or G711
    public enum WaveFormats
    {
        Unknown = 0,
        Pcm = 1,
        Adpcm = 2,
        Float = 3,
        alaw = 6,
        mulaw = 7
    }
//----------------------------------------------------------------------------------------------------
// this structure is used to initialise the WaveOut device into the required mode (sample rate, channels, &etc)
    [StructLayout(LayoutKind.Sequential)]
    public class WaveFormat
    {
        public short wFormatTag;
        public short nChannels;
        public int nSamplesPerSec;
        public int nAvgBytesPerSec;
        public short nBlockAlign;
        public short wBitsPerSample;
        public short cbSize;
    }
    public const int CALLBACK_FUNCTION = 0x00030000;                                // flag used if we require a callback when audio frames are completed
    public const int CALLBACK_NULL = 0x00000000;                                    // flag used if no callback is required
    public const int BUFFER_DONE = 0x3BD;                                           // flag used in callback to identify the reason for the callback
    public delegate void WaveDelegate(IntPtr dev, int uMsg, int dwUser, int dwParam1, int dwParam2);
//----------------------------------------------------------------------------------------------------
// here we import the WaveOut interface components from an external dll
 
    [DllImport("winmm.dll")]
    public static extern int waveOutOpen(out IntPtr hWaveOut, int uDeviceID, WaveFormat lpFormat, WaveDelegate dwCallback, IntPtr dwInstance, int dwFlags);
    [DllImport("winmm.dll")]
    public static extern int waveOutReset(IntPtr hWaveOut);
    [DllImport("winmm.dll")]
    public static extern int waveOutRestart(IntPtr hWaveOut);
    [DllImport("winmm.dll")]
    public static extern int waveOutPrepareHeader(IntPtr hWaveOut, ref WaveHdr lpWaveOutHdr, int uSize);
    [DllImport("winmm.dll")]
    public static extern int waveOutWrite(IntPtr hWaveOut, ref WaveHdr lpWaveOutHdr, int uSize);
    [DllImport("winmm.dll")]
    public static extern int waveOutClose(IntPtr hWaveOut);
//----------------------------------------------------------------------------------------------------
// these variables are used to manage the persistent audio output buffer which must be maintained while the waveout device is outputting.
 
    private UInt32 pFrames = 0;                                                     // counts audio frames delivered to WaveOut device
    private IntPtr rFrames = (IntPtr)0;                                             // pointer to UNMANAGED UInt32; used by callback to count completed buffers
    private IntPtr hwo = (IntPtr)0;                                                 // wave output device
    private const int audioRawSize = 1000000;                                       // size of raw audio buffer used with hwo device
    private IntPtr audioRawP = (IntPtr)0;                                           // this will be an audio buffer for the wave out device but it must be in UNMANAGED memory
    private UInt32 audioRawIndex = 0;                                               // this is used to point to the current end of data in the audio buffer
    private WaveDelegate woDone = new WaveDelegate(WaveOutDone);                    // delegate for wave out done callback
//----------------------------------------------------------------------------------------------------
// this callback is called when the waveout device has finished outputting a buffer. If we choose to use callbacks then this allows us to estimate current latency since the
// difference in number between sent and completed frames (queued) tells us how much data are outstanding. Clearly we need to know how much data are in each buffer plus the sample
// rates and so forth. Queued frames can be used to pace audio when playing data from a file. In streaming applications we either start to dump frames or reset the buffer if
// the number of queued frames gets too large.

    unsafe static void WaveOutDone(IntPtr dev, int uMsg, int dwUser, int dwParam1, int dwParam2)
    {
        if ((uMsg == BUFFER_DONE) && (dwUser!=0))
        {
            try
            {
                (*(UInt32*)dwUser)++;                                               // increment the integer at the pointer dwUser. This must be unmanaged memory otherwise the

            }                                                                       // callback may well try to increment a variable that is no longer there and probably, in
            catch                                                                   // fact, something else. This integer is held at "rFrames".
            {
            }
        }
    }
 
//----------------------------------------------------------------------------------------------------
// this routine creates a wave out device based on a set of input parameters. sf - sample frequency, chans - number of audio channels, bps - bits per (mono) sample, G711 - true
// if the data are G711 mulaw. The default audio device is always used (WAVE_MAPPER) or -1 is specified in waveOutOpen. If you want to choose the wave out device, -1 can be
// replace by an initialisation parameter. Import and call "int waveOutGetNumDevs()" if you want to know how many valid audio devices your system has.

    public void InitWODevice(UInt32 sf, UInt32 chans, UInt32 bps, bool G711)
    {
//.................................................................................................................................................................................
// we may call this more than once if different sample configurations must be played so check if a player has already been created and, if so, close it
        if (hwo != (IntPtr)0) CloseWODevice();
//.................................................................................................................................................................................
// check if an unmanaged memory pool has been created yet. If not, create it
        if (audioRawP == (IntPtr)0)                                                 // see if the audio buffer has been created yet
        {                                                                           // if not, create it. Because data must remain live while being played out, this memory must
            audioRawP = Marshal.AllocHGlobal(audioRawSize);                         // stay at a fixed location and, therefore, be unmanaged.
            rFrames = Marshal.AllocHGlobal(4);                                      // similarly, create a fixed memory pool for the callback "frames-rendered" counter
        }
//.................................................................................................................................................................................
// set up the WaveFormat for this initialisation
 
        WaveFormat wFmt = new WaveFormat();                                         // create new wave format object
        wFmt.nAvgBytesPerSec = (int)(sf * (bps / 8) * chans);                      
        wFmt.nBlockAlign = (short)(chans * (bps / 8));                             
        wFmt.nChannels = (short)chans;
        wFmt.nSamplesPerSec = (int)sf;
        wFmt.wBitsPerSample = (short)bps;
        if (G711) wFmt.wFormatTag = (short)WaveFormats.mulaw; else wFmt.wFormatTag = (short)WaveFormats.Pcm;
        wFmt.cbSize = 0;
//.................................................................................................................................................................................
// open the WaveOut device
 
        waveOutOpen(out hwo, -1, wFmt, woDone, rFrames, CALLBACK_FUNCTION);         // use this if you want the callback for frame counting
//        waveOutOpen(out hwo, -1, wFmt, woDone, (IntPtr) 0, CALLBACK_NULL);        // use this, otherwise
        ResetWODevice();
    }
//----------------------------------------------------------------------------------------------------
// reset the wave out device
    public void ResetWODevice()
    {
        if (hwo != (IntPtr)0) waveOutRestart(hwo);                                  // restart the audio device - flusing any unfinished buffers
        pFrames = 0;                                                                // clear the played frames (i.e. delivered) counter
        if (rFrames != (IntPtr)0)                                                   // clear the rendered frames (i.e. completed) counter
            unsafe
            {
                UInt32* i = (UInt32*)rFrames.ToPointer();                           // note that this is in unmanaged memory
                *i = 0;
            }
        audioRawIndex = 0;                                                          // reset the raw buffer index to start
    }
//----------------------------------------------------------------------------------------------------
// indicates that the wave out device has been inititialised
 
    public bool IsInit()
    {
        return (hwo != (IntPtr)0);
    }
//----------------------------------------------------------------------------------------------------
// returns the number of frames submitted to the wave out device
    public UInt32 GetFrames()
    {
        return pFrames;
    }
//----------------------------------------------------------------------------------------------------
// returns the number of queued frames that are pending within the wave out device
 
    public UInt32 GetQueued()
    {
        if (rFrames != (IntPtr)0)
            unsafe
            {
                UInt32* i = (UInt32*)rFrames.ToPointer();
                if (pFrames > *i)
                    return (pFrames - *i);
                else
                    return 0;
            }
        else return 0;
    }
 
//----------------------------------------------------------------------------------------------------
// close the wave out device in the event that we need to change its settings (sample rate, for example)
 
    public void CloseWODevice()
    {
       ResetWODevice();
        if (hwo != (IntPtr)0) waveOutClose(hwo);
        hwo = (IntPtr)0;
    }
//----------------------------------------------------------------------------------------------------
// generic interface to deliver raw data to a wave out device - uses a raw audio buffer; data must remain valid until playout is complete so we copy them to a buffer prior to
// delivery. This is done in two parts. First a wave header is created on the buffer, then the raw data are copied immediately after the header. The header is then filled in and
// delivered to the specified wave out device. Currently the size of the raw audio buffer is fixed but it could be made easily programmable. Bear in mind that the raw audio buffer
// must be large enough to accommodate the latency (queued frames) in the play out device as data must remain alive until completed. This is not checked here. If in doubt, increase
// the allocated buffer size. For multi-byte samples (more than one byte per sample or more than one channel) byte ordering must be correct. If the sound comes out harsh and very
// distorted, you've probably got the byte order swapped.

    public void SendWODevice(IntPtr data, UInt32 bytes)
    {
        if ((hwo != (IntPtr)0) && (audioRawP!=(IntPtr)0))                           // check that a device has been initialised
            unsafe
            {
                UInt32 pwhs = (UInt32) sizeof(WaveHdr);                             // get the size of the wave header class
                if ((bytes + audioRawIndex + pwhs) > audioRawSize)
                    audioRawIndex = 0;                                              // reset the index to 0 if there's not enough room on the buffer for header and data*

//.................................................................................................................................................................................
                if ((bytes + pwhs) < audioRawSize)                                  // check that we actually have enough room on the buffer to accommodate this request
                {
//.................................................................................................................................................................................
// create wave header on static buffer
 
                    byte* bpi = (byte*)audioRawP;                                   // point to buffer start
                    byte* bpw = (byte*)&bpi[audioRawIndex];                         // create a wave header at the next free space on the buffer
                    byte* bpr = (byte*)&bpi[audioRawIndex + pwhs];                  // create a pointer for the raw data that we're adding to the buffer
                    WaveHdr* pwh = (WaveHdr*)bpw;                                   // type cast to wave header structure that will be placed next in the raw buffer
                    pwh->lpData = (IntPtr)bpr;                                      // point the wave header data pointer to the new audio data
                    pwh->dwBufferLength = (int)bytes;                               // set the wave buffer length, flags, and so forth
                    pwh->dwUser = (IntPtr)0;
                    pwh->dwFlags = 0;
                    pwh->dwLoops = 0;
//.................................................................................................................................................................................
// copy raw data to static buffer
                    byte[] raw = new byte[bytes];
                    Marshal.Copy(data, raw, 0, (int)bytes);
                    IntPtr bpd = (IntPtr)bpr;
                    Marshal.Copy(raw, 0, bpd, (int)bytes);
//.................................................................................................................................................................................
// update the buffer index
                    audioRawIndex += (pwhs + bytes);                                // advance audio buffer index by the size of the wave header class
//.................................................................................................................................................................................
// deliver header to the wave out device
                    waveOutPrepareHeader(hwo, ref *pwh, (int)sizeof(WaveHdr));      // prepeare the wave header
                    waveOutWrite(hwo, ref *pwh, (int)sizeof(WaveHdr));              // send the data
                    pFrames++;                                                      // indicate played frames have increased
                }
            }
    }
   
// * note, by examining the amount of queued frames, if we use constant data sizes when calling SendWODevice, we can determine whether or not there will be enough room for this
// call on the buffer. Essentially, if the queued frames * size (in bytes) per frame (including WaveHeader) is approaching audioRawSize then we should call ResetWODevice().
// responsibility for this is left with the user - this is only a simple interface for demonstration purposes
//----------------------------------------------------------------------------------------------------
// this is included in order for the unmanaged resources to be deleted by the application when the class is cleaned up.
    public void Dispose()
    {
        CloseWODevice();                                                            // close the wave out device
        if (audioRawP == (IntPtr)0)                                                 // ensure that memory was allocated
        {
            Marshal.FreeHGlobal(audioRawP);                                         // free it up
            Marshal.FreeHGlobal(rFrames);
        }
    }
}
 
//----------------------------------------------------------------------------------------------------
// CLASS END CLASS END CLASS END CLASS END CLASS END CLASS ENDCLASS END CLASS END CLASS ENDCLASS END CLASS END CLASS END CLASS END CLASS END CLASS END CLASS END CLASS END CLASS EN
//----------------------------------------------------------------------------------------------------

Points of Interest

Interfacing to real devices through C# can produce a clash of paradigms; it’s been helpful to have a practical, and hopefully useful, example with which to work through this shift in understanding. C# does impose some frustrations; for example, where data are copied from the source, and into the unmanaged, buffers an extra copy appears to be necessary.

byte[] raw = new byte[bytes];
Marshal.Copy(data, raw, 0, (int)bytes);
IntPtr bpd = (IntPtr)bpr;
Marshal.Copy(raw, 0, bpd, (int)bytes);

In a world of optimized applications this produces a slightly uncomfortable feeling. I experienced exactly the same issue when getting a memory based image onto my screen. It is first copied into memory, then into a bitmap, and finally attached to a picture box object. It’s hard to say how many physical copies were actually involved but it feels like the long way around. Heaven help me when I start unravelling DirectX!

Starting to get to grips with unmanaged objects was the real key in this project. It took quite a bit of internet research in order to even find out the right question to ask, never mind its answer! As a complete newcomer to both classes and C# the code may not be brilliantly written; however, a newcomer’s perspective may well help other newcomers since we’re all going to trip over the same issues.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)