Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / multimedia / audio

Recording and playing PCM audio on Windows 8 (VB)

4.91/5 (9 votes)
15 Jan 2013Public Domain9 min read 126.1K   2.7K  
How to record and play PCM audio on Windows 8 in .NET.

Introduction

This code records and plays raw audio in PCM format, on Windows 8. It's written in VB but the same techniques apply to C#.

It uses IAudioClient, IAudioRenderClient,

IAudioCaptureClient
, part of the Windows Audio Session API (WASAPI).

Windows has several different API-families for recording and playing back audio:

  • MediaElement - All the UI frameworks have a MediaElement control - XAML for Windows 8, WPF, Silverlight, WinForms. They let you play audio and video. But they don't have an interface for recording, and they don't let you send your own raw audio data. (The best you can do is build a WAV file and tell them to play it).
  • waveOut and waveIn - These legacy APIs used to be the simplest and easiest libraries to use from .NET, for recording and playing raw audio. Unfortunately they're not allowed in Windows app-store apps.
  • WASAPI (Windows Audio Session API) - Introduced in Vista, this C++/COM API is the new recommended way to record and play audio.
  • xAudio2 - This is a low-level C++/COM API for audio, aimed at game developers, and allowed in Windows app-store apps. It is the successor to DirectSound.

NAudio. As a .NET developer, if you want audio, your first port of call should normally be http://naudio.codeplex.com/. NAudio is a high quality library, available through NuGet, licensed under MS-Pl, for adding audio to VB and C# apps. It supports waveOut/waveIn, WASAPI, and DirectSound. 

Sharpdx. This is a managed wrapper for DirectX, at http://www.sharpdx.org/. Amongst other things, it wraps xAudio2. It has also been used in successful submissions to the windows app store. So it's another candidate. 

I wrote this article because I didn't know about Sharpdx, and because (as of September 2012) NAudio doesn't work in Windows 8 app-store apps. That's because it includes calls to disallowed legacy APIs. Also, Windows 8 app-store apps require a different "entry-point" into the WASAPI from the traditional entry-point that NAudio uses. 

Microsoft has some great C++ tutorials on using WASAPI for "Capturing a stream" and "Rendering a stream". This article is basically a .NET and WinRT equivalent of those samples.

Using the code

COM interop. WASAPI is thoroughly COM based, so it needs pinvoke interop wrappers to make it work under .NET. Moreover, in an audio application, you deal with large quantities of data, and it's important to release resources in a timely fashion. VB and C# use the IDispose mechanism for this, and rely on .NET garbage collection for everything else. C++/COM uses IUnknown.Release and reference-counting instead. It's difficult to bridge the gap between the two.

You can skip over this section if you just want to read about WASAPI. But I don't advise it, since you'll need these skills in order to use the WASAPI correctly.

I suggest this MSDN article "Improving Interop Performance" and this C++ Team blog article "Mixing deterministic and non-deterministic cleanup" and this blog by Lim Bio Liong "RCW Internal Reference Count".

VB.NET
<ComImport, Guid("BCDE0395-E52F-467C-8E3D-C4579291692E")>
Class MMDeviceEnumerator
End Class

<ComImport, Guid("A95664D2-9614-4F35-A746-DE8DB63617E6")>
<InterfaceType(ComInterfaceType.InterfaceIsIUnknown)>
Interface IMMDeviceEnumerator
    Sub GetDefaultAudioEndpoint(dflow As EDataFlow, role As ERole, ByRef ppDevice As IMMDevice)
End Interface

10: Dim e As New MMDeviceEnumerator
20: Dim i = CType(e, IMMDeviceEnumerator)
...
30: Marshal.FinalReleaseComObject(i)
40: i = Nothing : e = Nothing

Here's an explanation of the above code. We will be concerned with the COM object, which maintains a reference count, and the .NET Runtime Callable Wrapper (RCW) for it which has its own internal reference count. There's a one-to-one relationship between an RCW and an IUnknown IntPtr. The first time the IUnknown IntPtr enters managed code (e.g., through allocating a new COM object, or Marshal.ObjectFromIUnknown) the RCW is created, its internal reference count is set to 1, and it calls IUnknown.AddRef just once. On subsequent times that the same IUnknown IntPtr enters managed code, RCW's internal reference count is incremented, but it doesn't call IUnknown.AddRef. RCW's internal reference count gets decremented through natural garbage collection; you can also decrement it manually with Marshal.ReleaseComObject, or force it straight to 0 with Marshal.FinalReleaseComObject. When RCW's internal reference count drops to 0, it calls IUnknown.Release just once. Any further method on it will fail with the message: "COM object that has been separated from its underlying RCW cannot be used".

After line 10: We have a COM object with ref count "1", and "e" is a reference to its RCW, which has an internal reference count "1".

After line 20: The same COM object still has ref count "1", and "i" is a reference to the same RCW as before, which still has an internal reference count "1".

After line 30: The RCW's reference count dropped to "0", and IUnknown.Release was called on the COM object, which now has a ref count "0". Any further method or cast on "i" and "e" will fail.

Suggested practice: Make a wrapper class which implements IDisposable and which has a private field for the COM object (or more precisely, for its RCW). Never let your clients have direct access to the COM object. It's fine for local variables in your wrapper-class to reference the same RCW if needed. In IDisposable.Dispose, call Marshal.FinalReleaseComObject, set the field to Nothing, and make sure that your wrapper never accesses methods or fields of that field again.

Enumerating audio devices, and getting IAudioClient

.NET 4.5: This is the .NET 4.5 code for enumerating audio devices, and getting the IAudioClient for one of them.

VB
Dim pEnum = CType(New MMDeviceEnumerator, IMMDeviceEnumerator)

' Enumerating devices...
Dim pDevices As IMMDeviceCollection = Nothing : _
    pEnum.EnumAudioEndpoints(EDataFlow.eAll, _
    DeviceStateFlags.DEVICE_STATE_ACTIVE, pDevices)
Dim pcDevices = 0 : pDevices.GetCount(pcDevices)
For i = 0 To pcDevices - 1
    Dim pDevice As IMMDevice = Nothing : pDevices.Item(i, pDevice)
    Dim pProps As IPropertyStore = Nothing : pDevice.OpenPropertyStore(StgmMode.STGM_READ, pProps)
    Dim varName As PROPVARIANT = Nothing : pProps.GetValue(PKEY_Device_FriendlyName, varName)
    ' CStr(varName.Value) - is the name of the device
    PropVariantClear(varName)
    Runtime.InteropServices.Marshal.FinalReleaseComObject(pProps) : pProps = Nothing
    Runtime.InteropServices.Marshal.FinalReleaseComObject(pDevice)
Next
Runtime.InteropServices.Marshal.FinalReleaseComObject(pDevices) : pDevices = Nothing

' Or, instead of enumerating, we can get the default device directly.
' Use EDataFlow.eRender for speakers, and .eCapture for microphone.
Dim pDevice As IMMDevice = Nothing : pEnum.GetDefaultAudioEndpoint(EDataFlow.eRender,  ERole.eConsole, pDevice)

' Once we have an IMMDevice, this is how we get the IAudioClient
Dim pAudioClient As IAudioClient2 = Nothing : pDeviceR.Activate(

                 IID_IAudioClient2, CLSCTX.CLSCTX_ALL, Nothing, pAudioClient)
...
Runtime.InteropServices.Marshal.FinalReleaseComObject(pAudioClient) : pAudioClient = Nothing
Runtime.InteropServices.Marshal.FinalReleaseComObject(pEnum) : pEnum = Nothing

Windows 8: The above COM objects aren't allowed in Windows app-store apps, so we have to use different techniques. Let's start with the code to enumerate audio devices and get the default one.

VB
' What is the default device for recording audio?
Dim defaultDeviceIdR = Windows.Media.Devices.MediaDevice.GetDefaultAudioCaptureId(
           Windows.Media.Devices.AudioDeviceRole.Default)

' Let's enumerate all the recording devices...
Dim audioSelectorR = Windows.Media.Devices.MediaDevice.GetAudioCaptureSelector()
Dim devicesR = Await Windows.Devices.Enumeration.DeviceInformation.FindAllAsync(
      audioSelectorR, {PKEY_AudioEndpoint_Supports_EventDriven_Mode.ToString()})

For Each device In devicesR
    ' use device.Id, device.Name, ...
Next


' What is the default device for playing audio?
Dim defaultDeviceIdP = Windows.Media.Devices.MediaDevice.GetDefaultAudioRenderId(
        Windows.Media.Devices.AudioDeviceRole.Default)

' Enumerate all the playback devices in the same way...
Dim audioSelectorP = Windows.Media.Devices.MediaDevice.GetAudioRenderSelector()
Dim devicesP = Await Windows.Devices.Enumeration.DeviceInformation.FindAllAsync(
    audioSelectorP, {PKEY_AudioEndpoint_Supports_EventDriven_Mode.ToString()})  

Microphone permission. In Windows 8 we need to get an IAudioClient for a chosen recording/playback device. This isn't straightforward. If you're using the APIs in an app-store app, and you want to initialize an IAudioClient for a recording device, then the first time your application tries this an alert will pop-up asking the user for permission for the app to record audio. Windows 8 does this for privacy reasons, so that apps don't surreptitiously eavesdrop on their users. Windows will remember the user's answer and the user won't see that prompt again (unless they uninstall+reinstall the app). If the user changes their mind, they can launch your app, Charms > Devices > Permissions. Microsoft has written a more detailed "Guidelines for devices that access personal data" including UI guidelines on how to present this to the user. Incidentally, permission is always implicitly granted to desktop apps, and no permission is even needed for audio playback.

VB
Dim icbh As New ActivateAudioInterfaceCompletionHandler(
    Sub(pAudioClient As IAudioClient2)
        Dim wfx As New WAVEFORMATEX With {.wFormatTag = 1, .nChannels = 2,
                   .nSamplesPerSec = 44100, .wBitsPerSample = 16, .nBlockAlign = 4,
                   .nAvgBytesPerSec = 44100 * 4, .cbSize = 0}
        pAudioClient.Initialize(
                     AUDCLNT_SHAREMODE.AUDCLNT_SHAREMODE_SHARED,
                     AUDCLNT_FLAGS.AUDCLNT_STREAMFLAGS_EVENTCALLBACK Or
                     AUDCLNT_FLAGS.AUDCLNT_STREAMFLAGS_NOPERSIST,
                     10000000, 0, wfx, Nothing)
    End Sub)

Dim activationOperation As IActivateAudioInterfaceAsyncOperation = Nothing
ActivateAudioInterfaceAsync(defaultDeviceIdR, IID_IAudioClient2, Nothing,
                            icbh, activationOperation)

Try
    Dim pAudioClient = Await icbh 

    ...

    Runtime.InteropServices.Marshal.FinalReleaseComObject(pAudioClient)
    pAudioClient = Nothing 
Catch ex As UnauthorizedAccessException When ex.HResult = -2147024891
    ' OOPS! Can't record. Charms > Settings > Permissions to grant microphone permission
Finally 
    Runtime.InteropServices.Marshal.FinalReleaseComObject(activationOperation)
    activationOperation = Nothing 
End Try   

The way this works is that you construct your own object, in this case icbhR, which implements IActivateAudioInterfaceCompletionHandler and IAgileObject. Next you call ActivateAudioInterfaceAsync, passing this object. A short time later, on a different thread, your object's IActivateAudioInterfaceCompletionHandler.ActivateCompleted method will be called. Inside the callback, you get hold of the IAudioClient interface, and then you call IAudioClient.Initialize() on it with your chosen PCM format. If Windows wanted to pop up its permissions prompt, then the call to Initialize() would block while the prompt is shown. Afterwards, the call to Initialize() will either succeed (if permission is granted), or fail with an UnauthorizedAccessException (if it isn't). You must call Initialize from inside your callback, otherwise your app will block indefinitely on the Initialize call.

The MSDN docs for ActivateAudioInterfaceAsync say that ActivateAudioInterfaceAsync may display a consent prompt the first time it is called. That's incorrect. It's Initialize() that may display the consent prompt; never ActivateAudioInterfaceAsync. They also say that "In Windows 8, the first use of IAudioClient to access the audio device should be on the STA thread. Calls from an MTA thread may result in undefined behavior." That's incorrect. The first use of IAudioClient.Initialize must be inside your IActivateAudioInterfaceCompletionHandler.ActivateCompleted handler, which will have been invoked on a background thread by ActivateAudioInterfaceAsync.

The above code requires you to implement this icbh object yourself. Here's my implementation. I made it implement the "awaiter pattern" with a method called GetAwaiter: this lets us simply Await the icbh, as in the above code. This code shows a typical use of TaskCompletionSource, to turn an API that uses callbacks into a friendlier one that you can await.

VB
Class ActivateAudioInterfaceCompletionHandler
    Implements IActivateAudioInterfaceCompletionHandler, IAgileObject

    Private InitializeAction As Action(Of IAudioClient2)
    Private tcs As New TaskCompletionSource(Of IAudioClient2)

    Sub New(InitializeAction As Action(Of IAudioClient2))
        Me.InitializeAction = InitializeAction
    End Sub

    Public Sub ActivateCompleted(activateOperation As IActivateAudioInterfaceAsyncOperation) _
               Implements IActivateAudioInterfaceCompletionHandler.ActivateCompleted
        ' First get the activation results, and see if anything bad happened then
        Dim hr As Integer = 0, unk As Object = Nothing : activateOperation.GetActivateResult(hr, unk)
        If hr <> 0 Then
            tcs.TrySetException(Runtime.InteropServices.Marshal.GetExceptionForHR(hr, New IntPtr(-1)))
            Return
        End If

        Dim pAudioClient = CType(unk, IAudioClient2)

        ' Next try to call the client's (synchronous, blocking) initialization method.
        Try
            InitializeAction(pAudioClient)
            tcs.SetResult(pAudioClient)
        Catch ex As Exception
            tcs.TrySetException(ex)
        End Try
    End Sub

    Public Function GetAwaiter() As Runtime.CompilerServices.TaskAwaiter(Of IAudioClient2)
        Return tcs.Task.GetAwaiter()
    End Function
End Class

The MSDN docs for IActivateAudioInterfaceCompletionHandler say that the object must be "an agile object (aggregating a free-threaded marshaler)". That's incorrect. The object's IMarshal interface is never even retrieved. All that's required is that it implements IAgileObject.

Picking the audio format

In the above code, as an argument to the IAudioClient.Initialize method, I went straight for CD-quality audio (stereo, 16 bits per sample, 44100Hz). You can only pick formats that the device supports natively, and many devices (including the Surface) don't even support this format...

There are some other ways you can pick an audio format. I generally don't like them, because they return WAVEFORMATEX structures with a bunch of extra OS-specific data at the end. That means you have to keep the IntPtr that's given to you, if you want to pass it to Initialize(), and you have to Marshal.FreeCoTaskMem on it at the end. (Alternatively: what NAudio does is more elegant: it defines its own custom marshaller which is able to marshal in and out that extra data).

VB
' Easiest way to pick an audio format for testing:
Dim wfx As New WAVEFORMATEX With {.wFormatTag = 1, .nChannels = 2, .nSamplesPerSec = 44100,
                                  .wBitsPerSample = 16, .nBlockAlign = 4,
                                  .nAvgBytesPerSec = 44100 * 4, .cbSize = 0}


' Another way: you could get the preferred audio format of the device
Dim pwfx_default As IntPtr = Nothing : pAudioClient.GetMixFormat(pwfx_default) 
Dim wfx_default = CType(Runtime.InteropServices.Marshal.PtrToStructure(pwfx_default, GetType(WAVEFORMATEX)), WAVEFORMATEX)
If pwfx_default <> Nothing Then Runtime.InteropServices.Marshal.FreeCoTaskMem(pwfx_default) : pwfx_default = Nothing


' Or you could pass in your own requested format, and if it's not supported "as is",
' then it'll suggest a closest match. (If it's okay as-is, then pwfx_suggested = Nothing)
Dim pwfx_suggested As IntPtr = Nothing : pAudioClient.IsFormatSupported(AUDCLNT_SHAREMODE.AUDCLNT_SHAREMODE_SHARED, wfx, pwfx_suggested)
If pwfx_suggested <> Nothing Then
    Dim wfx_suggested = CType(Runtime.InteropServices.Marshal.PtrToStructure(pwfx_suggested, GetType(WAVEFORMATEX)), WAVEFORMATEX)
    Runtime.InteropServices.Marshal.FreeCoTaskMem(pwfx_suggested)
End If

Recording audio

Here's the code to record audio. In this case I have already allocated a buffer "buf" large enough to hold 10 seconds of audio, and I merely copy into that. You might instead want to work with smaller buffers that you re-use.

VB
Dim hEvent = CreateEventEx(Nothing, Nothing, 0, EventAccess.EVENT_ALL_ACCESS)
pAudioClient.SetEventHandle(hEvent)
Dim bufferFrameCount As Integer = 0 : pAudioClient.GetBufferSize(bufferFrameCount)
Dim ppv As Object = Nothing : pAudioClient.GetService(IID_IAudioCaptureClient, ppv)
Dim pCaptureClient = CType(ppv, IAudioCaptureClient)
Dim buf = New Short(44100 * 10 * 2) {} ' ten seconds of audio
Dim nFrames = 0
pAudioClient.Start()

While True
    Await WaitForSingleObjectAsync(hEvent)
    Dim pData As IntPtr, NumFramesToRead As Integer = 0, dwFlags As Integer = 0
    pCaptureClient.GetBuffer(pData, NumFramesToRead, dwFlags, Nothing, Nothing)
    Dim nFramesToCopy = Math.Min(NumFramesToRead, buf.Length \ 2 - nFrames)
    Runtime.InteropServices.Marshal.Copy(pData, buf, nFrames * 2, nFramesToCopy * 2)
    pCaptureClient.ReleaseBuffer(NumFramesToRead)
    nFrames += nFramesToCopy
    If nFrames >= buf.Length \ 2 Then Exit While
End While

pAudioClient.Stop()
Runtime.InteropServices.Marshal.FinalReleaseComObject(pCaptureClient)
pCaptureClient = Nothing
CloseHandle(hEvent) : hEvent = Nothing

The code is event-based. It works with a Win32 event handle, obtained with the Win32 function CreateEventEx. Every time a new buffer of audio data is available, the event gets set. Because I passed "0" as a flag to CreateEventEx, it was an auto-reset event, i.e., it gets reset every time I successfully wait for it. Here's the small helper function WaitForSingleObjectAsync which lets me use a nice Await syntax:

VB
Function WaitForSingleObjectAsync(hEvent As IntPtr) As Task
    Return Task.Run(Sub()
                        Dim r = WaitForSingleObjectEx(hEvent, &HFFFFFFFF, True)
                        If r <> 0 Then Throw New Exception("Unexpected event")
                    End Sub)
End Function

Playing audio

Here's the code to play audio. Again, I already started with a buffer "buf" for all ten seconds of my waveform audio, and I play it all. You might wish to work with smaller buffers that you re-use.

VB
Dim hEvent = CreateEventEx(Nothing, Nothing, 0, EventAccess.EVENT_ALL_ACCESS)
pAudioClient.SetEventHandle(hEvent)
Dim bufferFrameCount As Integer = 0 : pAudioClient.GetBufferSize(bufferFrameCount)
Dim ppv As Object = Nothing : pAudioClient.GetService(IID_IAudioRenderClient, ppv)
Dim pRenderClient = CType(ppv, IAudioRenderClient)
Dim nFrame = 0
pAudioClient.Start()

While True
    Await WaitForSingleObjectAsync(hEvent)
    Dim numFramesPadding = 0 : pAudioClient.GetCurrentPadding(numFramesPadding)
    Dim numFramesAvailable = bufferFrameCount - numFramesPadding
    If numFramesAvailable = 0 Then Continue While
    Dim numFramesToCopy = Math.Min(numFramesAvailable, buf.Length \ 2 - nFrame)
    Dim pData As IntPtr = Nothing : pRenderClient.GetBuffer(numFramesToCopy, pData)
    Runtime.InteropServices.Marshal.Copy(buf, nFrame * 2, pData, numFramesToCopy * 2)
    pRenderClient.ReleaseBuffer(numFramesToCopy, 0)
    nFrame += numFramesToCopy
    If nFrame >= buf.Length \ 2 Then Exit While
End While

' and wait until the buffer plays out to the end...
While True
    Dim numFramesPadding = 0 : pAudioClient.GetCurrentPadding(numFramesPadding)
    If numFramesPadding = 0 Then Exit While
    Await Task.Delay(20)
End While

pAudioClient.Stop()
Runtime.InteropServices.Marshal.FinalReleaseComObject(pRenderClient) : pRenderClient = Nothing
CloseHandle(hEvent) : hEvent = Nothing

P/Invoke interop libraries for WASAPI and IAudioClient

All that's left is a huge pinvoke interop library. It took me several days to piece all this together. I'm not a pinvoke expert by any means. I bet there are bugs in the definitions, and I'm sure they don't embody best-practice.

VB
Module Interop
    <Runtime.InteropServices.DllImport("Mmdevapi.dll", ExactSpelling:=True, PreserveSig:=False)>
    Public Sub ActivateAudioInterfaceAsync(<Runtime.InteropServices.MarshalAs(_
           Runtime.InteropServices.UnmanagedType.LPWStr)> deviceInterfacePath As String, _
           <Runtime.InteropServices.MarshalAs(Runtime.InteropServices.UnmanagedType.LPStruct)> _
           riid As Guid, activationParams As IntPtr, completionHandler As _
           IActivateAudioInterfaceCompletionHandler, _
           ByRef activationOperation As IActivateAudioInterfaceAsyncOperation)
    End Sub

    <Runtime.InteropServices.DllImport("ole32.dll", _
                ExactSpelling:=True, PreserveSig:=False)>
    Public Sub PropVariantClear(ByRef pvar As PROPVARIANT)
    End Sub

    <Runtime.InteropServices.DllImport("kernel32.dll", _
           CharSet:=Runtime.InteropServices.CharSet.Unicode, _
           ExactSpelling:=False, PreserveSig:=True, SetLastError:=True)>
    Public Function CreateEventEx(lpEventAttributes As IntPtr, lpName As IntPtr, _
           dwFlags As Integer, dwDesiredAccess As EventAccess) As IntPtr
    End Function

    <Runtime.InteropServices.DllImport("kernel32.dll", _
             ExactSpelling:=True, PreserveSig:=True, SetLastError:=True)>
    Public Function CloseHandle(hObject As IntPtr) As Boolean
    End Function

    <Runtime.InteropServices.DllImport("kernel32", _
          ExactSpelling:=True, PreserveSig:=True, SetLastError:=True)>
    Function WaitForSingleObjectEx(hEvent As IntPtr, milliseconds _
          As Integer, bAlertable As Boolean) As Integer
    End Function


    Public ReadOnly PKEY_Device_FriendlyName As New PROPERTYKEY With _
           {.fmtid = New Guid("a45c254e-df1c-4efd-8020-67d146a850e0"), .pid = 14}
    Public ReadOnly PKEY_Device_DeviceDesc As New PROPERTYKEY With _
           {.fmtid = New Guid("a45c254e-df1c-4efd-8020-67d146a850e0"), .pid = 2}
    Public ReadOnly PKEY_AudioEndpoint_Supports_EventDriven_Mode As New _
           PROPERTYKEY With {.fmtid = New Guid("1da5d803-d492-4edd-8c23-e0c0ffee7f0e"), .pid = 7}
    Public ReadOnly IID_IAudioClient As New Guid("1CB9AD4C-DBFA-4c32-B178-C2F568A703B2")
    Public ReadOnly IID_IAudioClient2 As New Guid("726778CD-F60A-4eda-82DE-E47610CD78AA")
    Public ReadOnly IID_IAudioRenderClient As New Guid("F294ACFC-3146-4483-A7BF-ADDCA7C260E2")
    Public ReadOnly IID_IAudioCaptureClient As New Guid("C8ADBD64-E71E-48a0-A4DE-185C395CD317")


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.Guid(_
       "BCDE0395-E52F-467C-8E3D-C4579291692E")> Class MMDeviceEnumerator
    End Class


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
       Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
       Runtime.InteropServices.Guid("A95664D2-9614-4F35-A746-DE8DB63617E6")>
    Public Interface IMMDeviceEnumerator
        Sub EnumAudioEndpoints(dataflow As EDataFlow, dwStateMask _
            As DeviceStateFlags, ByRef ppDevices As IMMDeviceCollection)
        Sub GetDefaultAudioEndpoint(dataflow As EDataFlow, role As ERole, ByRef ppDevice As IMMDevice)
        Sub GetDevice(<Runtime.InteropServices.MarshalAs(_
            Runtime.InteropServices.UnmanagedType.LPWStr)> pwstrId As String, ByRef ppDevice As IntPtr)
        Sub RegisterEndpointNotificationCallback(pClient As IntPtr)
        Sub UnregisterEndpointNotificationCallback(pClient As IntPtr)
    End Interface


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
       Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
       Runtime.InteropServices.Guid("0BD7A1BE-7A1A-44DB-8397-CC5392387B5E")>
    Public Interface IMMDeviceCollection
        Sub GetCount(ByRef pcDevices As Integer)
        Sub Item(nDevice As Integer, ByRef ppDevice As IMMDevice)
    End Interface


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
       Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
       Runtime.InteropServices.Guid("D666063F-1587-4E43-81F1-B948E807363F")>
    Public Interface IMMDevice
        Sub Activate(<Runtime.InteropServices.MarshalAs(_
            Runtime.InteropServices.UnmanagedType.LPStruct)> iid As Guid, _
            dwClsCtx As CLSCTX, pActivationParams As IntPtr, ByRef ppInterface As IAudioClient2)
        Sub OpenPropertyStore(stgmAccess As Integer, ByRef ppProperties As IPropertyStore)
        Sub GetId(<Runtime.InteropServices.MarshalAs(_
            Runtime.InteropServices.UnmanagedType.LPWStr)> ByRef ppstrId As String)
        Sub GetState(ByRef pdwState As Integer)
    End Interface


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
          Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
          Runtime.InteropServices.Guid("886d8eeb-8cf2-4446-8d02-cdba1dbdcf99")>
    Public Interface IPropertyStore
        'virtual HRESULT STDMETHODCALLTYPE GetCount(/*[out]*/ __RPC__out DWORD *cProps)
        Sub GetCount(ByRef cProps As Integer)
        'virtual HRESULT STDMETHODCALLTYPE GetAt(/*Runtime.InteropServices.In*/ 
        '   DWORD iProp, /*[out]*/ __RPC__out PROPERTYKEY *pkey)
        Sub GetAt(iProp As Integer, ByRef pkey As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE GetValue(/*Runtime.InteropServices.In*/
        '    __RPC__in REFPROPERTYKEY key, /*[out]*/ __RPC__out PROPVARIANT *pv)
        Sub GetValue(ByRef key As PROPERTYKEY, ByRef pv As PROPVARIANT)
        'virtual HRESULT STDMETHODCALLTYPE SetValue(/*Runtime.InteropServices.In*/ 
        '  __RPC__in REFPROPERTYKEY key, /*Runtime.InteropServices.In*/ __RPC__in REFPROPVARIANT propvar)
        Sub SetValue(ByRef key As PROPERTYKEY, ByRef propvar As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE Commit()
        Sub Commit()
    End Interface


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
       Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
       Runtime.InteropServices.Guid("1CB9AD4C-DBFA-4c32-B178-C2F568A703B2")>
    Public Interface IAudioClient
        Sub Initialize(ShareMode As AUDCLNT_SHAREMODE, StreamFlags As AUDCLNT_FLAGS, _
            hnsBufferDuration As Long, hnsPeriodicity As Long, ByRef _
            pFormat As WAVEFORMATEX, AudioSessionGuid As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE GetBufferSize(/*[out]*/ _Out_  UINT32 *pNumBufferFrames) = 0;
        Sub GetBufferSize(ByRef pNumBufferFrames As Integer)
        'virtual HRESULT STDMETHODCALLTYPE GetStreamLatency(/*[out]*/ _Out_  REFERENCE_TIME *phnsLatency) = 0;
        Sub GetStreamLatency(ByRef phnsLatency As Long)
        'virtual HRESULT STDMETHODCALLTYPE GetCurrentPadding(/*[out]*/ _Out_  UINT32 *pNumPaddingFrames) = 0;
        Sub GetCurrentPadding(ByRef pNumPaddingFrames As Integer)
        'virtual HRESULT STDMETHODCALLTYPE IsFormatSupported(/*[in]*/ _In_  
        '   AUDCLNT_SHAREMODE ShareMode, /*[in]*/ _In_  const WAVEFORMATEX *pFormat, 
        '   /*[unique][out]*/ _Out_opt_  WAVEFORMATEX **ppClosestMatch) = 0;
        Sub IsFormatSupported(ShareMode As AUDCLNT_SHAREMODE, ByRef pFormat _
                              As WAVEFORMATEX, ByRef ppClosestMatch As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE GetMixFormat(/*[out]*/ _Out_  WAVEFORMATEX **ppDeviceFormat) = 0;
        Sub GetMixFormat(ByRef ppDeviceFormat As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE GetDevicePeriod(/*[out]*/ _Out_opt_  
        '    REFERENCE_TIME *phnsDefaultDevicePeriod, /*[out]*/ 
        '    _Out_opt_  REFERENCE_TIME *phnsMinimumDevicePeriod) = 0;
        Sub GetDevicePeriod(ByRef phnsDefaultDevicePeriod As Long, ByRef phnsMinimumDevicePeriod As Long)
        'virtual HRESULT STDMETHODCALLTYPE Start( void) = 0;
        Sub Start()
        'virtual HRESULT STDMETHODCALLTYPE Stop( void) = 0;
        Sub [Stop]()
        'virtual HRESULT STDMETHODCALLTYPE Reset( void) = 0;
        Sub Reset()
        'virtual HRESULT STDMETHODCALLTYPE SetEventHandle(/*[in]*/ HANDLE eventHandle) = 0;
        Sub SetEventHandle(eventHandle As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE GetService(/*[in]*/ _In_  REFIID riid, /*[iid_is][out]*/ _Out_  void **ppv) = 0;
        Sub GetService(<Runtime.InteropServices.MarshalAs(_
            Runtime.InteropServices.UnmanagedType.LPStruct)> riid As Guid, _
            <Runtime.InteropServices.MarshalAs(Runtime.InteropServices.UnmanagedType.IUnknown)> ByRef ppv As Object)
    End Interface


    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
                     Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
                     Runtime.InteropServices.Guid("726778CD-F60A-4eda-82DE-E47610CD78AA")>
    Public Interface IAudioClient2
        Sub Initialize(ShareMode As AUDCLNT_SHAREMODE, StreamFlags As AUDCLNT_FLAGS, _
            hnsBufferDuration As Long, hnsPeriodicity As Long, _
            ByRef pFormat As WAVEFORMATEX, AudioSessionGuid As IntPtr)
        Sub GetBufferSize(ByRef pNumBufferFrames As Integer)
        Sub GetStreamLatency(ByRef phnsLatency As Long)
        Sub GetCurrentPadding(ByRef pNumPaddingFrames As Integer)
        Sub IsFormatSupported(ShareMode As AUDCLNT_SHAREMODE, _
            ByRef pFormat As WAVEFORMATEX, ByRef ppClosestMatch As IntPtr)
        Sub GetMixFormat(ByRef ppDeviceFormat As IntPtr)
        Sub GetDevicePeriod(ByRef phnsDefaultDevicePeriod As Long, ByRef phnsMinimumDevicePeriod As Long)
        Sub Start()
        Sub [Stop]()
        Sub Reset()
        Sub SetEventHandle(eventHandle As IntPtr)
        Sub GetService(<Runtime.InteropServices.MarshalAs(_
            Runtime.InteropServices.UnmanagedType.LPStruct)> riid As Guid, _
            <Runtime.InteropServices.MarshalAs(Runtime.InteropServices.UnmanagedType.IUnknown)> ByRef ppv As Object)
        'virtual HRESULT STDMETHODCALLTYPE IsOffloadCapable(/*[in]*/ _In_  
        '   AUDIO_STREAM_CATEGORY Category, /*[in]*/ _Out_  BOOL *pbOffloadCapable) = 0;
        Sub IsOffloadCapable(Category As Integer, ByRef pbOffloadCapable As Boolean)
        'virtual HRESULT STDMETHODCALLTYPE SetClientProperties(/*[in]*/ _In_  
        '  const AudioClientProperties *pProperties) = 0;
        Sub SetClientProperties(pProperties As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE GetBufferSizeLimits(/*[in]*/ _In_  
        '   const WAVEFORMATEX *pFormat, /*[in]*/ _In_  BOOL bEventDriven, /*[in]*/ 
        '  _Out_  REFERENCE_TIME *phnsMinBufferDuration, /*[in]*/ _Out_  
        '  REFERENCE_TIME *phnsMaxBufferDuration) = 0;
        Sub GetBufferSizeLimits(pFormat As IntPtr, bEventDriven As Boolean, _
                 phnsMinBufferDuration As IntPtr, phnsMaxBufferDuration As IntPtr)
    End Interface
    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
        Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
        Runtime.InteropServices.Guid("F294ACFC-3146-4483-A7BF-ADDCA7C260E2")>
    Public Interface IAudioRenderClient
        'virtual HRESULT STDMETHODCALLTYPE GetBuffer(/*[in]*/ _In_  UINT32 NumFramesRequested,
        '   /*[out]*/ _Outptr_result_buffer_(_Inexpressible_(
        '  "NumFramesRequested * pFormat->nBlockAlign"))  BYTE **ppData) = 0;
        Sub GetBuffer(NumFramesRequested As Integer, ByRef ppData As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE ReleaseBuffer(/*[in]*/ _In_  
        '   UINT32 NumFramesWritten, /*[in]*/ _In_  DWORD dwFlags) = 0;
        Sub ReleaseBuffer(NumFramesWritten As Integer, dwFlags As Integer)
    End Interface
    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
        Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
        Runtime.InteropServices.Guid("C8ADBD64-E71E-48a0-A4DE-185C395CD317")>
    Public Interface IAudioCaptureClient
        'virtual HRESULT STDMETHODCALLTYPE GetBuffer(/*[out]*/ _Outptr_result_buffer_(
        '   _Inexpressible_("*pNumFramesToRead * pFormat->nBlockAlign"))  
        '   BYTE **ppData, /*[out]*/ _Out_  UINT32 *pNumFramesToRead, /*[out]*/_Out_ 
        '   DWORD *pdwFlags, /*[out]*/_Out_opt_  UINT64 *pu64DevicePosition, 
        '   /*[out]*/_Out_opt_  UINT64 *pu64QPCPosition) = 0;
        Sub GetBuffer(ByRef ppData As IntPtr, ByRef pNumFramesToRead As Integer, _
               ByRef pdwFlags As Integer, pu64DevicePosition As IntPtr, pu64QPCPosition As IntPtr)
        'virtual HRESULT STDMETHODCALLTYPE ReleaseBuffer(/*[in]*/ _In_  UINT32 NumFramesRead) = 0;
        Sub ReleaseBuffer(NumFramesRead As Integer)
        'virtual HRESULT STDMETHODCALLTYPE GetNextPacketSize(
        '       /*[out]*/ _Out_  UINT32 *pNumFramesInNextPacket) = 0;
        Sub GetNextPacketSize(ByRef pNumFramesInNextPacket As Integer)
    End Interface
    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
      Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
      Runtime.InteropServices.Guid("41D949AB-9862-444A-80F6-C261334DA5EB")>
    Public Interface IActivateAudioInterfaceCompletionHandler
        'virtual HRESULT STDMETHODCALLTYPE ActivateCompleted(/*[in]*/ _In_  
        '   IActivateAudioInterfaceAsyncOperation *activateOperation) = 0;
        Sub ActivateCompleted(activateOperation As IActivateAudioInterfaceAsyncOperation)
    End Interface
    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
       Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
       Runtime.InteropServices.Guid("72A22D78-CDE4-431D-B8CC-843A71199B6D")>
    Public Interface IActivateAudioInterfaceAsyncOperation
        'virtual HRESULT STDMETHODCALLTYPE GetActivateResult(/*[out]*/ _Out_  
        '  HRESULT *activateResult, /*[out]*/ _Outptr_result_maybenull_  IUnknown **activatedInterface) = 0;
        Sub GetActivateResult(ByRef activateResult As Integer, _
            <Runtime.InteropServices.MarshalAs(Runtime.InteropServices.UnmanagedType.IUnknown)> _
            ByRef activateInterface As Object)
    End Interface

    <Runtime.InteropServices.ComImport, Runtime.InteropServices.InterfaceType(_
       Runtime.InteropServices.ComInterfaceType.InterfaceIsIUnknown), _
       Runtime.InteropServices.Guid("94ea2b94-e9cc-49e0-c0ff-ee64ca8f5b90")>
    Public Interface IAgileObject
    End Interface

    <Runtime.InteropServices.StructLayout(Runtime.InteropServices.LayoutKind.Sequential, _
               Pack:=1)> Structure WAVEFORMATEX
        Dim wFormatTag As Short
        Dim nChannels As Short
        Dim nSamplesPerSec As Integer
        Dim nAvgBytesPerSec As Integer
        Dim nBlockAlign As Short
        Dim wBitsPerSample As Short
        Dim cbSize As Short
    End Structure


    <Runtime.InteropServices.StructLayout(Runtime.InteropServices.LayoutKind.Sequential, _
               Pack:=1)> Public Structure PROPVARIANT
        Dim vt As UShort
        Dim wReserved1 As UShort
        Dim wReserved2 As UShort
        Dim wReserved3 As UShort
        Dim p As IntPtr
        Dim p2 As Integer
        ReadOnly Property Value As Object
            Get
                Select Case vt
                    Case 31 : Return Runtime.InteropServices.Marshal.PtrToStringUni(p) ' VT_LPWSTR
                    Case Else
                        Throw New NotImplementedException
                End Select
            End Get
        End Property
    End Structure

    <Runtime.InteropServices.StructLayout(Runtime.InteropServices.LayoutKind.Sequential, _
                Pack:=1)> Public Structure PROPERTYKEY
        <Runtime.InteropServices.MarshalAs(Runtime.InteropServices.UnmanagedType.Struct)> Dim fmtid As Guid
        Dim pid As Integer
        Public Overrides Function ToString() As String
            Return "{" & fmtid.ToString() & "} " & pid.ToString()
        End Function
    End Structure

    Enum EDataFlow
        eRender = 0
        eCapture = 1
        eAll = 2
        EDataFlow_enum_count = 3
    End Enum

    Enum ERole
        eConsole = 0
        eMultimedia = 1
        eCommunications = 2
        ERole_enum_count = 3
    End Enum


    Enum StgmMode
        STGM_READ = 0
        STGM_WRITE = 1
        STGM_READWRITE = 2
    End Enum

    Enum AUDCLNT_SHAREMODE
        AUDCLNT_SHAREMODE_SHARED = 0
        AUDCLNT_SHAREMODE_EXCLUSIVE = 1
    End Enum

    <Flags> Enum DeviceStateFlags
        DEVICE_STATE_ACTIVE = 1
        DEVICE_STATE_DISABLED = 2
        DEVICE_STATE_NOTPRESENT = 4
        DEVICE_STATE_UNPLUGGED = 8
        DEVICE_STATEMASK_ALL = 15
    End Enum

    <Flags> Enum AUDCLNT_FLAGS
        AUDCLNT_STREAMFLAGS_CROSSPROCESS = &H10000
        AUDCLNT_STREAMFLAGS_LOOPBACK = &H20000
        AUDCLNT_STREAMFLAGS_EVENTCALLBACK = &H40000
        AUDCLNT_STREAMFLAGS_NOPERSIST = &H80000
        AUDCLNT_STREAMFLAGS_RATEADJUST = &H100000
        AUDCLNT_SESSIONFLAGS_EXPIREWHENUNOWNED = &H10000000
        AUDCLNT_SESSIONFLAGS_DISPLAY_HIDE = &H20000000
        AUDCLNT_SESSIONFLAGS_DISPLAY_HIDEWHENEXPIRED = &H40000000
    End Enum

    <Flags> Enum EventAccess
        STANDARD_RIGHTS_REQUIRED = &HF0000
        SYNCHRONIZE = &H100000
        EVENT_ALL_ACCESS = STANDARD_RIGHTS_REQUIRED Or SYNCHRONIZE Or &H3
    End Enum

    <Flags> Enum CLSCTX
        CLSCTX_INPROC_SERVER = 1
        CLSCTX_INPROC_HANDLER = 2
        CLSCTX_LOCAL_SERVER = 4
        CLSCTX_REMOTE_SERVER = 16
        CLSCTX_ALL = CLSCTX_INPROC_SERVER Or CLSCTX_INPROC_HANDLER Or _
                     CLSCTX_LOCAL_SERVER Or CLSCTX_REMOTE_SERVER
    End Enum
End Module

Notes

Disclaimer: although I work at Microsoft on the VB/C# language team, this article is strictly a personal amateur effort based on public information and experimentation - it's not in my professional area of expertise, is written in my own free time not as a representative of Microsoft, and neither Microsoft nor I make any claims about its correctness.

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication