Introduction
As a VB.NET developer, I must admit that it is difficult getting decent code for DirectSound
on the internet. Most of the examples are either in cryptic C/C++ or in C# (a close relation to VB.NET). The benefit of the latter is that it uses the familiar .NET Framework.
This tutorial takes you through the process of creating a simple utility application which will display and play a WAV file or portions of it as selected from a simple user interface.
For the non-initiated in the work of managed direct sound, I will take you through a brief introduction of DirectSound
and jump into the topic of circular buffers in direct sound. For the already initiated, I will delve into the WAV file structure culminating with a class to parse a WAV file. To get the application working, the use of system timers will follow. Last but not least is putting all the code together.
An assumption made is that you are already familiar with DirectX and you have installed the DirectX SDK on your development platform.
About the Application
“Circular Buffers” is an application developed in VB.NET (VS 2003). It demonstrates the following concepts:
- Playing a WAV file using a
DirectSound
static buffer - Reading a WAV file and parsing it in preparation of playing
- Playing a WAV file stream using a
DirectSound
circular buffer - Using the system timer to have precise control on timed events
- Visualizing WAV file data as a graph of sound values and Graphic double buffering
- Playing a WAV file data array using a
DirectSound
circular buffer - Selecting portions of a WAV file and playing it using a static buffer
- Simple mixing of two sounds
Getting Started with DirectSound
DirectSound
is part of the DirectX components and specifically handles playback of sound, including mono, stereo, 3-D sound and multi-channel sound. To begin, load the DirectSound
reference into your VB.NET project as shown.
By adding a reference to DirectSound, you expose four basic objects required in this project to play and manipulate sounds:
Object | Purpose |
Microsoft.DirectX.DirectSound.Device | This is the main audio device object required to use DirectSound . |
Microsoft.DirectX.DirectSound.WaveFormat | Holds the required header properties of a WAV. For custom sounds, you must set all the parameters. |
Microsoft.DirectX.DirectSound.SecondaryBuffer
| This is the buffer to which we write our sound data before the primary hardware mixes and plays the sound. You can have as many secondary buffers as RAM can allow but only 1 primary buffer which is found in the hardware. |
Microsoft.DirectX.DirectSound.BufferDescription | Defines the capabilities of the audio device given the WAV format. 3-D sounds, volume control, frequency, panning can be set. |
Minimum Required Code to Use DirectSound
Private Sub cmdDefault_Click(……) Handles cmdDefault.Click
Try
SoundDevice = New Microsoft.DirectX.DirectSound.Device
SoundDevice.SetCooperativeLevel(Me.Handle_
, Microsoft.DirectX.DirectSound.CooperativeLevel.Normal)
SbufferOriginal = New _ Microsoft.DirectX.DirectSound._
SecondaryBuffer(SoundFile, SoundDevice)
SbufferOriginal.Play(0,_
Microsoft.DirectX.DirectSound.BufferPlayFlags.Looping)
Catch ex As Exception
End Try
End Sub
The code shown above is the minimum code required to play a sound from a WAV file. The steps required are:
- Create a new sound device and assign a valid window handle as one parameter and set the cooperative level.
- Priority: When the application has focus, only its sound will be audible
- Normal: Restricts all sound output to 8-bit
- WritePrimary: Allows the application to write to the primary buffer
- Create a secondary sound buffer and assign a valid filename/file stream and audio device as the input.
- The sound file can only be a valid WAV file.
- Call the play method of the secondary buffer
Playing a WAV File using a DirectSound Static Buffer
A static buffer is by the name static. The content does not change in time and is loaded once. For continuous sound play, the secondary buffer play method uses the looping option. The steps outlined above use a static buffer. The contents of the WAV file are loaded into the secondary buffer and played.
A static buffer is best used when you have small WAV files whose size will not consume much resource. The procedure cmdDefault_Click
in the above code creates and plays a static buffer.
Reading a WAV File and Parsing it in Preparation of Playing
The WAV file format is a subset of Microsoft's RIFF specification for the storage of multimedia files. A RIFF file starts out with a file header followed by a sequence of data chunks. A WAVE file is often just a RIFF file with a single "WAVE" chunk which consists of two sub-chunks -- a "fmt" chunk specifying the data format and a "data" chunk containing the actual sample data.
The figure below depicts the WAV file structure. The class CWAVReader
parses the WAV file structure extracting the header details (WAV format) and actual sound data.
The code below depicts the constructor with a series of methods parsing the WAV file structure.
‘Constructor to open stream
Sub New(ByVal SoundFilePathName As String)
mWAVFileName = SoundFilePathName
mOpen = OpenWAVStream(mWAVFileName)
If mOpen Then
mChunkID = ReadChunkID(mWAVStream)
mChunkSize = ReadChunkSize(mWAVStream)
mFormatID = ReadFormatID(mWAVStream)
mSubChunkID = ReadSubChunkID(mWAVStream)
mSubChunkSize = ReadSubChunkSize(mWAVStream)
mAudioFormat = ReadAudioFormat(mWAVStream)
mNumChannels = ReadNumChannels(mWAVStream)
mSampleRate = ReadSampleRate(mWAVStream)
mByteRate = ReadByteRate(mWAVStream)
mBlockAlign = ReadBlockAlign(mWAVStream)
mBitsPerSample = ReadBitsPerSample(mWAVStream)
mSubChunkIDTwo = ReadSubChunkIDTwo(mWAVStream)
mSubChunkSizeTwo = ReadSubChunkSizeTwo(mWAVStream)
mWaveSoundData = ReadWAVSampleData(mWAVStream)
mWAVStream.Close()
End If
End Sub
NOTE: THE ORDER MUST BE MAINTAINED! THE CODE USES A BINARY STREAM READ WHICH ADVANCES THE FILE POINTER.
Parsing the binary stream is not difficult. This involves reading a number of bytes as required using the binary reader ReadBytes
method.
The code below reads the chunk ID from a WAV file. Notice that the chunk ID is in BIG-ENDIAN.
Computer architectures differ in terms of byte ordering. In some, data is stored left to right, which is referred to as big-endian. In others data is stored from right to left, which is referred to as little-endian. A notable computer architecture that uses big-endian byte ordering is Sun's Sparc. Intel architecture uses little-endian byte ordering, as does the Compaq Alpha processor.
Private Function ReadChunkID(….) As String
Dim DataBuffer() As Byte
Dim DataEncoder As System.Text.ASCIIEncoding
Dim TempString As Char()
DataEncoder = New System.Text.ASCIIEncoding
DataBuffer = WAVIOstreamReader.ReadBytes(4)
If DataBuffer.Length <> 0 Then
TempString = DataEncoder.GetChars(DataBuffer, 0, 4)
Return TempString(0) & TempString(1) & TempString(2) & TempString(3)
Else
Return ""
End If
End Function
Since we are reading the data and converting the same into text based on the relative location (the array is read from location 0 to location 3), this is a big-endian value.
Small-endian values require a much more complicated function. The binary stream is read but the values are reversed and padded to ensure correct alignment, then converted to either text or value. The code below is one such function which takes up a byte array and reverses it to return the small-endian value.
Private Function GetLittleEndianStringValue(..) As String
Dim ValueString As String = "&h"
If DataBuffer.Length <> 0 Then
and pad the same where the length is 1
If Hex(DataBuffer(3)).Length = 1 Then
ValueString &= "0" & Hex(DataBuffer(3))
Else
ValueString &= Hex(DataBuffer(3))
End If
If Hex(DataBuffer(2)).Length = 1 Then
ValueString &= "0" & Hex(DataBuffer(2))
Else
ValueString &= Hex(DataBuffer(2))
End If
If Hex(DataBuffer(1)).Length = 1 Then
ValueString &= "0" & Hex(DataBuffer(1))
Else
ValueString &= Hex(DataBuffer(1))
End If
If Hex(DataBuffer(0)).Length = 1 Then
ValueString &= "0" & Hex(DataBuffer(0))
Else
ValueString &= Hex(DataBuffer(0))
End If
Else
ValueString = "0"
End If
GetLittleEndianStringValue = ValueString
End Function
After reading the WAV’s properties, the final function is to read the entire sound data. The sound is read as a series of int16
data blocks. The memory stream has a method named ReadInt16
, which is called repeatedly. The code below reads the actual sound values and converts the data from unsigned data to signed int16
.
Public Function GetSoundDataValue() As Int16()
Dim DataCount As Integer
Dim tempStream As IO.BinaryReader
tempStream = New IO.BinaryReader(New IO.MemoryStream(mWaveSoundData))
tempStream.BaseStream.Position = 0
Dim tempData(CInt(tempStream.BaseStream.Length / 2)) As Int16
While DataCount <= tempData.Length - 2
tempData(DataCount) = tempStream.ReadInt16()
DataCount += 1
End While
tempStream.Close()
tempStream = Nothing
Return tempData
End Function
Using the System Timer to have Precise Control on Timed Events
The system.timers.timer
works in much the same way as does the Windows Forms timer, but does not require the Windows message pump. Other than that, the primary difference between server timers and Windows Forms timers is that the event handlers for server timers execute on thread pool threads. This makes it possible to maintain a responsive user interface even if the event handler takes a long time to execute. Another critical difference in this case of audio programming is higher precision and thread safety.
The class timer’s constructor takes in a time interval and a function to call after the elapse of the interval. The system.timers.timer
object is set to autoreset and thus continuously calls the function after the interval. The use of delegates (pointer to a function/sub) is used. Thus the timer object gets the timer interval and a delegate (pointer) of type System.Timers.ElapsedEventHandler
to call. This class is used to monitor the sound buffer and ‘top-up’ data to play and also to paint the progress of the play bar while playing music.
Playing a WAV File Stream using a DirectSound Circular Buffer
According to Wikipedia, “A circular buffer or ring buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end. This structure lends itself easily to buffering data streams.” Pictorially, a circular buffer is as shown in the figure below:
The write pointer identifies a location from which we can write sound data. The play pointer identifies the location where the sound buffer will play data from. The red boxes identify a location where data can be written to.
In the first scenario, the write pointer is positioned at a location point larger than the play pointer. As the play pointer advances, the write pointer needs to advance with latency not large enough to get a distortion. In the second scenario, the write pointer has wrapped around and is now at a location smaller than the play pointer.
When the write pointer gets to location 7, it has to warp around. The following code enables wrapping around of the write pointer and returns the amount of data already played which is the location to which new data has to be written to.
Function GetPlayedSize() As Integer
Dim Pos As Integer
Pos = SbufferOriginal.PlayPosition
If Pos < NextWritePos Then
Return Pos + (SbufferOriginal.Caps.BufferBytes - NextWritePos)
Else
Return Pos - NextWritePos
End If
End Function
The PlayPosition
is a property of the secondary sound buffer and returns the position of the play pointer. NextWritePos
is an internal pointer used to identify the location of where to write data to.
As the play pointer advances, we must continuously add data to the circular buffer. In this case, we shall use a memory stream from which to read data from and write to the circular stream. As mentioned before, a circular buffer is very useful if there is a large WAV file to be read and you intend to play the data in small chunks as opposed to reading the entire data to memory. There are two ways of ‘filling’ up the circular stream: use of notifications or use of polling technique. I have implemented the latter. A system timer is used to continuously ‘fill’ the circular stream with data.
At intervals of 75 milliseconds, the timer object calls the function PlayEventHandler
. This function calls other functions that determine the amount of data to write and thereafter writes this data from the stream into the secondary sound buffer.
Sub PlayerEventHandler(…)
If PlayerPosition >= MYwave.SubChunkSizeTwo Then
StopPlay()
End If
If IsArray = False Then
WriteData(GetPlayedSize())
Else
WriteDataArray(GetPlayedSize())
End If
End Sub
The function GetPlayedSize
uses the concept of circular buffers to return the amount of data to safely write on the secondary sound buffer. The WriteData
function thereafter writes the data to the secondary buffer. The code below demonstrates the functionality required to top up the secondary buffer (circular buffer).
Sub WriteData(ByVal DataSize As Integer)
Dim Tocopy As Integer
Tocopy = Math.Min(DataSize, TimeToDataSize(Latency))
If Tocopy > 0 Then
If SbufferOriginal.Status.BufferLost Then
SbufferOriginal.Restore()
End If
SbufferOriginal.Write(NextWritePos, DataMemStream, Tocopy, _
Microsoft.DirectX.DirectSound.LockFlag.None)
PlayerPosition += Tocopy
NextWritePos += Tocopy
If NextWritePos >= SbufferOriginal.Caps.BufferBytes Then
NextWritePos = NextWritePos - SbufferOriginal.Caps.BufferBytes
End If
End If
End Sub
The system.timers.timer
object is also used to update the screen with the location of the player pointer with respect to the data being played and not the secondary buffer. The function MyPainPoint
is called at the same time interval of 75 milliseconds. But I use a different timer to reduce the latency and sound distortion.
Sub MyPainPoint(ByVal obj As Object, ByVal Args As System.Timers.ElapsedEventArgs)
If PlayerPosition >= MYwave.SubChunkSizeTwo Then
StopPlay()
lblPos.Text = MYwave.SubChunkSizeTwo.ToString
lbltime.Text = MYwave.PlayTimeSeconds().ToString
End If
Dim XPos As Single = CSng((PlayerPosition / MYwave.SubChunkSizeTwo) * picWave.Width)
Dim posgraphic As Graphics
PlayPicture = CType(myPicture.Clone, Bitmap)
posgraphic = Graphics.FromImage(PlayPicture)
Dim Mypen As Pen = New Pen(Color.Red)
posgraphic.DrawLine(Mypen, XPos, 0, XPos, picWave.Height)
MyTime += TimerStep
posgraphic.DrawImage(PlayPicture, picWave.Width, picWave.Height)
lblPos.Text = PlayerPosition.ToString
lbltime.Text = (MyTime / 1000).ToString
Me.Invalidate(New Drawing.Rectangle(picWave.Left, picWave.Top, picWave.Width, _
picWave.Height))
End Sub
The initial drawing of the sound data graph is stored as a bitmap. This bitmap is continuously cloned and a red line is drawn onto the cloned bitmap at different locations to give an effect of movement.
Visualizing WAV File Data as a Graph of Sound Values
The function DrawGraph
takes in an array of int16
data and plots it out on the (vertical mid-point) picture control. Double buffering is used to speed up the drawing process. A bitmap is first created, thereafter the graphics are drawn. Once the entire line graph is drawn, the graphic is then drawn onto the bitmap. The bitmap is then transferred to the picture control. This process if faster than drawing directly onto the picture control.
Function DrawGraph(ByVal Data() As Int16) As Bitmap
Dim myBitmap As System.Drawing.Bitmap
Dim tempData(Data.Length) As Integer
Data.CopyTo(tempData, 0)
Array.Sort(tempData)
myBitmap = New Bitmap(picWave.Width, picWave.Height)
Dim myGraphic As System.Drawing.Graphics
myGraphic = Graphics.FromImage(myBitmap)
myGraphic.Clear(Color.FromArgb(181, 223, 225))
Dim YMax As Integer = tempData(Data.Length - 1)
Dim YMin As Integer = tempData(0)
Dim XMax As Integer = picWave.Width
Dim Xmin As Integer = 0
Dim PicPoint(Data.Length - 1) As System.Drawing.PointF
Dim Count As Integer
Dim Step1 As Single
Dim Step2 As Single
Dim step3 As Single
Dim Mypen As New Pen(Color.FromArgb(24, 101, 123))
For Count = 0 To Data.Length - 1
Step1 = CSng(Data(Count) / (YMax - YMin))
Step2 = CSng(Step1 * picWave.Height / 2)
step3 = CSng(Step2 + (picWave.Height / 2))
PicPoint(Count) = New System.Drawing.PointF(CSng(XMax * _
(Count / Data.Length)), step3)
Next
myGraphic.DrawLines(Mypen, PicPoint)
myGraphic.DrawImage(myBitmap, picWave.Width, picWave.Height)
Return (myBitmap)
End Function
Playing a WAV File Data Array using a Direct Sound Circular Buffer
Playing data from an array is not very different from playing data from a memory stream. The only difference here is that the secondary buffer method has an overload to read data from an array. As the programmer, you have to fetch data from the source (a memory stream) and create the data array. As listed in the code, the memory stream is repositioned to the last location of a read (playerposition
) and data is read from there to the length of safe data to write.
Sub WriteDataArray(ByVal DataSize As Integer)
Dim Tocopy As Integer
Tocopy = Math.Min(DataSize, TimeToDataSize(Latency))
If Tocopy > 0 Then
If SbufferOriginal.Status.BufferLost Then
SbufferOriginal.Restore()
End If
ReDim DataArray(Tocopy - 1)
DataMemStream.Position = PlayerPosition
DataMemStream.Read(DataArray, 0, Tocopy - 1)
SbufferOriginal.Write(NextWritePos, DataArray, _
Microsoft.DirectX.DirectSound.LockFlag.None)
PlayerPosition += Tocopy
NextWritePos += Tocopy
If NextWritePos >= SbufferOriginal.Caps.BufferBytes Then
NextWritePos = NextWritePos - SbufferOriginal.Caps.BufferBytes
End If
End If
End Sub
Selecting Portions of a WAV File and Playing it using a Static Buffer
To play a selected portion of the sound file, highlight the portion and select capture 1 or capture 2 . If both buttons are selected on different portions, then two different sounds can be played simultaneously (mixing).
To select a portion of the sound, toggle the left mouse button over the picture control and move the mouse to the right. Once you let go of the left mouse button, the portion to be played will be highlighted. Click on the capture 1 button. Repeat the same for another portion and click capture 2.
The selection is made possible by using the mousedown, mousemove and mouseup event of the picture control. The rectangle is made transparent by using alpha blending. The code listed below is the implementation:
Sub DrawSelection()
Dim posgraphic As Graphics
Dim RubberRect As Rectangle
RubberRect = New Rectangle(StartPoint.X, 0, _
EndPoint.X - StartPoint.X, picWave.Height - 3)
PlayPicture = CType(myPicture.Clone, Bitmap)
posgraphic = Graphics.FromImage(PlayPicture)
Dim Mypen As Pen = New Pen(Color.Green)
Dim MyBrush As SolidBrush = New SolidBrush(Color.FromArgb(85, 204, 32, 92))
posgraphic.DrawRectangle(Mypen, RubberRect)
posgraphic.FillRectangle(MyBrush, RubberRect)
posgraphic.DrawImage(PlayPicture, picWave.Width, picWave.Height)
Me.Invalidate(New Drawing.Rectangle(picWave.Left, picWave.Top, _
picWave.Width, picWave.Height))
End Sub
To play the custom select sound, data is read to a data array. You must ensure that the data read is aligned based on the blockalign value. If not, noise results which is not very pleasant to hear!
Public Sub SetSegment(ByVal sender As System.Object, ByVal e As System.EventArgs) _
Handles cmdSeg1.Click, cmdSeg2.Click
Dim tag As Int16
Dim theButton As Button
Dim DataStart As Integer
Dim DataStop As Integer
Dim BufferSize As Integer
Dim Format As Microsoft.DirectX.DirectSound.WaveFormat
Dim Desc As Microsoft.DirectX.DirectSound.BufferDescription
Dim MixBuffer As Microsoft.DirectX.DirectSound.SecondaryBuffer
cmdBrowse.Enabled = False
cmdDefault.Enabled = False
cmdCustom.Enabled = False
cmdCircular.Enabled = False
cmdSeg1.Enabled = False
cmdSeg2.Enabled = True
cmdStop.Enabled = False
theButton = CType(sender, Button)
theButton.Enabled = False
DataStart = CInt(MYwave.SubChunkSizeTwo * (StartPoint.X / picWave.Width))
DataStop = CInt(MYwave.SubChunkSizeTwo * (EndPoint.X / picWave.Width))
StartPoint = Nothing
EndPoint = Nothing
DataStart = DataStart - (DataStart Mod CInt(MYwave.BlockAlign))
DataStop = DataStop - (DataStop Mod CInt(MYwave.BlockAlign))
Dim DataSegment(DataStop - DataStart) As Byte
DataMemStream.Position = DataStart
DataMemStream.Read(DataSegment, 0, DataStop - DataStart)
Format = New Microsoft.DirectX.DirectSound.WaveFormat
Format.AverageBytesPerSecond = CInt(MYwave.ByteRate)
Format.BitsPerSample = CShort(MYwave.BitsPerSample)
Format.BlockAlign = CShort(MYwave.BlockAlign)
Format.Channels = CShort(MYwave.NumChannels)
Format.FormatTag = Microsoft.DirectX.DirectSound.WaveFormatTag.Pcm
Format.SamplesPerSecond = CInt(MYwave.SampleRate)
Desc = New Microsoft.DirectX.DirectSound.BufferDescription(Format)
BufferSize = DataStop - DataStart + 1
BufferSize = BufferSize + (BufferSize Mod CInt(MYwave.BlockAlign))
Desc.BufferBytes = BufferSize
Desc.ControlFrequency = True
Desc.ControlPan = True
Desc.ControlVolume = True
Desc.GlobalFocus = True
Try
MixBuffer = New Microsoft.DirectX.DirectSound.SecondaryBuffer(Desc, SoundDevice)
MixBuffer.Stop()
MixBuffer.SetCurrentPosition(0)
If MixBuffer.Status.BufferLost Then
MixBuffer.Restore()
End If
MixBuffer.Write(0, DataSegment, Microsoft.DirectX.DirectSound.LockFlag.None)
MixBuffer.Play(0, Microsoft.DirectX.DirectSound.BufferPlayFlags.Looping)
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Sub
Points of Interest
Playing with DirectSound
is fun, especially when you get some real sound after hours and days of struggling. It took me a couple of days to get the circular buffer working. The WAV parser was something that got me thinking especially the endian bit! I intend to build this further and incorporate FFT (fast fourier transform) for real music mixing!
So that is it! I hope this is helpful for you VB.NET direct sound enthusiasts who have not benefited from the C/C++, C# found on the internet. Much of the work done here was trial and error, again due to scanty material and books!
Happy coding!