Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

C# MP3 Sound Capturing/Recording Component

4.75/5 (47 votes)
20 Oct 2009CPOL8 min read 490.5K   33.3K  
A .NET component capturing WAVE or MP3 sound from a sound card. LAME used for MP3 compression.
Mp3 Capture Sample Application

Introduction

It's surprising that there are no components for sound capturing in .NET Framework 3.5. Even designers of WPF and Silverlight 2.0 were focused on graphics so deeply, that they forgot about applications recording sound from user's microphone. It is said that the next version of Silverlight will provide such functionality.

However, what you often want to achieve is to store the recorded sound in MP3 file (or send it as MP3 stream). That's legally complicated due to MP3 patent constraints. And for the same legal reason, we can assume that we will not see MP3 functionality in Microsoft technologies soon (there is WMA instead).

Here you find an easy to use .NET 3.5 component, providing Sound Capturing functionality for a Windows application. It outputs data as raw PCM samples or a regular WAV file. Or you can just set one boolean property to use LAME DLL and perform MP3 compression on the fly.

This article uses a subset of C# MP3 Compressor libraries written by Idael Cardoso which in turn are partially based on A low level audio player in C# by Ianier Munoz. See this website for technical and copyright information regarding the LAME project.

Background

I chose Managed DirectX (MDX) 1.1 to capture sound. The MDX project is currently frozen since Microsoft moved to XNA Game Studio Express (a solution rather inadequate just for capturing bits of sound). MDX is descended by SlimDX Open Source project which exposes roughly similar interfaces and delegates to the native DirectSound libraries.

MDX comprises of .NET assemblies delegating calls to native DirectX DLLs of 2006. We expect DirectX in a version backward compatible with the 2006 interfaces to be installed on virtually every Windows computer (yes, also works on Vista).

The component captures sound via MDX from a sound card in raw PCM format. PCM format is a simple sequence of sound sample values. The samples can be 8bit (0..255) or 16bit (-32768..32767) each. Stereo sound is a sequence of pairs of samples (left, right, left, right...). PCM is a proper format for streaming of raw data. Streaming means passing data in small chunks while the total volume of the data may be unknown. Raw PCM is the most basic type of output of Mp3SoundCapture component. If you direct stream this format to a WAV file, you will not be able to play it as WAV file must additionally contain a RIFF header.

WAV (RIFF) format requires the raw PCM data to be prefixed with RIFF header, which apart from format information also contains information about the total length of the PCM data in the file. That's why you can't really stream RIFF data, as the total size of the stream is usually unknown. WAV (RIFF) format is the second type of output of the Mp3SoundCapture component. You can direct stream this format to a WAV file.

The third type of Mp3SoundCapture component output is MP3. A Bit rate (kbit/s) is a parameter which decides MP3 sound quality. Apart from that, the sound quality depends on the format of PCM data being compressed. Not every combination of the bit rate and sampling parameters is allowed. You can usher MP3 stream to an MP3 file.

Using the Code

Setting Up

From your application, add a reference to Istrib.Sound.Mp3.dll assembly (see Istrib.Sound.Example.WinForms for example). MP3 compression also requires lame_enc.dll library to be located in binaries directory. Istrib.Sound.Mp3.dll assembly references Managed DirectX assemblies (located in DirectX subdirectory). The DirectX assemblies can be also installed in GAC via DirectX End-User Runtimes redistributable available for download at Microsoft site (here: Nov 2008 version).

If you experience "Loader Lock" warning while debugging your application, refer here for a workaround.

Istrib.Sound.Mp3.dll assembly contains a single component: Mp3SoundCapture. You may think of this component as of a sound recorder. You set up its properties prior to calling Start(...) and Stop() methods ("recorder buttons"). You provide a writable stream or a file path to each call to the Start method. Single instance of the component may capture sound many times to many streams/files one by one.

Component Construction

You may use Visual Studio Component Designer to drag and drop the Mp3SoundCapture component from the Visual Studio toolbox to the component surface or create the component manually:

C#
mp3SoundCapture = new Mp3SoundCapture();

The component is ready to use just after construction. The default output is MP3 128kbit/s sampled at 22kHz, 16bit, mono. You specify sampling parameters and output format by setting the component properties.

Capturing Device (e.g. A Microphone)

You may use a default Windows recording device:

C#
mp3SoundCapture.CaptureDevice = SoundCaptureDevice.Default;

Or choose one of the installed system sound capture devices:

C#
mp3SoundCapture.CaptureDevice = SoundCaptureDevice.AllAvailable.First();

Output Format

You set one of the 3 output types:

  • Mp3SoundCapture.Outputs.Mp3 - MP3 format
  • Mp3SoundCapture.Outputs.RawPcm - Raw sample data (without a RIFF header)
  • Mp3SoundCapture.Outputs.Wav - WAV file data (including the RIFF header)
C#
mp3SoundCapture.OutputType = Mp3SoundCapture.Outputs.Mp3;

Sampling Parameters

For PCM or WAV output, you may select any available sampling parameters supported by the sound card (PcmSoundFormat.StandardFormats):

C#
mp3SoundCapture.WaveFormat = PcmSoundFormat.StandardFormats.First();

... or if you wish to hardcode it:

C#
mp3SoundCapture.WaveFormat = PcmSoundFormat.Pcm22kHz16bitMono;

Sampling parameters for MP3 format are restricted to values returned by Mp3SoundFormat.AllSourceFormats. Not every combination of the sampling parameters and bit rate is allowed. If you choose the bit rate prior to sampling parameters, then you may use  Mp3BitRate.CompatibleSourceFormats property to list compatible values.

C#
mp3SoundCapture.WaveFormat = myMp3BitRate.CompatibleSourceFormats.First();
//Or: mp3SoundCapture.WaveFormat = Mp3SoundFormat.AllSourceFormats.First();

MP3 Bit Rate

For MP3 output format, you specify one of the available bit rates. Again - you cannot pair each bit rate with each sampling parameters. If you choose sampling parameters prior to bit rate, then you may use PcmSoundFormat.GetCompatibleMp3BitRates() extension method to enumerate through compatible MP3 bit rates.

C#
mp3SoundCapture.Mp3BitRate = myPcmSoundFormat.GetCompatibleMp3BitRates().First();
//Or mp3SoundCapture.Mp3BitRate = Mp3BitRate.AllValues.First();

... or if you wish to hardcode it:

C#
mp3SoundCapture.Mp3BitRate = Mp3BitRate.BitRate128;

Volume Normalization Option

Often when an application records and stores many pieces of sound, it is required to adjust their volume so that all of them were at similar volume level. The Mp3SoundCapture has the NormalizeVolume property at your disposal to perform this transformation for you. Setting true causes all recorded sound pieces to be normalized, i.e. volume of the most loud section of the piece will be turned up to the highest possible level and all other sections will be turned up proportionally.

C#
mp3SoundCapture.NormalizeVolume = true;

Note that the normalization algorithm must read the whole stream to find the loudest place, then rewrite the whole stream adjusting the volume of each sample. It means that the entire stream must be buffered before it is directed to the output. Mp3SoundCapture uses a temporary file to buffer the data when normalizing. MP3 compression, if applied, is done after the normalization. When you have recorded a sizeable piece of sound, the gross of processing takes place after calling Stop() met method, not on the fly (as it is when NormalizeVolume is false). It may take time. Here Mp3SoundCapture offers an asynchronous stopping.

Capturing

To start capturing, just call Start(Stream) method passing an open, writable stream (you must close it yourself after capturing has stopped - not obvious when using asynchronous stopping). You may also call Start(string) method passing an output file name.

To stop capturing, just call the Stop() method.

Asynchronous Stop()

As mentioned above, when normalizing, it may take some time after calling Stop() before all captured data is written to the output stream. Mp3SoundCapture has an option of immediately leaving Capturing state and passing all buffer processing to a separate thread. You can start the next recording session not waiting for the last bytes of data from the previous one. By default, the asynchronous behavior is disabled. To enable it, set:

C#
mp3SoundCapture.WaitOnStop = false;

Note that you cannot close your output buffer passed to Start(Stream) method until a Mp3SoundCapture.Stopped event is fired. Use Stopped event arguments to get the reference to the stream which is ready for closing or - if you used Start(string filePath) - a path of the file which has just been closed by Mp3SoundCapture:

C#
private void mp3SoundCapture_Stopped(object sender, Mp3SoundCapture.StoppedEventArgs e)
{
    //Now the e.OutputFileName file is ready and contains all captured data
    dataAvailableLbl.Text = "Data available in " + e.OutputFileName;
    dataAvailableLbl.Visible = true;
}

Points of Interest

In some development environment configurations, you may get "Loader Lock" error (which is really a warning) while starting your application under the debugger. It's a well-known design issue in Managed DirectX. You may disable this error in Visual Studio debugger settings (most people do this without observable consequences). I preferred not to do this. Instead I found a workaround: if the project which references Istrib.Sound.Mp3.dll also explicitly references Managed DirectX assemblies (Microsoft.DirectX and Microsoft.DirectX.DirectSound), then the warning is not raised. Otherwise the warning is shown by the debugger each time any assembly referencing Managed DirectX libraries is loaded into the Application Domain.

However - what I experienced - you cannot use Visual Studio Add Reference... wizard to add the DirectX GAC assembly reference. That's not an issue when you reference a local copy of DirectX assemblies (like in the example: DirectX subdirectory).

The workaround for the GAC problem is to add the reference to GAC assemblies manually editing your csproj file with a text editor:

XML
<ItemGroup>
    ...
    <Reference Include="Microsoft.DirectX">
      <Name>Microsoft.DirectX</Name>
    </Reference>
    <Reference Include="Microsoft.DirectX.DirectSound">

      <Name>Microsoft.DirectX.DirectSound</Name>
    </Reference>
    ...
</ItemGroup>

Visual Studio Add Reference... wizard generates identical XML except it includes full assembly version info. This should work as well... but it does not, at least on my machines.

History

  • 29th November, 2008: Initial post
  • 1st December, 2008: More MP3 formats available
  • 19th October, 2009: Updated source code and demo project

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)