Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Converting Wav file to MP3 or other format using DirectShow

0.00/5 (No votes)
30 Jul 2002 18  
Simple class to convert stereo 44 kHz, 16 bit wav file to another format, including MP3. The class shows how to use DirectShow API for audio conversion.

Sample Image - DShowEncoder.gif

Introduction

This article explains how to use the DirectShow API for simple audio conversion, particularly Wav to MP3 conversion. Audio codecs in the DirectShow API are of three type : native codecs, ACM codecs, and DMO (DirectX Media Object) codecs.

There are only few audio native codecs for audio compression. For MP3 encoding, the only one that I've found is the LAME DirectShow wrapper from Elecard. Most MP3 encoders are in the ACM (Audio Compression Manager) format, wich was introduced with the Windows Multimedia API. CDSEncoder class and it's relative classes CDSCodec and CDSCodecFormat enumerate ACM codecs and their respective compression parameters, construct a graph and do the encoding.

The GraphBuilder and other filters

The graph consist of five filters:

  • File source (async) for reading the input wav file,
  • WAV Parser for wav parsing,
  • ACM codec for audio compression (in this case : MP3 ACM, wrapped by the ACM Wrapper Filter),
  • WAV Dest, for wav output multiplexing,
  • File Writer for writing the output file.

Important note

The WAV Dest filter is not included in standard filters, but need to be compiled from the DirectX SDK (SDK_root\Samples\Multimedia\DirectShow\Filters\WavDest). For convenience, the compiled WAV Dest filter is included in the demo zip, but you have to register it by RegSrv32 wavdest.ax.

ACM codecs and the ACM Wrapper filter

All of the ACM codecs are listed in DirectShow in the Audio Compressors Filter Category (CLSID_AudioCompressorCategory) and cannot be instantiated directly. We have to use the Device Enumerator to use them.

Note : Depending of your configuration, several ACM codecs for a same format can be installed on your computer. This can be the case for MP3 codecs. You set priority or deactivate some of them by the use of control panel, as show in the following figure.

Codec configuration

The DeviceEnumerator or how to browse ACM codecs

The Device Enumerator must be used to retrieve an instance of an ACM codec. It returns the codecs list by the IEnumMoniker interface, so we can get the filter interface (IBaseFilter) by a call to IMoniker::BindToObject() and the filter name by a call to IMoniker::BindToStorage().

Configuring the ACM codec with IAMStreamConfig interface

Once the desired codec is instantiated, we can obtain an IBaseFilter interface for filter configuration. Since each IBaseFilter have one or more Pin, we have to search the output Pin by the use of the IEnumPins interface and IPin::QueryDirection() calls.
With the output Pin, we can query the IAMStreamConfig interface to configure the following property :

  • Numbers of channels,
  • Samples per second,
  • Average byte per second,
  • Bits per sample.

Note : For some codecs (including MP3), the call to IAMStreamConfig::SetFormat() must be after the graph rendering.

The classes

CDSEncoder

CDSEncoder assumes the following task :

  • Enumerate the Audio codecs (CLSID_AudioCompressorCategory),
  • Build, render, and run the graph.

class CDSEncoder : public CArray<CDSCodec*, CDSCodec*>
{
public:
  void BuildGraph(CString szSrcFileName, CString szDestFileName, 
    int nCodec, int nFormat);
  CDSEncoder();
  virtual ~CDSEncoder();

protected:
  void BuildCodecArray();
  HRESULT AddFilterByClsid(IGraphBuilder *pGraph, LPCWSTR wszName, 
    const GUID& clsid, IBaseFilter **ppF);
  BOOL SetFilterFormat(AM_MEDIA_TYPE* pStreamFormat, 
    IBaseFilter* pBaseFilter);

  IGraphBuilder *m_pGraphBuilder;
};

As CDSEncoder inherits from CArray, the collection of codecs is exposed by CArray methods with each codecs returned as CDSCodec object.

CDSCodec

CDSCodec assumes the following task :

  • Enumerate the codec supported parameters,
  • expose the codec name.

class CDSCodec : public CArray<CDSCodecFormat*, CDSCodecFormat*>
{
public:
  CDSCodec();
  virtual ~CDSCodec();

  CString m_szCodecName;
  IMoniker  *m_pMoniker;
  void BuildCodecFormatArray();
};

As CDSCodec inherits from CArray, the collection of codecs supported parameters is exposed by CArray methods with each parameters returned as CDSCodecFormat object.

CDSCodecFormat

CDSCodecFormat exposes the properties of one-codec parameters :

  • Number of channels,
  • Samples per second,
  • Bytes per second,
  • Bits per samples.

class CDSCodecFormat  
{
public:
  WORD BitsPerSample();
  DWORD BytesPerSec();
  DWORD SamplesPerSecond();
  WORD NumberOfChannels();
  CDSCodecFormat();
  virtual ~CDSCodecFormat();

public:
  AM_MEDIA_TYPE* m_pMediaType;
};

Known issues

Errors Checking

The article goal is to demonstrate the use of DirectShow for simple audio conversion. These classes are not as safe as they have to be. Please keep this in mind if you plan to use it in a production environment.

Source Wav format

There are no sampling conversion, so you can only generate 44 kHz output files if you use 44 kHz Wav.

Windows Media

Windows Media format can be used only with a certificate that can be obtained by the Windows Media SDK from Microsoft.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here