Introduction
This article explains how to use the DirectShow API for simple audio
conversion, particularly Wav to MP3 conversion. Audio codecs in the DirectShow
API are of three type : native codecs, ACM codecs, and DMO (DirectX Media
Object) codecs.
There are only few audio native codecs for audio compression. For MP3
encoding, the only one that I've found is the LAME
DirectShow wrapper from Elecard. Most MP3 encoders are in the ACM (Audio
Compression Manager) format, wich was introduced with the Windows Multimedia
API. CDSEncoder
class and it's relative classes
CDSCodec
and CDSCodecFormat
enumerate ACM codecs and
their respective compression parameters, construct a graph and do the
encoding.
The GraphBuilder and other filters
The graph consist of five filters:
- File source (async) for reading the input wav file,
- WAV Parser for wav parsing,
- ACM codec for audio compression (in this case : MP3 ACM, wrapped by
the ACM Wrapper Filter),
- WAV Dest, for wav output multiplexing,
- File Writer for writing the output file.
Important note
The WAV Dest filter is not included in standard filters, but need to be
compiled from the DirectX SDK
(SDK_root\Samples\Multimedia\DirectShow\Filters\WavDest). For
convenience, the compiled WAV Dest filter is included in the demo zip, but you
have to register it by RegSrv32 wavdest.ax.
ACM codecs and the ACM Wrapper filter
All of the ACM codecs are listed in DirectShow in the Audio Compressors
Filter Category (CLSID_AudioCompressorCategory
) and cannot
be instantiated directly. We have to use the Device Enumerator to use them.
Note : Depending of your configuration, several ACM codecs for a
same format can be installed on your computer. This can be the case for MP3
codecs. You set priority or deactivate some of them by the use of control
panel, as show in the following figure.
The DeviceEnumerator or how to browse ACM codecs
The Device Enumerator must be used to retrieve an instance of an ACM codec.
It returns the codecs list by the IEnumMoniker
interface, so we can
get the filter interface (IBaseFilter
) by a call to
IMoniker::BindToObject()
and the filter name by a call to
IMoniker::BindToStorage()
.
Configuring the ACM codec with IAMStreamConfig interface
Once the desired codec is instantiated, we can obtain an
IBaseFilter
interface for filter configuration. Since each
IBaseFilter
have one or more Pin, we have to search the output Pin
by the use of the IEnumPins
interface and
IPin::QueryDirection()
calls.
With the output Pin, we can query
the IAMStreamConfig
interface to configure the following property :
- Numbers of channels,
- Samples per second,
- Average byte per second,
- Bits per sample.
Note : For some codecs (including MP3), the call to
IAMStreamConfig::SetFormat()
must be after the graph
rendering.
The classes
CDSEncoder
CDSEncoder
assumes the following task :
- Enumerate the Audio codecs (
CLSID_AudioCompressorCategory
),
- Build, render, and run the graph.
class CDSEncoder : public CArray<CDSCodec*, CDSCodec*>
{
public:
void BuildGraph(CString szSrcFileName, CString szDestFileName,
int nCodec, int nFormat);
CDSEncoder();
virtual ~CDSEncoder();
protected:
void BuildCodecArray();
HRESULT AddFilterByClsid(IGraphBuilder *pGraph, LPCWSTR wszName,
const GUID& clsid, IBaseFilter **ppF);
BOOL SetFilterFormat(AM_MEDIA_TYPE* pStreamFormat,
IBaseFilter* pBaseFilter);
IGraphBuilder *m_pGraphBuilder;
};
As CDSEncoder
inherits from CArray
, the collection
of codecs is exposed by CArray
methods with each codecs returned as
CDSCodec
object.
CDSCodec
CDSCodec
assumes the following task :
- Enumerate the codec supported parameters,
- expose the codec name.
class CDSCodec : public CArray<CDSCodecFormat*, CDSCodecFormat*>
{
public:
CDSCodec();
virtual ~CDSCodec();
CString m_szCodecName;
IMoniker *m_pMoniker;
void BuildCodecFormatArray();
};
As CDSCodec
inherits from CArray
, the collection of
codecs supported parameters is exposed by CArray
methods with each
parameters returned as CDSCodecFormat
object.
CDSCodecFormat
CDSCodecFormat
exposes the properties of one-codec parameters :
- Number of channels,
- Samples per second,
- Bytes per second,
- Bits per samples.
class CDSCodecFormat
{
public:
WORD BitsPerSample();
DWORD BytesPerSec();
DWORD SamplesPerSecond();
WORD NumberOfChannels();
CDSCodecFormat();
virtual ~CDSCodecFormat();
public:
AM_MEDIA_TYPE* m_pMediaType;
};
Known issues
Errors Checking
The article goal is to demonstrate the use of DirectShow for simple audio
conversion. These classes are not as safe as they have to be. Please keep this
in mind if you plan to use it in a production environment.
Source Wav format
There are no sampling conversion, so you can only generate 44 kHz output
files if you use 44 kHz Wav.
Windows Media
Windows Media format can be used only with a certificate that can be obtained
by the Windows Media SDK from Microsoft.