Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Capturing audio to a WAV file from an MPEG movie file, using DirectShow

0.00/5 (No votes)
29 Dec 2004 1  
This article describes the process of storing the audio data of a movie file (.mpeg, .mpg, .avi and .dat) on the hard disk in a WAV file using DirectShow filters.

Sample Image - Audio5.jpg

Introduction

This article chiefly focuses on the following DirectShow filters:

  • MPEG-1 Stream Splitter Filter.
  • MPEG Audio Decoder Filter.
  • Wav Dest Filter and
  • File Writer Filter.

MPEG-1 Stream Splitter Filter

This filter splits an MPEG-1 system stream into its component audio and video streams.

MPEG-1 Audio Decoder Filter

Decodes MPEG-1 Layer I and Layer II audio to PCM.

Wav Dest Filter

The Wav Dest filter writes an audio stream to a WAV file. It takes a single audio stream as input, and its output pin must be connected to the File Writer filter.

File Writer Filter

The File Writer filter can be used to write files to disk regardless of format. The filter simply writes to disk whatever it receives on its input pin, so it must be connected upstream to a multiplexer that can format the file correctly.

The Working

Figure 1 Screenshot of the desired Filter Graph

Figure 1: Screenshot of the desired Filter Graph.

Figure 1 shows the filter graph in which all the desired filters are connected in the proper order.

Note: The RenderFile method of the IGraphBuilder creates the File Source (Async.) Filter itself and adds it to the graph as well.

In order to get this phenomenon going, first these filters have to be created and then added to the graph.

The Mpeg2WavConvertor method of CMpeg2Wav Class

The �Mpeg2WavConvertor� method (it�s a method of �CMpeg2Wav� class which has been developed in ATL and is housed in �MpegToWav.dll�) is the one that starts the whole process of saving the audio data into a WAV file from the supplied video file. It first creates all the required filters, adds them to the graph, and then executes the graph.

Note: You will find all the required DirectShow working in the two methods of CMpeg2Wav; these are: Mpeg2WavConvertor and handleEvent (which is discussed later).

STDMETHODIMP CMpeg2Wav::Mpeg2WavConvertor(LPCWSTR file, HWND g_hwnd, LPCWSTR waveFile)
{
    if(g_hwnd == NULL)
    {
        MessageBox(NULL, "FAILED TO CREATE THE HANDLE" 
                   " OF THE WINDOW", NULL, NULL);
        exit(0);
    }
    //creating capture graph

    hr = CoCreateInstance(CLSID_CaptureGraphBuilder2, NULL, 
         CLSCTX_INPROC,  IID_ICaptureGraphBuilder2, (void **)&m_pCapture);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Unable to create capture graph", NULL, NULL);
    }
    //creating graph builder

    hr = CoCreateInstance(CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER, 
         IID_IGraphBuilder, (void **)&pGraphBuilder);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Create Graph Builder", NULL, NULL);
    }
 
    hr = m_pCapture->SetFiltergraph(pGraphBuilder);  
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed", NULL, NULL);
    }
    pGraphBuilder->QueryInterface(IID_IMediaControl, (void **)&pMC);
    pGraphBuilder->QueryInterface(IID_IMediaEventEx, (void **)&pEvent);
    hr = CoCreateInstance(CLSID_MPEG1Splitter, NULL, CLSCTX_INPROC,
                          IID_IBaseFilter, (void**)&pSplitter);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Create MpegSplitter Filter", NULL, NULL);
    }
    //creating Audio Decoder

    static const GUID CLSID_MPEG_Audio_Decoder = 
        { 0x4a2286e0, 0x7bef, 0x11ce, 
        { 0x9b, 0xd9, 0x0, 0x0, 0xe2, 0x02, 0x59, 0x9c } };
    hr = CoCreateInstance(CLSID_MPEG_Audio_Decoder, NULL, 
         CLSCTX_INPROC, IID_IBaseFilter, (void**)&pDecoder);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Create MpegAudioDecoder Filter", 
                   NULL, NULL);
    }
    //creating Video Decoder

    static const GUID CLSID_MPEG_Video_Decoder = 
        {0xfeb50740 , 0x7bef, 0x11ce, 
        {0x9b , 0xd9, 0x0, 0x0, 0xe2, 0x2, 0x59,  0x9c} };
    hr = CoCreateInstance(CLSID_MPEG_Video_Decoder, 
         NULL, CLSCTX_INPROC, IID_IBaseFilter, (void**)&pVideoDecoder);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Create MpegVideoDecoder Filter", NULL, NULL);
    }
    //creating Video Renderer for windows XP

    static const GUID CLSID_MPEG_Video_Renderer = 
        {0x6BC1CFFA , 0x8FC1, 0x4261, 
        { 0xAC, 0x22,0xCF , 0xB4, 0xCC, 0x38, 0xDB,  0x50} };
    hr = CoCreateInstance(CLSID_MPEG_Video_Renderer, NULL, 
         CLSCTX_INPROC, IID_IBaseFilter, (void**)&pVideoRenderer);
    if(FAILED(hr))
    {
        //creating Video Renderer for windows 2000

        static const GUID CLSID_Video_Renderer = 
               {0x70E102B0 , 0x5556, 0x11CE, 
               { 0x97, 0xC0, 0x00 , 0xAA, 0x00, 0x55, 0x59,  0x5A} };
        hr = CoCreateInstance(CLSID_Video_Renderer, NULL, 
             CLSCTX_INPROC, IID_IBaseFilter, (void**)&pVideoRenderer);

        if (FAILED(hr))
        {
            MessageBox(NULL, "Still Error!", "Error", MB_OK);
        }
     }

    //creating wavedest filter

    static const GUID CLSID_WavDest = 
           { 0x3c78b8e2, 0x6c4d, 0x11d1, 
           { 0xad, 0xe2, 0x0, 0x0, 0xf8, 0x75, 0x4b, 0x99 } };
    hr = CoCreateInstance(CLSID_WavDest, NULL, 
         CLSCTX_INPROC,  IID_IBaseFilter, (void **)&pWavDest);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Create WaveDest Filter", NULL, NULL);
    }
    //Creating FileSink2

    hr = CoCreateInstance(CLSID_FileWriter, NULL, CLSCTX_INPROC,
                          IID_IFileSinkFilter2, (void **)&pFileSink);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Create FileSink2 Filter", NULL, NULL);
    }
    // Get the file sink interface pointer from the File Writer

    hr = pFileSink->QueryInterface(IID_IBaseFilter, (void **)&pFileWriter);
    if(FAILED(hr))
    {
       MessageBox(NULL, "Unable to get the File Writer" 
                  " interface pointer from the File Sink", NULL, NULL);
    }
    hr = pFileSink->SetFileName(waveFile, NULL);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Couldnt create wav file", NULL, NULL);
    }
    //adding writer to the graph

    hr = pGraphBuilder->AddFilter((IBaseFilter *)pFileWriter, L"File Writer");
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to add Writer To Graph", NULL, NULL);
    }
    //adding wavdest to the graph

    hr = pGraphBuilder->AddFilter(pWavDest, L"WAV DEST");
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to add WavDest To Graph", NULL, NULL);
    }
    //adding audioDecoder to the graph

    hr = pGraphBuilder->AddFilter(pDecoder, NULL);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to add audio decoder To Graph", NULL, NULL);
    }
    //adding video renderer to the graph

    hr = pGraphBuilder->AddFilter(pVideoRenderer, NULL);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to add video renderer To Graph", NULL, NULL);
    }
    //adding video Decoder to the graph

    hr = pGraphBuilder->AddFilter(pVideoDecoder, NULL);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to add video decoder To Graph", NULL, NULL);
    }
    //adding Mpeg Stream Splitter to the graph

    hr = pGraphBuilder->AddFilter(pSplitter, NULL);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to add MpegSplitter to graph", NULL, NULL);
    }
    hr = pFileSink->SetMode(AM_FILE_OVERWRITE);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to OverWrite", NULL, NULL);
    }
    hr = pGraphBuilder->RenderFile(file, NULL);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to RenderFile", NULL, NULL);
    }
    //Removing Video Renderer From the Graph

    hr = pGraphBuilder->RemoveFilter(pVideoRenderer);
    if(FAILED(hr))
    {
        MessageBox(NULL, "Couldn't remove the video" 
                         " rendere filter", NULL, NULL);
    }
    //Removing Video Decoder From the Graph

    hr = pGraphBuilder->RemoveFilter(pVideoDecoder);

    if(FAILED(hr))
    {
        MessageBox(NULL, "Couldn't remove the video decoder filter", NULL, NULL);
    }
    //Instructing the graph manager to send a windows message 

    //to the window of the calling application 

    //whenever an event occurs on the graph

    pEvent->SetNotifyWindow((OAHWND)g_hwnd, WM_GRAPHNOTIFY, 0);
    // Execute the graph

    hr = pMC->Run();
    if(FAILED(hr))
    {
        MessageBox(NULL, "Failed to Run", NULL, NULL);
    }
    return S_OK;
}

Please note that two more filters: MPEG Video Decoder and Video Renderer have also been created and added to the graph in the Mpeg2WavConvertor method. It is because the RenderStream method of the IGraphBuilder (which is called after all the filters are added to the graph to connect all the filters intelligently) creates and adds these two methods and adds them to the graph as well.

Figure 2 MPEG Video Decoder and MPEG Video Renderer created 
and added to the graph automatically by RenderFile

Now, this is not the sort of behavior we want because the effect of these two filters will be a window that will be displaying the video of the supplied video file, and there will be no audio because the audio data is being copied into a WAV file and not rendered to the default DirectSound device. To solve this problem, both these filters (MPEG Video Decoder and Video Renderer) are created, added to the graph, and after the call to IGraphBuilder�s RenderFile method, they have been removed from the graph by calling the RemoveFilter method of the IGraphBuilder and then the graph is executed.

A li�l bit of trickery

It should also be noted that the filters (File Writer, Wav Dest, Audio Decoder, MPEG-1 Stream Splitter) have been added to the graph in the reverse order (File Writer > Wav Dest > Audio Decoder > MPEG-1 Stream Splitter), which is because the RenderFile method of the IGraphBuilder (which is called after all the filters are added to the graph) actually connects all the filters intelligently in a proper order using their respective input/output pins and this method also creates the File Source (Async.) Filter for the supplied movie file and adds it to the graph. This is important because if you add the filters in ascending order (MPEG-1 Stream Splitter > MPEG Audio Decoder > Wav Dest > File Writer) and then call the RenderFile method, the MPEG Audio decoder will be bypassed and will not be connected to any filter in the graph; instead the audio output pin of the MPEG-1 Stream Splitter will be directly connected to the input pin of the Wav Dest Filter.

Figure 3The MPEG Audio Decoder Filter is bypassed and it is not connected with any filter in the graph

Figure 3: The MPEG Audio Decoder Filter is bypassed and it is not connected with any filter in the graph.

This is not what we want because the data at the audio output pin of the MPEG-1 Stream Splitter has the format �WaveFormatEx: 44.100 KHz, 0 bit mono�. If we add MPEG Audio Decoder to the graph (between MPEG-1 Stream Splitter and Wav Dest), like that:

Figure 4The MPEG Audio Filter is connected between MPEG-1 Stream Splitter and Wav Dest Filters

Figure 4: The MPEG Audio Filter is connected between MPEG-1 Stream Splitter and Wav Dest Filters.

Now the data that the input pin of the Wav Dest is receiving is �WaveFormatEx: 44.100 KHz, 16 bit mono� which is the data present at the output pin of the MPEG Audio Decoder and that�s the desired data that Wav Dest should pump into File Writer.

The Calling Application

Now, we come over to the application that is going to call the �Mpeg2WavConvertor� method. This application is developed using Managed C++.

Figure 5 Application's GUI

Figure 5 Application's GUI

The loading of the Form

In the Form1_Load method of this application (Handler for the form load event which is raised when the form is loaded), the �Convert button is disabled to prevent the user from starting the whole process without supplying any video file, which of course will not work, and also an unsigned char is set to the value �a� (which is later used by the application and it is also discussed later in this tutorial).

Figure 3The MPEG Audio Decoder Filter is bypassed 
and it is not connected with any filter in the graph

Figure 2: MPEG Video Decoder and MPEG Video Renderer created and added to the graph automatically by RenderFile.

private: unsigned char check;
 
private: System::Void Form1_Load(System::Object *  sender, System::EventArgs *  e)
{
    buttonConvert->Enabled = false;
    check = 'a';
}

The �Select Movie� button

This application asks the user to select the desired movie for which he/she wants to have the WAV audio off by clicking the �Select Movie� button.

private: System::Void buttonSelectMpg_Click(System::Object *  sender, 
                                              System::EventArgs *  e)
{
    buttonConvert->Enabled = true;
    OpenFileDialog *fileDialog3 = new OpenFileDialog();
    fileDialog3->InitialDirectory = S"C:\\" ;
    fileDialog3->Filter = "Video Files (*.mpg; *.dat; *.avi;" 
          " *.mpeg; *.m1v;)|*.mpg; *.dat; *.mpeg;" 
          " *.avi; *.m1v; | All Files (*.*)|*.*";
    fileDialog3->RestoreDirectory = true;
    fileDialog3->ShowDialog();
    String* mpegVidFile;
    mpegVidFile = fileDialog3->get_FileName();
    if (mpegVidFile == NULL || mpegVidFile == String::Empty)
        return;
    mpegTextBox->Text = mpegVidFile;
}

In its button handler, an �OpenFileDialog� is created so that the user while clicking this button can select the movie file of his/her choice from the desired location on the hard disk and the filename is also preserved so that it can be shown in the textbox and also to be supplied to the �Mpeg2WavConvertor� method (which is called in the �Convert� button handler) so that File Source (Async.) Filter is created for that particular video file by the IGraphBuilder�s RenderFile method.

mpegTextBox->Text = mpegVidFile;

The �Change� button

The user can also set the desired WAV filename through this application by clicking the �Change� button in front of the �Wav file� textbox.

private: System::Void buttonSaveWav_Click(System::Object *  sender, 
                                            System::EventArgs *  e)
{
    SaveFileDialog *saveFileDialog2;
    saveFileDialog2 = new SaveFileDialog();
    saveFileDialog2->InitialDirectory = S"C:\\" ;
    saveFileDialog2->Filter = S"Wav files (*.wav)|*.wav�;
    saveFileDialog2->RestoreDirectory = true;
    saveFileDialog2->ShowDialog();
    String* waveFile = saveFileDialog2->get_FileName();
    if (waveFile == NULL)
    {
        return;
    }

    wavTextBox->Text = waveFile;
}

In its button handler, a �SaveFileDialog� is created so that user can set the desired filename of the WAV file and can also set the desired location to which this file is going to be saved. The filename is preserved so that it can be supplied to the �Mpeg2WavConvertor� method which the File Writer Filter will use to create the WAV file with the user desired filename. By default, the filename is �C:\test.wav�.

wavTextBox->Text = waveFile;

The �Convert� button

It is the Convert button that actually calls the �Mpeg2WavConvertor� method of �CMpeg2Wav� class to start the proceedings.

private: System::Void buttonConvert_Click(System::Object *  sender, 
                                            System::EventArgs *  e)
{
    buttonConvert->Enabled = false;
    // prevent the user from clicking when in process


    String* mpgFile = mpegTextBox->get_Text();
    if (mpgFile == NULL || mpgFile->Length == 0)
    {
        MessageBox(NULL, "No MPEG file selected", "Error", MB_OK);
        return;
    }
    Interop::MpegToWavLib::_RemotableHandle *pointer = 
           (Interop::MpegToWavLib::_RemotableHandle*)get_Handle().ToPointer();
    ptrMpeg2Wav = new Interop::MpegToWavLib::Mpeg2WavClass();
    ptrMpeg2Wav->Mpeg2WavConvertor(mpegTextBox->Text, pointer, wavTextBox->Text);
    pointer = NULL;
}

This button handler calls the �Mpeg2WavConvertor� method to get things going. It supplies the filename of the selected video file (this is to help the RenderFile method of the IGraphBuilder to actually create and add to the graph the File Source (Async.) Filter for that particular video and it also connect the filters in the graph as mentioned earlier), filename of the WAV file (for the File Writer), and handle of the parent window.

This handle is required to instruct the Filter Graph Manager to send a Windows message (in this case, it is WM_GRAPHNOTIFY) whenever a new event occurs.

#define WM_GRAPHNOTIFY  WM_APP + 1
 
pGraphBuilder->QueryInterface(IID_IMediaEventEx, (void **)&pEvent);
 
pEvent->SetNotifyWindow((OAHWND)g_hwnd, WM_GRAPHNOTIFY, 0);

This is helpful in letting the application know about the events that occur during the execution of the graph. In our case, we are interested only in EC_COMPLETE and EC_USERABORT events. The EC_COMPLETE event indicates that playback has completed normally and the EC_USERABORT event indicates that the user has interrupted playback (when user closes the application in the middle of the process).

So when our application receives a WM_GRAPHNOTIFY message, it calls the �handleEvent� method of the �CMpeg2Wav� class. The handling of this message is done by overriding the �WndProc� method in the calling application.

Overriding WndProc

protected:
void WndProc(Message* m) 
{
    switch (m->Msg) 
    {
      case WM_GRAPHNOTIFY:

          ptrMpeg2Wav->handleEvent(&check);
          if(check == 'z')
          {
              buttonConvert->Enabled = true;
              check = �a�;
          }

          break;
    }

    Form::WndProc(m);
}

Whenever the �handleEvent� method is called (it is called whenever the parent window of the calling application receives the WM_GRAPHNOTIFY message), it checks to see whether the event is either �EC_COMPLETE� or �EC_USERABORT�. If it finds either of the two, it performs a general cleanup and also changes the value of the unsigned char* to �z� (which is supplied as the lone parameter to the method and is used as a sentinel by which the calling application knows that either of the two events has occurred). When the calling application sees the value of the supplied unsigned char being changed to �z�, it enables the �Convert� button which had been disabled when the user clicked it to start the whole process (this is to prevent the user from clicking the �Convert� button over and over during the execution of the whole process of preserving the WAV audio onto the hard disk).

The handleEvent method of CMpeg2Wav Class

STDMETHODIMP CMpeg2Wav::handleEvent(unsigned char *s)
{
    long evCode, param1, param2;
    while (hr = pEvent->GetEvent(&evCode, �m1, �m2, 0), SUCCEEDED(hr))
    {
        hr = pEvent->FreeEventParams(evCode, param1, param2);
        if ((EC_COMPLETE == evCode) || (EC_USERABORT == evCode))
        {
            pMC->Stop();
            pEvent->SetNotifyWindow(NULL, 0, 0);
            hr = pGraphBuilder->RemoveFilter(pFileWriter);
            hr = pGraphBuilder->RemoveFilter(pWavDest);
            hr = pGraphBuilder->RemoveFilter(pDecoder);
            hr = pGraphBuilder->RemoveFilter(pSplitter);
            SAFE_RELEASE(pFileWriter);
            SAFE_RELEASE(pWavDest);
            SAFE_RELEASE(pDecoder);
            SAFE_RELEASE(pSplitter);
            SAFE_RELEASE(pVideoDecoder);
            SAFE_RELEASE(pVideoRenderer);
            SAFE_RELEASE(pGraphBuilder);
            SAFE_RELEASE(pEvent);
            SAFE_RELEASE(pMC);
            *s = 'z';
            break;
        }
    }
    return S_OK;
}

A Few tips at the end

  • To run the demo, you need to register �Mpeg2Wav.dll� and �wavdest.ax� (Wav Dest Filter). �wavdest.ax� can also  be obtained by compiling the WavDest sample provided in the DirectX SDK (drive name:\DXSDK\Samples\C++\DirectShow\Filters\WavDest).
  • In order to get the source code working, you have to have the �wavdest.ax� registered on your machine in the first place. Then compile the �MpegToWav� sample. This will create the �MpegToWav.dll�, then open the Mpeg2WavTestApplication and add this recently created �Mpeg2Wav.dll� to the References and then compile + run this application, and you are ready to create the WAV files out of the videos you have, my friend.
  • The process of creating a WAV file of the audio of the supplied video file (described above) applies only to the movie files of the type .mpg, .mpeg, .avi etc. and not to the streaming video file types such as .wmv, .asf etc.
  • This whole application has been made using Microsoft Visual Studio .NET 2003, so the readers having the older version of Visual Studio .NET will have to convert this application to the older version of Visual Studio .NET by some converter, if they want to test the code.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here