Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

DirectShow Video and Audio Trimming

4.44/5 (6 votes)
18 Mar 2008CPOL4 min read 1   3.2K  
How to cut off specified parts of media file using DirectShow

Introduction

Sometimes, users ask a question about video and audio trimming and then search the utilities that can do this. But what about programmatic implementation of this task? This article will show how it appears from the programmer's position. First of all, I will tell you about the platform and necessary media interfaces. It will be Win32 and core Windows media technology named DirectShow. Moreover, we need «K-Lite Codec Pack» installed with Media Player Classic, which is used for playing media files.

Our application consists of two parts:

  1. A DirectShow Transform Filter, which actually does the necessary media splitting
  2. A Console Application, which demonstrates how this filter can be used

The code, which is shown in this article represents the idea of this process and is not a complete application. However it can be a hard foundation of such popular application types, such as different media (video and audio) splitters.

Background

Let's look at the DirectShow Transform Filters -- what are they? First of all, they allow a programmer to do any manipulation with the media samples, which goes through these filters. But what do we need to exclude specified intervals from the media stream? The answer may be so: we need to discard any samples, which we don't want to see in the output. If we stop here we will have the media file without specified intervals of data, but the cut parts will be filled with the last sample, which have been played before the cut interval started. And this is not the result we want to see. To avoid this, we need not only simply exclude time blocks from the media stream, we need to take the next samples and correct their times, such that they shall go from the point where the cut is done. It will give us the effect that the sample goes through our filter without any pauses and break effects will not appear. To make this more general, we need to use the previous rule for all excluded intervals. We need to stride the next samples block at the summary time, which represents the sum of cut interval times. To see it in more detail, look at the application code.

Using the Code

To run the application code, you need copy the chicken.wmv file to C:\. It is necessary, because we use the manually built graph for rendering, which takes the input file from the specified location. We don't use the code graph building, because it’s a demo application and is targeted to core principles demonstration, but not for all DirectShow application specific needs. The second thing that you need to start the demo code is to register the filter in your system using a shell command like regsvr32 DShowMediaSplitFilt.ax. Then start the Demo.exe and after the execution process is finished, you will see the proceed.wmv file in C:\, which represents the result of time block exclusion process.

The graph, which is used here, is shown below:

DShowMediaSplit

You can see that we use two filter instances to make the video and audio exclusion synchronized. Also we use the IReferenceClock synchronization in our application for all graph filters. Look at the filter code:

C++
 // DirectShow Media Split Filter header file by Kovalev Maxim
#ifndef __DSHOWMEDIASPLITFILT_H__
#define __DSHOWMEDIASPLITFILT_H__
#pragma warning (disable:4312)
#include <streams.h> // DirectShow (includes windows.h)
#include <initguid.h> // Declares DEFINE_GUID to declare an EXTERN_C const
#include <vector>
using std::vector;
// {1A21958D-8E93-45de-92DA-DD9544297B41}
DEFINE_GUID (CLSID_DShowMediaSplit, 0x1a21958d, 0x8e93, 0x45de, 0x92, 0xda,
    0xdd, 0x95, 0x44, 0x29, 0x7b, 0x41);
struct TimeBlock
{
 TimeBlock (LONGLONG startTime, LONGLONG endTime)
 {
  this->startTime = startTime;
  this->endTime = endTime;
 }
 
 LONGLONG startTime;
 LONGLONG endTime;
};
// {EC634576-4C86-464e-99D7-AC7AFEFB2FF0}
DEFINE_GUID (IID_IDShowMediaSplitFilt, 0xec634576, 0x4c86, 0x464e, 0x99, 0xd7,
    0xac, 0x7a, 0xfe, 0xfb, 0x2f, 0xf0);
interface IDShowMediaSplitFilt : public IUnknown
{
 STDMETHOD (SetIntervalList) (const vector <TimeBlock> &lst) = 0;
};
#endif
// DirectShow Media Split Filter source file by Kovalev Maxim
#include "DShowMediaSplitFilt.h"
// Setup data - allows the self-registration to work
const AMOVIESETUP_MEDIATYPE sudPinTypes =
 { &MEDIATYPE_NULL     // clsMajorType
 , &MEDIASUBTYPE_NULL  // clsMinorType
 };
const AMOVIESETUP_PIN psudPins [] = {
 { L"Input"            // strName
   , FALSE             // bRendered
   , FALSE             // bOutput
   , FALSE             // bZero
   , FALSE             // bMany
   , &CLSID_NULL       // clsConnectsToFilter
   , L""               // strConnectsToPin
   , 1                 // nTypes
   , &sudPinTypes      // lpTypes
 }
 ,
 { L"Output"           // strName
 , FALSE               // bRendered
 , TRUE                // bOutput
 , FALSE               // bZero
 , FALSE               // bMany
 , &CLSID_NULL         // clsConnectsToFilter
 , L""                 // strConnectsToPin
 , 1                   // nTypes
 , &sudPinTypes        // lpTypes
 }
 };
const AMOVIESETUP_FILTER sudTransformSample =
 { &CLSID_DShowMediaSplit           // clsID
 , L"DirectShow Media Split Filter" // strName
 , MERIT_DO_NOT_USE                 // dwMerit
 , 2                                // nPins
 , psudPins                         // lpPin
 };
struct TimeBlockEx : public TimeBlock
{
 TimeBlockEx (LONGLONG startTime, LONGLONG endTime,
  LONGLONG timeAtSkipped, LONGLONG timeAtNonSkipped)
  : TimeBlock (startTime, endTime)
 {
  this->timeAtSkipped = timeAtSkipped;
  this->timeAtNonSkipped = timeAtNonSkipped;
  skippedFound = false;
  nonSkippedFound = false;
  needToSkip = false;
 }
 
 LONGLONG timeAtSkipped;
 LONGLONG timeAtNonSkipped;
 bool skippedFound;
 bool nonSkippedFound;
 bool needToSkip;
};
class CDShowMediaSplitFilt : public CTransInPlaceFilter, IDShowMediaSplitFilt
{
public:
 // CreateInstance. Provide the way for COM to create a "CDShowMediaSplitFilt" object
 static CUnknown *WINAPI CreateInstance (LPUNKNOWN pUnk, HRESULT *pHr);
 DECLARE_IUNKNOWN;
 
 // Method, which provides specified interface querying
 STDMETHODIMP NonDelegatingQueryInterface (REFIID riid, void **ppv);
 // Method for storing time blocks list into class members
 STDMETHODIMP SetIntervalList (const vector <TimeBlock> &lst);
private:
 CDShowMediaSplitFilt (TCHAR *tszName, LPUNKNOWN pUnk, HRESULT *pHr)
  : CTransInPlaceFilter (tszName, pUnk, CLSID_DShowMediaSplit, pHr) {}
 // Filter's transform method - where core work done
 HRESULT Transform (IMediaSample *pSample);
 // We accept any input type. We'd return S_FALSE for any we didn't like
 HRESULT CheckInputType (const CMediaType *pMediaTypeIn);
 // Time block's array, which stores split intervals
 vector <TimeBlockEx> timeBlockList_;
};
// Needed for the CreateInstance mechanism
CFactoryTemplate g_Templates [] = {
 {
  L"DirectShow Media Split Filter",
  &CLSID_DShowMediaSplit,
  CDShowMediaSplitFilt::CreateInstance,
  NULL,
  &sudTransformSample
 }
 };
int g_cTemplates = sizeof (g_Templates) / sizeof (g_Templates [0]);
CUnknown *WINAPI CDShowMediaSplitFilt::CreateInstance (LPUNKNOWN pUnk, HRESULT *pHr)
{
 CDShowMediaSplitFilt *pNewObject = new CDShowMediaSplitFilt (
  NAME ("DirectShow Media Split Filter"), pUnk, pHr);
 if (pNewObject == NULL)
  *pHr = E_OUTOFMEMORY;
 return pNewObject;
}
STDMETHODIMP CDShowMediaSplitFilt::NonDelegatingQueryInterface (REFIID riid, void **ppv)
{
 if (riid == IID_IDShowMediaSplitFilt)
  return GetInterface (static_cast <IDShowMediaSplitFilt *> (this),
  ppv);
 return CTransInPlaceFilter::NonDelegatingQueryInterface (riid, ppv);
}
STDMETHODIMP CDShowMediaSplitFilt::SetIntervalList (const vector <TimeBlock> &lst)
{
 for (size_t i = 0; i < lst.size (); i++)
 {
  TimeBlockEx timeBlockEx (lst.at (i).startTime, lst.at (i).endTime, 0, 0);
  timeBlockList_.push_back (timeBlockEx);
 }
 return S_OK;
}
HRESULT CDShowMediaSplitFilt::Transform (IMediaSample *pSample)
{
 LONGLONG sampleStartTime = 0;
 LONGLONG sampleEndTime = 0;
 pSample->GetTime (&sampleStartTime, &sampleEndTime);
 // Summary time, for which the samples have to be moved
 LONGLONG totalDelta = 0;
 
 for (size_t i = 0; i < timeBlockList_.size (); i++)
 {
  // Check this sample time - is it needed to skip this
  if (sampleEndTime > timeBlockList_.at (i).startTime &&
   sampleEndTime < timeBlockList_.at (i).endTime)
   timeBlockList_.at (i).needToSkip = true;
  else
   timeBlockList_.at (i).needToSkip = false;
  
  // Search the first sample time, which is needed to be
  // skipped and store it
  if (sampleEndTime >= timeBlockList_.at (i).startTime &&
   !timeBlockList_.at (i).skippedFound)
  {
   timeBlockList_.at (i).timeAtSkipped = sampleEndTime;
   timeBlockList_.at (i).skippedFound = true;
  }
  // Search the first sample time, which need not to be
  // skipped and store it
  if (sampleEndTime >= timeBlockList_.at (i).endTime &&
   !timeBlockList_.at (i).nonSkippedFound)
  {
   timeBlockList_.at (i).timeAtNonSkipped = sampleEndTime;
   timeBlockList_.at (i).nonSkippedFound = true;
  }
  // All necessary times found, so calculate "delta" time
  // as a difference of founded previously times and add
  // calculated value to the summary time for using in the
  // next of cut time blocks
  if (timeBlockList_.at (i).skippedFound &&
   timeBlockList_.at (i).nonSkippedFound)
  {
   LONGLONG delta = timeBlockList_.at (i).timeAtSkipped -
    timeBlockList_.at (i).timeAtNonSkipped;
   totalDelta += delta;
   // New sample times, which makes effects, that media is not broken
   LONGLONG newSampleStartTime = sampleStartTime + totalDelta;
   LONGLONG newSampleEndTime = sampleEndTime + totalDelta;
   // Setting new times
   pSample->SetTime (&newSampleStartTime, &newSampleEndTime);
  }
  // If the current sample time is inside "i"-th interval,
  // we need to skip it
  if (timeBlockList_.at (i).needToSkip)
   return S_FALSE;
 }
 
 return NOERROR;
}
HRESULT CDShowMediaSplitFilt::CheckInputType (const CMediaType *pMediaTypeIn)
{
 return S_OK;
}
// Exported entry points for registration and unregistration
STDAPI DllRegisterServer ()
{
 return AMovieDllRegisterServer2 (TRUE);
}
STDAPI DllUnregisterServer ()
{
 return AMovieDllRegisterServer2 (FALSE);
}

We use the IDShowMediaSplitFilt interface, which is made in COM-style to allow calling application setup time blocks, which have to be excluded from the input media file. All core functionality stays inside the filter Transform method. The filter code is strongly simplified. Out filter accepts any media type and doesn't control which type it receives – encoded or decoded, however, we work only with decoded content. But now let’s look at the demo application source code, which is shown below:

C++
// DirectShow Media Split Filter demo application source file
// by Kovalev Maxim
#include <iostream>
using std::cout;
using std::endl;
#include "../DShowMediaSplitFilt/DShowMediaSplitFilt.h"
// Read manually created graph function with structured exception
// handling to make it more like C++ 
HRESULT LoadGraphFile (IGraphBuilder *pGraph, const WCHAR *wszName)
{
 IStorage *pStorage = NULL;
 IPersistStream *pPersistStream = NULL;
 IStream *pStream = NULL;
 HRESULT hr = S_OK;
 
 try
 {
  hr = StgIsStorageFile (wszName);
  if (FAILED (hr))
   throw "Graph file not found";
  
  hr = StgOpenStorage (wszName, 0,
   STGM_TRANSACTED | STGM_READ | STGM_SHARE_DENY_WRITE,
   0, 0, &pStorage);
  
  if (FAILED (hr))
   throw "Storage opening failed";
  
  hr = pGraph->QueryInterface (IID_IPersistStream,
   reinterpret_cast <void **> (&pPersistStream));
  
  if (FAILED (hr))
   throw "Couldn't query interface";
  
  hr = pStorage->OpenStream (L"ActiveMovieGraph", 0,
   STGM_READ | STGM_SHARE_EXCLUSIVE, 0, &pStream);
  
  if (FAILED (hr))
   throw "Couldn't open ActiveMovieGraph";
  
  hr = pPersistStream->Load (pStream);
  if (FAILED (hr))
   throw "Couldn't load stream data";
 }
 catch (char *str)
 {
  cout << "Error occurred: " << str << endl;
 }
 
 if (pStream) pStream->Release ();
 if (pPersistStream) pPersistStream->Release ();
 if (pStorage) pStorage->Release ();
 
 return hr;
}
// App entry point. Here we demonstrate how to use our "DirectShow Transform Filter"
int main ()
{
 const WCHAR GRAPH_FILE_NAME [] = L"Graph.GRF";
 const LONGLONG SCALE_FACTOR = 10000000;
 
 IGraphBuilder *pGraph = NULL;
 IMediaControl *pControl = NULL;
 IMediaEvent *pEvent = NULL;
 HRESULT hr = S_OK;
 IBaseFilter *pVideoSplitFilt = NULL;
 IDShowMediaSplitFilt *pVideoSplitConfig = NULL;
 IBaseFilter *pAudioSplitFilt = NULL;
 IDShowMediaSplitFilt *pAudioSplitConfig = NULL;
 IReferenceClock *pClock = NULL;
 IEnumFilters *pFiltEnum = NULL;
 
 try
 {
  cout << "Process started" << endl;
  hr = CoInitialize (NULL);
  if (FAILED (hr))
   throw "Could not initialize COM library";
  
  hr = CoCreateInstance (CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER,
   IID_IGraphBuilder, reinterpret_cast <void **> (&pGraph));
  if (FAILED (hr))
   throw "Could not create the Filter Graph Manager";
  
  hr = LoadGraphFile (pGraph, GRAPH_FILE_NAME);
  if (FAILED (hr))
   throw "Couldn't load graph file";
  
  hr = pGraph->FindFilterByName (L"DirectShow Media Split Filter",
   &pVideoSplitFilt);
  if (FAILED (hr))
   throw """DirectShow Media Split Filter"" not founded in graph";
  
  hr = pVideoSplitFilt->QueryInterface (IID_IDShowMediaSplitFilt,
   reinterpret_cast <void **> (&pVideoSplitConfig));
  if (FAILED (hr))
   throw "IDShowMediaSplitFilt config 4 video splitter couldn't be retrieved";
  
  hr = pGraph->FindFilterByName (L"DirectShow Media Split Filter 0001",
   &pAudioSplitFilt);
  if (FAILED (hr))
   throw """DirectShow Media Split Filter 0001"" not founded in graph";
  
  hr = pAudioSplitFilt->QueryInterface (IID_IDShowMediaSplitFilt,
   reinterpret_cast <void **> (&pAudioSplitConfig));
  if (FAILED (hr))
   throw "IDShowMediaSplitFilt config 4 audio splitter couldn't be retrieved";
  
  vector <TimeBlock> timeBlockList;
  timeBlockList.push_back (TimeBlock (1 * SCALE_FACTOR, 3 * SCALE_FACTOR));
  timeBlockList.push_back (TimeBlock (5 * SCALE_FACTOR, 9 * SCALE_FACTOR));
  
  // Setting equal cutting intervals 4 both filter instances - video/audio
  pVideoSplitConfig->SetIntervalList (timeBlockList);
  pAudioSplitConfig->SetIntervalList (timeBlockList);
  
  hr = CoCreateInstance (CLSID_SystemClock, NULL, CLSCTX_INPROC_SERVER,
   IID_IReferenceClock, reinterpret_cast <void **> (&pClock));
  if (FAILED (hr))
   throw "Failed to create reference clock";
  
  hr = pGraph->EnumFilters (&pFiltEnum);
  if (FAILED (hr))
   throw "Couldn't enumerate graph filters";
  
  IBaseFilter *pCurrFilt = NULL;
  while (pFiltEnum->Next (1, &pCurrFilt, 0) == S_OK)
  {
   pCurrFilt->SetSyncSource (pClock);
   pCurrFilt->Release ();
  }
  
  hr = pGraph->QueryInterface (IID_IMediaControl, reinterpret_cast <
   void **> (&pControl));
  if (FAILED (hr))
   throw "Media control couldn't be retrieved";
  
  hr = pGraph->QueryInterface (IID_IMediaEvent, reinterpret_cast <
   void **> (&pEvent));
  if (FAILED (hr))
   throw "Media event couldn't be retrieved";
  
  hr = pControl->Run ();
  if (FAILED (hr))
   throw "Graph couldn't be started";
  
  long evCode = 0;
  pEvent->WaitForCompletion (INFINITE, &evCode);
 }
 catch (char *str)
 {
  cout << "Error occurred: " << str << endl;
 }
 
 if (pFiltEnum) pFiltEnum->Release ();
 if (pClock) pClock->Release ();
 if (pAudioSplitConfig) pAudioSplitConfig->Release ();
 if (pAudioSplitFilt) pAudioSplitFilt->Release ();
 if (pVideoSplitConfig) pVideoSplitConfig->Release ();
 if (pVideoSplitFilt) pVideoSplitFilt->Release ();
 if (pEvent) pEvent->Release ();
 if (pControl) pControl->Release ();
 if (pGraph) pGraph->Release ();
 
 CoUninitialize ();
 cout << "Done." << endl;
 
 return EXIT_SUCCESS;
}

How can we see that it is DirectShow-specific code, which shows how the filter properties can be set through the COM-interface. In our application it’s time blocks that we need to exclude. The time block is set in the second unit and multiplied by a factor, which allows using these times as DirectShow-style reference times.

Points of Interest

Of course there are many different ways to do a similar task. For example, we can use the DirectShow Editing Services. But in my own opinion, non-standard ways are more interesting for implementation.

History

  • 19th March, 2008: Initial post

    Some time ago I was very interested in the implementation of the task described above, but couldn't find any useful material or code for it and decided to write my own. It was targeted to video trim process using DirectShow implementation programmatically. We can see that its foundation was built successfully.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)