Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

.NET Implementation of an Ogg Vorbis Player

0.00/5 (No votes)
4 Apr 2007 2  
An article on decoding Ogg Vorbis audio files in .NET.

Screenshot - OggPlayer.jpg

Introduction

The TgPlayOgg project is a .NET C# library that allows you to play Ogg Vorbis files from your managed code. Decoding a given Ogg Vorbis file into usable sound data is done by TgPlayOgg which makes calls to an unmanaged C++ project TGPlayOgg_vorbisfile. TgPlayOgg also requires managed DirectX for sound output.

Background

At TrayGames, we needed to add support for playing sound files to the multi-player online game development SDK (TGSDK) provided to third party developers. We started out using the MP3 audio format, but we were concerned about licensing issues (fees kick in after you reach a certain level of sales). After comparing alternatives, we chose to use the Ogg Vorbis format. Ogg Vorbis is a completely open, patent-free, professional audio encoding and streaming technology with all the benefits of Open Source.

Using the Code

If you download the source, there is an "OggPlayer.sln" solution file under the "OggPlayer Sample" folder that will build all of the projects mentioned in this article. A sample test application has been provided in the "Test App" folder under the TgPlayOgg project. This application demonstrates how to use the library. The steps are as follows:

  • Include a reference to the TgPlayOgg project and import the TG.Sound namespace.
  • Construct an instance of the OggPlay class (your app only needs one instance no matter how many Ogg Vorbis files are played simultaneously).
  • Add your PlayOggFile event handler to the PlayOggFileResult delegate. You are now ready to call PlayOggFile as many times as you want and whenever you want. Note that PlayOggFile returns immediately after the call, since decoding and playback are done in separate threads. Your event handler will be called when a file is finished playing.
  • When you are all done using the instance of the OggPlay class, you will want to call Dispose to be sure that the unmanaged resources used by the DirectSound Device object get cleaned up.

Let's take a look at the highlights of the test application. First we see it has a method that does the initialization and handles the PlayOggFile event. Note that we must call the OggPlay constructor in a try block, since it makes calls into DirectSound that may raise exceptions. Then it has another method that allows the user to choose an Ogg Vorbis sound file to open for playback.

using TG.Sound;

private void InitTestOfOggPlayer()
{
  try
  {
      oplay = new OggPlay(this, OggSampleSize.SixteenBits);
      oplay.PlayOggFileResult += new PlayOggFileEventHandler(PlayOggFileResult);

      textBox1.Text = "Initialization successful.\r\n";
  }
  catch(Exception e)
  {
      textBox1.Text = "Initialization failed: " + e.Message + "\r\n";
  }
}
  
private void Button1Click(object sender, System.EventArgs e)
{
  OggName = GetOggFileNameToOpen();
  
  if (OggName != null)
  {
    oplay.PlayOggFile(OggName, ++PlayId);
    textBox1.Text = "Playing " + OggName + " Id= " + PlayId.ToString() + "\r\n";
  }        
}

The Ogg Vorbis decoder may have encountered errors while decoding the Ogg Vorbis data. An Ogg Vorbis file may refuse to play if there is not enough data to stream through the initial buffers (the Ogg is too small) or if it simply can not read the file. There are two error counts that are for information purposes only, since if successful the created waveform data was played, but it may not have sounded as intended if either of these two counts are nonzero. The way we handle the PlayOggFile event is to display a status message indicating success or error (with the two error counts). We'll learn more about what these error counts mean later.

private static void PlayOggFileResult(object sender, PlayOggFileEventArgs e)
{
  if (e.Success)
  {
    MainForm.textBox1.Text += "PlayOggFile(" + e.PlayId + ") succeeded ("
      + "ErrorHoleCount: " + e.ErrorHoleCount + ", ErrorBadLinkCount: " 
      + e.ErrorBadLinkCount + ").\r\n";
  }
  else
  {
    MainForm.textBox1.Text += "PlayOggFile(" + e.PlayId + ") failed: '" 
      + e.ReasonForFailure + "'\r\n";
   }
   
   PlayId--;
}

Note that exiting a calling application does not kill the playback threads if one or more Ogg Vorbis files are still playing. These playback threads keep running although you can no longer hear them playing. The threads will finish playing whatever Ogg Vorbis files they were playing and then quit, unless the threads are specifically told to stop playback. So, when your application exits, it should probably kill any long playing Ogg Vorbis files that are still playing. This is why the test application handles the Form.Closing event by calling OggPlay.StopOggFile which you will learn more about later.

protected void Form1_Closing(object sender, 
                      System.ComponentModel.CancelEventArgs e)
{
  // Determine if any Ogg files are still playing by checking the PlayId member

  if (PlayId > 0)
  {
    // Display a MsgBox asking the user to save changes or abort

    if (MessageBox.Show("Ogg files are still playing," + 
        " are you sure you want to exit?", "TrayGames Ogg Player",
        MessageBoxButtons.YesNo) ==  DialogResult.No)
    {
      // Cancel the Closing event from closing the form

      e.Cancel = true;

      // Wait for files to finish playing...

    }
    else
    {
        // Kill all outstanding playbacks

      while (PlayId > 0)
        oplay.StopOggFile(PlayId--);  
    }
  }
}

Other times you might want to kill playback threads in midstream would be if your application has a pause capability, or if you want to reset sounds when the user switches away from your game.

The Ogg Vorbis Wrapper

Ogg Vorbis' high-level API, Vorbisfile, has only two input choices: either a C file pointer or a set of custom callback functions that do the reading of the input Ogg Vorbis data. The better and more portable of these choices is probably custom callbacks, but I wasn't aware that .NET 1.1 gave any control over the calling convention of its methods, and its standard calling convention is StdCall, while the Vorbisfile dynamic link libraries (DLLs) are compiled with the Cdecl calling convention. Thus, given C# and .NET 1.1, we decided to write some C/C++ code and compile it into a DLL, and this DLL includes the callbacks that Vorbisfile needs. This is why we created the TGPlayOgg_vorbis wrapper project.

I have since learned that you can use the DllImportAttribute class to provide the information needed to call a function exported from an unmanaged DLL. So you should be able to modify the source code of this library to eliminate the TGPlayOgg_vorbis wrapper project and make the Vorbisfile API calls directly from the TGPlayOgg project. The .NET Framework Base Class Library (BCL) provides for StdCall, Cdecl, ThisCall, and the WinApi calling conventions. WinApi selects the correct type automatically based on the platform (Windows or Windows CE). For example, to change the SomeFunction located in SomeLibrary.dll, you could use the following code:

[DllImport("SomeLibrary.DLL", EntryPoint="SomeFunction",  SetLastError=true,
CharSet=CharSet.Unicode, ExactSpelling=true,
CallingConvention=CallingConvention.Cdecl)]
public static extern bool SomeFunction(String param1, String param2);

For now though, the TGPlayOgg_vorbis project makes the calls into the Ogg Vorbis API for us. There are three wrapper functions: init_file_for_ogg_decode, ogg_decode_one_vorbis_packet, and ogg_final_cleanup. You will never call these methods directly, the C# library will make them so it can decode the file. If you want to add the definitions for these methods to your own managed application you would define a NativeMethods class (use any class name) and add the function prototypes to it. Having a separate class for your unmanaged DLL functions is advisable because consuming DLL functions can be prone to errors. Encapsulating the DLL declarations makes your job easier when it comes time to debug. The "Vorbisapi.cs" file in the TGPlayOgg_vorbis project already has such a class definition containing the declarations:

// External C functions in the TgPlayOgg_vorbisfile unmanaged DLL

[DllImport("TgPlayOgg_vorbisfile.dll", CharSet=CharSet.Unicode,
   CallingConvention=CallingConvention.Cdecl)]
public unsafe static extern int init_for_ogg_decode(
string fileName, void **vf_out);

[DllImport("TgPlayOgg_vorbisfile.dll", CallingConvention=CallingConvention.Cdecl)]
public unsafe static extern int ogg_decode_one_vorbis_packet(
  void *vf_ptr, void *buf_out, int buf_byte_size, 
int bits_per_sample, int *channels_cnt, int *sampling_rate, 
int *err_ov_hole_cnt, int *err_ov_ebadlink_cnt);

[DllImport("TgPlayOgg_vorbisfile.dll", CallingConvention=CallingConvention.Cdecl)]
public unsafe static extern int final_ogg_cleanup(void *vf_ptr);

What allows us to make these calls is the Platform Invoke (PInvoke) service. PInvoke will enable our managed code to call the unmanaged functions implemented in the DLL. It will locate and invoke the exported functions and marshal their parameters across the managed/unmanaged code boundary as needed. Note that PInvoke throws exceptions generated by the unmanaged function to the managed caller. Let's look at our unmanaged functions now.

The init_file_for_ogg_decode function will open and initialize the given Ogg Vorbis file for decoding. It sets up all the related decoding structures by calling the op_open API function. Also, you should be aware that ov_open, once successful, takes complete possession of the file resource. After you have opened a file using ov_open, you must close it using ov_clear, not fclose or any other function. Our wrapper functions take care of all of this, here's what our initialize function looks like:

int init_file_for_ogg_decode(wchar_t *filename, void **vf_out)
{
  // . . .

  
  int ov_ret = ov_open(file_ptr, static_cast<OggVorbis_File*>(vf_ptr), NULL, 0);

  if (ov_ret < 0)
  {
    // There was an error so cleanup now

    fclose(file_ptr);
    free(vf_ptr);

    // Return the ifod_err_ code

    return err_code;
  }
    
  // Copy the memory pointer to the caller

  *vf_out = vf_ptr;
  
  return 0;  // success

}

The ogg_decode_one_vorbis_packet function writes PCM (Pulse Code Modulation) data into the given buffer and returns the number of bytes written into that buffer. First it calls the ov_read which returns up to the specified number of bytes of decoded PCM audio in the requested endianness, signedness, and word size. If the audio is multichannel, the channels are interleaved in the output buffer. This function is used to decode a Vorbis file within a loop. Our C# application, which we'll see later, will be doing just that.

Next it calls ov_info which returns the vorbis_info struct for the specified bitstream. This allows us to return the number of channels in the bitstream, and the sampling rate of the bitstream to our C# application. There are basically two errors that can occur: OV_HOLE which indicates there was an interruption in the data, and OV_EBADLINK which indicates that an invalid stream section was supplied, or the requested link is corrupt.

The Ogg Vorbis format allows for multiple logical bitstreams to be combined (with restrictions) into a single physical bitstream. Note that the Vorbisfile API could more or less hide the multiple logical bitstream nature of chaining from an application but, when reading audio back, the application must be aware that multiple bitstream sections do not necessarily use the same number of channels or sampling rate. The Ogg Vorbis documentation provides more information on Ogg logical bitstream framing.

int ogg_decode_one_vorbis_packet(void *vf_ptr, 
            void *buf_out, int buf_byte_size, 
            int ogg_sample_size, 
            int *channels_cnt, int *sampling_rate, 
            int *err_ov_hole_cnt, int *err_ov_ebadlink_cnt)
{
  // . . .

  
  for (bytes_put_in_buf = 0;;)
  {
      long ov_ret = ov_read(static_cast<OggVorbis_File*>(vf_ptr), 
        static_cast<char*>(buf_out), buf_byte_size, 0, 
        word_size, want_signed, &bitstream);
      
      if (ov_ret == 0)  // at EOF

      {
         break;
      }
      else if (ov_ret < 0)
      {
          // An error occurred, bad ogg data of some kind

          if (ov_ret == OV_HOLE)
              ++(*err_ov_hole_cnt);
          else if (ov_ret == OV_EBADLINK)
              ++(*err_ov_ebadlink_cnt);
      }
      else 
      {
          assert(ov_ret <= buf_byte_size);
  
          vorbis_info* vi_ptr = ov_info(static_cast<OggVorbis_File*>(vf_ptr), 
                                                                      bitstream);
          if (vi_ptr != NULL)
          {
              // Number of channels in the bitstream

              *channels_cnt = vi_ptr->channels;

              // Sampling rate of the bitstream

              *sampling_rate = vi_ptr->rate;
          }
          
          bytes_put_in_buf = ov_ret;
          break;
      }
  }
  
  return bytes_put_in_buf;
}

After a bitstream has been opened using ov_open and decoding is complete, an application must call ov_clear to clear the decoder's buffers and close the file. The ogg_final_cleanup function does this by calling this function, it also frees the memory pointed to by vf_out. You can take a look at the Vorbisfile API documentation for more information on any of these functions.

int ogg_final_cleanup(void *vf_ptr)
{
  int ret = 0;
  
  if (vf_ptr != NULL)
  {
      ret = ov_clear(static_cast<OggVorbis_File*>(vf_ptr));
      // non-zero is failure

      free(vf_ptr);
  }
  
  return ret;
}

The .NET Ogg Vorbis Library

The Microsoft .NET 1.1 Framework has no sound playing classes, so to play the waveform data constructed from the decoded Ogg Vorbis file data, there are basically two choices. The first is to write the waveform data out as a WAV file, and then use quartz.dll (on Win98 and later) to play that WAV file. The disadvantage of this choice is that WAV files can be very large (e.g. a 5.5 MB Ogg Vorbis file was tested and resulted in a 67 MB WAV file), and playback can't begin until after the entire WAV file has been written out (e.g. decoding that 5.5 MB Ogg Vorbis file and writing out a WAV file takes more than 20 seconds on a 1.6 GHz P4 PC). The other choice is to use methods in managed DirectX which means there's no need to write out any WAV file, and we can play the waveform data as it is generated, so playback can begin much quicker than the first approach. The TrayGames client already ensures that the managed DirectX APIs are installed on target computers so this was not an issue for us and it's the choice we went with.

The OggPlay class is the main class that your application will be using. Its constructor creates a new DirectX Sound device, sets the cooperative level and sample size of the Ogg Vorbis file.

public OggPlay(Control owner, OggSampleSize wantedOggSampleSize)
{
  // Set DirectSoundDevice

  DirectSoundDevice = new Device();

  // NOTE: The DirectSound documentation recommends

  // CooperativeLevel.Priority for games

  DirectSoundDevice.SetCooperativeLevel(owner, 
                    CooperativeLevel.Priority);

  // Set OggSampleSize

  OggFileSampleSize = wantedOggSampleSize;
}

The owner parameter is used by the DirectSound SetCooperativeLevel method, which defines its owner parameter as "The System.Windows.Forms.Control of the application that is using the Device object". This should probably be your application's main window. The wantedOggSampleSize parameter is either 8 bits or 16 bits. 8-bit sample size has lower quality but is faster and takes less memory than 16-bit sample size. If your application's Ogg Vorbis files are encoded with 8-bit sample size, then choose 8 (you can also choose 16, but it's wasteful and gains nothing if the Ogg Vorbis sources are only 8-bit). If your application's Ogg Vorbis files are encoded with 16-bit sample size, then choose 16 to get the full sound quality during playback, or choose 8, or give the user the option of choosing 8, if you want to minimize playback resource requirements. If your application's Ogg Vorbis files are a mixture (some are encoded with 8-bit sample size and others are encoded with 16-bit sample size), then choose whichever you think is best (either setting, 8 or 16 bits, will play all the Ogg Vorbis files).

The TgPlayOgg library declares two events with delegates and an event argument class (which defines data for both events) for playing and stopping Ogg Vorbis files. The PlayOggFileResult event (PlayOggFileEventHandler delegate) is used for event notification when the PlayOggFile method completes, while the StopOggFileNow event (StopOggFileEventHandler delegate) is used when the client wishes to interrupt playback prematurely. Here's a look at the data members of the event argument class.

public sealed class PlayOggFileEventArgs : EventArgs
{
  private bool success;
  // If !Success then this is the explanation for the failure

  private string reasonForFailure;
  // The value of the playID parameter when PlayOggFile() was called

  private int playId;
      
  public int ErrorHoleCount,
      // Count of encountered OV_HOLE errors during decoding

      // indicates there was an interruption in the data.

    ErrorBadLinkCount;
      // Count of encountered OV_EBADLINK errors during decoding

      // indicates that an invalid stream

      // section was supplied to libvorbisfile, 

  // . . .

}

OggPlay provides two simple methods PlayOggFile and StopOggFile. PlayOggFile plays the Ogg Vorbis file specified by the fileName parameter. The playId parameter is an arbitrary value determined by the user, and it is returned in the raised PlayOggFileResult event. This event is raised by PlayOggFileThreadProc. In your event-handler code, you can use the returned playID to know which specific PlayOggFile call resulted in that handled event. This is why your application should attach to the PlayOggFileEventHandler delegate.

public void PlayOggFile(string fileName, int playId)
{
  PlayOggFileEventArgs EventArgs = new PlayOggFileEventArgs(playId);
  
  // Decode the ogg file in a separate thread

  PlayOggFileThreadInfo pofInfo = new PlayOggFileThreadInfo(
    EventArgs, fileName, 
    OggFileSampleSize == OggSampleSize.EightBits ? 8 : 16,
    DirectSoundDevice, this);
  
  Thread PlaybackThread = new Thread(new 
         ThreadStart(pofInfo.PlayOggFileThreadProc));
  PlaybackThread.Start();
  Thread.Sleep(0);
}

StopOggFile raises the StopOggFileNow event. This event will be handled by the PlayOggFileThreadProc method. Your application does not need to attach to the StopOggFileEventHandler delegate, but PlayOggFileThreadProc does of course.

public void StopOggFile(int playId)
{
  PlayOggFileEventArgs EventArgs = new PlayOggFileEventArgs(playId);
  StopOggFileNow(this, EventArgs);  
}

The OggPlay class contains the PlayOggFileThreadInfo class which is used as the thread class for the playback thread created in the PlayOggFile method of the OggPlay class. In a way, this class sits between the managed and unmanaged environments. It does work on behalf of OggPlay by making calls into the unmanaged Ogg Vorbis wrapper described above. The main method in this class is PlayOggFileThreadProc and we will look at some of the parts of this method now.

The first thing that PlayOggFileThreadProc does is initialize the Ogg Vorbis file for decoding by calling into the Ogg Vorbis wrapper. If an error is encountered during initialization, it's returned via the PlayOggFileEventHandler (see below). Note that the filename, sample rate, and DirectSound device are all passed to this class through its constructor. The constructor also registers the class' InterruptOggFilePlayback method to handle the StopOggFileNow.

  int ErrorCode = NativeMethods.init_file_for_ogg_decode(FileName, &vf);
  
  if (ErrorCode != 0)
  {
    // . . .

      
    oplay.PlayOggFileResult(this, EventArgs);
    return;
  }

Next PlayOggFileThreadProc creates the PCM byte array and passes it to the ogg_decode_one_vorbis_packet function. This function will pass back the first chunk of decoded Ogg Vorbis data and its size.

  // Get next chunk of PCM data, pin these so GC can't relocate them

  fixed(byte *buf = &PcmBuffer[0])
  {
    fixed(int *HoleCount = &EventArgs.ErrorHoleCount)
    {
      fixed(int *BadLinkCount = &EventArgs.ErrorBadLinkCount)
      {
        // NOTE: The sample size of the returned PCM data -- either 8-bit 

        //     or 16-bit samples -- is set by BitsPerSample

        PcmBytes = NativeMethods.ogg_decode_one_vorbis_packet(
          vf, buf, PcmBuffer.Length,
          BitsPerSample,
          &ChannelsCount, &SamplingRate,
          HoleCount, BadLinkCount);
      }
    }
  }

The first time we return from the ogg_decode_one_vorbis_packet function, we create DirectSound WaveFormat, BufferDescription, SecondaryBuffer, and Notify objects. WaveFormat is used to hold the format of the waveform audio data after it's been decoded. BufferDescription will describe the characteristics of the new buffer object, including the WaveFormat. The SecondaryBuffer has methods and properties used to manage the sound buffer. Notify allows us to set up notification triggers at different points during playback.

  int HoldThisManySamples = 
    (int)(SamplingRate * SecBufHoldThisManySeconds);
  
  // Set the format

  MyWaveFormat.AverageBytesPerSecond = AverageBytesPerSecond;
  MyWaveFormat.BitsPerSample = (short)BitsPerSample;
  MyWaveFormat.BlockAlign = (short)BlockAlign;
  MyWaveFormat.Channels = (short)ChannelsCount;
  MyWaveFormat.SamplesPerSecond = SamplingRate;
  MyWaveFormat.FormatTag = WaveFormatTag.Pcm;
  
  // Set BufferDescription

  MyDescription = new BufferDescription();
  
  MyDescription.Format = MyWaveFormat;
  MyDescription.BufferBytes = 
  SecBufByteSize = HoldThisManySamples * BlockAlign;
  MyDescription.CanGetCurrentPosition = true;
  MyDescription.ControlPositionNotify = true;
  
  // Create the buffer

  SecBuf = new SecondaryBuffer(MyDescription, DirectSoundDevice);
  
  // Set 3 notification points, at 0, 1/3, and 2/3 SecBuf size

  MyNotify = new Notify(SecBuf);
  
  BufferPositionNotify[] MyBufferPositions = new BufferPositionNotify[3];

  MyBufferPositions[0].Offset = 0;
  MyBufferPositions[0].EventNotifyHandle = 
            SecBufNotifyAtBegin.Handle;
  MyBufferPositions[1].Offset = 
            (HoldThisManySamples / 3) * BlockAlign;
  MyBufferPositions[1].EventNotifyHandle = 
            SecBufNotifyAtOneThird.Handle;
  MyBufferPositions[2].Offset = 
            ((HoldThisManySamples * 2) / 3) * BlockAlign;
  MyBufferPositions[2].EventNotifyHandle = 
            SecBufNotifyAtTwoThirds.Handle;
  
  MyNotify.SetNotificationPositions(MyBufferPositions);

After these objects are prepared, we load the decoded PCM data into a MemoryStream object. This stream is written into the DirectSound buffer object and then played using the asynchronous Play method. This process is repeated until we reach the end of the Ogg Vorbis file. We must be aware that multiple bitstream sections do not necessarily use the same number of channels or sampling rate (we refer to this as its format). While we can handle a different format at the start of a new Ogg Vorbis file, we can't handle a format change during the playback of a file. Besides reaching the end of file, or an error, this is another reason the library will stop playback.

  // Copy the new PCM data into PCM memory stream

  PcmStream.SetLength(0);
  PcmStream.Write(PcmBuffer, 0, PcmBytes);
  PcmStream.Position = 0;
  PcmStreamNextConsumPcmPosition = 0;
  
  // Initial load of secondary buffer

  if (SecBufInitialLoad)
  {
    int WriteCount = (int)Math.Min(
      PcmStream.Length,
      SecBufByteSize - SecBufNextWritePosition);
    
    if (WriteCount > 0)
    {
      SecBuf.Write(
        SecBufNextWritePosition,
        PcmStream,
        WriteCount,
        LockFlag.None);
    
      SecBufNextWritePosition += WriteCount;
      PcmStreamNextConsumPcmPosition += WriteCount;
    }
    
    if (SecBufByteSize == SecBufNextWritePosition)
    {
      // Done filling the buffer

      SecBufInitialLoad = false;
      SecBufNextWritePosition = 0;

      // So start the playback

      // NOTE: Play does the playing in its own thread

      SecBuf.Play(0, BufferPlayFlags.Looping);
      Thread.Sleep(0);
      //yield rest of timeslice 

      //so playback can start right away

    }
    else
    {
      continue;  // Get more PCM data

    }
  }

Points of Interest

Those are pretty much the highlights of the sample, TgPlayOgg, and TgPlayOgg_vorbisfile projects. These projects are interesting if you want to learn about decoding Ogg Vorbis audio files or as an example of how to call unmanaged code from the managed .NET environment. If you are interested in checking out the full TGSDK for producing your own multi-player online games, you can get it at the TrayGames web site. You may also want to check out the Ogg Vorbis web site to learn more about their encoding format and the many tools for manipulating it.

Revision History

  • 02 April 2007

    Updated this library to support Visual Studio .NET 2005 and made several bug fixes. The updated library is also available in the TGSDK, downloadable from the TrayGames Developer web site.

  • 07 March 2006

    Updated this library to support .NET 2.0, added a WaitForAllOggFiles method that will block until all outstanding Ogg files are finished playing, and made several bug fixes.

  • 22 August 2005

    Added more details on the Vorbisfile API functions that this library calls to decode an Ogg Vorbis sound file.

  • 11 August 2005

    Fixed some minor defects in the source code, updated the test application and fixed some grammatical errors in the article.

  • 18 July 2005

    Initial revision. Based on some welcomed feedback, I've updated the Ogg Vorbis Wrapper section of this article to talk about the .NET Framework Base Class Library DllImportAttribute class which can be used to call a function exported from an unmanaged DLL.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here