.NET Wrapper of FFmpeg Libraries

Maxim Kartavenkov

5.00/5 (28 votes)

15 Jul 2022CPOL50 min read

50K

2.3K

Article describes created .NET wrapper library

Article provides the documentation of the features of created .NET Wrapper of FFmpeg libraries. It contains description of architecture, core classes and include sample applications and results of code execution.

Introduction
Objective
How to read the documentation
Supported FFmpeg versions
Architecture
Core Classes
Advanced
Examples
Source Code
Compiled library
Building project
Licensing and distribution
History

Introduction

There are lots of attempts to create .NET wrappers of FFmpeg libraries. I saw different implementations, mostly just performing calling ffmpeg.exe with prepared arguments and parsing outputs from it. More productive results have some basic classes which can perform just one of few tasks of ffmpeg and whose classes are exposed as COM objects or regular C# classes, but their functionality is very limited and does not fully perform in .NET what actually can be done with full access to ffmpeg libraries in C++. I also have seen a totally .NET “unsafe” code implementation of wrapper, so the C# code must be also marked as “unsafe” and the functions of such wrapper have lots of pointers like “void *” and other similar types. So, usage of such code is also problematic and I am just talking about C# and not take other languages such as VB.NET.

Current implementation of .NET wrapper is representation of all exported structures from ffmpeg libraries, along with APIs which can operate with those objects from managed code either from C# or VB.NET or even Delphi.NET. It is not just calls of certain APIs of a couple predefined objects - all API exposed as methods. You may ask a good question: “how is that possible and why wasn't that done before?”, and the answers you can find in current documentation.

Objective

Article describes created .NET wrapper of FFmpeg libraries. It covers basic architecture aspects along with information: how that was done and what issues are solved during implementation. It contains a lot of code samples for different aims of usage of FFmpeg library. Each code is workable and has a description with the output results. It will be interesting for those who want to start learning how to work with the ffmpeg library or for those who plan to use it in production. Even for those who are sure that he knows the ffmpeg library from C++ side, this information also will be interesting. And if you are looking for a solution of how to integrate the whole ffmpeg libraries power directly into your .NET application - then you are on the right way.

How to Read the Documentation

The first part describes wrapper library architecture, main features, basic design of the classes, what kind of issues appear during implementation and how they were solved. Here is also introduced the base classes of the library with description of their fields and purposes. It contains code snippet and samples either in C++/CLI or C#. In the second part, you can find core wrapper classes of the most FFmpeg libraries structures. Each class documentation contains basic description and one or few sample code of usage for this class. In this part code samples in C#. Part three introduces some advanced topics of the library. In additional documentation include description of the example C# projects of the .NET wrapper of native C++ FFmpeg examples. There are also a few of my own sample projects, which, I think, will be interesting to look for FFmpeg users. Last part contains a description about how to build the project, license information and distribution notes.

Supported FFmpeg Versions

Current implementation supports FFmpeg libraries from 4.2.2 and higher. By saying higher I mean that it controls some aspects of newer versions with some definitions and/or dynamic library linking. For example native FFmpeg libraries have definitions of what APIs are exposed and what fields each structure contains and that is properly handled in the current structure.

Architecture

By saying that implementation of the current .NET wrapper isn’t simple or not easy it means say nothing. The result of the ability of this library to exist and work properly is in its architecture. Project is written on C++/CLI and compiled as a standalone .NET DLL which can be added as reference into .NET applications written in any language. Answers why decisions were made in favor of C++/CLI you can find in this documentation.
Library contains couple h/cpp files which are mostly represent underlying imported FFmpeg dll library:

AVCore.h/cpp	Contains core classes enumerations and structures. Also contains imported classes and enumerations from `AVUtil` library
AVCodec.h/cpp	Contains classes and enumerations for wrapper from `AVCodec` Library
AVFormat.h/cpp	Contains classes and enumerations for wrapper from `AVFormat` Library
AVFilter.h/cpp	Contains classes and enumerations for wrapper from `AVFilter` Library
AVDevice.h/cpp	Contains classes and enumerations for wrapper from `AVDevice` Library
SWResample.h/cpp	Contains classes and enumerations for wrapper from `SWResample` Library
SWScale.h/cpp	Contains classes and enumerations for wrapper from `SWScale` Library
Postproc.h/cpp	Contains classes and enumerations for wrapper from `Postproc` Library

Core Architecture

This topic contains the description of most benefits and a design guide of major parts of library architecture from base classes design to core aspects and resolved issues. This part along with C# also contains code snippets from wrapper library implementation which is in C++/CLI.

Classes

Most classes in the wrapper library implementation rely on structure or enumeration in FFmpeg libraries. Class has the same name as its native underlying structure. For example: the managed class AVPacket in the library represents the AVPacket structure of the libavcodec: it contains all fields of the structure along with exposed related methods. The fields implemented not as regular structure fields, but as managed properties:

MC++

public ref class AVPacket : public AVBase
						  , public ICloneable
{
private:
	Object^			m_pOpaque;
	AVBufferFreeCB^	m_pFreeCB;
internal:
	AVPacket(void * _pointer,AVBase^ _parent);
public:
	/// Allocate an AVPacket and set its fields to default values. 
	AVPacket();
	// Allocate the payload of a packet and initialize its fields with default values.
	AVPacket(int _size);
	/// Initialize a reference-counted packet from allocated data.
	/// Data Must be freed separately
	AVPacket(IntPtr _data,int _size);
	/// Initialize a reference-counted packet from allocated data.
	/// With callback For data Free
	AVPacket(IntPtr _data,int _size,AVBufferFreeCB^ free_cb,Object^ opaque);
	/// Initialize a reference-counted packet with given buffer
	AVPacket(AVBufferRef^ buf);
	// Create a new packet that references the same data as src.
	AVPacket(AVPacket^ _packet);
	~AVPacket();
public:
	property int _StructureSize { virtual int get() override; }
public:
    ///A reference to the reference-counted buffer where the packet data is
    ///stored.
    ///May be NULL, then the packet data is not reference-counted.
	property AVBufferRef^ buf { AVBufferRef^ get(); }

	///Presentation timestamp in AVStream->time_base units; the time at which
	///the decompressed packet will be presented to the user.
	///Can be AV_NOPTS_VALUE if it is not stored in the file.
	///pts MUST be larger or equal to dts as presentation cannot happen before
	///decompression, unless one wants to view hex dumps. Some formats misuse
	///the terms dts and pts/cts to mean something different. Such timestamps
	///must be converted to true pts/dts before they are stored in AVPacket.
	property Int64 pts { Int64 get(); void set(Int64); }

	///Decompression timestamp in AVStream->time_base units; the time at which
	///the packet is decompressed.
	///Can be AV_NOPTS_VALUE if it is not stored in the file.
	property Int64 dts { Int64 get(); void set(Int64); }
	
	...

Properties

The class which relies on native structure is inherited from the AVBase class. The classes represent the pointer to underlying structure. Exposing structure fields as properties gives the ability to hide whole internals and this allows to support different versions of ffmpeg by encapsulation implementations. For example: in certain ffmpeg versions some fields can be hidden in structure and accessed via Options APIs, like the “b_sensitivity” field of AVCodecContext it may be removed during build by having FF_API_PRIVATE_OPT definition set:

C++

#if FF_API_PRIVATE_OPT
    /** @deprecated use encoder private options instead */
    attribute_deprecated
    int b_sensitivity;
#endif

In wrapper library, you only see the property, but internally it chooses the way to access it:

MC++

int FFmpeg::AVCodecContext::b_sensitivity::get()
{
	__int64 val = 0;
	if (AVERROR_OPTION_NOT_FOUND == av_opt_get_int(m_pPointer, "b_sensitivity", 0, &val))
	{
#if FF_API_PRIVATE_OPT
		val = ((::AVCodecContext*)m_pPointer)->b_sensitivity;
#endif
	}
	return (int)val;
}

void FFmpeg::AVCodecContext::b_sensitivity::set(int value)
{
	if (AVERROR_OPTION_NOT_FOUND == 
        av_opt_set_int(m_pPointer, "b_sensitivity", (int64_t)value, 0))
	{
#if FF_API_PRIVATE_OPT
		((::AVCodecContext*)m_pPointer)->b_sensitivity = (int)value;
#endif
	}
}

In same time in C# code:

Or, in some versions of FFmpeg, fields may be not present at all. For example “refcounted_frames” of AVCodecContext:

MC++

int FFmpeg::AVCodecContext::refcounted_frames::get()
{
	__int64 val = 0;
	if (AVERROR_OPTION_NOT_FOUND == 
        av_opt_get_int(m_pPointer, "refcounted_frames", 0, &val))
	{
#if (LIBAVCODEC_VERSION_MAJOR < 59)
		val = ((::AVCodecContext*)m_pPointer)->refcounted_frames;
#endif
	}
	return (int)val;
}

void FFmpeg::AVCodecContext::refcounted_frames::set(int value)
{
	if (AVERROR_OPTION_NOT_FOUND == 
        av_opt_set_int(m_pPointer, "refcounted_frames", (int64_t)value, 0))
	{
#if (LIBAVCODEC_VERSION_MAJOR < 59)
		((::AVCodecContext*)m_pPointer)->refcounted_frames = (int)value;
#endif
	}
}

I’m sure you already think that it is very cool and no need to worry about the background. Such implementation is known as Facade pattern.

Constructors

If you know FFmpeg libraries APIs, you can agree that it contains APIs for allocating structures. There can be different APIs for every structure for such purposes. In wrapper classes, all objects have a constructor - so users no need to think about what API to use; and even more: the constructor can encapsulate a list of FFmpeg API calls to perform full object initialization. Different FFmpeg API as well as different object initialization are done as separate constructor with different arguments. As an example constructor of AVPacket object:

MC++

public:
	/// Allocate an AVPacket and set its fields to default values. 
	AVPacket();
	// Allocate the payload of a packet and initialize its fields with default values.
	AVPacket(int _size);
	/// Initialize a reference-counted packet from allocated data.
	/// Data Must be freed separately
	AVPacket(IntPtr _data,int _size);
	/// Initialize a reference-counted packet from allocated data.
	/// With callback For data Free
	AVPacket(IntPtr _data,int _size,AVBufferFreeCB^ free_cb,Object^ opaque);
	/// Initialize a reference-counted packet with given buffer
	AVPacket(AVBufferRef^ buf);
	// Create a new packet that references the same data as src.
	AVPacket(AVPacket^ _packet);

There is a special case of an internal constructor which is not exposed outside of the library. Most objects contain it. This is done for internal object management which I describe later in this document:

MC++

internal:
	AVPacket(void * _pointer,AVBase^ _parent);

Destructors

Also for users no need to worry about releasing memory and free object data. This is all done in object destructors and internally in finalizers. In C#, that is implemented automatically as an IDisposable interface of an object. So, once an object is not needed it is good to call Dispose method - to clear all dependencies and free allocated object memory. Users also have no need to think which FFmpeg API to call to free data and deallocate object resources as It is all handled by the wrapper library. In case the user forgot to call Dispose method: object and its memory will be freed by finalizer, but it is good practice to call Dispose directly in code.

// a wrapper around a single output AVStream
public class OutputStream
{
    public AVStream st;
    public AVCodecContext enc;

    /* pts of the next frame that will be generated */
    public long next_pts;
    public int samples_count;

    public AVFrame frame;
    public AVFrame tmp_frame;

    public float t, tincr, tincr2;

    public SwsContext sws_ctx;
    public SwrContext swr_ctx;
}

static void close_stream(AVFormatContext oc, OutputStream ost)
{
    if (ost.enc != null) ost.enc.Dispose();
    if (ost.frame != null) ost.frame.Dispose();
    if (ost.tmp_frame != null) ost.tmp_frame.Dispose();
    if (ost.sws_ctx != null) ost.sws_ctx.Dispose();
    if (ost.swr_ctx != null) ost.swr_ctx.Dispose();
}

Also, it is necessary to keep in mind that such objects as AVPacket or AVFrame can contain buffers which are allocated internally by ffmpeg API and which should be freed separately see av_packet_unref and av_frame_unref APIs. For such purposes those objects expose methods named Free, so, don’t forget to call it to avoid memory leaks.

Enumerations

Some FFmpeg library definitions or enumerations are done as .NET enum classes - this way it is easy to find what value can be set in certain fields of the structures. For example AV_CODEC_FLAG_*, AV_CODEC_FLAG2_* or AV_CODEC_CAP_* definitions:

C++

#define AV_CODEC_CAP_DR1                 (1 <<  1)
#define AV_CODEC_CAP_TRUNCATED           (1 <<  3)

Designed as .NET enum for better access from code:

MC++

// AV_CODEC_CAP_*
[Flags]
public enum class AVCodecCap : UInt32
{
	///< Decoder can use draw_horiz_band callback.
	DRAW_HORIZ_BAND = (1 <<  0), 
	///<summary>
	/// Codec uses get_buffer() for allocating buffers and supports custom allocators.
	/// If not set, it might not use get_buffer() at all or use operations that
	/// assume the buffer was allocated by avcodec_default_get_buffer.
	///</summary>
	DR1 = (1 <<  1),
	TRUNCATED = (1 <<  3),
	...

In C#, this looks like:

Some enumerations or definitions from FFmpeg are done as regular .NET classes. In that case, class also exposes some useful methods which rely on specified enumerator type:

MC++

/// AV_SAMPLE_FMT_*
public value class AVSampleFormat
{
public:
	static const AVSampleFormat NONE	= ::AV_SAMPLE_FMT_NONE;
	/// unsigned 8 bits
	static const AVSampleFormat U8 		= ::AV_SAMPLE_FMT_U8;
	/// signed 16 bits
	static const AVSampleFormat S16		= ::AV_SAMPLE_FMT_S16;
	/// signed 32 bits
	static const AVSampleFormat S32		= ::AV_SAMPLE_FMT_S32;
	/// float
	static const AVSampleFormat FLT		= ::AV_SAMPLE_FMT_FLT;
	/// double
	static const AVSampleFormat DBL		= ::AV_SAMPLE_FMT_DBL;
	/// unsigned 8 bits, planar
	static const AVSampleFormat U8P		= ::AV_SAMPLE_FMT_U8P;
	/// signed 16 bits, planar
	static const AVSampleFormat S16P	= ::AV_SAMPLE_FMT_S16P;
	/// signed 32 bits, planar
	static const AVSampleFormat S32P	= ::AV_SAMPLE_FMT_S32P;
	/// float, planar
	static const AVSampleFormat FLTP	= ::AV_SAMPLE_FMT_FLTP;
	/// double, planar
	static const AVSampleFormat DBLP	= ::AV_SAMPLE_FMT_DBLP;
protected:
	/// Sample Format
	int			m_nValue;
public:
	AVSampleFormat(int value);
	explicit AVSampleFormat(unsigned int value);
protected:
	//property int value { int get() { return m_nValue; } }
public:
	/// Return the name of sample_fmt, or NULL if sample_fmt is not
	/// recognized.
	property String^ name { String^ get() { return ToString(); } }
	/// Return number of bytes per sample.
	///
	/// @param sample_fmt the sample format
	/// @return number of bytes per sample or zero if unknown for the given
	/// sample format
	property int bytes_per_sample { int get(); }

	/// Check if the sample format is planar.
	///
	/// @param sample_fmt the sample format to inspect
	/// @return 1 if the sample format is planar, 0 if it is interleaved
	property bool is_planar { bool get(); }

Such classes expose all required things to work with such classes as value types along with implicit types conversions:

MC++

public:
	static operator int(AVSampleFormat a) { return a.m_nValue; }
	static explicit operator unsigned int(AVSampleFormat a) 
           { return (unsigned int)a.m_nValue; }
	static operator AVSampleFormat(int a) { return AVSampleFormat(a); }
	static explicit operator AVSampleFormat(unsigned int a) 
           { return AVSampleFormat((int)a); }
internal:
	static operator ::AVSampleFormat(AVSampleFormat a) 
           { return (::AVSampleFormat)a.m_nValue; }
	static operator AVSampleFormat(::AVSampleFormat a) 
           { return AVSampleFormat((int)a); }

Libraries

Library contains special static classes which can identify used libraries - their version, build configuration and license:

MC++

// LibAVCodec
public ref class LibAVCodec
{
private:
	static bool s_bRegistered = false;
private:
	LibAVCodec();
public:
	// Return the LIBAVCODEC_VERSION_INT constant.
	static property UInt32 Version { UInt32 get(); }
	// Return the libavcodec build-time configuration.
	static property String^ Configuration { String^ get(); }
	// Return the libavcodec license.
	static property String^ License { String^ get(); }
public:
	// Register the codec codec and initialize libavcodec.
	static void Register(AVCodec^ _codec);
	// Register all the codecs, parsers and bitstream filters which were enabled at
	// configuration time. If you do not call this function you can select exactly
	// which formats you want to support, by using the individual registration
	// functions.
	static void RegisterAll();
};

Each FFmpeg imported library has a similar class with described fields.

Those library classes can contain static initialization methods which are relay to internal calls of such APIs as: avcodec_register_all, av_register_all and others. Those methods may be called manually by the user, but the library manages to call them internally if needed. The API which is deprecated and may not be present as exported from FFmpeg library - internally designed dynamically linking. How that is done is described later in this document.

ToString() implementation

Some classes or enumerations have overridden the string conversion method: “ToString()”. In such a case, it returns the string value which is output by the related FFmpeg API to get a readable description of the type or structure content:

MC++

 String^ FFmpeg::AVSampleFormat::ToString()
{
	auto p = av_get_sample_fmt_name((::AVSampleFormat)m_nValue);
	return p != nullptr ? gcnew String(p) : nullptr;
}

In example, it calls that method to get default string representation of the sample format value:

var fmt = AVSampleFormat.FLT;
Console.WriteLine("AVSampleFormat: {0} {1}",(int)fmt,fmt);

AVRESULT

As you may know, most of FFmpeg API returns integer value, which may mean error or number processed data or have any other meaning. For the good understanding of the returned value wrapper library have the class with the name AVRESULT. It is the same as the AVERROR macro in FFmpeg, just designed as a class object with exposed useful fields and with .NET pluses. That class can be used as integer value without direct type casting as it is done as value type in its basis. By using the class, it is possible to get an error description string from the returned API value as the class has overridden the ToString() method so it can be easily used in string conversion:

/* Write the stream header, if any. */
ret = oc.WriteHeader();
if (ret < 0) {
    Console.WriteLine("Error occurred when opening output file: {0}\n",(AVRESULT)ret);
    Environment.Exit(1);
}

Given error description string is good for understanding what result has in execution of the method:

Console.WriteLine(AVRESULT.ENOMEM.ToString());

If AVRESULT, type used in code then possible to see result string under debugger:

In case we have integer result type and we know that it is an error, then just cast to AVRESULT type to see error description in variable values view while debugging:

This class also can be accessed with some predefined error values which can be used in code:

static AVRESULT CheckPointer(IntPtr p)
{
    if (p == IntPtr.Zero) return AVRESULT.ENOMEM;
    return AVRESULT.OK;
}

Some of the methods already designed to return AVRESULT objects, other types can be easily casted to it.

MC++

FFmpeg::AVRESULT FFmpeg::AVCodecParameters::FromContext(AVCodecContext^ codec)
{
	return avcodec_parameters_from_context(
		((::AVCodecParameters*)m_pPointer),(::AVCodecContext*)codec->_Pointer.ToPointer());
}

FFmpeg::AVRESULT FFmpeg::AVCodecParameters::CopyTo(AVCodecContext^ context)
{
	return avcodec_parameters_to_context(
		(::AVCodecContext*)context->_Pointer.ToPointer(), 
        (::AVCodecParameters*)m_pPointer);
}

FFmpeg::AVRESULT FFmpeg::AVCodecParameters::CopyTo(AVCodecParameters^ dst)
{
	return avcodec_parameters_copy(
		(::AVCodecParameters*)dst->_Pointer.ToPointer(),
        ((::AVCodecParameters*)m_pPointer));
}

AVLog

The way to inform applications from your component or from the internal of FFmpeg library uses the av_log API. Access to the same functionality of the logging is done with a static AVLog class.

if ((ret = AVFormatContext.OpenInput(out ifmt_ctx, in_filename)) < 0) {
    AVLog.log(AVLogLevel.Error, string.Format("Could not open input file {0}", in_filename));
    goto end;
}

It also has the ability to set up your own callback for the receiving log messages:

static void LogCallback(AVBase avcl, int level, string fmt, IntPtr vl)
{
    Console.WriteLine(fmt);
}

static void Main(string[] args)
{
    AVLog.Callback = LogCallback;
    ...

AVOptions

Most of native FFmpeg structure objects have properties which are hidden and can be accessed only with av_opt_* API’s. Current implementations of the wrapper library also have such abilities. For that purpose, the library contains class AVOptions. Initially, it was designed as a static object but later implemented as a regular class. It can be initialized with a constructor with AVBase or just with a pointer type argument.

/* Set the filter options through the AVOptions API. */
AVOptions options = new AVOptions(abuffer_ctx);
options.set("channel_layout", INPUT_CHANNEL_LAYOUT.ToString(), AVOptSearch.CHILDREN);
options.set("sample_fmt", INPUT_FORMAT.name, AVOptSearch.CHILDREN);
options.set_q("time_base", new AVRational( 1, INPUT_SAMPLERATE ),  AVOptSearch.CHILDREN);
options.set_int("sample_rate", INPUT_SAMPLERATE, AVOptSearch.CHILDREN);

Class exposes all useful methods for manipulation with options API. Along with it, there is an ability to enumerate options:

AVOptions options = new AVOptions(ifmt_ctx);
foreach (AVOption o in options)
{
    Console.WriteLine(o.name);
}

AVDictionary

Another most used structure in FFmpeg API is the AVDictionary. It is a Key-Value collection class which can be an input or an output of certain functions. In such functions, you pass some initialization options as a dictionary object and in return getting all available options or options which are not used by the function. The management of that structure pointer in such function methods is done in place there is such an API called. So, users just need to free the returned dictionary object in a regular way as it described earlier, and internally the structure pointer just replaced. For example, we have an FFmpeg API:

C++

int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options);

And related implementation in the wrapper library:

MC++

AVRESULT Open(AVCodec^ codec, AVDictionary^ options);
AVRESULT Open(AVCodec^ codec);

So, according to FFmpeg API implementation details, structure designed to set options, and on returns, there is a new dictionary object with options which are not found. But on implementation, only the structure object is used which was passed as an argument. So this is working in a regular way and users have only one structure at all. Even if the call of the internal API failed, the user needed to free only the target structure object, which was created initially:

AVDictionary dct = new AVDictionary();
dct.SetValue("preset", "veryfast");
s.id = (int)_output.nb_streams - 1;
bFailed = !(s.codec.Open(_codec, dct));
dct.Dispose();

Extending Functionality

As mentioned, structure classes can contain fields of the related structure and methods which generated from FFmpeg API and also related to functionality of that structure. But classes can contain helper methods which allows extending functionality of FFmpeg. For example, class AVFrame has static methods for conversion from/to .NET Bitmap object into AVFrame structure:

MC++

public:
	System::Drawing::Bitmap^ ToBitmap();
public:
	static String^ GetColorspaceName(AVColorSpace val);
public:
	// Create Frame From Image
	static AVFrame^ FromImage(System::Drawing::Bitmap^ _image,AVPixelFormat _format);
	static AVFrame^ FromImage(System::Drawing::Bitmap^ _image);
	static AVFrame^ FromImage(System::Drawing::Bitmap^ _image, 
                    AVPixelFormat _format,bool bDeleteBitmap);
	static AVFrame^ FromImage(System::Drawing::Bitmap^ _image, bool bDeleteBitmap);
	// Convert Frame To Different Colorspace
	static AVFrame^ ConvertVideoFrame(AVFrame^ _source,AVPixelFormat _format);
	// Convert Frame To Bitmap
	static System::Drawing::Bitmap^ ToBitmap(AVFrame^ _frame);

Those methods encapsulate whole functionality for colorspace conversion and any other list of calls. More of that, there is an ability to wrap around existing Bitmap object without any copy or allocating new image data. In such a case, an internal helper object is created, which is freed with the parent AVFrame class. How that mechanism is implemented is described later in the document.

Class AVBase

All FFmpeg structure classes are inherited from the AVBase class. It contains common fields and helper methods. AVBase class is the root class of each FFmpeg object. Main tasks of the AVBase object are to handle object management and memory management.

Fields

As mentioned, AVBase class is the base for every exposed structure of FFmpeg library and it contains fields which are common to all objects. Those fields are: structure pointer and size of structure. Size of structure may be not initialized unless the structure is not allocated directly, so user does not need to rely on that field. Pointers can be casted to raw structure type, for access to any additional fields or raw data directly:

AVBase avbase = new AVPacket();
Console.WriteLine("0x{0:x}",avbase._Pointer);

In case if structure can be allocated with a common allocation method, it may contain structure size non zero value and in case if the structure allocated the related property is set to true:

AVBase avbase = new AVRational(1,1);
Console.WriteLine("0x{0:x} StructureSize: {1} Allocated {2}",
    avbase._Pointer,
    avbase._StructureSize,
    avbase._IsAllocated
    );

If a class object is returned from any method, to be sure if such object is valid, we can check the pointer field and compare it to non-zero value. But there is also _IsValid property in the AVBase class which performs the same:

/* allocate and init a re-usable frame */
ost.frame = alloc_picture(c.pix_fmt, c.width, c.height);
if (!ost.frame._IsValid) {
    Console.WriteLine("Could not allocate video frame\n");
    Environment.Exit(1);
}

Decided to add the “_” prefix into such internal and not related to FFmpeg structure properties, so the user knows that it is not a structure field.

Methods

One method available for testing objects and its fields is the TraceValues(). This method uses .NET Reflection and enumerates available properties of the object with the values. If your code is critical to use of .NET Reflection, then that code should be excluded from assembly. Although TraceValues code worked only in Debug build configuration. Example of execution of that method is:

var r = new AVRational(25, 1);
r.TraceValues();
r.Dispose();

There are also a couple protected methods of the class which can help to build inherited classes with your own needs. I just point few of them which may be interested:

MC++

// Validate object not disposed
void ValidateObject();

That method checks whether an object is disposed or not, and if disposed then it throws an ObjectDisposedException exception.

MC++

// Check that object is available to be accessed
// throw exception if not
bool _EnsurePointer();
bool _EnsurePointer(bool bThrow);

Those two methods check if the structure pointer is zero or not, including the case if the object is disposed. This method is better for internal use than the _IsValid property described earlier.

MC++

// Object Memory Allocation
void AllocPointer(int _size);

This is the helper method of allocation structure pointer. Usually, an argument for this method is used in the _StructureSize field, but this is not a requirement. What method performs internal memory allocation for the structure pointer field: _Pointer, setting up the destructor and set allocation flag of an object: _IsAllocated.

if (!base._EnsurePointer(false)) 
{
    base.AllocPointer(_StructureSize);
}

Other protected methods of the class are described in objects management or memory management topics.

Destructor

As already described, that structure pointer in AVBase class is a pointer to a real FFmpeg structure. But any structure can be allocated and freed with different FFmpeg APIs. More of that after creation and call any initialization methods of any structure there is possible to call additional methods for destruction or specific uninitialization before performing actual object free. For example, AVCodecContext allocation and free APIs.

C++

AVCodecContext *avcodec_alloc_context3(const AVCodec *codec);

/**
 * Free the codec context and everything associated with it and write NULL to
 * the provided pointer.
 */
void avcodec_free_context(AVCodecContext **avctx);

And at the same time, AVPacket object, along with allocation and free related APIs, have the underlying buffer dereferencing API, which should be called before freeing the structure.

C++

void av_packet_unref(AVPacket *pkt);

Depending on that, AVBase class has two pointers to APIs which perform structure destruction and free actual structure object memory:

MC++

typedef void TFreeFN(void *);
typedef void TFreeFNP(void **);
// Function to free object
TFreeFN * m_pFree;
// Object Destructor Function (may be set along with any free function)
TFreeFN * m_pDescructor;
// Free Pointer Function
TFreeFNP * m_pFreep;

There are two types of free API pointers, as different FFmpeg structures can have different API arguments to free underlying objects, and properly destroy it. If one API is set, then the other one is not used - those two pointers are mutual exclusion. Destructor API if set then called before any API to free structure. Example of how those pointers are initialized internally for AVPacket class:

MC++

FFmpeg::AVPacket::AVPacket()
	: AVBase(nullptr,nullptr)
	, m_pOpaque(nullptr)
	, m_pFreeCB(nullptr)
{
	m_pPointer = av_packet_alloc();
	m_pFreep = (TFreeFNP*)av_packet_free;
	m_pDescructor = (TFreeFN*)av_packet_unref;
	
	av_init_packet((::AVPacket*)m_pPointer);
	((::AVPacket*)m_pPointer)->data = nullptr;
	((::AVPacket*)m_pPointer)->size = 0;
}

Those pointers can be changed internally in code depending on which object methods are called. So, in some cases, for an object to be set to a different free function pointer or different destructor depending on the object state. This allows the object to properly destroy allocated data and clean up used memory.

Objects Management

Each wrapper library object can belong to another object as well as the underlying structure field containing a pointer to another FFmpeg structure. As an example: structure AVCodecContext contains codec field which points to AVCodec structure; av_class field - AVClass structure and others; AVFormatContext have similar fields: iformat, oformat, pb and others. More of that, the structure fields may not be read only and can be changed by the user, so in the code, we need to handle that. In related properties, it designed in the next way:

MC++

public ref class AVFormatContext : public AVBase
{
public:
	/// Get the AVClass for AVFormatContext. It can be used in combination with
	/// AV_OPT_SEARCH_FAKE_OBJ for examining options.
	static property AVClass^ Class { AVClass^ get(); }
protected:
	// Callback
	AVIOInterruptDesc^	m_pInterruptCB;
internal:
	AVFormatContext(void * _pointer,AVBase^ _parent);
public:
	AVFormatContext();
public:
	/// A class for logging and @ref avoptions. Set by avformat_alloc_context().
	/// Exports (de)muxer private options if they exist.
	property AVClass^ av_class { AVClass^ get(); }
	/// The input container format.
	/// Demuxing only, set by avformat_open_input().
	property AVInputFormat^ iformat { AVInputFormat^ get(); void set(AVInputFormat^); }
	/// The output container format.
	/// Muxing only, must be set by the caller before avformat_write_header().
	property AVOutputFormat^ oformat { AVOutputFormat^ get(); void set(AVOutputFormat^); }

Internally, the child object of such structure is created with no referencing allocated data - as initially, the object is part of the parent. So disposing of that object makes no sense. But, if such an object is set by the user, which means that it was allocated as an independent object initially, then the parent holds reference to the child until it will be replaced via property or the main object will be released. In that case, the call of child object disposing does not free the actual object - it just decrements the internal reference counter. In that manner: one object can be created and can be set via property to different parents, and in all cases, that single object will be used without coping and it will be disposed automatically, once all parents will be freed. In other words, disposing of the child object will be performed then it does not belong to any other parents. So, an object can be created manually, an object can be created from a pointer as part of the parent object, or an object can be set as a pointer property of another object, and from all of that object can be properly released. This is handled by having two reference counters. One used for in-object access another from outer-object calls and once all references become zero - object freed.

Most child objects are created once the program accessed the property for the first time. Then such an object is put into the children collection, and if the program makes a new call of the same property, then a previously created object from that collection will be used, so no new instance is created. There are helper templates in the AVBase class for child instance creation:

MC++

internal:
	// Create or Access Child objects
	template <class T>
	static T^ _CreateChildObject(const void * p, AVBase^ _parent) 
              { return _CreateChildObject((void*)p,_parent); }
	template <class T>
	static T^ _CreateChildObject(void * p,AVBase^ _parent) {  
		if (p == nullptr) return nullptr; 
		T^ o = (_parent != nullptr ? (T^)_parent->GetObject((IntPtr)p) : nullptr); 
		if (o == nullptr) o = gcnew T(p,_parent); return o; 
	}
	template <class T>
	T^ _CreateObject(const void * p) { return _CreateChildObject<T>(p,this); }
	template <class T>
	T^ _CreateObject(void * p) { return _CreateChildObject<T>(p,this); }

Children collection is freed once the main object is released. Class contains helper methods for accessing children objects in a collection:

MC++

protected:
	// Accessing children 
	AVBase^ GetObject(IntPtr p);
	bool AddObject(IntPtr p,AVBase^ _object);
	bool AddObject(AVBase^ _object);
	void RemoveObject(IntPtr p);

Most interesting method here is the AddObject which allows associating a child AVBase object with a specified pointer. If that pointer is already associated with another object, then it will be replaced, and the previous object disposed.

Memory Management

Each AVBase object can contain allocated memory for certain properties, internal data or other needs. That memory is stored in the named collection which is related to that object. If a memory pointer is recreated, then it is replaced in the collection. Once an object is freed - then all allocated memory is also free.

MC++

Generic::SortedList<String^,IntPtr>^ m_ObjectMemory;

There are couple methods for memory manipulation in a class:

MC++

IntPtr GetMemory(String^ _key);
void SetMemory(String^ _key,IntPtr _pointer);
IntPtr AllocMemory(String^ _key,int _size);
IntPtr AllocString(String^ _key,String^ _value);
IntPtr AllocString(String^ _key,String^ _value,bool bUnicode);
void FreeMemory(String^ _key);
bool IsAllocated(String^ _key);

Those methods help allocate and free memory and check whether it is allocated or not. Naming access done for controls allocation for specified properties.

MC++

void FFmpeg::AVCodecContext::subtitle_header::set(array<byte>^ value)
{
	if (value != nullptr && value->Length > 0)
	{
		((::AVCodecContext*)m_pPointer)->subtitle_header_size = value->Length;
		((::AVCodecContext*)m_pPointer)->subtitle_header = 
			(uint8_t *)AllocMemory("subtitle_header",value->Length).ToPointer();
		Marshal::Copy(value,0,
			(IntPtr)((::AVCodecContext*)m_pPointer)->subtitle_header,value->Length);
	}
	else
	{
		FreeMemory("subtitle_header");
		((::AVCodecContext*)m_pPointer)->subtitle_header_size = 0;
		((::AVCodecContext*)m_pPointer)->subtitle_header = nullptr;
	}
}
</byte>

Allocation of the memory is done by using av_maloc API of the FFmpeg library. There is also a static collection of allocated memory. That collection is used to call the static methods for memory allocation provided by the AVBase class.

MC++

IntPtr AllocMemory(String^ _key,int _size);
IntPtr AllocString(String^ _key,String^ _value);
IntPtr AllocString(String^ _key,String^ _value,bool bUnicode);
void FreeMemory(String^ _key);
bool IsAllocated(String^ _key);

Static memory collection located in a special AVMemory class. As that class is the base for AVMemPtr - which represents the special object of memory allocated pointer, and for AVBase - as that class is base for all imported structures and, as were mentioned, there may require memory allocation. Static memory is freed automatically once all objects which may use it are disposed.

Arrays

In FFmpeg structures, there are a lot of fields which represent arrays of data with different types. Those arrays can be fixed size, arrays ending with specified data value, arrays of arrays and arrays of other structures. Each type can be designed personally, but there is also some common implementation. For example, array of streams in format context implemented as its own class with enumerator and indexed property.

MC++

ref class AVStreams
	: public System::Collections::IEnumerable
	, public System::Collections::Generic::IEnumerable<AVStream^>
{
private:
	ref class AVStreamsEnumerator
		: public System::Collections::IEnumerator
		, public System::Collections::Generic::IEnumerator<AVStream^>
	{
	protected:
		AVStreams^	m_pParent;
		int			m_nIndex;
	public:
		AVStreamsEnumerator(AVStreams^ streams);
		~AVStreamsEnumerator();
	public:
		// IEnumerator
		virtual bool MoveNext();
		virtual property AVStream^ Current { AVStream^ get (); }
		virtual void Reset();
		virtual property Object^ CurrentObject 
        { virtual Object^ get() sealed = IEnumerator::Current::get; }
	};
protected:
	AVFormatContext^ m_pParent;
internal:
	AVStreams(AVFormatContext^ _parent) : m_pParent(_parent) {}
public:
	property AVStream^ default[int] { AVStream^ get(int index) 
                       { return m_pParent->GetStream(index); } }
	property int Count { int get() { return m_pParent->nb_streams; } }
public:
	// IEnumerable
	virtual System::Collections::IEnumerator^ GetEnumerator() sealed = 
    System::Collections::IEnumerable::GetEnumerator 
    { return gcnew AVStreamsEnumerator(this); }
public:
	// IEnumerable<AVStream^>
	virtual System::Collections::Generic::IEnumerator<AVStream^>^ 
    GetEnumeratorGeneric() sealed = 
    System::Collections::Generic::IEnumerable<AVStream^>::GetEnumerator 
	{
		return gcnew AVStreamsEnumerator(this);
	}
};

In that case, we have an array with pointers to other structure classes - AVStream, and each indexed property call - performing creation of the child object of the parent AVFormatContext object, as described in objects management topic.

MC++

FFmpeg::AVStream^ FFmpeg::AVFormatContext::GetStream(int idx)
{
	if (((::AVFormatContext*)m_pPointer)->nb_streams <= 
         (unsigned int)idx || idx < 0) return nullptr;
	auto p = ((::AVFormatContext*)m_pPointer)->streams[idx];
	return _CreateObject<AVStream>((void*)p);
}

In .NET, that looks just as regular property and array access:

var _input = AVFormatContext.OpenInputFile(@"test.avi");
if (_input.FindStreamInfo() == 0)
{
    for (int i = 0; i < _input.streams.Count; i++)
    {
        Console.WriteLine("Stream: {0} {1}",i,_input.streams[i].codecpar.codec_type);
    }
}

There are also possible cases where you need to set an array of the AVBase objects as property - in that situation, each object is put into the parent objects collection, so disposing of that object can be done safely. Along with the main structure array, the property which is related to the counter of objects in that array can be modified internally. Also, the previously allocated objects and array memory itself are freed. The next code displays how that is done:

MC++

void FFmpeg::AVFormatContext::chapters::set(array<avchapter^>^ value)
{
	{
		int nCount = ((::AVFormatContext*)m_pPointer)->nb_chapters;
		::AVChapter ** p = (::AVChapter **)((::AVFormatContext*)m_pPointer)->chapters;
		if (p)
		{
			while (nCount-- > 0)
			{
				RemoveObject((IntPtr)*p++);
			}
		}
		((::AVFormatContext*)m_pPointer)->nb_chapters = 0;
		((::AVFormatContext*)m_pPointer)->chapters = nullptr;
		FreeMemory("chapters");
	}
	if (value != nullptr && value->Length > 0)
	{
		::AVChapter ** p = (::AVChapter **)AllocMemory
        ("chapters",value->Length * (sizeof(::AVChapter*))).ToPointer();
		if (p)
		{
			((::AVFormatContext*)m_pPointer)->chapters = p;
			((::AVFormatContext*)m_pPointer)->nb_chapters = value->Length;
			for (int i = 0; i < value->Length; i++)
			{
				AddObject((IntPtr)*p,value[i]);
				*p++ = (::AVChapter*)value[i]->_Pointer.ToPointer();
			}
		}
	}
}

In most cases with read only arrays, where array values are not required to be changed, the properties returns a simple .NET array:

MC++

array<FFmpeg::AVSampleFormat>^ FFmpeg::AVCodec::sample_fmts::get()
{ 
	List<AVSampleFormat>^ _array = nullptr;
	const ::AVSampleFormat * _pointer = ((::AVCodec*)m_pPointer)->sample_fmts;
	if (_pointer)
	{
		_array =  gcnew List<AVSampleFormat>();
		while (*_pointer != -1)
		{
			_array->Add((AVSampleFormat)*_pointer++);
		}
	}
	return _array != nullptr ? _array->ToArray() : nullptr;
}

The usage of such arrays are done in regular way in .NET:

var codec = AVCodec.FindDecoder(AVCodecID.MP3);
Console.Write("{0} Formats [ ",codec.long_name);
foreach (var fmt in codec.sample_fmts)
{
    Console.Write("{0} ",fmt);
}
Console.WriteLine("]");

For other cases for arrays accessing in the wrapper library implemented base class AVArrayBase. It is inherited from the AVBase class and accesses memory chunks with specified data size. Class also contains a number of elements in that array.

MC++

public ref class AVArrayBase : public AVBase
{
protected:
	bool	m_bValidate;
	int		m_nItemSize;
	int		m_nCount;
protected:
	AVArrayBase(void * _pointer,AVBase^ _parent,int nItemSize,int nCount) 
		: AVBase(_pointer,_parent) , m_nCount(nCount), 
          m_nItemSize(nItemSize), m_bValidate(true) { }
	AVArrayBase(void * _pointer,AVBase^ _parent,
                int nItemSize,int nCount,bool bValidate) 
		: AVBase(_pointer,_parent) , m_nCount(nCount), 
                 m_nItemSize(nItemSize), m_bValidate(bValidate) { }
protected:
	void ValidateIndex(int index) { if (index < 0 || index >= m_nCount) 
                                    throw gcnew ArgumentOutOfRangeException(); }
	void * GetValue(int index)
	{
		if (m_bValidate) ValidateIndex(index);
		return (((LPBYTE)m_pPointer) + m_nItemSize * index);
	}
	void SetValue(int index,void * value)
	{
		if (m_bValidate) ValidateIndex(index);
		memcpy(((LPBYTE)m_pPointer) + m_nItemSize * index,value,m_nItemSize);
	}
public:
	property int Count { int get() { return m_nCount; } }
};

That class is the base and without types specifying. And the main template for the typed arrays is the AVArray class. It is inherited from AVArrayBase and already has access to the indexed property of the array elements. Along with it, the class supports enumerators. This class has special modifications for some types like AVMemPtr and IntPtr to make proper access to array elements. The template classes are used in most cases in the library, as they give direct access to underlying array memory - without any .NET marshaling and copy. For example, the class AVFrame/AVPicture has properties:

MC++

/// pointers to the image data planes
property AVArray<AVMemPtr^>^ data { AVArray<AVMemPtr^>^ get(); }
/// number of bytes per line
property AVArray<int>^ linesize { AVArray<int>^ get(); }

And in the same place, the AVMemPtr class has the ability to directly access the memory with its properties, as different typed arrays which helps modifying or reading data. Next example shows how to access the frame data:

var frame = AVFrame.FromImage((Bitmap)Bitmap.FromFile(@"image.jpg"),
                               AVPixelFormat.YUV420P);
for (int j = 0; j < frame.height; j++)
{
    for (int i = 0; i < frame.linesize[0]; i++)
    {
        Console.Write("{0:X2}",frame.data[0][i + j * frame.linesize[0]]);
    }
    Console.WriteLine();
}
frame.Dispose();

Class AVMemPtr

Initially, in the wrapper library all properties, which operate with pointers, were designed as IntPtr type. But it is not an easy way to work with memory in .NET directly with IntPtr. As in application may require to change picture data, generate image, perform some processing of audio and video. To make that easier, the AVMemPtr memory helper class was involved. However, it contains an implicit casting operator which allows it to be used in the same methods which have the IntPtr type as argument in .NET. The basic usage of this class as regular IntPtr value demonstrated in the next C# example:

[DllImport("msvcrt.dll", EntryPoint = "strcpy")]
public static extern IntPtr strcpy(IntPtr dest, IntPtr src);

static void Main(string[] args)
{
    // Allocate pointer from string 
    var p = Marshal.StringToCoTaskMemAnsi("some text");
    // Create AVMemPtr object and allocate memory buffer of 256 bytes
    AVMemPtr s = new AVMemPtr(256);
    // Copy zero ending string see exter API above
    strcpy(s, p);
    // Free Source memory
    Marshal.FreeCoTaskMem(p);
    // Shows what we have in memory
    Console.WriteLine("AVMemPtr string: \"{0}\"", Marshal.PtrToStringAnsi(s));
    s.Dispose();
}

In this example, we allocate memory from given text. The result of that memory pointer and the data can be shown in the next picture:

After execution of C++ strcpy API .NET wrapper, which performs zero ending string copy, you can see that the text copied into allocated pointer of AVMemPtr object:

And after we display the string value which we have in there:

Memory allocation and free in class done with FFmpeg API av_alloc and av_free. Class has comparison operators with different types. AVMemPtr also can be casted directly from IntPtr structure. Along with it, the class has the ability to access data as a regular bytes array:

var p = Marshal.StringToCoTaskMemAnsi("some text");
AVMemPtr s = p;
int idx = 0; 
Console.Write("AVMemPtr data: ");
while (s[idx] != 0) Console.Write(" {0}",(char)s[idx++]);
Console.Write("\n");
Marshal.FreeCoTaskMem(p);

Result of execution code above:

So we have allocated pointer data as IntPtr type and easily accessed it directly. AVMemPtr also can handle addition and subtraction operators they gave another AVMemPtr object which points to another address with resulted offset:

var p = Marshal.StringToCoTaskMemAnsi("some text");
AVMemPtr s = p;
Console.Write("AVMemPtr data: ");
while ((byte)s != 0) { Console.Write(" {0}", (char)((byte)s)); s += 1; }
Console.Write("\n");
Marshal.FreeCoTaskMem(p);

Each addition or subtraction creates a new instance of AVMemPtr but internally, it is implemented that the base data pointer stays the same even if data is allocated and the main instance is DIsposed. That is done also with counting references of main instance, for example, this code will work correctly:

var p = Marshal.StringToCoTaskMemAnsi("some text");
AVMemPtr s = new AVMemPtr(256);
strcpy(s, p);
Marshal.FreeCoTaskMem(p);
AVMemPtr s1 = s + 3;
s.Dispose();
Console.WriteLine("AVMemPtr string: \"{0}\"",Marshal.PtrToStringAnsi(s1));
s1.Dispose();

In next code, we can see that instances of s, s0 and s1 objects are different but comparison operators determine equals of the data:

var p = Marshal.StringToCoTaskMemAnsi("some text");
AVMemPtr s = p;
AVMemPtr s1 = s + 1;
AVMemPtr s0 = s1 - 1;
Console.WriteLine(" s {0} s1 {1} s0 {2}", s,s1,s0);
Console.WriteLine(" s == s1 {0}, s == s0 {1},s0 == s1 {2}", 
    (s == s1),(s == s0),(s0 == s1));
Console.WriteLine(" {0}, {1}, {2}", 
    object.ReferenceEquals(s,s1), object.ReferenceEquals(s,s0),
    object.ReferenceEquals(s0,s1));
Marshal.FreeCoTaskMem(p);

Class AVMemPtr also contains helper methods and properties which may be useful, one of them to determine if data allocated and size of buffer in bytes and debug helper method to perform dump pointer data to a file or Stream.

The main feature of this class is the ability to represent data as arrays of different types. This is done as different properties of AVArray templates:

MC++

property AVArray<byte>^   bytes				{ AVArray<byte>^ get();   }
property AVArray<short>^  shorts			{ AVArray<short>^ get();  }
property AVArray<int>^	  integers			{ AVArray<int>^ get();    }
property AVArray<float>^  floats			{ AVArray<float>^ get();  }
property AVArray<double>^ doubles			{ AVArray<double>^ get();  }
property AVArray<IntPtrv^ pointers			{ AVArray<IntPtr>^ get(); }
property AVArray<unsigned int>^	uints		{ AVArray<unsigned int>^ get(); }
property AVArray<unsigned short>^ ushorts	{ AVArray<unsigned short>^ get();  }
property AVArray<RGB^>^   rgb				{ AVArray<RGB^>^ get();    }
property AVArray<RGBA^>^  rgba				{ AVArray<RGBA^>^ get();   }
property AVArray<AYUV^>^  ayuv				{ AVArray<AYUV^>^ get();   }
property AVArray<YUY2^>^  yuy2				{ AVArray<YUY2^>^ get();   }
property AVArray<UYVY^>^  uyvy				{ AVArray<UYVY^>^ get();   }

So, it is easy fill data of AVFrame object, for example, of the audio with IEEE float type or signed short:

var samples = frame.data[0].shorts;
for (int j = 0; j < c.frame_size; j++)
{
    samples[2 * j] = (short)(int)(Math.Sin(t) * 10000);

    for (int k = 1; k < c.channels; k++)
        samples[2 * j + k] = samples[2 * j];
    t += tincr;
}

More of it, as you can see, there are some arrays with specified pixel format structures: RGB, RGBA, AYUV, YUY2 and UYVY. Using these properties is a designed way to address pixels with those formats which is described later.

Enumerators

FFmpeg APIs for accessing lists of resources for example enumerate existing input or output formats have special types of APIs - iterators. The new API has iterate with opaque argument and old ones use previous element as reference:

C++

const AVCodec *av_codec_iterate(void **opaque);

And the old one:

C++

AVCodec *av_codec_next(const AVCodec *c);

Initially, in the wrapper library were designed a way to use those API in the same manner as they are present in FFmpeg, but later implementations were done with usage of IEnumerable and IEnumerator interfaces. That gave us the way of performing enumeration with foreach C# operator instead of calling API directly:

foreach (AVCodec codec in AVCodec.Codecs)
{
    if (codec.IsDecoder()) Console.WriteLine(codec);
}

The code above can be done in a couple other modifications. Most of enumerator objects expose Count property and indexing operator:

var codecs = AVCodec.Codecs;
for (int i = 0; i < codecs.Count; i++)
{
    if (codecs[i].IsDecoder()) Console.WriteLine(codecs[i]);
}

It is also possible to use old way with calling API wrapper method:

AVCodec codec = null;
while (null != (codec = AVCodec.Next(codec)))
{
    if (codec.IsDecoder()) Console.WriteLine(codec);
}

There is no need to worry if FFmpeg libraries expose only one API: av_codec_iterate or av_codec_next. In the wrapper library, all FFmpeg APIs which are marked as deprecated are linked dynamically, so it checks runtime for what API to use. How that is implemented is described later in a separate topic.

AVArray classes which are done in the library also expose an enumerator interface:

var p = Marshal.StringToCoTaskMemAnsi("some text");
AVMemPtr s = new AVMemPtr(256);
strcpy(s, p);
Marshal.FreeCoTaskMem(p);
var array = s.bytes;
foreach (byte c in array)
{
    if (c == 0) break;
    Console.Write("{0} ",(char)c);
}
Console.Write("\n");
s.Dispose();

In the example above, we have allocated data length of memory while creating AVMemPtr object. But it is possible that enumeration may not work as arrays can be created dynamically or without size initialization - so the enumerators are not able to determine the total number of elements. I limit that functionality to avoid out of boundary crashes and other exceptions. For example, the next code will not work as the data size of AVMemPtr object is 0 due conversion from IntPtr pointer into AVMemPtr, so you can find the difference with the previous example:

var p = Marshal.StringToCoTaskMemAnsi("some text");
AVMemPtr s = p;
var array = s.bytes;
foreach (byte c in array)
{
    if (c == 0) break;
    Console.Write("{0} ",(char)c);
}
Console.Write("\n");
s.Dispose();

On the screenshot below, it is possible to see that data of AVMemPtr object pointed correctly to the text line, but as size is unknown and the enumeration is not available.

The wrapper library properly handles AVPicture/AVFrame classes data field enumeration. That is done with an internal function which handles detecting object field size. So the next code example works correctly:

AVFrame frame = new AVFrame();
frame.Alloc(AVPixelFormat.BGR24, 320, 240);
frame.MakeWritable();
var rgb = frame.data[0].rgb;
foreach (RGB c in rgb)
{
    c.Color = Color.AliceBlue;
}
var bmp = frame.ToBitmap();
bmp.Save(@"test.bmp");
bmp.Dispose();
frame.Dispose();

Lots of classes in a library support enumerators. Along with enumerating codecs in the examples above, enumerator types are used in enumeration input and output formats, parsers, filters, devices and other library resources.

Class AVColor

As already mentioned, the AVMemPtr class can produce data access as arrays of some colorspace types like RGB24, RGBA, AYUV, YUY2 or UYVY. Those colorspaces are designed as structures with the ability to access color components. Base class of those structures is the AVColor. It contains basic operators, helper properties and methods. For example, it supports assigning regular color value and overrides string representation:

AVFrame frame = new AVFrame();
frame.Alloc(AVPixelFormat.BGRA, 320, 240);
frame.MakeWritable();
var rgb = frame.data[0].rgba;
rgb[0].Color = Color.DarkGoldenrod;
Console.Write("{0} {1}",rgb[0].ToString(),rgb[0].Color);
frame.Dispose();

Along with it, the class contains the ability for color component access and implicit conversion into direct IntPtr type. Also, internally, the class supports colorspace conversions RGB into YUV and YUV to RGB.

var ayuv = frame.data[0].ayuv;
ayuv[0].Color = Color.DarkGoldenrod;
Console.Write("{0} {1}",ayuv[0].ToString(),ayuv[0].Color);

In the code above, the same color as in the previous example set into AYUV colorspace. So internally, it converted from ARGB into AYUV, which gave output:

Conversion results have little difference from original value, but that is acceptable. By default, used the BT.601 conversion matrix. It is possible to change matrix coefficients by calling SetColorspaceMatrices static method of AVColor class. Additionally class contains static methods for colorspace components conversion:

Color c = Color.DarkGoldenrod;
int y = 0, u = 0, v = 0;
int r = 0, g = 0, b = 0;
AVColor.RGBToYUV(c.R,c.G,c.B, ref y, ref u, ref v);
AVColor.YUVToRGB(y,u,v, ref r, ref g, ref b);
Console.Write("src: [R: {0} G: {1} B: {2}] dst: [R: {3} G: {4} B: {5}]",c.R,c.G,c.B,r,g,b);

It is necessary to keep in mind that YUY2 and UYVY control 2 pixels. So setting the Color property affects both those pixels. To control each pixel, you should use “Y0” and “Y1” class properties to change luma values.

Core Classes

The wrapper library contains lots of classes which, as were mentioned, represents each one of FFmpeg structures with exposing data fields as properties along with related usage methods. This topic describes most of those classes but not all of them. Description also contains basic usage examples of those classes, without describing class fields, as they are the same as in original FFmpeg libraries. All code samples are workable but in code parts skipped calling the Dispose() method of some global object destruction. Also, that is required to check error results of the methods calls, and you should do it in your code but in the samples, that is also skipped to decrease code size.

AVPacket

Class is the wrapper around the AVPacket structure of the FFmpeg libavcodec library. It has different constructors with the ability to pass its own buffer in there, and with a callback to detect when the buffer will be freed. After every successive return from any decoding or encoding operation, it is required to call the Free method to un-reference the underlying buffer. Class contains static methods to create AVPacket objects from specified .NET arrays with different types. Dump method allows you to save packet data into a file or stream.

// Create frame object from the image
var f = AVFrame.FromImage((Bitmap)Bitmap.FromFile(@"image.jpg"),AVPixelFormat.YUV420P);
// Create encoder context
var c = new AVCodecContext(AVCodec.FindEncoder(AVCodecID.MPEG2VIDEO));
// Setting up encoder parameters
c.bit_rate = 400000;
c.width = f.width;
c.height = f.height;
c.time_base = new AVRational(1, 25);
c.framerate = new AVRational(25, 1);
c.pix_fmt = f.format;
// Open context
c.Open(null);
// Create packet object
AVPacket pkt = new AVPacket();
bool got_packet = false;
while (!got_packet)
{
    // Encode frame until we got packet
    c.EncodeVideo(pkt,f,ref got_packet);
}
// Save packet int a file
pkt.Dump(@"pkt.bin");
// unref packet buffer
pkt.Free();
pkt.Dispose();

AVPicture

Class which is wrapped around related structure in the old version of the FFmpeg libavcodec library. In wrapper library, it is supported for all versions of FFmpeg in current implementation. Class contains methods for the av_image_* API’s of FFmpeg library and pointers to data and plane sizes. The AVFrame inherits from this class so it is recommended to use AVFrame directly instead.

AVFrame

Class for the AVFrame structure of the FFmpeg libavutil library. Class used as input for encoding audio or video data, or as output from the decoding process. It can handle audio and video data. After every successive return from any decoding or encoding operation it is required to call the Free method to un-reference the underlying buffer.

// Open Input File Context
var _input = AVFormatContext.OpenInputFile(@"test.mp4");
// Check for the streams
if (_input.FindStreamInfo() == 0) {
    // Get Video Stream Index
    int idx = _input.FindBestStream(AVMediaType.VIDEO, -1, -1);
    // Initialize Decoder For That Stream
    AVCodecContext decoder = _input.streams[idx].codec;
    var codec = AVCodec.FindDecoder(decoder.codec_id);   
    // Open Decoder Context
    if (decoder.Open(codec) == 0) {
        // Create frame and packet objects
        AVPacket pkt = new AVPacket();
        AVFrame frame = new AVFrame();
        int index = 0;
        // Reading Packets from the input file
        while ((bool)_input.ReadFrame(pkt)) {
            // Check the stream index 
            if (pkt.stream_index == idx) {    
                bool got_frame = false;
                // Decode Video 
                int ret = decoder.DecodeVideo(frame, ref got_frame, pkt);
                if (got_frame) {
                    // Once we got frame convert it into Bitmap object 
                    // and saves into a file
                    var bmp = AVFrame.ToBitmap(frame);
                    bmp.Save(string.Format(@"image{0}.png",++index));
                    bmp.Dispose();
                    // Free frame data
                    frame.Free();
                }
            }
            // Free packet data
            pkt.Free();
        }
    }
}

Frame objects can be initialized from existing .NET Bitmap object. In the code below, the frame is created from the image and converted into YUV 420 planar colorspace format.

var frame = AVFrame.FromImage((Bitmap)Bitmap.FromFile(@"image.jpg"),
            AVPixelFormat.YUV420P);

It is also possible to have wrap around existing bitmap data in that case, there is no need to specify target pixel format as the second argument. The pixel format of such frame data will be BGRA, and the image data will not be copied into the newly allocated buffer. But, internally, the child class object of the wrapped image data will be created which will be freed once the actual frame will be disposed.

There are also some helper methods which can be used for converting video frames:

var src = AVFrame.FromImage((Bitmap)Bitmap.FromFile(@"image.jpg"));
var frame = AVFrame.ConvertVideoFrame(src,AVPixelFormat.YUV420P);

And conversion back into .NET Bitmap object:

var bmp = AVFrame.ToBitmap(frame);
bmp.Save(@"image.png");

AVCodec

Class which is implemented managed access to AVCodec structure of the FFmpeg libavcodec library. As recommended, it contains only public fields.of the underlying structure. Class describes registered codec in the library.

var codec = AVCodec.FindEncoder("libx264");
Console.WriteLine("{0}",codec);

Class is able to enumerate existing codecs:

foreach (AVCodec codec in AVCodec.Codecs)
{
    Console.WriteLine(codec);
}

AVCodecContext

Class which is implemented managed access to AVCodecContext structure of the FFmpeg libavcodec library. Object is used for encoding or decoding video and audio data.

// Create Encoder Context
var c = new AVCodecContext(AVCodec.FindEncoder("libx264"));
// Initialize Encoder Parameters
c.bit_rate = 400000;
c.width = 352;
c.height = 288;
c.time_base = new AVRational(1, 25);
c.framerate = new AVRational(25, 1);
c.gop_size = 10;
c.max_b_frames = 1;
c.pix_fmt =  AVPixelFormat.YUV420P;
// Open Context
c.Open(null);
Console.WriteLine(c);

In the example above, we create a context object and initialize the H264 encoder of the video with resolution 352x288 with 25 frames per second and 400 kbps bit rate.

Next sample demonstrates opening the file and initializing the decoder for the video stream:

AVFormatContext fmt_ctx;
// Open Input File Context
AVFormatContext.OpenInput(out fmt_ctx, @"test.mp4");
// Check streams
fmt_ctx.FindStreamInfo();
// Get the Video Stream Object
var s = fmt_ctx.streams[fmt_ctx.FindBestStream(AVMediaType.VIDEO)];
// Create Decoder Context object for that stream
var c = new AVCodecContext(AVCodec.FindDecoder(s.codecpar.codec_id));
// Copy decoder parameters from the stream
s.codecpar.CopyTo(c);
// Open Codec Context
c.Open(null);
Console.WriteLine(c);

Class contains methods for initializing structure and decoding or encoding audio and video.

AVCodecParser

Class for accessing the AVCodecParser structure of the FFmpeg libavcodec library from managed code. Structure describes registered codec parsers. Class mostly used for enumerating registered parsers in the library:

foreach (AVCodecParser parser in AVCodecParser.Parsers)
{
    for (int i = 0; i < parser.codec_ids.Length; i++)
    {
        Console.Write("{0} ", parser.codec_ids[i]);
    }
}

AVCodecParserContext

Class for performing bitstream parse operation. It is wrapped around the AVCodecParserContext structure of the FFmpeg libavcodec library. It is created with the specified codec and contains different variations of the parse method.

// Create Decoder Context for MP3 audio
var decoder = new AVCodecContext(AVCodec.FindDecoder(AVCodecID.MP3));
// Specify desired decoder sample format
decoder.sample_fmt = AVSampleFormat.FLTP;
// Open the context
if (decoder.Open(null) == 0) {
    // Create parset context for our codec
    var parser = new AVCodecParserContext(decoder.codec_id);
    AVPacket pkt = new AVPacket();
    AVFrame frame = new AVFrame();
    // Opens the mp3 file
    var f = fopen(@"test.mp3", "rb");
    const int buffer_size = 1024;
    IntPtr data = Marshal.AllocCoTaskMem(buffer_size);
    // Reading the file data      
    int data_size = fread(data, 1, buffer_size, f);
    while (data_size > 0) {
        // Parse readed data 
        int ret = parser.Parse(decoder, pkt, data, data_size);
        if (ret < 0) break;
        data_size -= ret;
        // Once we have packet data
        if (pkt.size > 0) {
            bool got_frame = false;
            // Decode it
            decoder.DecodeAudio(frame,ref got_frame,pkt);
            if (got_frame) {
                // Output IEEE floats wrap into characters
                var p = frame.data[0].floats;
                for (int i = 0; i < frame.nb_samples; i++) {
                    Console.Write((char)(((p[i] + 1.0) * 54) + 32));
                }
                // Free frame buffer
                frame.Free();
            }
            // Free packet buffer
            pkt.Free();
        }
        // Shift outstanding data and continue file reading
        memmove(data, data + ret, data_size);
        data_size += fread(data + data_size, 1, buffer_size - data_size, f);
    }
    // Free allocated buffer and close file
    Marshal.FreeCoTaskMem(data);
    fclose(f);
}

Example demonstrates reading data from raw mp3 file, parsing it by chunks and decode with simple print clamped data into output. Example uses some wrappers of the unmanaged API’s: fopen, fread and fclose. Those function declarations can be found in other samples in this document.

AVRational

Class which is implemented managed access to AVRational structure of the FFmpeg libavutil library. It contains all methods which are related to that structure.

AVRational r = new AVRational(25,1);
Console.WriteLine("{0}, {1}",r.ToString(),r.inv_q().ToString());

AVFormatContext

Class which is implemented managed access to AVFormatContext structure of the FFmpeg libavformat library. Class contains methods for operating with output or input media data.

var ic = AVFormatContext.OpenInputFile(@"test.mp4");
ic.DumpFormat(0,null,false);

Most use case of the class is getting audio or video packets from input or writing such packets into output. Next sample demonstrates reading mp3 audio packets from an AVI file and saves them into a separate file.

// Open input file context
var ic = AVFormatContext.OpenInputFile(@"1.avi");
// Check streams
if (0 == ic.FindStreamInfo()) {
    // Get the audio stream index
    int idx = ic.FindBestStream(AVMediaType.AUDIO);
    if (idx >= 0 
        && ic.streams[idx].codecpar.codec_id == AVCodecID.MP3) {
        // Start reading packets in case if we have MP3 audio
        AVPacket pkt = new AVPacket();
        bool append = false;
        while (ic.ReadFrame(pkt) == 0) {
            if (pkt.stream_index == idx) {
                // Saves packets into a file
                pkt.Dump(@"out.mp3",append);
                append = true;
            }
            pkt.Free();
        }
        pkt.Dispose();
    }
}
ic.Dispose();

And another main task, as mentioned, is the output packets. The next example demonstrates outputting a static picture to a mp4 file with H264 encoding for a 10 seconds long:

string filename = @"test.mp4";
// Open output file context
var oc = AVFormatContext.OpenOutput(filename);
// Create frame from image file
var frame = AVFrame.FromImage((Bitmap)Bitmap.FromFile(@"image.jpg"),
            AVPixelFormat.YUV420P);
// Create and initialize video encoder context
var codec = AVCodec.FindEncoder(AVCodecID.H264);
var ctx = new AVCodecContext(codec);
// Add new stream 
var st = oc.AddStream(codec);
// Set stream settings
st.id = oc.nb_streams-1;
st.time_base = new AVRational( 1, 25 );
// Set encoder parameters
ctx.bit_rate = 400000;
ctx.width = frame.width;
ctx.height = frame.height;
ctx.time_base = st.time_base;
ctx.framerate = ctx.time_base.inv_q();
ctx.pix_fmt = frame.format;
// Open encoder
ctx.Open(null);
// Copy parameters into stream
st.codecpar.FromContext(ctx);
// Open underlaying file IO context
if ((int)(oc.oformat.flags & AVfmt.NOFILE) == 0) {
    oc.pb = new AVIOContext(filename,AvioFlag.WRITE);
}
// Display format information
oc.DumpFormat(0, filename, true);
// Start writing into the file
oc.WriteHeader();
AVPacket pkt = new AVPacket();
int idx = 0;
bool flush = false;
while (true) {
    // Output for 10 seconds long
    flush = (idx > ctx.framerate.num * 10 / ctx.framerate.den);
    frame.pts = idx++;
    bool got_packet = false;
    // Encode video frame
    if (0 < ctx.EncodeVideo(pkt, flush ? null : frame, ref got_packet)) break;
    if (got_packet) {
        // Configure timestamps
        pkt.RescaleTS(ctx.time_base, st.time_base);
        // Write resulted packet into the file
        oc.WriteFrame(pkt);
        pkt.Free();
        continue;
    }
    if (flush) break;
}
// End writing
oc.WriteTrailer();
pkt.Dispose();
frame.Dispose();
ctx.Dispose();
oc.Dispose();

And the application output results:

AVIOContext

Class which is implemented managed access to AVIOContext structure of the FFmpeg libavformat library. Class contains helper methods for writing/reading data from byte streams. The most interesting usage of this class is the ability to set up custom reading or writing callbacks. To demonstrates how to use those features, let's modify previous code sample with replacement AVIOContext creation code:

AVMemPtr ptr = new AVMemPtr(20 * 1024);
OutputFile file = new OutputFile(filename);
if ((int)(oc.oformat.flags & AVfmt.NOFILE) == 0) {
    oc.pb = new AVIOContext(ptr,1,file,null,OutputFile.WritePacket,OutputFile.Seek);
}

So right now, we create a context with defining output and seek callback. Those callbacks are defined in OutputFile class of the example:

class OutputFile
{
    // File Handle
    IntPtr stream = IntPtr.Zero;

    public OutputFile(string filename) {
        // Opens the file for writing
        stream = fopen(filename, "w+b");
    }
    ~OutputFile() {
        // Close the file
        IntPtr s = Interlocked.Exchange(ref stream, IntPtr.Zero);
        if (s != IntPtr.Zero) fclose(s);
    }
    
    // Write data callback
    public static int WritePacket(object opaque, IntPtr buf, int buf_size) {
        return fwrite(buf, 1, buf_size, (opaque as OutputFile).stream);
    }

    // File seek callback
    public static int Seek(object opaque, long offset, AVSeek whence) {
        return fseek((opaque as OutputFile).stream, (int)offset, (int)whence);
    }

    [DllImport("msvcrt.dll", EntryPoint = "fopen")]
    public static extern IntPtr fopen(
        [MarshalAs(UnmanagedType.LPStr)] string filename,
        [MarshalAs(UnmanagedType.LPStr)] string mode);

    [DllImport("msvcrt.dll")]
    public static extern int fwrite(IntPtr buffer, int size, int count, IntPtr stream);

    [DllImport("msvcrt.dll")]
    public static extern int fclose(IntPtr stream);

    [DllImport("msvcrt.dll")]
    public static extern int fseek(IntPtr stream, long offset, int origin);
}

By running the sample, you got the same output file and same results as in the previous topic, but now you can control each writing and seeking of the data.

Class also contains some helper static methods. It can check for URLs and enumerate supported protocols:

Console.Write("Output Protocols: \n");
var protocols = AVIOContext.EnumProtocols(true);
foreach (var s in protocols) {
    Console.Write("{0} ",s);
}

AVOutputFormat

Class which is implemented managed access to AVOutputFormat structure of the FFmpeg libavformat library. Structure describes parameters of an output format supported by the libavformat library. Class contains enumerator to see all available formats:

foreach (AVOutputFormat format in AVOutputFormat.Formats)
{
    Console.WriteLine(format.ToString());
}

Class has no public constructor, but it can be accessed with enumerator or static methods:

var fmt = AVOutputFormat.GuessFormat(null,"test.mp4",null);
Console.WriteLine(fmt.long_name);

AVInputFormat

Class which is implemented managed access to AVInputFormat structure of the FFmpeg libavformat library. The structure describes parameters of an input format supported by the libavformat library. Class also contains enumerator to see all available formats:

foreach (AVInputFormat format in AVInputFormat.Formats)
{
    Console.WriteLine(format.ToString());
}

Class has no public constructor. Can be accessed from static methods:

var fmt = AVInputFormat.FindInputFormat("avi");
Console.WriteLine(fmt.long_name);

It is also possible to detect format of giving buffer, also using static methods:

// Opens the file
var f = fopen(@"test.mp4", "rb");
// Setting up probe data and allocate buffer
AVProbeData data = new AVProbeData();
data.buf_size = 1024;
data.buf = Marshal.AllocCoTaskMem(data.buf_size);
// Read data from the file
data.buf_size = fread(data.buf,1,data.buf_size,f);
// Check the data 
var fmt = AVInputFormat.ProbeInputFormat(data,true);
Console.WriteLine(fmt.ToString());
// Free memory and close file
Marshal.FreeCoTaskMem(data.buf);
data.Dispose();
fclose(f);

AVStream

Class which is implemented managed access to AVStream structure of the FFmpeg libavformat library. Class has no public constructor and can be accessed from the AVFormatContext streams property.

var input = AVFormatContext.OpenInputFile(@"test.mp4");
if (input.FindStreamInfo() == 0){
    for (int idx = 0; idx < input.streams.Count; idx++)
    {
        Console.WriteLine("Stream [{0}]: {1}",idx,
            input.streams[idx].codecpar.codec_type);
    }
}
input.DumpFormat(0,null,false);

Streams can be created with the AVFormatContext AddStream method.

// Open Output Context
var oc = AVFormatContext.OpenOutput(@"test.mp4");
// Find Encoder
var codec = AVCodec.FindEncoder(AVCodecID.H264);
// Create Encoder Context
var c = new AVCodecContext(codec);
// Add Video Stream to Output
var st = oc.AddStream(codec);
st.id = oc.nb_streams-1;
st.time_base = new AVRational( 1, 25 );
// Initialize codec parameters
c.codec_id = codec.id;
c.bit_rate = 400000;
c.width    = 352;
c.height   = 288;
c.time_base  = st.time_base;
c.gop_size   = 12;
c.pix_fmt    = AVPixelFormat.YUV420P;
// Open encoder context
c.Open(codec);
// Copy codec parameters to stream
st.codecpar.FromContext(c);
oc.DumpFormat(0, null, true);

AVDictionary

Class which is implemented managed access to AVDictionary collection type of the FFmpeg libavutil library. This class is the name-value collection of the string values and contains methods, enumerators and properties for accessing them. There is no ability in FFmpeg API to remove special entries from the collection, so we have the same functionality in the wrapper library. AVDictionary class supports enumeration of the key-values entries. Class also supports copy data with usage of the ICloneable interface.

AVDictionary dict = new AVDictionary();
dict.SetValue("Key1","Value1");
dict.SetValue("Key2","Value2");
dict.SetValue("Key3","Value3");
Console.Write("Number of elements: \"{0}\"\nIndex of \"Key2\": \"{1}\"\n",
    dict.Count,dict.IndexOf("Key2"));
Console.Write("Keys:\t");
foreach (string key in dict.Keys) {
    Console.Write(" \"" + key + "\"");
}
Console.Write("\nValues:\t");
for (int i = 0; i < dict.Values.Count; i++) {
    Console.Write(" \"" + dict.Values[i] + "\"");
}
Console.Write("\nValue[\"Key1\"]:\t\"{0}\"\nValue[3]:\t\"{1}\"\n",
    dict["Key1"],dict[2]);
AVDictionary cloned = (AVDictionary)dict.Clone();
Console.Write("Cloned Entries: \n");
foreach (AVDictionaryEntry ent in cloned) {
    Console.Write("\"{0}\"=\"{1}\"\n", ent.key, ent.value);
}
cloned.Dispose();
dict.Dispose();

Result of execution code above is:

AVOptions

Helper class which exposes functionality accessing options of the library objects. Objects can be created from pointers or from AVBase class as were already mentioned. Example of enumerating existing object options:

// Create Encoder Context
var ctx = new AVCodecContext(AVCodec.FindEncoder(AVCodecID.H264));
Console.WriteLine("{0} Options:",ctx);
// Create options object for encoder context
var options = new AVOptions(ctx);
for (int i = 0; i < options.Count; i++)
{
    string value = "";
    if (0 == options.get(options[i].name, AVOptSearch.None, out value))
    {
        Console.WriteLine("{0}=\"{1}\"\t'{2}'",options[i].name,value,options[i].help);
    }
}

Class gives the ability to enumerate option names and option parameters and also gets or sets options values of the object.

var ctx = new AVCodecContext(AVCodec.FindEncoder(AVCodecID.H264));
var options = new AVOptions(ctx);
string initial = "";
options.get("b", AVOptSearch.None, out initial);
options.set_int("b", 100000, AVOptSearch.None);
string updated = "";
options.get("b", AVOptSearch.None, out updated);
Console.WriteLine("Initial=\"{0}\", Updated\"{1}\"",initial,updated);

Due to compatibility with old wrapper library versions, some properties of library classes internally set object options instead of usage direct fields. Example of how that done in a AVFilterGraph scale_sws_opts field is given below:

MC++

void FFmpeg::AVFilterGraph::scale_sws_opts::set(String^ value)
{
	auto p = _CreateObject<AVOptions>(m_pPointer);
	p->set("scale_sws_opts",value,AVOptSearch::None);
}

Available options also can be enumerated via AVClass object of the library:

// Get available options for AVCodecContext
var options = AVClass.GetCodecClass().option;
for (int i = 0; i < options.Length; i++)
{
    Console.WriteLine("\"{0}\"",options[i]);
}

As in FFmpeg, each context structure has AVClass object access, so the next code gives the same results:

var ctx = new AVCodecContext(AVCodec.FindEncoder(AVCodecID.H264));
var options = ctx.av_class.option;
for (int i = 0; i < options.Length; i++)
{
    Console.WriteLine("\"{0}\"",options[i]);
}

AVBitStreamFilter

Class which is implemented managed access to AVBitStreamFilter type of the FFmpeg libavcodec library. It contains the name and array of codec ids to which filter can be applied. All filters can be enumerated:

foreach (AVBitStreamFilter f in AVBitStreamFilter.Filters)
{
    Console.WriteLine("{0}",f);
}

Filter can be searched by name:

var f = AVBitStreamFilter.GetByName("h264_mp4toannexb");
Console.Write("{0} Codecs: ",f);
foreach (AVCodecID id in f.codec_ids)
{
    Console.Write("{0} ",id);
}

AVBSFContext

Class which is implemented managed access to AVBSFContext structure type of the FFmpeg libavcodec library. It is an instance of Bit Stream Filter and contains methods for filtering operations.

Example of class usage is as given below:

// Open Input File Context
var input = AVFormatContext.OpenInputFile(@"test.mp4");
if (input.FindStreamInfo() == 0) {
    // Get video stream index
    int idx = input.FindBestStream(AVMediaType.VIDEO);
    // Create Bit Stream Filter Context 
    var ctx = new AVBSFContext(
        AVBitStreamFilter.GetByName("h264_mp4toannexb"));
    // Set context parameters from the stream
    input.streams[idx].codecpar.CopyTo(ctx.par_in);
    ctx.time_base_in = input.streams[idx].time_base;
    // Initialize Context
    if (0 == ctx.Init()) {
        bool append = false;
        AVPacket pkt = new AVPacket();
        // Read Packets from the file
        while (0 == input.ReadFrame(pkt)) {
            // Process video packets
            if (pkt.stream_index == idx) {
                ctx.SendPacket(pkt);
            }
            pkt.Free();
            if (0 == ctx.ReceivePacket(pkt)) {
                // Save resulted packet data 
                pkt.Dump(@"out.h264",append);
                pkt.Free();
                append = true;
            }
        }
        pkt.Dispose();
    }
    ctx.Dispose();
}

The code above performs conversion of H264 video stream from opened mp4 media file into annexb format - format with start codes, and saves the resulting bitstream into a file. The file, which is produced, can be played with VLC media player, with graphedit tool or with ffplay.

In the hex dump of a file, it is possible to see that data is in annexb format:

AVChannelLayout

Helper value class of definitions AV_CH_* for audio channels masks from libavutil library. Class contains all available definitions as public static class fields:

Class represent the 64 bit integer value and cover implicit conversions operators. It also contains helper methods as wrap of FFmpeg library APIs and has overridden string representation of the value.

AVChannelLayout ch = AVChannelLayout.LAYOUT_5POINT1;
Console.WriteLine("{0} Channels: \"{1}\" Description: \"{2}\"",
    ch,ch.channels,ch.description);
AVChannelLayout ex = ch.extract_channel(2);
Console.WriteLine("Extracted 2 Channel: {0} Index: \"{1}\" Name: \"{2}\"",
    ex,ch.get_channel_index(ex),ex.name);
ch = AVChannelLayout.get_default_channel_layout(7);
Console.WriteLine("Default layout for 7 channels {0} "+
    "Channels: \"{1}\" Description: \"{2}\"",
    ch,ch.channels,ch.description);

The code above displays basic operations with the class. All class methods are based on FFmpeg library APIs, so it is easy to understand how to use them. The execution result of the code above:

There is also an AVChannels class in a wrapper library which is the separate static class with exported APIs for AVChannelLayout.

AVPixelFormat

Another helper value class for enumeration with the same name from libavutil library. It describes the pixel format of video data. Class contains all formats which are exposed by original enumeration. Along with it, the class extended with methods and properties based on FFmpeg APIs. It handles implicit conversion from to integer types.

AVPixelFormat fmt = AVPixelFormat.YUV420P;
byte[] bytes = BitConverter.GetBytes(fmt.codec_tag);
string tag = "";
for (int i = 0; i < bytes.Length; i++) { tag += (char)bytes[i]; }
Console.WriteLine("Format: {0} Name: {1} Tag: {2} Planes: {3} " +
    "Components: {4} Bits Per Pixel: {5}",(int)fmt,
    fmt.name,tag,fmt.planes, fmt.format.nb_components,
    fmt.format.bits_per_pixel);
fmt = AVPixelFormat.get_pix_fmt("rgb32");
Console.WriteLine("{0} {1}",fmt.name, (int)fmt);
FFLoss loss = FFLoss.NONE;
fmt = AVPixelFormat.find_best_pix_fmt_of_2(AVPixelFormat.RGB24,
    AVPixelFormat.RGB32,AVPixelFormat.RGB444BE,false,ref loss);
Console.WriteLine("{0} {1}",fmt.name, (int)fmt);

Execution of the code above gives a result:

AVSampleFormat

Also a helper class for libavutil enumeration with the same name. It manages the format description of the audio data. As other classes, it handles basic operations and exposes properties and methods which are done as FFmpeg APIs.

AVSampleFormat fmt = AVSampleFormat.FLTP;
Console.WriteLine("Format: {0}, Name: {1}, Planar: {2}, Bytes Per Sample: {3}",
    (int)fmt,fmt.name,fmt.is_planar,fmt.bytes_per_sample);
fmt = AVSampleFormat.get_sample_fmt("s16");
var alt = fmt.get_alt_sample_fmt(false);
var pln = fmt.get_planar_sample_fmt();
Console.WriteLine("Format: {0} Alt: {1} Planar: {2}",fmt, alt, pln);

FFmpeg APIs which operate with AVSampleFormat are also exported in a separate static class AVSampleFmt.

AVSamples

Helper static class which exposes methods for libavutil APIs and operates with audio data samples. It is able to allocate buffers for storing audio data with different formats and copy data from extended buffers into the AVFrame/AVPicture class.

Next code displays how to allocate a buffer for audio data, filling that buffer with silence and copying the decoded audio frame to that buffer.

// Open input file
var input = AVFormatContext.OpenInputFile(@"test.mp4");
if (input.FindStreamInfo() == 0) {
    // Open Decoder Context For the Audio Stream
    int idx = input.FindBestStream(AVMediaType.AUDIO);
    AVCodecContext decoder = input.streams[idx].codec;
    var codec = AVCodec.FindDecoder(decoder.codec_id);
    if (decoder.Open(codec) == 0) {
        AVPacket pkt = new AVPacket();
        AVFrame frame = new AVFrame();
        bool got_frame = false;
        // Read packets from file
        while (input.ReadFrame(pkt) == 0 && !got_frame) {
            if (pkt.stream_index == idx) {
                // Decode audio data
                decoder.DecodeAudio(frame,ref got_frame,pkt);
                if (got_frame) {
                    // Gets the size of the buffer
                    int size = AVSamples.get_buffer_size(frame.channels,
                        frame.nb_samples, frame.format);
                    // Allocate the buffer
                    AVMemPtr ptr = new AVMemPtr(size);
                    IntPtr[] data = new IntPtr[frame.channels];
                    // Setup channels pointers
                    AVSamples.fill_arrays(ref data, ptr, 
                        frame.channels, frame.nb_samples, frame.format);
                    // Set silence to created buffer
                    AVSamples.set_silence(data, 0, 
                        frame.nb_samples, frame.channels, frame.format);
                    ptr.Dump(@"silence.bin");
                    // Copy decode data into that buffer
                    AVSamples.copy(data, frame, 0, 0, frame.nb_samples);
                    ptr.Dump(@"data.bin");
                    ptr.Dispose();
                    frame.Free();
                }
            }
            pkt.Free();
        }
        frame.Dispose();
        pkt.Dispose();
    }
}

AVMath

Helper static class which exposes some useful mathematics APIs from libavutil library. Of course, it is not mandatory to use them as we have math from .NET, but in some cases, it can be helpful. For example, timestamps conversion from one base into another and timestamps comparing.

AVCodecContext c = new AVCodecContext(null);
//... Initialize encoder context
AVFrame frame = new AVFrame();
//... Prepare frame data 
frame.pts = AVMath.rescale_q(frame.nb_samples, 
            new AVRational(1, c.sample_rate), c.time_base);
//... Pass frame to encoder

In the code snippet above, performed conversion from sample rate basis into context time base timestamp which later will be passed to encoder.

SwrContext

Managed wrapper of resampling context structure of libswresample library. Class contains methods and properties to work with underlying library structure. It allows to set properties, initialize context, and perform resampling of the audio data.

The next code sample displays how to initialize resampling context and perform conversion. It opens the file with an audio stream. Decode audio data. Resampling resulted data into S16 stereo format with the same sample rate as were on input and saved the result into a binary file. Definitions of fopen, fwrite and fclose functions can be found in AVIOContext code samples.

// Open input file
var input = AVFormatContext.OpenInputFile(@"test.mp4");
if (input.FindStreamInfo() == 0) {
    // Initialize and open decoder context for audio stream
    int idx = input.FindBestStream(AVMediaType.AUDIO);
    AVCodecContext decoder = input.streams[idx].codec;
    var codec = AVCodec.FindDecoder(decoder.codec_id);
    decoder.sample_fmt = codec.sample_fmts != null ? 
                         codec.sample_fmts[0] : AVSampleFormat.FLTP;
    if (decoder.Open(codec) == 0) {
        AVPacket pkt = new AVPacket();
        AVFrame frame = new AVFrame();
        AVMemPtr ptr = null;
        SwrContext swr = null;
        // Open destination file
        IntPtr file = fopen(@"out.bin","w+b");
        // Reading packets from input
        while (input.ReadFrame(pkt) == 0) {
            if (pkt.stream_index == idx) {
                bool got_frame = false;
                // Decode audio
                decoder.DecodeAudio(frame,ref got_frame,pkt);
                if (got_frame) {
                    int size = AVSamples.get_buffer_size(frame.channels,
                        frame.nb_samples, frame.format);
                    if (swr == null) {
                        // Create resampling context
                        swr = new SwrContext(AVChannelLayout.LAYOUT_STEREO, 
                            AVSampleFormat.S16, frame.sample_rate, 
                            frame.channel_layout, frame.format, frame.sample_rate);
                        swr.Init();
                    }
                    if (ptr != null && ptr.size < size) {
                        ptr.Dispose();
                        ptr = null;
                    }
                    if (ptr == null) {
                        // Allocate output buffer 
                        ptr = new AVMemPtr(size);
                    }
                    // Perform resampling
                    int count = swr.Convert(new IntPtr[] { ptr },
                                frame.nb_samples,frame.data,frame.nb_samples);
                    if (count > 0) {
                        int bps = AVSampleFmt.get_bytes_per_sample(AVSampleFormat.S16) 
                                  * frame.channels;
                        // Saving all data into output file
                        fwrite(ptr,bps,count,file);
                    }
                    frame.Free();
                }
            }
            pkt.Free();
        }
        fclose(file);
        if (swr != null) swr.Dispose();
        if (ptr != null) ptr.Dispose();
        frame.Dispose();
        pkt.Dispose();
    }
}

The resulting file can be played with ffplay with format parameters. In case your audio data has 44100 Hz sample rate, otherwise replace rate in next command line arguments.

SwsContext

Managed wrapper of scaling context structure of libswscale library. Class supports scaling and colorspace conversion of the video data.

The next sample code displays how to initialize scaling context and perform conversion of the video frames into image files. It opens the file with a video stream. Decode video data. Initialize context and temporal data pointer, create a .NET Bitmap object associated with an allocated data pointer, and save each frame to a file.

// Open input file
var input = AVFormatContext.OpenInputFile(@"test.mp4");
if (input.FindStreamInfo() == 0) {
    // Open decoder context for the video stream
    int idx = input.FindBestStream(AVMediaType.VIDEO);
    AVCodecContext decoder = input.streams[idx].codec;
    var codec = AVCodec.FindDecoder(decoder.codec_id);
    if (decoder.Open(codec) == 0) {
        AVPacket pkt = new AVPacket();
        AVFrame frame = new AVFrame();
        AVMemPtr ptr = null;
        SwsContext sws = null;
        int image = 0;
        Bitmap bmp = null;
        // Reading packets
        while (input.ReadFrame(pkt) == 0) {
            if (pkt.stream_index == idx) {
                bool got_frame = false;
                // Decode video
                decoder.DecodeVideo(frame,ref got_frame,pkt);
                if (got_frame) {
                    if (sws == null) {
                        // Create scaling context object
                        sws = new SwsContext(frame.width, frame.height, frame.format,
                              frame.width, frame.height, AVPixelFormat.BGRA, 
                              SwsFlags.FastBilinear);
                    }
                    if (ptr == null) {
                        // Allocate picture buffer
                        int size = AVPicture.GetSize(AVPixelFormat.BGRA,
                                   frame.width, frame.height);
                        ptr = new AVMemPtr(size);
                    }
                    // Convert input frame into RGBA image
                    sws.Scale(frame,0,frame.height,new IntPtr[] { ptr },
                              new int[] { frame.width << 2 });
                    if (bmp == null)
                    {
                        // Create Bitmap object for given buffer
                        bmp = new Bitmap(frame.width, frame.height,
                            frame.width << 2, PixelFormat.Format32bppRgb, ptr);
                    }
                    // Saves bitmap into a file
                    bmp.Save(string.Format(@"image{0}.png",image++));
                    frame.Free();
                }
            }
            pkt.Free();
        }
        if (sws != null) sws.Dispose();
        if (ptr != null) ptr.Dispose();
        if (bmp != null) bmp.Dispose();
        frame.Dispose();
        pkt.Dispose();
    }
}

Such frame conversion is implemented by the library with the AVFrame.ToBitmap() method. You can see how that is done in AVFrame sample code. The difference in the current example is that the data buffer and bitmap object are allocated only once, which gives better performance.

AVFilter

Managed wrapper of AVFilter structure of libavfilter library. Class describes every filter item in the filtering platform. It defines object fields and methods. Class has no public constructor, and can be accessed from the static methods.

var f = AVFilter.GetByName("vflip");
Console.WriteLine("Name: \"{0}\"\nDescription: \"{1}\"\nFlags: {2}, ",
    f.name, f.description,f.flags);
for (int i = 0; i < f.inputs.Count; i++)
    Console.WriteLine("Input[{0}] Name: \"{1}\" Type: \"{2}\"", 
        i + 1, f.inputs[i].name, f.inputs[i].type);
for (int i = 0; i < f.outputs.Count; i++)
    Console.WriteLine("Output[{0}] Name: \"{1}\" Type: \"{2}\"", 
        i + 1, f.outputs[i].name, f.outputs[i].type);

Class also has the ability to enumerate existing filters available in the library.

foreach (var f in AVFilter.Filters)
{
    Console.WriteLine(f);
}

AVFilterContext

Managed class represents AVFilterContext structure of the libavfilter library. It is used to describe filter instances in a filter graph. It has no public constructor, it is only able to be created by calling AVFilterGraph class methods.

// Create filter graph object
AVFilterGraph graph = new AVFilterGraph();
// Create vertical flip filter context instance
AVFilterContext ctx = graph.CreateFilter(AVFilter.GetByName("vflip"),"My Filter");
Console.WriteLine("Name: \"{0}\" Filter: \"{1}\" Ready: \"{2}\"",
    ctx.name, ctx.filter, ctx.ready);

Each filter context output can be linked with the context input of the other filter. There are few special filters: sink and source. Source able to receive data and should be inserted first into the graph chain. And the sink is the endpoint of the chain. Sink provides output from the filter graph.

Source filter have special name: “buffer” for video and “abuffer” for the audio, creation such filter always must have the initialization parameters:

AVFilterGraph graph = new AVFilterGraph();
// Create video source filter context
AVFilterContext src = graph.CreateFilter(AVFilter.GetByName("buffer"),"in",
    "video_size=640x480:pix_fmt=0:time_base=1/25:pixel_aspect=1/1",IntPtr.Zero);

Destination filter names are “buffersink” and “abuffersink” for video and audio respectively. Extend the previous code by adding sink filter creation and connecting it with previously created video source.

// Create video sink filter context
var sink = graph.CreateFilter(AVFilter.GetByName("buffersink"),"out");
// Connect sink with source directly
src.Link(sink);
// Configure filter graph
graph.Config();
var input = sink.inputs[0];
Console.WriteLine("Type: \"{0}\" W: \"{1}\" H: \"{2}\" Format: \"{3}\"",
    input.type, input.w, input.h,((AVPixelFormat)input.format).name);

Once we call the configuring method of the graph, it sets up the chain parameters, so we have format settings on the sink input.

There are special classes which help to operate with sink and source filters context. AVBufferSrc - have methods for filter initialization and providing input frames into underlying source filter context. AVBufferSink - contains abilities for getting properties of the graph endpoint and receiving resulting frames.

The next complete example code performs a vertical flip filtering effect of the picture from one image file and saves the result into another file:

// Create frame from image file
var frame = AVFrame.FromImage((Bitmap)Bitmap.FromFile(@"image.jpg"),
    AVPixelFormat.YUV420P);
string fmt = string.Format(
        "video_size={0}x{1}:pix_fmt={2}:time_base=1/1:pixel_aspect1/1",
        frame.width, frame.height, (int)frame.format);

// Create filter graph
AVFilterGraph graph = new AVFilterGraph();
// Create source context
AVFilterContext src = graph.CreateFilter(AVFilter.GetByName("buffer"),
                      "in",fmt, IntPtr.Zero);
// Create vertical flip filter context
var flip = graph.CreateFilter(AVFilter.GetByName("vflip"),"My Filter");
// Create sink context
var sink = graph.CreateFilter(AVFilter.GetByName("buffersink"),"out");

// Connect filters
src.Link(flip);
flip.Link(sink);
graph.Config();

// Create sink and source helper objects
AVBufferSink _sink = new AVBufferSink(sink);
AVBufferSrc _source = new AVBufferSrc(src);
// Add frame for processing
_source.add_frame(frame);
frame.Free();
// Get resulted frame
_sink.get_frame(frame);
// Save Frame into a file
frame.ToBitmap().Save(@"out_image.jpg");

AVFilterGraph

Managed class for AVFilterGraph structure of the libavfilter library. Class manages filters chain and connections between them. It contains graph configuring and filter context creation methods which were described previously. In addition, graphs have the ability to enumerate all filters in a chain:

AVFilterGraph graph = new AVFilterGraph();
var src = graph.CreateFilter(AVFilter.GetByName("buffer"),"in",
    "video_size=640x480:pix_fmt=0:time_base=1/25:pixel_aspect=1/1",IntPtr.Zero);
var flip = graph.CreateFilter(AVFilter.GetByName("vflip"),"My Filter");
var sink = graph.CreateFilter(AVFilter.GetByName("buffersink"),"out");

foreach (AVFilterContext f in graph.filters)
{
    Console.Write("\"" + f + "\" ");
}

In an additional way of graph initialization, class allows it to be generated from the parameters string, which includes filter name and its parameters. Source and sink filters in that case should be created in a regular way with setting up the Inputs and output parameters for intermediate filter chain initialization.

// Create filters graph
AVFilterGraph graph = new AVFilterGraph();
// Create sink and source context
var src = graph.CreateFilter(AVFilter.GetByName("buffer"),"in",
    "video_size=640x480:pix_fmt=0:time_base=1/25:pixel_aspect=1/1",IntPtr.Zero);
var sink = graph.CreateFilter(AVFilter.GetByName("buffersink"),"out");

// Setup outputs
AVFilterInOut outputs = new AVFilterInOut();
outputs.name       = "in";
outputs.filter_ctx = src;

// Setup inputs
AVFilterInOut inputs  = new AVFilterInOut();
inputs.name       = "out";
inputs.filter_ctx = sink;

// Initialize filters change from configuration string
graph.ParsePtr("vflip,scale=2:3", inputs, outputs);
graph.Config();
foreach (AVFilterContext f in graph.filters)
{
    Console.Write("\"" + f + "\" ");
}

The code above demonstrates filters chain creation and configuring graph from initialization parameters. The output from the sample:

We add “in” and “out” filters to the graph, setup inputs and outputs structures and call method to build a chain with two intermediate filters: vertical flip and scaling.

AVDevices

Managed static class for accessing libavdevice library APIs. All methods are static and allow access to collection of devices.

AVInputFormat fmt = null;
do {
    fmt = AVDevices.input_video_device_next(fmt);
    if (fmt != null) {
        Console.WriteLine(fmt);
    }
} while (fmt != null);

The devices registered for ability accessing with the libavformat API. Additional registration API call is not required as all necessary API calls are performed once you access either AVDevices class or any libavformat wrapper classes APIs. Each format class contains available options to access the target device from the input device subsystem. Options can be listed from the input format:

var fmt = AVInputFormat.FindInputFormat("dshow");
foreach (var opt in fmt.priv_class.option) {
    Console.WriteLine(opt);
}

Options should be set on AVFormatContext creation. You can specify resolution or device number. For example, let's see how can be displayed list of available DirectShow devices:

AVDictionary opt = new AVDictionary();
opt.SetValue("list_devices", "true");
var fmt = AVInputFormat.FindInputFormat("dshow");
AVFormatContext ctx = null;
AVFormatContext.OpenInput(out ctx, fmt, "video=dummy", opt);

AVDeviceInfoList

Managed class describing list of AVDeviceInfo structures of libavdevice library. Class handling devices collection, and can be accessed from methods of AVDevices class, without a public constructor.

AVDeviceInfoList devices = AVDevices.list_input_sources(
        AVInputFormat.FindInputFormat("dshow"),null,null);
if (devices != null) {
    foreach (AVDeviceInfo dev in devices) {
        Console.WriteLine(dev);
    }
}

AVDeviceInfo

Managed class for AVDeviceInfo structure of libavdevice library. Class contains strings of device description and device name. It can be accessed from the AVDeviceInfoList class.

Advanced

Based on the documentation above, we can arrange some advanced topics. In most cases of wrapper library usage, it may not be needed, but it is good to know any other features of implementation.

Delegates and Callbacks

In some cases of API usage, it is required to have the ability to set up callbacks for an API or a structure method. Such callback functions are designed to be implemented as static methods and they have opaque object as parameter. Each callback method has a delegate:

MC++

public delegate void AVBufferFreeCB(Object^ opaque, IntPtr buf);

In the library, it is only needed to know how the delegate of the method looks, and make a callback method with the same arguments and return type. Callbacks can be set as an argument of function or object constructor or even designed as object property.

static void buffer_free(object opaque, IntPtr buf)
{
    Marshal.FreeCoTaskMem(buf);
}

static void Main(string[] args)
{
    int cb = 1024 * 1024;
    var pkt = new AVPacket(Marshal.AllocCoTaskMem(cb), cb, buffer_free, null);
    //...
    pkt.Dispose();
}

The example above shows how to create AVPacket with custom allocated data. Once the packet is disposed, then the passed free callback method is called. As were mentioned, it is possible to use opaque and cast it to another object which is used as another argument for callback creation:

static int read_packet(object opaque, IntPtr buf, int buf_size)
{
    var s = (opaque as Stream);
    long available = (s.Length - s.Position);
    if (available == 0) return AVRESULT.EOF;
    if ((long)buf_size > available) buf_size = (int)available;
    var buffer = new byte[buf_size];
    buf_size = s.Read(buffer,0,buf_size);
    Marshal.Copy(buffer, 0, buf,buf_size);
    return buf_size;
}

static void Main(string[] args)
{
    var stream = File.OpenRead(@"Test.avi");
    var avio_ctx = new AVIOContext(new AVMemPtr(4096), 
                   0, stream, read_packet, null, null);
    //...
}

The example above demonstrates callback implementation which passed in the constructor of an AVIOContext object.

All callbacks in a library have their own background - the native callback function which stays as a layer between managed and unmanaged code and can provide arguments in a proper way into managed callback methods. This can be easily displayed then we set breakpoint in the callback method of the AVPacket sample code above:

In the call stack, it is possible to see that buffer_free callback method is called from an internal library function which converts opaque pointer into GCHandle of the related AVPacket object. And that object calls the common free handler of that class which as the result made call the passed callback method which was saved as a variable along with the user's opaque value.

The callbacks, which are set as property, do not have the same implementation. For example: AVCodecContext structure class uses an opaque object which is set to context property - so same way, it is done in native FFmpeg library:

// Format selector class
class FormatSelector
{
    private AVPixelFormat m_Format;
    // Save preferred format for selection
    public FormatSelector(AVPixelFormat fmt) { m_Format = fmt; }
    
    // Format selector handler
    public AVPixelFormat SelectFormat(AVPixelFormat[] fmts)
    {
        foreach (var fmt in fmts)
        {
            if (fmt == m_Format) return fmt;
        }
        return AVPixelFormat.NONE;
    }
}

// Get Pixel format callback
public static AVPixelFormat GetPixelFormat(AVCodecContext s, AVPixelFormat[] fmts)
{
    var h = GCHandle.FromIntPtr(s.opaque);
    if (h.IsAllocated)
    {
        // Call our object selector
        return ((FormatSelector)h.Target).SelectFormat(fmts);
    }
    // Call the default selector
    return s.default_get_format(fmts);
}

static void Main(string[] args)
{
    // Create selector object
    var selector = new FormatSelector(AVPixelFormat.YUV420P);
    // Create context
    var ctx = new AVCodecContext(null);
    // Sets opaque
    ctx.opaque = GCHandle.ToIntPtr(GCHandle.Alloc(selector, GCHandleType.Weak));
    // Set callback
    ctx.get_format = GetPixelFormat;
    //...
}

In the code above, we demonstrate how to set up a callback for the format selection property of the AVCodecContext structure. We create a selector class object where we try to select the YUV420P pixel format. This class, we provide as an opaque object for the context as a GCHandle pointer. The selector object must stay alive until the callback method can be used as we set a Weak handle type. We have a GetPixelFormat callback method where we use an AVCodecContext opaque object as our format selector class and use its method for performing selection. As we have a Weak handle type, then the underlying object can be freed and in such case, we call the default format selector method of the AVCodecContext class.

Dynamic API

The wrapper library is designed for linking to the specified .lib files from the FFmpeg libraries. The current implementation links to the FFmpeg API function entries which are exposed with those .lib files. But, as were mentioned, the wrapper library is designed to support different versions of FFmpeg libraries, and on newer versions, APIs can be deprecated or added optionally depending on FFmpeg build options. To properly handle building such cases, some imported APIs are checked dynamically in code. And internal object method implementation can differ depending on what API is present in FFmpeg libraries. That implementation is hidden internally, so users just see the regular one method. For example, on some version of FFmpeg, av_codec_next is not present and replacement of the functionality is done with av_codec_iterate API:

C++

const AVCodec *av_codec_iterate(void **opaque);

#if FF_API_NEXT
/**
 * If c is NULL, returns the first registered codec,
 * if c is non-NULL, returns the next registered codec after c,
 * or NULL if c is the last one.
 */
attribute_deprecated
AVCodec *av_codec_next(const AVCodec *c);
#endif

Also on new FFmpeg versions, an avcodec_register_all API is also deprecated and not present as an exported API.

To handle such version differences and avoid build errors, access to those API is made dynamically. Here is how the implementation of assessing avcodec_register_all API looks:

MC++

void FFmpeg::LibAVCodec::RegisterAll()
{
	AVBase::EnsureLibraryLoaded();
	if (!s_bRegistered) 
	{
		s_bRegistered = true;
		VOID_API(AVCodec,avcodec_register_all)
		avcodec_register_all();
	}
}

And here is how the codec iteration is done:

MC++

bool FFmpeg::AVCodec::AVCodecs::AVCodecEnumerator::MoveNext()
{
	AVBase::EnsureLibraryLoaded();
	const ::AVCodec * p = nullptr;
	void * opaque = m_pOpaque.ToPointer();
	LOAD_API(AVCodec,::AVCodec *,av_codec_next,const ::AVCodec*);
	LOAD_API(AVCodec,::AVCodec *,av_codec_iterate,void **);
	if (av_codec_iterate != nullptr)
	{
		p = av_codec_iterate(&opaque);
	}
	else
	{
		if (av_codec_next != nullptr)
		{
			p = av_codec_next((const ::AVCodec*)opaque);
			opaque = (void*)p;
		}
	}
	m_pOpaque = IntPtr(opaque);
	m_pCurrent = (p != nullptr) ? gcnew AVCodec((void*)p, nullptr) : nullptr;
	return (m_pCurrent != nullptr);
}

In the code above, there are some helper macros: VOID_API and LOAD_API which load API from specified DLL modules and if one or another API is present - uses it, or in other case, skip API call. The dynamic DLLs are static parts of AVBase class and loaded in its constructor call for the first time.

MC++

internal:
	static bool s_bDllLoaded = false;
	static HMODULE m_hLibAVUtil = nullptr;
	static HMODULE m_hLibAVCodec = nullptr;
	static HMODULE m_hLibAVFormat = nullptr;
	static HMODULE m_hLibAVFilter = nullptr;
	static HMODULE m_hLibAVDevice = nullptr;
	static HMODULE m_hLibPostproc = nullptr;
	static HMODULE m_hLibSwscale = nullptr;
	static HMODULE m_hLibSwresample = nullptr;

The helper macro defined in AVCore.h file:

MC++

//////////////////////////////////////////////////////
#define LOAD_API(lib,result,api,...) \
	typedef result (WINAPIV *PFN_##api)(__VA_ARGS__); \
	PFN_##api api = (AVBase::m_hLib##lib != nullptr ? 
                    (PFN_##api)GetProcAddress(AVBase::m_hLib##lib,#api) : nullptr);
//////////////////////////////////////////////////////
#define DYNAMIC_API(lib,result,api,...) \
		LOAD_API(lib,result,api,__VA_ARGS__); \
		if (api)
//////////////////////////////////////////////////////
#define DYNAMIC_DEF_API(lib,result,_default,api,...) \
	LOAD_API(lib,result,api,__VA_ARGS__); \
	if (!api) return _default;

#define DYNAMIC_DEF_SYM(lib,result,_default,sym) \
	void * pSym = (AVBase::m_hLib##lib != nullptr ? 
                   GetProcAddress(AVBase::m_hLib##lib,#sym) : nullptr); \
	if (!pSym) return _default; \
	result sym = (result)pSym;
//////////////////////////////////////////////////////
#define VOID_API(lib,api,...) DYNAMIC_API(lib,void,api,__VA_ARGS__)
#define PTR_API(lib,api,...)  DYNAMIC_API(lib,void *,api,__VA_ARGS__)
#define INT_API(lib,api,...)  DYNAMIC_API(lib,int,api,__VA_ARGS__)
#define INT_API2(lib,_default,api,...) DYNAMIC_DEF_API(lib,int,_default,api,__VA_ARGS__)
//////////////////////////////////////////////////////

LOAD_API macro defines the API variable and loads it from the specified FFmpeg library.

DYNAMIC_API - loads an API and calls the next code line if the API persists. This is good if required to switch between dynamic or static API linkage, or use different way access.

MC++

FFmpeg::AVRational^ FFmpeg::AVBufferSink::frame_rate::get()
{
	::AVRational r = ((::AVFilterLink*)m_pContext->
                     inputs[0]->_Pointer.ToPointer())->frame_rate;
	DYNAMIC_API(AVFilter,::AVRational,av_buffersink_get_frame_rate,::AVFilterContext *)
	r = av_buffersink_get_frame_rate((::AVFilterContext *)m_pContext->
        _Pointer.ToPointer());
	return gcnew AVRational(r.num,r.den);
}

DYNAMIC_DEF_API - loads API and if it is not able to be loaded, return specified default value.

MC++

int FFmpeg::AVBufferSink::format::get()
{
	DYNAMIC_DEF_API(AVFilter,int,m_pContext->inputs[0]->format,
                    av_buffersink_get_format,::AVFilterContext *);
	return av_buffersink_get_format((::AVFilterContext *)m_pContext->
                                   _Pointer.ToPointer());
}

Other macros are just variations of return types.

In case your exported API is used as the static method, or in class which is not a subclass of AVBase then make sure that before using those macro AVBase::EnsureLibraryLoaded(); method is called, or just check AVBase::s_bDllLoaded variable.

Structure Pointers

As mentioned earlier: each AVBase class exposes a pointer to the underlying FFmpeg structure. This is done for the ability to extend library functionality or manually manage existing API. As an example, AVBase._Pointer field can be used for the direct object cast or for manual usage in exported API:

[DllImport("avcodec-58.dll")]
private static extern void av_packet_move_ref(IntPtr dst,IntPtr src);

public static AVPacket api_raw_call_example(AVPacket pkt)
{
    // Create packet object
    AVPacket dst = new AVPacket();
    // Use packet structure pointer directly in exported API
    av_packet_move_ref(dst._Pointer,pkt._Pointer);
    return dst;
}

Note: The object methods can internally manage destructor and free structure APIs, so such usage of raw API has risk of memory leaks.

Raw pointers also can be used to access any structure fields directly.

Extending the Library

Just to show the benefits of implementing an entire wrapper library in C++/CLI, this topic was added. It is possible to implement different parts of the code in C# or other .NET languages directly by accessing the existing library objects properties. It is also possible to extend existing functionality by adding support for any API, which may be missed, or for any other needs. As already specified, those API can be used with the pointers to structures, and to handle such structures is used the AVBase class. To wrap your own structure, it is just necessary to inherit such a structure from the AVBase class and manage its properties. Here is a simple example of such implementation:

public class MyAVStruct : AVBase
{       
    [StructLayout(LayoutKind.Sequential,CharSet=CharSet.Ansi)]
    private struct S
    {
        [MarshalAs(UnmanagedType.I4)]
        public int Value;
        [MarshalAs(UnmanagedType.ByValTStr,SizeConst=200)]
        public string Name;
    }

    public MyAVStruct() {
        if (!base._EnsurePointer(false)) {
            base.AllocPointer(_StructureSize);
        }
    }

    public override int _StructureSize { get { return Marshal.SizeOf(typeof(S)); } }

    public int Value
    {
        get { return Marshal.PtrToStructure<S>(base._Pointer).Value;}
        set {
            if (base._EnsurePointer())
            {
                var s = Marshal.PtrToStructure<S>(base._Pointer);
                s.Value = value;
                Marshal.StructureToPtr<S>(s, base._Pointer, false);
            }
        }
    }

    public string Name
    {
        get { return Marshal.PtrToStructure<S>(base._Pointer).Name;}
        set {
            if (base._EnsurePointer())
            {
                var s = Marshal.PtrToStructure<S>(base._Pointer);
                s.Name = value;
                Marshal.StructureToPtr<S>(s, base._Pointer, false);
            }
        }
    }
}

We have a structure named “S” for which we make a wrap object “MyAVStruct” for the ability to access it from .NET. We allocate a data pointer by calling the AllocPointer method of the AVBase class. Also, we have access to each field of that structure by marshaling the whole structure each time we get or set property value. An example of the call that structure:

// Create Structure instance
var s = new MyAVStruct();
if (s._IsValid) // Check if the pointer allocated
{
    // Set field values
    s.Value = 22;
    s.Name = "Some text";
    // Get values
    Console.WriteLine("0x{0:x} StructureSize: {1} 
                       Allocated: {2}\n Name: \"{3}\" Value: {4}",
    s._Pointer, s._StructureSize, s._IsAllocated,
    s.Name,s.Value);
}
// Destroy structure and free data
s.Dispose();

Of course, the structure implementation in code above is not optimized. It's just for displaying functionality. Keep in mind that those operations for marshaling structure each time are not performing either in case of C++/CLI implementation as those fields are accessed directly internally and this is a big plus. In .NET case, we can optimize the above structure:

// Our structure
public class MyAVStruct : AVBase
{
    // Underlaying Internal structure for wrapper
    [StructLayout(LayoutKind.Sequential,CharSet=CharSet.Ansi)]
    private struct S
    {
        [MarshalAs(UnmanagedType.I4)]
        public int Value;
        [MarshalAs(UnmanagedType.ByValTStr,SizeConst=200)]
        public string Name;
    }

    // private structure for optimization
    private S m_S = new S();
    // Updated flag
    private bool m_bUpdated = false;

    // Constructor 
    public MyAVStruct() {
        // Just to show so AVBase API usage
        if (!base._EnsurePointer(false)) {
            // Allocate Structure
            base.AllocPointer(_StructureSize);
            Update();
        }
    }

    // Method For Update private structure
    protected void Update() {
        m_S = Marshal.PtrToStructure<S>(base._Pointer);
        m_bUpdated = false;
    }

    // Size of Structure for allocation
    public override int _StructureSize { get { return Marshal.SizeOf(typeof(S)); } }

    // Pointer access
    public override IntPtr _Pointer {
        get {
            if (m_bUpdated && base._EnsurePointer()) {
                Marshal.StructureToPtr<S>(m_S, base._Pointer, false);
                m_bUpdated = false;
            }
            return base._Pointer;
        }
    }

    // Structure Fields Accessing
    public int Value { get { return m_S.Value; } 
                       set { m_S.Value = value; m_bUpdated = true; } }
    public string Name { get { return m_S.Name; } 
                         set { m_S.Name = value; m_bUpdated = true; } }

    // Example of Exported API from library
    [DllImport("avutil.dll", EntryPoint = "SomeAPIThatChangeS")]
    private static extern int ChangeS(IntPtr s);

    // Call API 
    public void Change() {
        if (ChangeS(this._Pointer) == 0){
            Update(); // Update structure fields
        }
    }
}

In the modified structure, we have the flag: “m_bUpdate” which controls the field updates. If any value of the field is updated, then the flag is set, and when we need to access raw pointer - structure marshaling is performed. Along with it, there is an example of some exported API calls with the structure. After calling the temporal internal structure updated with values from the actual pointer.

This is a simple example of implementation as you can see in the FFmpeg libraries structures are not simple and not possible to manage all of them this way, or at least not all fields, it is also hard in the case of marshaling each field directly from a pointer and casting then into one or another objects. But that was not hard to handle with C++/CLI, that's why it was selected for implementation.

Examples

I made a couple of C# examples to show how to use the wrapper library. Some of them are wrappers of examples which come with native FFmpeg documentation, so you can easily compare the implementation. I also add a few more of my own examples, which I think will be interesting. All samples code is located in the “Sources\Examples” folder. The project files can be loaded from the “Sources\Examples\proj” folder, or just the library solution file can be opened to access all examples. All C# samples which are wrappers of standard FFmpeg examples are working the same way as native. Some sample descriptions contain screenshots of the execution.

Audio_playback

Example of loading media file with libavformat decoding and resampling audio data and output it to playback with WinMM Windows API.

Avio_reading

Standard FFmpeg libavformat AVIOContext API example. Make libavformat demuxer access media content through a custom AVIOContext read callback.

Decode_audio

A C# wrapper of standard FFmpeg example of audio decoding with libavcodec API.

Decode_video

A C# wrapper of standard FFmpeg example of video decoding with libavcodec API.

Demuxing_decoding

A C# wrapper of standard FFmpeg demuxing and decoding example. Show how to use the libavformat and libavcodec API to demux and decode audio and video data.

Encode_audio

A C# wrapper of standard FFmpeg audio encoding with libavcodec API example.

Encode_video

A C# wrapper of standard FFmpeg video encoding with libavcodec API example.

Filter_audio

A C# wrapper of standard FFmpeg libavfilter API usage example. This example will generate a sine wave audio, pass it through a simple filter chain, and then compute the MD5 checksum of the output data.

Filtering_audio

A C# wrapper of standard FFmpeg API example for audio decoding and filtering.

Filtering_video

A C# wrapper of standard FFmpeg API example for decoding and filtering video.

Metadata

A C# wrapper of standard FFmpeg example. Shows how the metadata API can be used in application programs.

Muxing

A C# wrapper of standard FFmpeg libavformat API example. Output a media file in any supported libavformat format.

Remuxing

A C# wrapper of standard FFmpeg libavformat/libavcodec demuxing and muxing API example. Remux streams from one container format to another.

Resampling_audio

A C# wrapper of standard FFmpeg libswresample API use example. Program shows how to resample an audio stream with libswresample. It generates a series of audio frames, resamples them to a specified output format and rate and saves them to an output file.

Scaling_video

A C# wrapper of standard FFmpeg libswscale API use example. Program shows how to scale an image with libswscale. It generates a series of pictures, rescales them to the given size and saves them to an output file.

Screen_capture

Capture Screen with Windows GDI scale with libswscale, encode with libavcodec and save into a file with libavformat.

Transcoding

A C# wrapper of standard FFmpeg API example for demuxing, decoding, filtering, encoding and muxing.

Video_playback

Example of loading a media file with libavformat decoding and scaling video frames and output it to playback with Winforms and GDI+ API. This is just an example, I do not suggest using the same way for video displaying. For video output, it is better to use playback with GPU output not GDI+, as there is lots of CPU overhead.

Source Code

The code repository at the GitHub can be found at https://github.com/S0NIC/FFmpeg.NET.

Compiled Library

Upon requests, I added the Release build version of the wrapper library for x86 and x64 platforms with 4.2.2. lgpl version of FFmpeg. To use it in Visual Studio, add FFmpeg.NET.dll assembly reference from zip archive into your project. It is better to configure "x64" or "x86" platform on your application build settings as "Any CPU" loads runtime depending on system. Once application is compiled, copy the whole content of the compiled library build archive into resulted location. It is necessary to remember - wrapper library linked with those DLLs and uses build settings and configuration files of the related FFmpeg libraries, so, do not replace them.

Building Project

Build solution of the library located in the “Solutions” folder. On the target PC should be installed WIndows SDK (version 10 on my system) and .NET Framework (v 4.5.2), and the project can be built with Visual Studio 2017 (v141) platform toolset. It is possible to use different versions of SDK, .NET Framework and build toolsets. They can be configured in a project directly.

FFmpeg libraries and headers located in “ThirdParty\ffmpeg\x.x.x” where “x.x.x” is the FFmpeg version. There are up to four subfolders inside: “bin” with DLL libraries located in there, “lib” - with linker libraries, “include” - with exported API headers, also possible to put FFmpeg source into “sources” folder. Sources, if present, must be the same version. Depending on the build configuration, “lib” and “bin” contain “x64” and “x86” divisions.

The FFmpeg version with the path must be specified in project settings for compiler include path configuration (Properties\C/C++\General\Additional Include directories), linker library search path configuration (Properties\Linker\General\Additional Library directories) and post build library copy events (Properties\Build Events\Post-Build Event).

If sources are present, then the “HAVE_FFMPEG_SOURCES” definition may be set in project settings or in a precompiled header file.

If everything is set up correctly, then the building project will succeed.

Licensing and Distribution

The FFmpeg.NET Wrapper of FFmpeg libraries later "Wrapper Library" is provided "as is" without warranty of any kind; without even the implied warranty of merchantability or fitness for a particular purpose.

Wrapper Library is the source available software. It is free to use in non-commercial software and open source projects.

Usage terms are described in the attached license text file of the project.

History

15^th July, 2022: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

.NET Wrapper of FFmpeg Libraries

Table of Contents

License