Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Access image metadata using Visual Studio's new object data binding feature

0.00/5 (No votes)
1 Feb 2007 1  
Using a new class library to bind to photo metadata with a few line of code.

Introduction

Metadata is information about an image (or other media type) imbedded in the file. There are several types of metadata. The most commonly used is EXIF (Exchangeable Image File Format). Digital cameras save shooting data in an EXIF segment. Browser and image management applications such as the Photoshop Elements Organizer or (my favorite) Photo Mechanic save user entered information such as the image caption; they can also edit shooting data such as the creation time. They update the original EXIF segment, but can also save data in XMP (Extensible Metadata Platform) segments. Access to XMP Metadata is the primary focus of this article.

One of the neatest features of Visual Studio 2005 and .NET 2.0 is the ability to data bind directly to the properties of an object. I will describe a class library that is used to create bindable XMP data objects. The XMP data model in the Adobe XMP Specification contains a set of XMP Schema definitions. Each Schema lists a group of related properties. The classes in the library map to the schema in the XMP data model. To increase its usability, the library also contains a class that binds to EXIF metadata.

In this article, I will first describe a sample application that illustrates how the class library is used. Look at the article by Rockford Lhotka for an in-depth description of the object data binding techniques I used. Next I will briefly discuss some of the details of the class library. For more information about construction of bindable classes see the Code Project article on the subject and other articles that describe the INotifyPropertyChanged interface. Finally, I will describe how I generated the library from the information in XMP Specification. I particular, I will reveal the trick I used to extract usable source data from the PDF format specification file.

Using the code

The screen shot shows some of the features of the sample application. The Load Data button loads metadata from an image file. The Write Data button writes the XMP metadata to an XML file. The Tab Control displays the properties of several photo metadata objects in separate tab pages.

The Tab Control contains five pages which contain the photo metadata we want to display.

One of the powerful features of Visual Studio 2005 is the ability to create the property display and editing fields by simply dragging an object data source from the Data Sources window to the form. Visual Studio also creates most of the required data binding source code. At the point shown here, a tab page to display the PDF Metadata properties is being created.

The first step is to create a Data Source bound to the PdfMetadata object. The screen shot shows the Object Selection page in the Data Source Configuration Wizard. It was displayed by clicking the Add New Data Source button in the Data Sources window and choosing an Object Data Source Type. Selecting the PdfMetadata object and clicking the Finish button adds PdfMetadata to the Data Sources list.

The next screen shot shows how to configure the property fields in the Data Sources window. In the view on the left, I have selected a "Detailed" layout, which displays the properties in separate fields. The view on the right shows how to configure the control type used for each property. In this case, I have chosen to not to display the PDFVersion property.

After configuring the data source, all I had to do was drag the PdfMetadata item from the Data Sources window to the PDF Metadata tab. Visual Studio 2005 did the layout and created most of the data binding code.

To complete the process I added the following code:

private void LoadData(string imageFile) 
{ 
  xmpData = XmpMetadata.CreateNewXmpData(imageFile); 
  this.photoshopMetadataBindingSource.DataSource = xmpData
    .XmpPhotoshopMetadata; 
  this.dcMetadataBindingSource.DataSource = xmpData
    .XmpDcMetadata; 
  this.exifMetaDataBindingSource.DataSource = xmpData
    .ImageExifMetadata; 
  this.tiffMetadataBindingSource.DataSource = xmpData
    .XmpTiffMetadata; 
  this.pdfMetadataBindingSource.DataSource = xmpData
    .XmpPdfMetadata;
}

LoadData , which called during the Load Data button event processing, receives the selected image file. The first line creates the xmpData object, which is a container for the Metadata objects. The remaining lines link the Photo Metadata objects to the automatically generated BindingSource objects.

The Write Data button writes the XMP properties to an XML file. Where allowed by XMP Specification (see the discussion below), editing changes to the displayed property fields are written to the output file. Your application could use the edited output to update XMP segment in the source image file.

This about covers the basics. Again, I recommend that you review the Rockford Lhotka article for an in-depth discussion of how to use object data binding.

The remainder of this article is background information about the library and how it was created.

XMP Metadata Class Library Overview

XMP Metadata Schema

The fifteen XMP metadata classes map to the schemas defined in the Adobe XMP Specification. This mapping is illustrated using one of the shorter schemas, the "EXIF Schema for Additional EXIF Properties". This is a screen shot of the schema taken from the specification document.

It schema name is "http://ns.adobe.com/exif/1.0/aux/". Its schema namespace prefix is aux. The class names are prefixed by the schema namespace prefix.

The first column contains the property name. The second column shows the value type of the property as defined in the specification. The third column is the "Category", which can be either "Internal" or "External". An "External" property, such as a "Caption", can be updated by the user. An "Internal" property, such as the camera "SerialNumber", can not be changed. In the library, "Internal" properties are read-only. The last column is the property description. In the library, the property description is included as an XML comment.

The next screen shot shows the photo metadata classes displayed in the Object Browser.

The "EXIF Schema for Additional EXIF Properties" schema is mapped to the AuxMetadata class. The class name is derived from the schema namespace prefix. The object property names are derived from the schema property names. Because the selected property, Lens, belongs to the "Internal" category, only the get accessor is present. "External" properties have both accessors and can be edited. Because the schema comment is used as an XML comment in the AuxMetadata class code, it is displayed in the Summary section in the Object browser.

The XmpMetadata Class

The XmpMetadata object is the container for the metadata objects. When it is instantiated, it extracts the XMP Segment from the specified file and loads it into an XmlDocument object. The file can be a Jpeg or TIFF image file or an XML file. It parses the XmlDocument object to extract the individual schema elements, which are stored as XmlNode objects. The metadata objects, which are exposed as properties, are created from the XmlNode objects. They are instantiated when they are first referenced in the application.

It maintains a list of the instantiated metadata objects in the metadataList Dictionary object. The list is used in the UpdateXmpSegment method, which applies metadata property changes to the original XmlDocument object.

Its WriteXmpSegment method writes XmlDocument object to the specified stream. The indent parameter specifies whether the output is formatted.

There is a bunch of rather nit picky code to extract the imbedded XMP segment from JPEG and TIFF image files. One part, which may be of general interest, searches a TIFF file for a specified Tag ID and returns the tag in a C# data structure. The TIFF code supports both "Big Endian" and "Little Endian" byte orders. The files I have looked at turn out to be "Little Endian" probably due to their origin on a PC. Files that originate on an older Mac may be "Big Endian".

The Photo Metadata Base Class

The PhotoMetadata class is the base class for the XMP metadata classes. It implements the required INotifyPropertyChanged interface. When it is instantiated, it receives the XmlNode object that was extracted from the XMP Segment. The object contains properties belonging to the class's XMP schema. It also receives the top level XmlDocument object, which it uses during property editing.

Its GetInitialValue method extracts the specified property value from XmlNode object, which it returns as a string. GetInitialValue is called during the initial data load process. It receives the property name and property value type as defined in the XMP schema definition.

Its UpdateData method is called when a metadata property is changed. It receives the new property value and the XMP property name and property value type. It updates the XmlNode object with the changes.

Its WriteXML method writes the XmlNode object to an output stream. This turned out to be an evolutionary dead end that is not actually used in the sample application, but may be of some use to other applications.

The XMP Schema Metadata Classes

Each of the XMP metadata classes, which are based on the PhotoMetadata class, exposes the properties defined in its XMP schema. They each contain a LoadInitialData method, which is called when the base class is instantiated. LoadInitialData extracts the initial property values from the XmlNode object.

The following is sample property code.

        
private string m_Contributor = null;
/// <summary>

/// Contributors to the resource (other than the authors).

/// </summary>

public string Contributor
{
  get
  {
    return m_Contributor;
  }
  set
  {
    string adjustedValue = AdjustedValue(value);
    if (adjustedValue != m_Contributor)
    {
      m_Contributor = adjustedValue;
      ElementUpdated("Contributor", "dc:contributor", 
                     "bag ProperName", adjustedValue);
    }
  }
}

This code creates the first property in the Dublin Core, (DcMetadata) class. The XML documentation is taken from the property description in the XMP specification. The m_Contributor string was initialized by the LoadInitialData method. The "set" accessor is present here because the Contributor property are read-only. The AdjustedValue method cleans up the input string. The ElementUpdated method is called to output the required notification event and to update the property value in the XmlNode object. It arguments are the .Net property name, the XMP schema property name and property value type, and the updated value.

The Code Generator

Since there are fifteen XMP schema and several hundred properties, it is impractical to write the XMP metadata class code by hand. Therefore, I used the code generator contained in the CodeGenerator project, which is part of the code distribution. The source data used by the code generator is in the XML files in the "doc" subdirectory.

I extracted the source files from the PDF file containing the "Adobe XMP Specification". My copy of the specification file is also in the "doc" subdirectory. I extracted the data using Adobe Acrobat Professional. My version was 6.0.5. I loaded the specification file, and then extracted the page or generate. I used the Document/Pages/Extract menu selection and entered the pages I wanted to extract. In the extracted document, I chose to save the document as XML. I then proceeded to hand edit the resulting XML. I first removed any lines not contained in <table> tags. Then, if necessary, I merged multiple tables into a single table.

The code generator loads the the raw XML created by this process into an XmlDocument object. Then the CleanXmlTable method merges continued lines until there is one and only one property per XmlNode object.

The remaining code is straightforward. The application iterates twice through each property node in the XmlDocument object. In the first pass, it generates the property code. In the second, it generates the property initialization statements.

Additions

In addition to the classes discussed earlier, I have included the PhotoMechanicMetadata class in the library. It accesses data in the custom XMP schema created by Photo Mechanic (see www.camerabits.com). Photo Mechanic saves user entered metadata that does not fit into any other schema in this custom schema. My main reason for adding it is that, in my own applications, I want to have access to the color coded photo quality rating information featured by Photo Mechanic. I created the class by reverse engineering the data in the custom schema.

As mentioned in the Introduction, the library also contains the ImageExifMetadata class that binds to EXIF metadata. The ImageExifMetadata class uses the EXIFExtractor class described in the Code Project article by Asim Goheer. ImageExifMetadata uses EXIFExtractor to access the EXIF properties in the image file and store them in a Hashtable. It contains generated property code to create bindable, read-only properties. It uses the function, LoadData, to initialize the properties from data in the Hashtable.

Omissions

A number of potentially useful features have been omitted from this library. Here are some:

My own applications do not require a complete editing capability, so editing has not been fully implemented. In particular, although new properties can be added, the result may not conform completely to the XMP specification. The necessary, "Property Value Type", information is supplied in the ElementUpdated function call. This could be, but is not now, used to control the format of newly inserted property values.

This library can access data in TIFF, JPEG and XML files. The XMP specification lists several other file types which can support imbedded XMP metadata. A common format which I would like to access is Adobe Photoshop (*.psd). One of the other listed types is PNG, which I like because it is an open standard, is lossless, has a good file compression, and is widely supported. It would be useful to be able to access metadata in PNG files and perhaps some of the other listed image file types.

This list is a basis for future improvements.

Finally, because of a lack of test data, many of the classes have not actually been tested. Because of the way they were created using an automated process, this is not likely to be a problem, but, without testing, I can't be sure.

Points of interest

Where is the metadata? If you run the sample application using a file taken directly from your camera, you will see data on the EXIF Metadata tab, but the other tabs will probably be empty. The reason is that most cameras save metadata according to the old (and complex) EXIF standard, but do not support the XML based XMP standard. You will need to save the file from an application than can be configured to write an XMP Segment to the image file.

Some applications, such as Adobe Photoshop Elements, allow you to view the contents of the XMP segment. For example, in Elements, select File Info from the File menu. In the File Info window, select the "Advanced" tab. This will display a tree structure with the XMP properties organized by Schema.

History

This article was originally submitted on January 24, 2007.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here