Extracting Tags and Details from Your MP3 Collection into XML

Raven123

4.38/5 (14 votes)

31 Jul 20057 min read

1.1K

Extracting various information (ID3vN tags and general information) from MP3 files in a directory tree into XML format

Download source code and project - 24.9 KB

Introduction

OK, first of all, let's define our goal. We want to take a folder, find all MP3 files that are in it and its subfolders, read information that is contained in them and store it using some media. The media choice will be XML; I have modified Erhan Hosca's code and used that as the "skeletal' code for my class to avoid the hassle of enumerating directories and files myself. You can find it here.

The class doesn't work with archived or encrypted headers. Tested on my MP3 collection (and believe me, it is really big) it found nothing like that, so I guess it's not really used by anyone. If you find any other bugs or omissions, please contact me. Frankly speaking, I wrote this code 2 years back to get used to C# (it's my second C# program after Hello World), so it might be far from optimal. I especially dislike the amount of byte arrays it uses. On the other hand, it's fast and it can be easily rewritten to match the highest standards of C# purist. The idea is to show a way to parse binary files or to be more precise, the layout of bytes and bits used in MP3 files that we all know and love.

Reading the Tags

First of all, let's develop a class to read data from a single MP3 file. We create a blank solution and add the new class in it, and change it to static because there's no instance specific information apart from the file name. Here's what we get:

namespace Mp3_Lister
{
    static class Mp3Reader
    {
    }
}

Let's throw something in the mix. First of all, we map codes for some common ID3v2 tags to descriptive names for our XML attributes. To distinguish them from the ID3v1 tags, I have added v2- prefix. Then, we add map to easily decode bit rate and sampling rate values from cryptic bytes that are used to represent them in MP3 file headers. Then we add some arrays for versions, layers, channel modes and genres, all placed in accordance with the codes that represent them in the file. In the end, there's a small static method that we use to get the genre name by its number. All the stuff is initialized in a static constructor, so it will be made available when our class is first used.

private static Hashtable TagMap;
private static Hashtable BitrateMap;
private static Hashtable RateMap;
private static int picCounter;

static Mp3Reader()
{
    TagMap = new Hashtable();
    TagMap.Add("TIT2","v2-song-title");
    //...

    TagMap.Add("TCON","v2-genre");
    BitrateMap = new Hashtable();
    BitrateMap.Add("011","free");
    //...
    BitrateMap.Add("1523","bad");
    RateMap = new Hashtable();
    RateMap.Add("01","44100");
    //...

    RateMap.Add("33","bad");
}

private static double[] versions = {2.5,0,2,1};

private static int[] layers = {0,3,2,1};

private static string[] channelModes = {"Stereo",
               "JointStereo","DualChannel","Mono"};

private static string[] genres = {"Blues",
       "Classic Rock", "Country", "Dance", "Disco",
       "Funk","Grunge", "Hip-Hop",
        //...

       "Thrash Metal", "Anime", "Jpop", "Synthpop"};

public static string Genre(int index)
{
    return ((index < genres.Length) ?
                         genres[index] : "Unknown");
}

Now that we are done defining, let's add a simple method to remove the weird characters you might sometimes encounter in MP3 tags to avoid XML errors. This method will also handle double (or multiple) tags with the same name in MP3 files. This will take the name and value for attribute containing potentially dangerous data from MP3 file. The XML document and the XML element to work on are passed to our method by reference, so that we can attach data to them. That suits our pattern fine, the directory enumerating class will probably get the file data (size, modification dates, etc.), create XML element that represents this particular file and then pass it to our method, allowing us to add all the additional data that we are going to extract.

private static void SetXmlAttribute(string attributeName,
                 string attributeValue, ref XmlDocument xmlDoc,
                 ref XmlElement xmlElement)
{
    XmlAttribute xmlAttrib;
    string separator = "";

    if (xmlElement.GetAttributeNode(attributeName) == null)
    {
        xmlAttrib = xmlDoc.CreateAttribute(attributeName);
        xmlElement.Attributes.Append(xmlAttrib);
        xmlAttrib.Value = "";
    }
    else
    {
        separator = "; ";
        xmlAttrib = xmlElement.GetAttributeNode(attributeName);
    }

    for (int i = 0; i < attributeValue.Length; i++)
    {
        if ((attributeValue[i] < '\x20') &&
                (attributeValue[i] != '\t') &&
                (attributeValue[i] != '\n') &&
                (attributeValue[i] != '\r'))
        {
            attributeValue = attributeValue.Remove(i, 1);
            i--;
        }
    }
    xmlAttrib.Value +=
            separator + attributeValue.Replace("\"", """);
}

Then, let's add the main method that our class will use, the one that reads the actual file and extracts the information. I threw the file opening routine in right away. We also create a byte array for reading small header data.

public static void getTagInfo(string fileName,
     ref XmlElement tagInfo, ref XmlDocument xmlDoc)
{
    FileStream mp3File;
    long startPos = 0;
    byte[] ba = new byte[6];
    XmlAttribute xmlAttrib;
    try
    {
        mp3File = new FileStream(fileName,
                      FileMode.Open, FileAccess.Read);
    }
    catch (Exception e)
    {
        xmlAttrib = xmlDoc.CreateAttribute("file-error");
        xmlAttrib.Value = e.Message;
        tagInfo.Attributes.Append(xmlAttrib);
        return;
    }
}

OK, let's check for the header in the beginning of the file first. That's easy - if the file starts with "ID3", there's a header, if it doesn't there's no header present. We create an attribute that indicates its presence.

mp3File.Read(ba, 0, 6);
xmlAttrib = xmlDoc.CreateAttribute("id3v2");
if ((((char)ba[0]).ToString() +
             ((char)ba[1]).ToString() +
             ((char)ba[2]).ToString()) == "ID3")
{
    xmlAttrib.Value = "1";
    tagInfo.Attributes.Append(xmlAttrib);
}
else
{
    xmlAttrib.Value = "0";
    tagInfo.Attributes.Append(xmlAttrib);
}

If the header is present, we probably want to parse it. First of all, we get the version of the tag from the fourth byte, and set values for some variables like tag name length for further processing. Then, we extract information about the extended header from the bits in the sixth flag byte.

int version = ba[3];
int thsize;
int tfsize;
thsize = (version > 2) ? 4 : 3;
tfsize = (version > 2) ? 2 : 0;
bool isExtended = false;
//check 6th byte of ba for flags
//( left bits : unsync-extended-experimental )

if ((byte)(ba[5] << 1) > 127)
{
    isExtended = true;
}
mp3File.Read(ba, 0, 4);

OK, what's next? Next, we implement a static function that will help us in our header parsing emprises. It gets the length encrypted in four bytes of the file and decrypts from that special bits-shifted format to the actual number. Here it goes:

private static int GetLength(byte[] ba)
{
    int len = (ba[3] + (byte)( ba[2] << 7 ));
    len += ((byte)(ba[2] >> 1) +  (byte)(ba[1] << 6))*256;
    len += ((byte)(ba[1] >> 2) +  (byte)(ba[0] << 5))*65536;
    len += (byte)(ba[0] >> 3)*16776960;
    return len;
}

Next we are using it to find out the length of the header and the extended header, given that it is present. We skip over the extended header and read the main header into our byte array. We also prepare some variables for tag extracting.

int headerLength = GetLength(ba);
if (isExtended)
{
    mp3File.Read(ba, 0, 4);
    int extHeaderLength = GetLength(ba) - 4;
    ba = new byte[extHeaderLength];
    mp3File.Read(ba, 0, extHeaderLength);
}
ba = new byte[headerLength];
mp3File.Read(ba, 0, headerLength);
startPos = mp3File.Position;

int pos = 0;
byte[] tag = new byte[thsize];
byte[] len = new byte[thsize];
byte[] str;
string tagName, tagContent;
int tagLength = 0;

We add a simple loop to go through the tags. When we are out of the header, or when the tag name is not found where it was expected, we end the loop.

do
{
    if ((pos + 10) > headerLength)
    {
        break;
    }

}
while (tagLength > 0);

We get the tag name and length into place. Then, we check if the contents of the tag are encrypted or compressed. If it is, we add appropriate information. All the time, we carefully maintain our position in the byte array as we parse it (that's pos variable). Compressed and encrypted flags show us that there's additional stuff in the header - I don't deal with it in my simple class.

tagName = ""; tagContent = "";
Array.Copy(ba, pos, tag, 0, thsize);
pos += thsize;
Array.Copy(ba, pos, len, 0, thsize);
pos += thsize;
if (tfsize > 0)
{
    int shift = 0;
    if (ba[pos + 1] > 127)
    {
        shift += 4;
        tagContent += "compressed; ";
    }
    if ((byte)(ba[pos + 1] << 1) > 127)
    {
        shift += 1;
        tagContent += "encrypted; ";
    }
    if ((byte)(ba[pos + 1] << 1) > 127)
    {
        shift += 1;
    }
    pos += (2 + shift);
}

After that, we calculate the tag length (which uses a different, straightforward algorithm, so getlength function doesn't apply), and stuff it into yet another byte array.

//tagLength = len[0]*65536*256+len[1]*65536+len[2]*256+len[3];
    tagLength = 0;
    for (int i = thsize - 1; i >= 0; i--)
    {
        int ml = 1;
        for (int j = i; j < thsize - 1; j++)
        {
            ml *= 256;
        }
        //get multiplier

        tagLength += len[i] * ml;
    }
    str = new byte[tagLength];
    if (tagLength > ba.Length)
    {
        //means someone was too bored to stuff the end of the header with 
        //\0s and used fancy text instead so out length detection screwed up
        SetXmlAttribute("potential-error", "1", ref xmlDoc, ref tagInfo);
        break;
    }
    Array.Copy(ba, pos, str, 0, tagLength);
    pos += tagLength;
    tagName = System.Text.Encoding.ASCII.GetString(tag);

At last, we get the tag content to another array and add it to our XML element if everything's OK.

if ((tagLength > 0) && (tagName.Length > 0))
{
    tagContent = Mp3Reader.TransformV2Tag(tagName, str);
    tagName = ((Mp3Reader.TagMap.Contains(tagName)) ?
                 ((string)Mp3Reader.TagMap[tagName]) :
                                     "v2-tag-" + tagName);
    SetXmlAttribute(tagName,tagContent,
                      ref xmlDoc,ref tagInfo);
}

The last bit of code features TransformV2Tag function. It can be used to transform different tags' data into some legible form, or exposed as event if you want to tamper with the code for a little while. In my simple case, it removes \0-s from the tags, converts length data to human readable format and extracts the picture. There can be many pictures stored in the ID3v2 tags, but I treat them all in the same way discarding available type information (like cover, label logo, etc.) and description, and save them to files named imageN. Image name is then written into the tag content instead of the actual binary data. Here it goes:

public static string TransformV2Tag(string tagName,
                             byte[] tagContentArray)
{
    //the only binary tag we are going to handle
    string rv = "";
    if (tagName != "APIC")
    {
        rv = System.Text.Encoding.ASCII.GetString(tagContentArray);
    }

    string tmp;
    if (tagName == "TLEN")
    {
        rv = rv.Replace('\0', ' ').Trim();
        int sLength = Int32.Parse(rv);
        rv = "";
        int ln;
        sLength = sLength / 1000;
        if (sLength > 3600)
        {
            ln = (int)Math.Floor((double)sLength / 3600);
            rv += ln.ToString();
            sLength -= ln * 3600;
            rv += ":";
        }
        if (sLength > 60)
        {
            ln = ((int)Math.Floor((double)sLength / 60));
            tmp = ln.ToString();
            if (tmp.Length == 1) tmp = "0" + tmp;
            rv += tmp;
            sLength -= ln * 60;
            rv += ":";
        }
        else
        {
            rv += "00:";
        }
        tmp = sLength.ToString();
        if (tmp.Length == 1) tmp = "0" + tmp;
        rv += tmp;
    }
    if (tagName == "APIC")
    {
        byte[] tmpStart = new byte[40];
        Array.Copy(tagContentArray, tmpStart,
                    Math.Min(40,tagContentArray.Length));
        string tagContent =
           System.Text.Encoding.ASCII.GetString(tmpStart);

        int zeroCount = 0, ii = tagContent.IndexOf("image/");
        while (zeroCount < 3)
        {
            if (tagContentArray[ii] == 0)
            {
                zeroCount++;
            }
            ii++;
        }

        tagContent = tagContent.Remove(0,
                        tagContent.IndexOf("image/") + 6);
        string imgExt = tagContent.Substring(0,
                                tagContent.IndexOf('\0'));

        if ((tagContentArray.Length - ii) > 0)
        {
            FileStream picFile = new FileStream("image" +
                            picCounter.ToString() + "." +
                            imgExt, FileMode.Create,
                            FileAccess.Write);
            picCounter++;

            picFile.Write(tagContentArray, ii,
                            tagContentArray.Length - ii);
            /*for (int i = ii; i < tagContentArray.Length; i++)
            {
                picFile.WriteByte((byte)tagContentA[i]);
            }*/
            picFile.Close();
            rv = "image" + (picCounter - 1).ToString() + "." + imgExt;
        }
        else
        {
            rv = "empty";
        }
    }

    rv = rv.Replace('\0', ' ').Trim();
    return rv;
}

OK, we are done with the ID3v2 stuff, let's get to parsing the good old v1 tag. v1 is not like v2, it uses 128 bytes at the end of the file to store some data without any tag headers or anything else. Every bit of data starts and ends at some pre-defined position. Checking for the tag is easy - it starts with "TAG".

mp3File.Seek(-128, SeekOrigin.End);
ba = new byte[128];
mp3File.Read(ba, 0, 128);
//Console.WriteLine((((char)ba[0]).ToString()+
//  ((char)ba[1]).ToString()+((char)ba[2]).ToString() ));
if ((((char)ba[0]).ToString() + ((char)ba[1]).ToString() +
                          ((char)ba[2]).ToString()) == "TAG")
{
    xmlAttrib = xmlDoc.CreateAttribute("id3v1");
    xmlAttrib.Value = "1";
    tagInfo.Attributes.Append(xmlAttrib);
    string tagContent;
    tagContent = Mp3Reader.GetV1Tag(ba, 3, 33);
    if (tagContent.Length > 0)
    {
        SetXmlAttribute("song-title",
                  tagContent, ref xmlDoc, ref tagInfo);
    }
    tagContent = Mp3Reader.GetV1Tag(ba, 33, 63);
    if (tagContent.Length > 0)
    {
        SetXmlAttribute("artist", tagContent,
                              ref xmlDoc, ref tagInfo);
    }
    tagContent = Mp3Reader.GetV1Tag(ba, 63, 93);
    if (tagContent.Length > 0)
    {
        SetXmlAttribute("album-title", tagContent,
                              ref xmlDoc, ref tagInfo);
    }
    tagContent = Mp3Reader.GetV1Tag(ba, 93, 97);
    if (tagContent.Length > 0)
    {
        SetXmlAttribute("year", tagContent,
                              ref xmlDoc, ref tagInfo);
    }
    tagContent = Mp3Reader.GetV1Tag(ba, 97, 126);
    if (tagContent.Length > 0)
    {
        SetXmlAttribute("comment", tagContent,
                              ref xmlDoc, ref tagInfo);
    }
    tagContent = Mp3Reader.GetV1Tag(ba, 126, 127);
    if ((tagContent.Length > 0) && (ba[125] == '\0'))
    {
        SetXmlAttribute("track",
                      ((int)tagContent[0]).ToString(),
                      ref xmlDoc, ref tagInfo);
    }
    tagContent = Mp3Reader.GetV1Tag(ba, 127, 128);
    if (tagContent.Length > 0)
    {
        SetXmlAttribute("genre",
                  Mp3Reader.Genre((int)tagContent[0]),
                  ref xmlDoc, ref tagInfo);
    }
}
else
{
    xmlAttrib = xmlDoc.CreateAttribute("id3v1");
    xmlAttrib.Value = "0";
    tagInfo.Attributes.Append(xmlAttrib);
}

You see yet another function, GetV1Tag used here. It's a simple function that transforms a byte array holding some null-terminated string into the actual string. Here it goes:

private static string GetV1Tag(byte[] ba, int sp, int ep)
{
    string tagContent = "";
    for (int i = sp; i < ep; i++)
    {
        if (ba[i] == 0)
        {
            break;
        }
        tagContent += (char)ba[i];
    }
    return tagContent;
}

Then, we have the rest of the MP3 file to tamper with. Let's extract something from it.

mp3File.Seek(startPos, SeekOrigin.Begin);
ba = new byte[2];
mp3File.Read(ba, 0, 2);
while ((ba[0] != 255) || (ba[1] < 224))
{
    ba[0] = ba[1];
    mp3File.Read(ba, 1, 1);
}
byte tmp = ba[1];
ba = new byte[3];
ba[0] = tmp;
mp3File.Read(ba, 1, 2);
byte mpegLayer, mpegVersion, bitrateBits, rateBits, chanMode;
mpegVersion = (byte)(((byte)(ba[0] << 3)) >> 6);
mpegLayer = (byte)(((byte)(ba[0] << 5)) >> 6);
bitrateBits = (byte)(ba[1] >> 4);
rateBits = (byte)(((byte)(ba[1] << 4)) >> 6);
chanMode = (byte)(ba[2] >> 6);
xmlAttrib = xmlDoc.CreateAttribute("mpeg-version");
xmlAttrib.Value = versions[mpegVersion].ToString();
tagInfo.Attributes.Append(xmlAttrib);
xmlAttrib = xmlDoc.CreateAttribute("mpeg-layer");
xmlAttrib.Value = layers[mpegLayer].ToString();
tagInfo.Attributes.Append(xmlAttrib);
string mask = bitrateBits.ToString() +
       ((int)Math.Floor(versions[mpegVersion])).ToString() +
       layers[mpegLayer].ToString();
xmlAttrib = xmlDoc.CreateAttribute("bitrate");
xmlAttrib.Value = (string)Mp3Reader.BitrateMap[mask] + "Kbps";
tagInfo.Attributes.Append(xmlAttrib);
mask = rateBits.ToString() +
        ((int)Math.Ceiling(versions[mpegVersion])).ToString();
xmlAttrib = xmlDoc.CreateAttribute("sampling-rate");
xmlAttrib.Value = (string)Mp3Reader.RateMap[mask] + "Hz";
tagInfo.Attributes.Append(xmlAttrib);
xmlAttrib = xmlDoc.CreateAttribute("channel-mode");
xmlAttrib.Value = (string)Mp3Reader.channelModes[chanMode];
tagInfo.Attributes.Append(xmlAttrib);

mp3File.Close();

First of all, we skip the header and find the first block that contains some sound. It has its own header. We take the first header (that means the bit rate for VBR files will be shown incorrectly, I could go for extra blocks but that'd greatly slow down the whole thing). From the block, we extract all the precious information regarding the sound quality using the HashTables we had defined before. At last, we close the file. Here we are.

Reading Multiple Files

Now, let's create another class and name it "Mp3DirectoryEnumerator". I will add a method to create XML element with a name attribute (I don't use any text nodes in this XML representation). There's also a XML attribute creation method copied right from the Mp3Lister class. The only difference is that it is not static, and it takes instance-specific XML document instead of receiving one by reference.

class Mp3DirectoryEnumerator
{
    XmlDocument xmlDoc;

    // modified lister from
    // http://www.codeproject.com/csharp/XMLDirectoryTreeGen.asp
    public Mp3DirectoryEnumerator()
    {
    }

    private XmlElement XmlElement(string elementName,
                                          string elementValue)
    {
        XmlElement xmlElement = xmlDoc.CreateElement(elementName);
        xmlElement.Attributes.Append(XmlAttribute("name",
                                                   elementValue));
        return xmlElement;
    }
}

Let's now add the main functionality to our class.

public XmlDocument GetFileSystemInfoList(string StartFolder)
{
    xmlDoc = new XmlDocument();
    try
    {
        XmlDeclaration xmlDec =
             xmlDoc.CreateXmlDeclaration("1.0", null, "yes");
        xmlDoc.PrependChild ( xmlDec );
        XmlElement nodeElem = xmlDoc.CreateElement("list");
        xmlDoc.AppendChild(nodeElem);
        XmlElement rootElem = XmlElement("folder",
                      new DirectoryInfo(StartFolder).Name);
        nodeElem.AppendChild(AddElements(rootElem,
                                             StartFolder));
    }
    catch (Exception ex)
    {
        xmlDoc.AppendChild(XmlElement("error",ex.Message));
        return xmlDoc;
    }
    return xmlDoc;
}

This method sets up our XML document and calls the AddElements method for the parent node. The method is recursive and it goes through all the subdirectories in the given directory. If there are MP3 files in the tree, it takes them and applies the Mp3Reader main method to them, extracting the file information.

private XmlElement AddElements(XmlElement startNode,
                                                   string Folder)
{
    try
    {
        DirectoryInfo dir = new DirectoryInfo(Folder);
        DirectoryInfo[] subDirs = dir.GetDirectories();
        FileInfo[] files = dir.GetFiles();
        foreach(FileInfo fi in files)
        {
            if ( fi.Extension.ToLower().IndexOf("mp3") >= 0  )
            {
                Console.Write(fi.FullName+"\r\n");
                XmlElement fileElem = XmlElement("file",fi.Name);
                fileElem.Attributes.Append(XmlAttribute("size",
                                          fi.Length.ToString()));
                fileElem.Attributes.Append(XmlAttribute("creation-time",
                                             fi.CreationTime.ToString()));
                fileElem.Attributes.Append(XmlAttribute("last-write-time",
                                             fi.LastWriteTime.ToString()));
                Mp3Reader.getTagInfo(fi.FullName,ref fileElem,ref xmlDoc);
                startNode.AppendChild(fileElem);
            }
        }
        foreach (DirectoryInfo sd in subDirs) 
        {
            XmlElement folderElem = XmlElement("folder",sd.Name);
            startNode.AppendChild(AddElements(folderElem,sd.FullName));
        }
        return startNode;
    }
    catch (Exception ex) 
    {
        return XmlElement("error",ex.Message);
    }
}

I have included a small WinForms project in the source to enable you to test the class at once.

License

This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below.

A list of licenses authors might use can be found here.