Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

MIDI Beat Detection using NAudio

0.00/5 (No votes)
5 Aug 2014 1  
An outline of basic beat detection for games such as guitar hero.

Introduction

Over the past few weeks, I've been working on a Guitar Hero-esque game. As I puzzled through how to get each note so that it could be displayed on screen, I stumbled upon quite a number of people trying to do similar things to me, but there was no one definitive answer available anywhere. I did come across the third-party library NAudio, available for download here. This library has an amazing array of useful audio tools, but here, we'll be using it for its MIDI library, since the MIDI file format is among the easiest to use for beat detection.

Other tools

The other helpful tool when (re)using this code is Anvil Studio. It's a free MIDI editor; however, we need to do here is open and save a MIDI in it. This will convert single track MIDIs to multi-track MIDIs for ease of use with beat detection.

Using the code

The first thing we need to do is get the frequency of each MIDI note. This can be done fairly easily, using a 440Hz piano tuning:

static double[] midi = new double[127];

static void getFrequencies()
{
    //get frequencies for midi notes at 440 tuning (piano)
    int a = 440;
    for (int i = 0; i < 127; i++)
    {
        midi[i] = (a / 32) * Math.Pow(2, (i - 9) / 12);
    }
}

This code takes each note of the MIDI scale (0-127) and converts it to a frequency in hertz.

Next, we need to get various info about the MIDI file so that we can perform operations on it:

static string getFilename()
{
    Console.Write("Path to MIDI file (relative or absolute): ");
    string filen = Console.ReadLine();
    if (!File.Exists(filen))
    {
        Console.WriteLine("That file does not exist.");
        return getFilename();
    }
    return filen;
}

static int getTrackNumber()
{
    Console.Write("Melody Track #: ");
    string track = Console.ReadLine();
    int trackN;
    if (!int.TryParse(track, out trackN))
    {
        Console.WriteLine(track + " is not a valid number.");
        return getTrackNumber();
    }
    return trackN;
}

static string getSortType()
{
    Console.Write("Sort importance by [(d)uration or (v)olume]: ");
    string sortType = Console.ReadLine();
    if (sortType != "d" && sortType != "v")
    {
        Console.WriteLine("Invalid sort type " + sortType);
        return getSortType();
    }
    return sortType;
}

static string getOutputPath()
{
    Console.Write("Path to output (*.song): ");
    string output = Console.ReadLine() + ".song";
    foreach (char c in Path.GetInvalidFileNameChars())
    {
        if (output.Contains(c))
        {
            Console.WriteLine("The character '" + c + "' is not allowed in file names.");
            return getOutputPath();
        }
    }
    return output;
}

These functions provide the input filename, the output filename, the melody track number, and how to determine the importance of the note. The importance will determine the difficulty that the note is played at (high importance on easy, middle importance on medium, and low importance on hard. This may seem contradictory, but the importance of the note determines how important the note is to the melody, meaning that low-importance notes are ornamental and high importance notes should be seen at every difficulty).

Now is the part where we get to start using NAudio to process the MIDI for us. Before we even start looking at the code, we need to have all of our usings, including some for later in the article:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using NAudio;
using NAudio.Midi;
using System.IO;
using System.Xml;
using System.Xml.Serialization;

The main audio processing namespaces we need access to are NAudio and NAudio.Midi. If you only care about the beat detection and not how I applied it, you don't need IO or XML namespaces.

Now, let's take a look at the Main function. Skim through to get a general idea of what we're doing and I'll break it down afterwards:

static void Main(string[] args)
{
    getFrequencies();

    string filen = getFilename();

    MidiFile file = new MidiFile(filen);
    Console.WriteLine("Here are some possible tracks for you to choose from:");
    for (int i = 0; i < file.Tracks; i++)
    {
        string instrument = "";
        var events = file.Events.GetTrackEvents(i);
        foreach (var x in events)
        {
            if (x is PatchChangeEvent)
            {
                instrument = (x as PatchChangeEvent).ToString().Split(':')[1].Replace(" " + x.Channel + " ", "");
                break;
            }
        }
        if (!string.IsNullOrWhiteSpace(instrument)) Console.WriteLine(i + ": " + instrument);
    }

    int trackN = getTrackNumber();

    string sortType = getSortType();

    string output = getOutputPath();

    var trackevents = file.Events.GetTrackEvents(trackN);
    //this is the track 0 will have tempo information
    var tempoGetter = file.Events.GetTrackEvents(0);
    List<MidiNote> notes = new List<MidiNote>();
    int tempo = 0;
    foreach (var e in tempoGetter)
    {
        //get the tempo and drop out of that track
        if (e is TempoEvent)
        {
            //tempo in milliseconds
            tempo = (e as TempoEvent).MicrosecondsPerQuarterNote / 1000;
            break;
        }
    }
    for (int i = 0; i < trackevents.Count; i++)
    {
        //for every note
        MidiEvent e = trackevents[i];
        //if it's a note turning ON
        if (e is NoteOnEvent)
        {
            //the note on event, contains the time, volume, and the length of note
            var On = e as NoteOnEvent;
            //the note event, contains the midi note number (pitch)
            var n = e as NoteEvent;
            //the absolute time (in delta ticks) over the delta ticks per quarter note times the number of milliseconds per quarter note = time in milliseconds
            notes.Add(new MidiNote(On.NoteLength, midi[n.NoteNumber], (long)((On.AbsoluteTime / (float)file.DeltaTicksPerQuarterNote) * tempo), On.Velocity));
        }
    }
    //uses known values to get unknown values needed for a guitar hero clone
    //get the min and max frequency
    notes.Sort((n, n2) => n.frequency.CompareTo(n2.frequency));
    int minFreq = notes.First().frequency;
    int maxFreq = notes.Last().frequency;

    //get the min and max volume
    notes.Sort((n, n2) => n.volume.CompareTo(n2.volume));
    int minVol = notes.First().volume;
    int maxVol = notes.Last().volume;

    //get the min and max note duration
    notes.Sort((n, n2) => n.duration.CompareTo(n2.duration));
    int minLen = notes.First().duration;
    int maxLen = notes.Last().duration;

    //sort by time
    notes.Sort((n, n2) => n.startTime.CompareTo(n2.startTime));

    //outputs the song data to {output}.song
    List<Note> nt = new List<Note>();
    foreach (MidiNote n in notes)
    {
        MidiNote N = n;
        //gets unknown values for button and importance based off of known values
        buttonSignificance(ref N, minFreq, maxFreq, minVol, maxVol, minLen, maxLen, sortType == "v");
        nt.Add((Note)N);
    }
    //serialize to XML document
    XmlTextWriter w = new XmlTextWriter(output, null);
    XmlSerializer serializer = new XmlSerializer(typeof(List<Note>));
    serializer.Serialize(w, nt);
    w.Close();
    Console.WriteLine("done");
    Console.ReadKey();
}

The first part of this function that is optional, but helpful all the same is this:

MidiFile file = new MidiFile(filen);
Console.WriteLine("Here are some possible tracks for you to choose from:");
for (int i = 0; i < file.Tracks; i++)
{
    string instrument = "";
    var events = file.Events.GetTrackEvents(i);
    foreach (var x in events)
    {
        if (x is PatchChangeEvent)
        {
            instrument = (x as PatchChangeEvent).ToString().Split(':')[1].Replace(" " + x.Channel + " ", "");
            break;
        }
    }
    if (!string.IsNullOrWhiteSpace(instrument)) Console.WriteLine(i + ": " + instrument);
}

This code opens the file and gets the tracks in the MIDI file. Each track in a MIDI formatted by Anvil Studio will have several tracks, with track zero describing information such as the tempo in beats per minute and other info that helps us with timing when exactly the note happens. All other tracks contain a patch change event. This tells the MIDI file which type of sound to use, such as trumpet, piano, violin, etc. We check each track for its single patch change event. PatchChangeEvent.ToString() is formatted "{time} PatchChange Ch: {channel number} {Patch name}", so taking everything after the ":" and removing the channel number yields the instrument name for us. Track 0 will have no PatchChangeEvents if formatted with Anvil, since it has no notes to play. This means that if the instrument is empty, we can exclude the track from the list displayed to the user.

Now, let's look at some timing:

var trackevents = file.Events.GetTrackEvents(trackN);
//this is the track 0 will have tempo information
var tempoGetter = file.Events.GetTrackEvents(0);
List<MidiNote> notes = new List<MidiNote>();
int tempo = 0;
foreach (var e in tempoGetter)
{
    //get the tempo and drop out of that track
    if (e is TempoEvent)
    {
        //tempo in milliseconds
        tempo = (e as TempoEvent).MicrosecondsPerQuarterNote / 1000;
        break;
    }
}

We first get the track info for track 0 and for the melody track, specified by the user. Track 0 contains all of the information we need. We loop through every event in track 0. If it's a tempo event, we get the number of milliseconds per quarter note. Since there is only one tempo event, the loop then breaks to save time.

Before we start processing all of out info, we have to read in each note:

for (int i = 0; i < trackevents.Count; i++)
{
    //for every note
    MidiEvent e = trackevents[i];
    //if it's a note turning ON
    if (e is NoteOnEvent)
    {
        //the note on event, contains the time, volume, and the length of note
        var On = e as NoteOnEvent;
        //the note event, contains the midi note number (pitch)
        var n = e as NoteEvent;
        //the absolute time (in delta ticks) over the delta ticks per quarter note times the number of milliseconds per quarter note = time in milliseconds
        notes.Add(new MidiNote(On.NoteLength, midi[n.NoteNumber], (long)((On.AbsoluteTime / (float)file.DeltaTicksPerQuarterNote) * tempo), On.Velocity));
    }
}

As you may have noticed by now, there are a variety of different types of events contained in a MIDI file. In this code block, we'll be focusing on the NoteEvent and the NoteOnEvent. Every time a note event occurs, a new MidiNote is added to the list of notes. A MidiNote is defined like so:

public struct MidiNote
{
    public int duration;
    public int frequency;
    public long startTime;
    public int volume;
    public int button;
    public bool? importance;//null = low,false=mid,true=high
    public MidiNote(int duration, double frequency, long startTime, int volume)
    {
        this.duration = duration;
        this.frequency = (int)frequency;
        this.startTime = startTime;
        this.volume = volume;
        button = 0;
        importance = null;
    }
    public override string ToString()
    {
        return "@" + startTime + "-" + (startTime + duration) + ":" + frequency;
    }
}

Each note defines a frequency, duration, and volume used to define the button and importance. They also define a time (in milliseconds) when the note starts.

Now, we need the minimum and maximum for each of volume, duration, and frequency so we can get our unknown values. We will then sort the notes by start time:

//uses known values to get unknown values needed for a guitar hero clone
//get the min and max frequency
notes.Sort((n, n2) => n.frequency.CompareTo(n2.frequency));
int minFreq = notes.First().frequency;
int maxFreq = notes.Last().frequency;

//get the min and max volume
notes.Sort((n, n2) => n.volume.CompareTo(n2.volume));
int minVol = notes.First().volume;
int maxVol = notes.Last().volume;

//get the min and max note duration
notes.Sort((n, n2) => n.duration.CompareTo(n2.duration));
int minLen = notes.First().duration;
int maxLen = notes.Last().duration;

//sort by time
notes.Sort((n, n2) => n.startTime.CompareTo(n2.startTime));

Now, we must convert each MidiNote to a Note:

List<Note> nt = new List<Note>();
foreach (MidiNote n in notes)
{
    MidiNote N = n;
    //gets unknown values for button and importance based off of known values
    buttonSignificance(ref N, minFreq, maxFreq, minVol, maxVol, minLen, maxLen, sortType == "v");
    nt.Add((Note)N);
}

What's the difference between the MidiNote and a Note? The two main differences are that the Note is missing a duration and frequency. They aren't needed because the frequency becomes a button, and duration is only used for sorting. A Note is can also be serialized to XML:

[Serializable]
public class Note : IXmlSerializable
{
    int m, s, ms, b;
    bool? sig;

    public Note()
    { }

    public Note(int minute, int second, int milli, int button, bool? significance)
    {
        m = minute;
        s = second;
        ms = milli;
        sig = significance;
        b = button;
    }

    public System.Xml.Schema.XmlSchema GetSchema()
    {
        //GetSchema should always return null
        return null;
    }

    public void ReadXml(XmlReader reader)
    {
        //move to the next node. If it's a note, get the data
        if (reader.MoveToContent() == XmlNodeType.Element && reader.LocalName == "Note")
        {
            ms = int.Parse(reader.GetAttribute("milliseconds"));
            s = int.Parse(reader.GetAttribute("seconds"));
            m = int.Parse(reader.GetAttribute("minutes"));
            b = int.Parse(reader.GetAttribute("button"));
            string input = reader.GetAttribute("significance");
            bool sn;
            if (bool.TryParse(input, out sn))
            {
                sig = sn;
            }
            else
            {
                sig = null;
            }
            reader.Read();
        }
    }

    public void WriteXml(XmlWriter writer)
    {
        //write values to XML
        writer.WriteAttributeString("milliseconds", ms.ToString());
        writer.WriteAttributeString("seconds", s.ToString());
        writer.WriteAttributeString("minutes", m.ToString());
        writer.WriteAttributeString("significance", !sig.HasValue ? "null" : sig.ToString());
        writer.WriteAttributeString("button", b.ToString());
    }

    public static explicit operator Note(MidiNote n)
    {
        Note ret = new Note();
        ret.b = n.button;
        ret.sig = n.importance;
        long time = n.startTime;

        TimeSpan span = new TimeSpan(0, 0, 0, 0, (int)time);
        ret.m = span.Minutes;
        ret.s = span.Seconds;
        ret.ms = span.Milliseconds;

        return ret;
    }
}

Let's also define the buttonSignificance(MidiNote, int, int, int, int, int, int, bool) function:

static void buttonSignificance(ref MidiNote note, int minFreq, int maxFreq, int minVel, int maxVel, int minLen, int maxLen, bool sortVol)
{
    //divide the frequencies into five steps
    float btnStep = (maxFreq - minFreq) / 5.0f;

    float button1Min = minFreq;
    float button2Min = button1Min + btnStep;
    float button3Min = button2Min + btnStep;
    float button4Min = button3Min + btnStep;
    float button5Min = button4Min + btnStep;
    //based off of the note frequency, get the button it needs to be
    if (note.frequency >= button1Min && note.frequency < button2Min) note.button = 0;
    if (note.frequency >= button2Min && note.frequency < button3Min) note.button = 1;
    if (note.frequency >= button3Min && note.frequency < button4Min) note.button = 2;
    if (note.frequency >= button4Min && note.frequency < button5Min) note.button = 3;
    if (note.frequency >= button5Min && note.frequency <= maxFreq) note.button = 4;

    if (sortVol)
    {
        //if sorting by volume, split volume into three steps and get importance
        float vStep = (maxVel - minVel) / 3.0f;
        float v1 = minVel;
        float v2 = v1 + vStep;
        float v3 = v2 + vStep;
        if (note.volume >= v1 && note.volume < v2) note.importance = null;
        if (note.volume >= v2 && note.volume < v3) note.importance = false;
        if (note.volume >= v3 && note.volume <= maxVel) note.importance = true;
    }
    else
    {
        //if sorting by duration, split duration into three steps and get importance
        float lStep = (maxLen - minLen) / 3.0f;
        float l1 = minLen;
        float l2 = l1 + lStep;
        float l3 = l2 + lStep;
        if (note.duration >= l1 && note.duration < l2) note.importance = null;
        if (note.duration >= l2 && note.duration < l3) note.importance = false;
        if (note.duration >= l3 && note.duration <= maxLen) note.importance = true;
    }
}

This function will use the frequencies of the song relative to the note to get the button (0-5) and the duration or volume based off of the longest or loudest notes to determine how important the note is. I myself prefer to use duration as the sort criteria, but volume is also perfectly valid, although not as effective.

Lastly, we must serialize it to an XML document:

//serialize to XML document
XmlTextWriter w = new XmlTextWriter(output, null);
XmlSerializer serializer = new XmlSerializer(typeof(List<Note>));
serializer.Serialize(w, nt);
w.Close();
Console.WriteLine("done");
Console.ReadKey();

That sums it all up. The MIDI file will be converted to a song file (XML).

Points of Interest

Overall, beat detection is a complex process. However, it can be simplified (such as we did) or complicated (real-time beat detection) as much as we want it to. This is one of many approaches to beat detection. If you are truly interested in this field, I encourage you to look into doing this with WAV files. It can be very difficult unless you do signal processing for a living. However, if you aren't limited by time (as I was) and are very perseverant, you can do neat things with beat detection.

Thanks for reading this, my first article on CodeProject; feedback is much appreciated.

History

8/5/2014 Original Post

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here