Introduction
Over the past few weeks, I've been working on a Guitar Hero-esque game. As I puzzled through how to get each note so that it could be displayed on screen, I stumbled upon quite a number of people trying to do similar things to me, but there was no one definitive answer available anywhere. I did come across the third-party library NAudio, available for download here. This library has an amazing array of useful audio tools, but here, we'll be using it for its MIDI library, since the MIDI file format is among the easiest to use for beat detection.
Other tools
The other helpful tool when (re)using this code is Anvil Studio. It's a free MIDI editor; however, we need to do here is open and save a MIDI in it. This will convert single track MIDIs to multi-track MIDIs for ease of use with beat detection.
Using the code
The first thing we need to do is get the frequency of each MIDI note. This can be done fairly easily, using a 440Hz piano tuning:
static double[] midi = new double[127];
static void getFrequencies()
{
int a = 440;
for (int i = 0; i < 127; i++)
{
midi[i] = (a / 32) * Math.Pow(2, (i - 9) / 12);
}
}
This code takes each note of the MIDI scale (0-127) and converts it to a frequency in hertz.
Next, we need to get various info about the MIDI file so that we can perform operations on it:
static string getFilename()
{
Console.Write("Path to MIDI file (relative or absolute): ");
string filen = Console.ReadLine();
if (!File.Exists(filen))
{
Console.WriteLine("That file does not exist.");
return getFilename();
}
return filen;
}
static int getTrackNumber()
{
Console.Write("Melody Track #: ");
string track = Console.ReadLine();
int trackN;
if (!int.TryParse(track, out trackN))
{
Console.WriteLine(track + " is not a valid number.");
return getTrackNumber();
}
return trackN;
}
static string getSortType()
{
Console.Write("Sort importance by [(d)uration or (v)olume]: ");
string sortType = Console.ReadLine();
if (sortType != "d" && sortType != "v")
{
Console.WriteLine("Invalid sort type " + sortType);
return getSortType();
}
return sortType;
}
static string getOutputPath()
{
Console.Write("Path to output (*.song): ");
string output = Console.ReadLine() + ".song";
foreach (char c in Path.GetInvalidFileNameChars())
{
if (output.Contains(c))
{
Console.WriteLine("The character '" + c + "' is not allowed in file names.");
return getOutputPath();
}
}
return output;
}
These functions provide the input filename, the output filename, the melody track number, and how to determine the importance of the note. The importance will determine the difficulty that the note is played at (high importance on easy, middle importance on medium, and low importance on hard. This may seem contradictory, but the importance of the note determines how important the note is to the melody, meaning that low-importance notes are ornamental and high importance notes should be seen at every difficulty).
Now is the part where we get to start using NAudio to process the MIDI for us. Before we even start looking at the code, we need to have all of our usings, including some for later in the article:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using NAudio;
using NAudio.Midi;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
The main audio processing namespaces we need access to are NAudio
and NAudio.Midi
. If you only care about the beat detection and not how I applied it, you don't need IO or XML namespaces.
Now, let's take a look at the Main
function. Skim through to get a general idea of what we're doing and I'll break it down afterwards:
static void Main(string[] args)
{
getFrequencies();
string filen = getFilename();
MidiFile file = new MidiFile(filen);
Console.WriteLine("Here are some possible tracks for you to choose from:");
for (int i = 0; i < file.Tracks; i++)
{
string instrument = "";
var events = file.Events.GetTrackEvents(i);
foreach (var x in events)
{
if (x is PatchChangeEvent)
{
instrument = (x as PatchChangeEvent).ToString().Split(':')[1].Replace(" " + x.Channel + " ", "");
break;
}
}
if (!string.IsNullOrWhiteSpace(instrument)) Console.WriteLine(i + ": " + instrument);
}
int trackN = getTrackNumber();
string sortType = getSortType();
string output = getOutputPath();
var trackevents = file.Events.GetTrackEvents(trackN);
var tempoGetter = file.Events.GetTrackEvents(0);
List<MidiNote> notes = new List<MidiNote>();
int tempo = 0;
foreach (var e in tempoGetter)
{
if (e is TempoEvent)
{
tempo = (e as TempoEvent).MicrosecondsPerQuarterNote / 1000;
break;
}
}
for (int i = 0; i < trackevents.Count; i++)
{
MidiEvent e = trackevents[i];
if (e is NoteOnEvent)
{
var On = e as NoteOnEvent;
var n = e as NoteEvent;
notes.Add(new MidiNote(On.NoteLength, midi[n.NoteNumber], (long)((On.AbsoluteTime / (float)file.DeltaTicksPerQuarterNote) * tempo), On.Velocity));
}
}
notes.Sort((n, n2) => n.frequency.CompareTo(n2.frequency));
int minFreq = notes.First().frequency;
int maxFreq = notes.Last().frequency;
notes.Sort((n, n2) => n.volume.CompareTo(n2.volume));
int minVol = notes.First().volume;
int maxVol = notes.Last().volume;
notes.Sort((n, n2) => n.duration.CompareTo(n2.duration));
int minLen = notes.First().duration;
int maxLen = notes.Last().duration;
notes.Sort((n, n2) => n.startTime.CompareTo(n2.startTime));
List<Note> nt = new List<Note>();
foreach (MidiNote n in notes)
{
MidiNote N = n;
buttonSignificance(ref N, minFreq, maxFreq, minVol, maxVol, minLen, maxLen, sortType == "v");
nt.Add((Note)N);
}
XmlTextWriter w = new XmlTextWriter(output, null);
XmlSerializer serializer = new XmlSerializer(typeof(List<Note>));
serializer.Serialize(w, nt);
w.Close();
Console.WriteLine("done");
Console.ReadKey();
}
The first part of this function that is optional, but helpful all the same is this:
MidiFile file = new MidiFile(filen);
Console.WriteLine("Here are some possible tracks for you to choose from:");
for (int i = 0; i < file.Tracks; i++)
{
string instrument = "";
var events = file.Events.GetTrackEvents(i);
foreach (var x in events)
{
if (x is PatchChangeEvent)
{
instrument = (x as PatchChangeEvent).ToString().Split(':')[1].Replace(" " + x.Channel + " ", "");
break;
}
}
if (!string.IsNullOrWhiteSpace(instrument)) Console.WriteLine(i + ": " + instrument);
}
This code opens the file and gets the tracks in the MIDI file. Each track in a MIDI formatted by Anvil Studio will have several tracks, with track zero describing information such as the tempo in beats per minute and other info that helps us with timing when exactly the note happens. All other tracks contain a patch change event. This tells the MIDI file which type of sound to use, such as trumpet, piano, violin, etc. We check each track for its single patch change event. PatchChangeEvent.ToString()
is formatted "{time} PatchChange Ch: {channel number} {Patch name}", so taking everything after the ":" and removing the channel number yields the instrument name for us. Track 0 will have no PatchChangeEvents if formatted with Anvil, since it has no notes to play. This means that if the instrument is empty, we can exclude the track from the list displayed to the user.
Now, let's look at some timing:
var trackevents = file.Events.GetTrackEvents(trackN);
var tempoGetter = file.Events.GetTrackEvents(0);
List<MidiNote> notes = new List<MidiNote>();
int tempo = 0;
foreach (var e in tempoGetter)
{
if (e is TempoEvent)
{
tempo = (e as TempoEvent).MicrosecondsPerQuarterNote / 1000;
break;
}
}
We first get the track info for track 0 and for the melody track, specified by the user. Track 0 contains all of the information we need. We loop through every event in track 0. If it's a tempo event, we get the number of milliseconds per quarter note. Since there is only one tempo event, the loop then breaks to save time.
Before we start processing all of out info, we have to read in each note:
for (int i = 0; i < trackevents.Count; i++)
{
MidiEvent e = trackevents[i];
if (e is NoteOnEvent)
{
var On = e as NoteOnEvent;
var n = e as NoteEvent;
notes.Add(new MidiNote(On.NoteLength, midi[n.NoteNumber], (long)((On.AbsoluteTime / (float)file.DeltaTicksPerQuarterNote) * tempo), On.Velocity));
}
}
As you may have noticed by now, there are a variety of different types of events contained in a MIDI file. In this code block, we'll be focusing on the NoteEvent
and the NoteOnEvent
. Every time a note event occurs, a new MidiNote
is added to the list of notes. A MidiNote
is defined like so:
public struct MidiNote
{
public int duration;
public int frequency;
public long startTime;
public int volume;
public int button;
public bool? importance; public MidiNote(int duration, double frequency, long startTime, int volume)
{
this.duration = duration;
this.frequency = (int)frequency;
this.startTime = startTime;
this.volume = volume;
button = 0;
importance = null;
}
public override string ToString()
{
return "@" + startTime + "-" + (startTime + duration) + ":" + frequency;
}
}
Each note defines a frequency, duration, and volume used to define the button and importance. They also define a time (in milliseconds) when the note starts.
Now, we need the minimum and maximum for each of volume, duration, and frequency so we can get our unknown values. We will then sort the notes by start time:
notes.Sort((n, n2) => n.frequency.CompareTo(n2.frequency));
int minFreq = notes.First().frequency;
int maxFreq = notes.Last().frequency;
notes.Sort((n, n2) => n.volume.CompareTo(n2.volume));
int minVol = notes.First().volume;
int maxVol = notes.Last().volume;
notes.Sort((n, n2) => n.duration.CompareTo(n2.duration));
int minLen = notes.First().duration;
int maxLen = notes.Last().duration;
notes.Sort((n, n2) => n.startTime.CompareTo(n2.startTime));
Now, we must convert each MidiNote
to a Note
:
List<Note> nt = new List<Note>();
foreach (MidiNote n in notes)
{
MidiNote N = n;
buttonSignificance(ref N, minFreq, maxFreq, minVol, maxVol, minLen, maxLen, sortType == "v");
nt.Add((Note)N);
}
What's the difference between the MidiNote
and a Note
? The two main differences are that the Note
is missing a duration and frequency. They aren't needed because the frequency becomes a button, and duration is only used for sorting. A Note
is can also be serialized to XML:
[Serializable]
public class Note : IXmlSerializable
{
int m, s, ms, b;
bool? sig;
public Note()
{ }
public Note(int minute, int second, int milli, int button, bool? significance)
{
m = minute;
s = second;
ms = milli;
sig = significance;
b = button;
}
public System.Xml.Schema.XmlSchema GetSchema()
{
return null;
}
public void ReadXml(XmlReader reader)
{
if (reader.MoveToContent() == XmlNodeType.Element && reader.LocalName == "Note")
{
ms = int.Parse(reader.GetAttribute("milliseconds"));
s = int.Parse(reader.GetAttribute("seconds"));
m = int.Parse(reader.GetAttribute("minutes"));
b = int.Parse(reader.GetAttribute("button"));
string input = reader.GetAttribute("significance");
bool sn;
if (bool.TryParse(input, out sn))
{
sig = sn;
}
else
{
sig = null;
}
reader.Read();
}
}
public void WriteXml(XmlWriter writer)
{
writer.WriteAttributeString("milliseconds", ms.ToString());
writer.WriteAttributeString("seconds", s.ToString());
writer.WriteAttributeString("minutes", m.ToString());
writer.WriteAttributeString("significance", !sig.HasValue ? "null" : sig.ToString());
writer.WriteAttributeString("button", b.ToString());
}
public static explicit operator Note(MidiNote n)
{
Note ret = new Note();
ret.b = n.button;
ret.sig = n.importance;
long time = n.startTime;
TimeSpan span = new TimeSpan(0, 0, 0, 0, (int)time);
ret.m = span.Minutes;
ret.s = span.Seconds;
ret.ms = span.Milliseconds;
return ret;
}
}
Let's also define the buttonSignificance(MidiNote, int, int, int, int, int, int, bool)
function:
static void buttonSignificance(ref MidiNote note, int minFreq, int maxFreq, int minVel, int maxVel, int minLen, int maxLen, bool sortVol)
{
float btnStep = (maxFreq - minFreq) / 5.0f;
float button1Min = minFreq;
float button2Min = button1Min + btnStep;
float button3Min = button2Min + btnStep;
float button4Min = button3Min + btnStep;
float button5Min = button4Min + btnStep;
if (note.frequency >= button1Min && note.frequency < button2Min) note.button = 0;
if (note.frequency >= button2Min && note.frequency < button3Min) note.button = 1;
if (note.frequency >= button3Min && note.frequency < button4Min) note.button = 2;
if (note.frequency >= button4Min && note.frequency < button5Min) note.button = 3;
if (note.frequency >= button5Min && note.frequency <= maxFreq) note.button = 4;
if (sortVol)
{
float vStep = (maxVel - minVel) / 3.0f;
float v1 = minVel;
float v2 = v1 + vStep;
float v3 = v2 + vStep;
if (note.volume >= v1 && note.volume < v2) note.importance = null;
if (note.volume >= v2 && note.volume < v3) note.importance = false;
if (note.volume >= v3 && note.volume <= maxVel) note.importance = true;
}
else
{
float lStep = (maxLen - minLen) / 3.0f;
float l1 = minLen;
float l2 = l1 + lStep;
float l3 = l2 + lStep;
if (note.duration >= l1 && note.duration < l2) note.importance = null;
if (note.duration >= l2 && note.duration < l3) note.importance = false;
if (note.duration >= l3 && note.duration <= maxLen) note.importance = true;
}
}
This function will use the frequencies of the song relative to the note to get the button (0-5) and the duration or volume based off of the longest or loudest notes to determine how important the note is. I myself prefer to use duration as the sort criteria, but volume is also perfectly valid, although not as effective.
Lastly, we must serialize it to an XML document:
XmlTextWriter w = new XmlTextWriter(output, null);
XmlSerializer serializer = new XmlSerializer(typeof(List<Note>));
serializer.Serialize(w, nt);
w.Close();
Console.WriteLine("done");
Console.ReadKey();
That sums it all up. The MIDI file will be converted to a song file (XML).
Points of Interest
Overall, beat detection is a complex process. However, it can be simplified (such as we did) or complicated (real-time beat detection) as much as we want it to. This is one of many approaches to beat detection. If you are truly interested in this field, I encourage you to look into doing this with WAV files. It can be very difficult unless you do signal processing for a living. However, if you aren't limited by time (as I was) and are very perseverant, you can do neat things with beat detection.
Thanks for reading this, my first article on CodeProject; feedback is much appreciated.
History
8/5/2014 Original Post