Introduction
Most of the common Machine Learning (ML) libraries are written in Python and it is not so easy for .NET developers. The ML.NET library occurs as a bridge between ML libraries and .NET applications.
ML.NET is an open source library that can be used directly in .NET applications. In this article, I am going to introduce how to use the ML.NET library in Visual Studio 2017 (I am using VS 2017 Community).
Background
A Binary Classification Problem
Assume that we have two points (in a two-dimensional space) groups that are Red and Blue and we are going to predict whether a point will belong to the Red
group or the Blue
group based on coordinates (x
and y
) of this point. Our training data can look like this:
3 -2 Red
-2 3 Red
-1 -4 Red
2 3 Red
3 4 Red
-1 9 Blue
2 14 Blue
1 17 Blue
3 12 Blue
0 8 Blue
We have ten points. Two first values of each row are coordinates (x
and y
) of each point and the third value is the group which that point belongs to.
Because we have only two outputs that are Blue
or Red
, our problem is binary classification problem. There are a lot of different ML techniques for solving a binary classification problem and in this article, I will use Logistic Regression because it is the simplest ML algorithm.
Creating a .NET Application and Installing the ML.NET Library
For simplicity, we will create a Console Application C# (.NET Framework) and name it MyFirstMLDOTNET
. In the Solution Explorer window, we also rename the Program.cs to MyFirstMLDOTNET.cs:
We can install the ML.NET by right-clicking on the MyFirstMLDOTNET
project and choosing Manage NuGet Packages:
In the NuGet window, we select the Browse tab and enter ‘ML.NET’ in the Search field. Finally, we select Microsoft.ML and click the Install button:
Clicking OK in the Preview Changes and then clicking I Accept in the License Acceptance. After a few seconds, Visual Studio will respond with a message in the Output window:
At this point, if we try to run our application, we can get an error message as follows:
Solve this error by right-clicking on the MyFirstMLDOTNET
project and selecting the Properties. In the Properties window, we select the Built item on the left side and change Any CPU to x64 in the Plaform target item:
We also need to select the 4.7 version (or later versions) of the .NET Framework because we will meet some errors with earlier versions. We can select the version of the .NET Framework by selecting the Application item on the left side and selecting the version in Target framework item. If we don’t have the 4.7 version (or later versions), we can select the Install other frameworks and we will be directed to the Microsoft page to download and install the .NET Framework packages:
So far, we can try to run our aplication again and it is sucessful.
Using the Code
The Training Data
Before creating the ML model, we must create the training data file by right-clicking on the MyFirstMLDOTNET
project and select Add > New Item, select the Text File type and enter myMLData.txt in the Name
field:
Click the Add button. In the myMLData.txt window, we enter (or copy above) the training data:
3 -2 Red
-2 3 Red
-1 -4 Red
2 3 Red
3 4 Red
-1 9 Blue
2 14 Blue
1 17 Blue
3 12 Blue
0 8 Blue
Click the Save and close the myMLData.txt window.
The Data Classes
After creating the training data file, we also need to create data classes. A class (named myData
) defines the structure of the training data (two coordinates (x
and y
) and one label (Red
or Blue
))
public class myData
{
[Column(ordinal: "0", name: "XCoord")]
public float x;
[Column(ordinal: "1", name: "YCoord")]
public float y;
[Column(ordinal: "2", name: "Label")]
public string Label;
}
And a class (named myPrediction
) holds predicted information:
public class myPrediction
{
[ColumnName("PredictedLabel")]
public string PredictedLabels;
}
Creating and Training the ML Model
We can create the ML model and train it:
var pipeline = new LearningPipeline();
string dataPath = "..\\..\\myMLData.txt";
pipeline.Add(new TextLoader(dataPath).CreateFrom<myData>(separator: ' '));
pipeline.Add(new Dictionarizer("Label"));
pipeline.Add(new ColumnConcatenator("Features", "XCoord", "YCoord"));
pipeline.Add(new LogisticRegressionBinaryClassifier());
pipeline.Add(new PredictedLabelColumnOriginalValueConverter()
{ PredictedLabelColumn = "PredictedLabel" });
Console.WriteLine("\nStarting training \n");
var model = pipeline.Train<myData, myPrediction>();
Evaluting the Model
We can evalute our ML model as follows:
var testData = new TextLoader(dataPath).CreateFrom<myData>(separator: ' ');
var evaluator = new BinaryClassificationEvaluator();
var metrics = evaluator.Evaluate(model, testData);
double acc = metrics.Accuracy * 100;
Console.WriteLine("Model accuracy = " + acc.ToString("F2") + "%");
Testing the Model
Finally, we can test our model with a new point:
myData newPoint = new myData(){ x = 5f, y = -7f};
myPrediction prediction = model.Predict(newPoint);
string result = prediction.PredictedLabels;
Console.WriteLine("Prediction = " + result);
All of our code in the MyFirstMLDOTNET.cs file:
using System;
using Microsoft.ML.Runtime.Api;
using System.Threading.Tasks;
using Microsoft.ML.Legacy;
using Microsoft.ML.Legacy.Data;
using Microsoft.ML.Legacy.Transforms;
using Microsoft.ML.Legacy.Trainers;
using Microsoft.ML.Legacy.Models;
namespace MyFirstMLDOTNET
{
class MyFirstMLDOTNET
{
public class myData
{
[Column(ordinal: "0", name: "XCoord")]
public float x;
[Column(ordinal: "1", name: "YCoord")]
public float y;
[Column(ordinal: "2", name: "Label")]
public string Label;
}
public class myPrediction
{
[ColumnName("PredictedLabel")]
public string PredictedLabels;
}
static void Main(string[] args)
{
var pipeline = new LearningPipeline();
string dataPath = "..\\..\\myMLData.txt";
pipeline.Add(new TextLoader(dataPath).CreateFrom<myData>(separator: ' '));
pipeline.Add(new Dictionarizer("Label"));
pipeline.Add(new ColumnConcatenator("Features", "XCoord", "YCoord"));
pipeline.Add(new LogisticRegressionBinaryClassifier());
pipeline.Add(new PredictedLabelColumnOriginalValueConverter()
{ PredictedLabelColumn = "PredictedLabel" });
Console.WriteLine("\nStarting training \n");
var model = pipeline.Train<myData, myPrediction>();
var testData = new TextLoader(dataPath).CreateFrom<myData>(separator: ' ');
var evaluator = new BinaryClassificationEvaluator();
var metrics = evaluator.Evaluate(model, testData);
double acc = metrics.Accuracy * 100;
Console.WriteLine("Model accuracy = " + acc.ToString("F2") + "%");
myData newPoint = new myData()
{ x = 5f, y = -7f};
myPrediction prediction = model.Predict(newPoint);
string result = prediction.PredictedLabels;
Console.WriteLine("Prediction = " + result);
Console.WriteLine("\nEnd ML.NET demo");
Console.ReadLine();
}
}
}
Run our application and get the result which can look like this:
Points of Interest
In this article, I only introduced the ML.NET – the Machine Learning library for .NET developers – basically. The ML.NET has still been developing and you can learn more about this library through tutorials here.
History
- 24th November, 2018: Initial version