Infer.NET – A Library for People Who Love Probability

Ngọc Minh Trần

3.81/5 (9 votes)

11 Mar 2019CPOL4 min read

16.5K

An introduction to Infer.NET

Introduction

Infer.NET is an open source library that can be used to create probabilistic programming systems. We can use Infer.NET to solve many different kinds of machine learning problems, such as classification, recommendation, and so on. In this article, I am going to introduce how to use the infer.NET library in Visual Studio 2017 Community. Infer.NET supports both C# and F#, and I am going to use C# in this article.

Background

Installing the Infer.NET Library in VS 2017 Community

For simplicity, we will create a Console Application C# (.NET Framework) and name it InferDotNetDemo. We also need to select the 4.7 version (or later versions) of the .NET Framework because we maybe meet some errors with earlier versions.

In the Solution Explorer window, we also rename the Program.cs to InferDotNetDemo.cs:

We can install the Infer.NET by right-clicking on the InferDotNetDemo project and choosing Manage NuGet Packages:

In the NuGet window, we select the Browse tab and enter ‘Infer.net’ in the Search field. Finally, we select Microsoft.ML.Probabilistic.Compiler and click the Install button:

Clicking OK in the Preview Changes.

and then clicking I Accept in the License Acceptance:

After a few seconds, Visual Studio will respond with a message in the Output window:

An Infer.NET Program

An Infer.NET program includes three key steps:

Step 1: Definition of a probabilistic model

All Infer.NET programs need a probabilistic model to be defined. In our demo, we can define a model by defining random variables from distributions.
Step 2: Creation of an inference engine
All inference is achieved through the use of an inference engine by using InferenceEngine class.
Step 3: Execution of an inference query
Given an inference engine, you can query marginal distributions over variables using Infer().

Using the Code

Let’s start by including the Infer.NET namespaces:

using Microsoft.ML.Probabilistic.Models;
using Microsoft.ML.Probabilistic.Algorithms;
using Microsoft.ML.Probabilistic.Distributions;

And now, I am going to explain how to use infer.NET library in probabilistic programming through two examples below.

An Introduction to Probability - A simple example

Assume that we have a box that contains six balls – one red ball and five blue balls. We are allowed to pick up a ball randomly 2 times:

When we pick up one ball, there are two possible outcomes: red ball or blue ball. The probability of picking a red ball is p = 1/6 = 0.17 and the probability of picking a blue ball is q = 1 – p = 5/6 = 0.83. The four possible outcomes that could occur if we picked up a ball twice are listed below table:

To represent each picking, we can use a boolean variable where true represents “picking a red ball” and false represents “picking a blue ball”. A distribution over a boolean value with some probability of being true is called a Bernoulli distribution. So we can simulate each picking by creating a boolean random variable from a Bernoulli distribution.

Note that First Picking and Second Picking are independent. Hence, the probability of picking a red ball on First Pick and a red ball on Second Pick is (1/6*1/6) = 0.17*0.17 = 0.0289 (you can find out details about Bernoulli distribution here).

So far, we can write some lines of C# code for our program. First, we are going to define a probabilistic model by defining random variables from a Bernoulli distribution with a 1/6 = 0.17 probability of being true:

Variable<bool> firstPicking = Variable.Bernoulli(0.17);
Variable<bool> secondPicking = Variable.Bernoulli(0.17);

Another way of making a random variable is to derive it using an expression containing other random variables like so:

Variable<bool> bothReds = firstPicking & secondPicking;

Here, bothReds is true only if both firstPicking and secondPicking are true and hence it represents the situation where both Pickings are a red ball.

Second, we have created a variable bothReds and we are going to find out its distribution by creating an inference engine which uses the default inference algorithm (Expectation Propagation):

InferenceEngine ie = new InferenceEngine();

We can use the Infer() method of this engine to query marginal distribution of the bothReds variable:

Console.WriteLine("Probability both pickings are a red ball: " + ie.Infer(bothReds));

All of our code in the InferDotNetDemo.cs file:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.ML.Probabilistic.Models;
using Microsoft.ML.Probabilistic.Algorithms;
using Microsoft.ML.Probabilistic.Distributions;

namespace ConsoleApp1
{
    class InferDemoProgram
    {
        static void Main(string[] args)
        {
            //**************Step 1: Define a probabilistic model*************************
            //creating a random variable for first picking
            Variable<bool> firstPicking = Variable.Bernoulli(0.17);
            //creating a random variable for second picking
            Variable<bool> secondPicking = Variable.Bernoulli(0.17);
            //creating a random variable based on other random variables
            Variable<bool> bothReds = firstPicking & secondPicking;
            //**************Step 2: Creating an inference engine*************************
            InferenceEngine ie = new InferenceEngine();
            //**************Step 3: Execution of an inference query*************************
            //Using Expectation Propagation - the default algorithm
            if (!(ie.Algorithm is VariationalMessagePassing))
             {
                Console.WriteLine("Probability both pickings are a red ball: " + ie.Infer(bothReds));
             }
            else
                Console.WriteLine("Not run with Variational Message Passing!");
            Console.ReadKey();
        }
    }
}

Run our application and get the result which can look like this:

And a more complex example

In this example, we are going to use a Gaussian distribution and a Gamma distribution to built a model which can predict temperature of a day based on known-dataset. Our InferDotNetDemo program can look like this:

 class InferDemoProgram
    {
        static void Main(string[] args)
        {
            //**************Step 1: Define a probabilistic model*************************
            //Assume that we know temperature of three days:
            //Day 1: 13 Celsius, Day 2: 17 Celsius, and Day 3: 16 Celsius
            double[] temp = new double[3] { 13, 17, 16 };
            // Creating a Gaussian distribution and a Gamma distribution
            // and I also assume that we have some parameters such as mean, precision, shape and scale
            Variable<double> averageTemp = Variable.GaussianFromMeanAndPrecision(15, 0.01).Named("Average Temperature");
            Variable<double> precision = Variable.GammaFromShapeAndScale(2.0, 0.5).Named("Precision");

            // Train the model
            // using the Range object to handle the array of data efficiently
            Range dataRange = new Range(temp.Length).Named("n");
            VariableArray<double> daysTemp = Variable.Array<double>(dataRange);
            daysTemp[dataRange] = Variable.GaussianFromMeanAndPrecision(averageTemp, precision).ForEach(dataRange);
            daysTemp.ObservedValue = temp;
            //**************Step 2: Creating an inference engine*************************
            InferenceEngine ie = new InferenceEngine();
            //**************Step 3: Execution of an inference query*************************
            //Make predictions
            //Add a prediction variable and retrain the model
            Variable<double> tomorrowsTemp = Variable.GaussianFromMeanAndPrecision(averageTemp, precision).Named("Tomorrows Predicted Temperature");
            //get the Gaussian distribution
            Gaussian tomorrowsTempDist = ie.Infer<Gaussian>(tomorrowsTemp);
            // get the mean 
            double tomorrowsMean = tomorrowsTempDist.GetMean();
            //get the variance
            double tomorrowsStdDev = Math.Sqrt(tomorrowsTempDist.GetVariance());
            //Using Expectation Propagation - the default algorithm
            if (!(ie.Algorithm is VariationalMessagePassing))
           {
                // Write out the results.
                Console.WriteLine("Tomorrows predicted temperature: {0:f2} Celsius plus or minus {1:f2}", tomorrowsMean, tomorrowsStdDev);
                // Ask other questions of the model
                double probTempLessThan18Celsius = ie.Infer<Bernoulli>(tomorrowsTemp < 18.0).GetProbTrue();
                Console.WriteLine("Probability that the temperature is less than 18 Celsius: {0:f2}", probTempLessThan18Celsius);
            }
            else
                Console.WriteLine("Not run with Variational Message Passing!");
            Console.ReadKey();
        }
    }

If we run program, the result:

Visualising our model

Infer.NET allows us to visualise the model in which inference is being performed, in the form of a factor graph. To do this, we need to implement in two steps:

Step 1: download and install Graphviz
Step 2: set the ShowFactorGraph property of the inference engine to true

InferenceEngine ie = new InferenceEngine();
ie.ShowFactorGraph = true;

if we run program (second example), we will take something like:

Points of Interest

In this article, I only introduced the basic Infer.NET library. If you want to discover more about this library, you can refer to some of the best sources below:

History

4^th March, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)