Here we introduce the code of the architecture presented in the previous article, and we will examine another architecture that would require more than an ordinary CPU to be trained accurately.
To start, let’s download the Keras.NET package from the Nuget package manager. We can find the Nuget package manager in Tools > Nuget package manager. Keras.NET relies on the packages Numpy.NET and pythonnet_netstandard. In case they are not installed, let’s go ahead and install them.
It is important to point out here that Keras.NET requires a version of Python 2.7-3.7 to be installed in your operating system. It also requires the Python libraries Numpy and TensorFlow to be installed. In this example, we have used Python 3.7 64-bit.
If you encounter any issues while executing the code in this article, try running the following code once at the beginning of the execution of the main method in your ConsoleApplication. This code will set up the environment variables you need so that all DLLs are found:
private static void SetupPyEnv()
{
string envPythonHome = @"C:\Users\arnal\AppData\Local\Programs\Python\Python37\";
string envPythonLib = envPythonHome + "Lib\\;" + envPythonHome + @"Lib\site-packages\";
Environment.SetEnvironmentVariable("PYTHONHOME", envPythonHome, EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PATH", envPythonHome + ";" + envPythonLib + ";" + Environment.GetEnvironmentVariable("PATH", EnvironmentVariableTarget.Machine), EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PYTHONPATH", envPythonLib, EnvironmentVariableTarget.User);
PythonEngine.PythonHome = envPythonHome;
PythonEngine.PythonPath = Environment.GetEnvironmentVariable("PYTHONPATH");
}
We will see now how easy and transparent it is to create our CNN for coin recognition using Keras.NET. The following class shows our Cnn
class containing all the logic of the model.
public class Cnn
{
private DataSet _dataset;
private Sequential _model;
public Cnn(DataSet dataset)
{
_dataset = dataset;
_model = new Sequential();
}
public void Train()
{
_model.Add(new Conv2D(32, kernel_size: (3, 3).ToTuple(),
padding: Settings.PaddingMode,
input_shape: new Shape(Settings.ImgWidth, Settings.ImgHeight, Settings.Channels)));
_model.Add(new Activation(Settings.ActivationFunction));
_model.Add(new Conv2D(32, (3, 3).ToTuple()));
_model.Add(new Activation(Settings.ActivationFunction));
_model.Add(new MaxPooling2D(pool_size: (2, 2).ToTuple()));
_model.Add(new Dropout(0.25));
_model.Add(new Conv2D(64, kernel_size: (3, 3).ToTuple(),
padding: Settings.PaddingMode));
_model.Add(new Activation(Settings.ActivationFunction));
_model.Add(new Conv2D(64, (3, 3).ToTuple()));
_model.Add(new Activation(Settings.ActivationFunction));
_model.Add(new MaxPooling2D(pool_size: (2, 2).ToTuple()));
_model.Add(new Dropout(0.25));
_model.Add(new Flatten());
_model.Add(new Dense(Settings.FullyConnectedNodes));
_model.Add(new Activation(Settings.ActivationFunction));
_model.Add(new Dropout(0.5));
_model.Add(new Dense(_dataset.NumberClasses));
_model.Add(new Softmax());
_model.Compile(loss: Settings.LossFunction,
optimizer: Settings.Optimizer,
metrics: new string[] { Settings.Accuracy });
_model.Fit(_dataset.TrainX, _dataset.TrainY,
batch_size: Settings.BatchSize,
epochs: Settings.Epochs,
validation_data: new NDarray[] { _dataset.ValidationX, _dataset.ValidationY });
var score = _model.Evaluate(_dataset.ValidationX, _dataset.ValidationY, verbose: 0);
Console.WriteLine("Test loss:" + score[0]);
Console.WriteLine("Test accuracy:" + score[1]);
}
public NDarray Predict(string imgPath)
{
NDarray x = Utils.Normalize(imgPath);
x = x.reshape(1, x.shape[0], x.shape[1], x.shape[2]);
return _model.Predict(x);
}
}
As we can see, we first have a constructor where we receive the dataset (imported and processed during the second article of this series) and create a new instance of the Sequential
class stored in the private variable _model
. What is Sequential
? It is an empty model that gives us the possibility of stacking layers, which is precisely what we need.
Then, in the Train
method, we first create our stack of layers as described in the architecture presented in the previous article, and then compile the model and call the fit
method to begin training. The loss function used is categorical_crossentropy
. What is a loss function? It is the function we use to optimize our learning process, that is, we either minimize it or maximize it. The one in charge of minimizing the loss function is the optimizer — an algorithm that changes the weights and learning rate of our network to minimize the loss.
At the end, the model is evaluated using the validation dataset. Another method is Predict
, which as the name suggests predicts the label of new incoming data. This method should be called once training has been finalized. Starting the training phrase is as simple as running the following:
var cnn = new Cnn(dataSet);
cnn.Train();
Let’s see the results obtained during training in the coin recognition problem that we are going through in this series:
We can see that we were able to reach 100% accuracy during training.
In the case of the prediction method, its output will be an NDarray
containing the probability of the object or image belonging to one of the classes used to train the CNN.
So, what would be an architecture that would require a GPU instead of a CPU? AlexNet architecture, for example, includes five convolutional layers and three fully connected layers, along with pooling and activation layers. This type of deep CNN performs better on a GPU due to its complexity. The general rule is that the more layers you add, the more complicated calculations on the weights will be.
After seeing how to code your own CNN, we will move to the area of pre-trained models. More on this in the next article!