Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / artificial-intelligence / neural-network

Build Simple AI .NET Library - Part 4 - Beyond Perceptron

5.00/5 (13 votes)
17 Sep 2017CPOL9 min read 17.1K  
Part 4 of a series of articles demonstrating .NET AI library from scratch
This is the fourth in a series of articles demonstrating how to build a .NET AI library from scratch. In this article, we will continue discussion about Perceptron to create a more complicated layout for more complicated problems.

Series Introduction

This is the 4th article of creating .NET library. Below are the links for the earlier parts in the series:

My objective is to create a simple AI library that covers couple of advanced AI topics such as Genetic algorithms, ANN, Fuzzy logics and other evolutionary algorithms. The only challenge to complete this series would be having enough time to work on the code and articles.

Having the code itself might not be the main target however, understanding these algorithms is. Wish it will be useful to someone someday.

Article Introduction - Part 2 "Beyond Perceptron"

In the last article, we created Perceptron that acts as Binary Linear Classifier. We will continue discussion about Perceptron to create a more complicated layout for more complicated problems.

I strongly advise you to review Build Simple AI .NET Library - Part 2 - Machine Learning Introduction before moving any further in this article.

More Perceptron Examples

As mentioned, Perceptron is the simplest processing element of ANN, yet it is still a powerful algorithm, however it is very limited. Remember, it's mainly used only as a binary linear classifier.

What about other complicated classification problems, what about binary but non-linear classifications. We will see through a couple of examples how to develop further complex layouts of Perceptron.

To be all on the same page; let's verify some definitions:

  • Binary Classifier - is a problem where output has only 2 possible answers, classifications or groups. However, classification can be linear or non-linear.
  • Linear Classifier - If inputs are linearly separated, you can draw a straight line to separate both groups.

    Image 1

  • Non-linear Classifier - In case classification is not possible via a straight line:

    Image 2

    Image 3

  • Problem Dimensions - Inputs matrix (Vector) can be considered as features of problem being optimized, for the last example of article 3, we had 2 inputs: one is X and other one is Y, which are coordinates of each point. Another way to view inputs is by considering a number of inputs as dimensions of the problem. So 4D problem means it has 4 different features (AI can resolve problems of higher dimensions that are impossible to visualize by the human brain).

Using Perceptron to Optimize Binary Functions

To better understand Perceptron and its limitation, we will check its use in optimization of binary functions as NOT, OR, AND & XOR.

NOT Function

Image 4

So this is a 1-Dimensional problem. Let's design Perceptron as follows:

Image 5

h(x) = W0 + W1 * X1

As output is 0 or 1, Step Activation function is a good choice.

Image 6

Then Y=StepFunction (h(x))

From NOT truth table above, output Y is 1 when X is 0 so h(x) shall be >= 0 if X=0.

h(x) = W0 + W1 * X1 >= 0 When X=0

W0 >= 0 for X =0 let's select W0 =1

h(x) = 1 + W1 * X1

Now, second possible value of Y is 0 for X = 1

h(x) < 0 for X =1

1 + W1 * X1 < 0 for X =1

1 + W1 < 0 for X=1

W1 < -1 so let's select W1 = -1.5

Finally, h(x) = 1 - 1.5 * X

Image 7

OR Function

Image 8

This is a 2-dimenstional problem, let's plot X1 & X2.

Image 9

These are linearly separated groups as a straight line can be drawn to separate both groups as the following one:

Image 10

Again, we will use Step activation function for this Perceptron:

Image 11

h(x) = W0 + W1 * X1 + W2 * X2

Y= Step(h(x)

From truth table, we know that Y=o for X1=X2=0 which means:

h(x) < 0 for X1=X2=0

W0 < 0 for X1=X2=0 - Let's select W0 as -0.5

h(x) = -0.5 + W1 * X1 + W2 * X2

Selecting one line from the graph that intercepts with X1 at 0.5 and X2 at 0.5 (other lines can work as well as separators):

From the Truth table, Y=1 for X1=1 and X2 =0 then h(x) >= 0 for X1=1 and X2 =0

-0.5+ W1 * X1 + W2 * X2 >= 0 for X1=1 and X2 =0

-0.5+ W1 * 1 + W2 * 0 >= 0 for X1=1 and X2 =0

-0.5+ W1 >= 0 for X1=1 and X2 =0

W1 >= 0.5 for X1=1 and X2 =0 let's select W1 = 1

h(x) = -0.5 + 1 * X1 + W2 * X2

Also, Y=1 for X1=0 and X2 =1 then h(x) >= 0 for X1=0 and X2 =1

-0.5 + 1 * X1 + W2 * X2 >= 0 for X1=0 and X2 =1

-0.5 + W2 >= 0 for X1=0 and X2 =1

W2 >= 0.5 for X1=0 and X2 =1 let's select W2 = 1

Finally h(x) = -0.5 + X1 + X2

Let's confirm the truth table:

X1 X2 Desired h(x) = -0.5 + X1 + X2 Y
1 1 1 1.5 1
1 0 1 0.5 1
0 1 1 0.5 1
0 0 0 -0.5 0

AND Function

Image 12

Image 13

Similarly, it is a 2D problem and Perceptron shall be:

Image 14

by following the same above OR procedure, we may conclude values for W0, W1 & W2:

One possible combination is h(x) = -1.5 + X1 + X2

To verify the truth table:

X1 X2 Desired h(x) = -1.5 + X1 + X2 Y
1 1 1 0.5 1
1 0 0 -0.5 0
0 1 0 -0.5 0
0 0 0 -1.5 0

So, the final Perceptron shall be:

Image 15

XOR Function

Image 16

Image 17

This is a problem, this function cannot be linearly separated; there is no single line that can separate the 2 groups.

Then Perceptron cannot resolve this problem and this is the main and major limitation of Perceptron (only binary linear classifications).

Yet, Perceptron is a powerful algorithm and can be used maybe in other formations to optimize complicated problems.

Let's come back to the XOR function and try to understand this function more. We will use Venn diagrams to help with that. Venn diagrams are graphical representation of different logical operations (here is more about Venn diagrams).

Venn diagram for OR gate shall be:

Image 18

and here is the Venn diagram for AND:

Image 19

Here is XOR:

Image 20

From Venn diagrams, we may extract the meaning of XOR gate as the result of UNION (OR) excluding INTERSECTION area in other words:

A XOR B = (A + B) - (A.B)

We already used Perceptron to implement AND & OR functions above, so why do we not use more than one Perceptron to implement the above function. One possible implementation could be:

Image 21

AND Function

Already 2D AND function is implemented and we may use the same:

Image 22

OR Function

We did not implement 3D OR function. To do so, let's start by simplifying XOR function truth table:

X1 X2 X1 AND X2 Desired
1 1 1 0
1 0 0 1
0 1 0 1
0 0 0 0

So we need to find weights of h(x) of OR function Perceptron that fulfill the above table where:

h(x) = W0 +W1 * X1 + W2 * X2 + W3 * X3 (X3 = X1 AND X2)

Activation function will be Step as well.

Let's start with the last combination of X1=0, X2=0 & X1 AND X2 = 0 then Y = 0

h(x) = W0 +W1 * X1 + W2 * X2 + W3 * X3 <0 for X1=0, X2=0 & X1 AND X2 = 0

W0 <0 for X1=0, X2=0 & X1 AND X2 = 0 Let's select W0 = -1

For combination of X1=1, X2=0 & X1 AND X2 = 0 then Y = 1

h(x) = -1 +W1 * X1 + W2 * X2 + W3 * X3 >= 0 for X1=1, X2=0 & X1 AND X2 = 0

-1 +W1 >= 0 for X1=1, X2=0 & X1 AND X2 = 0

W1 >= 1 Let's select W1 = 2

For combination of X1=0, X2=1 & X1 AND X2 = 0 then Y = 1

h(x) = -1 + 2 * X1 + W2 * X2 + W3 * X3 >= 0 for X1=0, X2=1 & X1 AND X2 = 0

-1 +W2 >= 0 for X1=0, X2=1 & X1 AND X2 = 0

W1 >= 1 Let's select W2 = 2

For combination of X1=1, X2=1 & X1 AND X2 = 1 then Y = 0

h(x) = -1 + 2 * X1 + 2 * X2 + W3 * X3 < 0 for X1=1, X2=1 & X1 AND X2 = 1

-1 + 2 + 2 +W3 < 0 for X1=1, X2=1 & X1 AND X2 = 1

3 +W3 < 0 for X1=1, X2=1 & X1 AND X2 = 1

W3 < -3 for X1=1, X2=1 & X1 AND X2 = 1 Let's select W3 = -4

Final h(x) =-1 +2 * X1 + 2 * X2 - 4 * X3 (X3 = X1 AND X2)

Image 23

Final Perceptron Network shall be:

Image 24

Well, let's try to reformulate the graphical representation of the above layout. Each Perceptron shall be denoted by its function.

Instead of having 1 AND Perceptron, let's add 1 Perceptron that generates X1 and other one to generate X2:

Image 25

Now, let's add dummy Perceptrons to receive inputs and just pass it to the next level of Perceptrons:

Image 26

Clearly above is a better representation, and it's called MLP (Multi-Layer Perceptron Network). This is exactly the common layout for ANN (Artificial Neural Network).

Inputs are being received by set of Perceptrons equal to number of inputs, this is called Input Layer.

Output is being generated by Perceptron, where 1 Perceptron per each output. This is called Output Layer.

Processing Perceptrons in the middle between input layer to output layer is called Hidden Layer.

Image 27

Each ANN can have only 1 Input Layer and 1 Output Layer, but could have one or multiple hidden layers. Number of hidden layers is based on complexity of problem being optimized.

We have demonstrated that by adding 1 Perceptron in addition to output Perceptron could add additional power to the network.

A lot of algorithms are there for ANN training and ANN itself has many types as well. We will discuss the most common types and algorithms in future articles.

However, it all starts from the Perceptron concept and builds upon it, hence it is important to get as many details as possible about Perceptron although the examples might not seem to be complicated.

History

  • 17th September, 2017: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)