This example is dedicated to make machines learn the concept Carnivorous animal
Why Carnivorous animals?
A Carnivorous animal is a concept because it is group of things that got a common attributes and these attributes are valid for every member of the group And you can't find a bigger group of common attributes for this group of things
So if mammals group is M
And common attributes group is C
Then the concept mammals is M, C
Example
If we can have another attributes group C1 and C1 attributes are common in all M members C1 > C
Then M, C IS NOT a concept
Another example let us say concept TABLE, because we already know that every table got one plane and 4 legs at least, but also Chair got one plane and 4 legs!
Now, we will not destroy concept TABLE but we will say Chair is negative example
Because it got 4 legs and it is not a TABLE, and coffee table is positive example
Because it got 4 legs and it is a TABLE
See if we can make our program understands concept TABLE then we can make the antivirus understands concept Malicious Code
And we will not need to give the program any clever Semantic; the program should understand what is happening by itself
We will supply the positive and negative examples and let the program learn
The concept, after that we will ask the program about something, and it will answer us, and will give us the reason of its answer
We can supervise the learning by correcting system answers but I didn't get so far with the code
Valid concept we say concept M, C is valid {where M is objects, C is attributes} If it contains enough positive examples what is enough? We will get to that later
Coherent concept we say that valid concept is Coherent if it has no negative examples, and because it is difficult to find a Coherent concept we say
Semi Coherent has a very little amount of negative examples
Pertinent concept valid and Semi Coherent concept is called Pertinent concept
The Biggest Pertinent concepts Now you can guess, we are looking for
Benefits unlike neural networks, version space does not need a lot of positive and negative examples to learn And as you will see Bordat is capable of justification but NN can't
Definition of attributes' values
Because the network that we are going to build is a binary one, then the
Attributes' values belong to {0, 1}
And the attributes that will be supplied by new data table
Should be chosen carefully, so the attributes must be comprehensive and sufficient
And you should take enough time to make new data table because a big data table will make the program hang
The program constants
Because your table will be probably small to make learn process faster you should give small proper values for Alpha, Beta, Theta _Alpha, Theta _Beta
What do they do?
Alpha, Beta are the limits mentioned in the next PDF file that will control
Function "validity" which will judge concepts as we mentioned above
Theta _Alpha, Theta _Beta are used in decision system to accept or refuse objects
They are set according to the number of total attributes in the table And
They are extremely sensitive to the number of the resulted proper attributes
When we finish learning we will reach ideal attributes for our concepts
Identifying objects
The new objects will be offered by the user who will tell us the attributes
Of his objects We will compare his attributes with our ideal attributes
That we got from BORDAT, so we will use counter variable
Iteration will scan the user attributes and compare each one with ours
When they match we increase the counter
When they are different we decrease the counter
After this iteration we will compare our counter with Theta_Alpha
And Theta_Beta
If the counter is smaller than Theta_Beta limit to reject the odd objects then the user object doesn't belong to our concept
If the counter is bigger than Theta_Alpha limit to accept the familiar objects then the user object belongs to our concept
Otherwise we can't tell if the object is odd or familiar this is called
System silence
Now you can imagine this situation you put small table to make learning faster
And because attributes are not that much, a big Theta_Beta value will cause the program to reject all objects and a small Theta_Alpha will make the program
Accept all objects
Changing Alpha, Beta values will change Bordat's behavior
Huge Alpha will make all concepts not valid, and negative beta will not let any
Concept to be valid and solid
I think it is clever not to play with their values
Simply, how does Bordat work?
The input is a table of objects and attributes
The major concept is All objects, No attributes, we will start with that
Because we already know that there is no shared attributes between the objects, and if there are shared attributes then it is better to drop them because they are neutral and they will not help learning the concept
We will calculate the closure of objects group conceder its cardinal is n
STAR
We will seek concepts in subgroups whose cardinal is n1
for each valid and semi coherent concept put HASSE connection between the major concept father and that new concept sun
For each valid concept recursively repeat STAR so we can seek possible valid and semi coherent concepts in its sons
We stop when there are no more suns to generate from all fathers
Optimized Bordat
You can see that Bordat has this problem
If I generated the suns of some node X and we reached a son Z which is
Considered as pertinent concept although it is a son of another pertinent father Y
Y and X are from the same level and we didn't generate sons of Y because it is pertinent
So Z is not "biggest pertinent" because there is Y Y>Z and Y is pertinent
To solve this declare public array, it will contain the biggest pertinent concepts
And every time we find a new pertinent concept we will compare it with the elements of the array, if it is not contained in any concept of this array we will add it to the array, if it is contained in some concept of this array, we will not generate its sons
In our implementation we will not draw HASSE diagram but we will seek proper
Attributes that will identify user concepts, and in our way we will see a list of constructed concepts
Future development
Justification
By comparing proper attributes with the given attributes the new object
And justifying program's behavior when object belongs mention the shared
Attributes between "Proper attributes" group and "given attributes" group
When object doesn't belong mention the Attributes in "Proper attributes" group that doesn't belong to "given attributes" group
Fuzzy logic
It is possible to use fuzzy attributes in our learning, like to describe "Reputation"
Attribute of object "Customer", we will say "Very good" or
"Extremely bad" classification application, of course we must change function
Validity and other related things
In short description, some attributes can't be answered in the classical logic space
How to use the code
One important condition is the formation of data files
The first line of the table will be used as Attributes' names
The first column will be used as objects' names
A spaces column should precede every new column, so words in every row are always separated by one space or more
The data file format most be *RTF, below are the buttons' uses
Load data
Import data from a file first step
Set + objects
Confirm your choice of positive and negative examples in the list box
Save table
Saves a new user table in a file
Set Alpha Beta Theta's values
Set new values for constants values
Start Bordat
The program will use Bordat to learn your table
Ask about learned concept
The program will ask you to choose an object by defining its list of attributes
What is this?
It will give the judge of system about your object
Note exe icon is 48 x 48 so it may not appear in your desktop till you enable large icons not necessary in vista