This article shows how to develop a low costing, scalable AI that will be able to classify apples based on their color, height and diameter into various categories of red and green apples and then further allocate a corresponding sticker based on the apples' actual size.
Introduction
AI is amazing and it is very exciting to learn and make new solutions using Artificial intelligence.
In this article, we will try and develop a low costing, scalable AI that will be able to classify apples based on their color, height and diameter into various categories of red and green apples and then further allocate a corresponding sticker based on the apples' actual size.
Moreover, as some apples may be below quality standards, the AI will also make decisions for us to reject some apples for being oversized or undersized.
For the purpose of our example, our custom sticker’s labels will consist of #small R, # Mid R, #Big R or #small G, #Mid G or #Big Green as represented by figure 1.0 below.
This article will also provide information on which sensors you can actually use to solve key issues of automating measurements and then proceed to provide code on AI development.
The end goal of this article will be to create an AI that can replicate human thinking and decide on which sticker is to be applied to an apple that is being evaluated.
Figure 1.0
The table below represents the six different categories of apple Stickers that we are looking to work with.
Red apple Category | Green apple Category |
#Small R | #Small G |
#Mid R | #Mid G |
#Big R | #Big Green |
Background
Who will this Article Help?
The following article intends to help farmers and small re-seller enterprises who wish to devise automation mechanisms using AI to value add to their products, increase productivity and expand their operations while incurring minimal labour costs with respect to sorting and labeling fruits.
As we all know, branding and labeling of products is a pleasant requirement nowadays as it is an important factor in distinguishing products of value. People prefer to buy labeled products over unlabeled products.
In addition to the above, small micro enterprises or small farmers exist in huge quantities, however no such low cost solution exists to aid them to engage in the development of automated fruit labeling and sorting systems that focus on diameter size and height of the produce as well.
How will the AI Component Help?
Normally, when apples are picked, they come in all shapes and sizes and have to be further sorted and categorized during processing, therefore with AI, the sorting and labeling would become automated.
- Saving Labour costs
- Saving significant equipment purchasing costs
- Increasing product processing speed which will increase delivery speed
- Provide a fully re -use-able AI that can be trained to identify and label other fruits and vegetables
What will this Article Provide?
It will recommend information about which sensors to possibly use.
It will provide code and explain how to build, test and use the AI component.
Processing Fruit Sorting and Labeling Without AI
In order to plan our AI build, we need to first identify the issues that it needs to solve:
If you look at the above illustration, it would be very time-consuming, tiresome and resource intensive to manually sort, measure and label each apple between the 6 various sticker categories.
Since no one tree produces apples of uniform size, imagine if you had 30,000 apples to process between various sizes and colour.
As a farmer, you may end up facing many additional challenges such as:
- trying to find temporary workers to accommodate the workload
- fruits rotting and getting spoilt because of slow processing times
- missing supply deadlines due to slow packaging and labeling since you and your workers would not be operating for 24 hours a day around the clock
- being prone to errors in labeling
- increased cost of labeling as the more workers you hire, the more manual labeling machines you would need to buy
Getting Started with Artificial Intelligence
So to get started, the first thing I would like to do is replicate my thinking and decision making ability onto a machine, in our case, a Windows 10 Intel Core 7 CPU with 4 GB RAM by setting up a development environment, algorithms and providing it with training data so that it can learn from it.
Once the machine has finished its training, we provide it with some testing data to see how accurately it can replicate my abilities and what its error rate is.
Finally, if our AI model’s error rate is low and accuracy is reasonably high, we will consider development complete and start using it for production.
Challenges of Automating Height and Diameter Measurement
Before we dive into development of the AI, it is important to acknowledge how the AI will determine the apple’s colour and measure the apple diameter sizes.
Often tutorials give you information about how to make a AI but there is no clue about how to collect data cheaply in an automated fashion which is the most important part.
Below, we explain what sensors we can plan to use to collect data with.
Because if we simply cannot collect data feasibly for the AI to process, then that defeats the purpose of the AI's development because it will not make sense to overspend our budget to utilize our AI.
A major problem lies ahead - how do we measure an Apple’s Diameter and its Height using sensors?
Is there a cheap cost effective sensor available that is easy to use and easily replaceable?
Right now lack there-off information exists on how to measure diameters of objects easily without having to use intermediate mathematics and vision with extensive calibration so for our scenario, we developed a unique solution to approximate and measure the diameter of the Apple using 3 Ultrasonic Sensor’s namely HC-sr04.
The concept is illustrated by figure 1.1 below.
Note: Ultrasonic sensor image taken from fritzing:
Measuring an Apples Diameter
To measure the diameter of the apple, we place the Left ultrasonic sensor and Right ultrasonic sensor vertically 30 centimeters apart and denote it as Total Distance.
The apple is then placed in the center of our sensor via a conveyor belt .
Next, sequentially, we fire each ultrasonic sensor to get a reading from each side.
When we fire our left sensor, it gives a reading of 11.8cm and when we fire the right sensor in this example, it also reads 11.8 cm.
Finally, we subtract the left hand side’s sensor reading and right hand sides sensor reading from the Total distance between the 2 sensors.
The result from the above operations is an approximated Diameter of our apple .
The above can be represented using the below formula:
Approximated Diameter= Total Distance-(Left Sensor reading + Right Sensor reading)
6.4cm = 30cm -(11.8cm +11.8cm)
Measuring an Apples Height
Similarly, to measure the diameter of the apple:
We place the Ultrasonic sensor 15 centimeters above the ground denoted by ground height and then fire the ultrasonic sensor sequentially.
Next, we take the reading from the top sensor and subtract it from our Ground height.
This can be represented using the below formula:
Apple Height = Ground Height - Top sensor reading
6.8cm = 15cm -8.2 cm
The result from the above operations is an approximated height of our apple .
Of course, you can use laser distance sensor’s instead of the ultrasonic sensors as well, I chose the ultrasonic sensor since it is cheap and easily replaceable and readily available in my area.
The current setup is perfectly changeable to accommodate other fruits or other items with larger height and diameter as well without the need for much complex calculations.
The above robust conceptual setup is very cheap and allows you to quickly and easily adjust your sensors to accommodate other items that may need measurement such as pumpkins, dalo, potatoes, pineapples, rock melons, passion fruits whereever height and diameter play an important factor in pricing or as quality determinants.
Note: Cliparts under CC0 license no attribution required.
Detecting the Color of an Apple
The next cheap and cost effective sensor that we plan to use to measure the color of the item concerned in our case is a TCS3200 color sensor .
We measure the R= Red value’s and G = green value’s to determine the difference in colors of the apple.
Since for our scenario we are developing for the apple production line that processes 2 varieties red and green on one production line, the color detection will be for red or green values only.
In our concept, once all the sensors have collected the relevant data such as color, diameter and height, we post it to our AI model for evaluation.
So having effectively determined and solved our problem of what sensors we can use to automate sensory input for evaluation, we can now proceed to create our AI which this article focuses on.
Developing the AI
Artificial intelligence is our attempt to replicate human like thinking onto a machine.
So in order to do this, firstly we must source and create training data based on what our current actions and ways of doing work are.
In our case, since we currently sort, measure, detect color and then apply an appropriate sticker to the apple.
For the purposes of training, our AI to think like us, we will write down let's say 600 observations of our manual labeling process across the various categories and note 3 main parameters that we use for decision making such as Diameter, Height and color in a CSV file, along with our classification.
Below is a small excerpt of what our CSV file looks like:
Diameter_cm | Height_cm | Color | classification |
6.42 | 6.8 | R | #BigR |
5.46 | 6.0 | R | #MidR |
4.69 | 5.5 | G | #MidG |
If you take a look at the Color and Classification columns, you will notice that they are non numeric and varchar
s.
So for us to further proceed, it is always good to convert text into a numeric form.
For our case, we assign the numeric value of 1 to red and the numeric value of 2 to green to represent each color under the Color column.
Because we will classify the apple according to 7 different categories:
Red apple Category | Green apple Category |
#Small R =1 | #Small G =2 |
#Mid R= 3 | #Mid G=4 |
#Big R=5 | #Big Green=6 |
Rejected =7
So after representing the categorical data in numeric form, our final CSV looks like the below:
Diameter_cm | Height_cm | Color | classification |
6.42 | 6.8 | 1 | 5 |
5.46 | 6.0 | 1 | 3 |
4.69 | 5.5 | 2 | 4 |
Next, we will take another 600 observations and reserve it for testing, notice when it comes to testing, we only use 3 inputs diameter, height and color of the apple.
The below is a small excerpt of what our CSV file looks like:
Diameter_cm | Height_cm | Color |
6.42 | 6.8 | 1 |
5.46 | 6.0 | 1 |
4.69 | 5.5 | 2 |
Once we begin testing our AI, sometimes, the AI in the beginning will get the answers incorrect and misclassify the apples, we will continually train our AI until it learns to classify correctly .
The number of times the AI misclassifies our apple categories are known as the error rate but as we continuously train the AI, it eventually gets better and better and starts predicting accurately .
Coding the AI
Prerequisites
Install Anaconda development environment
Install Python 3.5.2
Install scikit learn
Install pandas
Importing our Dependencies
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
Next, load our observed data called apples.csv.
df=pd.read_csv('apples.csv')
If you would like to see the contents with headers of the file, you may use:
print (df.head())
Next specify which columns, meaning the input features that you wish to select from the CSV file.
In our case, we will use 3 input features of our apple such as its Diameter
, Height
and Color
.
input_features=df.loc[:,'Diameter_cm':'Height_cm':'Color']
Next, we select a column that contains our targeted conclusion that we made after referencing the input features, meaning we looked at the apples Diameter
, Height
and Color
and than drew a conclusion by classifying it into one of the following categories #small R, #Mid R, #Big R or #small G, #Mid G or #Big Green and Rejected.
targetted_output=df.loc[:,'claffification']
Next, we proceed to split the rows inside our apples.csv file into Training and Testing data.
In our case, we take a value of 0.3 which is a 30% split for testing and 80% for training our AI.
train_x_data,test_x_data,train_Y_data,test_Y_data =
train_test_split(input_features, targetted_output, test_size = 0.3, random_state = 100)
Next, initialize the Decision Tree classifier where max_leaf_nodes
is the number of leafs that need to be processed to draw a conclusion, for our case, we choose 8 you may change this values to increase your accuracy.
apples_tree=DecisionTreeClassifier(max_leaf_nodes=8,random_state=0)
With the below one simple line, we train our AI using the Gini method.
clf=apples_tree.fit(train_x_data,train_Y_data)
Finally, let's take 30 % of our testing data and actually test the accuracy level of our AI.
Test_predications =apples_tree.predict(test_x_data)
print ("Accuracy is", accuracy_score(test_Y_data,Test_predications)*100)
After testing is complete, the accuracy will be displayed as follows. In our case, it was around 93.47 % with more room for improvement .
But it's good enough for our case so with the testing done, let's move onto actually meaning fully using our AI on a single apple.
So for our scenario, we want to apply a sticker or reject an apple for being under sized, we do this by feeding a single array to the predict()
function feeding Diameter
, Height
and color
.
apple_prediction=apples_tree.predict([[6.4,6.5,1]])
The result of the prediction is a Big red Apple so sticker of #Big R is to be applied, which is correct.
So using very simple if then else
logic, we print the prediction of our AI on which sticker to apply .
Of course, in production, we would remove the print
statements and call subroutines to hardware that applies the actual sticker onto the apples.
if apple_prediction==1:
print("#SmallR")
if apple_prediction==2:
print("#Small G")
if apple_prediction==3:
print("#Mid R")
if apple_prediction==4:
print("#Mid G")
if apple_prediction==5:
print("#Big R")
if apple_prediction==6:
print("#Big Green")
if apple_prediction==7:
print("Rejected")
Points of Interest
Thanks very much for reading my article. If you like it, please leave a rating.
I learned that it takes more than just one component to build a meaningful product and also that no such cheap solutions exist for farmers until now for application of stickers based on the size of a fruit's diameter.
History
- 31st December, 2018: Added pictures and notes