Keras UI: A GUI to Manage Image Classification

Daniele Fontani

4.88/5 (13 votes)

13 May 2019CPOL19 min read

30.5K

A tool for managing CNN training built on django helps to understand image classification

Introduction

Image classification is something that reminds me of the times of college when I had many AI courses and I played with it for the first time. At that moment, it was a very experimental field and, to use it on your project, you often had to implement a lot of stuff and fight with code before you had something to test.

A few years ago, when I had the opportunity to play again with it, I found a very very different scenario. Everything that was “experimental” at college time now is “standard” and academic things are now just libraries. When I discovered OpenCV, Tensorflow, Keras, I was amazed but scared at the same time. You can put together some lines of code copied from the internet and you can have a trained network without knowing anything about AI. That’s the power of progress.

In this article, I wanted to go a step forward. In most cases, the image classification process with AI is quite the same. You have some samples, hopefully, images, each one labeled with the image description. In an example, you may have 1000 animal images, and you must know what each one is. Then you have to design a model, that’s able to train and learn from the sample the rule to decide what is what. After that, you need to train the model, maybe making more than one attempt, tuning parameter. This process is quite long, even hardware progress makes it nothing compared to my college time. So relax and wait until the training process finishes. Finally, you have a trained model, that’s able to predict what is into the image basing on what you teach.

Well, why not create a web application to manage all this stuff? Imagine a tool that allows you to upload or define images and image labels. In this application, you can also define the model, and start the train. In the end, you can download the model, or consume it from web services just sending an image and getting back the result.

In this scenario, you do not need to have a dev environment to play with AI. Just open the browser, define the model, click training and wait for the result.

Basically, that’s the project I do for this article and, because it is on top of Keras, I called it KerasUI.

Github project

By the way, I would also take the opportunity to introduce image classification for neophytes.

Article Roadmap

Image classification explained to my grandma
Keras UI: How It Works
Tutorial
Point of Interest

Note: This project is part of the image classification context on Codeproject. A special thanks to Ryan Peden for the article "Cat or not" where I learnt the basics to manage the training process and images to test the tool.

Image Classification Explained to My Grandma

The Image Classification problem is the task of assigning to an image one label from a fixed set of categories. This is one of the core problems in Computer Vision that, despite its simplicity, has a large variety of practical applications. In poor words, what you want is that if you give an image of a dog to the computer, it tells you “it’s a dog”.

This is a problem that can be resolved using artificial intelligence and computer vision. The computer vision helps to manipulate and preprocess images to get it in a form that computer can use (from a bitmap to a matrix of relevant values). Once you have the input in a good form, you can apply an algorithm to predict the result.

The most common solution nowadays is to use CNN (Convolutional Neural Network). Such kind of neural network is very convenient for image processing and is trained against the dataset you will provide.

Dataset is just a list of samples, each one is labeled. The main topic is that you tell the machine how to decide by example. Usually, the dataset is divided into a training set, test set, and validation set. This is because you want to train the network, then test how it works on separate data until it works as expected. Finally, if you want to have objective feedback, you must use other data: the validation set. This is required mostly because if you let the network train always on the same data, it will drop any error, but will be able to work only with the sample you provide. So, if you put into something a little bit different, you want to get a good result. That’s called overfitting and is something to avoid because it means the network didn’t abstract the rules but just repeated what you tell it. Think about a math expression 2*5+10, it’s something like remember that the result is 25 instead be able to evaluate it.

CNN: The Convolutional Neural Networks

The architecture of a CNN (Source)

The Input

In image classification, we start from... images! It is not so hard to understand that an image is a bidimensional matrix (width * height), composed by pixel. Each pixel is composed of 3 different values, in RGB, red, green and blue. To use CNN is convenient to separate the 3 different layers, so your final input matrix to represent your image will be image_size x image_size x 3. Of course, if you have a black & white image, you don't need 3 layers, but only one, so you'll have image_size x image_size x 1. Well, if you consider also that your dataset will be composed by N items, the whole input matrix will be N x image size x image size x 3. The size of the image must be the same for all and cannot be a full HD image, to avoid too long time processing. There are no written rules for this but is often a compromise: 256x256 may be a good value in some cases, in other words, you will need more resolution.

The Convolution

Inside images, there are many details, hints, and shades that are not relevant for the network. All that detail can confuse the training so the main idea is to simplify the image keeping all the data that brings information. This is intuitive and is so easy in word but in practice? The CNN uses a convolutional step, that's the core of this method, to reduce the size of the image keeping the more relevant part of the image. The convolution layer has this name because it makes the convolution between a sliding piece of the matrix and a filter. The size of the filter and the piece of the matrix to analyze are the same. This piece is called Kernel. To make the matrix size suitable for the kernel size, it is padded with zeroes in all dimensions. Convolution produces a scalar value for all kernel multiplication. This will produce a size drop, in example with kernel 4x4 over a 32x32 matrix (1024 elements), you will have in output a 4x4 matrix (16 elements). The size of the kernel impact on the final result and often is better to keep the small kernel and chain multiple convolutional layers, where I can add some pooling layer in the middle (I'll speak about pooling later).

The Convolution operation. The output matrix is called Convolved Feature or Feature Map. Source

The Pooling

The pooling layer is used to reduce the matrix size. There are many approaches, but the basic is: I take a set of adjacent values and I used only one. The most common algorithm is the max pooling, so, basically, you take the bigger element into the set.

The Max pooling operation. Source, from O'Reilly media

The Fully Connected Layer

The final step is the neural network. Until this last step, we have done some "deterministic" operation, just algebraic computation. In this step, we have real artificial intelligence. All has been done before, had only the purpose to generate data that can be understood by the network.

Keras UI: How It Works

The process for training a neural network is often the same. I can summarize it in:

Define the dataset
Build a model (network configuration, preprocess layers...)
Train
Check if work, if not, came to 1

It isn't a deterministic process: you may need to iterate it a lot of times before having something working. In most cases, people build scripts and run them manually until they were satisfied. This approach works but has some limitations:

You work locally, so you consume your physical resources. You may be limited by your hardware or you may be forced to keep the computer on all night until the training will finish.
It is hard for a colleague to make his test, maybe because all the dataset or trained model is into your PC.

So, my idea was to create a web application that can manage this project, letting the freedom to change the model definition as you were in your local machine. With a solution like this, you can start the train, shut down the PC and go out for a walk, because of all the load is the server side. You can also have a cheap PC or a smartphone to train a neural network.

Moreover, the big benefit of using a system like this is the following. Besides, the training process is long, heavy and not deterministic, the final result will be a model with weights (just a single file, using Keras). So why bother with that stuff if, after a certain period, you don't have to pain more? The matter with this is that you will embed your model into the app. Maybe it is good enough for now, but what in the future? You may want to deliver a better model or unload the user from the prediction load.

That's why, designing KerasUI, I left the possibility to expose the model via API. In this way, you can send the image and get back the result. All the computational part is on the server side, so you can change it as you want without any impact for the caller.

Usage

Run standalone.bat or sh standalone.bat (this will install requirements apply migrations and run the server, the same script works on UNIX and Windows)
Create the admin user using python manage.py createsuperuser
http://127.0.0.1:8000/

How to Manage Dataset

Keras UI allows uploading dataset items (image) into the web application. You can do it one by one or adding a zip file with many images in one shot. It manages multiple datasets so you can keep things separate. After you have the images loaded, you can click the training button and run the training process. This will train the model you have defined without any interaction from you. You will get back training result and if you are finicky, you can go to the log file and see what the system output.

How to Test Using Web UI

To avoid losing sleep over, I provided a simple form where you can upload your image and get the result.

How to Use Django API UI or Postman to Test API

All you have seen until now in the web UI can be replicated using API.

API Usage

This application use oauth2 to authenticate the request, so the first step you need is to get the token. Django supports many flows, in this example, I used password flow. Please remember you have to enable the app (this is not created by default at first run).

POST to http://127.0.0.1:8000/o/token/

Headers:
Authorization: Basic czZCaGRSa3F0MzpnWDFmQmF0M2JW
Content-Type: application/x-www-form-urlencoded

Body:
grant_type:password
username:admin
password:admin2019!

Response:

{
   "access_token": "h6WeZwYwqahFDqGDRr6mcToyAm3Eae",
   "expires_in": 36000,
   "token_type": "Bearer",
   "scope": "read write",
   "refresh_token": "eg97atDWMfqC1lYKW81XCvltj0sism"
}

The API to get the prediction works in json post or form post. In json post, the image is sent as a base64 string. This double way to consume the service is useful because it supports different scenarios. You may link it to a form or use with wget or curl tool directly as well as you can use it from your application.

POST http://127.0.0.1:8000/api/test/

Headers:
Content-Type:application/json
Authorization:Bearer <token>

Body:
{
  "image":"<base 64 image",
  "dataset":1
}

The response
{
   "result": "<LABEL PREDICTED>"
}

You can check the full postman file for more information or test it directly.

Tutorial

This chapter is a walkthrough on the technical part to explain how it is built and how it works. The project is built on top of Django, so it follows Django guidelines and there isn't anything out of standard.

What is Django

I'll cover the most important points, supposing you can understand all even if you don't know Django yet. For who is a virgin with Django, all you need to know is that Django is mostly a RAD framework that allows defining the data model from a high level (i.e., you can tell that there is an "email" or "image" field). The model is mapped directly to the database structure and automatically bound with UI. Of course, you can configure quite everything by implementing classes or altering templates. All is not related by data, can be managed manually creating custom routes/action.

Well, I fear to have bee too simplistic, and I hope to haven't created too many enemies into Django supporters. ;-)

Anyway, you can discover it here.

The Stack

The project stack:

Python
django framework
keras, tensorflow,numpy
SQLite (or another database you like)

Tools used:

Visual Studio code
Postman
A web browser

Project Setup

As I mentioned, the project is based on Django, so the first thing to do is to create a Django project using CLI.

This requires to install Django installed on the system, and we can manage from pip.

pip install Django

Now we have Django onboard, just create the project.

django-admin startproject kerasui

This command will produce the following structure:

root/
    manage.py
    kerasui/
        __init__.py
        settings.py
        urls.py
        wsgi.py

These files are:

The outer kerasui/ root directory is just a container for your project. The inner mysite/ directory is the actual Python package for your project. Its name is the Python package name you’ll need to use to import anything inside it (e.g. mysite.urls).
manage.py: A command-line utility that lets you interact with this Django project in various ways. You can read all the details about manage.py in Django-admin and manage.py.
__init__.py: An empty file that tells Python that this directory should be considered a Python package. If you’re a Python beginner, read more about packages in the official Python docs.
kerasui/settings.py: Settings/configuration for this Django project. Django settings will tell you all about how settings work.
kerasui/urls.py: The URL declarations for this Django project; a “table of contents” of your Django-powered site. You can read more about URLs in URL dispatcher.
kerasui/wsgi.py: An entry-point for WSGI-compatible web servers to serve your project. See how to deploy with WSGI for more details.

Run It

To check if all works, just run Django with the built-in server (in production, we will use WSGI interface to integrate with our favorite web server):

python manage.py runserver

You can also use setup Visual Studio code to run Django.

This is the Django configuration:

JavaScript

{
   // Use IntelliSense to learn about possible attributes.
   // Hover to view descriptions of existing attributes.
   // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387

   "version": "0.2.0",
   "configurations": [
       {
           "name": "Python: Django",
           "type": "python",
           "request": "launch",
           "program": "${workspaceFolder}\\kerasui\\manage.py",
           "args": [
               "runserver",
               "--noreload",
               "--nothreading"
           ],
           "django": true
       }
   ]
}

Settings Configuration

Here the basic part of the configuration that tells:

To use OAuth 2 and session authentication so that: regular web user logs in and uses the web site and rest sandbox, API user gets the token and queries the API services
To use SQLite (you can change to move to any other DB)
To add all Django modules (and our two custom: management UI and API)
Enable cors

Python

INSTALLED_APPS = [
   'python_field',
   'django.contrib.admin',
   'django.contrib.auth',
   'django.contrib.contenttypes',
   'django.contrib.sessions',
   'django.contrib.messages',
   'django.contrib.staticfiles',
   'oauth2_provider',
   'corsheaders',
   'rest_framework',  
   'management',
   'api',
]

MIDDLEWARE = [

   'django.middleware.security.SecurityMiddleware',
   'django.contrib.sessions.middleware.SessionMiddleware',
   'django.middleware.common.CommonMiddleware',
   'django.contrib.auth.middleware.AuthenticationMiddleware',
   'django.contrib.messages.middleware.MessageMiddleware',
   'django.middleware.clickjacking.XFrameOptionsMiddleware',
   'django.middleware.security.SecurityMiddleware',
   'corsheaders.middleware.CorsMiddleware',
]

ROOT_URLCONF = 'kerasui.urls'

REST_FRAMEWORK = {
   'DEFAULT_AUTHENTICATION_CLASSES': (
       'rest_framework.authentication.SessionAuthentication',
       'rest_framework.authentication.BasicAuthentication',
       'oauth2_provider.contrib.rest_framework.OAuth2Authentication',
   ),
   'DEFAULT_PERMISSION_CLASSES': (
       'rest_framework.permissions.IsAuthenticated',
   ),
   'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.LimitOffsetPagination',
   'PAGE_SIZE': 10,
}

DATABASES = {
   'default': {
       'ENGINE': 'django.db.backends.sqlite3',
       'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
   }
}

First Run

Django uses a migration system that produces migration files from the model you defined. To apply migrations you just need to run the migrate command (makemigration to create migration files from the model).

The user database starts empty, so you need to create the admin user to login. This is done by the createsuperadmin command.

BAT

python manage.py migrate
python manage.py createsuperuser

How It's Built

The app is separated into 3 modules:

Management part: the web UI, the modules and all the core stuff
Background worker: is a Django command that can be executed in the background and is used to train models against the dataset
API: This part exposes API to interact with application from outside. In the example, this allows adding the item to dataset from a third party application. Moreover, the most common usage is to send an image and get the prediction result

Management

To create an app on Django:

BAT

python manage.py startapp management

This will create the main files for you. In this module, the most we use is about Model and Model representation:

module.py: Here are all models with field specifications. By such class definition, all is set to have a working CRUD over entities
admin.py: This layer describes how to show and edit data with forms.

The Data Model

Our data model is very simple. Assuming that we want to train only one model per dataset (this may be a limit if you would reuse dataset with multiple models...), we have:

DataSet: This contains the model, the model settings, and the name of the dataset.
DataSetItem: This contains the dataset items, so one image per row with the label attached.

Here is just a sample of models and model representation:

Python

#from admin.py

class DataSetForm( forms.ModelForm ):
   process =forms.CharField( widget=forms.Textarea(attrs={'rows':40, 'cols':115}), 
            initial=settings.PROCESS_TEMPLATE )
   model_labels =forms.CharField(initial="[]")
   class Meta:
       model = DataSet
       fields = ['name', 'process','epochs','batchSize','verbose','model_labels','model']    

    def train(modeladmin, request, queryset):
      for dataset in queryset:
         DataSetAdmin.train_async(dataset.id)

class DataSetAdmin(admin.ModelAdmin):
   list_display = ('name','epochs','batchSize','verbose','progress')   
   form=DataSetForm
   actions = [train]
   change_list_template = "dataset_changelist.html"

   @staticmethod
   def train(datasetid):
       call_command('train',datasetid)

   @staticmethod
   def train_async(datasetid):
       t = threading.Thread(target=DataSetAdmin.train, args=(datasetid,))
       t.setDaemon(True)
       t.start()

admin.site.register(DataSet,DataSetAdmin)

#from model.py

class DataSet(models.Model):
   name= models.CharField(max_length=200)
   process = models.CharField(max_length=5000, default=settings.PROCESS_TEMPLATE)
   model = models.ImageField(upload_to=path_model_name,max_length=300,
           db_column='modelPath',blank=True, null=True)
   batchSize = models.IntegerField(validators=[MaxValueValidator(100), 
               MinValueValidator(1)],default=10)
   epochs = models.IntegerField(validators=[MaxValueValidator(100), 
            MinValueValidator(1)],default=10)
   verbose = models.BooleanField(default=True)
   progress = models.FloatField(default=0) 
   model_labels= models.CharField(max_length=200)

   def __str__(self):
       return self.name

Django works in code-first approach, so we will need to run python manage.py makemigrations to generate migration files that will be applied to the database.

python manage.py makemigrations

Background Worker

The background worker is the part of the application that works on a separated thread and manages a long time running process, in this case, the network training. Django supports "Commands" that are designed to be called by the UI or shell. To create a command, you don't need a lot more than create a file with a class and put it into a special folder, called "management".

To create the background worker, we need a module to host it, and I used the management module. Inside it, we need to create a management folder (sorry for the name that is the same as the main module, I hope this is not a threat). Each file on it can be run via python manage.py <commandname> or via API.

In our case, we start the command in a background process via regular Django action.

This is the relevant part:

Python

class DataSetAdmin(admin.ModelAdmin): 
   actions = [train]
   # ....  

   @staticmethod
   def train(datasetid):
       call_command('train',datasetid)

   @staticmethod
   def train_async(datasetid):
       t = threading.Thread(target=DataSetAdmin.train, args=(datasetid,))
       t.setDaemon(True)
       t.start()

API

The API is created in a separated app, to keep things isolated and more clear. This helps also to separate concerns and avoid drowning in a big pot of files.

BAT

python manage.py startapp api

Basically, all CRUD models can be exposed by API, however, you need to specify how to serialize it:

Python

class DataSetItemSerializer(serializers.HyperlinkedModelSerializer):
   image = Base64ImageField()
   dataset=   serializers.PrimaryKeyRelatedField(many=False, read_only=True)
   class Meta:
       model = DataSetItem
       fields = ('label', 'image', 'dataset')

class DataSetSerializer(serializers.HyperlinkedModelSerializer):
   class Meta:
       model = DataSet
       fields = ('name', 'process')

You need also to create ViewSet (mapping between the model and the data presentation):

Python

class DataSetItemViewSet(viewsets.ModelViewSet):
   queryset = DataSetItem.objects.all()
   serializer_class = DataSetItemSerializer

class DataSetViewSet(viewsets.ModelViewSet):
   queryset = DataSet.objects.all()
   serializer_class = DataSetSerializer

Finally, you need to define all routes and map viewset to url. This will be enough to consume model as API.

Python

router = routers.DefaultRouter()
router.register(r'users', views.UserViewSet)
router.register(r'datasetitem', views.DataSetItemViewSet)
router.register(r'dataset', views.DataSetViewSet)
router.register(r'test', views.TestItemViewSet, basename='test')
urlpatterns = [
   url(r'^', include(router.urls)),
   url(r'^api-auth/', include('rest_framework.urls', namespace='rest_framework')),
]
urlpatterns += staticfiles_urlpatterns()

The Training

This is the core part of the application. Besides all other topics are not a lot more than a web application, this is closer to artificial intelligence and here, we start the cool part.

The algorithm is very easy:

Take all images from the dataset.
Normalize them and add to a labeled list.
Create the model, how it is specified into dataset record.
Train it.

The part that has to be managed smart is how to let the user have the possibility to write its own custom model and train it. The idea I had is to give him the possibility to write its own python code. Then, it is used during the training process. This may not be the best in terms of security, because the user could write anything on it, also malicious code. Well, that's true, but this is not a saas project, so we can assume that only trusted people can access as administrator. Instead, as a consumer, you just inquire the saved model, so the system is fully safe.

This is the piece of code that queries dataset items and loads images:

Python

def load_data(self, datasetid):
       self.stdout.write("loading images")
       train_data = []     
       images = DataSetItem.objects.filter(dataset=datasetid)
       labels = [x['label'] for x in  DataSetItem.objects.values('label').distinct()]

       for image in images:           
           image_path = image.image.path
           if "DS_Store" not in image_path:          
               index=[x for x in range(len(labels)) if labels[x]==image.label]
               label = to_categorical([index,],len(labels))
               img = Image.open(image_path)
               img = img.convert('L')
               img = img.resize((self.IMAGE_SIZE, self.IMAGE_SIZE), Image.ANTIALIAS)
               train_data.append([np.array(img), np.array(label[0])])
       return train_data

Take a look at:

Python

labels = [x['label'] for x in  DataSetItem.objects.values('label').distinct()]
label = to_categorical([index,],len(labels))

This assigns an order to all the labels, i.e., ["CAT","DOGS"] then to_categorical convert the positional index to the one-hot representation. To tell in simpler words, this makes CAT =[1,0] and DOG=[0,1].

To train the model:

Python

model=Sequential()
exec(dataset.process)
model.add(Dense(len(labels), activation = 'softmax'))
model.fit(training_images, training_labels,
          batch_size=dataset.batchSize, epochs=dataset.epochs, verbose=dataset.verbose)

Note that the dataset.process is the python model definition you entered into web admin and you can tune as much you want. The last layer is added outside the user callback to be sure to match the array size.

The fit method just runs the train using all data (Keras automatically makes a heuristic separation of test and training set, for now, it's enough, in future, we can plan to let the user choose percentages of data to use in each part or mark items one by one).

Finally, we store the trained model:

Python

datasetToSave=DataSet.objects.get(pk=datasetid)
datasetToSave.progress=100
datasetToSave.model_labels=json.dumps(labels)
temp_file_name=str(uuid.uuid4())+'.h5'
model.save(temp_file_name)
datasetToSave.model.save('weights.h5',File(open(temp_file_name, mode='rb')))
os.remove(temp_file_name)
datasetToSave.save()

Note that I also save the label order because it must be the same as the model to match the one-hot convention.

The Prediction

There is a common method that, given the sample and the dataset, retrieves the model, loads it and makes the prediction. This is the piece of code:

Python

def predict(image_path,datasetid):
     dataset=DataSet.objects.get(pk=datasetid)
     modelpath=dataset.model.path
     model=load_model(modelpath)
     labels=json.loads(dataset.model_labels)

     img = Image.open(image_path)
     img = img.convert('L')
     img = img.resize((256, 256), Image.ANTIALIAS)
     result= model.predict(np.array(img).reshape(-1,256,256, 1))
     max=result[0]

     idx=0

     for i in range(1,len(result)):
        if max<result[i]:
          max=result[i]
             idx=i

     return labels[idx]

The model is loaded using load_model (modelpath) and the labels are from the database. The model prediction output as a list of values, the higher index is chosen and used to retrieve the correct label assigned to the network output at training time.

Points of Interest

About the Project and the Idea of Creating a UI for Keras

This project is just a toy. To become a real project, it will need a lot of work and, as a potential user, I'll think twice before choosing an on-prem solution considering how well cloud solution works and how cheap we are considering the quality of the service.

The project was an excuse to create an opportunity to play again with neural networks and learn\ experiment with Keras and artificial intelligence.

About CNN, AI, and Image classification

Since I moved my first step on this field, in 2008, there were relevant changes. The first feeling as an amateur, that now all the process is "deterministic". By using standard technologies and good documentation, it is easier to make a network work. The experience is still important, and I don't want to compare AI with a regular database read\write operation, but finding a lot of stuff, tutorial, guides is something that allows a newbie to get something working.

The credit for this is to the big players, as usual, They make accessible AI to developers, sharing their libraries, maybe just to let us know it's easier to consume all the stuff from them as service. ;-)

Github project

References

History

13^th May, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)