Introduction
Recently I installed Opera 5 and was impressed on a Gesture UI.
Moreover several weeks ago I noticed a discussion on CodeProject's Lounge about it. To all appearances people want to know about it too :).
IMHO, the neural network most suitable for this purpose.
As I a little know neural network I tried to implement such feature themselves.
What is Neural Network ? Hm it's not easy to say. A rephrased definition Zurada, J.M.:
"Neural network software is a software which can acquire, store, and utilize experiential knowledge."
I think I can point any person concerned to theory directly to several neural network sites.
Here is small list of web resources about Neural networks:
Implementation
Let's return to mouse gestures.
After some research I have chosen a multilayer perceptron and standard back-propagation algorithm for training.
The main problem was in the representation of an input data for neural network.
The best result I found was in the transformation of a mouse path into a vector of cosines and
sines.
For example:
path {170:82 172:83 175:85 177:86 ...}
transformed into
vector {0.45 0.55 0.45 0.71 ... 0.89 0.83 0.89 0.71 ...}
Recognition algorithm.
- record a mouse path
- smooth a path to a base points
- transform points to angles' vector
- compute sines and cosines
- pass values (cosines and sines) to network's inputs
- apply softmax function on an output network vector
- find and verify a winner
Neural network architecture.
- input layers : 32 sinapses
- hidded layer : 32 neurons
- output layer : 29 axons (one for each gesture)
- fully connected layers
- transfer function : log-sigmoid
- incremental training algorithm, standard back-propagation method
- momentum, variable learning rate (slowly reduced)
- input noise
Application
Training
Before testing the recognition ability you must train the network (or you can load an file image of trained net).
You can customize the parameters of the training process, namely: maximum number of cycles, a momentum value, a learning rate, a minimum value of mean square error (in other words "target error").
The training process will stop after achieving either of the conditions: maximum number of cycles or target error.
During the training process you can keep an eye on a error's graph, a current gesture (with noise) and 2D network presentation.
Testing
As soon as you have a trained net, you can test it.
Select the patterns (or test all of them), a speed value and a noise level.
Besides, you can familiarize oneself with ideal presentation of gestures via setting minimal noise and minimal speed.
Recognition
For recognition of mouse gestures you must press right mouse button during moving a mouse.
For example for recognition "left" gesture, press right mouse button and move a mouse to the left.
If a neural network can recognize the gesture, then you will see the name, probability and ideal presentation of winner.
Because of freeware nature of GestureApp the mouse path must have at least 16 points :(.
Sorry I didn't implemented a "stretch a path" feature so far.
Note: the direction is very important.
The network is trained to recognize the gestures but not 2D images.
Hence, you can draw the "circle" gesture a thousand different ways, but the only
valid way is: press mouse button and move a mouse to the right and down and so on.
Once more: it's gesture, not 2D image.
Mouse gestures
Compatibility
Compatible with Win2k, WinXP, Win98, WinMe. Unfortunately doesn't work on WinNT because of
the need for the AlphaBlend API.
Acknowledgement
Special Thanks:
My wife Julia for her nice artwork ;)
And thanks to:
Pedro Pombeiro for Selection slider control