(untagged)

Gesture Recognition for Touch Devices (Touch Gestures)

The Manoj Kumar

0.00/5 (No votes)

3 Feb 2010

A multi-touch simple and compound gesture recognition implementation for Windows 7 in WPF.

Note: Requires .NET 3.5 SP1 to run and multi-touch enabled hardware, or use Multi-touch Vista to simulate multi-touch.

Touch Gestures Screenshot

Introduction
Required software
Structure of the application
Getting ready to receive multi-touch input
Handling multi-touch events
Tracking multiple touch inputs
How to recognize a touch gesture
Math behind gesture recognition
For the future
Other links and references

Introduction

One of the newly introduced features in Windows 7 is Touch API. It allows a developer to go beyond mouse and keyboard. Now, the application can interact with the user with touch. The Touch API supports rich gestures like zoom, pan, and rotate (predefined). Multi-touch APIs provides raw touch data inputs and advance manipulations to objects including inertia (a physics model that helps define real life operations, attributes, and actions like speed, velocity, acceleration, retardation, and weight).

Touch Gestures is an application developed with Windows 7 Multi-touch APIs which provides support for simple gesture recognition in a WPF application. It can recognize basic gestures like left, right, up, and down; also, it provides support for some compound gestures made up of the basic gestures like Left then Right, Left then Down, and so on. The application uses a slight variation of the approach used in the Adding Mouse Gesture Support to .NET Windows Applications article.

Required software

Visual Studio 2008 SP1
Windows 7 SDK
Windows 7 Multi-touch .NET Interop Sample Library
Multi-Touch Vista. If you don't have a touch enabled device, then this simulator can simulate touch with mouse.

Structure of the application

The application contains five classes and one enumeration, which are:

Classes

App: This class is the entry point of the application. We create an instance of MainWindow and run it. We also catch all unhandled exceptions at domain level in this class. If any exception occurs which is not handled by try/catch blocks, then we catch the exception here.
MainWindow: This class serves as the UI of the application. It displays all supported gestures as images on a canvas. When a gesture is recognized by the application, a boundary painted with a random color is shown around the gesture image (as shown in the screenshot above). The boundary disappears after a pre-defined interval.
GestureTracker: This class keeps a record of a particular gesture. It is simply done by keeping a list of all the points which the gesture travelled. This class also handles highlighting the recognized gesture image by showing a boundary around it. For this purpose, it uses a DispatcherTimer object. After the pre-defined interval, the timer stops and the rectangle disappears.
GestureTrackerManager: This class keeps a record of all touch gestures. If a user is creating multiple touch gestures by using multiple fingers (if supported by hardware device), then this class keeps a record of all touch gestures by storing them in a dictionary with the touch ID as a key. This class also checks if a particular touch ID is already tracked or not.
GestureInterpreter: This class is a utility/helper class that helps in identifying a gesture. This class can identify both basic and compound gestures, provided there is a list of all the points that the gesture moved through.

Enumerations

TouchGestureType: This enum provides the name of all the directions for both simple and compound gestures. For basic ones, the names are obvious: Left, Right, Up, and Down. For compound gestures, the names are like LeftDown, which implies that the gesture first moves left and then moves downwards.

Getting ready to receive multi-touch input

.NET 3.5 does not support multi-touch (multi-touch events and controls are part of .NET 4.0). In order to use multi-touch in .NET 3.5 SP1, we have to use the Windows 7 Integration Library Sample. For simplicity, it is placed in the Win7LibSample folder in the project.

To add multi-touch support to your application, add references to Windows7.Multitouch.dll and Windows7.Multitouch.WPF.dll (also add using statements). In order to check if multi-touch is enabled on the device or not, use the code provided below in the constructor of MainWindow.

using Windows7.Multitouch;
using Windows7.Multitouch.WPF;

// check if multi-touch capability is available
if (!TouchHandler.DigitizerCapabilities.IsMultiTouchReady)
{
    MessageBox.Show("Multitouch is not availible");
    Environment.Exit(1);
}

// enable stylus (touch) events
this.Loaded += (sender, args) => { Factory.EnableStylusEvents(this); };

Handling multi-touch events

Touch events are not handled by the MainWindow class. They are passed on to the GestureTrackerManager class, and GestureTrackerManager takes the appropriate action for the events. Before going further, we have to subscribe to the StylusDown, StylusMove, and StylusUp events. Also, declare and create an object of GestureTrackerManager which will be used for the proper handling of events.

private readonly GestureTrackerManager gestureTrackerManager;
gestureTrackerManager = new GestureTrackerManager(canvas);

// register for stylus (touch) events
StylusDown += gestureTrackerManager.Process_TouchDown;
StylusUp += gestureTrackerManager.Process_TouchUp;
StylusMove += gestureTrackerManager.Process_TouchMove;

Now, let us proceed further to the events and how we handle them in the GestureTrackerManager class. One important point to make here is that we also pass a read only reference to the canvas object to GestureTrackerManager and then to the GestureTracker object. This is to enable addition and deletion of rectangles around the images after recognizing the gesture. In GestureTrackerManager, we keep a track of all the touch gestures by using a Dictionary with touch ID as a Key. The actual points (which the gesture travelled) are stored in the GestureTracker class.

// a map for tracking all gestures and associated device Ids
private readonly Dictionary<int,> gestureTrackerMap;

// a reference for the canvas in MainWindow.xaml
private readonly Canvas canvas;

In the Stylus events, we get the current gesture location relative to the canvas from args.GetPosition(canvas) and then we check if we are already tracking the gesture (with the StylusDevice.Id from args. If the user uses two fingers for two different gestures, then both IDs will be different. For checking this, we see if we already have the ID in our dictionary; if yes, then we get the corresponding GestureTracker object, and send the location to it, so that it can handle it (usually add it to the points list). If we are not yet tracking any particular ID, then we add the ID to the dictionary and attach a new GestureTracker object to it. This part is handled by the GetGestureTracker function. The Stylus Down, Up, and Move events look like this:

public void Process_TouchDown(object sender, StylusEventArgs args)
{
    // get the location
    Point location = args.GetPosition(canvas);
    // get the tracker for this device from gestrue tracker map
    GestureTracker gestureTracker = GetGestureTracker(args.StylusDevice.Id);
    // process further
    gestureTracker.ProcessDown(location);
}

public void Process_TouchMove(object sender, StylusEventArgs args)
{
    // get the tracker for this device from gesture tracker map
    GestureTracker gestureTracker = GetGestureTracker(args.StylusDevice.Id);

    if (gestureTracker == null)
        return; // don't do anything

    // get the location
    Point location = args.GetPosition(canvas);
    // process further
    gestureTracker.ProcessMove(location);
}

public void Process_TouchUp(object sender, StylusEventArgs args)
{
    // get the location
    Point location = args.GetPosition(canvas);
    // get the tracker for this device from gesture tracker map
    GestureTracker gestureTracker = GetGestureTracker(args.StylusDevice.Id);

    if (gestureTracker == null)
        return;

    // process further i.e. handle the gesture
    gestureTracker.ProcessUp(location);

    // as the gesture is already handled so remove it from the gesture tracker
    gestureTrackerMap.Remove(args.StylusDevice.Id);
}

private GestureTracker GetGestureTracker(int deviceId)
{
    GestureTracker gestureTracker = null;
    // check if we are already tracking this device, if yes then return it
    // otherwise add it to the gesture tracker map and return a reference to that
    if (gestureTrackerMap.TryGetValue(deviceId, out gestureTracker))
        return gestureTracker;

    // first time for this device so add it to map
    if (gestureTracker == null)
    {
        gestureTracker = new GestureTracker(canvas);
        gestureTrackerMap.Add(deviceId, gestureTracker);
    }

    // finally return the gesture tracker
    return gestureTracker;
}

Tracking multiple touch inputs

Handling multiple touch inputs is really easy. All touch events have StylusEventArgs in common, and it passes the ID for each device/touch impression. For handling all of them, we just have to keep track of all the IDs. In our code, we have used it as args.StylusDevice.Id. We create a dictionary which has all these unique IDs as key and the corresponding GestureTracker for keeping track of all the points which the input travelled.

How to recognize a touch gesture

The GestureInterpreter class is used to recognize a gesture. It has two static functions: InterpretGesture and InterpretCompoundGesture; the first one is a public function, and the other is private. We make a call to InterpretGesture and pass the list of points from the GestureTracker class from the ProcessUp function. The code for the ProcessUp function is:

public void ProcessUp(Point location)
{
    // add point to the gesture's points list
    points.Add(location);

    // process and recognize the gesture
    gesture = GestureInterpreter.InterpretGesture(points);

    // find coordinates of the image for the recognized gesture
    Point imageCoordinates = FindImageCoordinatesByGesture();
    // add a boundary to the rectangle, so that it gives a highlight effect
    // pass the left and top coordinates of the image
    AddHighlightToImage(imageCoordinates.X, imageCoordinates.Y);
    
    // start a delay for 2 seconds and in the tick event
    // we remove the rectangle (border) from the image
    // which gives a un-highlighting effect
    timer.Start();
}

If the gesture is recognized, then we highlight the gesture image by showing a rectangle around it and start a DispatcherTimer for a pre-defined duration. When the duration is elapsed, we remove highlighting from the image by removing the rectangle from the canvas. The code for highlighting and un-highlighting is very straight-forward. Once we recognize the gesture, we store the gesture name in a local string variable, and it is used in finding the co-ordinates of the gesture image by name.

While creating the rectangle object, we create a new GUID and remove the minus signs from it, and then we prefix it with rect. This is done to keep the name unique while registering and un-registering the name of the rectangle object. The code for highlighting and un-highlighting the image is:

// highlight the image by adding a rectangle boundary around it
private void AddHighlightToImage(double left, double top)
{
    Rectangle rectangle = new Rectangle();
    // create a random name for this rectangle
    // this is just to ensure that we don't
    // fail while adding/registering the 
    // rectangle as a children of canvas
    rectangleName = string.Format("rect{0}", 
                           GenerateNewGuidLessMinus());
    rectangle.Name = rectangleName;

    // set some generic properties
    rectangle.Width = 72;
    rectangle.Height = 72;
    rectangle.StrokeThickness = 2;
    rectangle.RadiusX = 3;
    rectangle.RadiusY = 3;

    // create a random color for this rectangle
    Random random = new Random((int)DateTime.Now.Millisecond);
    SolidColorBrush brush = new SolidColorBrush(Color.FromRgb(
                            (byte)random.Next(0, 255),
                            (byte)random.Next(0, 255),
                            (byte)random.Next(0, 255)));
    rectangle.Stroke = brush;

    // set the coordinates of the rectangle
    Canvas.SetLeft(rectangle, left);
    Canvas.SetTop(rectangle, top);
    // register the name of rectangle with canvas
    // and add it to childres list of canvas
    canvas.RegisterName(rectangleName, rectangle);
    canvas.Children.Insert(0, rectangle);
}

private void highlightTimer_Tick(object sender, EventArgs e)
{
    // remove the boundary from the image
    // or actually remove the rectangle that we added earlier from canvas 
    RemoveHighlightFromImage();
    // disable and stop the timer
    timer.Stop();
}

// remove the highlight boundary from the image
// in actual remove the rectangle that we added a few seconds ago
private void RemoveHighlightFromImage()
{
    // find the rectangle by the registered name
    Rectangle rectangle = (Rectangle)canvas.FindName(rectangleName);
    // change the color of boundary
    rectangle.Stroke = new SolidColorBrush(Colors.White);
    // remove from childres list and unregister the name
    canvas.Children.Remove(rectangle);
    canvas.UnregisterName(rectangleName);
}

Math behind gesture recognition

Okay, so here comes the typical part (actually it is very easy). We have two kinds of gestures namely Simple and Compound. Let us try to understand how we calculate both of them.

Simple: We have two directions, horizontal and vertical. Let us take them as X axis and Y axis. First of all, we calculate the difference of X and Y between the first and last point, called xDiff and yDiff. If xDiff is zero or almost negligible (we have a minimum threshold of 10 pixels), then we moved across the Y axis or vertically only. So the gesture is either Up or Down. Similarly, if yDiff is zero or almost negligible, then we moved across X axis or horizontally only. So the gesture is either Left or Right. Also, if yDiff is negative (which means we started from a lower value and moved to a higher value), then we moved upwards, otherwise we moved downwards. Similarly, if xDiff is negative, we moved right, otherwise we moved left.
Compound: When we have the movements on both X axis and Y axis, then it means that it is a compound gesture. In order to recognize a compound gesture, we first identify the directions it moved (same as what we did for simple gestures), then we check its overall movement with respect to the starting point. And this gives us both the directions in proper order. There is one more way to do the same. We can calculate the angle, where the tangent of the angle is calculated as angle = Math.ATan2(yDiff, xDiff), and then we can identify the gesture using xDiff, yDiff, and the angle.

For the future

In the next articles, I will try to explain/implement:

Similar functionality in .NET 4.0. .NET 4.0 provides built-in Multi-touch support with WPF.
Complex gesture recognition with Neural Networks.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Gesture Recognition for Touch Devices (Touch Gestures)

Contents

License