Note: Requires .NET 3.5 SP1 to run and multi-touch enabled hardware, or use Multi-touch Vista to simulate multi-touch.
Contents
One of the newly introduced features in Windows 7 is Touch API. It allows a developer to go beyond mouse and keyboard. Now, the application can interact with the user with touch. The Touch API supports rich gestures like zoom, pan, and rotate (predefined). Multi-touch APIs provides raw touch data inputs and advance manipulations to objects including inertia (a physics model that helps define real life operations, attributes, and actions like speed, velocity, acceleration, retardation, and weight).
Touch Gestures is an application developed with Windows 7 Multi-touch APIs which provides support for simple gesture recognition in a WPF application. It can recognize basic gestures like left, right, up, and down; also, it provides support for some compound gestures made up of the basic gestures like Left then Right, Left then Down, and so on. The application uses a slight variation of the approach used in the Adding Mouse Gesture Support to .NET Windows Applications article.
The application contains five classes and one enumeration, which are:
- Classes
App
: This class is the entry point of the application. We create an instance of MainWindow
and run it. We also catch all unhandled exceptions at domain level in this class. If any exception occurs which is not handled by try
/catch
blocks, then we catch the exception here.
MainWindow
: This class serves as the UI of the application. It displays all supported gestures as images on a canvas. When a gesture is recognized by the application, a boundary painted with a random color is shown around the gesture image (as shown in the screenshot above). The boundary disappears after a pre-defined interval.
GestureTracker
: This class keeps a record of a particular gesture. It is simply done by keeping a list of all the points which the gesture travelled. This class also handles highlighting the recognized gesture image by showing a boundary around it. For this purpose, it uses a DispatcherTimer
object. After the pre-defined interval, the timer stops and the rectangle disappears.
GestureTrackerManager
: This class keeps a record of all touch gestures. If a user is creating multiple touch gestures by using multiple fingers (if supported by hardware device), then this class keeps a record of all touch gestures by storing them in a dictionary with the touch ID as a key. This class also checks if a particular touch ID is already tracked or not.
GestureInterpreter
: This class is a utility/helper class that helps in identifying a gesture. This class can identify both basic and compound gestures, provided there is a list of all the points that the gesture moved through.
- Enumerations
TouchGestureType
: This enum provides the name of all the directions for both simple and compound gestures. For basic ones, the names are obvious: Left, Right, Up, and Down. For compound gestures, the names are like LeftDown, which implies that the gesture first moves left and then moves downwards.
.NET 3.5 does not support multi-touch (multi-touch events and controls are part of .NET 4.0). In order to use multi-touch in .NET 3.5 SP1, we have to use the Windows 7 Integration Library Sample. For simplicity, it is placed in the Win7LibSample folder in the project.
To add multi-touch support to your application, add references to Windows7.Multitouch.dll and Windows7.Multitouch.WPF.dll (also add using
statements). In order to check if multi-touch is enabled on the device or not, use the code provided below in the constructor of MainWindow
.
using Windows7.Multitouch;
using Windows7.Multitouch.WPF;
if (!TouchHandler.DigitizerCapabilities.IsMultiTouchReady)
{
MessageBox.Show("Multitouch is not availible");
Environment.Exit(1);
}
this.Loaded += (sender, args) => { Factory.EnableStylusEvents(this); };
Touch events are not handled by the MainWindow
class. They are passed on to the GestureTrackerManager
class, and GestureTrackerManager
takes the appropriate action for the events. Before going further, we have to subscribe to the StylusDown
, StylusMove
, and StylusUp
events. Also, declare and create an object of GestureTrackerManager
which will be used for the proper handling of events.
private readonly GestureTrackerManager gestureTrackerManager;
gestureTrackerManager = new GestureTrackerManager(canvas);
StylusDown += gestureTrackerManager.Process_TouchDown;
StylusUp += gestureTrackerManager.Process_TouchUp;
StylusMove += gestureTrackerManager.Process_TouchMove;
Now, let us proceed further to the events and how we handle them in the GestureTrackerManager
class. One important point to make here is that we also pass a read only reference to the canvas object to GestureTrackerManager
and then to the GestureTracker
object. This is to enable addition and deletion of rectangles around the images after recognizing the gesture. In GestureTrackerManager
, we keep a track of all the touch gestures by using a Dictionary
with touch ID as a Key. The actual points (which the gesture travelled) are stored in the GestureTracker
class.
private readonly Dictionary<int,> gestureTrackerMap;
private readonly Canvas canvas;
In the Stylus events, we get the current gesture location relative to the canvas from args.GetPosition(canvas)
and then we check if we are already tracking the gesture (with the StylusDevice.Id
from args
. If the user uses two fingers for two different gestures, then both IDs will be different. For checking this, we see if we already have the ID in our dictionary; if yes, then we get the corresponding GestureTracker
object, and send the location to it, so that it can handle it (usually add it to the points list). If we are not yet tracking any particular ID, then we add the ID to the dictionary and attach a new GestureTracker
object to it. This part is handled by the GetGestureTracker
function. The Stylus Down, Up, and Move events look like this:
public void Process_TouchDown(object sender, StylusEventArgs args)
{
Point location = args.GetPosition(canvas);
GestureTracker gestureTracker = GetGestureTracker(args.StylusDevice.Id);
gestureTracker.ProcessDown(location);
}
public void Process_TouchMove(object sender, StylusEventArgs args)
{
GestureTracker gestureTracker = GetGestureTracker(args.StylusDevice.Id);
if (gestureTracker == null)
return;
Point location = args.GetPosition(canvas);
gestureTracker.ProcessMove(location);
}
public void Process_TouchUp(object sender, StylusEventArgs args)
{
Point location = args.GetPosition(canvas);
GestureTracker gestureTracker = GetGestureTracker(args.StylusDevice.Id);
if (gestureTracker == null)
return;
gestureTracker.ProcessUp(location);
gestureTrackerMap.Remove(args.StylusDevice.Id);
}
private GestureTracker GetGestureTracker(int deviceId)
{
GestureTracker gestureTracker = null;
if (gestureTrackerMap.TryGetValue(deviceId, out gestureTracker))
return gestureTracker;
if (gestureTracker == null)
{
gestureTracker = new GestureTracker(canvas);
gestureTrackerMap.Add(deviceId, gestureTracker);
}
return gestureTracker;
}
Handling multiple touch inputs is really easy. All touch events have StylusEventArgs
in common, and it passes the ID for each device/touch impression. For handling all of them, we just have to keep track of all the IDs. In our code, we have used it as args.StylusDevice.Id
. We create a dictionary which has all these unique IDs as key and the corresponding GestureTracker
for keeping track of all the points which the input travelled.
The GestureInterpreter
class is used to recognize a gesture. It has two static functions: InterpretGesture
and InterpretCompoundGesture
; the first one is a public
function, and the other is private
. We make a call to InterpretGesture
and pass the list of points from the GestureTracker
class from the ProcessUp
function. The code for the ProcessUp
function is:
public void ProcessUp(Point location)
{
points.Add(location);
gesture = GestureInterpreter.InterpretGesture(points);
Point imageCoordinates = FindImageCoordinatesByGesture();
AddHighlightToImage(imageCoordinates.X, imageCoordinates.Y);
timer.Start();
}
If the gesture is recognized, then we highlight the gesture image by showing a rectangle around it and start a DispatcherTimer
for a pre-defined duration. When the duration is elapsed, we remove highlighting from the image by removing the rectangle from the canvas. The code for highlighting and un-highlighting is very straight-forward. Once we recognize the gesture, we store the gesture name in a local string variable, and it is used in finding the co-ordinates of the gesture image by name.
While creating the rectangle object, we create a new GUID and remove the minus signs from it, and then we prefix it with rect. This is done to keep the name unique while registering and un-registering the name of the rectangle object. The code for highlighting and un-highlighting the image is:
private void AddHighlightToImage(double left, double top)
{
Rectangle rectangle = new Rectangle();
rectangleName = string.Format("rect{0}",
GenerateNewGuidLessMinus());
rectangle.Name = rectangleName;
rectangle.Width = 72;
rectangle.Height = 72;
rectangle.StrokeThickness = 2;
rectangle.RadiusX = 3;
rectangle.RadiusY = 3;
Random random = new Random((int)DateTime.Now.Millisecond);
SolidColorBrush brush = new SolidColorBrush(Color.FromRgb(
(byte)random.Next(0, 255),
(byte)random.Next(0, 255),
(byte)random.Next(0, 255)));
rectangle.Stroke = brush;
Canvas.SetLeft(rectangle, left);
Canvas.SetTop(rectangle, top);
canvas.RegisterName(rectangleName, rectangle);
canvas.Children.Insert(0, rectangle);
}
private void highlightTimer_Tick(object sender, EventArgs e)
{
RemoveHighlightFromImage();
timer.Stop();
}
private void RemoveHighlightFromImage()
{
Rectangle rectangle = (Rectangle)canvas.FindName(rectangleName);
rectangle.Stroke = new SolidColorBrush(Colors.White);
canvas.Children.Remove(rectangle);
canvas.UnregisterName(rectangleName);
}
Okay, so here comes the typical part (actually it is very easy). We have two kinds of gestures namely Simple and Compound. Let us try to understand how we calculate both of them.
- Simple: We have two directions, horizontal and vertical. Let us take them as X axis and Y axis. First of all, we calculate the difference of X and Y between the first and last point, called
xDiff
and yDiff
. If xDiff
is zero or almost negligible (we have a minimum threshold of 10 pixels), then we moved across the Y axis or vertically only. So the gesture is either Up or Down. Similarly, if yDiff
is zero or almost negligible, then we moved across X axis or horizontally only. So the gesture is either Left or Right. Also, if yDiff
is negative (which means we started from a lower value and moved to a higher value), then we moved upwards, otherwise we moved downwards. Similarly, if xDiff
is negative, we moved right, otherwise we moved left.
- Compound: When we have the movements on both X axis and Y axis, then it means that it is a compound gesture. In order to recognize a compound gesture, we first identify the directions it moved (same as what we did for simple gestures), then we check its overall movement with respect to the starting point. And this gives us both the directions in proper order. There is one more way to do the same. We can calculate the angle, where the tangent of the angle is calculated as angle =
Math.ATan2(yDiff, xDiff)
, and then we can identify the gesture using xDiff
, yDiff
, and the angle.
In the next articles, I will try to explain/implement:
- Similar functionality in .NET 4.0. .NET 4.0 provides built-in Multi-touch support with WPF.
- Complex gesture recognition with Neural Networks.