This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. This week's post shows how Huda has evolved from the application that was created at the end of the first week. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.
Week 2
Well, it’s week 2 now and development is carrying on apace. I’ve managed to get a fair bit into Huda in the first week, so the question has to be asked “Will Peter manage to maintain this awesome pace?” We all know that was the question you wanted to ask, but were far too nice to ask, so we’ll take it as read, shall we?
Monday
Today, I’m going to be working on adding some filters to the pictures that I’m opening. In the short term, all I’m going to do is hardcode the application to apply blur and Sepia filters to any picture that opens. This isn’t a huge task, and it’s a great way to get something in place that I can use to style the open filters window.
As I want to use filters from many places, I’m going to create a central location for all my filter code. I’ll treat this as a service and use something that’s known rather nerdily as Service Location to get a single reference to my filter code where it’s needed. Ultimately, this is going to be the master model for the whole approach of managing and manipulating the filters, including the “save
” method. One thing I want to do with this approach is get away from the concept of the user having to save work. It’s no longer 1990, so why should the user have to remember to save their work. This is probably going to be the most controversial area for those who are used to traditional desktop applications but, given that we are building the next generation of application for people who are used to working with tablets and phones, I think this is an anachronism that we can dispense with quite happily.
Right, that’s a central filter model in place, so all I need to do is hook up the photo selection code and *bang* we have our filters being applied in place. As that was so easy to do, why don’t we get some feedback on the photograph as well? Let’s have the RGB histograms and display them to the user. We’ll put them at the bottom of the display, and add some animation to them so that we can start to get some feedback from them. Again, because I’m working in WPF, and I’m using the MVVM design pattern, the actual screens and animation are incredibly trivial to implement.
As an aside, here, you may be wondering what I’m using in terms of external libraries other than the Perceptual SDK. While I developed my own MVVM framework, I decided for this implementation to go with my good friend Laurent Bugnion’s MVVM Light, as there’s a good chance that the WPFers who end up downloading and playing with the source will have had exposure to that. I contemplated using Code Project uber developer Sacha Barber’s excellent Cinch implementation, but the fact that Laurent had converted MVVM Light over to .NET 4.5 was the winning vote for me here. For the actual photo manipulation, I’ve gone with AForge.NET. This is a superb library that does all that I need, and so much more. Again, I could have gone with home rolled code, but the benefit of doing so is far outweighed by the tight timescales.
Tuesday
It’s Tuesday, now, and Intel have arranged a Google hangout for today. This is our opportunity to speak directly to the judges and to give feedback on where we are, and what’s going on. I’ve left the office earlier than I normally would, so as to be in plenty of time for the hangout. All I need is a link from our contacts and I am good to go.
Okay, the hangout has started and I’ve still had no link – nothing. I can see the other parties, but I can’t get in. I’ll fire off an email and hope that someone opens it to let me in. Robert and Wendy are indredibly clued in, so I’m hopeful that I should be able to join soon.
The hangout has ended, and I’m feeling pretty despondent now. Nothing came through, so I haven’t had a chance to talk things through with the others, or to talk to the judges. I’d hoped that they would be online, as well, so we could get some feedback from them. Still, this did give me time to noodle around on a guitar for an hour or so – something that I haven’t had anytime for since the competition started – so that was good. For those that are interested, I was practicing the Steve Morse trick of alternate picking arpeggios that you would normally sweep pick; talk about a challenge. (Note – I was playing this on an Ibanez Prestige RG550XX – a very smooth guitar to play on).
Anyway, I don’t have time to let this get in the way of the coding. As all the other competitors noted, time is tight. Time is so tight that at least one team has allocated a project manager to help them prioritise and manage this as a deliverable. I can’t do this unfortunately, but at least I won’t have to fill in status reports on my progress other than through the blog. One of my fellow solo coders, Lee, made some bold claims in his blog posting, and what I’ve seen from his blog video shows that his bold claims are backed up by bold ability – he’s definitely one to watch here; his project really could revolutionise online webinars and skype like activities.
Today's code is doing some animation on the histogram views, and then hooking that animation up to the gesture camera. This is where things start to get serious. There are, effectively, two approaches I can take here. The first approach is to take the coordinates I’m getting back from the camera and use the VisualTreeHelper.HitTest
method in WPF to identify the topmost window that the coordinates match with. This is fairly trivial to implement, but it’s not the method I’m going to take for several reasons, the top ones being:
- This returns the topmost item, regardless of where the mouse is. In other words, it will return the top item of the window, even if I’m not over something that I could be interested in. This means I’ll get continuous reports of things like the position being over a layout
Grid
or Border
. - The hit test is slow. What it would be doing, on every frame from the camera, is looking through the
VisualTree
for the topmost item at that point. That’s an expensive computational operation. - Once I’ve identified that the topmost item is actually something I’m interested in, the code would have to trigger some operation there. This means that I would have to put knowledge of the views I’m showing into the code behind the main window. This is something that I want to avoid at all costs, because this hard coupling can lead to problems down the line.
The approach I’m taking is for each control that I want to interact with the gesture camera to subscribe to the gesture events coming out of the gesture camera. What this means to the code is that each control has to hook into the same gesture camera instance – I’ve created what’s known as a singleton in my Perceptual SDK that serves up a single instance of each perceptual pipeline that I’m interested in. In real terms, this currently intersperses the gesture logic in a few locations, because I want the code that’s displaying the finger tracking image handled in the main window, but the code that’s actually working out whether or not the finger is over the control needs to be in the control. Today, I’m just going to get this working in one control, but the principle should allow me to then generalise this out into a functionality you can subscribe to if you want to have this handled for you.
As I’ve added an animation to the histogram view, this seems to be a great place to start. What I’ll do here is have the animation triggered when the gesture moves into the control, and have a second animation triggered when the gesture moves out of the control. I now have enough of a requirement to start coding. The first thing I have to do here is to hook into the gesture event inside the Loaded
event for the control; this is done here so that we don’t end up having problems with the order that views are started meaning that we try to hook into the event when the pipeline hasn’t been initialised.
Okay, now that I’ve got that event in place, I can work out whether or not the X and Y coordinates fall inside my control. Should be simple – I just need to translate the coordinates of my control into screen coordinates and as I’m just working with rectangular shapes, it’s a simple rectangle test. So, that’s what I code – half a dozen lines of boilerplate code and a call to trigger the animation.
Time to run the application. Okay, I’m waving my hand and the blue spot is following along nicely. Right, what happens when I move my hand over the control I want to animate? Absolutely nothing at all. I’ll put a breakpoint in the event handler just to ensure the event is firing. Yup. I hit the breakpoint no problem, but no matter what I do, the code doesn’t trigger the animation. My rectangle bounds checking is completely failing.
A cup of coffee later, and the answer hits me, and it’s blindingly obvious when it does. I’m returning the raw gesture camera coordinates to Huda, and I’m converting it into display coordinates when I’m displaying the blue blob. What I completely failed to do, however, was remember to perform the same transformation in the user control. In order to test my theory, I’ll just copy the transformation code in to the control and test it there. This isn’t going to be a long term fix, it’s just to prove I’m right with my thinking. Good programming practice indicates that you shouldn’t repeat yourself (we even give this the rather snappy acronym of DRY – or Don’t Repeat Yourself), so I’ll move this code somewhere logical, and just let it be handled once. The logical thing for me to do is to actually return the transformed coordinates, so that’s what I’ll do.
Run the application again and perform the same tests. Success. Now I have the histogram showing when I move my hand over it, and disappearing when I move my hand away from the control. Job done, and time to turn this into a control that my user controls can inherit from. When I do this, I’m going to create GestureEnter
and GestureLeave
events that follow a similar pattern that WPF developers are familiar with for items such as Mouse
and Touch
events.
One thing that I didn’t really get time to do last week was to start adding touch support in, beyond the standard “press the screen to open a photo”. That’s something I’m going to start remedying now. I’m just about to add the code for zooming and panning the photo based on touch. Once I’ve got that in place, I think I’ll call it a night. I’ve decided that I’m going to add pan and zoom to the photograph based on touch events; the same logic I use here will apply to gestures, so I will encapsulate this logic and just call it where I need to. WPF makes this astonishingly simple, so the code I will be using looks a lot like this:
<ScrollViewer HorizontalScrollBarVisibility="Auto" VerticalScrollBarVisibility="Auto">
<Image
Source="{Binding PreviewImage}"
HorizontalAlignment="Stretch"
x:Name="scaleImage"
ManipulationDelta="scaleImage_ManipulationDelta_1"
ManipulationStarting="scaleImage_ManipulationStarting_1"
VerticalAlignment="Stretch"
IsManipulationEnabled="True"
RenderTransformOrigin="0.5,0.5" Stretch="Fill">
<Image.RenderTransform>
<MatrixTransform>
</MatrixTransform>
</Image.RenderTransform>
</Image>
</ScrollViewer>
Now, some people think that MVVM requires you to remove all code from the code behind. Far be it for me to say that they are wrong but, they are wrong. MVVM is about removing the code that doesn’t belong in the view, from the view. In other words, as the image panning and zooming relates purely to the view, it’s perfectly fine to put the logic into the code behind. So, let’s take advantage of the ManipulationDelta
and ManipulationStarting
events which give us the ability to apply that ol’ pan and zoom mojo. It goes something like this:
private void scaleImage_ManipulationDelta_1(object sender, ManipulationDeltaEventArgs e)
{
Matrix rectsMatrix = ((MatrixTransform)scaleImage.RenderTransform).Matrix;
ManipulationDelta manipDelta = e.DeltaManipulation;
Point rectManipOrigin = rectsMatrix.Transform(new Point
(scaleImage.ActualWidth / 2, scaleImage.ActualHeight / 2));
rectsMatrix.ScaleAt(manipDelta.Scale.X, manipDelta.Scale.Y, rectManipOrigin.X, rectManipOrigin.Y);
rectsMatrix.Translate(manipDelta.Translation.X, manipDelta.Translation.Y);
((MatrixTransform)scaleImage.RenderTransform).Matrix = rectsMatrix;
e.Handled = true;
}
private void scaleImage_ManipulationStarting_1(object sender, ManipulationStartingEventArgs e)
{
e.ManipulationContainer = this;
e.Handled = true;
}
I’m not going to dissect this code too much but, suffice it to say the ScaleAt
code is responsible for zooming the photo and the Translate
code is responsible for panning it. Note that I could have easily added rotation if I’d wanted to, but free style rotation isn’t something I’m planning here.
Wednesday
Well, it’s a new day and one of the features I want to be able to do is to retrieve the item that a finger is pointed at, based on its X/Y coordinates. As WPF doesn’t provide this method by default, it’s something that I’ll have to write myself. Fortunately, this isn’t that complicated a task and the following code should do nicely.
public static int GetIndexPoint(this ListBox listBox, Point point)
{
int index = -1;
for (int i = 0; i < listBox.Items.Count; ++i)
{
ListBoxItem item = listBox.Items[i] as ListBoxItem;
Point xform = item.TransformToVisual((Visual)listBox.Parent).Transform(new Point());
Rect bounds = VisualTreeHelper.GetDescendantBounds(item);
bounds.Offset(xform.X, xform.Y);
if (bounds.Contains(point))
{
index = i;
break;
}
}
return index;
}
Tracking the mouse move, for instance, would look like this:
int lastItem = 0;
private Timer saveTimer;
public MainWindow()
{
InitializeComponent();
this.MouseMove += new MouseEventHandler(MainWindow_MouseMove);
saveTimer = new Timer();
saveTimer.Interval = 2000;
}
void saveTimer_Elapsed(object sender, ElapsedEventArgs e)
{
saveTimer.Stop();
saveTimer.Elapsed -= saveTimer_Elapsed;
Dispatcher.BeginInvoke((Action)(() =>
{
selectyThing.SelectedIndex = lastItem;
}), DispatcherPriority.Normal);
}
void MainWindow_MouseMove(object sender, MouseEventArgs e)
{
int index = selectyThing.GetIndexPoint(e.GetPosition(this));
if (index != lastItem)
{
lastItem = index;
saveTimer.Stop();
saveTimer.Elapsed -= saveTimer_Elapsed;
if (lastItem != -1)
{
saveTimer.Elapsed += saveTimer_Elapsed;
saveTimer.Start();
}
}
}
Thursday
Today, I’ve really been motoring, and managed to pull slightly ahead of my schedule, but I’ve also hit a bit of a gesture road block. So, I’ve updated the gesture code to retrieve multiple finger positions, as well as to retrieve a single node. This actually works really well from a code point of view, and it’s great to see my “fingers” moving around on the screen. The problem is that one finger “wobbles” less than a hand, plus the item selection actually works better based on a single finger. At least I’ve got options now – and the gesture code is working well – plus, I’ve managed to hook single selection of a list item based on my finger position. This allows the gesture code to mimic the touch code, something I’m going to explore more in week 3. My gesture selection code has now grown to this:
using Goldlight.Perceptual.Sdk.Events;
using Goldlight.Windows8.Mvvm;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace Goldlight.Perceptual.Sdk
{
public class GesturePipeline : AsyncPipelineBase
{
private WeakEvent<GesturePositionEventArgs> gestureRaised = new WeakEvent<GesturePositionEventArgs>();
private WeakEvent<MultipleGestureEventArgs> multipleGesturesRaised =
new WeakEvent<MultipleGestureEventArgs>();
private WeakEvent<GestureRecognizedEventArgs> gestureRecognised =
new WeakEvent<GestureRecognizedEventArgs>();
public event EventHandler<GesturePositionEventArgs> HandMoved
{
add { gestureRaised.Add(value); }
remove { gestureRaised.Remove(value); }
}
public event EventHandler<MultipleGestureEventArgs> FingersMoved
{
add { multipleGesturesRaised.Add(value); }
remove { multipleGesturesRaised.Remove(value); }
}
public event EventHandler<GestureRecognizedEventArgs> GestureRecognized
{
add { gestureRecognised.Add(value); }
remove { gestureRecognised.Remove(value); }
}
public GesturePipeline()
{
EnableGesture();
searchPattern = GetSearchPattern();
}
private int ScreenWidth = 1024;
private int ScreenHeight = 960;
public void SetBounds(int width, int height)
{
this.ScreenWidth = width;
this.ScreenHeight = height;
}
public override void OnGestureSetup(ref PXCMGesture.ProfileInfo pinfo)
{
pinfo.activationDistance = 75;
base.OnGestureSetup(ref pinfo);
}
public override void OnGesture(ref PXCMGesture.Gesture gesture)
{
if (gesture.active)
{
var handler = gestureRecognised;
if (handler != null)
{
handler.Invoke(new GestureRecognizedEventArgs(gesture.label.ToString()));
}
}
base.OnGesture(ref gesture);
}
public override bool OnNewFrame()
{
try
{
if (!IsDisconnected())
{
var gesture = QueryGesture();
PXCMGesture.GeoNode[] nodeData = new PXCMGesture.GeoNode[6];
PXCMGesture.GeoNode singleNode;
searchPattern = PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY;
var status = gesture.QueryNodeData(0, searchPattern, nodeData);
if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
var handler = multipleGesturesRaised;
if (handler != null)
{
List<GestureItem> gestures = new List<GestureItem>();
foreach (var node in nodeData)
{
float x = node.positionImage.x - 85;
float y = node.positionImage.y - 75;
GestureItem item = new GestureItem(x, y, node.body.ToString(), ScreenWidth, ScreenHeight);
gestures.Add(item);
handler.Invoke(new MultipleGestureEventArgs(gestures));
}
}
}
status = gesture.QueryNodeData(0, GetSearchPattern(), out singleNode);
if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
var handler = gestureRaised;
if (handler != null)
{
handler.Invoke(new GesturePositionEventArgs(singleNode.positionImage.x,
singleNode.positionImage.y, singleNode.body.ToString(), ScreenWidth, ScreenHeight));
}
}
}
Sleep(20);
}
catch (Exception ex)
{
}
return true;
}
private readonly object SyncLock = new object();
private void Sleep(int time)
{
lock (SyncLock)
{
Monitor.Wait(SyncLock, time);
}
}
private PXCMGesture.GeoNode.Label searchPattern;
private PXCMGesture.GeoNode.Label GetSearchPattern()
{
return PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY |
PXCMGesture.GeoNode.Label.LABEL_FINGER_INDEX |
PXCMGesture.GeoNode.Label.LABEL_FINGER_MIDDLE |
PXCMGesture.GeoNode.Label.LABEL_FINGER_PINKY |
PXCMGesture.GeoNode.Label.LABEL_FINGER_RING |
PXCMGesture.GeoNode.Label.LABEL_FINGER_THUMB |
PXCMGesture.GeoNode.Label.LABEL_HAND_FINGERTIP;
}
}
}
I’ve hooked into the various events from here to handle the type of interactions we are used to seeing from mouse and touch. Again, there’s a bit of repeated code in here, but I’ll be converting this into WPF behaviors which you will be able to use in your own projects. I’ve said it before, I love writing Blend Behaviors – they really make your life easy. As a personal plea to Microsoft; if you want to get people to write for WinRT, give us the features we’re used to in XAML right now. The fact that I can’t use Blend Behaviors in Windows 8 “Modern” applications is one more barrier in the way of me actually writing for the app store.
Friday
Following the big strides forwards I made with the gesture code yesterday, I’m going to have an easy one today and hook up the filter code. A lot of the time here will just be putting the plumbing in place to actually display the filters and select them. Initially, I’m only going to offer a few filters – I’ll add more next week. The key thing here is to prove that I can do this easily. I’m not spending too much time on the UI for the filters right now, but I will be adding to it in the near future. The point is that, while the interface is quite spartan right now, for a seasoned WPF developer, actually beefing it up isn’t a major task. The key thing I have in place is that all the main interface areas are solid black, with a glowy border around them. Once I’ve added the filters in and hooked them up, I think I’ll post a picture or two to show the before and after filter behaviour of Huda.
This screenshot is taken immediately after the very posh sheep photo was loaded:
If you look carefully at the bottom of the image, you’ll see the very faint Histogram view that I was talking about earlier. I’ve made this part of the interface transparent – gesture over it, or touch it and it becomes opaque. I’ll be doing something similar with other operations as well – the photo should be the view point, not the chrome such as the available filters.
I’ll apply the channel filter and blur like this:
Well, they have applied quite nicely. I’ll add other filters and the ability to set their values next week.
This Weeks Music
- Deep Purple – Purpendicular
- AC/DC – Back in Black
- Andrea Bocelli – Sacred Arias
- Muse – Black Holes and Revelations
- Chickenfoot = Chickenfoot III
- Jethro Tull – Aqualung and Thick As A Brick
End of Week 2
Well, here we are. Another week of developing and blogging. Where do I think I’m at right now? In some ways, I’m ahead of schedule, but in other ways I feel I’m behind schedule. Oh, the code will be finished on time, and the application will do what I set out to do, but when I look at what my competitors are doing, I can’t help feeling that I could do so much more with this. The big problem, of course, is that it’s easy to get carried away with “cool” features but if they don’t add anything to the application, then they are worthless. While the things that the other teams are doing make sense in the problem domains they are using, some of these features really don’t make sense in a photo editor. Perhaps I could have been more ambitious with what I wanted to deliver at the start, but overall I’m happy with what I’m going to deliver and what is coming up.
As a corollary to this, have you checked out what my fellow contestants are planning? Please do so and comment on their posts. They are all doing some amazing things, and I would take it as a personal favour if you would take the time to offer your encouragement to them.
By the end of week 3, I fully intend to have the whole central premise linked in. If you remember, my statement for Huda was that edits should be none-destructive. Well, they aren’t, but they aren’t saved anywhere right now. That’s what I need to hook in next week. The edits must be saved away, and the user must be able to edit them. If I haven’t delivered this, then I’ve failed to deliver on the promise of the application. Fortunately, I already know how I want to go about the save and it doesn’t touch on too much code that’s already in place, so it should be a nice and natural upgrade.
While I’ve thanked my fellow competitors, I haven’t taken the time to thank the judges and to our support team, (primarily the wonderful Bob Duffy and Wendy Boswell; what does the X stand for in your name?) There’s a real sense of camaraderie amongst the competitors, more so than I would ever have expected in a contest. We are all prepared to give our time and our thoughts to help the other teams – ultimately, we were all picked because of our willingness to share with the wider audience so there’s a natural symbiosis of us sharing with each other. Personalities are starting to come to the fore, and that is wonderful. Anyway, back to the judges (sorry for the digression). Come Wednesday, I hit refresh on the browser so many times waiting for their comments. Surprisingly enough, for those who know me, this is not an ego issue. I’m genuinely interested to read what the judges have to say. This is partly because I want to see that they think the application is still worth keeping their interest (and yes Sascha, I know precisely why Ctrl-Z is so important in photo apps given the number of disastrous edits I’ve made in the past). The main reason, though, is that I want to see that I’m writing the blog post at the write level. As someone who is used to writing articles that explain code listings, I want to make sure that I don’t make you nod off if you aren’t a coder, but at the same time I have a duty of care towards my fellow developers to try and educate them. So, to the judges, as well as Bob and Wendy; I thank you for your words and support.
You may have noticed that I haven’t posted any Youtube videos this week. There’s a simple reason for that. Right now, Huda is going through rapid changes any day. It’s hard to choose a point that I want to say “yes, stopping and videoing at this point makes sense” because I know that if I wait until the following day, the changes to the application at that point would make as much, or even more, sense. My gut feeling, right now, is that the right time to make the next video is when we can start reordering the filters.
One thing that was picked up on by the judges was my posting the music I was listening to. You may be wondering why I did that. There was a reason for it, and it wasn’t just “a bit of daft” on my part. Right now, this contest is a large part of my life. I’m devoting a lot of time to it, this isn’t a complaint as I love writing code, but you can go just the slightest bit insane if you don’t have some background noise. I can’t work with the TV on, but I can code to music, so I thought it might be interesting for the none coders to know how at least one coder hangs on to that last shred of what he laughingly calls his sanity. Last week, I didn’t manage to pick up my guitar so I am grateful that I managed to get 40 minutes or so to just noodle around this week. My down time is devoted to my family, including our newest addition Harvey, and this is the time I don’t even look at a computer. Expect to see more of Harvey over the next few weeks.
Harvey in all his glory.
So, thanks for reading this. I hope you reached this point without nodding off, and the magic word for this week is Wibble. We’ll see which judges manage to get it into their posts – that’ll see who’s paying attention. Until next week, bye.