Introduction
At PDC 2009, Reed Townsend presented some very exciting multitouch samples in a presentation: Windows Touch Deep Dive. The following video shows the presentation:

This blog post covers the code behind of the custom 3D manipulations that were presented. You can download the 3D manipulations code sample for this from my server. Note that this project uses the DirectX SDK. If you don't already have the DirectX SDK installed, you can Download the DirectX SDK from the DirectX Developer Center. If you have installed the DirectX SDK, you will also need to add the header, library, and other include paths to Visual Studio. You may need to add some additional libraries (D3D11.lib;d3d10_1.lib;) to the additional dependencies in the linker input if you are getting these errors:
1>D3DXDriver.obj : error LNK2019: unresolved external symbol
_D3D10CreateDeviceAndSwapChain1@36 referenced in function
"private: long __thiscall D3DXDriver::CreateSwapChain(unsigned int,unsigned int)"
(?CreateSwapChain@D3DXDriver@@AAEJII@Z)
1>D3DXDriver.obj : error LNK2019: unresolved external symbol
_D3D11CreateDevice@40 referenced in function
"private: long __thiscall D3DXDriver::CreateSwapChain(unsigned int,
unsigned int)" (?CreateSwapChain@D3DXDriver@@AAEJII@Z)
For the 3D manipulations demo that was shown at PDC 2009, a custom implementation of Windows Touch input is used to map gestures to transformations in 3D space. The following image shows the application in use:

Overview
For this demo, a few utility classes are created that simplify and organize the code. The D3DXDriver
class encapsulates the Direct 3D setup and control. The CComTouchDriver
class encapsulates Windows Touch handling. The Camera
class inherits from the InertiaObj
class, a generic inertia class, and encapsulates the various transformations that are made to the camera object within the scene. The following dataflow shows how messages are propagated through the application in the utility classes:

In the diagram on the left, the WM_TOUCH
message is generated by user input on the device, and the message is sent to the main window’s WndProc
method. From there, the message gets passed to the CComTouchDriver
class, which then sends the event data to the Camera
class, which then feeds the input its input handler. The input will then cause the manipulation processor (represented in the diagram on the right) to raise events such as ManipulationStarted
and ManipulationDelta
. The event handler for the ManipulationDelta
event will update the camera position based on the event’s values.
Demo component details
The following sections describe the various tasks that were completed to create this demo:
- Set up Direct3D
- Add
WM_TOUCH
support - Add manipulations support
- Map manipulations to 3D navigation
- Add inertia and tweak the application
Set up Direct3D
This project uses the D3DXDriver
class which simplifies hooking up Direct3D to a project. For this project, the D3DXDriver
class encapsulates the rendering methods and managing some of the scene objects. The Render
method uses the camera object to set up the camera position so that updates the camera object makes on itself are reflected when the scene renders.
Once you have Direct3D working using the driver, you can set up the basic elements of the scene.
Create boxes
The boxes that are seen in the scene are generated by a few calls to the Direct3D API. The following code shows you how the boxes are generated and randomly placed:
srand(1987);
for (int i = 0; i < NUM_BOXES; i++)
{
m_amBoxes[i] = new D3DXMATRIX();
if (m_amBoxes[i] == NULL)
{
hr = E_FAIL;
break;
}
D3DXMatrixIdentity(m_amBoxes[i]);
D3DXMatrixTranslation(
m_amBoxes[i],
(FLOAT)(rand() % 20 - 5),
(FLOAT)(rand() % 10 - 5),
(FLOAT)(rand() % 20 - 5));
}
The following code shows how the boxes are rendered:
for (int i = 0; i < NUM_BOXES; i++)
{
RenderBox(m_amBoxes[i], FALSE, NULL);
}
Create axes
The x, y, and z axes are added to the scene to enable some reference points that you can move the scene around. In implementation, these objects are just stretched boxes created in a manner similar to the randomly generated boxes that were created for the scene. The following code shows how the axes coordinate and color values are initialized.
FLOAT afScale[3], afTrans[3];
D3DXMATRIX mScale, mTranslate;
for (int i = 0; i < 3; i++)
{
m_amAxes[i] = new D3DXMATRIX();
if (m_amAxes[i] == NULL)
{
hr = E_FAIL;
break;
}
D3DXMatrixIdentity(m_amAxes[i]);
for (int j = 0; j < 3; j++)
{
afScale[j] = AXIS_SIZE;
afTrans[j] = 0.0f;
}
afScale[i] = AXIS_LENGTH;
afTrans[i] = AXIS_LENGTH * 0.98f;
D3DXMatrixScaling(&mScale, afScale[0], afScale[1], afScale[2]);
D3DXMatrixTranslation(&mTranslate, afTrans[0], afTrans[1], afTrans[2]);
D3DXMatrixMultiply(m_amAxes[i], &mScale, &mTranslate);
}
}
The following code shows how the axes are rendered in D3DXDriver.cpp:
for (int i = 0; i < 3; i++)
{
RenderBox(m_amAxes[i], TRUE, &(g_vAxesColors[i]));
}
Set up initial position for the camera
The camera must be initialized so that the view a user sees includes the entire scene that was set up in the previous steps. The following code shows how the camera is set up in the camera class:
VOID Camera::Reset()
{
m_vPos = D3DXVECTOR3( 20.0f, 20.0f, -20.0f);
m_vLookAt = D3DXVECTOR3( 0.0f, 0.0f, 0.0f );
m_vUp = D3DXVECTOR3( 0.0f, 1.0f, 0.0f );
}
The following code shows how the camera object is used to render from the camera’s position:
D3DXMatrixLookAtLH(
&m_mView,
&(m_pCamera->m_vPos),
&(m_pCamera->m_vLookAt),
&(m_pCamera->m_vUp) );
Add WM_TOUCH
By default, applications will receive WM_GESTURE
messages. Since this is a custom implementation of input based on touch, you need to call RegisterTouchWindow
to get WM_TOUCH
messages and ultimately will use a ManipulationProcessor
to interpret some of the WM_TOUCH
messages. The following code shows how RegisterTouchWindow
is called in InitWindow
for the project’s main source file:
if( !FAILED( hr ) )
{
RegisterTouchWindow(*hWnd, 0);
ShowWindow( *hWnd, nCmdShow );
}
An advantage of using WM_TOUCH
messages and custom handling of touch messages over using WM_GESTURE
messages is that you can perform two different types of manipulation simultaneously (zoom while panning, rotation while zooming, and so on). The following code maps the WM_TOUCH
message to the TouchProc
method.
The following code shows how the WM_TOUCH
message is propagated to the CComTouchDriver
class in the TouchProc
method:
LRESULT CALLBACK WndProc( HWND hWnd, UINT message, WPARAM wParam, LPARAM lParam )
{
PAINTSTRUCT ps;
HDC hdc;
switch( message )
{
case WM_TOUCH:
TouchProc(hWnd, message, wParam, lParam);
break;(…)
}
The following code shows how the CComTouchDriver
class handles the input event:
LRESULT TouchProc( HWND hWnd, UINT , WPARAM wParam, LPARAM lParam )
{
PTOUCHINPUT pInputs;
HTOUCHINPUT hInput;
int iNumContacts;
POINT ptInputs;
iNumContacts = LOWORD(wParam);
hInput = (HTOUCHINPUT)lParam;
pInputs = new (std::nothrow) TOUCHINPUT[iNumContacts];
if(pInputs != NULL)
{
if(GetTouchInputInfo(hInput, iNumContacts, pInputs, sizeof(TOUCHINPUT)))
{
for(int i = 0; i < iNumContacts; i++)
{
ptInputs.x = pInputs[i].x/100;
ptInputs.y = pInputs[i].y/100;
ScreenToClient(hWnd, &ptInputs);
pInputs[i].x = ptInputs.x;
pInputs[i].y = ptInputs.y;
g_ctDriver->ProcessInputEvent(&pInputs[i]);
}
}
delete [] pInputs;
}
CloseTouchInputHandle(hInput);
return 0;
}
The following code shows how the CComTouchDriver
class handles the input event:
VOID CComTouchDriver::ProcessInputEvent(TOUCHINPUT * inData)
{
if ( IsMultiFingerTap( inData ) )
{
m_pCamera->Reset();
}
m_pCamera->ProcessInputEvent(inData, m_pointMap.size());
}
The first thing it does is check for a custom gesture. The second thing the code does is send the input data to the camera. Finally, the following code shows how the camera handles the input data:
VOID Camera::ProcessInputEvent(TOUCHINPUT const * inData, int iNumContacts)
{
TrackNumContacts(inData->dwTime, iNumContacts);
InertiaObj::ProcessInputEvent(inData, iNumContacts);
}
The following code shows how the InertiaObj
class handles WM_TOUCH
data:
VOID InertiaObj::ProcessInputEvent(TOUCHINPUT const * inData, int )
{
DWORD dwCursorID = inData->dwID;
DWORD dwTime = inData->dwTime;
DWORD dwEvent = inData->dwFlags;
FLOAT fpX = (FLOAT)inData->x, fpY = (FLOAT)inData->y;
if(dwEvent & TOUCHEVENTF_DOWN)
{
m_manipulationProc->ProcessDownWithTime(dwCursorID, fpX, fpY, dwTime);
}
else if(dwEvent & TOUCHEVENTF_MOVE)
{
m_manipulationProc->ProcessMoveWithTime(dwCursorID, fpX, fpY, dwTime);
}
else if(dwEvent & TOUCHEVENTF_UP)
{
m_manipulationProc->ProcessUpWithTime(dwCursorID, fpX, fpY, dwTime);
}
}
In summary, the touch data propagates from the main application, to the touch driver, to the camera.
Add manipulations and map manipulations to 3D navigation
This project uses CComTouchDriver
, a class that encapsulates much of the touch input, and has places where the input handling can be easily customized, and the InertiaObj
class, a class that encapsulates touch input handling for inertia. As described in the previous section, WM_TOUCH
messages are handed to the touch driver in the main window’s WndProc
method which then routes messages appropriately to the camera, which implements the InertiaProcessor
interface. Once the messages are reaching the classes implementing the _IManipulationEvents
interfaces, manipulation events will be generated. Once the manipulation events are generated, you can map the manipulations to 3D navigation. The following sections describe the various manipulation mappings.
Zoom / Pinch to move the camera’s distance or pan about the z-axis
These transforms are hooked up within the manipulation processor to modify the camera’s distance while keeping the camera locked to the focal point or by panning about the z-axis.
SHORT sCtrlState = GetKeyState(VK_CONTROL);
if (sCtrlState < 0)
{
vPan = D3DXVECTOR2(0.0f, 0.0f);
Pan(vPan, CalcZPan(delta.scaleDelta));
}
else
{
Scale(delta.scaleDelta);
}
Note that the pinch gesture performs an operation similar to zooming if you hold the Control key, but it’s slightly different. If you hold Control while panning, the focal distance of the camera remains fixed and the camera moves in the coordinate space instead.
Spherical panning
Panning is done by rotating the camera about a focal point behind the scene created by the boxes. The following code shows how the camera is panned in the manipulation event handler:
if (m_uLagNumContacts >= 2)
{
vPan = D3DXVECTOR2(-delta.translationDeltaX, delta.translationDeltaY);
Pan(vPan, 0);
}
else
{
vSpherePan = D3DXVECTOR2(
-delta.translationDeltaX,
delta.translationDeltaY);
SphericalPan(vSpherePan);
}
The following code shows how the Pan
method is implemented:
VOID Camera::Pan(D3DXVECTOR2 vPan, FLOAT zPan)
{
D3DXVECTOR3 vTPan, vRadius;
RECT rClient;
GetClientRect(m_hWnd, &rClient);
FLOAT fpWidth = (FLOAT)(rClient.right - rClient.left);
FLOAT fpHeight = (FLOAT)(rClient.bottom - rClient.top);
vRadius = m_vPos - m_vLookAt;
FLOAT fpRadius = D3DXVec3Length(&vRadius);
FLOAT fpYPanCoef = 2*fpRadius / tan( (D3DX_PI - FOV_Y) / 2.0f);
FLOAT fpXPanCoef = fpYPanCoef * (fpWidth / fpHeight);
vTPan.x = vPan.x * fpXPanCoef;
vTPan.y = vPan.y * fpYPanCoef;
vTPan.z = zPan;
ScreenVecToCameraVec(&vTPan, vTPan);
m_vPos += vTPan;
m_vLookAt += vTPan;
}
The following code shows how the SphericalPan
method is implemented:
VOID Camera::SphericalPan(D3DXVECTOR2 vPan)
{
D3DXQUATERNION q;
D3DXMATRIX mRot;
D3DXVECTOR3 vRotAxis;
D3DXVECTOR3 vRadius = m_vPos - m_vLookAt;
FLOAT radius = D3DXVec3Length(&vRadius);
FLOAT cameraHeight = D3DXVec3Dot(&vRadius, &m_vUp);
FLOAT xOrbitRadius = sqrt( pow(radius, 2) - pow(cameraHeight, 2));
FLOAT ySpherePanCoef = 2 * sqrt(2.0f * pow(radius, 2));
FLOAT xSpherePanCoef = 2 * sqrt(2.0f * pow(xOrbitRadius, 2));
vPan.x *= xSpherePanCoef;
vPan.y *= ySpherePanCoef;
D3DXVECTOR3 vTPan = D3DXVECTOR3(vPan.x, vPan.y, 0);
FLOAT theta = D3DXVec2Length(&vPan) / radius;
FLOAT gamma = (FLOAT)((D3DX_PI - theta) / 2.0f);
FLOAT chordLen = (radius * sin(theta)) / sin(gamma);
ScreenVecToCameraVec(&vTPan, vTPan);
D3DXVec3Normalize(&vTPan, &vTPan);
vTPan *= chordLen;
D3DXQuaternionRotationAxis(&q,
D3DXVec3Cross(&vRotAxis, &vTPan, &m_vPos),
-(FLOAT)((D3DX_PI / 2.0f) - gamma));
D3DXMatrixRotationQuaternion(&mRot, &q);
D3DXVec3TransformCoord(&vTPan, &vTPan, &mRot);
D3DXVECTOR3 vNewPos = m_vPos + vTPan;
D3DXVECTOR3 vXBefore, vXAfter;
vRadius = m_vPos - m_vLookAt;
D3DXVec3Cross(&vXBefore, &vRadius, &m_vUp);
D3DXVec3Normalize(&vXBefore, &vXBefore);
vRadius = vNewPos - m_vLookAt;
D3DXVec3Cross(&vXAfter, &vRadius, &m_vUp);
D3DXVec3Normalize(&vXAfter, &vXAfter);
D3DXVECTOR3 vXPlus = vXBefore + vXAfter;
if ( D3DXVec3Length(&vXPlus) < 0.5f )
{
m_vUp = -m_vUp;
}
m_vPos = vNewPos;
}
2-finger tap detection
This custom gesture is implemented by detecting when more than one input comes down and comes up within a certain window of time. Handling this gesture is implemented by recording the time that fingers come down and the position they come down at, as well as the time that fingers come up. To track the point inputs and calculate the distance the points have travelled, a map, m_pointMap
, is created to store points. To track the time and number of contacts, the start time for the input process is stored along with the maximum number of contacts seen. The following code shows how 2-finger tap detection is implemented:
(...)
unsigned int m_uMaxNumContactsSeen;
DWORD m_dwGestureStartTime;
std::map<DWORD, D3DXVECTOR2> m_pointMap;
FLOAT m_fpMaxDist;
(…)
BOOL CComTouchDriver::IsMultiFingerTap(TOUCHINPUT const * inData)
{
BOOL fResult = FALSE;
DWORD dwPTime = inData->dwTime;
DWORD dwEvent = inData->dwFlags;
DWORD dwCursorID = inData->dwID;
FLOAT x = (FLOAT)(inData->x);
FLOAT y = (FLOAT)(inData->y);
if(dwEvent & TOUCHEVENTF_DOWN)
{
if (m_pointMap.size() == 0)
{
m_dwGestureStartTime = dwPTime;
m_fpMaxDist = 0;
}
try
{
m_pointMap.insert(std::pair<DWORD,
D3DXVECTOR2>(dwCursorID, D3DXVECTOR2(x,y)));
if (m_pointMap.size() > m_uMaxNumContactsSeen)
{
m_uMaxNumContactsSeen = m_pointMap.size();
}
}
catch(std::bad_alloc)
{
m_fpMaxDist = MAX_TAP_DIST + 1;
}
}
else if(dwEvent & TOUCHEVENTF_UP)
{
std::map<DWORD, D3DXVECTOR2>::iterator it = m_pointMap.find(dwCursorID);
if(it != m_pointMap.end())
{
D3DXVECTOR2 ptStart = (*it).second;
D3DXVECTOR2 ptEnd = D3DXVECTOR2( x, y );
D3DXVECTOR2 vDist = ptEnd - ptStart;
FLOAT fpDist = D3DXVec2Length( &vDist );
if (fpDist > m_fpMaxDist)
{
m_fpMaxDist = fpDist;
}
}
m_pointMap.erase(dwCursorID);
if (m_pointMap.size() == 0)
{
if (m_uMaxNumContactsSeen >= 2 && dwPTime -
m_dwGestureStartTime < MAX_TAP_TIME)
{
fResult = TRUE;
}
m_uMaxNumContactsSeen = 0;
}
}
if (m_fpMaxDist > MAX_TAP_DIST)
{
fResult = FALSE;
}
return fResult;
}
Manipulation smoothing
The input that you get by default will have some variability which is not optimal for this particular project. This causes jittery motion when panning around. This could be caused by noise on the input device or the manipulation processor interpreting the gesture as a combined pan and rotate gesture. To fix this, a window of input messages is kept and is averaged before processing the WM_TOUCH
messages. Smoothing the input messages fixes the wobbly panning and zooming. The following code shows how the averaged window is stored for contacts:
VOID Camera::SmoothManipulationDelta(Delta *delta)
{
Delta sumDeltas;
m_ucWindowIndex = m_ucWindowIndex % SMOOTHING_WINDOW_SIZE;
m_pDeltaWindow[m_ucWindowIndex++] = *delta;
DeltaIdentity(&sumDeltas);
for (int i = 0; i < SMOOTHING_WINDOW_SIZE; i++)
{
sumDeltas.translationDeltaX += m_pDeltaWindow[i].translationDeltaX;
sumDeltas.translationDeltaY += m_pDeltaWindow[i].translationDeltaY;
sumDeltas.rotationDelta += m_pDeltaWindow[i].rotationDelta;
sumDeltas.scaleDelta *= m_pDeltaWindow[i].scaleDelta;
}
sumDeltas.translationDeltaX /= SMOOTHING_WINDOW_SIZE;
sumDeltas.translationDeltaY /= SMOOTHING_WINDOW_SIZE;
sumDeltas.rotationDelta /= SMOOTHING_WINDOW_SIZE;
sumDeltas.scaleDelta = pow(sumDeltas.scaleDelta, 1.0f/SMOOTHING_WINDOW_SIZE);
#if SMOOTH_TRANSLATION
delta->translationDeltaX = sumDeltas.translationDeltaX;
delta->translationDeltaY = sumDeltas.translationDeltaY;
#endif
#if SMOOTH_ROTATION_AND_ZOOM
delta->scaleDelta = sumDeltas.scaleDelta;
delta->rotationDelta = sumDeltas.rotationDelta;
#endif
}
Note that this is implemented by simply averaging the input deltas in members of the _IManipulationEvents
interface via the InertiaObj
class.
Add inertia and other tweaks
Inertia is handled by triggering a timer on manipulation completion. This timer calls the Process
method on the inertia processor encapsulated in the camera class. The following code shows how the timer is triggered in the InertiaObj
interface:
HRESULT STDMETHODCALLTYPE InertiaObj::ManipulationCompleted(
FLOAT ,
FLOAT ,
FLOAT ,
FLOAT ,
FLOAT ,
FLOAT ,
FLOAT )
{
HRESULT hr = S_OK;
if(!m_bIsInertiaActive)
{
hr = SetupInertia(m_inertiaProc, m_manipulationProc);
m_bIsInertiaActive = TRUE;
SetTimer(m_hWnd, m_iTimerId, DESIRED_MILLISECONDS, NULL);
}
else
{
m_bIsInertiaActive = FALSE;
KillTimer(m_hWnd, m_iTimerId);
}
return hr;
}
The following code shows the handler for the WM_TIMER
event in the 3dManipulation
implementation file:
(...)
case WM_TIMER:
g_ctDriver->ProcessChanges();
break;
(...)
The following code shows how the ComTouchDriver
class implements the ProcessChanges
method to trigger inertia on the camera:
VOID CComTouchDriver::ProcessChanges()
{
BOOL bCompleted = FALSE;
if (m_pCamera->m_bIsInertiaActive == TRUE)
{
m_pCamera->m_inertiaProc->Process(&bCompleted);
}
}
Inertia Camera
The camera derives from the InertiaObj
class which inherits the _ManipulationEvents
interface in order to enable inertia features. When the camera is constructed, parameters for the inertia settings are configured within the class. The following code shows how inertia is configured:
HRESULT InertiaObj::SetupInertia(IInertiaProcessor* ip, IManipulationProcessor* mp)
{
HRESULT hr = S_OK;
HRESULT hrPutDD = ip->put_DesiredDeceleration(0.006f);
HRESULT hrPutDAD = ip->put_DesiredAngularDeceleration(0.00002f);
FLOAT fVX;
FLOAT fVY;
FLOAT fVR;
HRESULT hrPutVX = mp->GetVelocityX(&fVX);
HRESULT hrGetVY = mp->GetVelocityY(&fVY);
HRESULT hrGetAV = mp->GetAngularVelocity(&fVR);
HRESULT hrPutIVX = ip->put_InitialVelocityX(fVX);
HRESULT hrPutIVY = ip->put_InitialVelocityY(fVY);
HRESULT hrPutIAV = ip->put_InitialAngularVelocity(fVR);
if(FAILED(hrPutDD) || FAILED(hrPutDAD) || FAILED(hrPutVX)
|| FAILED(hrGetVY) || FAILED(hrGetAV) || FAILED(hrPutIVX)
|| FAILED(hrPutIVY) || FAILED(hrPutIAV))
{
hr = E_FAIL;
}
return hr;
}
See Also