Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Face and Head Tracking using the Intel® RealSense™ SDK

13 May 2015 1  
Face and Head Tracking using the Intel® RealSense™ SDK

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Intel®Developer Zone offers tools and how-to information for cross-platform app development, platform and technology information, code samples, and peer expertise to help developers innovate and succeed. Join our communities for the Internet of Things, Android*, Intel® RealSense™ Technology and Windows* to download tools, access dev kits, share ideas with like-minded developers, and participate in hackathons, contests, roadshows, and local events.

Tips, Use Cases, APIs, and Sample Code (BlockHead)

  • Required OS: Windows 8.1 Desktop

The Intel® RealSense™ SDK has a set of APIs that developers can call for implementing the following:

  • Face Location Detection
  • Landmark Detection
  • Pose Detection
  • Expression Detection
  • Face Recognition.
  • Note: There is also Emotion Detection which is only in experimental (not gold) stage

Data from these interactions can be captured and used by an app in near real-time. Here are some tips in order to take full advantage of the Facial Module when developing RSSDK software using the Face Analysis Module.

  • Good lighting is important for 2D RGB tracking
  • Avoid shadows, backlighting or strong directional lighting - including sunlight
  • Use slow movements for best tracking
  • Range up to 1.2 meters
  • Do not tilt head outside of 30 degrees (on any axis) from the screen.
  • Remember camera has a field of view so longest distance will be in center screen.
  • For examples of what each modality can track, see the face_viewer examples in the RSSDK/bin folder

Face Location Detection

  • Track up to 4 faces with marked rectangles for face boundaries
  • You can choose which 4 faces to detect
  • Only one face has landmarks
  • For facial recognition, see below
  • Enable: PXC[M]FaceConfiguration
  • Retrieve face location data: QueryDetection

Landmark Detection

  • Works on faces with/without facial hair and glasses
  • 3D tracking of 78 facial landmark points supporting avatar creation, emotion recognition and facial animation.
  • Best to track only the landmarks needed (even just say tip of nose)
  • Eye gaze location tracking is not specifically supported.
  • Enable: PXC[M]FaceConfiguration
  • Retrieve detected landmarks: QueryLandmarks

Pose Detection (yaw, pitch, roll)

  • Detect head orientation along 3d axes – Yaw, pitch and roll
  • Works best with frontal axis +/- 15 degrees of yaw, roll and pitch close to 0 degrees
  • Works best with face within 30 degrees of the screen (yaw and pitch)
  • Can be used as a coarse version of judging where the user is looking,
  • Detect where the user’s head is pointing or can be used to control side to side motion of a character
  • Enable: PXC[M]FaceConfiguration
  • QueryPose retrieves the detected pose data

Facial Expression Detection

  • Best for sensing natural expressions, some emotions and engagement.
  • Reliably capturing expression information (e.g., EXPRESSION_MOUTH_OPEN, EXPRESSION_SMILE) can be problematic when the user is wearing eyeglasses.
  • 30 frames per second with image sizes 48 X 48 pixels
  • Facial hair, glasses could make emotion detection harder
  • Supports 6 primary emotions – Anger, Disgust, Fear, Joy, Sadness, Surprise
  • Uses 2D RGB data
  • Query emotion data using intensity (0 to 1) or evidence ( on a log scale)
  • Possible to combine emotions (Ex: disgust + fear + anger = negative)
  • Use the ExpressionsConfiguration interface
  • QueryExpressions retrieves any detected expression data

Face Recognition

  • 30 frames per second with image sizes 48 X 48 pixels
  • Compares current face to others from references in a recognition database
  • Use the RecognitionConfiguration interface to create multiple recognition databases
  • QueryUserID is called to perform recognition of the current face against the active recognition database.
  • Can be used against a picture, but increases security risk.
  • There are currently a known issue with reusing existing databases and number of users that will be fixed in the R2 release. (See forum discussions for more information)

Emotion Detection - in experimental phase only.

The Intel RealSense SDK provides capabilities for both managed and unmanaged code:

  • C++ - pxcfacemodule.h
  • C# .NET4 – libpxcclr.cs.dll
  • Unity – libpxcclr.unity.dll
  • Java – libpxcclr.java.jar (not all modalities in R1 release library).

Look for the following APIs in the Intel RealSense SDK face analysis module (PXCFaceModule):

  • PCM[M]FaceConfiguration
  • PXC[M]FaceData
  • PXC[M]Emotion (experimental)

Here are some use cases that you can accomplish with the face module APIs:

Gaming/App Enhancements:

  • Head tracking and orientation can be used to allow navigation, parallax, view changes, or to peek around corners.
  • Landmark tracking can be used to identify user’s expressions

Face Augmentation:

Use head tracking to augment the user’s face on the screen in either a cartoonish or realistic way.

Avatar Creation:

Create cartoonish or realistic looking avatars that mimic the u ser’s face. Be sure to stick to more abstracted or cartoonish avatars to avoid the uncanny valley effect and to have more robust and predictable facial mimicry.

Affective Computing:

  • You can identify and respond to a user’s mood and level of engagement, either implicitly or explicitly.
  • Note that the Emotion APIs in the Intel RealSense SDK are currently experimental.

You can download the sample code "BlockHead", from the Intel Developer Zone.

BlockHead demonstrates the use of the Intel RealSense SDK for Windows* in a C#/WPF desktop application. The sample utilizes three features of the Intel RealSense SDK: (Note: The full functionality of this sample app requires a front-facing Intel RealSense 3D Camera.)

  • Captures and displays the color stream from the RGB camera
  • Retrieves face location and head pose estimation data
  • Retrieves and evaluates facial expression data

BlockHead was written by Bryan Brown who is a software applications engineer in the Developer Relations Division at Intel. His professional experience includes a mix of software, electronic, and systems design engineering. His technical interests focus on applications of natural interaction and brain-computer interface technologies, with active participation in several alpha developer programs for various emerging technologies in these areas.

Follow Bryan on twitter: @BryanBrownHMT

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here