Introduction
Are you a fan of the Game of Thrones show? Do you root for the Lannisters, Starks, or Targeryans? In this article, we will build an application to help you identify which character from the show you, or somebody else, resemble the most, using artificial intelligence to detect and compare faces.
Just as a quick look ahead, the application we will be building here finds a similarity of 42% between my face in the profile picture and Stannis Baratheon, based on the set of images it looked through.
Cognitive Services
One of the services available in Microsoft Azure is cognitive services. This is a collection of machine learning and artificial intelligence algorithms provided as services through reach APIs that enable you to easily add intelligent features to your applications. These services are grouped in five categories:
- Vision: provides image processing algorithms to detect, identify, analyze, and organize faces in photos, to detect emotion, to automatically moderate images and videos, and other capabilities.
- Speech: APIs that enable identifying and authentication people based on speech, conduct real-time speech translation, or convert text to speech and back to understand user intent.
- Knowledge: extract questions and answers from a text, find academic resources, and others.
- Language: provides APIs that evaluate sentiment and topics, understand commands from users, detect languages, conduct machine language translation in real time, etc.
- Search: delivers searching functionalities powered by Bing, such as searching for news, images, videos, web documents, entities, and more.
You can read more about cognitive services here. The service we will be using for this article is called Face API and is part of the Vision category.
Face API
This service is actually itself a collection of several functionalities:
- Detecting faces in a picture, along with attributes such as age, gender, emotion, pose, smile, facial hair, and a set of face landmarks (position of the mouth, nose, eyes).
- Identifying faces from private repositories of up to a million faces.
- Finding similar faces between a given face and a collection of prerecorded faces.
- Grouping faces based on their visual similarities.
In this article, we will use the APIs for detecting and identifying faces. You can actually try some of these APIs online.
Here is how it detects my face in my current profile picture along with some attributes. Although it gets the age with a few years off (three years to be more precise), it does a pretty good job at analyzing the face and the expression.
Getting Started with Face API
In order to use any of the Azure cognitive services, you need to have an account. There are several price tiers for the Face API: a free tier allows 30000 transactions per month, but only 20 per minute; paid tiers allows various quotas of transactions per month at different prices, with up to 10 transactions per second. These are, of course, prone to future changes and you should check the pricing details if you want to use the services in a real world application. However, a 30-day free trial is also available so you can try the services in order to decide if they can meet your needs.
One thing to note is that a transaction basically represents an API call. For instance, in order to find a similar face, we must first detect the face(s) in a picture and then find similarities. This is done in two distinct API calls and represent two transactions. If you use the free tier, then you are limited to only 20 API calls per minute.
The only thing you need to get started is a set of API keys. I will assume you have an Azure account. What you have to do is:
- Logon to the Azure portal to create a new resource.
- Search for Face API and select to create one.
- Fill in the form with name, subscription, location, pricing tier, resource group and hit Create. For the price tier, you can select F0 for free, in the image below is S0 because I already have a free resource and you cannot create multiple free ones.
- Go to the created resource and under Resource Management, you will find the Keys. You need to copy one of them to use it in the application.
- You also need the endpoint for the API. This is available from the overview panel.
Understanding the API
The Face API reference documentation is available here. You can find there details about each API call, such as functionality, arguments, results, possible errors, etc.
In order to find similarities between a face and characters from the Game of Thrones show, we must first build a list of faces to later search through. The REST API provides resources to create such a list, update and delete, as well as add and remove faces from a list. At the minimum, the application should:
- Create a list: You need to make a
PUT
HTTP request to [endpoint]/facelists/{faceListId}
, where endpoint
is the one you copied from the overview panel and faceListId
is a required parameter representing de identifier of the list (valid characters are letter in lower case or digit or '-' or '_', and the maximum length is 64). This will create an empty list to which you can later add faces to. - Add faces to a list: You need to make a
POST
HTTP request to [endpoint]/facelists/{faceListId}/persistedFaces[?userData][&targetFace]
. faceListId
is the identifier of the list you used when creating it, userData
is an optional string of maximum 1KB that can be used for any application defined purpose, and targetFace
is an optional parameter to indicate which face in the image should be added. This is necessary in case there is more than one face in the image. In this case, if this parameter is missing, the call fails. If this parameter is used, it should be a value returned by the Face Detect API. The actual image can be passed in two ways: as an URL in a JSON object, using the application/json
as content-type
, or as binary data, using application/octect-stream
for the content-type
.
In the application provided with this article, all of these functionalities, retrieving existing lists, creating news ones, deleting existing, adding and removing faces from a list, are implemented. You can find them in the FaceApi
namespace.
After building the list of faces to search within, we can actually look for similarities. To do so, we must call, in this order, the following APIs:
- Detect: is a
POST
HTTP request to [endpoint]/detect[?returnFaceId][&returnFaceLandmarks][&returnFaceAttributes]
. The image can be passed either as an URL or as a binary stream, just as in the case of the API for adding a face to a list. When the call is successful, the result contains an array of faces, and for each face an ID, rectangle, landmarks, and attributes. What is necessary for the next call is the ID. This identifier is stored on the server for 24 hours before it expires. - Find Similar: is a
POST
HTTP request to [endpoint]/findsimilars
. The content is a JSON object that contains the face ID returned by Detect, and either a face list ID, or an array of IDs for candidate faces, all returned by Detect. Optionally, you can specify how many matches you want in return and how the search should be performed. When successful, the return contains the persisted face ID that was found similar and a confidence score between 0 (no similarity) and 1 (identical).
Note that there are actually two types of face lists that can be created:
- Standard lists (described here and used in the demo application). These list can contain up to 1000 faces. And you can have up to 64 such lists per subscription.
- Large lists that can contain up to 1000000 faces. In a subscription, you can have up to 64 large face lists. These lists, however, require training before they can be used with algorithms such as Find Similar. This is an asynchronous operation and may take some time. After starting the training, you can query its status to make sure it has completed before using the list.
The algorithms that require a face list work the same regardless of whether you specify a standard face list or a large face list.
There is one more important aspect to keep in mind: creating a list of faces and adding faces to it does not actually preserve the images on the server, but only information about the faces. If you later on need to display the images in your application, you need to keep them in a place where you can later retrieve it from. This is the case with the demo application that displays the image of the Game of Thrones character that matched a particular face the best.
Consuming the APIs from C#
Calling the APIs mentioned earlier from C# is quite simple. Here is the implementation for (you can find these in the FaceApiUtils
class in the source code):
- Creating a face list, with a user defined face list ID, name and description. If the call fails, this function throws an exception.
public static async Task<bool> CreateFaceList
(string faceListId, string name, string description)
{
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", AppSettings.Key1);
var uri = $"{AppSettings.Endpoint}/facelists/{faceListId}";
var request = new FaceListCreateRequest()
{
Name = name,
UserData = description
};
var json = JsonConvert.SerializeObject(request);
var content = new StringContent(json, Encoding.UTF8, "application/json");
var response = await client.PutAsync(uri, content);
if(!response.IsSuccessStatusCode)
{
var body = await response.Content.ReadAsStringAsync();
var error = JsonConvert.DeserializeObject<FaceApiErrorResponse>(body);
throw new FaceApiException(error.Error.Code, error.Error.Message);
}
return true;
}
}
- Adding a face to a face list. The face is identified from an image uploaded as a binary stream. If the call is successful, the function returns the persisted ID of the face, otherwise it throws an exception.
public static async Task<string> AddFaceToFaceList(string faceListId, byte[] image)
{
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", AppSettings.Key1);
var uri = $"{AppSettings.Endpoint}/facelists/{faceListId}/persistedFaces";
var request = new ByteArrayContent(image);
request.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
var response = await client.PostAsync(uri, request);
if (!response.IsSuccessStatusCode)
{
var body = await response.Content.ReadAsStringAsync();
var error = JsonConvert.DeserializeObject<FaceApiErrorResponse>(body);
throw new FaceApiException(error.Error.Code, error.Error.Message);
}
else
{
var body = await response.Content.ReadAsStringAsync();
var result = JsonConvert.DeserializeObject<FaceAddResponse>(body);
return result.PersistedFaceId;
}
}
}
Here are the helper types used in these functions:
class FaceListCreateRequest
{
public string Name { get; set; }
public string UserData { get; set; }
}
class FaceApiError
{
public string Code { get; set; }
public string Message { get; set; }
}
class FaceApiErrorResponse
{
public FaceApiError Error { get; set; }
}
class FaceAddResponse
{
public string PersistedFaceId { get; set; }
}
class FaceApiException : Exception
{
public string Code { get; private set; }
public FaceApiException(string code, string message) : base(message)
{
Code = code;
}
}
The other two face API calls to implement are Detect and Face Similar. They are shown below:
- Detect face takes an image as binary content and sends it to the server. When successful, it returns back a list of detected faces. There is various information for each face, but the only one that is necessary is the ID.
public static async Task<List<FaceDetectResponse>> DetectFace(byte[] image)
{
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", AppSettings.Key1);
var uri = $"{AppSettings.Endpoint}/detect";
var content = new ByteArrayContent(image);
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
var response = await client.PostAsync(uri, content);
if (response.IsSuccessStatusCode)
{
var responseBody = await response.Content.ReadAsStringAsync();
var result =
JsonConvert.DeserializeObject<List<FaceDetectResponse>>(responseBody);
return result;
}
else
{
var errorText = await response.Content.ReadAsStringAsync();
var errorResponse =
JsonConvert.DeserializeObject<FaceApiErrorResponse>(errorText);
throw new FaceApiException
(errorResponse.Error.Code, errorResponse.Error.Message);
}
}
}
- Find similar uses the temporary face ID returned by Detect, a face list ID and a number of matches. When successful, it returns a list of similarities, each containing the permanent face ID and its confidence score.
public static async Task<List<FaceFindSimilarResponse>
FindSimilar(string faceId, string faceListId, int count)
{
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", AppSettings.Key1);
var uri = $"{AppSettings.Endpoint}/findsimilars";
var body = new FaceFindSimilarRequest()
{
FaceId = faceId,
FaceListId = faceListId,
MaxNumOfCandidatesReturned = count,
Mode = FaceFindSimilarRequestMode.MatchFace
};
var bodyText = JsonConvert.SerializeObject(body);
var httpContent = new StringContent(bodyText, Encoding.UTF8, "application/json");
var response = await client.PostAsync(uri, httpContent);
if (response.IsSuccessStatusCode)
{
var responseBody = await response.Content.ReadAsStringAsync();
var result =
JsonConvert.DeserializeObject<List<FaceFindSimilarResponse>(responseBody);
return result;
}
else
{
var errorText = await response.Content.ReadAsStringAsync();
var errorResponse =
JsonConvert.DeserializeObject<FaceApiErrorResponse>(errorText);
throw new FaceApiException(errorResponse.Error.Code, errorResponse.Error.Message);
}
}
}
The additional helper classes used by the above functions are defined as follows:
class FaceDetectResponse
{
public string FaceId { get; set; }
public Rectangle FaceRectangle { get; set; }
public FaceLandmarks FaceLandmarks { get; set; }
public FaceAttributes FaceAttributes { get; set; }
}
class Rectangle
{
public int Width { get; set; }
public int Height { get; set; }
public int Left { get; set; }
public int Top { get; set; }
}
class Point
{
public double X { get; set; }
public double Y { get; set; }
}
class FaceFindSimilarRequest
{
public string FaceId { get; set; }
public string FaceListId { get; set; }
public List<string> FaceIds { get; set; }
public int MaxNumOfCandidatesReturned { get; set; }
public string Mode { get; set; }
}
class FaceFindSimilarRequestMode
{
public const string MatchPerson = "matchPerson";
public const string MatchFace = "matchFace";
}
class FaceFindSimilarResponse
{
public string PersistedFaceId { get; set; }
public string FaceId { get; set; }
public double Confidence { get; set; }
}
Notice: In all these code samples, AppSettings.Key1
and AppSettings.Endpoint
are variables whose value is read from the application config file and represent the key and endpoint of your Face API Azure resource.
Building an Application
Although it would be more fun to have a mobile app detecting most similar characters from the Game of Thrones show, we will build a simple WPF application that you can run on your PC. If you run the application that is provided with the article, you’ll have to use your own application keys, build your own list, and add your own images to the list, as these are not shareable between subscriptions.
The WPF application has three main windows:
- The start-up window that enables you to select an action: either manage the face lists or find a similar face.
- The face lists management window. Here, you can view existing face lists (the once used in this app are standard lists), add new, or delete existing. For each face list, you can view the images used to add faces to the list, add more faces (from a file or entire folder) and delete existing faces. Because the server does not retain the images themselves, these are stored in a subfolder in the working directory. The name of the working directory is specified in the application config file. The name of each image in this folder is the persistent face ID returned by the server.
- The window for finding similar faces. This allows you to select an image from disk and a target face list. When successfully detected, it displays the most similar match along with the confidence score.
The app.config file contains several application settings: the endpoint for the Face API resource, the access keys, and the name of the folder where the images whose face has been added to a face list are preserved. Notice that for each list, a sub-folder with the list ID as the name will be created. Images are stored in the sub-folder associated with the list.
<appSettings>
<add key="Endpoint" value="https://westeurope.api.cognitive.microsoft.com/face/v1.0" />
<add key="Key1" value="...(insert your key)..." />
<add key="Key2" value="...(insert your key)..." />
<add key="FaceListBaseFolder" value="facelists" />
</appSettings>
To get the application running with reasonable results you should build a list of several hundred images. The list that I’ve used had over 400, but you can use up to 1000. If you want to build a larger list, then you must use large face lists instead of the standard lists. In this case, there is a different set of face list APIs, although they are mostly similar to the standard ones, with the exception that they require training before they can be used.
As you can see from the last image, when comparing the face in my profile picture against a the list of Game of Throne character images that I build, the application detects a 42% similarity with Stannis Baratheon. I tried having a bit of fun looking for similarities with GoT characters for different celebrities. Here is how the current ATP and WTA leaders, Roger Federer and Simona Halep, score against my list of GoT faces.
History
- 12th March, 2018: Initial version