Introduction
Microsoft and others are extremely good at taking big technologies and building services around them that bring what may seem difficult within reach of the general development community. If one has the time to spend learning the deep core fundamentals of a particular branch of computing that's great, but where other things take a priority and you still need to utilize *that thing*, you need to look to services. One fo the hot big things that's changing everything I see at the moment is Machine Learning. Not all of us can focus on going deep into the subject, so its great that there are some amazing Machine learning and Artificial Intelligence platforms, libraries and APIs you can now hook into. Recently, Microsoft rolled out further advances in the area that are provided as a service under the banner of 'Cognitive services'... they are really quite impressive and worth a look.
The first question I usually ask of new technology offerings is 'what can it do for me?' ... from here I decide if its worth looking into further or not. I often find that examples given by vendors may not be applicable to me immediately, but sit at the back of my head and become useful at a later stage. Lets take a look at some of what's there and see if its worth further investigation - hopefully it might inspire you at a later stage to dig a bit deeper.
The offering is provided by Microsoft as a series of REST based API services, available here. By exposing the power of the underlying services via a web based API, the service allows us to integrate very powerful features into modern applications both web and mobile based. The APIs offered cover a broad range that include Vision, Speech, Language, Knowledge and Search.
The vision service currently covers Computer vision, Content moderation, Emotion, Face and Video. I'm going to run through some quick examples of how these are useful.
Using the Face Detection API, you can upload an image (or point to an online URL), and the API will return information about any faces located in the image. In this example you can see my handsome visage and what are called 'face landmarks' that the API identified. These can be used in conjunction with other services as you will see later.
One of the uses of the API is verifying if facial images belong to the same person and by examining the probability score returned you can measure the confidence of the match.
The face API and its corresponding moving picture Video API can also be used to get extract deeper understanding from images. The following examples show the service identifying the gender of a person in the image together with age estimate.
Emotions are also possible to detect and report on
Lets look at another very powerful example. This one analyzes an image uploaded of a swimmer, and is capable of giving us information about what's happening in the image - very impressive!
Before we move on, lets look at analyzing text that's embedded within images, these two examples demonstration text being extracted that's both typeset, and handwritten. All data of course as you can see is returned in JSON format.
The text analysis service is equally as powerful, offering a lot of opportunity to add value to systems we develop. Text analysis can extract key phrases, detect the topic of the text, and the language the text is written in among other things. Sentiment analysis is only provided at this date (April 2017) in English, French Spanish and Portuguese, but more are to follow.
Another text based fundamental that is provided as part of the service is translation. Translation for speech is supported for 9 languages, and for text a massive 60 languages. One of the things you discover as you move between industries, is that each sector has its own domain specific language in which they communicate. To help with this, there is a custom translation system where you can feed in your own specific custom dictionaries to be used.
One specialist domain specific area catered for by Cognitive services is the Academic Knowledge API. This service offers a number of very interesting ways to interact with academic research papers. Available functionality includes paper similarity matching, graph search to enable you to follow citations (which can be expressed as lambda expressions), and natural language interpretation options.
Despite publishing a plethora of content on a website, sometimes users can get lost and simply cannot find the information they require. In this case its useful to create a 'frequently asked questions' section. However, like the content itself, this can be bothersome to both create and maintain. The 'QnA' service allows you to create an FAQ service from existing content, using natural language analysis to derive the questions from content.
I think that most web developers at some stage have been asked to develop, or at least interact with some kind of eCommerce website. Three APIs in the 'Recommendations API' should prove very useful for this sector. As the name suggests, this API uses cogitative intelligence to recommend products to customers. Options include the 'Frequently bought together', 'Item to item', and 'Personalized user recommendations'. These APIs allow you to offer the kind of services only normally found on bigger sites such as 'customers who liked this product also...', 'because you watched this movie you might also like...' etc.
Theres a lot of seriously interesting stuff in Cognitive Services that is worth checking out. Even if you don't have a use for the service now, it may trigger a thought in the future so worth spending some time familiarizing yourself with the options. As an aside, you can sign up for free and there is a very reasonable monthly allowance you can use for testing without having to pay anything.
Useful links:
Main cognititive services website
Microsoft cognititive services APIs
Free API key subscription
Intro videos:
Getting started with MS Cognititive services
Cognititive services overview
Cognititive services with Xamarin forms
History
21/Apr/17 - Version 1 published