(untagged)

Writing SAPI 5.1 Apps in CSharp

Maksim K

0.00/5 (No votes)

10 Feb 2006

Simple CSharp program that showcases basic SR and TTS features of SAPI 5.1

Download source files - 4 KB

Introduction

CSharp and SAPI 5.1 make Speech Recognition (SR) and Text to Speech (TTS) development fun, easy and very rewarding. Within hours, you'll be able to produce exciting demo apps to impress your friends and colleagues. It could be a command and control Add-In to your existing app or something that makes your computer speak with HAL like voice.

Getting Started with SAPI

SAPI (Microsoft Sound API), can be downloaded from this URL. Attached examples were written with SAPI 5.1 and Visual Studio 8.

I will not cover instructions on how to install SAPI as it's covered very well in the provided documentation, but I will highlight a couple of setup steps that took me more than 15 minutes to figure out. To make SAPI accessible in your project/code files, you need to add "using SpeechLib" to your code file and to Add reference to a COM component called "Microsoft Speech Object Library" from the Class View window.

Overview of the Functionality Covered in the Attached Code

SpSharedRecoContextClass class is your interface to the speech recognition engine. You can create an instance of this class and then register to events this class produces. In the example, I only implemented the recognition event which gets triggered when the engine decides that it provided its best guess at the phrase/utterance it heard. You shouldn't need to do anything special to hook this class to your default microphone, however if you want to process speech from source other than the microphone, you should be using a sister class called SpInProcRecoContextClass.

ISpRecoGrammar class will provide you with some basic tools to control the type of recognition you want. The two basic types are Dictation and Context Free Grammar driven recognition. Dictation will give you decent enough quality for a cool demo but in my opinion it falls short from the minimum quality needed for commercial quality apps. Using CFGs on the other hand, you can define fairly good quality command and control type apps.

SpVoice class will give you a very straight forward interface to TTS. I've not played around with this class much yet as SpVoice::Speak() method gave me all the functionality I needed thus far.

An object of ISpeechRecoResult class will be passed to your Recognition method handler. It'll give you access to the text that was understood by the SR engine and can be your portal to a lot of other cool under the hood info about probabilities and alternatives that the engine considered while evaluating the utterance.

Overview of the Included Code

Form1.cs -- Bulk of the code is here
Form1.Designer.cs -- Class declaration and UI code
DirectoryService.xml -- A simple Context Free Grammar
DirectoryService.cfg -- Compiled version of CFG, you should be able to self produce by running gc.exe included in the SAPI SDK

Summary

Overall I found the time I spent playing around with SAPI SDK to be quite rewarding. Perhaps the biggest frustration was the included documentation, it was quite cryptic and circular at times and is mainly geared towards VC++ developers. Being a novice C# developer, I found it quite challenging at times to figure out the datatypes I needed to pass to various methods. I also ran into a number of instances where methods and classes are either unique to CSharp API or were not supported.

Feel free to drop me a note with your SAPI experience.

Masksim Kozyarchuk (maksim_kozyarchuk@yahoo.com)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here