This article provides a step-by-step guide on implementing a simple text-to-speech (TTS) application using C# and the System.Speech.Synthesis namespace. It explains the code used, including creating a SpeechSynthesizer instance and synthesizing speech from text, and offers suggestions for exploring more advanced TTS features.
Introduction
This tip explains how to implement a text-to-speech (TTS) application in C# using the System.Speech.Synthesis
namespace. TTS technology has many practical use cases, such as in accessibility tools and speech-enabled applications. By following the step-by-step guide provided, readers will be able to create a simple console application that synthesizes speech from text. The article also provides an explanation of the code used and offers suggestions for exploring more advanced features of the SpeechSynthesizer
class.
Implementing Text-to-Speech in C# using System.Speech.Synthesis
Text-to-speech (TTS) technology has been around for a while and has found many use cases, such as in language learning, accessibility tools for visually impaired individuals, and in speech-enabled applications. In this article, we will explore how to implement a simple TTS application using C# and the System.Speech.Synthesis
namespace.
Prerequisites
Before we begin, you need to have the following installed on your machine:
- .NET Framework 4.6.1 or higher
- Visual Studio 2017 or higher
Implementation
We will be using the System.Speech.Synthesis
namespace, which provides classes for synthesizing speech from text. Follow the steps below to create a console application in C# and implement TTS.
- Open Visual Studio and create a new Console Application project.
- Add a reference to the
System.Speech
assembly. Right-click on the project in Solution Explorer, select Add Reference, and then choose System.Speech
from the list of assemblies. - In the Program.cs file, add the following code:
using System;
using System.IO;
using System.Speech.Synthesis;
class Program
{
static void Main(string[] args)
{
SpeechSynthesizer synth = new SpeechSynthesizer();
synth.SetOutputToDefaultAudioDevice();
synth.Speak("Hello, world!");
synth.SelectVoiceByHints(VoiceGender.Female, VoiceAge.Adult);
synth.Speak("Hello, I am a female voice!");
synth.Rate = -2;
synth.Volume = 100;
synth.Speak("Hello, I am speaking slower and louder!");
synth.Speak("Hello, I will pause for 3 seconds now.");
synth.Pause();
System.Threading.Thread.Sleep(3000);
synth.Resume();
synth.Speak("I am back!");
synth.SetOutputToWaveFile("output.wav");
synth.Speak("Hello, I am saving my speech to a WAV file!");
MemoryStream stream = new MemoryStream();
synth.SetOutputToWaveStream(stream);
synth.Speak("Hello, I am being streamed to a memory stream!");
byte[] speechBytes = stream.GetBuffer();
PromptBuilder builder = new PromptBuilder();
builder.StartVoice(VoiceGender.Female, VoiceAge.Adult, 1);
builder.AppendText("Hello, my name is Emily.");
builder.StartVoice(VoiceGender.Female, VoiceAge.Teen, 2);
builder.AppendText("I am from New York City.");
builder.StartStyle(new PromptStyle() { Emphasis = PromptEmphasis.Strong });
builder.AppendText("I really love chocolate!");
builder.EndStyle();
builder.StartStyle(new PromptStyle() { Emphasis = PromptEmphasis.Reduced });
builder.AppendText("But I'm allergic to it...");
builder.EndStyle();
synth.Speak(builder);
Console.ReadLine();
}
}
Code Outline
-
Basic TTS
Creates a SpeechSynthesizer
instance and synthesizes the text "Hello, world!
" using the default audio device.
-
Changing the Voice
Selects a female adult voice and synthesizes the text "Hello, I am a female voice!
" using that voice.
-
Changing the Pitch and Rate
Sets the speech rate to -2 (slower) and the volume to 100 (louder), and synthesizes the text "Hello, I am speaking slower and louder!
".
-
Pausing and Resuming Speech
Synthesizes the text "Hello, I will pause for 3 seconds now.
", pauses the speech for 3 seconds, and then resumes the speech and synthesizes the text "I am back!
".
-
Saving Speech to a WAV File
Sets the output of the SpeechSynthesizer
to a WAV file named "output.wav", and synthesizes the text "Hello, I am saving my speech to a WAV file!
".
-
Setting the Speech Stream
Sets the output of the SpeechSynthesizer
to a memory stream, synthesizes the text "Hello, I am being streamed to a memory stream!
", and gets the resulting speech bytes from the memory stream.
-
Changing the Voice and Pronunciation
Uses the PromptBuilder
class to create a more complex prompt, changing the voice for certain parts of the prompt, and adding emphasis and reduced emphasis to certain parts of the prompt. The resulting prompt is then synthesized using the SpeechSynthesizer
.
These code examples demonstrate some of the basic and advanced functionality of the SpeechSynthesizer
class, including changing the voice and pitch, pausing and resuming speech, and saving synthesized speech to a file or memory stream.
History
- 16th March, 2023: Initial version