Introduction
VoiceXML is a markup language for creating voice-user interfaces. It uses speech recognition and/or touchtone (DTMF keypad) for input, and pre-recorded audio and text-to-speech synthesis (TTS) for output. It is based on the Worldwide Web Consortium's Extensible Markup Language (XML), and leverages the web paradigm for application development and deployment. With VoiceXML, speech recognition application development is greatly simplified by using familiar web infrastructure, including tools and Web servers.
By having a common language, application developers, platform vendors, and tool providers all can benefit from code portability and reuse. For this sample, we are going to use Voicent Gateway as our VoiceXML server. A free version of the gateway can be downloaded from here. You should be able to port this sample to other VoiceXML gateway servers with very little change.
In this article, we are going to develop an automated customer satisfaction survey application for an automobile service shop. This sample application will do the following:
- Read from a list of customers who bring their car for service.
We are going to randomly select customer name, phone number, date of the service, and the maker of the car. The idea should be the same if you are using a database for the customer list.
- Automatically call these customers and collect their ratings (1-5) on the service provided.
The survey message: "Hi, this is ACME car service calling [customer name]. We have provided your car [maker of the car] for maintenance service on [date of the service]. Please rate our service by number 1 to 5, with 5 being the best. Thank you for your time."
Since it cannot collect any feedback from an answering machine, the answering machine message is: "Hi, this is ACME car service calling [customer name]. We have provided your car [maker of the car] for maintenance service on [date of service], we'd like to thank you for your business and please contact us if we can be further help."
- Produce a survey report.
We want to know how many calls were made, how many calls were answered by the answering machine, whether the line was busy, and how many customers rated the service on each rating scale.
The topics covered in this article are:
To make an outbound call, a call request has to be sent over to Voicent Gateway. The Gateway Tutorial has covered the basic feature of the call request handler. To make a call, simply send HTTP post request to the call scheduler.
Once a call request is scheduled with the Call Scheduler, it is put in the calling queue according to its call time. At the specified call time, the gateway will make the outbound call.
When the system makes an outbound call, the control flows these steps:
- Fetch the starting VXML file according to starturl specified in the call request.
- Dial the phone number if there is dial tone on the line.
- Detect the line status, such as no answer, line busy, answering machine, live human pick up.
- Fetch the starting VXML files for answering machine if the call is answered by an answering machine. The initial starting VXML fetched is discarded.
- Execute the starting VXML file after the gateway detects a live human pickup (someone says "hello"), or after it detects the answering machine beep.
- The gateway interacts with the person called according to the VXML application.
- The gateway disconnect the call.
- The gateway saves the call status.
As stated in the previous section, the gateway is aware of two starting VXML files, one for human pickup and the other for answering machine. The starting VXML file for human pickup is defined in starturl. The gateway first fetches the VXML file with no parameters, if the call is answered by an answering machine, the gateway discards the previously fetched VXML file and then fetches the VXML file with "ans=t".
For example, if your starturl is defined as http://mydomain/myapp/start.jsp. The gateway will get the start vxml file by sending an HTTP request to the URL defined, i.e., http://mydomain/myapp/start.jsp if the call is answered by an answering machine, the gateway will send another HTTP request to the same URL with parameter ans=t, i.e., http://mydomain/myapp/start.jsp?ans=t.
The following example JSP file will play live.wav for human pickup, and play answering.wav for answering machine. The wave file must reside under the directory audio. For the time being, record any message in these two wave files. We'll add more features as this tutorial develops.
="1.0"
<vxml version="1.0">
<%
String ans = request.getParameter("ans");
boolean isAnsweringMachine = ("t".equals(ans));
%>
<form id="td">
<block>
<% if (isAnsweringMachine) { %>
<audio src="audio/answering.wav"/>
<% } else { %>
<audio src="audio/live.wav"/>
<% } %>
</block>
</form>
</vxml>
Except trivial applications, most voice applications require dynamically generated VXML files. This is especially true when your application needs to be integrated with a web site or a database.
With dynamically generated VXML files, you can pretty much do whatever you want with your application. In this sample, we'll collect key press 1 - 5 from our customer. The following is the updated start.jsp file:
="1.0"
<vxml version="1.0">
<%
String ans = request.getParameter("ans");
boolean isAnsweringMachine = ("t".equals(ans));
%>
<form id="td">
<% if (isAnsweringMachine) { %>
<block>
<audio src="audio/answering.wav"/>
</block>
<% } else { %>
<field name="rating">
<prompt timeout="10s">
<block>
<audio src="audio/live.wav"/>
</block>
</prompt>
<dtmf>
1 | 2 | 3 | 4 | 5
</dtmf>
<filled>
<submit next="recordrating.jsp" namelist="rating"/>
</filled>
</field>
<% } %>
</form>
</vxml>
As you can see, the key press is captured by VXML file's "rating" field. This value is then submitted to the next JSP file called recordrating.jsp.
="1.0"
<vxml version="1.0">
<%
String key = request.getParameter("rating");
int ratetotal = 1;
Integer RateTotal = (Integer) application.getAttribute("rate" + key);
if (RateTotal != null)
ratetotal = RateTotal.intValue() + 1;
application.setAttribute("rate" + key, new Integer(ratetotal));
%>
<form id="td">
<block>
<audio src="audio/thankyou.wav"/>
</block>
</form>
</vxml>
Line status in VoiceXML is handled by VXML exceptions. The following are the exception values:
"telephone.noanswer"
"telephone.noline"
"telephone.linebusy"
"telephone.linedrop"
You can certainly catch these exceptions in your own VXML code and handle these exceptions accordingly.
<form id="td">
...
<catch event="telephone.noline">
<submit next="..." namelist="..."/>
</catch>
<catch event="telephone.linebusy">
<submit next="..." namelist="..."/>
</catch>
...
</form>
The survey message as described in the introduction section:
"Hi, this is ACME car service calling [customer name]. We have provided your [maker of the car] for maintenance service on [date of the service], please rate our service by number 1 to 5, with 5 being the best. Thank you for your time."
When the gateway is calling one customer, it will get the dynamically generated VXML file through the URL that is specified in the starturl parameter. When we send our call request to the gateway, we really do not know when the gateway will call back for the VXML file. And if we are using multi-line system, there will be concurrent access to the starturl
.
To solve this problem, for example, you can embed a customer ID in the starturl
string. So when the gateway is calling back, it will submit the customer ID back to the starturl
. The updated start.jsp file is listed below:
="1.0"
<vxml version="1.0">
<%
String ans = request.getParameter("ans");
boolean isAnsweringMachine = ("t".equals(ans));
String customer_name = request.getParameter("customer_name");
String car_maker = request.getParameter("car_maker");
String service_date = request.getParameter("service_date");
%>
<form id="td">
<% if (isAnsweringMachine) {
int anstotal = 1;
Integer AnsTotal = (Integer) application.getAttribute("anstotal");
if (AnsTotal != null)
anstotal = AnsTotal.intValue() + 1;
application.setAttribute("anstotal", new Integer(anstotal));
%>
<block>
<audio src="audio/acme.wav"/>
<%=customer_name%>
<audio src="audio/we_provide.wav"/>
<%=car_maker%>
<audio src="audio/maintenance.wav"/>
<%=service_date%>
<audio src="audio/thanks.wav"/>
</block>
<% } else { %>
<field name="rating">
<prompt timeout="10s">
<block>
<audio src="audio/acme.wav"/>
<%=customer_name%>
<audio src="audio/we_provide.wav"/>
<%=car_maker%>
<audio src="audio/maintenance.wav"/>
<%=service_date%>
<audio src="audio/please_press_15.wav"/>
</block>
</prompt>
<dtmf>
1 | 2 | 3 | 4 | 5
</dtmf>
<filled>
<submit next="recordrating.jsp" namelist="rating"/>
</filled>
</field>
<% } %>
</form>
</vxml>
Now that we have the application VXML files developed, we can start implementing the survey controls and report features. The survey start page is shown below:
To simplify the sample, the start page only takes a list of comma separated phone numbers. On the server side, the application will randomly assign other necessary values, such as car maker and date of service provided. In a real survey application, the start page is likely to take some database queries and perform the operation accordingly. But the key feature related to Voicent Gateway should be exactly the same.
After you click the Start Survey button, the surveyHandler.jsp sends all the calls to the gateway and then returns a survey report page.
When you click the "Show current survey report" button, the browser sends POST request to the same surveyHandler.jsp page, with action=report set. The handler will do the following:
int callsToBeMade = 0;
int callsFailed = 0;
for (int i = 0; i < m_callRecords.size(); i++) {
CallRecord rec = (CallRecord) m_callRecords.get(i);
if (rec.m_callStatus == null) {
getCallStatus(rec);
if (rec.m_callStatus == null) {
callsToBeMade++;
continue;
}
}
if (rec.m_callStatus.equals("Call Failed"))
callsFailed++;
}
int answeringTotal = getRateTotal(application, "anstotal");
int rate1 = getRateTotal(application, "rate1");
int rate2 = getRateTotal(application, "rate2");
int rate3 = getRateTotal(application, "rate3");
int rate4 = getRateTotal(application, "rate4");
int rate5 = getRateTotal(application, "rate5");
For details, please check the source code for the sample. Audio files are also included in the sample. They are automatically generated using Voicent Natural Text-to-Speech engine.
History
- 18th July, 2006: Initial post