Using XQuery and VoiceXML - IBM

[Pages:20]Using XQuery and VoiceXML

Create a dynamic VoiceXML application

Skill Level: Intermediate

Martin Brown Developer and writer Freelance

06 May 2008

XQuery and VoiceXML are a perfect combination. XQuery provides a very simple and direct method to generate XML documents from other XML documents. Because you can pick and choose the different elements that you want from the source XML file, and format the output file in any way you wish, you can easily produce a VoiceXML document that contains the exact information you need. In this tutorial, you see how to employ XQuery with XML documents to build complex and dynamic systems that take input and information from a VoiceXML environment and combine them with existing XML documents to produce interactive applications.

Section 1. Before you start

This tutorial is for developers interested in implementing a VoiceXML solution using XQuery. Familiarity with XQuery is helpful but not required. Readers should be familiar with basic XML and RSS concepts.

About this tutorial

Frequently used acronyms

? DTMF: Dual-tone multi-frequency ? GNU: Gnu's Not UNIX

Using XQuery and VoiceXML ? Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 1 of 20

developerWorks?

developerWorks

? HTML: Hypertext Markup Language ? RSS: Rich Site Summary ? URL: Uniform Resource Locator ? XML: Extensible Markup Language ? XSLT: Extensible Stylesheet Language Transformations

XQuery provides a method to select and process different elements of an XML document out into a different XML or other structure. One of the main benefits of XQuery over other formats is that you can contain the processing and XML within the same file. This can make it easier to process and manage an XQuery document compared to a traditional XSLT and XML, or language (Perl, Python, JavaTM or others) and XML solution.

In this tutorial, you will build an application (see Resources to access a live demo) that takes a list of potential RSS feeds, organized by topic. The application provides the caller with the opportunity to choose a topic, then a feed within the topic list. The system then reads out the news generated from that feed.

Prerequisites

The following tools are required to follow along with this tutorial.

? The Qexo tool, part of the GNU Kawa implementation and comes from GNU, will be used for the examples in this tutorial.

? The SAXON XSLT and XQuery Processor can handle XQuery document processing.

Section 2. Basic XQuery - VoiceXML interaction

The fundamentals of XQuery and VoiceXML are actually very simple. The first is a method to process XML files into a format that you want. The latter is a format to describe voice-based applications and interfaces, using XML. Using the two together means that you can take advantage of the querying and formatting capability in XQuery to help generate and format VoiceXML.

Using XQuery with VoiceXML

Using XQuery and VoiceXML Page 2 of 20

? Copyright IBM Corporation 1994, 2008. All rights reserved.

developerWorks

developerWorks?

The fundamentals of the XQuery are well described in other articles and tutorials (see Resources), and using it with VoiceXML is largely no different than using it with other applications. You do, however, need to be careful about how you organize your application, how the application interacts with the different components, and how you use and exchange information between VoiceXML and the XQuery script.

At its core, XQuery combines the functionality of a basic XML parser with the selection language of XPath and some additional programming and logic. This enables you to make decisions and selections and to iterate over individual components within the source XML file.

Therefore, creating VoiceXML from XQuery is a case of generating the correct XQuery statements that will convert your base XML document into a VoiceXML document that you can use with your VoiceXML application.

The format of the VoiceXML file creates the application that you can interact with vocally. The sample code in this tutorial is available in an online demo using the Voxeo system (see Downloads), which allows you to dial and use the application over a standard telephone line or through a Skype connection. The VoiceXML standard is now spawning extensions for other interfaces, including Web browsers and standalone applications.

Component overview

For the examples in this tutorial, you will use Qexo, the GNU Kawa implementation of the XQuery standard. The Kawa library provides a very simple method to execute XQuery documents and produce the necessary XML.

At the command line, you can use Qexo to execute XQuery documents by specifying the location of the Kawa jar file, specifying the XQuery execution and then providing the XQuery document. For example: $ java -jar kawa-1.9.1.jar --xquery myxqeuryfile.xql.

In this example, the JAR was placed in the current directory, but you can place the jar anywhere.

Most VoiceXML application platforms (including Voxeo) rely on a Web application or location that provides the VoiceXML components. For interactivity, you need to be able to run the XQuery dynamically. This in turn generates the VoiceXML, and the voice platform (Voxeo) generates the voice and responds to user input. For the dynamic element, you can use Kawa in combination with Tomcat to run XQuery documents as applications.

Parameters or options selected by the user are then passed from the VoiceXML application as Web application parameters to the dynamic component, which are

Using XQuery and VoiceXML ? Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 3 of 20

developerWorks?

developerWorks

then parsed and used to generate new content.

For example, when you dial in to the example service, you are connecting to Voxeo. Voxeo accessed the Web application (based on XQuery). When a user selects an option, this is sent to the Web application with the appropriate Web parameters to execute the next phase.

You can see a basic overview of the application components in Figure 1.

Figure 1. The components required to run an XQuery and VoiceXML service Figure 1 only shows the component overview. Now look at the application from the perspective of the user.

VoiceXML application overview

For the demonstration application within this tutorial you will build an application that takes a list of potential RSS feeds, organized by topic, and provides the caller with the opportunity to choose a topic. The application will then provide a feed within the topic list, and the system will read out the news generated from that feed.

You can see a basic layout of the Voice application interface in Figure 2.

Figure 2. The VoiceXML application process For example, a typical conversation might resemble the one in Listing 1.

Listing 1. Simple application sequence

User: [dials in] System: Welcome to the RSS feed reader. Please select a topic.

For News, press 1. For Technology, press 2. User: Press 1 System: Thank you, listing feeds in the News topic. Choose a feed. For BBC Latest News press 1. For CNN press 2. User: Press 1 System: [starts reading news]

Although the example shows that the user pressed a number to choose the option, speaking the option is also valid.

The basic sequence is to generate a VoiceXML file that relates to the topics, then provide another VoiceXML document that provides the list of the available news feeds. Finally, once the user selects the news feed, the voice system reads out the news items.

Now look at the application, starting with the file that lists your potential topics and feeds.

Using XQuery and VoiceXML Page 4 of 20

? Copyright IBM Corporation 1994, 2008. All rights reserved.

developerWorks

developerWorks?

Section 3. Converting data to VoiceXML with XQuery

The most basic process is to convert the file that contains your topic information and feed information into the VoiceXML required by your voice platform.

Creating a list of topics and feeds

XQuery can take a standard XML document and, by combining or reformatting the contents, you can generate a modified version of the information as XML, HTML or another *ML derivative.

The XQuery script uses XPath to select elements, and enables you to change or summarize the output of a document according to your needs. Because the output is another XML or HTML document, the basic structure of the file looks like XML/HTML, but with the added components required to output and format the information in the way that you want.

For your application, you need to start by creating an XML document to hold a list of available feeds and provide the feed information that will be used by the rest of the application, which will provide the user with options. This will, in turn, provide the application with the information it needs to be able to read out the RSS content.

Throughout this example, you will use files to hold the information to make the process of selecting and displaying the information that much easier. Listing 2 shows the main feedlist.xml file.

Listing 2. The feedlist.xml file containing topics and feed information

BBC Latest News bbc.xml CNN Top Stories cnn.xml

MCSLP mcslp.xml

Using XQuery and VoiceXML ? Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 5 of 20

developerWorks?

developerWorks

Computerworld computerworld.xml

The information (in Listing 2) is split into top level topics, using an attribute to identify the topic type, and then into individual feeds. For each feed, you need only two elements: the feed name and the XML file that holds the actual feed data.

Converting the topic list into VoiceXML

A very simple example of how this can work is to convert a list of possible news sources, your RSS feed list, into a simple menu that you can present to the user over VoiceXML.

To create your list of topics you need five elements:

? The basic VoiceXML document structure ? Introductory information ? The process to list the options to the user ? The process to generate validating values for the user input ? The definition for what to do with the option once the user selects it Within VoiceXML, prompts are introduced using the prompt tag. These elements are spoken to the text-to-speech (TTS) functionality of the Voice platform.

For a system that allows user selection or input, you create a form, and on the form are fields that define the individual elements you want to receive input for. Each option should have a corresponding option tag that holds the information about what the field should contain, data wise, once the user makes a selection. It also provides a textual version of the value that the user can speak.

Finally, once the form is filled in, you specify what happens with the provided information.

Listing 3 shows an XQuery script that converts the feedlist.xml file into a VoiceXML document.

Listing 3. An XQuery script that converts the feedlist.xml file into a VoiceXML document

declare function local:list-topics-options($url) {

Using XQuery and VoiceXML Page 6 of 20

? Copyright IBM Corporation 1994, 2008. All rights reserved.

developerWorks

developerWorks?

for $topic at $pos in doc($url)/feedlist/topic/@category return {data($topic)}

};

declare function local:list-topics-names($url) {

for $topic at $pos in doc($url)/feedlist/topic/@category return For {data($topic)} press {data($pos)}.

};

Welcome to the Feed Reader service. Choose a topic. {local:list-topics-names("feedlist.xml")} {local:list-topics-options("feedlist.xml")}

Thank you - listing feeds in the

topic.

The XQuery document works like this:

? The list-topics-options function accepts a file name as an argument (the XML document to output), and generates the option tags required to validate user input. In this case, you extract the list of available topics (from the category attribute) using the specified XPath. In addition to pulling out the topic, you also grab the position within the return elements and use this to specify the equivalent DTMF (keypad) value for the option.

? The list-topics-names function iterates over the same list, but in this case, outputs a prompt block that specifies the value and its DTMF option. The list generated by this output will be read out by the voice platform.

? The rest of the document is the VoiceXML document wrapper that contains the definition for the form, the field (topic) and what to do with the result when you receive it from the user.

The output makes more sense when you look at the generated VoiceXML document.

Making VoiceXML interactive

Using XQuery and VoiceXML ? Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 7 of 20

developerWorks?

developerWorks

Name the XQuery script topiclist.xql, and then use Qexo to generate the VoiceXML file. You will get the output in Listing 4.

Listing 4. The generated VoiceXML topic list

Welcome to the Feed Reader service. Choose a topic. For news press 1. For technology press 2. news technology Thank you - listing feeds in the topic.

The generated VoiceXML can be identified in separate blocks like this:

1. The opening section reads out the title of the service.

2. The next section specifies a new input field (topic): .

3. The voice platform will read out each potential option: For news press 1..

4. To specify what happens for the different options, you specify the value and its DTMF equivalent: news.

Once the user makes the selection, the value of the field in the VoiceXML form during execution by the voice platform is populated with the corresponding value. For example, in Listing 4, 'news' is the first option. If the user speaks 'news' or presses 1 on his keypad, the topic field will contain the value 'news.'

VoiceXML is largely a one-way solution--that is, you generate VoiceXML and output the prompts and other information so that the users can listen to the options and information.

There is no interaction in the VoiceXML in the same way as interaction with a typical Web interface or application. Within VoiceXML, you can configure parameters for the prompting operation to collect, but the VoiceXML does not describe how to do anything with those parameters. Instead, it has to hand off the parameters to an application that is then capable of generating VoiceXML in response.

Using XQuery and VoiceXML Page 8 of 20

? Copyright IBM Corporation 1994, 2008. All rights reserved.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Related searches