Talking Web Clients with JavaScript and the Speech API

Introduction

I was surfing the web and saw the announcement of Microsoft Surface. Surface right now is a tabletop Windows Vista machine with cameras pointing out, no keyboard, and no mouse. Interactions are done with the surface of the device. Combined with the cameras and Bluetooth capabilities, Surface interacts with objects in the real world. For example, lay an enabled cell phone on the device and Surface downloads your pictures or possibly synchronizes your phone contacts with your Outlook contacts. I can’t do the device justice here; it’s one of those things you will have to see to believe (www.microsoft.com/surface). It’s worth seeing.

I was so intrigued with a device that interacts naturally with the physical world that I became interested in the day-to-day possibilities of a more natural interaction with computing devices (as well as interested in writing a book about Surface. Hint: Microsoft, send me a Surface machine, for research purposes of course). Consequently, this article is an off-shoot of that interest.

This article demonstrates how to make your web clients read some or play some of the text content to the users. I am aware of accessibility tags and HTML, but as a programmer how could capabilities like audio-text be used for routine development purposes: first, making debug and trace information audible, and permitting everyday end-users to hear content. And, I will admit it’s fun too. In this article, you will get to experiment with JavaScript from the command line, learn a JavaScript debugging technique that might be useful, and how to load the Speech API and ask it to read the ALT (text) attribute of HTML controls.

Even if you aren’t that interested in new computing techniques or speech, you will find the JavaScript techniques helpful.

Running JavaScript from the Command Line

JavaScript can be used for more than snappy web clients and as part of the plumbing that makes Ajax work. (And, Ajax is a very cool capability of web programming.) JavaScript can be run from the command line to perform routine tasks.

If you create a .js file and click in Explorer or type the file name from the command line (cmd.exe), the Windows Scripting Host (WScript.exe) will run the JavaScript like a program. Try this example:

  1. Open Explorer.
  2. Create a new text file and name it Hello.js.
  3. Type: WScript.Echo(“Microsoft Surface is cool!”).
  4. Save the file and double-click it in Explorer.

The WScript host will execute the script and display a standard message box containing the text. Listing 1 demonstrates JavaScript that will create a desktop URL shortcut (to Microsoft’s Popfly—Code mashup tool). (Figure 1 shows a Mashup sample I created on Popfly, which is based on Microsoft’s Silverlight technology.)

Listing 1: JavaScript that creates a desktop shortcut to Microsoft’s Popfly.

var shell      = WScript.CreateObject("WScript.Shell");
var desktop    = shell.SpecialFolders("Desktop");
var url        = shell.CreateShortcut(desktop + "\\Popfly.url");
url.TargetPath = "http://www.popfly.ms";
url.Save();
WScript.Echo("Shortcut added to Popfly");

Figure 1: My FlickringVirtualEarth Mashup that shows the location of pictures from Flickr on VirtualEarth.

Adding Speech to a Web Client

You can use the Speech API with just a few lines of code. You will need to create an ActiveXObject passing the name of the Speech API component, SAPI.SpVoice. (You can download the Speech API for free, if it’s not installed on your computer already. Check out http://msdn2.microsoft.com/en-us/library/ms723627.aspx.) Next, you can pick a voice, and optionally set the speech rate and volume. Finally, send the text you’d like to be spoken (see Listing 2).

Note: Other browsers may not support ActiveX objects, but this technique works great on the most popular browser, IE.

Listing 2 The bare bones code it takes to read some hard coded text.

var voice = new ActiveXObject("SAPI.SpVoice");
voice.Speak("Microsoft Surface is cool!");

You can select from available voices and modulate the rate and volume with code like that shown in Listing 3. The rate can be from 1—slow—to 10—very fast; the volume can be from 0 to 100, which is the loudest; and the voice selected has to be an installed voice.

Listing 3: Speech script that sets the rate of speech, volume, and picks from an available voice.

var voices = voice.GetVoices();
var len = voices.count;
WScript.Echo("Length: " + len );
WScript.Echo("Voice: "  +
voices(1).GetDescription() );
if(len >= 2)
   voice.Voice = voices(1);
   voice.Rate = 1;
   voice.Volume = 75;
   voice.Speak("Microsoft PopFly is cool too!");

The code above gets the array of available voices. The number is retrieved and displayed and a description of voice at index 1 is displayed. On my machine, I have three voices and the voice at index 1 is LH Michelle. The speech rate is set to 1 (slow) and the volume is set to about ¾’s of the maximum volume.

Listing 4 combines the elements you have seen so far with some new features. I have introduced the debugger keyword that causes the debugger to break precisely at that line, making it easier to target specific code for debugging. I have also introduced a try..catch block and some cleanup code (see Listing 4).

Figure 2: By adding a debugger statement to your code, you will be prompted to select a debugger (usually Visual Studio or Microsoft Script Editor (shown)) and execution will be suspended at the debugger statement.

Listing 4: Speech capability bound to the onclick event of an <img> control. (The numbering is for reference only.)

 1: <html  >
 2: <head runat="server">
 3:    <title>Talking Page</title>
 4:    <script language="javascript">
 5:    try
 6:    {
 7:       var voice = new ActiveXObject("SAPI.SpVoice");
 8:    }
 9:    catch(oException){}
10:
11:    function SpeakIt()
12:    {
13:       try
14:       {
15:          debugger;
16:          var phoneNumberText = event.srcElement.alt;
17:          if(voice)
18:          {
19:            // set voice to LH Michelle
20:            voice.Voice = voice.GetVoices()(1);
21:            voice.Rate   = 2;      // pretty fast
22:            voice.Volume = 100;    // pretty loud
23:            voice.Speak(phoneNumberText, 1);
24:          }
25:          event.cancelBubble = true;
25:       }
27:       catch(oException)
28:       {
29:          alert(oException.message);
30:       }
31:    }
32:
33: </script>
34: <script for="window" event="OnQuit()" language="javascript">
35: if(voice) delete voice;
36: </script>
37: </head>
38: <body>
39:    <form id="form1" runat="server">
40:    <div>
41:       <img alt="Paul's Number is (555) 555-1212 extension 3333"
               src="images/phone.GIF"
42:            onclick="SpeakIt()">
43:    </div>
44:    </form>
45: </body>
46: </html>

Tip: You will have to enable ActiveX content for your browser by selecting Tools|Internet Options, navigating to the Security tab, selecting Local intranet, Custom level, and enable or prompt for ActiveX controls and script.

The HTML in Listing 4 is a basic .HTML page. (This code will work in an .ASPX page too.) In the header section, you have a <Script> block that has a startup script that creates the ActiveXObject instance of the SAPI.SpVoice object—see lines 5 through 9. The SpeakIt function—lines 11 through 31—is bound to the onclick event of the <img> tag. The SpeakIt function reads the text in the ALT attribute of the <img> tag.

The debugger keyword on line 15 will always cause the debugger to break into the code. (Once you are comfortable with the code, simply comment line 15 out. A try..catch block—lines 13 and 27—responds to errors by displaying the text of the exception object. Finally, when the window closes the voice object is deleted.

Summary

By combining JavaScript, ActiveX, and debugging techniques, you can add some fun and advanced features to your web clients. The code demonstrated in this article works on the command line with the Windows Scripting Host or in HTML or .ASPX pages on web clients.

In this article, in addition to learning about the Windows Scripting Host, you learned a handy JavaScript debugging technique, how to create ActiveXObjects, and how to incorporate the Microsoft Speech API into Web clients. Using the ActiveXObject does limit you to IE for the most part, but incorporating Speech into your web applications is fun for you and can be engaging for your users.

I also introduced Microsoft Surface and PopFly. These are new and compelling products (or tools) that will soon be available. Check them out and let me know what you think—or blog about them.

About the Author

Paul Kimmel is the VB Today columnist for www.codeguru.com and has written several books on object-oriented programming and .NET. Check out his new book UML DeMystified from McGraw-Hill/Osborne. Paul is a software architect for Tri-State Hospital Supply Corporation. You may contact him for technology questions at pkimmel@softconcepts.com.

If you are interested in joining or sponsoring a .NET Users Group, check out www.glugnet.org.

Copyright © 2007. All Rights Reserved.
By Paul Kimmel. pkimmel@softconcepts.com

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read