Building the Right Environment to Support AI, Machine Learning and Deep Learning
Current speech recognition technology is not able to completely and accurately translate speech to text. Microsoft knows this, and, with the Longhorn Speech API, has provided the developer with suitable technology to overcome this shortcoming without the hassle of training the computer on the user's voice. With this burden lifted from the user, the technology offers a much more pleasant user experience (referred to as UX in current Microsoft documentation), and it even learns to increase accuracy over time.
This article examines two technologies (recognizers in speech applications and the tile element in Windows GUI applications) that Microsoft provides to improve UX—which is the crux of the Longhorn movement.
The Speech Recognizers
Recognizers are an interesting concept. In Longhorn—and .NET development in general—they provide multiple suggestions for inputted information. The recognizer seems to have grown out of the Tablet PC codebase, which is known for its user-assisted transferal of ink input to strings that can be manipulated from the Windows API.
Because the Speech API is not an included namespace by default, you must add the reference to System.Speech. You'll find it located with the rest of the common auxiliary system namespaces: c:\windows\Microsoft.NET\Windows\v6.0.4030\System.Speech.dll. You then can add the recognizer to the application by declaring the System.Speech.Recognition namespace and a SystemRecognizer (known in this project simply as recognizer). As you can see in the event handlers for the recognizer (namely the Recognizer_RejectedRecognition handler), whenever the user does not agree with a suggestion, the recognizer calls the DisplayResult method.
DisplayResult understands how the application should handle alternate suggestions for user input. The RecognitionResult variable result contains inside its Alternates object a list of possible values the user might have meant when he or she spoke input for the program. These RecognitionPhraseAlternates are then added to the alternates box, which provides the user a way to choose from the possibilities after the initial rejection.
This sample adds an event handler to the alternates list box in order to read back the currently selected alternate via the Voice object in the System.Speech.Synthesis namespace as one is selected. For further processing, you can add additional code to build the alternate into the final output string. In this case, however, assume that the suggested alternate is what composes the final output string.
Introducing the Tile
New to the Longhorn version of Windows is the tile, an element of the Windows GUI that represents the evolution of the Windows Notification Area application (known in the industry as the system tray). As such, the tile can be a superset of a system tray application. Because the concept is new, it currently has a few monikers:
- "toast," because of its ability to pop up
- "flyout," for reasons this article discusses later
- "tile," for the view of a flyout
- Or simply, the new tray
The tile serves as an application that provides continuously updated information to the Windows user by housing it inside a bar located at the right side of the screen (currently known as the sidebar) by default. After creating and compiling a tile application in Whidbey, a user can right-click the upper part of the sidebar and check to install such a tile, adding it to the sidebar. The tile also offers a flyout functionality, which is very similar to opening a program that has been minimized to the system tray in order to adjust settings. Through a flyout, the user can control the currently shown information in the tile view.
The developer who's new to Longhorn may have difficulty acclimating to the syntax of creating tiles and flyouts. Longhorn provides only the bare minimum of functionality for new developers, without providing any actual value-added services. However, newbies will find a MyTile.cs file in the sample tile application for this article, "TileSample.zip," that shows the technique for adding a flyout class (with the aptly named Flyout.XAML and linked Flyout.XAML.CS). Within the XAML are SimpleText and Button elements, which provide some descriptions and a way to close the flyout. The method for closing a flyout may vary from application to application, but in this case a simple Button handles the task. The TileForeground class and its corresponding XAML define the default behavior and layout, respectively, for the actual sidebar tile.
The tile application has many uses, including tiles that provide weather information, messaging and queuing information, and appointment and calendar information. A systems administrator, for instance, could view possible incoming hacks and other pertinent network activity in real-time via a tile, customized with a flyout.
Available Speech Technology
For those who cannot wait for the technology mentioned in this article, some speech recognition is available with the newly released Speech Application Server and corresponding Speech Application SDK. This SDK allows a developer to create rich, accessible applications and Web sites that work according to the rules defined in GRXML (grammar-recognition XML files) mentioned in the previous Longhorn Speech API article. The SDK also includes Visual Studio add-ins for speech-enabled applications in addition to the documentation and samples. These add-ins provide an intuitive way to facilitate the creation of these files. All the developer needs to do to define the way an application recognizes speech is click and drag elements, such as grammar, words, and lists, into a visual designer space.
Currently, the add-ins are for only Visual Studio 2003, and they won't install on Whidbey. However, the resulting GRXML files can be used in Longhorn applications and, more specifically, in the Longhorn Speech API. In fact, the examples in the previous article, "The Longhorn Speech API, an Initial Glance", are the result of creating a grammar file and importing it via the Grammar.Load() method (see the CommandTest2 - GetMyNews application from the previous article).
Download the Code
The sample applications provided, RecognizerSample and TileSample, show what you can do by implementing the techniques mentioned in the article. All of the provided code compiles under PDC builds of Longhorn (4051), Whidbey (m2.030828-1205), .NET Framework (1.2.30703), and the Longhorn SDK. These samples will also compile under the WinHEC version of Longhorn, but you must change the MSAvalon namespaces to the System.Windows.* namespaces and create MSBuild path variables. The provided project file compiles via MSBuild.