How to Start a Universal Windows Platform (UWP) App Using Cortana

For many years now, being able to speak a command, and have your device fulfill the request has been a needed tool for many, and a play thing for others. For the former—and this list is in no way exhaustive—you may include users such as drivers, medical professions, and those who find keyboards unusable. With technology such as Cortana, we take voice recognition to a new level.

Now, there’s far more to what Cortana can offer you, other than working out what you’ve said. The Cortana Intelligence Suite comes to mind when dealing with data analytics. But for now, we’ll look at a much simpler use, and that is starting an UWP app using Cortana.

The App

The first thing you’ll need is an empty UWP app. I’m using Visual Studio 2015 to create the project, and once created, your solution may look like this…

Cortana1
Figure 1: The generated solution from the empty universal app template

Before we jump into doing some code, we need to turn our attention to the Voice Command Definitions. The definitions we’ll create shortly are a way of you telling Cortana what to listen for to start your app; to create these definitions, you’ll need an XML file added to your project.

I’ve added one named ExampleDefinition.xml, which is in the root of my project folder, along with App.xaml and MainPage.xaml. To get things started, let’s add the root element with the XML namespace defined to make things a little easier; via IntelliSense. And the element we’ll add is going to be this…

<VoiceCommands
   >

</VoiceCommands>

Inside this VoiceCommands element, we’re going to add a required element named CommandSet, which has some attributes we need to take note of. The first is the xml:lang attribute, which needs to be defined; and secondly the Name attribute, which is optional. Name can just be some arbitrary string, and I added my CommandSet element like so…

<VoiceCommands
      >
   <CommandSet xml_lang="en-gb"
               Name="CortanaTestCommandSet_en-gb">

   </CommandSet>
</VoiceCommands>

Now, we can jump into the meaty stuff. The first element we come to, which is optional but must appear first if you do opt to include it, is the AppName element. This element allows you to specify a more user-friendly name for your application, which can be different from its actual name, if for example it were too long to say. This name then will be listened for when the user is speaking the command needed to start your application; but we’ll come to where it fits into the command shortly. Firstly, this is what this element looks like when added to what we already have…

<VoiceCommands
      >
   <CommandSet xml_lang="en-gb"
               Name="CortanaTestCommandSet_en-gb">
      <AppName>Cortana Test</AppName>
   </CommandSet>
</VoiceCommands>

There is an alternative to the AppName element, called CommandPrefix, and they are mutually exclusive. Again, this is something we’ll come back to in a moment.

Before we do, the next element we need to include after AppName is an Example element that allows you to give it an example of what the user needs to speak. I have something like this…

<Example>Show photos from Gavin</Example>

At this time, you will be missing some elements that are required and therefore this definition file isn’t usable. But, I’ll quickly show you what the above achieves when you have installed this definition file.

When I press the Cortana Icon and speak the words ‘What can I say?’, Cortana shows me a list of things I can do with Cortana. And, if we look at the bottom of the list, we can see our UWP app, and the example we specified under the app name we defined. Again, this won’t work for you yet, but does give you an idea of where we’re going with this article.

Cortana2
Figure 2: Our app name and example showing on the Cortana Canvas after speaking ‘What can I say?’

If you were to click one of these items in the list, you’ll be presented with another list showing examples of all the commands you can use for this app. So now, let’s actually define a command Cortana will listen for.

After our Example element, we need to add the Command element and give it a name.

<Command Name="show">
</Command>

The following elements to be added will be children of the command element, starting with an example. The example that we’ll define as a child of the command element will appear as we described earlier; which is when you click the app after saying the words ‘What can I say?’. When added, the result looks like this…

<Command Name="show">
   <Example>Show photos from Gavin</Example>
</Command>

Following the example element, we now add the element that we’ve been waiting for, and that’s the ListenFor element.

My ListenFor element is as follows…

<ListenFor RequireAppName="AfterPhrase">
           Show [today's] {item} from {username}
</ListenFor>

…and has a few points of note we need to talk about.

The first is the RequireAppName attribute. This attribute is used in conjunction with the AppName element we defined earlier. I have entered AfterPhrase above, and this means that after the user has spoken the phrase ‘Show photos from {username}’, they also need to say ‘on Cortana Test’, where ‘Cortana Test’ was defined in the AppName element at the start. There are a few options, such as BeforePhrase, which speaks for itself. But also recall I mentioned that the AppName element could be replaced with CommandPrefix. The CommandPrefix element does exactly the same was what we have here; it will append ‘Cortana Test’ after the phrase specified in the ListenFor element, but we will not need to use the RequireAppName attribute.

Moving on, you’ll notice that some of the text in the phrase is wrapped in brackets, both square and curly. We come to [today’s] first, and this one is simple. Using square brackets specifies that the part of the phrase inside those brackets is optional, and may or may not be spoken by the user. The next bracket, the curly brackets, are a reference to a PhraseList or PhraseTopic element you specify. Both of these elements are not children of Command, and we’ll specify them now. However, we haven’t finished with the children of the Command element yet, and we’ll return to it in due course.

As a direct child of CommandSet, I have these elements defined…

<PhraseList Label="item">
   <Item>images</Item>
   <Item>photos</Item>
</PhraseList>
<PhraseTopic Label="username">
   <Subject>Person Names</Subject>
</PhraseTopic>

From our phrase specified in ListenFor, we can see that ‘item’ and ‘username’ reference these two elements. The first, ‘item’, is a phrase list, and as such is a hardcoded list of possible words or phrases the user can speak in a given command. For this example, I have ‘images’ and ‘photos’ set as the two possibilities the user needs to speak for this command to execute. The second is a little more complicated, and possibly more so than there is time for in this article. That said, this allows you a way to look for specific words and phrases in a less hardcoded fashion than a phrase list provides. The children of the phrase topic, the Subject elements, provide a given set of options, such as ‘Person Names’ to help you refine and constrain the possible words or phrase a user might say.

Going back to the children of the CommandSet as promised, let’s look at the last two items.

<Feedback>Showing {item} from {username}</Feedback>
<Navigate/>

We have Feedback and Navigate. The former is the feedback Cortana will give to the user if the command is successful. For this occasion, I’ve specified I want Cortana to show or say something like ‘Showing images from Mike’. As a note of caution: If you reference a phrase list or phrase topic in your feedback, you must include the same references in your ListenFor.

It is also worth knowing that you can’t combine brackets in the phrase given to Cortana to listen for, or to feedback to the user. For example, [{username}] is not valid.

Now, finally, we come to our last element, which is Navigate. This has a very simple function, which is to point the command towards a specific page in your application. I’ve left this element blank for now.

Let’s take a look at the full voice command definition XML file…

<?xml version="1.0" encoding="utf-8" ?>
<VoiceCommands >
   <CommandSet xml_lang="en-gb" Name="CortanaTestCommandSet_en-gb">
      <AppName>Cortana Test</AppName>
      <Example>Show photos from Gavin</Example>
      <Command Name="show">
         <Example>Show photos from Gavin</Example>
         <ListenFor RequireAppName="AfterPhrase">
            Show [today's] {item} from {username}
         </ListenFor>
         <Feedback>Showing {item} from {username}</Feedback>
         <Navigate/>
      </Command>
      <PhraseList Label="item">
         <Item>images</Item>
         <Item>photos</Item>
      </PhraseList>
      <PhraseTopic Label="username">
         <Subject>Person Names</Subject>
      </PhraseTopic>
   </CommandSet>
</VoiceCommands>

Giving the Voice Commands Definition to Cortana

Installing the definitions into Cortana is something we’ll do on application start. Open your app.xaml.cs, and create an async void method—because the method to install the definitions is an asynchronous one and awaitable—with the following code added…

async void RegisterExampleCommands()
{
   var voiceDefinitions = await
         Package.Current.InstalledLocation.
      GetFileAsync("ExampleDefinition.xml");

   await Windows.ApplicationModel.VoiceCommands.
      VoiceCommandDefinitionManager.
      InstallCommandDefinitionsFromStorageFileAsync
         (voiceDefinitions);
}

Then, call such a method from the constructor of the App class, run the app, and this should install your definitions to Cortana. If you now close your app down, you can use your voice command to start up your app. You also can say the command ‘What can I say?’ to Cortana, and look in the results to verify if your definitions were indeed installed.

Feedback from Cortana Given to Your App

When Cortana starts your app, your app starts in a specific way; and not via the normal route you would expect when you click the icon. If you go into your app.xaml.cs, you need to override the OnActivated method, like so…

protected override void OnActivated(IActivatedEventArgs args)
{
}

And the first thing we need to do is determine how the application was started. There are many ways the application could be started, which include—but not limited to, Search, File, Contact, and Wallet Action. But, the one we are interested in is VoiceCommand. Let’s look for this type of activation with the following code in our OnActivated method.

if (args.Kind == ActivationKind.VoiceCommand)
{
}

Once you know that the application was started via a voice command, you know you can get at the data Cortana will give you. Consider the following code…

if (args.Kind == ActivationKind.VoiceCommand)
{
   var commandArgs = args as VoiceCommandActivatedEventArgs;
   var speechResult = commandArgs.Result;

   var username = speechResult.SemanticInterpretation.
      Properties["username"];
}

From the above, we can grab hold of the username that the user spoke for the voice command. Be sure to match the key to select the item you want to the label given to your PhraseTopic or PhraseList.

I’ll leave to your good self to explore, because the event args passed in a voice command offer quite a lot.

I find this is quite a fun thing to play with, but at the same time, there are practical applications of this tech that can be rewarding. My own experience comes from building software for the manufacturing industry, and it’s certainly helped.

If you have any questions on this article, you can find me on Twitter @GLanata.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read