Voice Command Enabling Your Software

IBM and Dragon have made quite a splash with their voice software enabling users to dictate text to their computers and use voice to control applications. This article presents an MFC class based on the Microsoft Speech SDK which will enable you to make your application work with these engines to accept voice commands.

To use this class you must have a copy of the Microsoft Speech SDK version 3 or above. I believe this is available on MSDN. It is also available from Microsoft's web site.

These classes basically wrap the IVoiceCmd COM interface. They are specifically designed to simplify the interface so you can put speech into your applications with minimum effort. They also provide some features above and beyond the SDK including a built in "What can I say?" command and confirmation prompts. My CDialog and CScrollView derived classes show how you can also voice enable windows built in message boxes. If you wish to use this feature you will need to create similar classes for CView, CFormView etc. I leave this as an exercise for the reader.

The CCommandMenu Class

Before I leap into code I will outline each of the public methods on the CCommandMenu class which does most of the hard work.

   static CCommandMenu* CreateWindowCommandMenu(CWnd* pWnd, const CString& sVerify = "Do It",
   const CString& sWhatCanISay = "What Can I Say", const DWORD dwVerifyTimeout = 3000,
   CWnd* pMessageTarget = NULL);

This method is the only way in which a CCommandMenu object can be created. It returns NULL if it cannot be created. The most typical reason it fails is that a suitable engine has not been installed on the users machine. The parameters are ...

  • pWnd - This is the window that the commands are associated with. It must be a top level window. Do not pass a CView derived class here. If you are voice enabling a view class then you must pass in the CMainFrame class here. The commands in this menu will only be available when this window is the active window.
  • sVerify - This is the phrase the user must say after a voice command that requires confirmation to confirm the computer has heard the command correctly.
  • sWhatCanISay - This is the phrase the user can use to query the computer for what voice commands the computer can understand.
  • dwVerifyTimeout - This is the number of milliseconds the computer allows for the user to confirm their command before it just assumes it heard you wrong.
  • pMessageTarget - This is the window to send notifications to. If this parameter is NULL then notifications are sent to pWnd. If you are voice enabling a view and want the view to receove the voice messages then pass a pointer to the view here. If you are voice enabling a dialog then this can be NULL.
   BOOL Deactivate(void);
   BOOL Activate(void);

These methods activate and deactivate the command menu. If you are adding several voice commands to the menu then you will find it runs faster if you deactivate the menu before the first add and then re-activate it after the last add.


   BOOL Add(const CString& sCmd, const DWORD dwID, const BOOL fConfirmation,
 const CPoint& ptConfirmationPromptPos, const BOOL fSendWM_COMMAND);


This method adds a voice command that your window is listening for. The parameters are ...

  • sCmd - This is the word or phrase the computer will listen for. To help the reliability of the voice command recognition it is recommended that you make the commands as distinctly sounding as possible without confusing your users.
  • dwID - This is an identifier for your command. It must be unique otherwise you won't be able to tell which command was spoken. If fSendWM_COMMAND is TRUE then it must be the control id of the button or menu you want to receive the WM_COMMAND message.
  • fConfirmation - If this is true then the user will be prompted to confirm their command before the command is executed. This should be used for any command that does anything that is not reversable.
  • ptConfirmationPromptPos - This is a point relative to the window where the confirmation prompt tooltip should be displayed.
  • fSendWM_COMMAND - If this is true then the voice command notification is sent to your window via a WM_COMMAND message. This simplifies you coding on the window but requires you have a button for each voice command.
   BOOL Remove(const CString& sCmd);
   BOOL Remove(const DWORD dwID);

These methods remove a voice command from the menu.

   BOOL RemoveAll(void);

This method removes all voice commands from the menu.

   BOOL EnableItem(const DWORD dwID, const BOOL fEnable);
   BOOL EnableItem(const CString& sCmd, const BOOL fEnable);

These methods are used to enable and disable voice commands. When a voice command is disabled it will not appear on the "What can I say?" list.

   static BOOL IsAvailable(void);

This method can be called at any time to see if voice input is available on this machine.

Sample Project

Now that we understand the CVoiceCommand class it is time to add some voice commands to a project. Our sample project is a very simple dialog based project.

Step 1

In order to use this class you project needs to have the appropriate OLE code inserted. This may be already there but it may not.

In stdafx.h check that the following include statement is present

#include <afxdisp.h>        // MFC OLE automation classes

In your applications InitInstance check you are initialising the COM libraries

	// Initialize OLE libraries
	if (!AfxOleInit())
	{
		AfxMessageBox("OLE did not initialise.");
		return FALSE;
	}

Step 2

We must change the class from which our CSampleDialog is derived. By default this was CDialog and we need to change it to CVoiceCommandDialog. A simple search and replace in the SampleDlg.h and SampleDlg.cpp files does the trick.

Step 3

In our dialog box constructor we can override some of the default settings used for message boxes.

   // enable voice input
   SetVoiceEnabled(TRUE);

   // enable voice input on message boxes
   SetMessageBoxVerify(TRUE);

Step 4

Now we need to add some voice commands. This is most easily done in the OnInitDialog member funtion. Don't forget to check the m_pVoiceCmd is not NULL before using it.

   if (m_pVoiceCmd != NULL)
   {
      m_pVoiceCmd->Deactivate();
      m_pVoiceCmd->Add("Close", m_Close.GetDlgCtrlID(), TRUE, CPoint(0,0), TRUE);
      m_pVoiceCmd->Add("Message box", m_MessageBox.GetDlgCtrlID(), TRUE, CPoint(0,0), TRUE);
      m_pVoiceCmd->Add("Up", 1, FALSE, CPoint(0,0), FALSE);
      m_pVoiceCmd->Add("Down", 2, FALSE, CPoint(0,0), FALSE);
      m_pVoiceCmd->Activate();
   }
   else
   {
      MessageBox("Voice command not available.", "Warning", MB_OK);
   }

Step 5

We need to add a message handler to receive the command notifications

BEGIN_MESSAGE_MAP(CSampleDlg, CVoiceCommandDialog)
   ON_COMMAND_CONFIRMED()

and insert it's definition in the dialogs header file

   afx_msg LONG OnCommandConfirmed(UINT, LONG);
   DECLARE_MESSAGE_MAP()

and then actually write the handler

afx_msg LONG CSampleDlg::OnCommandConfirmed(UINT dwID, LONG lUnused)
{
   UpdateData(FALSE);

   switch(dwID)
   {
   case 1:
      if (m_Edit < 10)
      {
         m_Edit++;
      }
      break;
   case 2:
      if (m_Edit > 0)
      {
         m_Edit--;
      }
      break;
   default:
      ASSERT(FALSE);
      break;
   }

   UpdateData(FALSE);
   return 0;
}

Step 6

Lastly don't forget to add my source files to your project so it will link correctly...

  • SpeachInput.CPP
  • TimerWindow.CPP
  • Tip.CPP
  • VoiceCommandDialog.CPP

... And that is all there is to it.

Download source - 60 KB



Comments

  • Nike Display Max 1 FB publicity, have on the agenda c trick a strong color fabric, the chic shoes

    Posted by Geozyoceada on 04/24/2013 10:04am

    In the summer in a tumbler inside the chilling sprite seems to be a godlike creme de la creme, but if the sprite "feet"? Will also disclose you a trip, bring a sustenance! This summer, Nike and Sprite [url=http://northernroofing.co.uk/roofins.cfm]nike free[/url] and his sneakers to a blend of classic snow spread of callow, off-white and indecent color schematic in the definitive Nike Air Max 1 shoes reveal a like a breath of fresh air cool scent.[url=http://fossilsdirect.co.uk/glossarey.cfm]nike huarache free[/url] Summer is the metre to hand-pick a purified shoe, shoes should be a obedient choice. Qualifying series Nike Quality Max HomeTurf metropolis recently definitely comes up, this series in the immortal Breath Max shoes to London, Paris and Milan the three paid tribute to the iconic see of Europe, combined with the characteristics of the three cities, Sense Max 1 HYP,Tell Max 90 HYP,Air Max 1 and shoes such as Air Max 95, combined [url=http://markwarren.org.uk/property-waet.cfm]air max 90 uk[/url] with the Hyperfuse, as kind-heartedly as a heterogeneity of materials, such as suede, Whether you want practicable or retro-everything.

    Reply
  • Lightweight perceptive – Nike Unshackled TR Right in jump 2013 3 series

    Posted by Tufffruntee on 04/19/2013 12:19am

    Nike Free TR Fit 3 unmistakable features is to from the additional plot: Nike Self-ruling 5 soles improved bending Groove; stylish tractor layout making training more focused when; lighter preponderance, the permeability is stronger, and more fashionable shoe designs not not make shoes [url=http://markwarren.org.uk/goodbuy.cfm]nike free[/url] more serene wearing, barefoot training feel, but also more stylish appearance. Nike Free TR Fit 3 provides supreme lateral perseverance, you can have the legs in the untenable during training. Diligent vamp superiority breathable mesh, lower suds's one of a kind design can be [url=http://markwarren.org.uk/property-waet.cfm]air max 90 uk[/url] seen by virtue of it. Lightweight, rugged, piddling bubbles material familiar past merest only one seams, more flexible, forward is stronger. Demand more mainstay, factor of a training exercise, foam make inaccessible in more parts of the need for flexibility, bubble loose. Throw away double talk moisture wicking mock materials, flat on your feet, help maintain feet dry and comfortable. Phylite [url=http://turbo-vac.co.uk/components_13.cfm]nike free[/url] midsole offers lightweight stupor unchanging, outstanding durability and even outsole can do to greatly adjust the all-embracing weight of the shoe. Qianzhang pods on the outsole and heel-shaped Grassland rubber enhances the shoe multi-directional drag on extraordinary surfaces.

    Reply
  • from where i get speech.h

    Posted by Legacy on 08/29/2003 12:00am

    Originally posted by: zak

    hi
    plz tell me from where i get speech.h
    i dont find that on microsoft web site.
    plz Solve my prblem.
    can i use sapi.h istead of speech.h
    if yes then how can i
    bye
    ZAK

    Reply
  • Where can I find the speech.h file

    Posted by Legacy on 02/17/2003 12:00am

    Originally posted by: Puneet

    Hi,

    I am using VC6 and Speech SDK 5.0. I am not able to compile the program since it reports speech.h is not found.

    Where can I find this file. The links on the other comments are not working.

    Regards
    Puneet

    Reply
  • More Voice

    Posted by Legacy on 07/29/2002 12:00am

    Originally posted by: sujeet

    I am using Microsoft Speech SDK 5.1 in my project. Its working fine n smooth. Only problem i am facing is restriction of voice as I can use only three default voices provided by Microsoft (Mike, Sam and Marry). Is there any way to incorporate more voices? 
    
    

    sujeet katiyar

    Reply
  • Where can I get this microsoft speech sdk 3.0 or greater?? Is it free?

    Posted by Legacy on 09/19/2001 12:00am

    Originally posted by: peter drozd

    I tried to fined the SDK in the web and could not find it. Is there a place to get it and is it free?
    
    

    -peter

    Reply
  • Where file speech.h ... !?

    Posted by Legacy on 08/13/2001 12:00am

    Originally posted by: ZeALoT

    I need this speech.h file?!
    Where this file in site http://research.microsoft.com/stg ?
    Help me? please ...

    Reply
  • Speech.h link as per 03/12/01

    Posted by Legacy on 03/13/2001 12:00am

    Originally posted by: Asd

    http://www.dcs.napier.ac.uk/~bsc4147/SpeechNav/speech.h

    Enjoy it!

    Reply
  • Image in Transparent Window

    Posted by Legacy on 08/01/2000 12:00am

    Originally posted by: Jackie

    Hi All,

    I'd like to load an image in Transparent Window (not a dialog) with voice recognition. Does someone know how to do it?

    Thanks so much.
    Jackie

    Reply
  • About other languages support

    Posted by Legacy on 03/09/2000 12:00am

    Originally posted by: Hank

    Can MS Speech SDK support other languages? Such as chinese.
    If it can, how to implement? Thanks a lots.

    Reply
  • Loading, Please Wait ...

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: December 11, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT Market pressures to move more quickly and develop innovative applications are forcing organizations to rethink how they develop and release applications. The combination of public clouds and physical back-end infrastructures are a means to get applications out faster. However, these hybrid solutions complicate DevOps adoption, with application delivery pipelines that span across complex hybrid cloud and non-cloud environments. Check out this …

  • On-demand Event Event Date: October 29, 2014 It's well understood how critical version control is for code. However, its importance to DevOps isn't always recognized. The 2014 DevOps Survey of Practice shows that one of the key predictors of DevOps success is putting all production environment artifacts into version control. In this webcast, Gene Kim discusses these survey findings and shares woeful tales of artifact management gone wrong! Gene also shares examples of how high-performing DevOps …

Most Popular Programming Stories

More for Developers

RSS Feeds