Building the Right Environment to Support AI, Machine Learning and Deep Learning
Here are the properties for the Direct Speech Synthesiser control.
Age (index As Long) As Long (Read only)
Aproximate Age of the voice. This property can be one of the following values:
- TTSAGE_BABY: ~1 Year old.
- TTSAGE_TODDLER: ~3 Years old.
- TTSAGE_CHILD: ~6 Years old.
- TTSAGE_ADOLESCENT: ~14 Years old.
- TTSAGE_ADULT: 20–60 Years old.
- TTSAGE_ELDERLY: OVER 60 Years old.
CallBacksEnabled As Integer
Uses the True, False settings like the voice command control. When CallBacksEnabled is set to False, events are not called.
CountEngines As Long (Read only)
Number of speech synthesis voices installed on this computer.
Note: This is the highest number that can be used as an index to indexed properties and methods.
CurrentMode As Long
Index for the currently selected voice.
Dialect (index As Long) As String (Read Only)
Dialect specific to the language.
Features (index As Long) As Long (Read Only)
Text-to-speech features that are available in the control. Valid settings (can be more than one):
- TTSFEATURE_ANYWORD: The speech engine will attempt to read all words
- TTSFEATURE_PCOPTOMIZED: The voice is optimised for use with computer speakers
- TTSFEATURE_PHONEOPTIMIZED: The voice is optimised for use over the telephone with a 8Khz sampling rate
- TTSFEATURE_TAGGED: The engine can interpret Tagged text to control the voice output
- TTSFEATURE_VISUAL: The engine can provide mouth position information
FileName As String
When this variable is assigned to a filename, subsequent text-to-speech is recorded in a file of this File instead of played to the wave device. To re-enable the speakers and disable recording to a file, set FileName to "".
Gender (index As Long) As Long (Read only)
Gender of the voice.
HWnd as Long (Read only)
Initialized As Integer
Returns or sets the initialised state of the control. Most methods and properties will automatically initialise the control. Valid Settings: 0 = Not initialised, 1 = Initialised.
JawOpen As Integer
Angle to which the jaw is open. This is a linear range from &HFF for completely open to &H00 for completely closed.
LanguageID (index As Long) As Long (Read only)
Bits 0 through 9 identify the primary language, such as English, French, Spanish, and so on. Bits 10 through 15 indicate the sublanguage, which is essentially a locale setting.
LastError As Long (Read only)
Result code from the last method or property call.
LastWordPosition As Long
Offset, in bytes, from the beginning of the text-to-speech buffer to the word that is currently being played.
Note: This is the byte offset in Unicode.
LipTension As Integer
Lip tension. This is a linear range from &HFF if the lips are very tense to &H00 if they are completely relaxed.
LipType As Integer
When set to 0, red female lips are drawn. When set to 1, male pink lips are drawn.
MaxPitch As Long (Read only)
Maximum legal value for Pitch.
MaxSpeed As Long (Read only)
Maximum legal value for Speed.
MaxVolumeLeft As Long (Read only)
Maximum legal value for VolumeLeft.
MaxVolumeRight As Long (Read only)
Maximum legal value for VolumeRight.
MfgName (index As Long) As String (Read only)
Name of the engine manufacturer.
MinPitch As Long (Read only)
Minimum legal value for Pitch.
MinSpeed As Long (Read only)
Minimum legal value for Speed.
MinVolumeLeft As Long (Read only)
Minimum legal value for VolumeLeft.
MinVolumeRight As Long (Read only)
Minimum legal value for VolumeRight.
ModeName (index As Long) As String (Read Only)
Name of the text-to-speech mode.
MouthEnabled As Integer
When MouthEnabled is set to 0, the mouth does not animate.
MouthUpturn As Integer
Extent to which the mouth turns up at the corners (that is, how much it smiles). This is a linear range from &HFF for the maximum upturn (that is, the mouth is fully smiling) to &H00 if the corners of the mouth turn down. If this member is &H80, the mouth is neutral.
Pitch As Long
The current baseline pitch, in hertz, for a text-to-speech mode. The actual pitch of the voice typically fluctuates above this baseline. It usually does not go below it.
Speaker (index As Long) As String (Read only)
Name of the voice.
Speaking As Integer
Returns whether or not the synthesizer voice is speaking. When set to 1, the synthesizer voice is speaking. When set to 0, the synthesizer voice is not speaking.
Speed As Long
Sets or Returns the average speed for a text-to-speech mode, in words per minute.
SuppressExceptions As Integer
When set to 1, exceptions will never occur. You must check LastError to get the error code of the last method or property invocation.
TeethLowerVisible As Integer
Extent to which the lower teeth are visible. This is a linear range from &HFF for the maximum extent (that is, the lower teeth and gums are completely exposed) to &H00 for the minimum (the lower teeth are completely hidden.) If this member is &H80, only the teeth are visible.
TeethUpperVisible As Integer
Extent to which the upper teeth are visible. This is a linear range from 0xFF for the maximum extent (that is, the upper teeth and gums are completely exposed) to 0x00 for the minimum (the upper teeth are completely hidden). If this member is 0x80, only the teeth are visible.
TonguePosn As Integer
Tongue position. This a linear range from &HFF if the tongue is against the upper teeth, to &H00 if it is relaxed. If this member is &H80, the tongue is visible.
VolumeLeft As Long
Sets or returns the current volume for the left channel of text-to-speech mode.
VolumeRight As Long
Sets or returns the current volume for the right channel of text-to-speech mode.
AboutDlg (hWnd As Long, title As String)
Displays an About dialog box that identifies the text-to-speech engine and contains the copyright notice.
Note: If an application calls AudioPause and then TextData, the data will be queued up so that when AudioResume is called, there will be no latency. Applications can use this to ensure that the text-to-speech engine will speak right away.
Stops speech and cancels all queued speech data. When the queue is empty, the engine calls the TextDataDone event.
Resumes text-to-speech output that has been paused.
GeneralDlg (hWnd As Long, title As String)
Displays a General dialog box that gives the user general control of the text-to-speech engine and gives the user access to engine-specific controls.
GetPronunciation (CharSet As Long, Text As Long , Sense As Long , Pronounce As String, PartOfSpeech As Long , EngineInfo As String)
Returns pronunciation information for Text in Sense, Pronounce, PartOfSpeech, and EngineInfo.
LexiconDlg (hWnd As Long, title As String)
Displays a dialog box that allows the speaker to view and edit his or her pronunciation. For example, the speaker can edit the phonetics of mispronounced words.
Select (index As Long)
This selects a text-to-speech engine that Speak will use. See the CountEngines propertiy for more info.
Speak (text As String)
This causes text to speech to speak the text. By default, the Microsoft female voice is played on the wave device. Speak is asynchronous; that is, the method returns before all of the text is played.
TextData (characterset As Long, flags As Long, text As String)
Starts the process of converting text into audio data to be spoken. Same As Speak, but lets you set more flags. TextData is asynchronous; that is, the method returns before all of the text is played.
TranslateDlg (hWnd As Long, title As String)
Displays a Translation dialog box that lets the user control symbols, currencies, abbreviations, and number-translation techniques.
And lastly, the Events.
AttribChanged (which_attribute As Long)
Is called when an engine attribute has changed. Valid responses:
- TTSNSAC_LANGUAGE: Indicates that the language has changed
- TTSNSAC_MODE: Indicates that the mode or voice has changed
- TTSNSAC_PITCH: Indicates that the baseline pitch for the voice
- TTSNSAC_SPEED: Indicates that the baseline average speed for the voice
- TTSNSAC_VOLUME: Indicates that the baseline volume for the voice
AudioStart (hi As Long, lo As Long)
Is called when audio data starts playing.
AudioStop (hi As Long, lo As Long)
Is called when audio data stops playing.
ClickIn (x As Long, y As Long)
Is called when the user clicks in the object's icon.
TextDataDone (hi As Long, lo As Long, Flags As Long)
Is called when text to speech data processing ends. TextDataDone is called once per TextData call. The Flags return the reason the processing has ended.
TextDataStarted (hi As Long, lo As Long)
Is called when text to speech data processing begins. TextDataStarted is called once per TextData call.
Visual (timehi As Long, timelo As Long, Phoneme As Integer, EnginePhoneme As Integer, hints As Long, MouthHeight As Integer, bMouthWidth As Integer, bMouthUpturn As Integer, bJawOpen As Integer, TeethUpperVisible As Integer, TeethLowerVisible As Integer, TonguePosn As Integer, LipTension As Integer)
Is called whenever the shape of the mouth should change. Also notifies an application which phoneme is being used in the current digital-audio stream. This allows users to implement their own mouths.
WordPosition (hi As Long, low As Long, byteoffset As Long)
Notifies the application of the word that is currently being played. Used for synchronization. An application can use the information returned by WordPosition to highlight the word being played.