Simple Windows Text to Speech

Windows from certainly at least version 7+ and the equivalent server versions have an excellent built-in Speech engine that does both text-to-speech and speech recognition.

The speech engine is written as a system library and so is easily called from PowerShell.

This flow makes use of that feature and uses the exec node to shell out to PowerShell, calling the speech engine using a one-line PowerShell script.

You send the text into the exec node and it passes it through.

See the comment node for some details. This is exactly the way that the popular say.js library works.

You could easily take this further and control things like the speed and volume, changing voices, etc.

For example, adding the code $speak.rate = -5; just before $speak.Speak... will slow down the speech rate (rate goes from -10 to +10 with 0 being normal rate).

To change the volume, add $speak.Volume = 50 where the number is from 0 to 100. The default is 100 which is max volume.

If you want more control, change $speak.Speak to $speak.SpeakSsml and pass in an SSML XML string.

This is an example of SSML:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" 
    xml:lang="en-GB">
    <voice xml:lang="en-GB">
        <prosody rate="1">
            <p>Normal pitch. </p>
            <p><prosody pitch="x-high"> High Pitch. </prosody></p>
        </prosody>

        <p><say-as interpret-as="vxml:currency">USD45.329</say-as></p>
        <p><say-as interpret-as="vxml:currency">GBP10.45</say-as></p>
    </voice>
</speak>

You can find more information in the standards or in the Microsoft SSML Documentation.aspx).

The full list of controls is shown by this output from PowerShell:

PS > $speak | gm


   TypeName: System.Speech.Synthesis.SpeechSynthesizer

Name                          MemberType Definition                                                                                                  
----                          ---------- ----------                                                                                                  
BookmarkReached               Event      System.EventHandler`1[System.Speech.Synthesis.BookmarkReachedEventArgs] BookmarkReached(System.Object, Sy...
PhonemeReached                Event      System.EventHandler`1[System.Speech.Synthesis.PhonemeReachedEventArgs] PhonemeReached(System.Object, Syst...
SpeakCompleted                Event      System.EventHandler`1[System.Speech.Synthesis.SpeakCompletedEventArgs] SpeakCompleted(System.Object, Syst...
SpeakProgress                 Event      System.EventHandler`1[System.Speech.Synthesis.SpeakProgressEventArgs] SpeakProgress(System.Object, System...
SpeakStarted                  Event      System.EventHandler`1[System.Speech.Synthesis.SpeakStartedEventArgs] SpeakStarted(System.Object, System.S...
StateChanged                  Event      System.EventHandler`1[System.Speech.Synthesis.StateChangedEventArgs] StateChanged(System.Object, System.S...
VisemeReached                 Event      System.EventHandler`1[System.Speech.Synthesis.VisemeReachedEventArgs] VisemeReached(System.Object, System...
VoiceChange                   Event      System.EventHandler`1[System.Speech.Synthesis.VoiceChangeEventArgs] VoiceChange(System.Object, System.Spe...
AddLexicon                    Method     void AddLexicon(uri uri, string mediaType)                                                                  
Dispose                       Method     void Dispose(), void IDisposable.Dispose()                                                                  
Equals                        Method     bool Equals(System.Object obj)                                                                              
GetCurrentlySpokenPrompt      Method     System.Speech.Synthesis.Prompt GetCurrentlySpokenPrompt()                                                   
GetHashCode                   Method     int GetHashCode()                                                                                           
GetInstalledVoices            Method     System.Collections.ObjectModel.ReadOnlyCollection[System.Speech.Synthesis.InstalledVoice] GetInstalledVoi...
GetType                       Method     type GetType()                                                                                              
Pause                         Method     void Pause()                                                                                                
RemoveLexicon                 Method     void RemoveLexicon(uri uri)                                                                                 
Resume                        Method     void Resume()                                                                                               
SelectVoice                   Method     void SelectVoice(string name)                                                                               
SelectVoiceByHints            Method     void SelectVoiceByHints(System.Speech.Synthesis.VoiceGender gender), void SelectVoiceByHints(System.Speec...
SetOutputToAudioStream        Method     void SetOutputToAudioStream(System.IO.Stream audioDestination, System.Speech.AudioFormat.SpeechAudioForma...
SetOutputToDefaultAudioDevice Method     void SetOutputToDefaultAudioDevice()                                                                        
SetOutputToNull               Method     void SetOutputToNull()                                                                                      
SetOutputToWaveFile           Method     void SetOutputToWaveFile(string path), void SetOutputToWaveFile(string path, System.Speech.AudioFormat.Sp...
SetOutputToWaveStream         Method     void SetOutputToWaveStream(System.IO.Stream audioDestination)                                               
Speak                         Method     void Speak(string textToSpeak), void Speak(System.Speech.Synthesis.Prompt prompt), void Speak(System.Spee...
SpeakAsync                    Method     System.Speech.Synthesis.Prompt SpeakAsync(string textToSpeak), void SpeakAsync(System.Speech.Synthesis.Pr...
SpeakAsyncCancel              Method     void SpeakAsyncCancel(System.Speech.Synthesis.Prompt prompt)                                                
SpeakAsyncCancelAll           Method     void SpeakAsyncCancelAll()                                                                                  
SpeakSsml                     Method     void SpeakSsml(string textToSpeak)                                                                          
SpeakSsmlAsync                Method     System.Speech.Synthesis.Prompt SpeakSsmlAsync(string textToSpeak)                                           
ToString                      Method     string ToString()                                                                                           
Rate                          Property   int Rate {get;set;}                                                                                         
State                         Property   System.Speech.Synthesis.SynthesizerState State {get;}                                                       
Voice                         Property   System.Speech.Synthesis.VoiceInfo Voice {get;}                                                              
Volume                        Property   int Volume {get;set;}
[
    {
        "id": "b620e05d.96c1e",
        "type": "comment",
        "z": "462586cc.55d938",
        "name": "TTS",
        "info": "Simple text to speach synthesis \nusing native Windows Cortana.\n\nUses a call to PowerShell.\n\nNote that this is exactly the same as using the\nsay.js library on Windows.",
        "x": 117,
        "y": 3836,
        "wires": []
    },
    {
        "id": "dc7acf4c.ebb32",
        "type": "debug",
        "z": "462586cc.55d938",
        "name": "",
        "active": true,
        "console": "false",
        "complete": "false",
        "x": 550,
        "y": 3900,
        "wires": []
    },
    {
        "id": "dcfa278a.5bbe18",
        "type": "inject",
        "z": "462586cc.55d938",
        "name": "",
        "topic": "",
        "payload": "The Doorbell! has been pressed.",
        "payloadType": "str",
        "repeat": "",
        "crontab": "",
        "once": false,
        "x": 190,
        "y": 3900,
        "wires": [
            [
                "4b0928d8.f08b08"
            ]
        ]
    },
    {
        "id": "4b0928d8.f08b08",
        "type": "exec",
        "z": "462586cc.55d938",
        "command": "powershell -NoProfile -command \"Add-Type -AssemblyName System.speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer; $speak.Speak('",
        "addpay": true,
        "append": "')\"",
        "useSpawn": "",
        "timer": "",
        "name": "TTS",
        "x": 350,
        "y": 3900,
        "wires": [
            [
                "dc7acf4c.ebb32"
            ],
            [
                "dc7acf4c.ebb32"
            ],
            [
                "dc7acf4c.ebb32"
            ]
        ]
    }
]
TotallyInformation

Flow Info

created 3 months, 1 week ago

Node Types

Core
  • comment (x1)
  • debug (x1)
  • exec (x1)
  • inject (x1)

Tags

  • TTS
  • Windows
  • Voice
  • Sound
Copy this flow JSON to your clipboard and then import into Node-RED using the Import From > Clipboard (Ctrl-I) menu option