Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Text-To-Speech via COM - Examples


  • Please log in to reply
10 replies to this topic
jballi
  • Members
  • 1029 posts
  • Last active:
  • Joined: 01 Oct 2005
------------------------------
Note: The Text-To-Speech examples in this post require the standard COM library. These examples will work on AutoHotkey Basic and the ANSI version of AutoHotkey_L.

Text-To-Speech examples that use the built-in COM objects in AutoHotkey_L can be found here:
http://www.autohotke...pic.php?t=83162
------------------------------

Introduction
Since the release of the COM Standard Library in 2007 (known earlier as CoHelper or COM Helper), I've always assumed that I would find some need to dig into the library and figure it out. Well, the "must have" requirement never materialized but I recently discovered that I could use the library for a Text-To-Speech feature I was adding to a for-fun project so... I set about to figure out how to use COM Standard Library for Text-To-Speech.

In short, the experience has been exhausting. I've spent much of my free time over that last couple of weeks doing research (mostly on msdn) and writing trial-and-error code. My progress varied. I would make a big breakthrough one day and then I would get stuck on something that I couldn't figure out for a day or two.

The good news is that I've uncovered a lot of good stuff. The bad news is that I've barely scratched the COM Standard Library behemoth. Of the ~58 functions available in the libary, I've used 7. For most of the basic Text-To-Speech operations, you only need 4.

To help in the research and testing phase of this campaign, I created several example/proof-of-concept scripts. I thought others might benefit from this effort so I cleaned up the scripts a bit and I'm posting them here for your review.

Examples
There are currently four example scripts:

[*:3hg901tk]Single Instance. This script attempts to duplicate most of the functionality of the TTSApp demo that is provided with the Microsoft Speech SDK 5.1. See the Issues/Considerations section for additional information.

[*:3hg901tk]Multiple Instances. This script demonstrates the use of multiple SpVoice instances.

[*:3hg901tk]Wait Until Done. This script demonstrates a number of techniques for monitoring the end of a SpVoice stream.

[*:3hg901tk]Raw Dump. This is the first script that I wrote to figure out how to do all of the Text-To-Speech stuff via COM. It's not pretty but it includes almost everything that can be done with the SpVoice object. Use a debugger (I use DebugView) to see all of the values.
Screenshots
Posted Image
Posted Image
Posted Image

The Code
The pertinent files are included in this archive:TTS Examples.zip (Includes source, icons, and example files)

Requirements
These scripts use the COM Standard Library. In addition, the Microsoft Speech SDK 5.1 must be installed if using any Windows version earlier than Windows XP. See the References section for more information.

Issues/Considerations
A few considerations:
[*:3hg901tk]Limited Testing. Although these scripts should work on most Windows versions (Windows 98+), I was only able to test using Windows XP. In addition, I only have one audio output (my sound card), so I was unable to test using an alternate audio output.

[*:3hg901tk]Animation. The Example1GUI script includes animation. Although the animation is fairly accurate, I was unable to completely remove the flickering. It's annoying, I know. Live with it. Also, Drugwash reports that the animation does not work on Windows 98 because the OS does not support transparent icons. Sorry 'bout that.
References
COM Standard Library
This is the AutoHotkey library that makes it all possible. If you haven't done it already, download and install it.
http://www.autohotke...pic.php?t=22923

AutoHotkey Standard Library
If you're not sure where to put the COM Standard Library...
http://www.autohotke...nctions.htm#lib

Microsoft Speech SDK 5.1
This software is not necessary if you are using Windows XP or greater. However, the installation includes two additional voices (Microsoft Mike and Microsoft Mary) so it should be worth the trouble to install it.
http://tinyurl.com/yptaoo

DebugView
All of the example scripts (especially Example4.ahk) dump useful information to a debugger. DebugView is not the only debugger out there but it's my favorite.
http://technet.micro...s/bb896647.aspx

SpVoice Interface (SAPI 5.3)
A must-have guide to the SpVoice (Text-To-Speech engine) interface. This guide includes a list of all of the SpVoice methods as well as what you need to call them.
http://tinyurl.com/mqngf4

SpVoice (Events) Interface (SAPI 5.3)
A must-have guide to the SpVoice Events interface. This guide includes a list of all the SpVoice events as well as all of the parameters that are passed for each event.
http://tinyurl.com/lsjq48

TTS() Text To Speech using COM
Use Text-To-Speech via COM but without the COM Standard Library.
http://www.autohotke...pic.php?t=16552

Final Thoughts
I'm not an expert on this topic. Not even close. These examples were written by trial-and-error and by extracting syntax and ideas from some of the scripts that have been posted on the AutoHotkey forum and from the example code released with the Microsoft Speech SDK 5.1. If I've made any logic or code blunders (major or minor), I'm hoping that someone will be kind enough to bring it to my attention. I hope to benefit from your experience.

I hope that someone can make use of this information.

---------------------------------------------------------------------------
Release Notes

v0.1
Original release.

v0.2
Minor improvements.


n-l-i-d
  • Guests
  • Last active:
  • Joined: --
Very nicely done, and professionally presented!

8)

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
Very nice. BTW, I suppose you may find interesting the early history of COM implementation, realized through the function pointer support in DllCall by Chris.
ComCall via DllCall

icefreez
  • Members
  • 180 posts
  • Last active: May 06 2015 10:08 PM
  • Joined: 15 May 2007
Yes very well documented and demonstrated. I will be sure to keep this bookmarked should I need to implement text 2 speech in the future.

erictheturtle
  • Members
  • 101 posts
  • Last active: Sep 04 2011 02:07 PM
  • Joined: 27 Jun 2007
Ah Sean, always the humble one ;)

jballi, this really is a stunning and professional looking demo for the Microsoft Speech SDK, COM, and Autohotkey in general.

As far as I'm concerned, programming is easy compared to the time and effort it takes to clearly present, communicate, and document the code. So thanks for putting in that extra effort into making this.
-m35

Drugwash
  • Members
  • 1078 posts
  • Last active: May 24 2016 04:20 PM
  • Joined: 07 Sep 2008
Nicely done, good job!

First 3 examples do work in 98SE with only SAPI 4 installed (I dumped SAPI 5 some time ago due to buggy behavior). There's no animation at all in first example, though and the icons are all 32bit so transparent areas are actually black. OS limitation but had to report it for completeness' sake.

Fourth example throws a COM error a few times in a row (if I click Continue): Error 2 (0x80020003) - Member not found. Function GetAttribute, error 0x8004503A. After that, it does speak something, counts to ten and shows message box saying to click OK when speak finished. No GUI, if one was ever intended to show (haven't looked through the code).

I'd say it went pretty well for this "dinosaur" of mine. :) Thank you so much for your efforts, jballi! ;)

jballi
  • Members
  • 1029 posts
  • Last active:
  • Joined: 01 Oct 2005
Thanks for the kind words everybody. :)

First 3 examples do work in 98SE with only SAPI 4 installed (I dumped SAPI 5 some time ago due to buggy behavior). There's no animation at all in first example, though and the icons are all 32bit so transparent areas are actually black. OS limitation but had to report it for completeness' sake.

Fourth example throws a COM error a few times in a row (if I click Continue): Error 2 (0x80020003) - Member not found. Function GetAttribute, error 0x8004503A. After that, it does speak something, counts to ten and shows message box saying to click OK when speak finished. No GUI, if one was ever intended to show (haven't looked through the code).

I'd say it went pretty well for this "dinosaur" of mine. :) Thank you so much for your efforts, jballi! ;)

I'm not surprised that it works on Windows 98 but I'm shocked that it works (mostly) without SAPI5. The COM errors that you are getting in Example 4 are likely because one or more attributes that the GetAttribute method is trying to get is not available with SAPI4. I'm just guessing.

Sorry 'bout the animation. I just threw it in to see if I could get it to work. I didn't even think for second that transparent icons wouldn't work on Windows 98.

Thanks for the feedback.

JoeSchmoe
  • Members
  • 304 posts
  • Last active: Feb 28 2013 05:39 PM
  • Joined: 17 Feb 2008
Hi JBalli,

Thanks for your incredible efforts. I'm really looking forward to using these. I found them using the forum search, believe it or not.

I'm running Win XP and Vista and would like to access Microsoft Mike and Mary. Following the link you sent to the SDK, there is an option to download just those voices: "If you want to get only the Mike and Mary voices redistributable for Windows XP, download Mike and Mary redistributables (Sp5TTIntXP.exe)."

Do you know where to put the resulting file, once it has unzipped (the executable doesn't put it anywhere, it just unzips a file titled "Sp5TTIntXP.Msm")

Thanks again!

jballi
  • Members
  • 1029 posts
  • Last active:
  • Joined: 01 Oct 2005

I'm running Win XP and Vista and would like to access Microsoft Mike and Mary. Following the link you sent to the SDK, there is an option to download just those voices: "If you want to get only the Mike and Mary voices redistributable for Windows XP, download Mike and Mary redistributables (Sp5TTIntXP.exe)."

Do you know where to put the resulting file, once it has unzipped (the executable doesn't put it anywhere, it just unzips a file titled "Sp5TTIntXP.Msm")

I'm definitely not an expert on this topic but I'll give it a shot...

By itself, the Sp5TTIntXP.Msm file is worthless. This is a Windows Installer Merge Module which is used by a developer when creating a Whatever.MSI file for installing an application that uses a Text-To-Speech component.

The best way to get the Mike and Mary voices is to install the basic Speech SDK 5.1 (68 MB). Of course, you can write your own installer for the voices but it's not worth it IMHO. I found some info on how to do it here:
http://blogs.msdn.co.../21/410561.aspxI hope this is helpful.

Flest
  • Members
  • 8 posts
  • Last active: Jun 30 2010 06:05 AM
  • Joined: 29 Apr 2010
I'm getting an error when trying to select different audio formats under Example1GUI.ahk. And with each new format I select, it speaks but it is very distorted.

Each time I select a new audio format, an error like this appears:
Posted Image

This is the output from debug when starting Example1GUI.ahk, selecting a new format (48kHz 16 Bit Stereo), then pressing speak.
[4104] pSpVoice=
[4104] m_nThreadId = [0]
[4104] Subroutine: SpeakFlags
[4104] SpeakFlags:=3
[4104] Subroutine: AudioOutputStreamFormatType
[4104] Subroutine: Speak
[4104] pSink=1448752
[4104] Subroutine: Done

Any idea why this is happening? Any help would be appreciated.

jballi
  • Members
  • 1029 posts
  • Last active:
  • Joined: 01 Oct 2005

I'm getting an error when trying to select different audio formats under Example1GUI.ahk. And with each new format I select, it speaks but it is very distorted.

It's always difficult to debug a problem from a distance, especially if you're unable to duplicate the problem. A few things to try:[*:1tlaotn6]Make sure you have the latest version of AutoHotkey. Note: This script should work with AHK_L but I've never tried it.

[*:1tlaotn6]Verify that you have the latest version of the COM library and that it is installed in the correct location. The link is on the top post.

[*:1tlaotn6]Verify that you're using SAPI5. If using Windows XP or greater, you should be good to go. But and however, I've never tested this on anything but XP so I can't verify what will or will not work. Last resort: Consider installing/re-installing Microsoft Speech SDK 5.1. The link is on the top post.

[*:1tlaotn6]Using a debugger, run the Example4.ahk script and carefully review the output. You might get a clue as to what the problem is.Good luck!