Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Voice Recognition COM


  • Please log in to reply
21 replies to this topic
Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007

I rewrite the script in the following topic to use COM Standard Library, which is recommended over CoHelper version:
http://www.autohotke...pic.php?t=20493

#Persistent
OnExit, CleanUp

COM_Init()
plistener:= COM_CreateObject("SAPI.SpSharedRecognizer")
COM_Invoke(plistener, "AudioInput", paudioin ? "+" . paudioin : "+0")
pcontext := COM_Invoke(plistener, "CreateRecoContext")
pgrammar := COM_Invoke(pcontext , "CreateGrammar")
COM_Invoke(pgrammar, "DictationSetState", 0)
prules := COM_Invoke(pgrammar, "Rules")
prulec := COM_Invoke(prules, "Add", "wordsRule", 0x1|0x20)
COM_Invoke(prulec, "Clear")
pstate := COM_Invoke(prulec, "InitialState")

; Add here the words to be recognized!
COM_Invoke(pstate, "AddWordTransition", "+" . 0, "One")
COM_Invoke(pstate, "AddWordTransition", "+" . 0, "Two")
COM_Invoke(pstate, "AddWordTransition", "+" . 0, "Three")
;;

COM_Invoke(prules, "Commit")
COM_Invoke(pgrammar, "CmdSetRuleState", "wordsRule", 1)
COM_Invoke(prules, "Commit")
pevent := COM_ConnectObject(pcontext, "On")
Return

CleanUp:
COM_Release(pevent)
COM_Release(pstate)
COM_Release(prulec)
COM_Release(prules)
COM_Release(pgrammar)
COM_Release(pcontext)
COM_Release(plistener)
COM_Term()
ExitApp

OnRecognition(prms, this)
{
presult := COM_DispGetParam(prms, 3, 9)
pphrase := COM_Invoke(presult, "PhraseInfo")
sText := COM_Invoke(pphrase, "GetText")
COM_Release(pphrase)
; Add custom operations from here!
}


etopsirhc
  • Members
  • 62 posts
  • Last active: Apr 03 2011 12:27 AM
  • Joined: 12 Mar 2008
ok i'm haveing a problem with your script and i have no idea how to fix it here's the error

Error: Call to nonexistent function.
specifically:COM_Init()
(points to the line in code)
the program will exit

how can i fix this :?:

tic
  • Members
  • 1934 posts
  • Last active: May 30 2018 08:13 PM
  • Joined: 22 Apr 2007

I rewrite the script in the following topic to use COM Standard Library

:?:

etopsirhc
  • Members
  • 62 posts
  • Last active: Apr 03 2011 12:27 AM
  • Joined: 12 Mar 2008
i have that but it still dosent work

  • Guests
  • Last active:
  • Joined: --
Read about Standard Library:
<!-- m -->http://www.autohotke...nctions.htm#lib<!-- m -->

etopsirhc
  • Members
  • 62 posts
  • Last active: Apr 03 2011 12:27 AM
  • Joined: 12 Mar 2008
ok i got it to kinda work but now it says

No Event Interface Exists! Now exit the application.

so how do i fix that now

  • Guests
  • Last active:
  • Joined: --
Read the original thread linked at the top. You need to install
Speech SDK 5.1

DranDane
  • Members
  • 53 posts
  • Last active: Feb 04 2009 04:30 PM
  • Joined: 26 Jun 2007
Thank you for this sample. It's good to see it's possible to use ahk with speech recognition.

BUT I'm still thinking that speech recognition should be added in the core C++ code of ahk and be implemented like hotstrings.


"What time is it"::
  ;gives the time
return


maximina
  • Members
  • 17 posts
  • Last active: Nov 04 2008 08:20 PM
  • Joined: 17 Oct 2007

Read the original thread linked at the top. You need to install
Speech SDK 5.1


Which parts of the SDK are needed?

TodWulff
  • Members
  • 142 posts
  • Last active: Sep 15 2013 04:16 PM
  • Joined: 29 Dec 2007

I rewrite the script in the following topic to use COM Standard Library, which is recommended over CoHelper version:
<!-- m -->http://www.autohotke...pic.php?t=20493<!-- m -->

^^^Nice Tool!^^^

Hey Sean.

Thanks for the script. I have been able to get it running and working stand alone reasonably well (I occasionally get the no interface dialog, but I suspect that it is because I am being too impatient (opening a second instance of the script while the first one is releasing it resources, etc.) and confusing the underlying MS framework).

At any rate, I wanted to take a second to ask you if you'd (or someone else who is also knowledgeable) be willing to take a minute and explain the script's functionality a bit (as a teaching/learning exercise, as I am still absorbing the various concepts...). I'd like to evaluate incorporating the speech recognition into my project, but not understanding how it is actually able to do what it is doing causes me some inability to properly evaluate it.

So, I'll pose my questions:

From the manual, I understand the following:

A script that is not persistent and that lacks hotkeys, hotstrings, OnMessage, and GUI will terminate after the auto-execute section has completed. Otherwise, it will stay running in an idle state, responding to events such as hotkeys, hotstrings, GUI events, custom menu items, and timers.


So, with the #persistent after including the COM library, the script executes up to the evaluation/assignment of pevent, and then goes idle after same, waiting for events to occur.

This is what I am struggling with: How the hell is the script triggered to cause the OnRecognition function to kick off? i.e. what faculties of AHK are you employing to cause this functionality to exist. An explanation and/or a link to the manual would be greatly appreciated.

I suspect that it has to do with COM and COMevents, but I am not able to glean how this is actually implemented from your code. I suspect that it is probably a concept that is easily grasped, once 'verbalized'.?.

A 2nd question related to the Speech SDK: I suspect that the speech SDK facilities that are being employed are indeed NOT speaker independent - i.e. that some training would have to have taken place previously, so that the SDK's tool set has a frame of reference for the user. Is this indeed the case, or is it using generic enough speech recognition patterns to allow for user-independent voice recognition?

Thanks so much. Your time, in reviewing my questions, consider same, and replying to advise, is greatly appreciated.

Have a great day.

-t
When replying, please feel free to address me as Tod. My AHK.net site...

BoBo¨
  • Guests
  • Last active:
  • Joined: --
@ TodWulff
Just to let you know ... don't be angry if a German won't prefer to call you [Tod]

TodWulff
  • Members
  • 142 posts
  • Last active: Sep 15 2013 04:16 PM
  • Joined: 29 Dec 2007

@ TodWulff
Just to let you know ... don't be angry if a German won't prefer to call you [Tod]

OK, I am obviously a bit ignorant here, admittedly, so please do me the flavor of bringing me up to speed so as to remove any egg that I might have on my face... TIA. Anxiously anticipating an explanation... :)

[EDIT] Didn't notice the link. Sorry. Yikes. Fully understood. Thanks! [/EDIT]
When replying, please feel free to address me as Tod. My AHK.net site...

Bling170
  • Members
  • 1 posts
  • Last active: Apr 24 2008 05:45 AM
  • Joined: 24 Apr 2008
Bravo on the conversion. But may I ask, what are some of the optional parameters I can pass in, to tweak and adjust the way things work? Any flexibility here.... does anybody know? :)

infogulch
  • Moderators
  • 717 posts
  • Last active: Jul 31 2014 08:27 PM
  • Joined: 27 Mar 2008

Which parts of the SDK are needed?

That download page says this:

Important File Download Details

* If you want to download sample code, documentation, SAPI, and the U.S. English Speech engines for development purposes, download the Speech SDK 5.1 file (SpeechSDK51.exe).

* If you want to use the Japanese and Simplified Chinese engines for development purposes, download the Speech SDK 5.1 Language Pack file (SpeechSDK51LangPack.exe) in addition to the Speech SDK 5.1 file.

* If you want to redistribute the Speech API and/or the Speech engines to integrate and ship as a part of your product (e.g. AHK) **Hint**Hint** ;), download the Speech 5.1 SDK Redistributables file (SpeechSDK51MSM.exe).

* If you want to get only the Mike and Mary voices redistributable for Windows XP, download Mike and Mary redistributables (Sp5TTIntXP.exe).

* If you only want the documentation, download the Documentation file (sapi.chm).

(2maximina: So you'd probably want to just go with the first one: "SpeechSDK51.exe")

Hey guys, look at that - microsoft's dropping hints for AHK. and wink smileys too, interesting... :D:D **LooksAway&Whistles**

infogulch
  • Moderators
  • 717 posts
  • Last active: Jul 31 2014 08:27 PM
  • Joined: 27 Mar 2008
This is really cool, thnx. :D I'll have to get a headset or something so it can understand me better. :p

Anyway, I have some questions about this. Any answers greatly appreciated. :)

1.__________
As with TodWulff, I don't understand how the "OnRecognition()" function is called. I tried looking through the COM lib, but it references itself so much, I had my head spinning by the 3rd func. :p (bit manipulation is a bit over my head anyway)
It seems to work similar to "OnMessage()", does COM set that up? or is OnRecognition a built-in function by it's own right?
I have a simplified idea of how it works, please correct me if I'm wrong: (probably am, so that means "please correct me", i guess ;))

You pass the word you want to recognize to COM along with some other info, COM tells SDK which word to look for, and dynamically sets up an OnMessage based on that. Then when SDK recognizes a word, it sends a message to the script telling it that the word was recognized, COM receives that message, and sends it to OnRecognition, which uses COM agian to turn it back into the text you get from the var: "sText"

Is that even close? Even if, I still don't understand how OnRecognition is defined as the function to go to if it finds it, unless that's just part of COM.

2.__________
How would I clear words from being recognized anymore? Can I do it individually or does it have to be done all at once? If all at once, is it "COM_Release(pstate)"? (Because you use "pstate" in the COM_Invoke for recognizing a word, and that's in the OnExit subroutine)

3.__________
Is there a way to tell it to receive All words SDK recognizes? I know it would be unreliable, so I would only use it momentarily, but I would like to know. (does it have to do with the "+" . 0 paramater passed to the COM_Invoke?)

Thanks agian for any help. :D Voice recognition is really cool, and I would love to understand it more so I can meet it's potential better. 8)