Howdy,
in your opinion, is it possible for AHK to hook the Google speech to text API and make a dictation script?
I cannot use the SAPI.Spvoice Object because i am italian and i use Windows 7 (Italian language is not in the available languages for STT but only for TTS), so i thought that i could use the api of Google, but i don't really know where to start
Speech to text//Dictation with Google Api
Re: Speech to text//Dictation with Google Api
Hi jekko1976,
[attachment=0]ladolcevita.PNG[/attachment]
...and it seems to still work. You can optionally hide the chrome instance which is autmatically closed when the script exits. The Dictation class is able to call a user-defined callback for the following events: onInterimResult and onResult. I made it before GeekDude poped out of its magician's hat its outstanding chrome.ahk library so, as of now, it can be possible to execute the javascript contained in Dictation.injection.js without using any extension.
Hope this helps
It depends what you mean by possible. If its for your personal use, I guess it is... deep in hack country > Dictation-interface. The repo include a showcase script providing a basic interface which puts in the clipboard recognized speech, if any. Just tested with chrome Version 66.0.3359.181 (64 bits) and ahk v1.1.28.00 (Windows 8.1) in russian, spanish, french and... italian:jekko1976 wrote:in your opinion, is it possible for AHK to hook the Google speech to text API and make a dictation script?
...and it seems to still work. You can optionally hide the chrome instance which is autmatically closed when the script exits. The Dictation class is able to call a user-defined callback for the following events: onInterimResult and onResult. I made it before GeekDude poped out of its magician's hat its outstanding chrome.ahk library so, as of now, it can be possible to execute the javascript contained in Dictation.injection.js without using any extension.
Hope this helps
- Attachments
-
- ladolcevita.PNG (16.99 KiB) Viewed 4613 times
Re: Speech to text//Dictation with Google Api
As an alternative - if you are willing to dictate into your smartphone/tablet/android device - you could a create a personal bot in Telegram messenger and use the google speech API that is integrated with it. (I assume, iPhone's speech recognition could be used in the same way with a bot.)
The bot, running on your computer via AHK and using the Telegram API, could automatically process everything sent to him (by you or also by others, if you allow it) and store it in a text/word/email/whatever file (everything you can do with AHK, basically).
For interfacing the bot this way, you will need to use just a tiny part of the Telegram bot API (https://core.telegram.org/bots/api). I can help you with that - I think I posted a basic example how to connect to a bot and read out all the messages it gets, some time ago. And with dictation, there is not even (much) parsing involved (only from Google, obviously), which should make it easy. Let me know, if you are interested - shouldn't take too long to set up .
You could even dictate messages while on the road and your bot offline. You could still extract these messages at your computer via the bot at a later time (you will have 24 hours). After that, you could still copy the text from the Telegram Desktop app and paste it somewhere by hand (or means of AHK). So, the messages are not lost.
You could probably do something similar with your smartphone in combination with the 'WhatsApp Desktop' app (same Google speech recognition) - but it will be less flexible and reliable, because they don't have a bot API. Biggest problem probably, you simply cannot send a WhatsApp message to yourself (if I am not mistaken - perhaps if you could create a one person group or by just sending it to your secreaty instead). But with a personal Telegram bot - no problem; it is much cooler and no secretary needed
But A_Ahkuser's discovery looks very interesting, too.
The bot, running on your computer via AHK and using the Telegram API, could automatically process everything sent to him (by you or also by others, if you allow it) and store it in a text/word/email/whatever file (everything you can do with AHK, basically).
For interfacing the bot this way, you will need to use just a tiny part of the Telegram bot API (https://core.telegram.org/bots/api). I can help you with that - I think I posted a basic example how to connect to a bot and read out all the messages it gets, some time ago. And with dictation, there is not even (much) parsing involved (only from Google, obviously), which should make it easy. Let me know, if you are interested - shouldn't take too long to set up .
You could even dictate messages while on the road and your bot offline. You could still extract these messages at your computer via the bot at a later time (you will have 24 hours). After that, you could still copy the text from the Telegram Desktop app and paste it somewhere by hand (or means of AHK). So, the messages are not lost.
You could probably do something similar with your smartphone in combination with the 'WhatsApp Desktop' app (same Google speech recognition) - but it will be less flexible and reliable, because they don't have a bot API. Biggest problem probably, you simply cannot send a WhatsApp message to yourself (if I am not mistaken - perhaps if you could create a one person group or by just sending it to your secreaty instead). But with a personal Telegram bot - no problem; it is much cooler and no secretary needed
But A_Ahkuser's discovery looks very interesting, too.
Re: Speech to text//Dictation with Google Api
Dear all,
first of all, thank you all for the hints, I dind't expected that this post could become such a treasure chest for me.
I know very well GeekDude and his immense capabilities at programming. I would like to ask him some hints with programming the chrome.ahk library in order to drive dictation features with AHK.
About gregster, i am a big fan of Telegram! It would be a big improvement for me to drive it via AHK! Could you send me some examples of how to do it?
Thank you very much for all!
first of all, thank you all for the hints, I dind't expected that this post could become such a treasure chest for me.
I know very well GeekDude and his immense capabilities at programming. I would like to ask him some hints with programming the chrome.ahk library in order to drive dictation features with AHK.
About gregster, i am a big fan of Telegram! It would be a big improvement for me to drive it via AHK! Could you send me some examples of how to do it?
Thank you very much for all!
Re: Speech to text//Dictation with Google Api
Alright, I will try to put something up here, tonight.
Re: Speech to text//Dictation with Google Api
no no wait, wait, wait....gregster wrote:Alright, I will try to put something up here, tonight.
The last thing i want is to waste your time.
i have already managed to connect ahk to telegram with this:
https://autohotkey.com/boards/viewtopic.php?t=24919
It works fine, BUT now i am able to send messages
From AHK---->to telegram
What i wanna do is to read messages in telegram and store them in AHK variables. This is more complicated.
Thank you for the interest in my issue
Re: Speech to text//Dictation with Google Api
Ok, then I can save the introductory stuff how to set up a bot and get a bot ID, and chat ID
Now, here is a stripped down version of this script: https://autohotkey.com/boards/viewtopic ... am#p192355
Just add your bot ID, chat ID and include Coco's JSON library for easier parsing of the bot's responses. Add name and path of a textfile to save the messages (if you don't, there are some msgboxes, too, which can be used to check):
The msgboxes are mainly for testing/debugging, or in case you haven't added a path to a textfile yet, that collects all the text messages sent to the bot. They can be removed, if it works (and then the update frequency of the timer can be increased, if you like).
Since anybody can enter a private chat with your bot and send messages, I added a check so that only the messages of known chat IDs are added to the text file. That's why you will have to add your own chat ID (the msgboxes will show it for checking); of course, you can add your co-workers's/wife's/second phone's/whoever's chat ID as well, or remove this check completely.
If this script is not running on your computer while you send a message from your phone, it will still get all messages from the last 24 hours, when you start it next time. Older messages will be discarded by the API.
But there is also a webapp and a desktop app that can be used on Windows to copy the messages by hand later, if necessary (unfortunately, Google Speech is not accessible from these Windows apps).
Of course, you can change how the messages are processed by AHK, for example, if you want to save them separately. Let me know, if something is unclear.
Btw, Google speech can be started with the key just left to the space key on the Telegram keyboard; you might have to hold it for a short time until a small popup appears. Choose the microphone icon there and it will be remembered as the default action (at least, for some time). At least, it was like this on the Android phones I have seen...
Now, here is a stripped down version of this script: https://autohotkey.com/boards/viewtopic ... am#p192355
Just add your bot ID, chat ID and include Coco's JSON library for easier parsing of the bot's responses. Add name and path of a textfile to save the messages (if you don't, there are some msgboxes, too, which can be used to check):
Code: Select all
#include json.ahk ; Coco's JSON library, get it here: https://autohotkey.com/boards/viewtopic.php?t=627
;--------------------------------------------------------------------------------------------------
botToken := "xxxxxxxxx:yyyyyyyyyyyyyyyyyyy" ; add your Telegram bot token
chatID := 000000001 ; add your chat ID
textfile := "" ; add file (path and) name to save the Telegram messages
;--------------------------------------------------------------------------------------------------
oCustomers := {} ; create Object for user ids who are allowed to send messages to your
oCustomers[chatID] := "My Name" ; add your chat id (and name if you want) to the customer object for testing purposes
offset := "" ; Telegram message offset
; Check for new updates
SetTimer, UpdateTimer, 15000 ; set to 1000 ms = 1 second or similar, if you want (first comment the msgboxes out and a textfile path)
return
;---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Esc::ExitApp ; hit Escape to stop the script
;----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
UpdateTimer: ; checks constantly for user input in your bot
stack := {} ; message stack
updates := GetUpdates(botToken, (offset+1)) ; get (new) updates from your bot as JSON string; keep track of old messages
msgbox % "JSON response:`n" updates ; remove, if you use a textfile
try oUpdates := JSON.Load(updates) ; create an AHK object from the JSON string
If oUpdates.ok ; check if json answer was "ok" : true
{
loop % oUpdates.result.MaxIndex() ; determine number of new messages (updates)
stack.Push(oUpdates.result[A_index]) ; add all updates (=messages) to stack
For key, msg in stack
{
from_id := first_name := mtext := last_name := username := ""
from_id := msg.message.from.id ; which ID sent the message?
mtext := msg.message.text ; what was the message text?
;first_name := msg.message.from.first_name
;last_name := msg.message.from.last_name
;username ;= msg.message.from.username
msgbox % "userId: " from_id "`n" mtext ; remove when you add a textfile to collect the messages
offset := msg.update_id ; keep track of processed messages -> gets updated on Telegram server only with next call of GetUpdates(...)
if (textfile != "") ; checks, if there is a textfile to save to
if oCustomers.Haskey(from_id) ; check for known users... optional
FileAppend, %mtext% `n`n, %textfile% ; appends message to textfile and adds two linefeeds
}
}
return
;------------------------------------------ Telegram functions --------------------------------------------------------------------------------------------------------
GetUpdates(token, offset="", updlimit=100, timeout=0)
{
If (updlimit>100)
updlimit := 100
; Offset = Identifier of the first update to be returned.
url := "https://api.telegram.org/bot" token "/getupdates?offset=" offset "&limit=" updlimit "&timeout=" timeout
updjson := URLDownloadToVar(url)
return updjson
}
;----------------------------------- additional functions ------------------------------------------------------------------------------------------------------------------
URLDownloadToVar(url,ByRef variable=""){ ; function originally by Maestrith, I think
try ; keep script from breaking if API is down or not reacting
{
hObject:=ComObjCreate("WinHttp.WinHttpRequest.5.1")
hObject.Open("GET",url)
hObject.Send()
variable:=hObject.ResponseText
return variable
}
}
Since anybody can enter a private chat with your bot and send messages, I added a check so that only the messages of known chat IDs are added to the text file. That's why you will have to add your own chat ID (the msgboxes will show it for checking); of course, you can add your co-workers's/wife's/second phone's/whoever's chat ID as well, or remove this check completely.
If this script is not running on your computer while you send a message from your phone, it will still get all messages from the last 24 hours, when you start it next time. Older messages will be discarded by the API.
But there is also a webapp and a desktop app that can be used on Windows to copy the messages by hand later, if necessary (unfortunately, Google Speech is not accessible from these Windows apps).
Of course, you can change how the messages are processed by AHK, for example, if you want to save them separately. Let me know, if something is unclear.
Btw, Google speech can be started with the key just left to the space key on the Telegram keyboard; you might have to hold it for a short time until a small popup appears. Choose the microphone icon there and it will be remembered as the default action (at least, for some time). At least, it was like this on the Android phones I have seen...
Re: Speech to text//Dictation with Google Api
Dear gresgster,gregster wrote:Ok, then I can save the introductory stuff how to set up a bot and get a bot ID, and chat ID
i setup all like you described and it works like a charm and all this is uber-cool!!
Now i can interface telegram with ahk with full features!
Thank you very much for your assistance
Re: Speech to text//Dictation with Google Api
I am glad that I could help . I am planning to post a few more Telegram-related scripts as soon as I have finished my (object-oriented) Telegram API wrapper (but recently I didn't have time to work on this). There are a lot of other things that can be done - custom buttoms and keyboards, automated responses, up- and download of files, images etc.
Don't hesitate to ask if you want to expand your Telegram bot script!
Don't hesitate to ask if you want to expand your Telegram bot script!
Who is online
Users browsing this forum: mikeyww and 351 guests