Background:
- viewtopic.php?f=82&t=127698#p571301
https://github.com/kdalanon/ChatGPT-AutoHotkey-Utility/blob/main/ChatGPT%20AutoHotkey%20Utility.ahk
https://platform.openai.com/docs/api-reference/making-requests
https://platform.openai.com/docs/guides/speech-to-text/quickstart
Nonetheless, folks have used the JXON library and WinHttp to successfully interact with ChatGPT. I am not personally interested in ChatGPT so much. But I would like to get live transcription from OpenAI"s hosted Whisper. My experience with CoLab versions of Whisper and other instances has not been that positive. Transcribing audio from OpenAI itself, however, has been fast and painless. By using the published Python snippets from OpenAI, uploading an audio file is a cinch. It's also not difficult to call the Python script from AHK with a hotkey, etc.
However, I would prefer an "all-AHK" method.
Below are the bare bones that have been tested to work for ChatGPT, which spits out a random quote from Oscar Wilde:
Code: Select all
#Requires AutoHotkey v2.0.2
#SingleInstance
#Include "_jxon.ahk"
API_Key := "sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
API_URL := "https://api.openai.com/v1/chat/completions"
API_Model := "gpt-4"
Prompt := "Give me a random aphorism by Oscar Wilde"
Messages := '{ "role": "user", "content": "' Prompt '" }'
F2::
{
WHR := ComObject("WinHttp.WinHttpRequest.5.1")
WHR.open("POST", API_URL, true)
WHR.SetRequestHeader("Content-Type", "application/json")
WHR.SetRequestHeader("Authorization", "Bearer " API_Key)
JSON_Request := '{ "model": "' API_Model '", "messages": [' Messages '] }'
WHR.SetTimeouts(60000, 60000, 60000, 60000)
WHR.Send(JSON_Request)
WHR.WaitForResponse
try
{
if (WHR.status == 200)
{
SafeArray := WHR.responseBody
pData := NumGet(ComObjValue(SafeArray) + 8 + A_PtrSize, 'Ptr')
length := SafeArray.MaxIndex() + 1
JSON_Response := StrGet(pData, length, 'UTF-8')
var := Jxon_Load(&JSON_Response)
JSON_Response := var.Get("choices")[1].Get("message").Get("content")
MsgBox(JSON_Response)
}
}
}
However, I am unable to get the request for Whisper transcription below to work. I think the hold-up is in the file upload with respect to "multipart/form-data".
Code: Select all
#Requires AutoHotkey v2.0.2
#SingleInstance
#Include "_jxon.ahk"
API_Key := "sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
SAPI_URL := "https://api.openai.com/v1/audio/transcriptions"
;;FilePath := '{ "C:\PathToFile\Recording.m4a" }'
path := "C:\PathToFile\Recording.m4a"
API_Model := "whisper-1"
JSON_Request := '{ "model": "' API_Model '" }'
F2::
{
SplitPath path, &fileName
f := FileOpen(path, "r")
sfArray := ComObjArray(VT_UI1:=0x11, f.length)
pvData := NumGet(ComObjValue(sfArray) + 8+A_PtrSize, 'Ptr')
f.RawRead(pvData + 0, f.length)
f.Close()
WHR := ComObject("WinHttp.WinHttpRequest.5.1")
;; WHR.open("POST", SAPI_URL, true)
WHR.open("PUT", "https://api.openai.com/v1/audio/transcriptions" fileName, true)
WHR.SetRequestHeader("Content-Type", "multipart/form-data")
WHR.SetRequestHeader("Authorization", "Bearer " API_Key)
WHR.SetTimeouts(60000, 60000, 60000, 60000)
;; WHR.Send(FilePath)
;; WHR.Send(JSON_Request)
WHR.Send(sfArray)
WHR.WaitForResponse
try
{
if (WHR.status == 200)
{
SafeArray := WHR.responseBody
pData := NumGet(ComObjValue(SafeArray) + 8 + A_PtrSize, 'Ptr')
length := SafeArray.MaxIndex() + 1
JSON_Response := StrGet(pData, length, 'UTF-8')
var := Jxon_Load(&JSON_Response)
JSON_Response := var.Get("choices")[1].Get("message").Get("content")
MsgBox(JSON_Response)
}
}
}
Any ideas on how I might use WinHttp to send audio file through the OpenAI speech-to-text API? If needed I can post an API key for anyone to test with, except OpenAI can be fairly trigger-happy in banning the key if it senses what it considers to be "unusual" pattern of usage.
Thanks in advance!