Here's another post for anyone following this thread and interested in using AHK with Docparser. My previous post shows how to download the parsed data as key-value pairs (nested JSON objects). It also mentions that there's some header data (before the actual parsed data from the PDF file) and it turns out that the header data contains a link to the actual Excel file that works perfectly with AHK's
UrlDownloadToFile. The key is
media_link_data and the value following it is the URL of the XLSX file that contains the parsed data. A minor modification of one of the scripts above is able to download the XLSX file, as shown in the tested script below:
Code: Select all
oHTTP:=ComObjCreate("WinHttp.WinHttpRequest.5.1")
If !IsObject(oHTTP)
{
MsgBox,4112,Fatal Error,Unable to create HTTP object
ExitApp
}
ExcelFileDownload:="c:\temp\DocparserDownload.xlsx"
FileDelete,%ExcelFileDownload%
ResponseFileDownload:="c:\temp\DocparserDownload.txt"
FileDelete,%ResponseFileDownload%
UsernamePasswordBase64:="this is your Base64-encoded username with a null password"
ParserID:="this is the unique ID of your parser"
DocumentID:="this is the unique ID of your document"
URL:="https://api.docparser.com/v1/results/" . ParserID . "/" . DocumentID
oHTTP.Open("GET",URL)
oHTTP.SetRequestHeader("Authorization","Basic " UsernamePasswordBase64)
oHTTP.Send()
Response:=oHTTP.ResponseText
FileAppend,%Response%,%ResponseFileDownload%
MediaLinkStr:="""media_link_data"":""" ; key for Excel URL
MediaLinkPos:=InStr(Response,MediaLinkStr)
If (MediaLinkPos=0)
{
MsgBox,4112,Fatal Error,%MediaLinkStr% not found in response
ExitApp
}
MediaLinkStrLen:=StrLen(MediaLinkStr)
ExcelLinkBegin:=MediaLinkPos+MediaLinkStrLen
ExcelLinkEnd:=InStr(Response,"""",,ExcelLinkBegin)-1
If (ExcelLinkEnd=0)
{
MsgBox,4112,Fatal Error,Closing quote on Excel URL not found in response
ExitApp
}
ExcelLinkLen:=ExcelLinkEnd-ExcelLinkBegin+1
ExcelLink:=SubStr(Response,ExcelLinkBegin,ExcelLinkLen)
ExcelLink:=StrReplace(ExcelLink,"\") ; backslash is escape char for forward slashes in URL - remove all
UrlDownloadToFile,%ExcelLink%,%ExcelFileDownload%
If (ErrorLevel<>0)
{
MsgBox,4112,Fatal Error,Error Level=%ErrorLevel% trying to download:`n%ExcelLink%`nto:`n%ExcelFileDownload%
ExitApp
}
; Excel file was downloaded
ExitApp
My plan is to turn all the code developed in this thread into functions, something along these lines:
Code: Select all
BasicAuthentication(UsernamePasswordBase64)
ListParsers(UsernamePasswordBase64)
UploadPDF(UsernamePasswordBase64,ParserID,Filename)
DownloadParsedData(UsernamePasswordBase64,ParserID,DocumentID,JSONfilename,XLSXfilename)
I also plan to make it more robust, with lots of error checking, including
Try-Catch pairs where appropriate. But that's for the future with an unknown ETA. In the meantime, I hope that the code in here helps anyone looking to call the Docparser API with AHK. Regards, Joe