Easy OCR
Posted: 23 Apr 2023, 23:18
This is a UWP OCR library in AHK v2. No extra installations needed (except perhaps Windows language packs)
Credit to malcev's work who's OCR function I heavily relied on, and special thanks to feiyue whose FindText library has been on great help.
The library is available here: https://github.com/Descolada/OCR
Some examples
Example 1
Displays all text found on the desktop, then highlights the results line by line.
Example 2
Finds some text in Notepad and selects it with MouseClickDrag.
Example 3
Reads text from under the cursor and displays it in real time.
Example 4
Tries to find a search phrase from the active window.
Example 5
Shows how to wait for text, search keywords in the results object, and click results.
If you have any suggestions or comments about what should be changed or improved, please leave a comment. Also feel free to create Pull requests in GitHub.
There are probably multiple improvements to be made in the FromWindow method, since I'm not too familiar with Gdi+...
Edit history:
Credit to malcev's work who's OCR function I heavily relied on, and special thanks to feiyue whose FindText library has been on great help.
The library is available here: https://github.com/Descolada/OCR
Some examples
Example 1
Displays all text found on the desktop, then highlights the results line by line.
Code: Select all
#Requires AutoHotkey v2
#include OCR.ahk
result := OCR.FromDesktop()
MsgBox "All text from desktop: `n" result.Text
MsgBox "Press OK to highlight all found lines for 3 seconds."
for line in result.Lines
result.Highlight(line, -3000)
ExitApp
Finds some text in Notepad and selects it with MouseClickDrag.
Code: Select all
#include OCR.ahk
Run "notepad.exe"
WinWaitActive "ahk_exe notepad.exe"
Send "Lorem ipsum "
Sleep 40
result := OCR.FromWindow("A",,2)
try found := result.FindString("Lorem")
if !IsSet(found) {
MsgBox '"Lorem" was not found in Notepad!'
ExitApp
}
result.Highlight(found)
CoordMode "Mouse", "Window"
MouseClickDrag("Left", found.x, found.y, found.x + found.w, found.y + found.h)
Reads text from under the cursor and displays it in real time.
Code: Select all
#Requires AutoHotkey v2
#include OCR.ahk
CoordMode "Mouse", "Screen"
CoordMode "ToolTip", "Screen"
DllCall("SetThreadDpiAwarenessContext", "ptr", -3) ; Needed for multi-monitor setups with differing DPIs
global w := 150, h := 50, minsize := 5, step := 3
Loop {
MouseGetPos(&x, &y)
Highlight(x-w//2, y-h//2, w, h)
ToolTip(OCR.FromRect(x-w//2, y-h//2, w, h, "en-us").Text, , y+h//2+10)
}
Right::global w+=step
Left::global w-=(w < minsize ? 0 : step)
Up::global h+=step
Down::global h-=(h < minsize ? 0 : step)
Highlight(x?, y?, w?, h?, showTime:=0, color:="Red", d:=2) {
static guis := []
if !IsSet(x) {
for _, r in guis
r.Destroy()
guis := []
return
}
if !guis.Length {
Loop 4
guis.Push(Gui("+AlwaysOnTop -Caption +ToolWindow -DPIScale +E0x08000000"))
}
Loop 4 {
i:=A_Index
, x1:=(i=2 ? x+w : x-d)
, y1:=(i=3 ? y+h : y-d)
, w1:=(i=1 or i=3 ? w+2*d : d)
, h1:=(i=2 or i=4 ? h+2*d : d)
guis[i].BackColor := color
guis[i].Show("NA x" . x1 . " y" . y1 . " w" . w1 . " h" . h1)
}
if showTime > 0 {
Sleep(showTime)
Highlight()
} else if showTime < 0
SetTimer(Highlight, -Abs(showTime))
}
Tries to find a search phrase from the active window.
Code: Select all
#Requires AutoHotkey v2
#include OCR.ahk
CoordMode "Mouse", "Window"
Loop {
ib := InputBox("Insert search phrase to find from active window: ", "OCR")
Sleep 100 ; Small delay to wait for the InputBox to close
if ib.Result != "OK"
ExitApp
result := OCR.FromWindow("A",,2)
try found := result.FindString(ib.Value)
catch {
MsgBox 'Phrase "' ib.Value '" not found!'
continue
}
; MouseMove is set to CoordMode Window, so no coordinate conversion necessary
MouseMove found.x, found.y
result.Highlight(found)
break
}
Shows how to wait for text, search keywords in the results object, and click results.
Code: Select all
#Requires AutoHotkey v2
#include OCR.ahk
Run "https://www.w3schools.com/tags/att_input_type_checkbox.asp"
WinWaitActive "HTML input type",,10
if !WinActive("HTML input type") {
MsgBox "Failed to find test window!"
ExitApp
}
; Wait for text "Yourself" to appear, case-insensitive search, indefinite wait. Search only the active window.
result := OCR.WaitText("Yourself",, OCR.FromWindow.Bind(OCR, "A"))
; Find the Word for "Yourself" in the result, and click it.
result.Click(result.FindString("Yourself"))
; Wait for text to appear, that matches RegExMatch with needle "I have a bike(\s|$)".
; RegEx matching is used here to accept either a space at the end or the end of string, because
; it might be in the middle of the found text or at the end.
; Search only the active window.
result := OCR.WaitText("I have a bike(\s|$)",, OCR.FromWindow.Bind(OCR,"A"),,RegExMatch)
; Here we don't have to use RegEx, because the string will be split by spaces and compared word-by-word.
result.Click(result.FindString("I have a bike"))
There are probably multiple improvements to be made in the FromWindow method, since I'm not too familiar with Gdi+...
Edit history:
Code: Select all
13.08.2024: updated examples