Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

OCR.ahk - Library for recognizing text in images


  • Please log in to reply
88 replies to this topic
Zaelia
  • Members
  • 754 posts
  • Last active: Jan 17 2015 02:38 AM
  • Joined: 31 Oct 2008
Of course you can, for example GOCR give some extra feature as:

gocr.exe -C 0-9 towant.pnm ; only char from 0 to 9
gocr.exe -m 130 toknow.pnm ; set database (2+128)
gocr.exe -m 2 towant.pnm ; use database
gocr.exe -a 100 towant.pnm ; no error (perfect for computer font if complet db )

However GOCR have limit, the image conversion is a problem with windows : executable are rare or don't take alpha channel, pnm is the "open" bmp, but depends of your original file type... so they are some image processing to do, and sometime reshape/format a character with a good understand of pnm format (ascii to binary, select color chanel for basic captcha, db only use .pbm(no gray), 2 character can be merged , ... )

so my advice is to use gocr for "code number" or computer font, BUT gocr is still in work :) after this they are tesseract for tiff format, more long, very more harder to params, very more huge, but for hand script , scan, and language support

edit: you can use gocr as this too (gocr.exe - "minus alone" use the stdin/out , can be good if no image processing to do and fast result requiered)
xxx2pnm.exe root.xxx | gocr.exe - > result.txt
"You annoy me, therefore I exist."

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
bmoore45 & Zaelia: I posted a new version that allows numeric-only mode... it turned out to be easy, so I went ahead and did it (thanks Zaelia for informing me of those command-line params). Usage is as follows:

output := GetOCR(0, 0, 100, 100, "numeric")

Aren't you glad that I didn't put an annoying gif here?

bmoore45
  • Members
  • 273 posts
  • Last active: Mar 14 2012 01:00 PM
  • Joined: 10 Jul 2011
sorry for the stupid q...but how do I use that line to get numeric only output?

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009

sorry for the stupid q...but how do I use that line to get numeric only output?


If you got it to run in your own script, make it just like you have now, except the fifth parameter should have the text "numeric" in the string. If you're still having issues, feel free to PM the code to me and I'll take a look at it.
Aren't you glad that I didn't put an annoying gif here?

bmoore45
  • Members
  • 273 posts
  • Last active: Mar 14 2012 01:00 PM
  • Joined: 10 Jul 2011
oh right I got it now, thankyou, I was trying to do it with OCR-preview and couldn't work out that magicaltext is the name of output lol.

This works great for me btw- I haven't had a single error yet on what the OCR returns and have pocessed probably about 10,000 digits so far. Awesome stuff camerb ty!

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009

I haven't had a single error yet ... and have pocessed probably about 10,000 digits so far.

Wow, that's excellent. I've noticed that numeric-only mode makes it more reliable, but had no clue that it would be that good. Probably depends a lot on the font you're using, too.
Aren't you glad that I didn't put an annoying gif here?

Foo
  • Members
  • 37 posts
  • Last active: Nov 23 2011 02:22 AM
  • Joined: 09 Feb 2006
I tried to modify to capture an area of the screen instead of the area under the mouse after a hotkey press instead of repeatedly.

I tried to store the magicaltext into a variable and then send it to Excel.

It seemed like it worked at one instant but after changing the coordinates, it doesn't seem to be grabbing the text from the active window.

This is the modified ocr-example.ahk:

/**
 * OCR library test script by camerb
 *
 * This tiny script serves as an example of the intended usage of the OCR library.
*/

#SingleInstance force
#Include OCR.ahk

;sometimes this helps to ensure more consistent results when switching from one window to another
CoordMode, Mouse, Screen

outputFile=OCRtext.txt
;widthToScan=100
;heightToScan=40

Loop
{
   ;figuring out what region we will scan (the area around the mouse)
   ;MouseGetPos, mouseX, mouseY
   ;topLeftX := mouseX - (widthToScan / 2)
   ;topLeftY := mouseY - (heightToScan / 2)
   topLeftX = 300
   topLeftY = 300 
   widthToScan = 600  ;this may actually be the ending x of the right side of the box??
   heightToScan = 600 ;the right edge of the box, y coordinate? Hopefully?

   ;NOTE: this is where the magical OCR function is called
   magicalText := GetOCR(topLeftX, topLeftY, widthToScan, heightToScan)

   ;I prefer to look at this output using a BareTail, some like Tooltips
   ;but I left a msgbox with a timeout here because that works for everyone
   ;liveMessage=Here is the text that GetOCR() found near your mouse:`n%magicalText%`n`nPress ESC at any time to exit
   t::
   ;MsgBox %magicalText%
	test=%magicalText%
	Xl := ComObjActive("Excel.Application")
	XL.Range("A1").Value := test
   return
   
   ;FileAppend, %magicalText%`n, %outputFile%
   ;MsgBox, , , %liveMessage%, 2
   ;ToolTip, %liveMessage%
   Sleep, 100
}
;end of script (obviously this never really exits)

Esc:: ExitApp


camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
Well, there are a few things wrong with what you're trying to do...

First, you should note that "CoordMode, Mouse, Screen" changes the coordinates so that everything is based off of the top-left corner of the screen, not the active window. (However, I am aware that there is a bug in the lib making "CoordMode, Mouse, Window" not function correctly. I'm looking at fixing that because honestly, that's how I primarily want to use the lib, myself.

Second, you seem to have a hotkey definition ("t::") inside of a loop. That's not a great idea. In fact, hotkeys should not be in the auto-execute section at all.
Aren't you glad that I didn't put an annoying gif here?

ip0t
  • Guests
  • Last active:
  • Joined: --
can we make it work on a minimized firefox browser

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
ipot: No, that is not possible.
Aren't you glad that I didn't put an annoying gif here?

nimda
  • Members
  • 4368 posts
  • Last active: Aug 09 2015 02:36 AM
  • Joined: 26 Dec 2010

ipot: No, that is not possible.

Well... if you have it in the background but not minimized you can use GDI+ to grab a bitmap...

GeekDude
  • Spam Officer
  • 391 posts
  • Last active: Oct 05 2015 08:13 PM
  • Joined: 23 Nov 2009
My take on it, I call it OCRDrag

#SingleInstance Force
#Include OCR.ahk

CoordMode, Mouse, Screen

^Rbutton::
Gui, Color, blue
Gui, +LastFound -caption +border +AlwaysOnTop
WinSet, Transparent, 50
MouseGetPos, Xpos1, Ypos1
loop {
	MouseGetPos, Xpos2, Ypos2
	Xpos3 := (Xpos2 > Xpos1) ? Xpos2 - Xpos1 : Xpos1 - Xpos2
	Ypos3 := (Ypos2 > Ypos1) ? Ypos2 - Ypos1 : Ypos1 - Ypos2
	XPos4 := (Xpos2 > Xpos1) ? Xpos1 : Xpos2
	YPos4 := (Ypos2 > Ypos1) ? Ypos1 : Ypos2
	Gui, Show, x%Xpos4% y%Ypos4% w%Xpos3% h%Ypos3%
	GetKeyState, Key, Rbutton, P
	If Key = U
		Break
}
Gui, Destroy
OCR := GetOCR(XPos4, Ypos4, Xpos3, Ypos3)
MsgBox, 4,, %OCR%
IfMsgBox, Yes
	clipboard := OCR
Return

When having CTRL and RButton held down, you can drag your mouse to select an area for character recognition. It will then do the OCR command on the area you had selected. It will return with a message box, and if you select yes, it will copy it to the clipboard.

Foo
  • Members
  • 37 posts
  • Last active: Nov 23 2011 02:22 AM
  • Joined: 09 Feb 2006
When I try to use your example with just grabbing a small section of text on the screen it grabs way more area than what I would like.
I am trying to tell it to just read a small square at x187 y252 to x264 y249.
It seems to be grabbing a large rectangular section taller than what I am anticipating.
Do I have the coordinates labeled in the proper order? x of the beginning top most left corner to the x in the bottom right most corner?

#SingleInstance Force
#Include OCR.ahk

^Rbutton::

OCR := GetOCR(187, 252, 264, 249)
MsgBox %OCR%

IfMsgBox, Yes
   clipboard := OCR

Return


GeekDude
  • Spam Officer
  • 391 posts
  • Last active: Oct 05 2015 08:13 PM
  • Joined: 23 Nov 2009
The way the OCR function works, you put in the x coordinate, then the y coordinate, then the WIDTH of the area you are searching in, then the HEIGHT of the area you are searching in. To find the width an height, use this formula:
InputBox, TopLeftX, X1, Enter the x position of the top left corner.
InputBox, TopLeftY, Y1, Enter the y position of the top left corner.
InputBox, LowerRightX, X2, Enter the x position of the top left corner.
InputBox, LowerRightY, Y2, Enter the y position of the top left corner.

w := LowerRightX - TopLeftX
h := LowerRightY - TopLeftY

x := TopLeftX
y := TopLeftY

MsgBox, %x%`, %y%`, %w%`, %h%


Foo
  • Members
  • 37 posts
  • Last active: Nov 23 2011 02:22 AM
  • Joined: 09 Feb 2006
Thank you for that explanation. I was able to capture text from an area of the screen exactly how I want to with that understanding.

I appreciate your help very much.

THanks again.