Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

OCR.ahk - Library for recognizing text in images


  • Please log in to reply
88 replies to this topic
JimBob
  • Guests
  • Last active:
  • Joined: --
Is this broken on the newest AHK_L x64?

I've tried the preview and while I see a pink rectangle, I'm not seeing any tooltip. I tried changing the code to MsgBox, and no msgbox came up. I tried to do my own script with just text = GetOCR(0, 0, 500, 500) and then a MsgBox, didn't work. I had the #Include OCR.ahk there so that wasn't the issue.

JimBob
  • Guests
  • Last active:
  • Joined: --
Did some debugging, and found that a Gdip loop wasn't ending. Searched around and got an updated Gdip x64 library and now I see the ToolTip. However, the variable storing the OCR results is empty and I see no results with the preview and example files. Just says:

"Blah blah OCR Results:



Press ESC to exit."

Any ideas as to what's broken now?

JimBob
  • Guests
  • Last active:
  • Joined: --
Sorry for the spam, really need to get this working.

So I think this line may be the issue:

runCmd=gocr.exe %additionalParams% in.pnm

I stopped the code after that and commented out FileDelete, in.pnm

However, I don't see the pnm file in my folder. The jpeg is there. Is this again an issue with AHK_L x64?

JimBob
  • Guests
  • Last active:
  • Joined: --
Woops copy/pasted the wrong code line above lol.

It should be

convertCmd=djpeg.exe -pnm -grayscale %fileNameDestJ% in.pnm

So in short I think it's an issue with either the way it's calling djpeg.exe or djpeg.exe itself.

  • Guests
  • Last active:
  • Joined: --
it works, Thanks!

Guest Cheruvian
  • Guests
  • Last active:
  • Joined: --
Hey, trying to figure out this library, Im checking a static area of my screen and it returns the correct text... most of the time, other times it is just blank is there a reason for this? I have literally just changed the example code to check a static area instead of following the mouse, yet something is not right.

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
Well, it depends on the readability of the text. Not all text is readable... Even though you can read the text on the screen, that does not necessarily mean it is readable using OCR. Larger fonts do much better than smaller fonts, and same for text that has a clear difference in color from the text around it.

Perhaps you should show your code and the text on the screen that you are trying to read.
Aren't you glad that I didn't put an annoying gif here?

cheruvian
  • Members
  • 10 posts
  • Last active: Mar 02 2012 03:27 AM
  • Joined: 22 Aug 2010
Like I said, it reads the text correctly (as correct as you can expect any OCR to read it at least), but I've been doing testing and ~40% of the time, on the exact same image/screen it returns blank, while the other 60% of the time it returns the correct answer...

;;;;;;;;To Do;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;	Last Query Sent @
;	Next Query To be Sent @
;	Countdown
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

#Include OCR.ahk
CoordMode, Mouse, Screen

Tooltip, %A_Hour%:%A_Min%,0,0

sleep 10000

magicalText := GetOCR(593, 278, 150, 50, "debug")						;Get the square and OCR it
StringUpper,magicalText,magicalText

click 821, 294															;Click on inputbox
click 821, 294
click 821, 294

sleep 500
send %magicalText%{Enter}												;Send Text
reload																		;Do it again

Posted Image[/code]

Cheruvian Guest
  • Guests
  • Last active:
  • Joined: --
Figured it out... OCR.ahk was not waiting for in.pnm to exist, so depending on the load on my processor, it would take longer times to convert the jpg to a pnm and at times, GOCR would be run a nonexist image... thus the blank result... I have modified OCR.ahk to make sure that everything exists and is deleted at the appropriate times....


	/**
	 *   OCR library by camerb
	 *   v0.93 - 2011-09-06
	 *
	 * This OCR lib provides an easy way to check a part of the screen for
	 * machine-readable text. You should note that OCR isn't a perfect technology,
	 * and will frequently make mistakes, but it can give you a general idea of
	 * what text is in a given area. For example, a common mistake that this OCR
	 * function makes is that it frequently interprets slashes, lowercase L,
	 * lowercase I, and the number 1 interchangably. Results can also vary
	 * greatly based upon where the outer bounds of the area to scan are placed.
	 *
	 * Future plans include a function that will check if a given string is
	 * displayed within the given coordinates on the screen.
	 *
	 * Home thread: http://www.autohotkey.com/forum/viewtopic.php?t=74227
	 * With inspiration from: http://www.autohotkey.com/forum/viewtopic.php?p=93526#93526
	*/


	#Include GDIp.ahk
	#Include CMDret.ahk


	; the options parameter is a string and can contain any combination of the following:
	;   debug - for use to show errors that GOCR spits out (not helpful for daily use)
	;   numeric (or numeral, or number) - the text being scanned should be limited to
	;            numbers only (no letters or special characters)
	GetOCR(topLeftX="", topLeftY="", widthToScan="", heightToScan="", options="")
	{
	   ;TODO validate to ensure that the coords are numbers

	   prevBatchLines := A_BatchLines
	   SetBatchlines, -1 ;cuts the average time down from 140ms to 115ms for small areas

	   ;process options from the options param, if they are there
	   if options
	   {
		  if InStr(options, "debug")
			 isDebugMode:=true
		  if InStr(options, "numeral")
			 isNumericMode:=true
		  if InStr(options, "numeric")
			 isNumericMode:=true
		  if InStr(options, "number")
			 isNumericMode:=true
	   }

	   if (heightToScan == "")
	   {
		  ;TODO throw error if not in the right coordmode
		  ;CoordMode, Mouse, Window
		  WinGetActiveStats, no, winWidth, winHeight, no, no
		  topLeftX := 0
		  topLeftY := 0
		  widthToScan  := winWidth
		  heightToScan := winHeight
	   }

	   fileNameDestJ = ResultImage.jpg
	   jpegQuality = 100

	   pToken:=Gdip_Startup()
	   pBitmap:=Gdip_BitmapFromScreen(topLeftX "|" topLeftY "|" widthToScan "|" heightToScan)
	   Gdip_SaveBitmapToFile(pBitmap, fileNameDestJ, 100)
	   Gdip_Shutdown(pToken)

	   ; Wait for jpg file to exist
	   while NOT FileExist(fileNameDestJ)
		  Sleep, 10

		  ;msgbox check
	   ;convert the jpg file to pnm
	   convertCmd=djpeg.exe -pnm -grayscale %fileNameDestJ% in.pnm
		
	   ;run the OCR
	   ;runCmd=gocr.exe -i in.pnm
	   if isNumericMode
		  additionalParams .= "-C 0-9 "
	   runCmd=gocr.exe %additionalParams% in.pnm

	   ;run both commands using my mixed cmdret hack
	   CmdRet(convertCmd)
	   
	   while NOT FileExist("in.pnm")
		  Sleep, 10
		  
	   result := CmdRet(runCmd)
	  

	   ;suppress warnings from GOCR (we don't care, give us nothing)
	   if InStr(result, "NOT NORMAL")
		  gocrError:=true
	   if InStr(result, "strong rotation angle detected")
		  gocrError:=true
	   if InStr(result, "# no boxes found - stopped") ;multiple warnings show up with this in the string
		  gocrError:=true

	   if gocrError
	   {
		  if NOT isDebugMode
			 result=
			
	   }

	   ; Cleanup
	   
	
	   FileDelete, in.pnm
	   while FileExist("in.pnm")
		  Sleep, 10
	   FileDelete, %fileNameDestJ%	
	   while FileExist(fileNameDestJ)
		  Sleep, 10
		SetBatchlines, %prevBatchLines%

	   return result
	}

	;RunWaitEx(CMD, CMDdir, CMDin, ByRef CMDout, ByRef CMDerr)
	;{
	   ;VarSetCapacity(CMDOut, 100000)
	   ;VarSetCapacity(CMDerr, 100000)
	   ;RetVal := DllCall("cmdret.dll\RunWEx", "AStr", CMD, "AStr", CMDdir, "AStr", CMDin, "AStr", CMDout, "AStr", CMDerr)
	   ;Return, %RetVal%
	;}

	;GhettoCmdRet_RunReturn(command)
	;{
	   ;file := "joe.txt"
	   ;command .= " > " . file
	   ;Run %comspec% /c "%command%"
	   ;FileRead, returned, %file%
	   ;return returned
	;}

	CMDret(CMD)
	{
	   if RegExMatch(A_AHKversion, "^\Q1.0\E")
	   {
		  StrOut:=CMDret_RunReturn(cmd)
	   }
	   else
	   {
		  VarSetCapacity(StrOut, 20000)
		  RetVal := DllCall("cmdret.dll\RunReturn", "astr", CMD, "ptr", &StrOut)
		  strget:="strget"
		  StrOut:=%StrGet%(&StrOut, 20000, CP0)
	   }
	   Return, %StrOut%
	}



burton666
  • Guests
  • Last active:
  • Joined: --
Could anyone help me on how to use the ocr function in a basic script?

I have created a small script witch sends a bunch of commands/text to a telnet-prompt. And I have created a few shortcuts (Like:

"::A010::Plock_Low.A.01.0{enter}{enter}
Send BOX{enter}{enter})

But the problem is that I have to enter it at two places in the telnet prompt, and the second line should not be sent the second time ("Send BOX{enter}{enter}"]

So how do I use the ocr-function to detect witch screen info that is in the window and make it "understand" if the extra -BOX line should be sent?

There is some specific text that differs depending on if it is the first or second time the line should be sent and if I use the ocr-preview it always sees the text.

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
burton666: Are you searching for text that will be different every time you run the script? Or are you just searching for one of two different types of text? I suspect it is the latter, and you should just use ImageSearch to determine if you are on the first or second run. That would be much easier than messing around with OCR. Not to mention that your script would [probably] run faster as well.
Aren't you glad that I didn't put an annoying gif here?

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
Posted v0.95, which handles waiting for the file to exist, and does a few other miscellaneous things better. This should make the lib more reliable across different computers.
Aren't you glad that I didn't put an annoying gif here?

S1eepy
  • Members
  • 1 posts
  • Last active: Jun 03 2012 05:02 PM
  • Joined: 03 Jun 2012
been using this to convert some text for a spreadsheet im creating, but it seems to refuse to recognise the number 2
ive got it set to numeric mode, and everything else is working perfectly, its just "2" that it doesnt register. When not in numeric mode, it sees "2" as "_". In numeric mode it just ignores it completely.

It doesnt matter where the number 2 is in the chain of numbers, it just skips it. Any idea how i can get it to read the number correctly?

camerb
  • Moderators
  • 573 posts
  • Last active: Sep 14 2015 03:32 PM
  • Joined: 19 Mar 2009
Sorry I didn't post this earlier. I actually wrote it this yesterday, but didn't hit "submit"... my mistake.

Are you always doing this with the same font, same background and foreground color and all that? You may get better results by switching it in/out of greyscale mode. I noticed the 2 is pretty thin, so that may work well.

Does it put anything in its' place? Like a space? For instance, if you try to process "7625" and GetOCR() returns "76 5", you could be able to assume that it is always a 2. If you're only having issues with the 2, and it is very repeatable, then I'd say just convert the space to a 2 immediately after the GetOCR() command.

Like I say in the opening post, OCR does not read every character correctly. It doesn't guarantee anything, but it will give you a good gist of the text that was there.
Aren't you glad that I didn't put an annoying gif here?

dexter323
  • Guests
  • Last active:
  • Joined: --
Whenever I define a variable with GetOCR, such as:

Power := GetOCR(topLeftX, topLeftY, widthToScan, heightToScan, numeric)
ToolTip, %Power%

it works for the tooltip, and it can export to ini and some other things (the numbers it finds)

BUT if I try to use the variable in math stuff such as below it doesn't work:

Power := GetOCR(topLeftX, topLeftY, widthToScan, heightToScan, numeric)
DoublePower := Power * 2
ToolTip, %DoublePower%