Vis2 - OCR(), ImageIdentify()

Post your working scripts, libraries and tools
Archandrion
Posts: 31
Joined: 26 May 2018, 22:23

Re: Vis2 - OCR(), ImageIdentify()

04 Jun 2018, 00:54

Thanks for the reply. Actually played around with it before seeing your reply but found that passing the internal dialogue variable at the point where it would normally be shown in the OCR language to the translator was still too slow for the speed of the subtitles in the video with the added time taken up for translation. There was a problem too in that some languages translate rather poorly without human input as there are too many phrases that have some cultural context which subtitle groups usually find a way to convey.

A tool that attempts to translate subtitles in real time for live video with OCR however imperfect would still be extremely useful. I've seen word lens but there does not appear to be a good desktop alternative. I'm looking into OpenCV, Tesseract and FFmpeg to do OCR on the video while it is playing with audio as the native imshow appears to be for video frames only but that's python so it's outside of the this forum's scope. Also another question probably not specific to the main function of Vis2 but is there a way to make a Vis2.Graphics.Subtitle.Render last only until it is called again, maybe some subtitle.destroy option, as just removing the duration amount just results in overlapping subtitles? Couldn't tell how long a particular subtitle would last beforehand so couldn't use the duration option.
iseahound
Posts: 235
Joined: 13 Aug 2016, 21:04
GitHub: iseahound

Re: Vis2 - OCR(), ImageIdentify()

04 Jun 2018, 02:31

Just curious are you subtitle ripping anime? I'm assuming your source is hard coded which is quite rare these days. There's a few points I'd like to make:

1) The input image is increased by 3.5x during the preprocessing step. Some users like you want speed, and others want accuracy, so this is an okay balance for small fonts. Your hardcoded subtitle should be quite large, so if you control F the script and search 3.5, you can decrease it to 2, or even just 1. This should decrease processing times by 5x.

2) Using Python is not outside of Vis2's scope. I'd thought about adding OpenCV, but that requires my user to install an 138+ MB binary. On the bright side, maximally stable external regions can be detected with speed for full screen OCR.

3) Vis2.Graphics.Subtitle.Render has been released separately on this forum. https://autohotkey.com/boards/viewtopic.php?t=36384 (Check that for documentation.) First a subtitle object is initialized Vis2.obj.Subtitle := new Vis2.Graphics.Subtitle(). Then each time .Render() is called, what was previously rendered to the screen is overwritten. .Destroy() is a valid method, but it completely destroys the object, and doesn't clear it off the screen. Perhaps you are looking for .Hide() and its siblings .Show() and .ToggleVisible()? You can also set a time parameter to have the subtitle time out and self-destruct as in Vis2.Graphics.Subtitle.Render("This will last 5 seconds", "time: 5000")
In case you were wondering, I just copy and pasted the Subtitle class, so both versions are the same.
Archandrion
Posts: 31
Joined: 26 May 2018, 22:23

Re: Vis2 - OCR(), ImageIdentify()

04 Jun 2018, 08:25

To answer your question in part yes it is animation although not Japanese but the Chinese variety. As unable to tell the duration before hand and it being very variable can't really use the time option. With a GUI for example GuiControl,,Var, %TranslatedText% basically does the needed function. I am trying to make the following work correctly:

Code: [Select all] [Download] GeSHi © Codebox Plus

                   
if !(bypass)
Vis2.obj.Subtitle.Render(Vis2.obj.dialogue, Vis2.obj.style1_back, Vis2.obj.style1_text)
TranslatedSubtitle.Render(A_TickCount, "xCenter y760" Vis2.obj.style1_back, Vis2.obj.style1_text)



A_TickCount would be replaced by the actual translated text but for this example it shows the overlay when OCR() is called from the Demo in the GUI interactive mode. Where would I need to place TranslatedSubtitle := new Vis2.Graphics.Subtitle() in order to have the render update rather than overlay?
renmacro
Posts: 20
Joined: 05 Mar 2018, 23:30

Re: Vis2 - OCR(), ImageIdentify()

04 Jun 2018, 08:55

So I ended up making a tesseract training file as it was now messing up 1/2/7's, and even with majority consensus it still wouldn't come at the correct number. And yes, training tesseract was as bad as people make it out to be =\. But the trained one would now fail 0/6/8/9's.

I then changed my loop to run verifications with my trainingdata for half and the original eng_best for half and now it seems to be very good for what I'm trying to do. My next step was to make a font out of screenshots to vectors to train tesseract and see how that went. I hope my next post here isn't me doing that...
Last edited by renmacro on 04 Jun 2018, 11:42, edited 1 time in total.
iseahound
Posts: 235
Joined: 13 Aug 2016, 21:04
GitHub: iseahound

Re: Vis2 - OCR(), ImageIdentify()

04 Jun 2018, 09:50

You can place it with the other constructors. It's in a function called start(), under the line Vis2.obj.Subtitle := New Vis2.Graphics.Subtitle()

You should hit up the anime scene. They OCR subtitles from TV rips, but since it's a specialized process they might not have released their code on github.

@renmacro if your text is fixed width and height and numbers/letters only it might be time to pick up tensorflow.
Should be able to get >95% easy.
Archandrion
Posts: 31
Joined: 26 May 2018, 22:23

Re: Vis2 - OCR(), ImageIdentify()

04 Jun 2018, 21:21

iseahound wrote:You can place it with the other constructors. It's in a function called start(), under the line Vis2.obj.Subtitle := New Vis2.Graphics.Subtitle()


Thanks for the help, it worked.
euras
Posts: 321
Joined: 05 Nov 2015, 12:56

Re: Vis2 - OCR(), ImageIdentify()

23 Jun 2018, 10:03

iseahound wrote:You can place it with the other constructors. It's in a function called start(), under the line Vis2.obj.Subtitle := New Vis2.Graphics.Subtitle()


hi iseahound, wonderfull tool first of all! I try to understand a couple of things here.
first: how to add both language I want to use and coordinates of the screen? I have tried this way, but it doesn't work:

Code: [Select all] [Download] GeSHi © Codebox Plus

txt := OCR([232, 411, 640, 40, "nor"]).clipboard()

second: does coordinates method shares the same functions as mouseclickdrag method? because when I use coordinates mode to get the text, I get the incorrect text translation, but when I use mouseclickdrag method, then the text is converted without errors...

Code: [Select all] [Download] GeSHi © Codebox Plus

MsgBox % OCR("https://i.stack.imgur.com/sFPWe.png", , [0,330,999,400])

I get: prown dog jumped over the lazy Tox.

But that coordinate mode will be much more efficient if it works perfect...

I'm using demo.ahk file
iseahound
Posts: 235
Joined: 13 Aug 2016, 21:04
GitHub: iseahound

Re: Vis2 - OCR(), ImageIdentify()

24 Jun 2018, 23:47

Which link?

Code: [Select all] [Download] GeSHi © Codebox Plus

txt := OCR([232, 411, 640, 40], "nor").clipboard()

Return to “Scripts and Functions”

Who is online

Users browsing this forum: DuyMinh and 14 guests