Reads MS Word (docx) Fast

20 Mar 2017, 12:26

I wrote a brief code for fun.

It reads docx file's content.
(strings-simply data.)

First of all.
I made a testing.docx file.
It has 100,000 lines(sentences).
Each of it has 10 alpha-numeric random characters.
For instance, "d4j9415852".
And all of the lines are unique (no duplications).

I tried to extract its contents and made a txt file.

When I run the COM PIA codes.
It took me 3.4 secs.

And, this "Fast" codes took only 0.34 sec.

OMG, it is 10 times faster.
That is good, real good !!!

Though, it is just a level of pseudo code, it works.


Faster !!

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


Code: [Select all] [Download] GeSHi © Codebox Plus

startTime  :=  A_TickCount													
_ := ComObjCreate( "Word.Application" )
_.Documents.Open( "myWord.docx" )
_.ActiveDocument.SaveAs( FileName := "_Extracted.txt", FileFormat := 2 )
_.ActiveDocument.Saved := 1
MsgBox % A_TickCount - startTime

When you are going to use someone else's script, Please just leave a brief comment saying thank you.
타인의 스크립트를 이용할 때는 최소한의 감사 표시를 남기시기 바랍니다. 개싸가지 도적질은 그만 하시고..

