It reads docx file's content.
(strings-simply data.)
First of all.
I made a testing.docx file.
It has 100,000 lines(sentences).
Each of it has 10 alpha-numeric random characters.
For instance, "d4j9415852".
And all of the lines are unique (no duplications).
Now.
I tried to extract its contents and made a txt file.
When I run the COM PIA codes.
It took me 3.4 secs.
And, this "Fast" codes took only 0.34 sec.
OMG, it is 10 times faster.
That is good, real good !!!
Though, it is just a level of pseudo code, it works.
Regards
Faster !!
Code: Select all
myWord := "Testing.docx"
startTime := A_TickCount
tempFolder := RegExReplace( myWord, ".*\K\\.*") "\_Word_UnZip\"
tempName := RegExReplace( myWord, "\.docx") ".zip"
FileCopy, % myWord , % tempName
FileCreateDir, % tempFolder
tempObject := ComObjCreate("Shell.Application")
tempObject.Namespace(tempFolder).CopyHere( tempObject.Namespace(tempName).items, 4|16)
FileDelete, % tempName
FileEncoding, UTF-8
FileRead, wordContents, % tempFolder "\" "word\document.xml"
While @ := RegExMatch( wordContents, "<w:t>(.+?)</w:t>", _, @ ? StrLen(_) + @ : 1 )
myContent .= _1 "`n"
FileRemoveDir, % tempFolder, 1
resultFile := RegExReplace( myWord, "\.docx") "_Extracted.txt"
FileAppend,% myContent, % resultFile
MsgBox % A_TickCount - startTime
Code: Select all
startTime := A_TickCount
_ := ComObjCreate( "Word.Application" )
_.Documents.Open( "myWord.docx" )
_.ActiveDocument.SaveAs( FileName := "_Extracted.txt", FileFormat := 2 )
_.ActiveDocument.Saved := 1
_.Quit
MsgBox % A_TickCount - startTime
When you are going to use someone else's script, Please just leave a brief comment saying thank you.
타인의 스크립트를 이용할 때는 최소한의 감사 표시를 남기시기 바랍니다. 개싸가지 도적질은 그만 하시고..