Not the best forum for this request but the other forums didn't seem to right either so here goes...
I'm looking for script that will find repeating words in text. I've seen word processors that that do do it but I did a quick search and I didn't find any scripts out there that do it. I may be using the wrong search terms.
Anyone know of have a script to do this? If not, I will be forced to write my own.
Thanks.
Script to Find Repeating Words
Re: Script to Find Repeating Words
You mean like this?
Code: Select all
#SingleInstance, Force
SampleText := "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit."
StrReplace(SampleText, "Lorem", "Lorem", Occurrence)
MsgBox, % """Lorem"" occurs exactly " Occurrence " times in this sample text."
Re: Script to Find Repeating Words
Thank you for your interest.
No. I'm sorry, I realize that I wasn't very specific. When I say repeating words I mean words that are repeated next to each other. "The the" is a common problem that occurs in many of the documents that I write. Connectives like "with" are also common. Although there are common words, any word can be inadvertently (and incorrectly) repeated and I need to find them.
No. I'm sorry, I realize that I wasn't very specific. When I say repeating words I mean words that are repeated next to each other. "The the" is a common problem that occurs in many of the documents that I write. Connectives like "with" are also common. Although there are common words, any word can be inadvertently (and incorrectly) repeated and I need to find them.
Re: Script to Find Repeating Words
Code: Select all
#NoEnv
sampletext := "The the quick brown fox jumps with with excitement over the lazy dog."
RepeatedTextArray := {}
Loop, Parse, sampletext, %A_Space%
{
if (A_LoopField = LastFoundWord)
RepeatedTextArray[LastFoundWord] := LastFoundWord . A_Space . A_LoopField
LastFoundWord := A_LoopField
}
for word, RepeatedWord in RepeatedTextArray
sampletext := StrReplace(sampletext, RepeatedWord, word)
MsgBox % sampletext
Re: Script to Find Repeating Words
- Here's an attempt. It's essentially just a parsing loop using space as the delimiter. You then have to consider how to deal with any non-letters. I have considered commas and full stops in the example below.
- If you want to know the exact position of where the repeated match occurs, you could use my script, and then use RegExMatch on the original text to find the positions.
- RegExReplace could be used to replace secondary occurrences of words, but some are valid.
- If you want to know the exact position of where the repeated match occurs, you could use my script, and then use RegExMatch on the original text to find the positions.
- RegExReplace could be used to replace secondary occurrences of words, but some are valid.
Code: Select all
q:: ;list repeated words
vText := "abc abc, def ghi def def. ghi abc abc def, def"
;note: 'def, def' won't be considered as a repeated pair, since there is a punctuation mark in the middle (such behaviour could be changed by replacing commas with spaces via StrReplace and then replacing multiple spaces with single spaces via RegExReplace)
vText := StrReplace(vText, ",", " ")
vText := StrReplace(vText, ".", " ")
vTempLast := " ", vOutput := "" ;vTemp is compared to vTempLast each time, since we're parsing using space as the delimiter, the previous item will never be a space
Loop, Parse, vText, % " "
{
vTemp := A_LoopField
if !(vTemp = "") && (vTemp = vTempLast)
vOutput .= vTempLast " " vTemp "`r`n"
vTempLast := vTemp
}
Clipboard := vOutput
MsgBox, % vOutput
return
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Re: Script to Find Repeating Words
I was hoping that that there was some some ready-to-go script out there but this is a good start. Thanks everybody!
Re: Script to Find Repeating Words
From the "just in case you care" department...
I found a good RegEx pattern to help with this requirement from this this web site. It is far (far) from being done but this is what I have so far.
I found a good RegEx pattern to help with this requirement from this this web site. It is far (far) from being done but this is what I have so far.
Code: Select all
#NoEnv
#SingleInstance Force
Text=
(ltrim
This is just a test Test this
This this is is
just just a test Test of the
the emergency broadcast center center.
)
StartPos:=1
Loop
{
FoundPos:= RegExMatch(Text,"i)\b([\w]+)\s+\1\b",FoundString,StartPos)
if FoundPos then
{
MsgBox Found: %FoundString%
StartPos:=FoundPos+StrLen(FoundString)
}
else
{
MsgBox Nothing (else) found.
Break
}
}
return
Re: Script to Find Repeating Words
we see what you did there...jballi wrote:this this
Re: Script to Find Repeating Words
not the whole shebang but might give u a starting point
Code: Select all
#NoEnv
#SingleInstance Force
#Persistent
SetBatchLines -1
SampleText =
(LTrim
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure
dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure
dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.
)
Result := findAllRepeatingWords(SampleText)
Gui Display: New, +AlwaysOnTop, Results
Gui Display: Margin, 4, 4
for word, timesRepeated in Result
{
idx := A_Index
res .= Format("{}: {} ({})`n", (idx < 10 ? "0" idx : idx), word, timesRepeated)
}
Gui Display: Add, Edit, w200, % res
Gui Display: Show, xCenter yCenter
Return
DisplayGuiClose:
DisplayGuiEscape:
{
ExitApp
return
}
findAllRepeatingWords(str) {
str := RegExReplace(str, "\W", A_Space) ; replace non-word chars with space(get rid of punctuation). what about apostrophes?
Words := StrSplit(str, A_Space)
Reps := countRepetitions(Words)
return pruneRepetitions(Reps)
}
; return words mapped to their occurences
countRepetitions(Arr) {
Result := {}
for each, word in Arr
{
if (word != "")
{
if (Result.HasKey(word))
Result[word]++
else
Result[word] := 1
}
}
return Result
}
; get rid of single occurence words
pruneRepetitions(Arr) {
Result := {}
for word, timesRepeated in Arr
{
if (timesRepeated != 1)
Result[word] := timesRepeated
}
return Result
}
Re: Script to Find Repeating Words
swagfag: Definitely not what I was looking for but your script provides information about a document that might be useful in the future. Thanks for sharing.
Who is online
Users browsing this forum: Bing [Bot], Ronin_PL and 57 guests