Hello, everybody who is way smarter than me. Is it possible to create a query which does the following:
Takes in a string of words (small one)
Deletes every word which does not conform to following:
1. Does not start with capital letter E
2. Contains anything other than letters and integers
3. Is shorter or longer than 8 characters
I know it sounds crazy, but I need it for work.
Thanks guys.
REGEX riddle Topic is solved
- AlphaBravo
- Posts: 586
- Joined: 29 Sep 2013, 22:59
Re: REGEX riddle
Condition #3 is Crazy!!Olegreddo wrote:3. Is shorter or longer than 8 characters
Re: REGEX riddle
Code: Select all
f := FileOpen("wordlist.txt", "r")
MsgBox, % RegExReplace(f.Read(), "((\b[^E`n]*?\b)|(\bE[A-Za-z0-9]*?\b(?CCallout)))")
Callout(Match) {
return StrLen(Match) = 8
}
Code: Select all
Eagers
Eagerest
Ealdorman
eanlings
earaches
eardrops
Eardrums
earflaps
earful
earliest
Ear!obes
earlocks
earlship
earmarked
earmarks
Earmuffs
earnests
earnings
earphone
Earpiece
earplugs
earrings
Please excuse my spelling I am dyslexic.
Re: REGEX riddle
A little confused by what appears to be double negatives in your requirements.
I assume that a valid match is a word that:
1) Does not begin with Capital E
2) Is letters and numbers only (No decimal points, spaces or symbols)
3) Is exactly 8 characters long
Here is my regex: (\b[^E\s\W_][a-zA-Z0-9]{7}\b)
And here is the sample data - valid matches marked with a *
I assume that a valid match is a word that:
1) Does not begin with Capital E
2) Is letters and numbers only (No decimal points, spaces or symbols)
3) Is exactly 8 characters long
Here is my regex: (\b[^E\s\W_][a-zA-Z0-9]{7}\b)
And here is the sample data - valid matches marked with a *
Code: Select all
Eagers
Eagerest
Ealdorman
* eanlings
earache_
e rdrops
Eardrums
* earflaps
earful
* earliest
Ear!obes
* earlocks
* earlship
earmarked
* earmarks
Earmuffs
* earnests
* earnings
* earphone
Earpiece
* earplugs
* earrings
Re: REGEX riddle
I believe the criteria is:
Keep every word that conforms to the following:
1. Does not start with capital E
2. Contains at least one non-alphanumeric character
3. Is not length 8
e.g. 'e_34567' and 'e_3456789' would match the criteria.
Please confirm or contradict this.
Perhaps provide examples of strings that do/don't match the criteria.
In any case, a script which can be easily modified,
if I have misunderstood:
Keep every word that conforms to the following:
1. Does not start with capital E
2. Contains at least one non-alphanumeric character
3. Is not length 8
e.g. 'e_34567' and 'e_3456789' would match the criteria.
Please confirm or contradict this.
Perhaps provide examples of strings that do/don't match the criteria.
In any case, a script which can be easily modified,
if I have misunderstood:
Code: Select all
q::
vText := "e234567 e2345678 e23456789 E234567 E2345678 E23456789"
vText .= " e_34567 e_345678 e_3456789 E_34567 E_345678 E_3456789"
vOutput := ""
VarSetCapacity(vOutput, StrLen(vText)*2)
Loop, Parse, vText, %A_Space%
{
vTemp := A_LoopField
if (SubStr(vTemp, 1, 1) == "E") ;check first letter case sensitive
continue
if (StrLen(vTemp) = 8) ;check length
continue
if !RegExMatch(vTemp, "[^A-Za-z0-9]") ;check for any non-alphanumeric characters
continue
vOutput .= vTemp " "
}
vOutput := SubStr(vOutput, 1, -1)
Clipboard := vOutput
MsgBox % "done"
Return
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Re: REGEX riddle
Most of them are length 8
If those were his reqs, it would be a very odd sample data set to provide. But then you get all sorts on here
Oh, that was Odin's data set. OK, yeah, I officially have no friggin clue what the reqs are.
Re: REGEX riddle
The data set was taken from a previous answer, that apparently has been deleted. alltrought I don't think it was wrong, at least the approach in the deleted post was valid.evilC wrote:Oh, that was Odin's data set. OK, yeah, I officially have no friggin clue what the reqs are.
Please excuse my spelling I am dyslexic.
Re: REGEX riddle
Deciphering the OP's Match requirements:
1) String must start with a capital "E"
2) Can only be 8 Characters long
3) No special characters can be in the string, only Alpha and Digits
A Non-RegEx solution:
Using Capn's list we get these results:
1) String must start with a capital "E"
2) Can only be 8 Characters long
3) No special characters can be in the string, only Alpha and Digits
A Non-RegEx solution:
Code: Select all
StringCaseSense, On
For e, v in StrSplit(list, "`n", "`r")
if v is alnum
NewList .= (StrLen(v)==8?(SubStr(v,1,1)=="E"? v "`n":_):_)
Code: Select all
Eagerest
Eardrums
Earmuffs
Earpiece
Re: REGEX riddle
You guys are amazing. I have, like 10% of your brain capacity.
Re: REGEX riddle
jeeswg wrote:I believe the criteria is:
Keep every word that conforms to the following:
1. Does not start with capital E
2. Contains at least one non-alphanumeric character
3. Is not length 8
e.g. 'e_34567' and 'e_3456789' would match the criteria.
Please confirm or contradict this.
Perhaps provide examples of strings that do/don't match the criteria.
In any case, a script which can be easily modified,
if I have misunderstood:
Code: Select all
q:: vText := "e234567 e2345678 e23456789 E234567 E2345678 E23456789" vText .= " e_34567 e_345678 e_3456789 E_34567 E_345678 E_3456789" vOutput := "" VarSetCapacity(vOutput, StrLen(vText)*2) Loop, Parse, vText, %A_Space% { vTemp := A_LoopField if (SubStr(vTemp, 1, 1) == "E") ;check first letter case sensitive continue if (StrLen(vTemp) = 8) ;check length continue if !RegExMatch(vTemp, "[^A-Za-z0-9]") ;check for any non-alphanumeric characters continue vOutput .= vTemp " " } vOutput := SubStr(vOutput, 1, -1) Clipboard := vOutput MsgBox % "done" Return
You got all my requirements in reverse. Thanks for your code. Can you reverse it? Thanks
Re: REGEX riddle
We have been saying that there are so many negatives in there that we are not sure which cancel which out.Olegreddo wrote:You got all my requirements in reverse. Thanks for your code. Can you reverse it? Thanks
Why not just repeat your reqs and make it less ambiguous? A number of people have gone to a bunch of effort to write POC code based off ambiguous reqs - the least you can do is repeat the reqs in an unambiguous format.
Re: REGEX riddle
Do you see why people are confused?! What does "does not does not" mean!?Olegreddo wrote:Deletes every word which does not conform to [...] Does not start with capital letter E.
Provide some sample text!!!!!
Before and after!
Re: REGEX riddle
kon wrote:Do you see why people are confused?! What does "does not does not" mean!?Olegreddo wrote:Deletes every word which does not conform to [...] Does not start with capital letter E.
Provide some sample text!!!!!
Before and after!
My apologies to everyone for not being 100% clear:evilC wrote:We have been saying that there are so many negatives in there that we are not sure which cancel which out.Olegreddo wrote:You got all my requirements in reverse. Thanks for your code. Can you reverse it? Thanks
Why not just repeat your reqs and make it less ambiguous? A number of people have gone to a bunch of effort to write POC code based off ambiguous reqs - the least you can do is repeat the reqs in an unambiguous format.
Here are the conditions:
Incoming string is small number of words.
The outcome must be this:
1) String must start with a capital "E"
2) Can only be 8 Characters long
3) No special characters can be in the string, only Alpha and Digits
Example:
Incoming "RE: Insured Some Company, new GLPD water damage claim in NJ, E2D84587"
Outcome: "E2D84587"
Re: REGEX riddle
Here is your regex: \bE[a-zA-Z0-9]{7}\b
- AlphaBravo
- Posts: 586
- Joined: 29 Sep 2013, 22:59
Re: REGEX riddle
Deletes every word which does not conform to following:
1. Does not start with capital letter E
2. Contains anything other than letters and integers
3. Is not 8 characters long
1. Does not start with capital letter E
2. Contains anything other than letters and integers
3. Is not 8 characters long
Code: Select all
H = e234567 e2345678 e23456789 E234567 E2345678 E23456789
MsgBox % RegExReplace(H, "\b(E(?![a-zA-Z0-9]{7}\b)[a-zA-Z0-9]+\s?)|.", "$1")
Re: REGEX riddle
Wow, great compact statement. The only problem is that it removes the part which I would like to keep. In other words it cuts out my desired string, instead of cutting out the unneeded part.evilC wrote:Here is your regex: \bE[a-zA-Z0-9]{7}\b
Re: REGEX riddle
AlphaBravo wrote:Deletes every word which does not conform to following:
1. Does not start with capital letter E
2. Contains anything other than letters and integers
3. Is not 8 characters longCode: Select all
H = e234567 e2345678 e23456789 E234567 E2345678 E23456789 MsgBox % RegExReplace(H, "\b(E(?![a-zA-Z0-9]{7}\b)[a-zA-Z0-9]+\s?)|.", "$1")
Thanks for the help. However, even in your example it leaves in "E23456789" which is 9 characters, not 8.
Also, when I try my string "RE: Insured Some Company, new GLPD water damage claim in NJ, E2D84587" it fails.
Re: REGEX riddle Topic is solved
Oh so you want to strip them out of the string, not just match them?
What about the space before/after ? leave it ? remove one of them? which one?
RegexReplace(haystack, "\bE[a-zA-Z0-9]{7}\b", "") will just strip out the codes.
RegexReplace(haystack, " ?\bE[a-zA-Z0-9]{7}\b", "") will strip out the code, plus an optional space before
What about the space before/after ? leave it ? remove one of them? which one?
RegexReplace(haystack, "\bE[a-zA-Z0-9]{7}\b", "") will just strip out the codes.
RegexReplace(haystack, " ?\bE[a-zA-Z0-9]{7}\b", "") will strip out the code, plus an optional space before
Last edited by evilC on 03 Feb 2017, 15:49, edited 2 times in total.
Re: REGEX riddle
Thanks a lot. So my goal is that when I run it on this string "RE: Insured Some Company, new GLPD water damage claim in NJ, E2D84587" the only word left in new string is "E2D84587"evilC wrote:Oh so you want to strip them out of the string, not just match them?
What about the space before/after ? leave it ? remove one of them? which one?
RegexReplace(haystack, "\bE[a-zA-Z0-9]{7}\b", "") will just strip out the codes.
Re: REGEX riddle
Code: Select all
H := "RE: Insured Some e2D84587 Company, new E2D84589 GLPD water damage E23456789 claim in NJ, E2D84587"
RegExMatch(H, "\bE[a-zA-Z0-9]{7}\b", var)
MsgBox, % var ; E2D84589
MsgBox % RTrim(RegExReplace(H, "(\bE[a-zA-Z0-9]{7}\b\s?)|.+?", "$1")) ; E2D84589 E2D84587