Hello All,
I have been struggling with regexmatch for a few days; trying to research on the community sites, but I am still just as stuck as ever.
I am self taught, so if you could please explain with as much detail as you can, it will help me out a lot.
Goal:
I am trying to pull from a webpage using regexmatch (I have looked unto the HTML replaces, but that made even less sense to me). As it is right now, I have a script pulling all text from a webpage into a variable, then trying to use regex patterns to pull the data I want. I have one working replace, but someone else had provided it. I am trying to reverse it, and I get the concept, but I cant build a line on my own that works.
Example:
I am trying to pull flight details from an entire body of text. The text has a segment of "Flight/Vessel (text needed) Place of Loading" - I am trying to grab everything between these to text points. My biggest issue is determining how to match literal text.
I have a working regexmatch on this same body of text that looks like this:
pulling from: RANDOMtext Business (B1)
RegExMatch(BODY, "(?<=\()(.{2,2})(?=\))", SVC)
SVC = B1
This pulls B1 - the required text. I was given this line by someone else here in the community, but I have yet to be able to fully understand AHK regex. I am somewhat familiar with patterns, as I have used PERL regex before, I just cant quite grasp AHK.
If anyone is up to the challenge, could you please explain where I am going wrong, or what I need to do? The more "pieces" you explain, the better, as I will be adapting any answers provided to other pieces of my script.
Thanks in advance!
RegExMatch Help Topic is solved
-
- Posts: 65
- Joined: 09 Apr 2018, 15:53
Re: RegExMatch Help
for matching stuff between parens, your result is store in capture group 1:
Code: Select all
needle := "\(([^\)]+)"
\( ; match literal open-paren
( ; start capture group 1
[^\)]+ ; match anything that inst a close-paren
) ; end capture group 1
-
- Posts: 65
- Joined: 09 Apr 2018, 15:53
Re: RegExMatch Help
I apologize, I do not think I stated the line well - Flight/Vessel ... Place of Loading - There are no parentheses or quotes on this line. I need everything between these two literal words/phrases in a huge body of text, this is just a small segment representing the data I need.
How do I match a literal word/phrase?
I keep seeing \btext\b to match exact text, but I have tried this and cant get it to work.
Pulled from another post:
You can use the anchor \b for word boundaries:
"\bonly these words\b"
I have tried things similar to RegExMatch(BODY, "\bFlight/Vessel\b(.*)\bPlace of Loading\b", FLT)
but it didnt work as expected
How do I match a literal word/phrase?
I keep seeing \btext\b to match exact text, but I have tried this and cant get it to work.
Pulled from another post:
You can use the anchor \b for word boundaries:
"\bonly these words\b"
I have tried things similar to RegExMatch(BODY, "\bFlight/Vessel\b(.*)\bPlace of Loading\b", FLT)
but it didnt work as expected
Re: RegExMatch Help Topic is solved
i see.
this regex does exactly what youre expecting it to, that is grabbing everything between those 2 literal text limiters:
can u show how it's not doing what is expected of it?
otherwise, with lookarounds and no capturing groups: (?<=\bFlight/Vessel \b).*(?=\b Place of Loading\b)
this regex does exactly what youre expecting it to, that is grabbing everything between those 2 literal text limiters:
Code: Select all
RegExMatch(BODY, "\bFlight/Vessel\b(.*)\bPlace of Loading\b", FLT)
otherwise, with lookarounds and no capturing groups: (?<=\bFlight/Vessel \b).*(?=\b Place of Loading\b)
-
- Posts: 65
- Joined: 09 Apr 2018, 15:53
Re: RegExMatch Help
Welp, after you responded saying this should work, I started to copy over my entire code with the examples I have... Only to stop and notice that my variable FTL and FLT is not the same... I had been using a different variable the entire time.
However, now that the variables are matched up and it is working better, I still have a quick additional question.
Using the text:
Flight/Vessel this is a test Place of Loading
I want to pull specifically - this is a test. Though the regex code is pulling this whole line. (RegExMatch(BODY, "\bFlight/Vessel\b(.*)\bPlace of Loading\b", FLT))
Here is my whole jenky testing code. I am just trying to manipulate the above line to pull only what is between those two phrases, currently it is pulling everything from Flight to Loading (including flight/vessel and place of loading).
EDIT: I think I found the solution. It looks like the variable FLT would be the whole pattern, while the parentheses are a subpattern listed consecutively (ie - FLT1, FLT2, etc depending on the number of (...) within the pattern). I think I am on the right track, correct me if I am wrong, or elaborate more if needed. Though I think you have answered my questions.
Thanks, Swag
However, now that the variables are matched up and it is working better, I still have a quick additional question.
Using the text:
Flight/Vessel this is a test Place of Loading
I want to pull specifically - this is a test. Though the regex code is pulling this whole line. (RegExMatch(BODY, "\bFlight/Vessel\b(.*)\bPlace of Loading\b", FLT))
Here is my whole jenky testing code. I am just trying to manipulate the above line to pull only what is between those two phrases, currently it is pulling everything from Flight to Loading (including flight/vessel and place of loading).
Code: Select all
!h::
SetKeyDelay, 1
count++
if (count = "1")
{
MsgBox, 262144,, Select (click on) the MS Word window, then hit ok
WinGetTitle, WORD, A
Sleep, 250
MsgBox, 262144,, Select (click on) the Note Pad window, then hit ok
WinGetTitle, NOTE, A
Sleep, 250
}
WinActivate, %WORD%
Sleep, 50
Send, ^a
Sleep, 50
Send, ^c
Sleep, 50
HTML:=RegExReplace(clipboard,"\R")
Sleep, 100
clipboard = ;
; FLT:= "test"
WinActivate, %NOTE%
Sleep, 50
RegExMatch(HTML, "\bFlight/Vessel\b(.*)\bPlace of Loading\b", FLT)
Sleep, 50
Send, %FLT%
Return
EDIT: I think I found the solution. It looks like the variable FLT would be the whole pattern, while the parentheses are a subpattern listed consecutively (ie - FLT1, FLT2, etc depending on the number of (...) within the pattern). I think I am on the right track, correct me if I am wrong, or elaborate more if needed. Though I think you have answered my questions.
Thanks, Swag
Re: RegExMatch Help
Yes, it was due to the capturing group, gj