Collecting vars directly from string instead of notepad

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
Hybridnyckel
Posts: 11
Joined: 24 May 2017, 08:53

Collecting vars directly from string instead of notepad

15 Aug 2017, 02:38

Hi!
I'm collecting data from a website by a procedure similar to this: select all, copy, paste in notepad, then searching for a static word that always is 3 rows above the value I want.

Code: Select all

Send, ^a^c
ClipWait
Run, Notepad.exe
WinWaitActive, Notepad ; btw, is this the best way to ensure that next row doesn't execute too early?
Send, ^v
sleep 100
clipboard =
Send, ^{Home}^f
Sleep, 100
Send, static word{Enter}!{F4}
Sleep, 100
Send, {Home}{Down 3}^{Right}^+{Left}^c ; this right-then-left-thing is to avoid an initial tab. Guess I could Trim it instead.
ClipWait
var1 := clipboard
clipboard =
I guess it's a lot faster doing all this without pasting and reading from notepad. The text is quite long, and only a few rows are of any interest. Any ideas on how to do this? IfInString? Loop, Parse? Thanks in advance.
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Collecting vars directly from string instead of notepad

15 Aug 2017, 02:55

That could be done with a single RegExMatch command.
Hybridnyckel
Posts: 11
Joined: 24 May 2017, 08:53

Re: Collecting vars directly from string instead of notepad

15 Aug 2017, 03:45

Any suggestions on how to do that? To get the posistion of "static word" will return a value like 34653 or something like that. How do I get the content of the row three rows below the row of that position? Isn't it possible to work with rows instead of positions?
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Collecting vars directly from string instead of notepad

15 Aug 2017, 08:00

You don't get a position. The RegExMatch statement would return the exact text you're looking for by finding the static word as well as the appropriate number of CR/LF characters (along with the unknown text between them) then get the text you want. Don't think about it as knowing what position or row in which it appears. The RegExMatch will make that invisible to you. Like I said, one command, not a loop and series of statements or anything like that.
Hybridnyckel
Posts: 11
Joined: 24 May 2017, 08:53

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 02:54

All the examples in the RegExMatch documentation returns a position. One of them also store a subpattern in a variable. Is that what I should be using? I guess I first want to to get a string that begins with my static word and ends a few linefeeds later.
Im trying to do something like

Code: Select all

Haystack := clipboard
MsgBox % RegExMatch(Haystack, "m)static word.*`r`n.*`r`n.*`r`n.*(/d)", Outputvar)
to get the first digit on the third row beneath the static word row, but I'm not really know what I'm doing. :?
Hybridnyckel
Posts: 11
Joined: 24 May 2017, 08:53

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 03:05

For example
If I copy this entire post to the clipboard and use it as haystack
how do I use the word Green in my NeedleRegEx
to pick up the third word
two lines below that word,
so that the result is an outputvar with the value "below"?
Guest

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 03:42

It can ALSO return a position, but do study the Outputvar in this example and read the docs, note the difference between Outputvar and Outputvar1 here

Code: Select all

Haystack := "Elvis Presley`nImpersonators Reflect`non His Legacy 40 Years After..."
RegExMatch(Haystack, "m)Elvis.*`n.*`n.*(\d\d)", Outputvar)
MsgBox % Outputvar "`n----------`n" Outputvar1
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 05:20

Nice example showing pretty much exactly what the OP wants. Note that you can store the exact target match in Outputvar directly by having the needle isolate the match from the look-behind part:

Code: Select all

Haystack := "Elvis Presley`nImpersonators Reflect`non His Legacy 40 Years After..."
RegExMatch(Haystack, "Elvis.*`n.*`n.*\K\d\d", Outputvar)
MsgBox % Outputvar
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 05:34

Hybridnyckel wrote:For example
If I copy this entire post to the clipboard and use it as haystack
how do I use the word Green in my NeedleRegEx
to pick up the third word
two lines below that word,
so that the result is an outputvar with the value "below"?

Code: Select all

Haystack =
(
For example
If I copy this entire post to the clipboard and use it as haystack
how do I use the word Green in my NeedleRegEx
to pick up the third word
two lines below that word,
so that the result is an outputvar with the value "below"?
)
RegExMatch(Haystack, "U)Green.*\n.*\n.*\w+\s+\w+\s+\K\w+(?=\s)", Outputvar)
MsgBox % Outputvar
Hybridnyckel
Posts: 11
Joined: 24 May 2017, 08:53

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 08:47

Oh thanks a lot Guest & boiler! It's getting clearer now! The only thing I don't understand is how the Ungreedy option works in this case. Tried to remove it and then got Outputvar = "value". I don't get how it gets there!
It's one more thing before I'm ready to go. Sometimes there is an extra line with alpabethic caracters between the line of my static word and the line of my wanted var, and sometimes an empty line, so the section of text can have four different patterns:

Bla Bla Blastatic Word
Occasional line with some word and a ":" and some more words. No digits.
Occasional empty line
[tab or spaces] one or to digits that is to become the value of my var.

I'm ending the needle with \K\d\d?" to put the 1 or 2 digits to the var. But how do I make the needle flexible, so that the two different occasional lines doesn't effect the outcome?
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Collecting vars directly from string instead of notepad

16 Aug 2017, 09:36

I had to use Ungreedy because it wouldn't stop at finding the very next word after the first two. And once I made it ungreedy, I had to add the look-ahead for whitespace after it so it would continue including the whole word until the space and not just the first letter found.

What you're now describing complicates it a good bit. With some more complex RegEx needle definitions, it should be possible. I believe it would be relatively advanced, though. In that case, you might just want to use AHK to identify the correct line that contains your desired data, using RegEx or InString on each line or something.

"\d\d" will match exactly two digits (i.e, won't match your number if it's a single digit). You want to use "\d+" to allow it to find one or more digits in a row. Or use "\d{1,2}" so that it will match only 1 or 2 digits and not more (so if it found "123" it would match only "12"). In your case, the more generic "\d+" should be fine.
Hybridnyckel
Posts: 11
Joined: 24 May 2017, 08:53

Re: Collecting vars directly from string instead of notepad

20 Aug 2017, 13:21

Kind of solved it, but not for AHK.
"static.Word[\s\S]*?(\d{1,2})" works fine when I try it in Expresso (regex tool). [\s\S]*? works as a wildcard, including newlines. The questionmark makes in ungreedy so that it quit after the next match, not the last in the string. This seems to be the most commons solution suggested on StackOverflow.
(\d{1,2}) matches one or two digits.
Works perfect in Expresso, but not in AHK. Is there something to do to make it compatible?
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Collecting vars directly from string instead of notepad

20 Aug 2017, 21:25

I don't know anything about Expresso, but make sure it's set to use the PCRE flavor of regular expressions. If you use regex101.com to develop and test your regular expressions (with PCRE selected, which is the default), then they should work in AHK.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Descolada and 251 guests