Wishes for ~= (RegExMatch)

Propose new features and changes
Coco
Posts: 771
Joined: 29 Sep 2013, 20:37
Contact:

Wishes for ~= (RegExMatch)

02 Nov 2013, 12:47

What would be nice is to haveRegEx shorthand(~=) return an object if the O option is specified in the RegEx pattern, otherwise, blank if no match is found.
Example(Not implemented at the moment):

Code: Select all

match := ("Hello123" ~= "O)\d+") ; 'O' is specified
MsgBox, % match.Pos "`n" match.Value
Wade Hatler
Posts: 60
Joined: 03 Oct 2013, 19:49
Location: Seattle Area
Contact:

Re: Why there's no i) setting for ~= command?

02 Nov 2013, 17:26

I love the ~= syntax myself and use it all the time, but I also frequently wish I could get at the results. The only problem with the match syntax above is that it doesn't allow you to check for the results easily because an object is treated as false in an IF statement, so in most cases I think it would just move the problem downstream.

For example, what I currently do all the time is something like:

Code: Select all

if (RegexMatch(Haystack, "Needle", R_)) {
    DoSomethingWith(R_1)
}
If I don't actually need substring results, then I would shorten it to:

Code: Select all

if (Haystack ~= "Needle") {
    DoSomething()
}
So if the expression returned an object, it would sort of destroy the easy utility of using the resulting and IF statement, and it would break a lot of existing code.

I can think of a couple alternatives, one of which is a little bit weird. You could simply allow an extra parameter after the regex pattern, something like the third parameter in RegexMatch, and if present it gets filled in. The first example would look something like:

Code: Select all

if (Haystack ~= "Needle", R_) {
    DoSomethingWith(R_1)
}
This could be done, but it suffers from the obvious problem that it's really weird syntax unlike anything else in the language.

Another alternative would be to follow the Perl convention of creating a global (or actually, a local variable with a common name) that contains the match of the most recent Regex. In Perl, you have something like $1, $_, etc, and they get reused every time you do another expression. I would like thiss, although you would suffer a tiny performance hit in the cases where you don't need them the results. I doubt that it would be a big hit, but it would be non-zero. If you simply always passed an implicit $ to the existing RegexMatch for this syntax, it would do pretty much the same thing as the R_ in my example.

That seems like a good solution for me, although I don't actually find the current syntax to be terribly troubling anyway. AHK Regexs are an absolute joy to use compared to most other regex flavors (and I've used a bunch), so I'm pretty happy with them the way they are.
lexikos
Posts: 9637
Joined: 30 Sep 2013, 04:07
Contact:

Re: Wishes for ~= (RegExMatch)

02 Nov 2013, 20:55

I've split your posts from the original thread, which has been moved.

~= is something I added impulsively, to prove that it was possible (and at the time, required only a few lines of code). Aside from a mention in the changelog, it was undocumented for nearly two years. I kept it because of its low cost and because a few people were using it, but in retrospect, I regret adding it.

Note that ~= is documented as "Shorthand for RegExMatch." It is low cost because it is simply replaced with a function call, and is not handled at run-time. Differentiating between ~= and RegExMatch would increase the cost of ~= in terms of code size, if only by a small amount.
Coco wrote:What would be nice is to have RegEx shorthand(~=) return an object if the O option is specified in the RegEx pattern
That would require differentiating between ~= and RegExMatch, unless the behaviour of RegExMatch is also changed. A better idea might be simply to return an object if the O option is specified and UnquotedOutputVar is omitted. However, the O option is currently invalid in v2 since it is the default mode.
Wade Hatler wrote:an object is treated as false in an IF statement
Your statement is false... Objects are treated as true. No object would be returned if there is no match (as with the output var currently), so boolean expressions would continue to work. What might have confused you is that if you try to treat an object as a string or number, you get an empty string. This behaviour does not apply to boolean operators or If/While/Until statements.
Wade Hatler wrote:Another alternative would be to follow the Perl convention of creating a global (or actually, a local variable with a common name) that contains the match of the most recent Regex.
I'm against that. I considered allowing x~="(?<foo>bar)" to write into the 'foo' variable, since it would probably require only a small modification of the code used for the UnquotedOutputVar parameter. However, that would require a more complex implementation of ~=, unless the change was to also affect RegExMatch. (Edit: Also, that functionality doesn't exist in v2, which always uses the object mode.)


I typically don't use ~=, because its meaning is obscure and quite often any time saved is lost by having to convert the expression to a function call to add parameters. Instead, I use AutoHotkey as it was originally intended! :P

Code: Select all

:*:rem(::RegExMatch(
:*:rer(::RegExReplace(
Wade Hatler
Posts: 60
Joined: 03 Oct 2013, 19:49
Location: Seattle Area
Contact:

Re: Wishes for ~= (RegExMatch)

02 Nov 2013, 23:04

That's funny. I have almost those exact same hotstrings ;)

Good to know about the objects. I thought sure they were treated as false, but I must have confused some different tests.

All in all, I'm super-happy with the current regex support, so all is well.
Wade Hatler
Posts: 60
Joined: 03 Oct 2013, 19:49
Location: Seattle Area
Contact:

Re: Wishes for ~= (RegExMatch)

02 Nov 2013, 23:46

P.S. I don't actually use it because it's easier to type. I use it because it's easier to read later. The typing is pretty much a wash.
lexikos
Posts: 9637
Joined: 30 Sep 2013, 04:07
Contact:

Re: Wishes for ~= (RegExMatch)

03 Nov 2013, 00:42

Wade Hatler wrote:I use it because it's easier to read later.
I avoid using it because it isn't...
Coco
Posts: 771
Joined: 29 Sep 2013, 20:37
Contact:

Re: Wishes for ~= (RegExMatch)

03 Nov 2013, 01:08

A better idea might be simply to return an object if the O option is specified and UnquotedOutputVar is omitted. However, the O option is currently invalid in v2 since it is the default mode.
How about(For v. 1.1+ only):
Specifiying O returns an object if there is no UnquotedOutputVar specified. Specifying an UnquotedOutputVar, defaults to the current behavior ==> return Pos and store a MatchObject in UnquotedOutputVar. This will work for ~= and would not break existing scripts(hopefully, or damage should be minimal) as most scripts that are using the O option would most likely have an UnquotedOutputVar specified. It'll also in a way ease the transtition to v2 as most users will get use to using a MatchObject (only if they specify O). However, I'm not sure of how much code this change would cost, you be the judge lexikos.

~= is convenient for v2 as the following have been removed if var in/contains. I've been using ~= to replicate the behavior of those removed.

Code: Select all

if (var ~= "i)(red|yellow|blue)") ; if var contains
if (var ~= "i)^(red|yellow|blue)$") ; if var in
lexikos
Posts: 9637
Joined: 30 Sep 2013, 04:07
Contact:

Re: Wishes for ~= (RegExMatch)

03 Nov 2013, 16:31

Users can already get use to using a MatchObject. How will using the O option to return an object ease the transition to v2, when v2 doesn't have the O option and doesn't return an object (but stores it in the output var)?

Return to “Wish List”

Who is online

Users browsing this forum: No registered users and 16 guests