Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Regular Expressions (RegEx) for AutoHotkey


  • Please log in to reply
112 replies to this topic

Poll: What should the names of the RegEx functions be (if you HAD to pick one of these)? (42 member(s) have cast votes)

What should the names of the RegEx functions be (if you HAD to pick one of these)?

  1. RegExMatch() and RegExReplace() (43 votes [84.31%])

    Percentage of vote: 84.31%

  2. RegMatch() and RegReplace() (8 votes [15.69%])

    Percentage of vote: 15.69%

Vote Guests cannot vote
majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006

Modifiers of replace strings

This is cool idea. Why can't we extend this even more and use regular string functions, so instead "$u2" to write "ToUpper($2)" ?

So, replacement part of RegExp should be taken as "replacement expression" that takes $N (N >= 1) as an input

Finally, the performance of StringReplace has been improved a lot, as well as the overall performance of the := operator when long strings are involved.

:D
Posted Image

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005

Why can't we extend this even more and use regular string functions, so instead "$u2" to write "ToUpper($2)" ?

I meant that for simple, quite frequent operations. For more complex ones, I still think that my proposal to take a function as replace parameter would be very flexible. I didn't posted it again, but I hope Chris will consider it for some future release.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
You are right. That is definitely must-have.
Posted Image

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004

Well i could complete the chart so you just have to copy and paste it.

Although I probably wouldn't copy & paste the chart due to length reasons, some of its content may well be used -- at least to inspire examples and descriptions.

So feel free to work on it more, especially if there will be more links to it from the forum/web.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
I must do a bug report on the lasted beta, the one putting the options at the start of the RE.
In the previous version, the extended REs (multiline with comments) worked well. With this version, it is broken. I saw this in a complex regex: [TIPS] A collection/library of regular expressions, but I was able to reproduce it with a much simpler RE. It seems that's the comments that break the regex. Perhaps the option parser looks too far. I don't know how you coded it, but I think it should stop either on the first opening parenthesis, or on the first char that isn't in the list of option chars.
s = Booo    Gaaah
re = i)^(\w+)\s+(\w+)$ ; This works
re = ; This works too
(
ix)
^
(\w+)
	\s+
(\w+)
$
)
re = ; This is broken
(
ix)
^
# Capture #1
(\w+)
	\s+
# Capture #2
(\w+)
$
)
re = ; This shows the issue, the expression stops on the #
(
ix)
^
(\w+) #
	\s+
(\w+)
$
)
fp := RegExMatch(s, re, out)
MsgBox (%out%) %out1% & %out2%

Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
This is interesting:

I am positively surprised that this works as it don't work in some other RE sw I tried.

Remove empty lines
var = 
(
First text line




Last text line	
)

v := RegExReplace(var, "[`n]+", "`n")
msgbox %v%

Now, try to set there
RegExReplace(var, "$", "1")

You will get:
First text line




Last text line1
In EditPlus and Sed I get
First text line1




Last text line1


I am still confused why I don't get

First text line1
1
1
1
1
Last text line1

It appears that empty lines don't have ^ and $ "fields".

The same thing is there if you switch $ with ^ but in oposite direction. It appears that AHK is looking at this variable as single line.

So, the fact that AHK looks several lines without $ fields leaded to recognition of [`n]+ witch is very cool.

Compatibility is in question though....
Posted Image

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Jul 29 2016 12:40 AM
  • Joined: 24 May 2006
There is also a bug with new version.

var := "aaaaaaab"
StringReplace i, var, b, a
msgbox %i%

Returns unchanged var.
Posted Image

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
majkinetor, no bug in AHK, but in your code: unlike RegExReplace, StringReplace defaults to one replace.
var := "aaaaaaabbbbbbbbbaaaaaa"
StringReplace i, var, b, , All
msgbox %i%
I often do the same error...

And for $ to work on each line, you have to:
1) Specify the multiline option: m (otherwise it matches end of string);
2) Change the end-of-line symbol, as it defaults to `n in continuation sections.

var =
(
First text line




Last text line   
)

v := RegExReplace(var, "m`n)$", "1")
msgbox %v%

Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")