jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

Helpful script writing tricks and HowTo's
User avatar
jeeswg
Posts: 2190
Joined: 19 Dec 2016, 01:58
Location: UK

jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

11 Feb 2017, 22:16

[this page was called 'RegEx handy examples (RegExMatch, RegExReplace)']

I've tried to collect all of the most important RegEx techniques that I've used or may like to use.

Please notify of any corrections, improvements/simplifications.
Do post any handy examples not included, or links to other examples.
Also do mention any useful techniques doable in RegEx but not mentioned in AutoHotkey's help.

Btw if other people want to create their own lists of RegEx handy samples, including some that are closely/loosely based on mine, please go ahead and post your link here.
For example at the extreme end: copying the code, but only changing the variable names, is fine.

522 lines (initially)

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



LINKS (DOCUMENTATION):

Regular Expressions (RegEx) - Quick Reference
https://www.autohotkey.com/docs/misc/RegEx-QuickRef.htm
RegExMatch
https://www.autohotkey.com/docs/commands/RegExMatch.htm
RegExReplace
https://www.autohotkey.com/docs/commands/RegExReplace.htm
Regular Expression Callouts
https://autohotkey.com/docs/misc/RegExCallout.htm
SetTitleMatchMode
https://www.autohotkey.com/docs/commands/SetTitleMatchMode.htm#RegEx

LINKS (SPECIFIC EXAMPLES):
[remove items from a list if they start with a particular character]
Help with RegExReplace - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=33768

[trim numbers]
ZTrim() : Remove redundant leading/trailing zeroes from a number - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=33960&p=159193#p159193

[using RegEx to repeat a string]
Replicate() : Repeats a string N times - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=33977&p=157428#p157428

[backreferencing within needles]
RegExReplace More than One Needle in a Script? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=35229

[only keep lines that contain string (replace lines that don't contain string)]
TF library TF_RegExReplaceInLines - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=36575

[remove consecutive duplicate lines]
Put here requests of problems with regular expressions - Page 9 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-9#entry94923

[find a 3-digit number that is not '008']["\d{3}(?<!008)"]
Put here requests of problems with regular expressions - Page 20 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-20

[need a regex to extract the $100 if the string contains HELLO and WORLD, otherwise, extract last word]
Put here requests of problems with regular expressions - Page 29 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-29

Put here requests of problems with regular expressions - Page 63 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-63

Code: [Select all] [Download] GeSHi © Codebox Plus

msgbox % RegExReplace("123 456 789", "A)(\d)\d\d ?", "$1") ; outputs "147"
msgbox % RegExReplace("123 456 789", "^(\d)\d\d ?", "$1") ; outputs "1456 789"
msgbox % RegExReplace("123`n456`n789", "`nm)^(\d)\d\d\R?", "$1") ; outputs "147"


LINKS (SYNTAX NOT MENTIONED IN https://autohotkey.com/docs/misc/RegEx-QuickRef.htm)
[the \G anchor]
Add Thousands Separator - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/50019-add-thousands-separator/

[*ACCEPT: quit if error]
Default/Portable installation StdLib - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=10434&p=74978#p74978
Default/Portable installation StdLib - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=10434&p=75262#p75262

[*SKIP]
Put here requests of problems with regular expressions - Page 44 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-44

LINKS:
[get all matches]
extracting items from a list using RegEx (various methods) (get all matches) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=30448

[/Q, /E and escaping characters with backslashes]
simplest way to make a RegEx needle literal? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=30420

[PCRE REGULAR EXPRESSION SYNTAX SUMMARY]
[for syntax not mentioned in https://autohotkey.com/docs/misc/RegEx-QuickRef.htm]
pcresyntax specification
http://www.pcre.org/original/doc/html/pcresyntax.html

pcre.txt
http://www.pcre.org/pcre.txt

[TIPS] A collection/library of regular expressions - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12374-tips-a-collectionlibrary-of-regular-expressions/

Regular Expressions: a simple, easy tutorial
http://phi.lho.free.fr/programming/RETutorial.en.html

Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript
https://regex101.com/

Tutorial: An AHK Introduction to RegEx - Tutorials - AutoHotkey Community
https://autohotkey.com/board/topic/39733-tutorial-an-ahk-introduction-to-regex/

AutoHotkey Expression Examples: "" %% () and all that
http://www.daviddeley.com/autohotkey/xprxmp/autohotkey_expression_examples.htm#N

[look-behind v. '\K']
Put here requests of problems with regular expressions - Page 28 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-28
\K is a different method of lookbehind. The traditional method (?<=...) cannot support quantifiers of varying size (i.e. *, ?, and +) whereas \K can. That aside of it just being more simple to use, IMO.

Best way to learn RegEx for AHK? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=22&t=13030

[inclues some RegEx benchmark tests]
jeeswg's benchmark tests - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=37876&p=174191#p174191

==================================================

NEW SECTIONS:

[2017-07-07] SPLIT PATH
[2017-07-07] SLICE STRING / PAD STRING
[2017-07-07] UPPERCASE / LOWERCASE / TITLE CASE
[2017-07-07] NOTES: CHARACTER TYPES
[2017-07-07] NOTES: CHARACTER TYPES (SCRIPT NAMES)
[2017-10-05] NOTES: CHARACTERS
[2017-10-05] COLUMNS: INCREASE WHITESPACE BETWEEN COLUMNS
[2017-10-05] TRIM: RECREATING AUTOHOTKEY'S TRIM/LTRIM/RTRIM FUNCTIONS
[2017-10-05] SEPARATE LEADING WHITESPACE / CODE / COMMENTS
[2017-10-05] DATES
[2017-10-05] BACKREFERENCES
[2017-10-05] GET ALL MATCHES
[2017-10-05] SYNTAX SPECIFIC TO AUTOHOTKEY
Last edited by jeeswg on 05 Oct 2017, 17:57, edited 39 times in total.
User avatar
noname
Posts: 340
Joined: 19 Nov 2013, 09:15

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 11:09

Nice collection ,i already found some i will use.Thanks for posting :)
kon
Posts: 1760
Joined: 29 Sep 2013, 17:11

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 11:37

This looks like a useful collection.

Just some suggestions, do with them what you will:

";Greed: By default..." -- This line is the only one that is very long. Perhaps some linefeeds are in order?

vList is used throughout. I suggest giving it a value of some sample text. Then add comments to each regular expression showing the expected result. With a few modifications, this could be an actual working script that runs and shows the user the results of each regular expression.

The lack of indenting is painful for me to read, but I realize this is just personal preference. I don't think I'm in the minority though.

Sites like https://regex101.com/ are worth mentioning IMO. There are many, but that's the one I use. They are especially helpful when answering "Ask for Help" questions because you can save a link to your regular expression and post it for others to read. They explain what each symbol does, which saves a lot of typing-out explanations.

Edit:

Code: [Select all] [Download] GeSHi © Codebox Plus

;Is it possible to achieve the AutoHotkey RegEx 'options',
;by putting text in the needle proper?
;e.g. i m s U (case-insensitive matching, multiline, DotAll, ungreedy)
;e.g. `n `r `a
;see: https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#Options

See https://www.autohotkey.com/docs/misc/Re ... htm#subpat specifically "Change options on-the-fly..."
User avatar
jeeswg
Posts: 2190
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 12:35

@noname
Thanks so much for you comments.
At the moment AHK v2 doesn't have if var in/contains/is type,
(I believe,) which had prompted me to seek various RegEx methods
in the meantime, leading to this collection.

@kon
[UPDATE:]
I split the line ';Greed: By default...'
I added indentation.
I added https://regex101.com/.
() I'm still to add in vText in more places.

@kon
'Greed: By default'
That line is long, it's a quote from the help,
I might split it up.

'vList is used throughout'
I used examples with vList at it happens,
I think you mean vText.
That's a tricky one, because that would double/triple
the lines used, and really bulk everything out.
To begin with at least, there are some that would definitely benefit
from example text, like the 'remove columns' ones,
so I'll add example text to those.

'The lack of indenting'
I counted 15 lines starting with 'if ',
with only 7 followed by lines (which would thereby
require indentation), in basic 2-line/4-line blocks.
[EDIT: and 7 lines beginning with 'Loop']
I'm sympathetic to people wanting indentation for
larger scripts.
It wouldn't be too hard for me to add that in,
and I might do, I'm curious though, is:

so much worse than:

People seem to be really allergic to non-indentation.
I think there should be a poll on this forum,
as to whether I should use indentation,
I've heard it a fair amount.
I don't mind when people don't use indentation,
and I use Notepad, so I get zero assistance when reading scripts.
Btw these comments on indentation are aimed at everyone,
not you specifically, just that I hadn't got round to posting them.

To be honest I see indentation as a silly coding fad,
that often makes code less readable.
I do use it sometimes, when I have 2 or 3 pairs of curly brackets.
It definitely helps in those situations.
It seems to be pretty widespread, the demand for indentation,
I'm working on functions to add it in for my big scripts,
and library functions before I share them (together with manual checking),
even though personally I think over-indentation is unnecessary and undesirable,
a mark of indoctrination, with blank lines, comments and good variable names
being far more important.
I also like to use barriers (e.g. equal signs), uppercase comments e.g. ';STAGE 1 -',
and notations like ';;;SECTION' in some instances (3 semicolons being easy to search for).
Haha I'm going hard on indentation, because I've seen what looks like
some scary groupthink on the matter, I'm not particularly ideological re. programming.

Thanks so much for your comments, it means a lot coming
from you, you've done some clever things on this forum.
Last edited by jeeswg on 19 Feb 2017, 12:47, edited 2 times in total.
kon
Posts: 1760
Joined: 29 Sep 2013, 17:11

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 12:58

Yes, I meant vText.

I edited my post right before you posted. Not sure if you saw my edit so I'll just point it out here.

Regarding the sample text and expected results.
I've answered a lot of "Ask for Help" questions on regex. They usually play-out in two ways.
1. The OP provides sample text for both the haystack and the expected results.
2. OP tries to explain the problem in plain English.

#1 is usually answered with one reply.
#2 usually becomes a "moving target" style question as people post solutions that appear to conform to OP's request, but due to OP's lack of understanding of regex the solutions need to be adjusted to satisfy new requirements. These threads tend to drag on for quite a while.

I think this is a good argument for any regex tutorial to have sample text and show the expected results. It doesn't have to increase the length of the tutorial by 2-3x. Maybe just add a comment to the same line?
User avatar
jeeswg
Posts: 2190
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 13:14

Hmm, classic insight.
Yes I've thought for a while that the antidote to manuals being difficult
to understand in a foreign language,
is lots of before/after examples.
Actually even if the manual is in your language.

Wow, one magical haystack, will think of what I can do.

It was exhausting producing this tutorial, although I needed it for my own use anyway,
yeah, the way I like to do things, is get something quality *finished*,
and then eventually have a rethink and come back to it,
when you can bear to re-explore the material, and have possibly come up against some new ideas.

[EDIT:]
Thanks for the link re. 'Change options on-the-fly', I had noticed that but not understood it at the time. I found it mentioned here under Options (PhiLho was quite good with this RegEx stuff):
Regular Expressions: a simple, easy tutorial
http://phi.lho.free.fr/programming/RETutorial.en.html

I've updated the queries section to reflect this.
guest3456
Posts: 2021
Joined: 09 Oct 2013, 10:31

Re: RegEx handy examples (RegExMatch, RegExReplace)

18 Feb 2017, 09:04

jeeswg wrote:People seem to be really allergic to non-indentation.
I think there should be a poll on this forum,
as to whether I should use indentation,
I've heard it a fair amount.
I don't mind when people don't use indentation,
and I use Notepad, so I get zero assistance when reading scripts.
Btw these comments on indentation are aimed at everyone,
not you specifically, just that I hadn't got round to posting them.

To be honest I see indentation as a silly coding fad,
that often makes code less readable.


lol. yet another example of multiple people telling you that you're wrong, and you continuing to be stubborn and hardheaded. are you ever going to come down from your high horse? you would probably get the same responses on stackoverflow, and then complain that "those arrogant stackoverflow people deleted my post because i didn't indent my code" or some other nonsense. no poll is necessary. this is one of the most basic style aspects of programming, regardless of the langauge. god forbid you ever use Python, where indentation is REQUIRED

but, if you like the fact that no one reads your posts or your code, then continue with what you're doing

if however you actually want input from others, you should probably present your code in ways that they want to read it.


Return to “Tutorials”

Who is online

Users browsing this forum: No registered users and 3 guests