Replace sevearl characters with a single same character? Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
CaptainProton
Posts: 46
Joined: 06 May 2017, 13:38

Replace sevearl characters with a single same character?

15 May 2017, 10:21

I'm trying to clean up a string from several (in this case, illegal Windows) characters. They are all to be completely removed, or replaced with a single space character (" "). One practical solution is this one adapted from here:

Code: Select all

testVar := "T*h<is> i:s a| t?e\s/t"

replace :=   {"<":"", ">":"", ":":"", "*":"", "/":"", "\":"", "|":"", "?":"", "*":""}

For inputVar, outputVar in replace
   StringReplace, testVar, testVar, %inputVar%, %outputVar%, All

msgbox %testVar%
But, I'm wondering if there is something more practical. Any ideas?
CaptainProton
Posts: 46
Joined: 06 May 2017, 13:38

Re: Replace sevearl characters with a single same character?

15 May 2017, 12:15

evilC wrote:That is pretty efficient already - you could maybe to it in one pass with a regex, but I would say that is already good enough
Can you please show me how it can be done with Regex?
CaptainProton
Posts: 46
Joined: 06 May 2017, 13:38

Re: Replace sevearl characters with a single same character?

15 May 2017, 13:13

evilC wrote:

Code: Select all

testVar := "T*h<is> i:s a| t?e\s/t"
str := RegexReplace(testVar, "[<>:\*|\?\\/]", "")
msgbox % str
Awesome! Thank you!
CaptainProton
Posts: 46
Joined: 06 May 2017, 13:38

Re: Replace sevearl characters with a single same character?

15 May 2017, 13:30

evilC wrote:

Code: Select all

testVar := "T*h<is> i:s a| t?e\s/t"
str := RegexReplace(testVar, "[<>:\*|\?\\/]", "")
msgbox % str
Sorry, but I need to ask you one more thing... I managed to figure out how to also trim down extra spaces to one space like this:

Code: Select all

testVar:=RegExReplace(testVar,"` +","` ")
However, is it possible to combine it with the one you placed above? Basically, is it possible to replace the characters above with nothing ("") and also replace multiple spaces with only a single space (" ") with a single command?
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: Replace sevearl characters with a single same character?

15 May 2017, 13:31

From:
RegEx handy examples (RegExMatch, RegExReplace) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=28031

Code: Select all

;check for invalid filename characters [Chr(1) to Chr(31) and \/:*?"<>|]
if RegExMatch(vName, "[" Chr(1) "-" Chr(31) "\\/:*?""<>|]") ;invalid file name characters (40)
if RegExMatch(vPath, "[" Chr(1) "-" Chr(31) "/*?""<>|]") ;invalid file path characters (38) (allow : and \)
E.g. Chr(1) to Chr(31) can include tab/LF/CR.

Btw in the original post * is there twice. Also " is invalid.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: Replace sevearl characters with a single same character?

15 May 2017, 13:39

It may be possible to do 2 in 1 like so:

Code: Select all

q:: ;replace invalid filename characters + trim multiple spaces
testVar := "   T*h<is>   i:s   a|   t?e\s/t   "
vText := RegExReplace(testVar,"(?<= ) +|[" Chr(1) "-" Chr(31) "\\/:*?""<>|]","")
MsgBox, % "[" vText "]"
return
The problem in combining them is each task has a different replace text.

However replacing n spaces with 1 space, is the same as replacing the 2nd space onwards with nothing.

Replace any consecutive spaces, that have a space before them with nothing.

Btw although this appears to work, I would want to triple-check it before industrial usage.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
CaptainProton
Posts: 46
Joined: 06 May 2017, 13:38

Re: Replace sevearl characters with a single same character?

15 May 2017, 14:06

jeeswg wrote:It may be possible to do 2 in 1 like so:

Code: Select all

q:: ;replace invalid filename characters + trim multiple spaces
testVar := "   T*h<is>   i:s   a|   t?e\s/t   "
vText := RegExReplace(testVar,"(?<= ) +|[" Chr(1) "-" Chr(31) "\\/:*?""<>|]","")
MsgBox, % "[" vText "]"
return
The problem in combining them is each task has a different replace text.

However replacing n spaces with 1 space, is the same as replacing the 2nd space onwards with nothing.

Replace any consecutive spaces, that have a space before them with nothing.

Btw although this appears to work, I would want to triple-check it before industrial usage.
Wow... ok, would you mind explaining what exactly is going on in there? That looks nuts.

Well done!
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: Replace sevearl characters with a single same character?

15 May 2017, 15:00

I'm basically doing this, replace a with nothing, then b with nothing:
RegExReplace(testVar,"a|b","")

(?<= ) +
This uses a look-behind assertion. Replace one or more spaces if preceded by a space (the preceding space is not replaced).

Regular Expressions (RegEx) - Quick Reference
https://autohotkey.com/docs/misc/RegEx-QuickRef.htm
Look-ahead and look-behind assertions: The groups (?=...), (?!...), (?<=...), and (?<!...) are called assertions because they demand a condition to be met but don't consume any characters.
[" Chr(1) "-" Chr(31) "\\/:*?""<>|]
This is a character class, replace any character that matches one of these characters. Character classes have certain characters with a special meaning, e.g hyphen, to indicate a range: "a-z" is the same as Chr(97) "-" Chr(122).

One wonders about how RegEx works behind the scenes, these work for example:

Code: Select all

MsgBox, % RegExReplace("abababab", "a|b", "") ;result is blank)
MsgBox, % RegExReplace("CababCDCDababD", "D|a|b|C", "") ;result is blank
MsgBox, % RegExReplace("CababCDCDababD", "D$|a|b|^C", "") ;result is CDCD ;^ and $ are anchors
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
FanaticGuru
Posts: 1906
Joined: 30 Sep 2013, 22:25

Re: Replace sevearl characters with a single same character?

15 May 2017, 15:29

evilC wrote:

Code: Select all

testVar := "T*h<is> i:s a| t?e\s/t"
str := RegexReplace(testVar, "[<>:\*|\?\\/]", "")
msgbox % str
Not that it matters much but I find it easier to use \Q \E when you are dealing with a bunch of characters that might need to be escaped. Mainly because it keeps me from having to look up all the special characters as I don't have them memorized.

Code: Select all

testVar := "T*h<is> i:s a| t?e\s/t"
str := RegexReplace(testVar, "[\Q<>:*|?\/\E]", "")
msgbox % str
Anything between \Q and \E is treated as literals. Also a little easier to read.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
User avatar
tidbit
Posts: 1272
Joined: 29 Sep 2013, 17:15
Location: USA

Re: Replace sevearl characters with a single same character?

15 May 2017, 16:09

in regex, pretty much everything in a [] group is already escaped, except for a couple things (I think ] and \). no need for \Q\E or a large amount of \'s (though, it doesn't hurt if you're unsure).
[<>:*|?\\/]
and if you want to toss in a space, just do it, anywhere. no need to escape it.
[<>:* |?\\/] (I added it before the |)
rawr. fear me.
*poke*
Is it December 21, 2012 yet?

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Decar, doodles333, mikeyww and 228 guests