Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Put here requests of problems with regular expressions


  • Please log in to reply
1074 replies to this topic
Frankie
  • Members
  • 2930 posts
  • Last active: Feb 05 2015 02:49 PM
  • Joined: 02 Nov 2008
Thanks. Ill download grep and try it out.

Edit: Works great
aboutscriptappsscripts
Request Video Tutorials Here or View Current Tutorials on YouTube
Any code ⇈ above ⇈ requires AutoHotkey_L to run

keybored
  • Members
  • 351 posts
  • Last active: Apr 26 2013 09:08 AM
  • Joined: 18 Jun 2006
Lexikos thanks!
It works on the example and some other code I'm working on, but not these actual files. For now I have another way to remove the headers.

I have repetitive stress injury that the doctors can't identify. I avoid writing code and commenting in the forums too much. I am very grateful for all the help in getting more done easily. I will review your comments and try and understand better little by little. Thanks Everyone!

lilljimpa
  • Members
  • 127 posts
  • Last active: Feb 18 2011 05:48 PM
  • Joined: 18 Apr 2007
quite don't understand whats is wrong here... i got this help from someone and dosen't work longer
page =
(
<tr>
   <td width='20%'><b>Name</b>:</td>
   <td width='30%'><a style='text-decoration:none;' href='http://208.100.49.152/mde/gang/18.php'>[WC]</a> <a title='Mobster' style='' href='http://208.100.49.152/mde/profile/3452.php'>The Big Guy</a></td> ;want "The Big Guy"
   <td width='20%'><b>HP</b>:</td>

   <td width='30%'>850 / 850 [100%]</td>
</tr>
      
<tr>
)

RegExMatch(page,"<b>Name(?:.|`r)<a title='Mobster'[^\>]+\>([^\<]+?)\</a>",UName)
MsgBox %UName1%

you'll have to excuse me...I'm from Sweden, so my English is not that good...(but now it's better cuz JSLover/Guest is helping me)...

  • Guests
  • Last active:
  • Joined: --
I escaped the % in the sample code, I changed the coloured bit
page=
(
<tr>
   <td width='20`%'><b>Name</b>:</td>
   <td width='30`%'><a style='text-decoration:none;' href='http://208.100.49.152/mde/gang/18.php'>[WC]</a> <a title='Mobster' style='' href='http://208.100.49.152/mde/profile/3452.php'>The Big Guy</a></td> ;want "The Big Guy"
   <td width='20`%'><b>HP</b>:</td>

   <td width='30`%'>850 / 850 [100`%]</td>
</tr>
      
<tr>
)

RegExMatch(page,"<b>Name[color=red](?:.|`r?`n)*[/color]<a title='Mobster'[^\>]+\>([^\<]+?)\</a>",UName)
MsgBox %UName1%


sharethewisdom
  • Members
  • 57 posts
  • Last active: Dec 18 2014 08:44 AM
  • Joined: 24 Feb 2008
Replacing all positive whole numbers who consist of 3 digits placed in between ~, if they are not equal to any value in var2.

var1 contains a long string in which ~numbers~ and ~Inumbers~ are present.
If a number from ~numbers~ matches the number in var2 (parse loop), it is replaced by var3. Now I want to replace all the ~numbers~ that don't match a number in var2 with "404.html".
The same thing for ~Inumbers~ i.e. replace them by "not_found.jpeg"

first regex(var1,)
matches: ~040~ | ~180~ | ~008~
Non-Matches: ~string~ | ~0string15~ | ~-180~ | ~180,3~ | ~4~

second regex(var1,)
matches: ~I040~ | ~I180~ | ~I008~

please help me out I 'm making a mess of it..

TheGood
  • Members
  • 589 posts
  • Last active: Mar 22 2014 03:22 PM
  • Joined: 30 Jul 2007

Replacing all positive whole numbers who consist of 3 digits placed in between ~, if they are not equal to any value in var2.

var1 contains a long string in which ~numbers~ and ~Inumbers~ are present.
If a number from ~numbers~ matches the number in var2 (parse loop), it is replaced by var3. Now I want to replace all the ~numbers~ that don't match a number in var2 with "404.html".
The same thing for ~Inumbers~ i.e. replace them by "not_found.jpeg"

first regex(var1,)
matches: ~040~ | ~180~ | ~008~
Non-Matches: ~string~ | ~0string15~ | ~-180~ | ~180,3~ | ~4~

second regex(var1,)
matches: ~I040~ | ~I180~ | ~I008~

please help me out I 'm making a mess of it..


OK, well, first to retrieve all the numbers in the string:
i := 1
Loop {
	i := RegExMatch(var1, "~\d{3}~", number, i)
	
	;Check if we found anything
	If Not i
		Break
	
	;Add location to array
	iMatches0 += 1
	iMatches%iMatches0% := i
}

;Now you have an array filled with the location of every occurence of ~numbers~

;Now, just check each number against those in var2 (maybe build an array of booleans bNotInVar2)

;And to replace:
Loop %iMatches0%
	If bNotInVar2
		RegExReplace(var1, "~\d{3}~", "404.html", iMatches%iMatches0%)

;And it's very similar for the second regex (replace "~\d{3}~" by "~I\d{3}~")

Hope it helps!

Lexikos
  • Administrators
  • 9844 posts
  • AutoHotkey Foundation
  • Last active:
  • Joined: 17 Oct 2006
Is this what you mean?
var1 = ~180~ | ~008~ | ~string~ | ~0string15~ | ~-180~ | ~040~ | ~180,3~ | ~4~ 
var2 = 008
var3 = 404.html
MsgBox % RegExReplace(var1, "(?<=~)\d{3}(?<!\Q" var2 "\E)(?=~)", var3)


TheGood
  • Members
  • 589 posts
  • Last active: Mar 22 2014 03:22 PM
  • Joined: 30 Jul 2007

Is this what you mean?

var1 = ~180~ | ~008~ | ~string~ | ~0string15~ | ~-180~ | ~040~ | ~180,3~ | ~4~ 
var2 = 008
var3 = 404.html
MsgBox % RegExReplace(var1, "(?<=~)\d{3}(?<!\Q" var2 "\E)(?=~)", var3)


Nice! I thought var2 represented an array of numbers. :)

sharethewisdom
  • Members
  • 57 posts
  • Last active: Dec 18 2014 08:44 AM
  • Joined: 24 Feb 2008
:evil: there is no emoticon that expresses how I feel after pressing ctrl+w in stead of ctrl+x twice after typing a long post trying to explain how my script works :cry:
so please await my further replay or read the last few posts here but it is to much to read I know..

here is a bit of a summary:

BrokenLink := "404.html"
Loop, *.txt
	{
FileRead Template, template.html
FileRead Content, %A_LoopFileFullPath%
StringReplace Template, Template, ~Content~, %Content%
	Loop, content\*.txt
	{
		; fistlink.001.txt `n secondlink.002.txt `n ...
		TxtList = %TxtList%%A_LoopFileName%`n
	}
	Loop, parse, TxtList, `n
	{
		if A_LoopField = 
		break
		StringSplit TxtSplit, A_LoopField, .
		LinkName:= TxtSplit1
		StringLower TxtSplit1, TxtSplit1
		StringReplace TxtSplit1, TxtSplit1, %A_SPACE%, _, All
		;RegExReplace(Template, "(?<=~)\d{3}(?<!\Q" TxtSplit2 "\E)(?=~)", BrokenLink)
		Loop
		{
			StringReplace Template, Template, ~%TxtSplit2%~, %TxtSplit1%.html, UseErrorLevel ;~001~ becomes fistlink.html,...
			if ErrorLevel = 0  ; No more replacements needed.
			break
		}
		Loop
		{
			StringReplace Template, Template, ~%TxtSplit2%_name~, %LinkName%, UseErrorLevel ;~001_name~ becomes "fistlink"
			if ErrorLevel = 0
			break
		}

FileDelete %ThisHtmlNameSplit1%.html
FileAppend %Template%, %ThisHtmlNameSplit1%.html
ExitApp
}

I thought var2 represented an array of numbers.

TxtSplit2=var2 so parsed string from array

Is this what you mean

tanks, ~404.html~ should be 404.html though
I'm to angry to look at it any further, the keys on my new laptop are much smaller at the left and an external editor for bbc is not installed so i didn't bother typing here :x

biotech
  • Members
  • 172 posts
  • Last active: Jan 08 2011 03:16 PM
  • Joined: 23 Feb 2006
hello guys, i need some help in comparing two lists of data, normally i would do it with vlookup in excel but this time it is different since the data isnt the same. i have two columns of books, that i need to compare...titles are slightly different so search and find didnt work and the list are huge so i would like to automate it. so what approach to use? i guess this is simple for someone who knows how to use regex


thanks in advance!

  • Guests
  • Last active:
  • Joined: --

hello guys, i need some help in comparing two lists of data, normally i would do it with vlookup in excel but this time it is different since the data isnt the same. i have two columns of books, that i need to compare...titles are slightly different so search and find didnt work and the list are huge so i would like to automate it. so what approach to use? i guess this is simple for someone who knows how to use regex


thanks in advance!

That is not a job for Regex. You are looking for a 'fuzzy' match function (a simple form of AI). If you search the forums you will find some posts suggesting algorithms for doing this. (try searching for fuzzy - I believe Laszlo has posted on this a few times)

TheGood
  • Members
  • 589 posts
  • Last active: Mar 22 2014 03:22 PM
  • Joined: 30 Jul 2007

hello guys, i need some help in comparing two lists of data, normally i would do it with vlookup in excel but this time it is different since the data isnt the same. i have two columns of books, that i need to compare...titles are slightly different so search and find didnt work and the list are huge so i would like to automate it. so what approach to use? i guess this is simple for someone who knows how to use regex

Try this!

Slanter
  • Members
  • 739 posts
  • Last active: Jul 08 2011 05:26 AM
  • Joined: 28 May 2008
Or better yet Fuzzy string searching with Damerau–Levenshtein distance by Titan
Unless otherwise stated, all code is untested

(\__/) This is Bunny.
(='.'=) Cut, copy, and paste bunny onto your sig.
(")_(") Help Bunny gain World Domination.

poo_noo
  • Members
  • 251 posts
  • Last active: Jan 28 2015 08:33 PM
  • Joined: 08 Dec 2006
Below is a link to an "Essential Guide To Regular Expressions: Tools and Tutorials"

<!-- m -->http://www.smashingm... ... resources/<!-- m -->

Might be handy
Paul O

  • Guests
  • Last active:
  • Joined: --
how to extract the number that follows only after test' in a variable

org/test(somenumber)'