Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Put here requests of problems with regular expressions


  • Please log in to reply
1074 replies to this topic
PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
xml =

(

<elem1> ... </elem1>Foo

<elem2> ... </elem2>

Something else...

<elem3> ... </elem3>

...

Hi! <elem125> ... </elem125>

)

res := RegExReplace(xml, "<elem(\d+)>.*?</elem\1>", "<elem$1>{$1}</$1>")

MsgBox %res%


Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

majkinetor
  • Moderators
  • 4512 posts
  • Last active: May 20 2019 07:41 AM
  • Joined: 24 May 2006
LOL, LOL

Very funny guys...

2 Titan
That was not the point !

2 PhiLho
ROFTL... I was first confused when I saw just one RE, and thought, how did I miss this... Then I saw u used my naming scheme against me :))

Ok... well, it should be general purpose xml fixer so elements will not come by the name "elementN" but more rather like:
<DUE_DATE>		
<PAYMENT_CODE>		
<PAYMENT_AMOUNT>	
<MOD_REF_CREDIT_NUMBER>
<REF_CREDIT_NUMBER>	
<MOD_REF_DEBIT_NUMBER>	
<REF_DEBIT_NUMBER>	
<PURPOSE>		

IMO, it can be only done using function replacement with static field that counts number of calls ...


It would be really cool to set the function for replacement part in RegExpMatch similar to OnMessage
Posted Image

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
OK, it was unclear, and I still don't get it completely.
You should provide an example of the input and the wanted output.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

biotech
  • Members
  • 172 posts
  • Last active: Jan 08 2011 03:16 PM
  • Joined: 23 Feb 2006
is it possible that searching file with regEx is soo slow...i have a pretty large file but anyhow, it takes more seconds to look up the word in the file based on number of characters type in the begining of the word...tring to make a dictionary lookup, but no luck so far, please help.


#NoTrayIcon 

Gui, -Caption  +Resize +AlwaysOnTop
Gui, Font, S12 CDefault Bold, Verdana
Gui, Add, Edit, x0 y0 w177 h25  AltSubmit vValue gQuery 

query=
word=
matchCount=0

Return

F12::
x:=A_ScreenWidth - 186
y:=A_ScreenHeight - 56
Gui, Show, x%x% y%y% h25 w177, New GUI Window
return

Query:
Gui, Submit,NoHide
found:=lookupValue(Value)
displayResults(found)
return

lookupValue(string)
{
  proceed:=false
  length:=StrLen(string)
  if(length>3){ 
    Loop, Read, data.csv
    {
      Loop,Parse,A_LoopReadLine,csv
      {                    
        if(A_Index=1){                 
            expression:="i)^" . string . "" 
            if(RegExMatch(A_LoopField,expression>0){     
              MsgBox match! %A_LoopField%       
              proceed:=true
              }
           }  
        if(A_Index=2 && proceed=true){
         ;do something
        }
      }
    }
    MsgBox done searching file...
  }
}

displayResults(string)
{
 tooltip, %string%
}


GuiClose:
ExitApp

F11::
ExitApp

Esc::
ExitApp


Lemming
  • Members
  • 184 posts
  • Last active: Feb 03 2014 11:03 AM
  • Joined: 20 Dec 2005
I believe the slowness is not due to the regex, but rather the file-reading loop. It reads lines one at a time from the file.

If your data file is not overly big, try loading it to a variable first, then use a string parsing loop to process it (you already know how to do this).

FileRead, DataInMem, data.csv


biotech
  • Members
  • 172 posts
  • Last active: Jan 08 2011 03:16 PM
  • Joined: 23 Feb 2006
yes you were right, this seems to be trivial, it performs must faster now
thanks

#NoTrayIcon 

Gui, -Caption  +Resize +AlwaysOnTop
Gui, Font, S12 CDefault Bold, Verdana
Gui, Add, Edit, x0 y0 w177 h25  AltSubmit vValue gQuery 

query=
word=
matchCount=0

Return

F12::
x:=A_ScreenWidth - 186
y:=A_ScreenHeight - 56
Gui, Show, x%x% y%y% h25 w177, New GUI Window
return

Query:
Gui, Submit,NoHide
found:=lookupValue(Value)
displayResults(found)
return

lookupValue(string)
{
  proceed:=false
  length:=StrLen(string)
  if(length>2){ 
    FILEREAD,file,data_english.csv
    {
      Loop,Parse,file,`n`r
      {   
           if instr(A_LoopField,string){                
            expression:="i)^" . string . "" 
              if(RegExMatch(A_LoopField,expression,out)>0){     
                tooltip %A_LoopField%     
                matchCount+=1
                word%matchCount%=%A_LoopField%
                piece:=word%matchCount%
                displayResults(piece)
          }
        }
      }
    }
  }
}


displayResults(string)
{
 MSGBOX %string%
}




GuiClose:
ExitApp

F11::
ExitApp

Esc::
ExitApp


Lemming
  • Members
  • 184 posts
  • Last active: Feb 03 2014 11:03 AM
  • Joined: 20 Dec 2005
Anyone know if Ahk's Regex can handle backreferences in the pattern search section i.e. the "NeedleRegEx" section? The docs only mention backreferences for the replacement section.

I'm trying to implement the duplicate line remover from here:
http://www.regular-e...icatelines.html

But this does not work:

DataWithDupes =
(
AutoHotkey is a free, open-source utility for Windows. 
With it, you can:
With it, you can:
Automate almost anything by sending keystrokes and mouse clicks. 
You can write a mouse or keyboard macro by hand or use the macro recorder.
Create hotkeys for keyboard, joystick, and mouse. 
Create hotkeys for keyboard, joystick, and mouse. 
Virtually any key, button, or combination can become a hotkey.
Expand abbreviations as you type them. 
Expand abbreviations as you type them. 
For example, typing "btw" can automatically produce "by the way".
Create custom data entry forms, user interfaces, and menu bars. 
See GUI for details.
See GUI for details.
Remap keys and buttons on your keyboard, joystick, and mouse.
Respond to signals from hand-held remote controls 
via the WinLIRC client script.
via the WinLIRC client script.
)

MsgBox, Data with duplicate lines:`n`n%DataWithDupes%

NoDupes := RegExReplace(DataWithDupes, "m)^(.*)(\r?\n\1)+$", "$1")

MsgBox, Dupe lines removed:`n`n%NoDupes%

Yea, I know that dupe line problem has been solved with earlier versions of Ahk. I just wanted a regex to do it.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
Disregard of the expression you gave:
DataWithDupes =
( Join`r`n
AutoHotkey is a free, open-source utility for Windows.
AutoHotkey is a free, open-source utility for Windows.
With it, you can:
With it, you can:
With it, you can, most of the time:
Automate almost anything by sending keystrokes and mouse clicks.
You can write a mouse or keyboard macro by hand or use the macro recorder.
Create hotkeys for keyboard, joystick, and mouse.
Create hotkeys for keyboard, joystick, and mouse.
Virtually any key, button, or combination can become a hotkey.
Expand abbreviations as you type them.
Expand abbreviations as you type them.
Expand abbreviations as you type them.
For example, typing "btw" can automatically produce "by the way".
Create custom data entry forms, user interfaces, and menu bars.
See GUI for details.
See GUI for details.
Remap keys and buttons on your keyboard, joystick, and mouse.
Respond to signals from hand-held remote controls
via the WinLIRC client script.
via the WinLIRC client script.

)

;~ MsgBox, Data with duplicate lines:`n`n%DataWithDupes%

NoDupes := RegExReplace(DataWithDupes, "(\r?\n|^)(.*?)(?:\r?\n\2)+", "$1$2")

MsgBox, (%ErrorLevel%) Dupe lines removed:`n`n|%NoDupes%|
It wasn't obvious to get it right in all cases... :-)
[EDIT] I found the right formula... Not far away from the given one, indeed. I just don't use the multiline mode.

Anyone know if Ahk's Regex can handle backreferences in the pattern search section i.e. the "NeedleRegEx" section?

You could have just tried with a simpler test... ;-)
MsgBox % RegExMatch("Love, Love, All you need is Love", "(\w+), \1, [\w\s]+\1")

Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

Lemming
  • Members
  • 184 posts
  • Last active: Feb 03 2014 11:03 AM
  • Joined: 20 Dec 2005
Thanks PhiLho. It appears to work. The second regex you provided for dupe words would probably be useful later on too.

4lex
  • Guests
  • Last active:
  • Joined: --
Hi all,

Stumped again with RegExMatch :(

I need to pass a variable to the RegExMatch function, something like this:
RegExMatch(second,"NAME=CB([0-9]) VALUE=([0-9]+),,([0-9]+),>%Part%<",SubExp)
but it looks like the %Part% is being taken literally.

I tried sticking parts and all of my RegExMatch statement into another variable, one example like this:
RegString="NAME=CB([0-9]) VALUE=([0-9]+),,([0-9]+),>%Part%<"
RegExMatch(second,%RegString%,SubExp)
but then my RegExMatch statement 'contains an illegal character'

I've been escaping stuff all over the place, but don't really understand how that works, and what characters I should be escaping, so had no success there :(

Can anyone set me straight?

Cheers,
Alex

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
Remember that quoted strings don't expand %%! (in v.1...)
RegExMatch(second,"NAME=CB([0-9]) VALUE=([0-9]+),,([0-9]+),>" . Part . "<",SubExp)
And %% in expressions (including function calls) are rarely used too, for very specific purposes...
Your second try could have been:
RegString=NAME=CB([0-9]) VALUE=([0-9]+),,([0-9]+),>%Part%< ; No quotes!
RegExMatch(second,RegString,SubExp) ; No %!

Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

4lex
  • Guests
  • Last active:
  • Joined: --
Thanks, once again. Dunno where I'd be with AHK were it not for this forum - cheers.

jps
  • Members
  • 279 posts
  • Last active: Sep 11 2011 07:26 PM
  • Joined: 02 Sep 2006
This is my first time using regex.

An example of what could be in A_LoopReadLine
<td class="alt1"><img src="images/smilies/eyebrow.gif" border="0" alt="Eyebrow" /></td>

I'm obviously doing something very wrong because with this code, %FileName% is always blank
RegExMatch(A_LoopReadLine, "\\.*\.gif", FileName)

I expected it to contain everything from the first backslash to the .gif

What am i doing wrong?

Ultimately thats not even what I would want it to find but as i said this is my first time using regex and i was trying to work my way there slowly.

Ultimately i want it to find
eyebrow.gif

EDIT: I'm an idiot.Mixing up my blackslashes and forward slashes.Let me change it and see if it works.

EDIT2:
RegExMatch(A_LoopReadLine, "/.*\.gif", FileName)
Since i fixed my code it now behaves as i expected but I'm really struggling to write what i actually need.

To recap,I need to extract the name of a gif file that will be preceded by a forward slash.

YMP
  • Members
  • 424 posts
  • Last active: Apr 05 2012 01:18 AM
  • Joined: 23 Dec 2006
F11::

  Text=<td class="alt1"><img src="images/smilies/eyebrow.gif" border="0" alt="Eyebrow" /></td>

  RE=(?<=\/)[^\/]+\.gif

  RegExMatch(Text, RE, FileName)

  Msgbox, % FileName

Return



jps
  • Members
  • 279 posts
  • Last active: Sep 11 2011 07:26 PM
  • Joined: 02 Sep 2006
Thank you very much. Does exactly as i need. Now I just have to figure out how it does it :D