Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

AHK RegEx Tester v2.1


  • Please log in to reply
43 replies to this topic
toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
A small Gui that helps to evaluate/test regex.

Screenshot of the first version. The latest version has small extra info above the Needle field.
Posted Image

ScriptName = AHK RegEx Tester
Version = 2.1
;by toralf
;requires AHK 1.0.46+
;www.autohotkey.com/forum/topic17844.html

/*
Version history:

2.1)
- subpattern can be encapsulated (thanks titan)
- tab with text fields allow theme (thanks titan)  
- a small fix for the layout of the result
2)
- script generates ini file right next to script to store data
- remembers last position and size of GUI (thanks majkinetor)
- up to 10 regex can be stored (thanks majkinetor)
- haystack is remembered between sessions (thanks majkinetor)
- regex can be copied to clipboard with a button (thanks Helpy)
1)
- Initial release
*/

;Get script/app name
SplitPath, A_ScriptName, , , , OutNameNoExt
;get ini file name
IniFile = %OutNameNoExt%.ini

SeparatorChars = @µ§&#°¤¶®©¡¦
DefaultSeparator = @
DefaultRegEx = The (.*?) (?P<Name>.*?) (.*?) (.*?) the
DefaultHaystack = The quick brown fox jumps over the street.

Separator := ReadIniKey("RegEx","Separator",DefaultSeparator)
RegExList := ReadIniKey("RegEx","RegEx",DefaultRegEx)
StringReplace, RegExList, RegExList, %Separator%, `n, All

Separator := ReadIniKey("Haystack","Separator",DefaultSeparator)
Haystack := ReadIniKey("Haystack","Haystack",DefaultHaystack)
StringReplace, Haystack, Haystack, %Separator%, `n, All

Gui, 1:+Resize +MinSize +LastFound +Delimiter`n
Gui1HWND := WinExist()
Gui, 1:Add, Text, , Haystack
Gui, 1:Add, Edit, w220 r8 vEdtHaystack gEvaluateRegEx , %Haystack%   
Gui, 1:Add, Text, w220, Needle (RegEx)`nNote: Use \n instead of ``n, etc.`nUnlike in AHK quotes (") must not be escaped.  
Gui, 1:Add, ComboBox, w220 r10 vCbbRegEx gEvaluateRegEx , %RegExList%
GuiControl, Choose, CbbRegEx, 1 
Gui, 1:Add, Tab,  w220 r2.3 +Theme vTabRegExType gEvaluateRegEx , Match`nReplace
  Gui, 1:Tab, Match
    Gui, 1:Add, Text, Section +BackgroundTrans , OutputVar:
    Gui, 1:Add, Edit, x+2 ys-4 r1 w144 vEdtUnquotedOutputVar gEvaluateRegEx , Out
    Gui, 1:Add, Text, xs Section +BackgroundTrans, StartingPos:
    Gui, 1:Add, Edit, x+2 ys-4 r1 w28 Right vEdtMStartingPos gEvaluateRegEx , 1
    Gui, 1:Add, Text, x+15 ys +BackgroundTrans, # Subpattern:
    Gui, 1:Add, Edit, x+2 ys-4 r1 w28 Right Number vEdtNumSubpattern gEvaluateRegEx , 5
  Gui, 1:Tab, Replace
    Gui, 1:Add, Text, Section +BackgroundTrans, Replacement:
    Gui, 1:Add, Edit, x+2 ys-4 r1 w129 vEdtReplacement gEvaluateRegEx , $3
    Gui, 1:Add, Text, xs Section +BackgroundTrans, Limit:
    Gui, 1:Add, Edit, x+2 ys-4 r1 w28 Right vEdtLimit gEvaluateRegEx , -1
    Gui, 1:Add, Text, x+25 ys +BackgroundTrans, StartingPos:
    Gui, 1:Add, Edit, x+2 ys-4 r1 w28 Right vEdtRStartingPos gEvaluateRegEx , 1
Gui, 1:Tab
Gui, 1:Add, Text, xm, Result 
Gui, 1:Add, Edit, w220 r8 vEdtResult ,
Gui, 1:Add, Button, vBtnClose gGuiClose , Close
Gui, 1:Add, Button, x+10 vBtnStoreRegEx gBtnStoreRegEx , Store Regex
Gui, 1:Add, Button, x+10 vBtnCopyToCB gBtnCopyToCB , Copy Regex
Gui, 1:Show, Hide, %ScriptName% v%Version%
RestoreGuiPosSize(Gui1HWND, 1)  ;restore old size
Gui, 1:Show

GoSub, EvaluateRegEx 
Return

;user has changed any data in the gui
; => update the regex result
EvaluateRegEx:
  If UpdateComboBox
      Return
  Gui, 1:Submit, NoHide                                ;get all data
  If (!CbbRegEx OR !EdtHaystack){    ;if no haystack or needle
      GuiControl, 1:, EdtResult,                            ;no result and
      Return                                                ;ne evaluation
    }
  If (TabRegExType = "Match") {                      ;if match is selected
      ;what results need to be shown
      IsOutputVar := False                           ;set default options
      IsPositionAndLength := False
      If EdtUnquotedOutputVar is not space           ;output var is wanted
        {
          IsOutputVar := True                        ;set option
          
          Loop, %EdtNumSubpattern% {                 ;set internal vars to nothing
              Output%A_Index% =
              OutputPos%A_Index% =
              OutputLen%A_Index% =
            }
          
/* original regex to get the subpattern specified in CbbRegEx, not allowing encapsulation of subpattern
"s)(?<!\\)\((?:\?(?:P?<(\w+)>|'(\w+)')).+?(?<!\\)\)"
s)                                                 ;dotall 
  (?<!\\)                                          ;no \ before
         \(                                        ;a "("
           (?:                                     ;no subpattern start
              \?                                   ;a "?"
                (?:                                ;no subpattern start
                   P?                              ;maybe a "P"
                     <(\w+)>                       ;a subpattern word enclosed with "<>"
                            |                      ;or
                             '(\w+)'               ;a subpattern word enclosed with "''"
                                    )              ;no subpattern end
                                     )             ;no subpattern end
                                      .+?          ;any ungreedy text
                                         (?<!\\)   ;no \ before
                                                \) ;a ")"
*/ 
                                              
/* simplified regex to get the subpattern specified in CbbRegEx, allowing encapsulation of subpattern
"(?<!\\)\((?:\?P?<(\w+)>|'(\w+)')"
(?<!\\)                           ;no \ before
       \(                         ;a "("
         (?:                      ;no subpattern start
            \?                    ;a "?"
               P?                 ;maybe a "P"
                 <(\w+)>          ;a subpattern word enclosed with "<>"
                        |         ;or
                         '(\w+)'  ;a subpattern word enclosed with "''"
                                ) ;no subpattern end
*/                                               
          pos = 1                                    ;get named subpattern
          sub = 0
        	Loop{
          		If pos := RegExMatch(CbbRegEx,"(?<!\\)\((?:\?P?<(\w+)>|'(\w+)')", Name , pos)
          		  {
            			sub++                              ;subpattern index
            			Name0 := Name1 = "" ? Name2 : Name1
                  Output%Name0% =                    ;set subpattern var to nothing 
                  OutputPos%Name0% =
                  OutputLen%Name0% =
                  sub%sub% := Name0                  ;subpattern array
                  pos += StrLen(Name)                ;calculate next starting position
                }
          		Else Break                             ;no more named subpattern found
            }

          If RegExMatch(CbbRegEx, "^[\w`]*P[\w`]*\)")  ;Positions and length are wanted
              IsPositionAndLength := True
        }
          
      ;do the regex
      FoundPos := RegExMatch(EdtHaystack, CbbRegEx, Output, EdtMStartingPos)

      ;show results
      Result = ErrorLevel = %ErrorLevel%`nFoundPos = %FoundPos%`n
      
      If IsOutputVar {                               ;show output var results
          Result .= EdtUnquotedOutputVar " = " Output "`n"

        	Loop, %sub% {                                ;named subpattern
              If A_index = 1
                  Result .= "`n------- Named subpattern --------`n"
              If IsPositionAndLength {
                  T := "OutputPos" sub%A_Index%
                  Result .= EdtUnquotedOutputVar "Pos" sub%A_Index% " = " %T% "`n"
                  T := "OutputLen" sub%A_Index%
                  Result .= EdtUnquotedOutputVar "Len" sub%A_Index% " = " %T% "`n"
              }Else{
            	    T := "Output" sub%A_Index%
                  Result .= EdtUnquotedOutputVar sub%A_Index% " = " %T% "`n"
                }
        	  }

          If (EdtNumSubpattern > 0) {                ;numbered subpattern
              Result .= "`n------- Subpattern --------`n"
              Loop, %EdtNumSubpattern% {
                  If IsPositionAndLength {
                      Result .= EdtUnquotedOutputVar "Pos" A_Index " = " OutputPos%A_Index% "`n"
                      Result .= EdtUnquotedOutputVar "Len" A_Index " = " OutputLen%A_Index% "`n"
                  }Else
                      Result .= EdtUnquotedOutputVar A_Index " = " Output%A_Index% "`n"
                }
            }
        }
  }Else {                                            ;replace is selected
      ;do regex
      NewStr := RegExReplace(EdtHaystack, CbbRegEx, EdtReplacement, Count, EdtLimit, EdtRStartingPos)

      ;show result
      Result = ErrorLevel = %ErrorLevel%`nCount = %Count%`nNewStr = %NewStr%`n
    }
  GuiControl, 1:, EdtResult, %Result%                ;update gui
Return              

BtnCopyToCB:
  Gui, 1:Submit, NoHide
  Clipboard = %CbbRegEx%
Return

BtnStoreRegEx:
  Gui, 1:Submit, NoHide
  RegExList := StoreRegEx(RegExList, CbbRegEx)
Return

StoreRegEx(RegExList, CbbRegEx){
    Global UpdateComboBox
    RegExList = %CbbRegEx%`n%RegExList%     ;add current regex
    StringSplit, RegExList, RegExList, `n   ;create an array
    RegExList =                             ;empty list
    Loop, %RegExList0% {                    ;loop though array
        ID := A_Index
        AlreadyInList := False              ;check if item is already in list
        Loop, % ID - 1 {
            If (RegExList%ID% == RegExList%A_Index%) {
                AlreadyInList := True
                Break
              }
          } 
        If !AlreadyInList {                 ;if item is not in list, add him
            RegExList .= RegExList%ID% "`n"
            i++                             ;stop after 10 items
            If (i = 10)
                Break
          }        
      }
    StringTrimRight, RegExList, RegExList, 1   ;remove last `n
    GuiControl, 1:, CbbRegEx, `n%RegExList%    ;update combobox
    GuiControl, 1:Choose, CbbRegEx, 1
    Return RegExList
  }

GuiClose:
  Gui, 1:Submit, NoHide
  StoreListInIni("Haystack", EdtHaystack)
  StoreListInIni("RegEx", RegExList)
  StoreGuiPosSize(Gui1HWND, 1)  
  ExitApp
Return

StoreListInIni(Name, List){
    Global SeparatorChars
    Loop, Parse, SeparatorChars
      {
        If (InStr(List, A_LoopField) = 0){
            Separator = %A_LoopField%
            Break
          } 
      } 
    StringReplace, List, List, `n , %Separator%, All
    WriteIniKey(Name, "Separator", Separator)
    WriteIniKey(Name, Name, List)
  }

;return key value from ini file
ReadIniKey(Section,Key,Default=""){
    global IniFile
    DefaultTestValue = kbcewlkj1u234z98hr2310587fh
    IniRead, KeyValue, %IniFile%, %Section%, %Key%, %DefaultTestValue%
    If (KeyValue = DefaultTestValue) {
        WriteIniKey(Section,Key,Default)
        KeyValue = %Default%
      } 
    Return KeyValue
  }

;write key value to ini file
WriteIniKey(Section,Key,KeyValue){
    global IniFile
    IniWrite, %KeyValue%, %IniFile%, %Section%, %Key%
  }

;restore previous gui position and size
RestoreGuiPosSize(GuiUniqueID, GuiID = 1){
    GuiX := ReadIniKey("Gui" GuiID,"GuiX","")
    GuiY := ReadIniKey("Gui" GuiID,"GuiY","")
    GuiW := ReadIniKey("Gui" GuiID,"GuiW","")
    GuiH := ReadIniKey("Gui" GuiID,"GuiH","")
    DetectHiddenWindows, On
    WinMove, ahk_id %GuiUniqueID%, , %GuiX%, %GuiY%, %GuiW%, %GuiH%
    DetectHiddenWindows, Off
  }

;store current gui position and size
StoreGuiPosSize(GuiUniqueID, GuiID = 1){
    WinGetPos, GuiX, GuiY, GuiW, GuiH, ahk_id %GuiUniqueID%
    If (GuiX > -100 AND GuiX < A_ScreenWidth - 20){
        WriteIniKey("Gui" GuiID, "GuiX", GuiX)
        WriteIniKey("Gui" GuiID, "GuiY", GuiY)
        WriteIniKey("Gui" GuiID, "GuiW", GuiW)
        WriteIniKey("Gui" GuiID, "GuiH", GuiH)
      }
  }

GuiSize:
  Anchor("EdtHaystack","w")
  Anchor("CbbRegEx","w")
  Anchor("TabRegExType","w")
  Anchor("EdtUnquotedOutputVar","w")
  Anchor("EdtReplacement","w")
  Anchor("EdtResult","wh")
  Anchor("BtnClose","y")
  Anchor("BtnStoreRegEx","y")
  Anchor("BtnCopyToCB","y")
Return

Anchor(c, a, r = false) { ; v3.5.1 - Titan
  	static d
  	GuiControlGet, p, Pos, %c%
  	If !A_Gui or ErrorLevel
  		Return
  	i = x.w.y.h./.7.%A_GuiWidth%.%A_GuiHeight%.`n%A_Gui%:%c%=
  	StringSplit, i, i, .
  	d .= (n := !InStr(d, i9)) ? i9 :
  	Loop, 4
  		x := A_Index, j := i%x%, i6 += x = 3
  		, k := !RegExMatch(a, j . "([\d.]+)", v) + (v1 ? v1 : 0)
  		, e := p%j% - i%i6% * k, d .= n ? e . i5 : ""
  		, RegExMatch(d, RegExReplace(i9, "([[\\\^\$\.\|\?\*\+\(\)])", "\$1")
  		. "(?:([\d.\-]+)/){" . x . "}", v)
  		, l .= InStr(a, j) ? j . v1 + i%i6% * k : ""
  	r := r ? "Draw" :
  	GuiControl, Move%r%, %c%, %l%
  }

Edit:
- Updated to capture all named subpattern styles
- Added a note to the GUI to mention that quotes must not be escaped and that you should use \n instead of `n.
- update to version 2

Latest Edit:
- update to version 2.1, see script for history of changes

I hope I haven't done any dublicate work here.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.

TheIrishThug
  • Members
  • 419 posts
  • Last active: Jan 18 2012 02:51 PM
  • Joined: 19 Mar 2006
Looks very nice. I like that it can show a defined number of sub-patterns, Regex Coach stops after 10.

One function in Regex Coach that I use a lot is the step feature that shows you where the program is in the strings as it goes through testing. Great for quickly finding out what part is wrong when you have long regex's. You add that and your script will be everything I need.

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005

One function in Regex Coach that I use a lot is the step feature that shows you where the program is in the strings as it goes through testing.

Don't dream. RegEx Coach uses it own regex engine (made in Lisp), so it can hook it as much as it wants... We don't have this level of control with PCRE.

toralf, I wanted to mention something in the related Ask for Help topic, but then I forgot... :-(
The fastest and more reliable way to get a list of captures is to ask PCRE itself. You can see in Regular expressions: a wrapper around the PCRE DLL how I did it.
Alas, this suppose to have a DLL of PCRE along with your script, which is not practical.
Also note that there are new ways to name a capture, per last version:

In PCRE, a subpattern can be named in one of three ways: (?<name>...) or (?'name'...) as in Perl, or (?P<name>...) as in Python.

I use the Perl style, as I find it more elegant.
Now, since your GUI is for tests, one can convert the names before testing.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
Should be easy to extend. Just replace lien 60
If pos := RegExMatch(EdtRegEx,"si)(?<!\\)\((?:\?P<(\w+)>).+?(?<!\\)\)", Name , pos) 

with something like this
If pos := RegExMatch(EdtRegEx,"s)(?<!\\)\((?:\?(?:P?<(\w+)>|'(\w+)')).+?(?<!\\)\)", Name , pos)
Note: The 'name' style doesn't work here.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
Updated first post to capture all named subpattern styles.
Style 'Name' now works.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.

adamrgolf
  • Members
  • 442 posts
  • Last active: May 22 2017 09:16 PM
  • Joined: 28 Dec 2006
I like this a lot.

I think their may be a problem with quotes, but i'm not sure:

removed

what I posted in the haystack area is the same as the way book1.txt is laid out

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
It is not obvious to handle, because strings got from GuiControlGet (for example) doesn't need escaping the double-quotes like strings in literal string expressions.
In short, toralf's utility tests the expression as PCRE gets it, not the AutoHotkey's way of writing the expression. You will have a similar problem if using AHK's escapes like `r or `n instead of \r or \n or, of course, inclusion of variables.

I believe toralf shouldn't do anything for that, it is up to the user to clean up the expression before testing it.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

adamrgolf
  • Members
  • 442 posts
  • Last active: May 22 2017 09:16 PM
  • Joined: 28 Dec 2006
ah, I see, what is the correct way to search for that then with the tester?

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
just use this
"(.*)"
It should give you the result.

Maybe I could add a edit field below the results that shows the regex as it would be if typed directly into the Regex function as a string. On the other hand you could put the regex first in a var and then use the var in the Regex function. That would save you from escaping the quotes. Then only the ` have to be escaped, or used with \ which is recommended anyway. I guess I leave it as it is and think how I can add a small helping text to the gui that makes users aware of this.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
Updates the first post
- Added a note to the GUI to mention that quotes must not be escaped and that you should use \n instead of `n.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: May 20 2019 07:41 AM
  • Joined: 24 May 2006
Great job.

Can you save current position and gui size in Registry ? Last sentences used and regular exp ? Combo instead edit for regex and test strings with last few enteries. Ability to save regex ?

All easy tasks and will make this my prefered tool for quick RE experiments.
Posted Image

Helpy
  • Guests
  • Last active:
  • Joined: --

Ability to save regex ?

And one click put on clipboard.

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
I'll think about it. Thanks for the suggestions.

Sidenotes: I do not like code to write into the registry, since it is not obvious to remove it (only if there is a proper install/uninstall routine). I prever INI files in the same folder as the script. If that is ok for you that's how it would be implemented. Please post if you have other ideas.
Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.

majkinetor
  • Moderators
  • 4512 posts
  • Last active: May 20 2019 07:41 AM
  • Joined: 24 May 2006
I said Registry for the purpose. I always use Registry for the history lists as those are not important part of configuration and user usualy don't want them in config file. But INI will do, do what you think is good.
Posted Image

toralf
  • Moderators
  • 4035 posts
  • Last active: Aug 20 2014 04:23 PM
  • Joined: 31 Jan 2005
Update to version 2, see first post:

- script generates ini file right next to script to store data
- remembers last position and size of GUI (thanks majkinetor)
- up to 10 regex can be stored (thanks majkinetor)
- haystack is remembered between sessions (thanks majkinetor)
- regex can be copied to clipboard with a button (thanks Helpy)


Ciao
toralf
 
I use the latest AHK version (1.1.15+)
Please ask questions in forum on ahkscript.org. Why?
For online reference please use these Docs.