Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Put here requests of problems with regular expressions


  • Please log in to reply
1074 replies to this topic
TLM
  • Administrators
  • 3864 posts
  • Last active:
  • Joined: 21 Aug 2006

I put together this little trick to add repeat characters without a loop.
I know RegEx is slow but shouldnt it return 0 (error) or 15 (expected) characters rather than 9 (anomalous)?

SetFormat, Float, % (".14",_:=0.,C:=RegExReplace(_+0,"[\d.]",Chr(32)))
Msgbox % ">" C "<" ; <-- should be 15 spaces not 8 see below code.


This one works for obvious reasons

SetFormat, Float, % (".14",_:=0.)
C:=RegExReplace(_+0,"[\d.]",Chr(32))
Msgbox % ">" C "<"

You come to expect error or no error, not kinda error. Unless I'm not think about this correctly.

To debug this, returning the string from memory shows the length

SetFormat, Float, % (".14",_:=0.,L:=StrPut(RegExReplace(_+0,"[\d.]",Chr(32)),&C))
Msgbox % ">" StrGet(&C) "< " L

Posted Image

don't duplicate, iterate!


just me
  • Members
  • 1496 posts
  • Last active: Nov 03 2015 04:32 PM
  • Joined: 28 May 2011
Hi TLM,
 
your RegExReplace() is processed before the new format is set. So the default 0.6 (8 characters) is used.
 
MsgBox, 0, A_FormatFloat, % A_FormatFloat . " = " . StrLen(0.0 + 0) . " characters."
SetFormat, Float, % (".14", _:=0., C := RegExReplace(_+0, "[\d.]", Chr(32)))
Msgbox % ">" C "<" . " - StrLen( C ) = " . StrLen( C ) ; <-- should be 15 spaces not 8 see below code.

Prefer ahkscript.org for the time being.


smorgasboard
  • Members
  • 660 posts
  • Last active: Jan 14 2016 08:53 AM
  • Joined: 18 Jul 2012

hi all merry christmas!

i want to sort lines. lines are of two types either they have a pattern of (negative)?1to3digits(dot)2digits in them or they dont

i want to sort the lines that have this pattern of 1to3digits(dot)2digits. i want these lines to be replaced with all the contents that they already have including this pattern (some values are negative also) with serial no. in the left most side.

 

nothing happens to the lines that dont have this pattern (these are actually marks) or these lines might as well be deleted.

 

thanks in advance

 

MARKS OF CANDIDATES
                                                        IN ROLL NUMBER ORDER

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~
Record# ROLL        NAME                        TOTAL CAT1 CAT2 CAT3 GENDER DEPTL
 108901 9212000290 JEENSHA SATHAR               28.25 6    0    0    2      2
 108902 9212000292 DINDU KG                     38.50 6    0    0    2      2

result is

 

 

MARKS OF CANDIDATES
                                                        IN ROLL NUMBER ORDER

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~
Record# ROLL        NAME                        TOTAL CAT1 CAT2 CAT3 GENDER DEPTL
 1. 108902 9212000292 DINDU KG                     38.50 6    0    0    2      2

2. 108901 9212000290 JEENSHA SATHAR               28.25 6    0    0    2      2



smorgasboard
  • Members
  • 660 posts
  • Last active: Jan 14 2016 08:53 AM
  • Joined: 18 Jul 2012

@TLM thanks again!!

 

you hit the nail on the head! the above problem is solved again by you. :)



adrianh
  • Members
  • 616 posts
  • Last active: Apr 07 2016 03:35 PM
  • Joined: 28 Oct 2012
Hey smorgasboard, I know that you already got an answer from TLM and that's cool, but I wanted to give your problem a try anyway as it interested me and this is what I came up with:
x := "
(
 
MARKS OF CANDIDATES
                                                        IN ROLL NUMBER ORDER

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~
Record# ROLL        NAME                        TOTAL CAT1 CAT2 CAT3 GENDER DEPTL
 108901 9212000290 JEENSHA SATHAR               28.25 6    0    0    2      2
 108902 9212000292 DINDU KG                     38.50 6    0    0    2      2
result is
 
 
MARKS OF CANDIDATES
                                                        IN ROLL NUMBER ORDER

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~
Record# ROLL        NAME                        TOTAL CAT1 CAT2 CAT3 GENDER DEPTL
 1. 108902 9212000292 DINDU KG                     38.50 6    0    0    2      2
2. 108901 9212000290 JEENSHA SATHAR               28.25 6    0    0    2      2
)"


x := RegExReplace(x, "m`a)(?:(.*?(\d{1,3}\.\d\d ).*$\R?)|.*$\R?)", "$2$1")
Sort x, N
x := RegExReplace(x, "m`a)\d{1,3}\.\d\d (.*$\R?)", "$1")
MsgBox %x%

Here is a more detailed and commented version:
;// I'm slowly migrating to this as my newline match.
NL := "\R"

;// This sub regex is the number that must exist in the line to match
NUMBER := "\d{1,3}\.\d\d "

;// This regex describes a line that I want and puts it in the 1st capture group.
;// Further, put the number of interest in the 2nd capture group.
WANT := "(.*?(" NUMBER ").*$" NL "?)" 

;// This regex states match that line that I want *or* still match the line 
;// but don't put it in a capture group.
DO_OR_DONT_WANT := "(?:" WANT "|.*$" NL "?)"

;// If what I want is found, capture groups 1 and 2 are populated otherwise 
;// they are empty.  Put 2nd capture group infront of 1st.
x := RegExReplace(x, "m`a)" DO_OR_DONT_WANT, "$2$1")

;// sort the lines
Sort x, N

;// Remove the leading number used to sort on
x := RegExReplace(x, "m`a)" NUMBER "(.*$" NL "?)", "$1")

;// you know ;)
MsgBox %x%

I found the problem interesting and learned a few things from it. Here's hoping that it does the same for you. smile.png

I would also be interested in seeing what TLM's solution was. He comes up with some great stuff!

Thanks, and happy holidays!


Adrian

my library base
AHK_L is the bomb! With a whole lot of bug fixes, Unicode support, associative array objects, array like objects, classes and variadic functions, why wouldn't you switch?


TLM
  • Administrators
  • 3864 posts
  • Last active:
  • Joined: 21 Aug 2006

"your RegExReplace() is processed before the new format is set. So the default 0.6 (8 characters) is used."

 

Thank you just me, your advice led me to the correct solution ( once again biggrin.png )

VarSetCapacity(_,15*2,1),S:=RegExReplace(_,".",Chr(32)),_:=""
Msgbox % ">" S "<"

Posted Image

don't duplicate, iterate!


just me
  • Members
  • 1496 posts
  • Last active: Nov 03 2015 04:32 PM
  • Joined: 28 May 2011
You don't even need your magic var "_" in the current version:
S := RegExReplace((S, VarSetCapacity(S, 15 << !!A_IsUnicode, 1)), ".", " ")
Msgbox % " S: >" . S . "< - StrLen(S) = " . StrLen(S)
icon_wink.gif

Prefer ahkscript.org for the time being.


TLM
  • Administrators
  • 3864 posts
  • Last active:
  • Joined: 21 Aug 2006

you know I had a feeling there was a bit shift.. I just couldn't conceptualize it tytyty

I see what you did by extending hStack param, slick biggrin.png.. <<< no nested parenthesis revealed the uni

double inversion is cute (returns correct chr count with/out it) and your modal alert is ohso js concat lololol


Posted Image

don't duplicate, iterate!


just me
  • Members
  • 1496 posts
  • Last active: Nov 03 2015 04:32 PM
  • Joined: 28 May 2011

btw your modal alert is ohso js concat lololol

 

 

Maybe I'll change it when using v2, still not sure.


Prefer ahkscript.org for the time being.


TLM
  • Administrators
  • 3864 posts
  • Last active:
  • Joined: 21 Aug 2006

Wait, is V2 using the concat operator *shock* ??


Posted Image

don't duplicate, iterate!


just me
  • Members
  • 1496 posts
  • Last active: Nov 03 2015 04:32 PM
  • Joined: 28 May 2011
http://l.autohotkey.net/v2-changes.htm
Expressions:
Auto-concat now requires at least one space or tab in all cases (the v1 documentation says there "should be" a space).

 


Prefer ahkscript.org for the time being.


TLM
  • Administrators
  • 3864 posts
  • Last active:
  • Joined: 21 Aug 2006

Please see note: http://www.autohotke...-63#entry562252


Posted Image

don't duplicate, iterate!


adrianh
  • Members
  • 616 posts
  • Last active: Apr 07 2016 03:35 PM
  • Joined: 28 Oct 2012
Could you give a sample input? I wouldn't have thought that a foreign string (one that comes from outside of the AHK script) would be interpreted the same way as one that is within.

EDIT: what is the resulting string going to be used for?



Adrian

my library base
AHK_L is the bomb! With a whole lot of bug fixes, Unicode support, associative array objects, array like objects, classes and variadic functions, why wouldn't you switch?


guest3456
  • Members
  • 1704 posts
  • Last active: Nov 19 2015 11:58 AM
  • Joined: 10 Mar 2011

i cant stand auto concat, the . operator makes code so much more readable



adrianh
  • Members
  • 616 posts
  • Last active: Apr 07 2016 03:35 PM
  • Joined: 28 Oct 2012
I find the opposite actually, but maybe it might be the way it's being used?


Adrian

my library base
AHK_L is the bomb! With a whole lot of bug fixes, Unicode support, associative array objects, array like objects, classes and variadic functions, why wouldn't you switch?