one-loop sorting of strings by their length

Post your working scripts, libraries and tools
dashunbaba
Posts: 2
Joined: 20 Oct 2017, 05:37

one-loop sorting of strings by their length

20 Oct 2017, 05:49

Code: [Select all] [Expand] [Download] (sort_by_length.ahk)GeSHi © Codebox Plus

User avatar
SpeedMaster
Posts: 100
Joined: 12 Nov 2016, 16:09

Re: one-loop sorting of strings by their length

20 Oct 2017, 14:04

Ok, but you messed up the array index :(

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus

dashunbaba
Posts: 2
Joined: 20 Oct 2017, 05:37

Re: one-loop sorting of strings by their length

22 Oct 2017, 09:22

That's right. I just set the length of a string as its index so the longer ones go to the end automatically.
Why you need the index right though. The elements are in the right order. Job done.
User avatar
SpeedMaster
Posts: 100
Joined: 12 Nov 2016, 16:09

Re: one-loop sorting of strings by their length

22 Oct 2017, 16:11

dashunbaba wrote:The elements are in the right order. Job done.

No, your function does not work !! :problem:

Here is an example of a failed test using your function sort_by_length()

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus

Helgef
Posts: 2483
Joined: 17 Jul 2016, 01:02
Contact:

Re: one-loop sorting of strings by their length

22 Oct 2017, 16:46

You have the right idea, but need to make a small fix,

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


Cheers.
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

16 Nov 2017, 16:04

Thanks to all three of you for the great idea, finding inconsistency and fix.
Interested for speed, I added four more versions:
- two derived from this thread
- one I used till now for large data
- a beauty from another genius for short instances
For testing I've used large wordlist.

The fastest in this example is not the first choice for all purposes.
For me is the fastest Helgef's version. I'm dealing with numbers in the range of 1M, but only
ca. 5% are used and the same value appears rarely and only 2 - ca. 20 times (never much more).
So plain loops do ca. 95% of idle cycles and array assisted loops do double work.

It seems that push() is slightly faster then classical assignment for small indices, while with an increased index, push() becomes progressively slower.

To see/check the result, pleas shorten the wordlist to 20.000 lines. Given display doesn't handle much more.

Please use extended tests below!

bye!
Attachments
wordlist.rar
(1.54 MiB) Downloaded 10 times
Last edited by rommmcek on 18 Nov 2017, 03:52, edited 1 time in total.
Helgef
Posts: 2483
Joined: 17 Jul 2016, 01:02
Contact:

Re: one-loop sorting of strings by their length

16 Nov 2017, 17:14

Some of your ternary looks wierd, eg

Code: [Select all] [Download] GeSHi © Codebox Plus

(arr_temp[this_len]="")?(arr_temp[this_len]:=[])
arr_temp[this_len].push(val)

I do not see any :.

Maybe, try this one too, I tried to make it as ugly as possible,

Code: [Select all] [Download] GeSHi © Codebox Plus

f(arr){
lens := []
for k, ss in arr
lens[l := strlen(ss)] := "", slen%l% .= ss "`n"
for l in lens
t .= slen%l%
return strsplit(rtrim(t, "`n"), "`n" )
}

Also, you do not include the time it takes to do the strSplit for the for-loop functions. That is, the_array:=StrSplit(the_arr, "`n", "`r").
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

16 Nov 2017, 23:45

Ho, ho, very brisk and crisp respond!

Not ugly at all! Essentially the same idea, but cleaner more condensed, still faster... Thanks!
Especially I like witty array name "lens"!
Nevertheless it inherently does double work, so your first code is still the fastest for my numbers thing!

I'm no Pro, I thought : was optional, like else is in if statement, and time measurement is for all the same except for plain loops and Sort function, but they are hard exactly to compare, since they both take a variable for input, besides this is only ca. 5% and the error is almost 2%, so I guess not too bad.

Edit: Oh, I see you meant:

Code: [Select all] [Download] GeSHi © Codebox Plus

	(arr_temp[this_len]="")?(arr_temp[this_len]:=[], arr_temp[this_len].push(""))
:(arr_temp%this_len%.=val "`n")
It's better, but still doesn't reach your's. Maybe it led you to your masterpiece!
It's faster but sorts wrongly!!!
Last edited by rommmcek on 18 Nov 2017, 03:54, edited 1 time in total.
Helgef
Posts: 2483
Joined: 17 Jul 2016, 01:02
Contact:

Re: one-loop sorting of strings by their length

17 Nov 2017, 06:06

Hello rommmcek :).
First, thanks for sharing your tests and ideas. Second, yes, it does extra work, almost the same again, this time, less work,

Code: [Select all] [Download] GeSHi © Codebox Plus

g(str){
lens := []
loop parse, str, `n, `r
lens[l := strlen(A_LoopField)] := "", slen%l% .= A_LoopField "`n"
varsetcapacity(str,varsetcapacity(str)+2)
for l in lens
str .= slen%l%
return strsplit(rtrim(str, "`n"), "`n" )
}

In real use case, I'd use byref for the input parameter, avoids copying it, unless you need it unchanged for something else.

Ternary consists of ? and :, it is not optional.

This: the_array:=StrSplit(the_arr, "`n", "`r") takes time, to make fair comparisons between the loop parse and for-loop methods, you should include the time it takes.

You can improve the sort function, equivalent,

Code: [Select all] [Download] GeSHi © Codebox Plus

ShortestFirst(a1, a2, ShortestFirst) {	
return (d := strlen(a1) - strlen(a2)) ? d : -ShortestFirst
}

or alternative,

Code: [Select all] [Download] GeSHi © Codebox Plus

ShortestFirst(a,b){ 
return strlen(a)-strlen(b)
}

Very untested.

Cheers.
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

18 Nov 2017, 03:49

This is fine tuning par excellence! You are unstoppable!
All three work fine:
- first still a bit faster (as expected)
- second using sort ca. 1/3 faster as before
- last one using sort ca. 4/5 faster as before, making sort acceptable even for large instances
Unbelievable!

I first encountered ternary on the forum without knowing the name of it (to read in the docs).
O.k., I'm making binary from it sometimes, but I swear it behaves as expected (like if w/o else).
Should I do: (a>b)?(c:=a+b):() if I don't need the third part?

Time measuring depends on starting point. You are right! The name of the thread is one-loop sorting of strings by their length and I wrongly assumed, respected dashunbaba started form array, so I didn't want to set his work (with your fix) in disadvantage. Namely, we all ended up with an array! I fixed the tests adding your new functions.

Note: Further changes to tests due to Helgef's suggestions. See in the post below.
Thanks again for your extraordinary commitment!
Last edited by rommmcek on 19 Nov 2017, 02:22, edited 3 times in total.
User avatar
jeeswg
Posts: 3010
Joined: 19 Dec 2016, 01:58
Location: UK

Re: one-loop sorting of strings by their length

18 Nov 2017, 09:07

Re. ShortestFirst:
return (d := strlen(a1) - strlen(a2)) ? d : -ShortestFirst
versus
return strlen(a)-strlen(b)
The first will maintain the order of any items with the same length, it will give a stable sort, e.g.
a,b,c,d,e -> a,b,c,d,e
versus
a,b,c,d,e -> b,c,d,e,a

Re. ternary:

Code: [Select all] [Download] GeSHi © Codebox Plus

if (a=b) ? Func1() : Func2() ;AHK v1/v2
if (a=b) ? Func1() ;AHK v1 only
if (a=b) && Func1() ;AHK v1/v2 (workaround for the line above)
if (a!=b) && Func2() ;AHK v1/v2


[EDIT:] Btw what is meant here by 'binary operator'?
v2-changes
https://autohotkey.com/v2/v2-changes.htm
•Binary operator with less than two operands.

[EDIT:] Ah, quite simple.
Oracle Unary and Binary Operators - w3resource
https://www.w3resource.com/oracle/operators/index.php
What are unary and binary operators in C++?

unary : A unary operator is an operator that operates on only one operand. Here is the format : operator operand. Example: +2460, -300. binary : An operator is referred to as binary if it operates on two operands. Here is the format : operand1 operator operand2.
Helgef
Posts: 2483
Joined: 17 Jul 2016, 01:02
Contact:

Re: one-loop sorting of strings by their length

18 Nov 2017, 09:58

Code: [Select all] [Download] GeSHi © Codebox Plus

(a=b) ? Func1() ;AHK v1 only

No, a ternary is condition ? expression_if_condition_true : expression_if_condition_false in v1 too.

Should I do: (a>b)?(c:=a+b):() if I don't need the third part?

I do x ? y : "".
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

18 Nov 2017, 11:22

Hi jeeswg!

After posting I noticed something could be wrong with return strlen(a)-strlen(b), but was not sure (I could have a bug in the other script). Now with your demo I think that the function shifts the same value one position down in 8 (eight) loop cycle each time the same value is encountered, so input of:
- a,b,c,d,e,f,g,h or
- a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p or
- a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x or
- etc. (not tested further)
yield correct result.

Can you confirm that (a>b)?(c:=a+b) is compliable with v1.1?

Ternary means: made up of three parts or things; threefold; triple (Webster). So I dared to use term binary, since
Bynary means: made up of two parts or things; twofold; double (Webster).

bye!

P.s.: Really appreciate your work on the forum! In my opinion one of the most consistent & systematic lately!
User avatar
jeeswg
Posts: 3010
Joined: 19 Dec 2016, 01:58
Location: UK

Re: one-loop sorting of strings by their length

18 Nov 2017, 11:38

- I made some comments about stable sort v. unstable sort here:
Wish List 2.0 - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=36789
- In sort functions, you should specify what happens if a > b, and a < b, and a = b.
- If you return 0, for a = b, then the Sort command is allowed to rearrange the items in the final list (it might make sorting quicker). If the items are actually identical, then returning 0 is fine, e.g. in case sensitive sorting.
- If you return -offset, for a = b, then when 2 items are considered equal, the earlier item remains earlier.

- My comment about binary v. ternary operator wasn't aimed at anyone in particular. I'd been meaning to ask about it.

- Instead of (a>b)?(c:=a+b) I would do (a>b)?(c:=a+b):0 as this is AHK v1/v2 compatible.
- Other examples of a ternary operator where you do nothing if the condition is not met:
var := (a=b) ? "new value" : var
(a=b) ? (var := "new value") : 0
- I believe that you can do this in AHK v1: (a>b)?(c:=a+b).
- I don't know if there is some code that you could put after it, that would give you surprising results.
- I don't know if this is a relic of AutoIt. I don't know if Chris, the creator of AutoHotkey, was aware of this behaviour, or was happy to allow it.
- What is clear is that it's not really a ternary operator if there are only 2 items, and that lexikos thought it shouldn't be permissible in AHK v2. And that by using && instead of ? you can achieve the same thing in both AHK v1 and AHK v2.
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

18 Nov 2017, 11:53

Thanks again to both, really enlightening!

P.s.: I'm still hesitating to use v2, since for me there is still a lot to learn about v1.1.
P.p.s.: I edited my ternaries in the tests & made note about last function.
Helgef
Posts: 2483
Joined: 17 Jul 2016, 01:02
Contact:

Re: one-loop sorting of strings by their length

18 Nov 2017, 14:43

jeeswg wrote:- I believe that you can do this in AHK v1:(a>b)?(c:=a+b).

It is a syntax error / invalid expression, for whatever reason, v1 doesn't yield an error. I will try to spell it out ;)

Code: [Select all] [Download] GeSHi © Codebox Plus

msgbox % ("can you do it" ? " Yes you can!") ? "It worked!" : "It didn't work!"

And that by using && instead of ? you can achieve the same thing in both AHK v1 and AHK v2.

You certainly cannot. is not a general substitution for x ? y : "", not in v1, and not in v2. :arrow: x && y is not even two-way compatible.


rommmcek wrote:I noticed something could be wrong with return strlen(a)-strlen(b)
Note that,
is sufficient for sorting strings by length, if order of appearance is important, indeed, use,

Code: [Select all] [Download] GeSHi © Codebox Plus

return (d := strlen(a1) - strlen(a2)) ? d : -ShortestFirst
which is equivalent to, but slightly faster than, your first sort function. If you want alphabetical order, within each length group, eg,

Code: [Select all] [Download] GeSHi © Codebox Plus

return (d:=strlen(a)-strlen(b)) ? d : a > b ? 1 : a < b ? -1 : 0 ; untested
Your sort tests needs to be changed to use the same input, for fair compairson. (Sort command modifies the input.)

I'm still hesitating to use v2, since for me there is still a lot to learn about v1.1.

That is a mistake, v2 is easier to learn, if you had used it you would have been notified on your invalid ternaries before the script had even started to execute. Ofc, there are other pros and cons, you should use whatever you like.

Cheers.
User avatar
jeeswg
Posts: 3010
Joined: 19 Dec 2016, 01:58
Location: UK

Re: one-loop sorting of strings by their length

18 Nov 2017, 15:24

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



- But Helgef, if you learn v2, and all the scripts are currently written in v1, problem.
- To learn AHK, I just learnt what I needed at the time. If the thing is an obvious problem, there's probably a forum thread on it. Otherwise you might need a bit of ingenuity.
- Eventually I felt I'd learnt enough that it was worth going through the whole documentation to fill in the gaps. You can separate AutoHotkey.chm into separate htm files, order the file paths into a txt list, and then use an AutoHotkey script to navigate to the next page in the list when you trigger a hotkey.
- Learning v2 will probably be slightly easier that learning v1 because some of the more fiddly things have been fixed (e.g. InStr/SubStr), and sometimes when there were two ways of doing things, there is now one (e.g. StringReplace removed, StrReplace kept).
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

19 Nov 2017, 02:34

@jeeswg: Consistent & systematic as ever!
@Helgef: Precise as ever! Whisperingly, why I sometimes think, you are somebody else, a doppelganger? (In the most possible good meaning! Who else could be so committed & proficient? If I'm wrong this is the greatest compliment!)
P.s.: I modified tests above to ensure same starting conditions adding new sort function.Please see new code below!
Last edited by rommmcek on 21 Nov 2017, 20:14, edited 1 time in total.
Helgef
Posts: 2483
Joined: 17 Jul 2016, 01:02
Contact:

Re: one-loop sorting of strings by their length

20 Nov 2017, 14:54

@ jeeswg, although the discussion is interesting, we are way off-topic (again). Your x && y examples are fine when there is no interest in the return of the expressions. I recommend that no one omits the : for the ternary, there is no point, and there are no guarantees that syntax errors will not yield an error in the future.
@ rommmcek,
:arrow: binary code function for sorting strings by length. Feedback appreciated. The source of the binary code is written in c, it is circa 200 lines, compare to the ahk versions which can be like 3 to 10 lines :lol: :clap: .

Cheers.
rommmcek
Posts: 299
Joined: 15 Aug 2014, 15:18

Re: one-loop sorting of strings by their length

20 Nov 2017, 17:25

@Helgef:
I was planning to supplement the tests with numbers input to show how different the results can be! And you came with such a whopper with lightning speed in the record time! (I certainly did not fathom the hole dimension of it yet!)
And the prize?
I'll play for you one of your favorite songs (in AutoHotkey of course), if you confide us a few titles!

P.s.: For me it is a delectation pure!

Return to “Scripts and Functions”

Who is online

Users browsing this forum: truekefir, valuex and 12 guests