Anagrams

Post gaming related scripts
Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

14 Jul 2017, 08:44

When I woke up, somehow I knew exactly

I love when that happens :angel:

Cheers
littlegandhi1199
Posts: 42
Joined: 29 Aug 2016, 23:58

Re: Anagrams

14 Jul 2017, 09:19

Helgef wrote:Hello. I have several comments.
First,
Helgef wrote:I see little need to improve something that works more or less instantly, in a on-user-typing context.
I still think this is true, however, trying to improve serves other purposes, such as learning and just having fun ;) So I will try to contribute with some ideas.

First some general comments. When we try to improve performance, we need to start where it matters. To know where that is, we have to benchmark, or measure what we can, and see where we actually spend our time. For example, I found that for input string
"abcdefghijklmno" the line

Code: [Select all] [Download] GeSHi © Codebox Plus

GuiControl,, LBox, % "|" (WordList ? WordList : "no anagrams for " String)

takes almost half a second :x however,

Code: [Select all] [Download] GeSHi © Codebox Plus

GuiControl, hide, LBox
GuiControl,, LBox, % "|" (WordList ? WordList : "no anagrams for " String)
GuiControl, show, LBox

takes ~150 ms, simple improvment! :dance: Here are my measurments, for input string "abcdefghijklmnop", 16 letters,

Code: [Select all] [Expand] [Download] (Untitled.txt)GeSHi © Codebox Plus


So combinations() is the culprit, no surprise really. The sort we needn't think about.
The problem with combinations() as you noted, is the repetions, there are minor issues with the implementation, but really the problem is the algorthim. About the implementation, one thing that stands out is that you do

Code: [Select all] [Download] GeSHi © Codebox Plus

Keyword := make_Keyword(NextShorter)

but then return NextShorter and then you do

Code: [Select all] [Download] GeSHi © Codebox Plus

Keyword := make_Keyword(SubString)

again, you could return the KeyWord instead, its not gonna save you but still it is a bit wasteful. Note, then you need

Code: [Select all] [Download] GeSHi © Codebox Plus

AdjLength := StrLen(KeyWord) // 2 + 1

for correct lookups. (I think :lol:)
Regarding the algorithm, during the no-spoiler period, I did implement a general combination function,

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


I try to explain in comments. It is general because it doesn't concatenate strings, it takes any collection of items and do all combinations, without repeats and organise the result in arrays. So this is not directly usable for your case, but could easily be adepted, I know because I have done it :lol: .
Now, reading littlegandhi1199 s explanation I think this is the same idea. However, @littlegandhi1199, regarding repetions, there are no repetions for unique items, but if items are not unique you get repetions. So you will have to sort that out anyways, as wolf_II have noticed here. I think your implemention is correct there wolf_II, in my implemention, I worked with strings, not arrays, then I do Sort, result, U to remove duplicates.

Sorry for the long post :problem:


wolf_II wrote:@Helgef: I was too tired to realize what I did and didn't do. When I woke up, somehow I knew exactly why I get repetitions with input=ABBA. Cause the repeated letters of the input. D'oh. Those can't possibly be avoided, They MUST be filtered. Then I checked for responses in this thread, and there it was: "if items are not unique you get repetitions. So you will have to sort that out anyway".

I need some time to read your general solution, I will implement the hiding of the Listbox when updating it, I will check out sort, result, U and and and ...
:D Thank you. :D

I write some more in this thread, but I want to write code first.


Have fun and good luck wolf. I will try to understand Helgef, because I'm sure those comments only benefit me so I thank you! Still very annoying that pre-perfection his is somehow faster. I suppose some functions are just faster then simple array lookups..
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

14 Jul 2017, 11:13

@Helgef

Code: [Select all] [Download] GeSHi © Codebox Plus

GuiControl, hide, LBox
GuiControl,, LBox, % "|" (WordList ? WordList : "no anagrams for " String)
GuiControl, show, LBox
Very nice, thanks for sharing! :thumbup:



Code: [Select all] [Download] GeSHi © Codebox Plus

Keyword := make_Keyword(NextShorter)
I need the keyword for If Not Store[n].HasKey(Keyword) and Store[n, Keyword] := A_Index but I want the result to be "an array with all the combinations of String" which must not contain any ,.



Code: [Select all] [Download] GeSHi © Codebox Plus

AdjLength := StrLen(KeyWord) // 2 + 1
docs wrote:Retrieves the count of how many characters are in a string.

I have seen in the past questions about this in the forum, I think maybe the docs would benefit from an example like:

Code: [Select all] [Download] GeSHi © Codebox Plus

; ANSI string
AString := "abdc"
MsgBox, % AString "`n`n" StrLen(AString)

; Unicode string
WString := "ШЙфЖ"
MsgBox, % WString "`n`n" StrLen(WString)
I seem to remember once before you said something to the effect "I'm never entirely sure". I am not sure at all either, but I think AdjLength := StrLen(KeyWord) will work without // 2 + 1.



I love when that happens :angel:
:D

And now I have a bit of reading to do with your general combination function.



@littlegandhi1199: Thank you again for your explanations earlier. :thumbup:
My initial approach was (as I mentioned) convoluted when I started with the full length string and removed more and more letters. It is much clearer to start with a single letter and build up from there.



This is my current implementation:

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus

I like this much better than version 1.17, but its speed is not as much improved as I hoped, if at all.
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

14 Jul 2017, 12:11

@Helgef:
Step-by-step, I'm working through your combination function. I'm glad to see 2**n-1 in there. This tells me I was on track.
But more importently, I have written a "n choose k" function lately, which avoids floats by separating the division from the multiplication.
When I did them on the same line it always ended up as float. Is it worth (for speed) to not worry about floats like you did?

AND: Is your code correct? Seems to be off by 1 for: 50 choose 5, with Round() being the main suspect?

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus

Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

14 Jul 2017, 12:12

Regarding,

Code: [Select all] [Download] GeSHi © Codebox Plus

AdjLength := StrLen(KeyWord) // 2 + 1

and

Code: [Select all] [Download] GeSHi © Codebox Plus

I need the keyword for If Not Store[n].HasKey(Keyword)

The suggesting was not to remove that, but to do Result.Push(Keyword), so you do not have to do the call to make_Keyword() in the for-loop, but instead

Code: [Select all] [Download] GeSHi © Codebox Plus

For each, Keyword in Combinations(String) {
AdjLength := StrLen(Keyword) // 2 + 1 ; KeyWord has "," in it, so we need to adjust the string length
For each, Anagram in DICT[AdjLength, Keyword]
WordList .= WordList ? "|" Anagram : Anagram, Count++
}

Reading it again, I see I wasn't very clear. But I give you another hint, we can actually reduce the number of calls to make_Keyword to 1 call per ShowAnagrams, with very small modifications. :lol:

"I'm never entirely sure"

I'm not entirely sure I did 8-)

I did look at your recursive Combinations(), it is very nice, doing recursive is very elegant, but as you say, it will probably not be very fast due to the same elegancy. The recursion which will probably be quite deep for longer strings. There is some noise in the result too it seems, for "abcd" I get one item "abcda", unless my printing is incorrect. Otherwise it seems correct.

Edit:
wolf_II wrote:I have written a "n choose k" function lately, which avoids floats by separating the division from the multiplication.

Very good, yours is much better, I just looked up the formula (I can never remember) and typed it in, quick test on a few numbers and it seemed ok. Thanks for catching, I will steal yours. :thumbup:
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

14 Jul 2017, 13:19

I understand, Thanks. I need to change the "fundamental design" of the function, and return KEYWORDS instead of the substrings themselves.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus





There is some noise in the result too it seems, for "abcd" I get one item "abcda"

I can not confirm that, sorry.
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

14 Jul 2017, 13:52

version 1.18

I also updated the first post.



Edit: Time measurements with input = "thequickbrownfox" show Combinations(String) takes about 2 seconds (was about 1 second with v1.17) :(
Recursion is nice and readable, but slow. Helgef said so already, I just post my measurements.
User avatar
jeeswg
Posts: 1486
Joined: 19 Dec 2016, 01:58
Location: UK

Re: Anagrams

14 Jul 2017, 16:05

@wolf_II @Helgef @littlegandhi1199 (ordered by first appearance on this thread)

I have various functions and scripts here:
combinations and permutations (and anagrams) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=34244

Including this post, that explains the basic principles:
combinations and permutations (and anagrams) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=34244&p=158874#p158874

One thing unlike what I've done is this Permutations function, a recursive function:
Permutations() - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=34230

Anyhow I'd wondered if some/all of you would like to try to do a post like mine, explaining the basic principles of what you've achieved so far.

I'd also point out the usefulness of the language: 'combinations/permutations with/without repetition', and 'alphabetised strings', to help think/talk about the issues.

And it's possible that you'll benefit from some of the techniques that I've outlined. Thanks.
kon
Posts: 1748
Joined: 29 Sep 2013, 17:11

Re: Anagrams

14 Jul 2017, 17:10

Just for fun I made a simple one of my own. After I wrote it I read through some of the versions that wolf_II posted and I see many similarities. Just thought I would share. Barely tested.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


Edit: --Depth should have been Depth - 1.
Last edited by kon on 14 Jul 2017, 20:45, edited 1 time in total.
Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

14 Jul 2017, 17:29

wolf_II wrote:version 1.18


Edit: Time measurements with input = "thequickbrownfox" show Combinations(String) takes about 2 seconds (was about 1 second with v1.17) :(
Recursion is nice and readable, but slow. Helgef said so already, I just post my measurements.

My initial testing was positive, v 1.18 seems like an improvement, although slight. Results may ofc vary between different strings, that is a good observation.
From my perspective you have the following options
  • Leave it as it is, beacause it is very nice now (Boring ;) )
  • Improve your implementation of the latest algorithm to avoid recursion.
  • Revert to the previous algorithm, which wasn't recursive, but a bit wasteful, and see if you can better it by returning keywords from combinations().
  • Also, in v 1.18, there is minimal effort to reduce the number of calls to make_keyword(), you need 1 call total per search (per input string). I'm not going to spoil the fun for you, you will be very happy when you see it, and it improves performance a lot.

@jeeswg, hi.
Nice topic, I did post a combine() no repetions, script in this thread, here (last codebox).

@kon, nice of you to join, I will take a look at your scrpt :wave:

Cheers.
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

14 Jul 2017, 17:35

@kon: Nice one, thanks for sharing. :D :thumbup:



@jeeswg: Here is a collection of combinations/permutations with/without repetitions: Basically the same as I posted in your thread. To follow your request, I have added two functions, whose purpose is solely to complete the list of functions, and maybe save me reading through the wiki in the future, just to dig out what means what.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus





@Helgef: yes, I will look into avoiding recursion.
Last edited by wolf_II on 14 Jul 2017, 19:28, edited 1 time in total.
Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

14 Jul 2017, 17:50

@ kon, very nice :thumbup:. I have used the array[strsplit(word)*] method for word look-ups in an auto complete script, it is very fast and convenient, both implementation-wise and execution-wise. It is a bit wasteful on memory usage though, but I don't really mind.
However, it is not immediately clear to me what the "__" does :crazy:
Also, nice choise of delimeter, chr(1), never though of that, I struggle to decide between "|", "`n" and"," :lol:
kon
Posts: 1748
Joined: 29 Sep 2013, 17:11

Re: Anagrams

14 Jul 2017, 18:20

Thanks for reading :)

To say that array[strsplit(word)*] is a bit wasteful memory-wise is putting it mildly :)

The "__" key is a bit tough to explain so maybe an example is best. Compare the two.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



Chr(1) - I got the idea from PhiLho here: https://autohotkey.com/board/topic/7112 ... /?p=187505

Edit: To further explain the "__" key; single-character keys are part of the word. 'value' would work instead of '__'. It could be any key, the only requirement is that it is more than 1 character in length.
Edit2: Found and corrected an error in my previous post.
Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

15 Jul 2017, 04:41

Thank you very much for you explanation kon. I think I get it, but I do not get why it need be strlen>1 in this case, it would suffice it is not a letter, right? I make a small modification to to allow permutated input, eg, we find all anagrams for
police even if we type epolic.

Very nicely done kon, it is short and sweet, if it where not for the recursion, this would be the fastest one I am sure.
My mod, I put the result in the clipboard and only show the number of found anagrams,

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


Note: I changed the wordlist file.
Edit: code fix.
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

15 Jul 2017, 09:53

@Helgef:
Progress update: After I realized I need to let go of the idea that my Combinations() HAS TO return an array of substrings, It took only another 24 hours :D until I realized the real meaning of it: I can stop fiddling with the input string and start fiddling with the keywords instead. Thank you for not spoiling it at the start, and make me see for myself. Here is the new function:

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus

It starts with the full string, (as seen in v1.17) and removes "letter by letter" from the respective keywords.



Helgef wrote:Also, in v 1.18, there is minimal effort to reduce the number of calls to make_keyword()
I like to try to apply the idea to v1.18 (building up to full string) as well now. Since it involves "minimal effort" it might take me 24 hours again. :D



Here are comparisons of v1.17, v1.18 and v1.19, all using the 110k words file, with input = "thequickbrownfox":
  • v1.17 - 1.091 seconds
  • v1.18 - 1.915 seconds :(
  • v1.19 - 0.415 seconds :dance:
kon
Posts: 1748
Joined: 29 Sep 2013, 17:11

Re: Anagrams

15 Jul 2017, 10:45

Helgef wrote:Thank you very much for you explanation kon. I think I get it, but I do not get why it need be strlen>1 in this case, it would suffice it is not a letter, right?
Yes that would work provided you could guarantee that the character would not be in any word.

Helgef wrote:I make a small modification to to allow permutated input, eg, we find all anagrams for
police even if we type epolic.
The version I posed above does find police with the input 'epolic'. Perhaps I misunderstand. Good call sorting the string before sending it to Combinations();that was a glaring oversight on my part. I didn't test speed, but it will surely be an improvement.

Helgef wrote:Very nicely done kon, it is short and sweet, if it where not for the recursion, this would be the fastest one I am sure.

Thanks. Instead of a limit on the length of the input string, the limit is on the depth of recursion. I thought the stated goal of the OP was not to find ALL matches. ie: "My goal is to write the script such that it would display for example, possible 7- or 8-letter solutions for any given 9-letter string." I may have missed some discussion where it was decided to find ALL combinations, but I thought I would just point out the advantage for longer strings. ie: strlen>22 is still instant, provided the depth of recursion is limited.

@wolf_II Nice new Combinations() function. :thumbup: Similar to my comments above about limiting the depth of recursion... If you don't generate combinations that are less than, say (StrLen(String) - 4), that would speed things up a lot for longer stings. In both of our scripts, it could even be adjustable. ie: In my case, the user could choose the depth of recursion. Or in your case, the user could decide how many letters to remove from the string. ie: for an input string > 18 characters long it's not worth the time it takes to generate all the small words so you might as well skip all the words less than a length of your choosing.
Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

15 Jul 2017, 11:20

kon wrote:The version I posed above does find police with the input 'epolic'.

Indeed it does, my mistake :oops:
I may have missed some discussion where it was decided to find ALL combinations

Indeed, our ambitions grew in parallel with our greed. :twisted:
You make very good points about letting the user decide about trimming the input, it might not be desired to view (and wait for) all those short anagrams when searching for longer ones.

wolf_II wrote:v1.19 - 0.415 seconds
Very good, and you did find out where to call make_keyword :superhappy:

:idea: We could have used a found in x ms indicator in the status bar.
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

15 Jul 2017, 12:42

Anagrams.png
Anagrams.png (16.77 KiB) Viewed 94 times

@kon: Thanks for the compliment and the suggestion. :D
@HelgefI will move the time display to the status bar, I first wanted to show the time here. (using my existing code)
@all: This screen shot is just my previous time measurement put in again, Check out the time for full length input (26). Also a preview of how I implemented kon's suggestion.

I feel the wording for "Use minimum letters:" and "use all but x" still needs to improve. Any ideas anyone?
It should be something to the effect: "use all, but allow to drop up to 5 letters". But I wanted the text to fit. Maybe I use two lines.
wolf_II
Posts: 1395
Joined: 08 Feb 2015, 20:55

Re: Anagrams

15 Jul 2017, 18:16

I finished the conversion of the recursive algorithm:

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


I performs better than before (input = "thequickbrownfox", 110 k words) time = 0.62 seconds (before 1.915), but the previous algorithm without recursion is still the fastest. (0415 seconds).
I'm a little surprised to be honest, I did not expect for make_Keyword() the have that much of an influence.
Helgef
Posts: 1788
Joined: 17 Jul 2016, 01:02

Re: Anagrams

16 Jul 2017, 06:18

The gui looks great. Maybe something like:

Code: [Select all] [Download] (Untitled.txt)GeSHi © Codebox Plus

Exclude words of length: [1,2 .., show all.]
My version, a variant of the combine() function I posted earlier, cannot do a minumum, since it builds from bottom up :cry:. It could exclude from the output, but no gain in performance.

About the latest combinations(), it is really too bad the recursion slows it down so much, beacuse it really looks nice. Even though you do circa 30000 calls to make_keyword more than you need, per thequickbrownfox search, it seems to only improve by about 20 % when modified to do only one call to make_keyword. I am using the 370 k wordlist btw.

Cheers.

Return to “Gaming”

Who is online

Users browsing this forum: No registered users and 8 guests