Anagrams

Post gaming related scripts
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

16 Jul 2017, 08:32

version 1.23

I have updated the first post.

Note: The code will always display full length anagrams, e.g. 4-letter anagrams for "east", even if 4 or more letters are to be excluded.

That's caused by Result := [Keyword] in Combinations(). I like it, I leave it in, it's a feature!! :D



@Helgef: Thank you for the suggestion. Much better. :thumbup:
Last edited by wolf_II on 16 Jul 2017, 13:11, edited 1 time in total.
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

16 Jul 2017, 11:36

wolf_II wrote:version 1.23

Excellent, it is very smooth with the restrictions. You might want to include the QPC() function.
I like it, I leave it in, it's a feature!!

I agree :thumbup:

Now the question is, are you done with it?
Can of worms
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

16 Jul 2017, 13:11

Helgef wrote:You might want to include the QPC() function.
Oops, I will edit the code above and the first post. Thanks for catching.

Helgef wrote:are you done with it?
I have on my harddrive a version 2.00 actually :D That has no added features (yet), it's just a re-arranging of code that I do when code gets to above 6-8 kByte. I divide it into several files and keep my scrolling down in the editor. Also I have some helpers to go along, I can easily switch between word lists. And I can look for e.g. anagram lists, where every word is 8 letters long and the number of anagrams is at least 6.

I'm all for exploring the can of worms, if you are serious. :thumbup:
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

16 Jul 2017, 14:46

Here is the version 2.00 as a zip file. I did not include the needed word lists to keep the file size small. With 2 extra files ZIP size is 1.4 MByte vs 8 kBytes.
I included two empty files where my script (a project now) expects to find them. It's also easy to add different files or more files.
The script "Choose word list.ahk" will find them, they need to be one word per line.

Anagrams v2.00.zip
(8.03 KiB) Downloaded 24 times
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

16 Jul 2017, 17:02

This really turned out great wolf_II , really nice setup! :bravo:

About the, now open, can. I am serious :beard: I have not given it much though though, but looking at you v2.00, I think it would be natural to build it in there, there is already an infrastructure there which will probably be useful. When I think of scentence anagrams, I think that all the letters in the scentence can be permutated, not just the words. Eg,

Code: [Select all] [Download] (Untitled.txt)GeSHi © Codebox Plus

eleven plus two
twelve plus one

If you put in elevenplustwo in your anagram searcher, it will find all the words, twelve, plus and one, but there is a lot of noise. I'm thinking, the user enters a sentence, and selects how many words the new sentence should be, and we try to find something. Do you have any ideas?

Cheers.
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

17 Jul 2017, 01:49

My first thoughts:
  • I noticed that Input = "one plus twelve" outputs 473 anagrams skipping 2-letter words and less (110 k words). Its the same when I input the phrase without spaces. That turns out nice.
  • Next I noticed, that I want to have an *.ini file where I store the last input and the last Skip value from the DDL. That might be better than repeatedly editing the defaults.
  • While I'm at it I want to also store the window position, I already have some code to do that, that's easy and will reduce some nuisance when testing later.
  • I think maybe we can (from a new function) call Combinations(String, Skip) in a loop, and see where this leads us. I have to write and test a function to see if that is a dead end or not.
  • I will be away from keyboard in a bit, but I think I can get started soon.



Do you consider it better to start a new thread for Anagrams v2.xx ?
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

17 Jul 2017, 02:53

To make testing easier, here is version 2.01 as a zip-file: :D
Anagrams v2.01.zip
(10.05 KiB) Downloaded 18 times

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus

Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

17 Jul 2017, 07:01

Hello.
I have considered your thoughts, and collected some of my own.
I noticed that Input = "one plus twelve" outputs...
Good observation.
...call Combinations(String, Skip) in a loop...
I think we need to call it only once. Given a sentence, eg, eleven plus two the collection of (key) words we get from Combinations() are the only ones which can form an anagram of the input. However, I am not ruling that idea out, I have had similar thoughts, one could call Combinations() on substrings of some selection(s).
v2.01
Nice updates.

I have a proposal for an algorithm, please tell me if I am unclear, or made a mistake. I try to explain:
The basic idea is more or less the same as with the one word anagram, that is, that all anagrams fall under the same keyword, this is true for the sentences too. So we make a keyword of the user input, eg,

Code: [Select all] [Download] GeSHi © Codebox Plus

make_keyword("eleven plus two") ; Also remove spaces

we get the keyword: eeellnopstuvw (I have dropped the ,, you should too ;) ). This is what we will try to build from what Combinations() gives us. I'd like to define the word sub-anagram first, so we speak the same language.
A word W1 is a sub-anagram of the word W2, if the word W1 can be made by some combination of the letters in the word W2.
Example: If W1:="bd" and W2:="abcde", then W1 is a sub-anagram of W2, but W3:="abf" is not. That is exactly what we get from combinations(), sub-anagrams of the input.

A small example, with some selected output from combinations(),

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


In the real implemention, we should only work with keywords, and I think it will be convenient to modify combinations() to returns keywords in sub-arrays where each subarray only contains equally lengthed keywords. I think string length will be an important filter.

I think the theory is ok, the question is if it will be practical to code it.

I definetely think that V2.xx should live in this thread.

Final note, the number of letters in "eleven plus two" happens to be 13, which makes it even more awesome.

Cheers.
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

17 Jul 2017, 12:20

Helgef wrote:please tell me if I am unclear ...
sub-anagram ...
string length will be an important filter ...
if it will be practical to code it...
Sorry for the lazy quote.

I think you are clear. Well at least I think I understood the changes I need to do:
  • Change Combinations() to return something different again.
  • Adjust ShowAnagrams: to deal with the changed return format.
  • And lastly, make_Keyword() must drop the spaces.
I like "sub-anagram". I used string length to sort by and stored it as a value for each key. Have a look:



Changed Combinations(). I could re-use the former Store array and return that.

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus




Adjustments in ShowAnagrams: for the new returned format.

Code: [Select all] [Download] GeSHi © Codebox Plus

    WordList := "", Count := 0, Start := QPC()
For each, Set in Combination := Combinations(Input, Skip)
For Keyword, Size in Set
For each, Anagram in DICT[Size, Keyword]
WordList .= WordList ? "|" Anagram : Anagram, Count++
Sort, WordList, D| F LengthSort
TimeTaken := QPC() - Start ; time in s



Changed the Needle in make_Keyword() to drop spaces.

Code: [Select all] [Download] GeSHi © Codebox Plus

    Keyword := LTrim(RegExReplace(String, "(.) *", ",$1"), ", ")



Let me try to put in words, where I would go next: Suppose we want a two-word solution to a 11-letter input. Do we expect all letters to be used for that?
If yes, then one word must be at least 6 letters or longer. Would that be a reasonable attempt at reducing the number of computations necessary?
That's just the start point, and we might be able to half the size of lookups in the Combination object in each iteration. The exit condition for the loop would then be a still to define minimum word length. :think:

But I guess that would not be saving much since the lookups cost about O(1) and reducing the number of them to O(n*log n) will give benefit in the range of me previously compacting the custom sort. Which came for free, so I took it.

Have Fun :)

Edit: I corrected a typo in Combinations(), aka bug.
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

17 Jul 2017, 15:57

Changed Combinations()

Code: [Select all] [Download] GeSHi © Codebox Plus

Result[n, NextShorter]
Perfect.
make_Keyword()must drop the spaces
I also mean we should drop the commas, but maybe we wait, we might need them (again) :think: (In all version where you have reduced the calls to make_keyword to 1, the commas has been nothing but superfluous)
Suppose we want a two-word solution to a 11-letter input. Do we expect all letters to be used for that?
I would say yes. In general I think it will be rare to find two-word solutions to longer inputs. We'll see.
If yes, then one word must be at least 6 letters or longer
Indeed. This will limit the computations considerably, we only have to consider combinations of pairs of lengths: (6,5), (7,4) .... (10,1).
The exit condition for the loop would then be a still to define minimum word length.
The exit condition will probably be no more solutions possible, due to string length conditions.

Cheers.
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

17 Jul 2017, 18:54

I tested what it would require to drop the commas: It would mean adjusting Combinations() and no change needed either for ShowAnagrams: or make_DICT. I currently think the way to do it would be inside make_Keyword() function we keep the most recently posted RegEx needle then sort while we have a delimiter for sorting and drop the commas on the last line.

Code: [Select all] [Download] GeSHi © Codebox Plus

;-------------------------------------------------------------------------------
make_Keyword(String) { ; return a sorted string of chars
;-------------------------------------------------------------------------------
Keyword := LTrim(RegExReplace(String, "(.) *", ",$1"), ", ")
Sort, Keyword, D,
Return, StrReplace(Keyword, ",")
}

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



This all tested OK for me. Shall we look at how and where we decide if we are looking for multiple-word solutions. Depending on this would weather we ignore the chosen "Skip" from DDL. If there are 3 spaces in the input we go for a (maybe up to 4-word, maybe exactly 4-word) multi-word solution and ignore the "Skip" DDL. I'm sure that needs to happen before we call Combinations(). We could add an optional parameter to Combinations() which, when present, passes the number of spaces in the input. A different way is: when spaces are detected we call a different function to deal with multi-word solutions that calls Combinations() maybe only once along its way. Presently I gravitate towards a second function.

Have fun :D



LOL this feels so odd after removing commas from the post. Like I lost my ability to recognize sentences and put a delimiter between them.
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

17 Jul 2017, 19:08

Would a function Complement(Keyword, Subkey) be useful? Where the complement of subkey with regards to keyword is e,g, Complement("abcdef", "bdf") = "ace".
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

17 Jul 2017, 19:10

Return, StrReplace(Keyword, ",")

Exactly! Maybe we can omitt the Ltrim to, I think the sort ignores "blanks" by default.
It makes combinations() more readable too, when we dont have weird indices due to the commas. :thumbup:

I think we might want to look at a small range to begin with, maybe 2-4 words. I think a sentence containing multiple short words a not very likly to make any sense anyways.
The spaces can be removed before we call combinations, they have no value there, the string length should not take the spaces into the count.
I think we should call combinations from ShowSentenceAnagrams, that is, two separate routines.

:lol:
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

17 Jul 2017, 19:11

wolf_II wrote:Would a function Complement(Keyword, Subkey) be useful? Where the complement of subkey with regards to keyword is e,g, Complement("abcdef", "bdf") = "ace".
If we go with the combinations in loop idea, we should have that function. I think, I am out for today.
Edit: Maybe it is a good idea regardless... :think:
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

18 Jul 2017, 07:31

Regarding LTrim + Sort: I don't think it's possible to drop LTrim. I tested with the following script:
test without LTrim



Regarding Complement(): I have written and tested a nice version with a For loop and compared it with a second version with a vanilla Loop.
Vanilla style is between 15 and 20 % faster on my PC. I prefer the For loop but the faster one is not too bad.
test with Complement()



Regarding object.HasKey(key) vs ObjHasKey(object, key) that littlegandhi1199 brought up a while ago here.
Here is my test script, I have found a difference in favour of ObjHasKey() but at about 1/10 of what was claimed. :(
There was no code with the claim. Here is my code for everybody to test and verify/falsify. I have found about 50 ms difference over a million calls to HasKey IMHO, this is insignificant.
incorrect test with HasKey

better test



I'm ready to start working on a "Search for multiple-word solutions" function and a way to call it from ShowAnagrams:.
Last edited by wolf_II on 18 Jul 2017, 10:28, edited 1 time in total.
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

18 Jul 2017, 09:31

Ltrim
This is of less importance of course, but I am thinking,

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus



Regardin Complement(), nice function :thumbup:
I'm not surprised the loop is slightly faster. If this function will be called a million times, we might want to skip the for loop. Loop,Parse seems even a tiny bit faster,

Code: [Select all] [Download] GeSHi © Codebox Plus

;-------------------------------------------------------------------------------
Complement3(Keyword, Subkey) { ; remove from Keyword all the letters from Subkey
;-------------------------------------------------------------------------------
Loop, Parse, Subkey
Keyword := StrReplace(Keyword, A_LoopField,,, 1)
Return, Keyword
}


Regarding HasKey. There is a problem with your test.

Code: [Select all] [Download] GeSHi © Codebox Plus

 T1_Start := QPC() ;-----------------
Loop, % 1000*1000
If oTest.HasKey("asdf")
Msgbox, % T1 := QPC() - T1_Start ; Added visual ------------

Modifying it, I observe a differance in favour of ObjHaskey

Code: [Select all] [Expand] [Download] GeSHi © Codebox Plus


Code: [Select all] [Download] (Untitled.txt)GeSHi © Codebox Plus

0.347061
0.262215 (ObjHaskey)

132%

If it is significant to the application at hand, we will see.


I'm ready

I look forward to see what you will come up with.
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

18 Jul 2017, 10:45

Regarding LTrim:
make_Keyword() is called from Make_DICT() where there are no spaces.
make_Keyword() is called from Combinations() where there are no spaces.
Currently I'm half way done with MultiWordAnagrams() and I don't call it there either.
I understand my testing with strings containing spaces was missing the point. Thanks for putting me back on track. :D



Regarding Complement3:
I did not even think of trying Loop, Parse. Thanks for sharing, or more like thanks for your relentless teaching. I think I learned more about general programming while doing anagrams than the last couple of years put together. :thumbup:



Regarding HasKey:
Yep, I messed that up badly. Somehow my Count variable did not stick. I will edit the post above. Who is messing with my stuff, there is nobody here?



Quick progress update: I can find "eleven tow" and "eleven two" from " onetwelve". There are the swapped-word solutions still missing but what puzzles me at present is where is the "twelve one" and "twelve eon". Well, back to debugging.
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

18 Jul 2017, 10:58

Currently I'm half way done with MultiWordAnagrams()

Are you sure?
Well, back to debugging.

Just kidding. :D

Quick progress update: I can find "eleven tow" and "eleven two" from " onetwelve".
Sounds very promising.
wolf_II
Posts: 1581
Joined: 08 Feb 2015, 20:55

Re: Anagrams

18 Jul 2017, 12:29

version 2.03

Anagrams v2.03.zip
(10.47 KiB) Downloaded 14 times



This version does not deal with the number of space in the input. When there are any I pass the SpaceCount as a parameter but my algorithm ignores it for now.
Number of solutions to " onetwelve" = 32 (for small word list)
Number of solutions to " onetwelve" = 186 (for large word list)
8-)
Helgef
Posts: 2420
Joined: 17 Jul 2016, 01:02
Contact:

Re: Anagrams

18 Jul 2017, 13:21

Well done! I shall examine your code closer later.
I observed, for example, go od gives dublicates in the result.

Cheers.

Return to “Gaming”

Who is online

Users browsing this forum: No registered users and 5 guests