Page 5 of 7
Re: Anagrams
Posted: 18 Jul 2017, 13:58
by wolf_II
Bugfix for duplicates:
Code: Select all
;-------------------------------------------------------------------------------
MultiWordAnagrams(String, SpaceCount) { ; return an array with the sub-keywords
;-------------------------------------------------------------------------------
global Max_Input_Length, DICT
Solutions := Combinations(String, Max_Input_Length)
Keyword := make_Keyword(String) ; String has no spaces!
Len0 := StrLen(String) ; # of letters in String (Input without spaces)
Len1 := (Len0 + 1) // 2 ; # of letters in longer word
Result := [], Previous := []
Loop, % Len0 - Len1 {
Index1 := Len1 + A_Index - 1 ; iterate from e.g. 5 to 8 (for 9 letters)
For Key, Size in Solutions[Index1] {
Compl := Complement(Keyword, Key)
If Not Previous.HasKey(Compl)
And DICT[Size].HasKey(Key)
And DICT[Len0 - Size].HasKey(Compl) {
Previous[Key] := True
For each, Anagram1 in DICT[StrLen(Key), Key]
For each, Anagram2 in DICT[StrLen(Compl), Compl] {
Result.Push(Anagram1 " " Anagram2)
Result.Push(Anagram2 " " Anagram1)
}
}
}
}
Return, Result
}
Re: Anagrams
Posted: 19 Jul 2017, 03:27
by Helgef
Hello.
Your solution with the complement is very good
Do you plan to extend this to finding sentences of more than two words?
If so, I would not do the equivalent of this:
Code: Select all
Result.Push(Anagram2 " " Anagram1)
Instead, I propose additional GUI features,
1) Given a selection in the listbox, open a new window with permutations of those words.
2) Copy selection to clipboard.
The ini file doesn't seem to remember the spaces of the input.
I find It slightly addictive, in particular it is fun with names.
Re: Anagrams
Posted: 19 Jul 2017, 09:50
by wolf_II
Helgef wrote:Do you plan to extend this to finding sentences of more than two words?
Yes, I do. And I have been working at it already.
Regarding additional GUI features: I can do that, I'll start right now. Then I can release an additional feature of my own that is ready to go: save the position of the text caret to ini-file, and restore it correctly.
Regarding ini-file: Yes, I noticed, and I fixed it with changing
Gui, Submit to
Gui, Main: Submit in
CleanUp()
The version 2.04 will also outline my approach to
MultiWordAnagrams rather than have it completed.
Re: Anagrams
Posted: 19 Jul 2017, 12:25
by wolf_II
I have changed the ListBox to a ListView to support RightClicks with AHK built-in means. Along the way I did not change the generation of
WordList that is still a pipe separated list. Maybe I change that later to an array. What do you think?
The popup window is not showing any permutations yet, just the selected Items from the main GUI. Would that be close to what you had in mind?
Next the
MultiWordAnagrams() function. That's not done either. I have put in some code (not debugged or tested obviously) that outlines where I'm heading with this function. In order to make progress, I want to write yet another helper function
get_Pairs(Keyword, n) where n is the length of the first sibling who is the complement of the second sibling with regard to the given Keyword. I also want to make sure inside this function that the first sibling has a real-word anagram in
DICT[].
After reading my own writing of my thoughts concerning the
get_Pairs() function, I had an idea ... It's like looking for all anagrams of exactly size n of the given Keyword/String ... pair those up with their complements and Bob's your uncle!
Re: Anagrams
Posted: 19 Jul 2017, 12:36
by Helgef
Quick question, could you give an example for get. Pair function? I'm on the phone, hence the awkward typing.
Edit, I saw your edit now
Re: Anagrams
Posted: 19 Jul 2017, 12:51
by wolf_II
I saw that you've seen my edit. Maybe it will help me or others when I write down an example anyway.
The idea is to split off subkeys from Keyword (input ="one two three") such that Pair[1] is the keyword("one") and then look for TwoWordAnagrams(Pair[2]) where in this example Pair[2] = keyword("twothree").
My hope is that I can reduce the string length fast enough for the recursion to be practical.
Re: Anagrams
Posted: 19 Jul 2017, 12:59
by Helgef
My initial thought is that such an approach would be too limiting. But I haven't looked at the new code yet.
Re: Anagrams
Posted: 19 Jul 2017, 13:37
by Helgef
I'm not sure I get this. In your example it looks like you want the complement.
I'd say we should stick with the pipe string, since that is what
sort and
GuiControl wants.
Side note of lesser importance: I think you we can skip the length sort when we do
muliple-words anagrams.
The listview and pop-up looks perfect.
Nice touch with enabling the multi-selection.
Cheers.
Re: Anagrams
Posted: 19 Jul 2017, 15:14
by wolf_II
Regarding:
Sort
I did not notice at first, your right, it'll be included in the next version.
Regarding:
can of worms, aka
The deepest Rabbit Hole I have ever encountered.
I'm afraid I might be lost. I have stared into the abyss and now the abyss is staring back ...
I give you a glimpse at where I stand: (failure to progress update)
Code: Select all
;-------------------------------------------------------------------------------
MultiWordAnagrams(String, SpaceCount) { ; return an array with the "sentences"
;-------------------------------------------------------------------------------
Result := []
Keyword := make_Keyword(String)
If (SpaceCount = 1)
Return, TwoWordAnagrams(Keyword)
; otherwise: get word pairs
Loop, % SpaceCount - 1 { ; iterate once for 2 spaces etc
; iterate over requested size for Pair[1]
Loop, % (StrLen(Keyword) + 1) // 2 { ; ???
For each, Pair in get_Pairs(Keyword, A_Index)
; get_Pairs makes sure the Pair[1] will be in DICT
For each, MultiWord in TwoWordAnagrams(Pair[2])
Result.Push(Pair[1] " " MultiWord)
}
}
Return, Result
}
;-------------------------------------------------------------------------------
get_Pairs(Keyword, n) { ; return an array of [keyword(size=n), its complement]
;-------------------------------------------------------------------------------
Result := []
SIZE := StrLen(Keyword)
; all SubKeywords of Keyword of size n that have a real-word anagram
; here I expect to see (SIZE choose n) candidates
For each, SubKeyword in _Combinations(Keyword, n) ; ???
Result.Push(SubKeyword, Complement(SubKeyword))
Return, Result
}
;-------------------------------------------------------------------------------
_Combinations(Keyword, n) { ; return an array of all SubKeywords of size n
;-------------------------------------------------------------------------------
global DICT
Result := []
; code
Return, Result
}
I might be onto a practical implementation, I might be miles away. Currently I have split the get_Pairs() function into a trivial part and an loopy part, which should be similar to the tried and tested Combinations(String, Skip) function. I plan to combine the two parts later but for now I keep them separate.
I will give this project a rest of at least 24 hours before I return to it. I look forwards to hearing from you, guidance/spoilers/criticism are all equally welcome.
Re: Anagrams
Posted: 19 Jul 2017, 23:14
by wolf_II
I'm back! I can find 3,308 multi-word anagrams for "one plus twelve" (0.75 sec, 110k words). 261 anagrams for "who are you" in 0.04 sec, 930 anagrams for "behind blue eyes" in 0.93 sec.
The code is supposed to support more spaces but that's not working yet. And I have to double check if there are duplicates sneaking in somewhere.
This is the
get_Pairs() function which I am happy about. The bugs lie in
MultiWordAnagrams() where I seem to duplicate the result instead of recursing.
Code: Select all
;-------------------------------------------------------------------------------
get_Pairs(Keyword, n) { ; return a collection of [Key, Sibling]
;-------------------------------------------------------------------------------
global DICT
Result := []
SIZE := StrLen(Keyword)
For each, Collection in DICT[n] {
For each, Key in Collection {
Sibling := Complement(Keyword, Key)
If StrLen(Key) + StrLen(Sibling) = SIZE ; valid Key
Result.Push([Key, Sibling])
}
}
Return, Result
}
Re: Anagrams
Posted: 20 Jul 2017, 02:21
by Helgef
I will give this project a rest of at least 24 hours before I return to it
[8 hrs...]
I'm back!
Good job
It is evident that there will a lot of results. It isn't immediately clear too me how you are doing it. I will take a closer look later.
Cheers.
Re: Anagrams
Posted: 20 Jul 2017, 03:12
by wolf_II
Bonus #1: I have written another helper script to easily view the number of words for each length. Looking at that helped me to change directions for the
get_Pairs() function.
Bonus #2: I have also written a
Spinner() function which I use for the first time here. It's a careful attempt at using Gdip. Gdip itself is needed but not included in this zip. If you have it in your user library or your standard library, AHK will find it. I'm certain
Helgef needs not to be told, others might need to know.
Get Gdip from here.
@
Helgef; I'm sure I made no sense before, and maybe contradicted myself along the way too. The code is now cleaned up.
Re: Anagrams
Posted: 20 Jul 2017, 10:32
by Helgef
Hello.
Your function is fast, it is very user friendly. However, it seems you miss alot. I did a
brute-force attempt, it is slow as molasses (a few minutes), but I find 20000 sentences for
eleven plus two. I didn't verify this list though. I'm using the (slightly modified)
comine() function I posted earlier in this thread. But there should only be repetions if there are repetions in the input, ofc, I could have made a mistake.
I attach the list if you want it a reference.
Most of it is pretty useless though, like
Maybe we should have a better wordlist for this, i.e., with more common words. I do not recognise any of those words. (I'm using
the 110k list for this.)
I have peeked at your code, I don't really get the concept of
Key, Sibling. I will take a closer look later. But for now, I vastly prefer your methods, because it is fast and the output is useful, I wouldn't want all those
wens lev el po ut's in there.
Finally. I didn't see the bonuses, where is the spinner?
Re: Anagrams
Posted: 20 Jul 2017, 10:40
by wolf_II
Spinner.ahk is in the local library Anagrams\Lib.
Word list - Lengths.ahk is the name of the bonus helper file that calls spinner.
Thanks for the reference file for "elevenplustwo".
Re: Anagrams
Posted: 20 Jul 2017, 11:01
by wolf_II
I noticed I have not yet included the requested copy to clipboard feature, I do that now.
I had a look at your reference file. I find 1,105 three-word anagrams. I'll look at those for a start, because my algorithm can only compare to this sub-solution. I get the exact same two solutions when I input "elevenplustwo" and add a single space. adding another space gives me 3,315 solutions. I have to write the clipboard feature right now.
Re: Anagrams
Posted: 20 Jul 2017, 11:16
by Helgef
I see it, nice bonus.
I did a
spinner too
(All credits to
tidbit though)
Re: Anagrams
Posted: 20 Jul 2017, 12:27
by wolf_II
I remember seeing that spinner/particles before. But the particle class is way over my head.
However, it seems you miss alot.
My first impression is there is a difference in output for
eleven plus two (containing two spaces).
In the reference file I see the whole series on multi-word anagrams from 2 to 6 words.
Maybe we should have a better wordlist for this
Yes, I was tempted to put in "I" after I saw that "a" was the only single letter word. But the discussion of how to alter a word list is likely done somewhere else already. Word lists for dictionaries, spell checkers, suggestion lists, thesaurus ... I would add a more suitable list in a pinch. After some superficial checks I abandoned most alternatives I tried. I thought
littlegandhi1199's
12dicts-6.0.2 would be a good starting point maybe. Well the 12 dicts are obviously not his, but the choice to use them was his.
Edit: Maybe we should use his idea and combine a suitable word list? How did you do yours?
Re: Anagrams
Posted: 20 Jul 2017, 13:11
by Helgef
I get the exact same two solutions when I input "elevenplustwo" and add a single space. adding another space gives me 3,315 solutions
I find permutation in yours, eg,
are both present for
eleven plus two.
The code generating the
reference file isn't aware of any spaces, they are removed. It will generate the same result for any number of spaces and permutation of
eleven plus two. Your code is much quicker than my brute force, even if I stop it at ~1100 matches, when most of the useful results are found.
Edit: Maybe we should use his idea and combine a suitable word list? How did you do yours?
I didn't look too close at
littlegandhi1199's code, it wasn't indented, which makes it difficult for me. I do not understand the last question.
Cheers.
Re: Anagrams
Posted: 20 Jul 2017, 14:18
by wolf_II
There was a
Helper.ahk in one of his uploads, which took 4 word lists of his choice and looped through them. Outputs were len4.txt, len5.txt and so on, as well as a reject.txt.
My last question was about wl2.txt, from
here. How did you do this word list? Maybe you found it somewhere?
I think I understand where the permutations come from: When I split off a part from Input, I run a 2-words search on its complement.
eleven plus two =>
eels +
uptownlev ---
eels is in DICT, look for 2-word anagrams for its complement.
eleven plus two =>
uptown +
eelslev ---
uptown is in DICT, look for 2-word anagrams for its complement.
I need to weed out those permutations. Thanks for pointing this out
I think when the permutation are taken care of I will get only 1/3 of the count, which equals your number .. looks promising
Re: Anagrams
Posted: 20 Jul 2017, 14:57
by wolf_II
Code: Select all
;-------------------------------------------------------------------------------
MultiWordAnagrams(String, SpaceCount) { ; return an array with the "sentences"
;-------------------------------------------------------------------------------
Result := [], Previous := []
Keyword := make_Keyword(String)
If (SpaceCount = 1)
Return, TwoWordAnagrams(Keyword)
; otherwise: get word pairs, iterate over requested size for Pair[1]
Loop, % StrLen(Keyword) - A_Index {
For each, Pair in get_Pairs(Keyword, A_Index) {
For each, MultiWord in TwoWordAnagrams(Pair[2]) {
Candidate := Pair[1] " " MultiWord
Sort, Candidate, D%A_Space%
If Not Previous.HasKey(Candidate) {
Previous[Candidate] := True
Result.Push(Pair[1] " " MultiWord)
}
}
}
}
Return, Result
}
This weeds out the permutations rather than avoiding them, hmmm
But I now get 1,105 3-word anagrams for
eleven plus two but obviously no time benefits over before.