Trie Data Structure

Post your working scripts, libraries and tools
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Trie Data Structure

14 Aug 2017, 10:42

An AHK implementation of a trie. Tries are a type of tree where the values are read from root to leaf, this makes them suited for things like typing suggestions, because you are able to search through only the branches that start with a specific prefix.

Source

Code: [Select all] [Expand] [Download] (Trie.ahk)GeSHi © Codebox Plus


Intended Use
The only function that is meant to be manually used from outside the Class is __Set since trie[key] := "" behaves in an undesirable way for this class.
To get data from the trie use the syntax trie[key]

Test: Typing Suggestions
Image

Code: [Select all] [Expand] [Download] (Suggestions.ahk)GeSHi © Codebox Plus

Attachments
English British.txt
(624.3 KiB) Downloaded 68 times
Last edited by Capn Odin on 21 Aug 2017, 20:57, edited 8 times in total.
Please excuse my spelling I am dyslexic.
Helgef
Posts: 2404
Joined: 17 Jul 2016, 01:02
Contact:

Re: Trie Data Structure

14 Aug 2017, 15:39

I just tried the example, nice one :thumbup:. I will look closer at your code later :geek:
Thanks for sharing, cheers.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

14 Aug 2017, 15:54

Looks promising even if in this example loads and works rather slow.
When I type in "summe" I miss "summer" in the suggestion list.
Attachments
summer.png
summer.png (2.91 KiB) Viewed 961 times
Helgef
Posts: 2404
Joined: 17 Jul 2016, 01:02
Contact:

Re: Trie Data Structure

14 Aug 2017, 16:04

You can remove the tooltip for fast load, or make it conditional, eg, if !mod(index,1000)
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Trie Data Structure

14 Aug 2017, 16:12

rommmcek wrote:Looks promising even if in this example loads and works rather slow.
When I type in "summe" I miss "summer" in the suggestion list.

Yes, the reason is that the trie doesn't know that "summer" is a finished word since the branche continue to "summerhouse", I am trying to find a solution that is elegant and not too memory hungry, since it takes too much memory already.
Please excuse my spelling I am dyslexic.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

14 Aug 2017, 16:16

Oh, thanks! It was ToolTip hindering quick load! But still works slow, specially after first letter.
When entering "sobe", "sober" is missing. There must be a bug.

P.s.: Right now I see, you are working on it! Hope there is a good solution!
Last edited by rommmcek on 15 Aug 2017, 02:53, edited 1 time in total.
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Trie Data Structure

14 Aug 2017, 16:19

Helgef wrote:You can remove the tooltip for fast load, or make it conditional, eg, if !mod(index,1000)

Thanks, I assumed the bottleneck was making the trie.
Please excuse my spelling I am dyslexic.
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Trie Data Structure

14 Aug 2017, 19:26

It should support subkeys now, rip simplicity.
Please excuse my spelling I am dyslexic.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

15 Aug 2017, 02:47

Yeah, now it shows all the words. I liked it before better (except for the missing word)! Now looks like any other alphabetical suggestion.
P.s.: Block searching after first letter! Takes too much time and is not of great help.

Edit: Sorry, suggestions are essentiality the same, of course now are complete. But the question remains, what's the advantage? Nevertheless I'm interested in any alternatives, since I have a lot of problems with orthography!
Helgef
Posts: 2404
Joined: 17 Jul 2016, 01:02
Contact:

Re: Trie Data Structure

15 Aug 2017, 05:44

The trie performs well imo, it is mostly the example that has performance issues, which is good news :thumbup:
Most time is spent updating the Listbox, improve by,

Code: [Select all] [Download] GeSHi © Codebox Plus

GuiControl, -Redraw, Suggestions 
GuiControl, , Suggestions, % "|" Join("|", Entries[Address]*)
GuiControl, +Redraw, Suggestions

Also, consider limiting the number of hits in join, eg,

Code: [Select all] [Download] GeSHi © Codebox Plus

Join(sep, params*) {
for index, param in params {
str .= param sep
if (index>50) ; added
break
}
return SubStr(str, 1, -StrLen(sep))
}

Usually one will narrow the search if presented 20+ suggestions anyways.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

15 Aug 2017, 06:38

Thanks, Helgef! You are a real Pro!
I added in the Search: label after Gui, Submit, NoHide:Now it works perfect even with half a million wordlist!

It's remains to see how it performs in everyday usage!
Helgef
Posts: 2404
Joined: 17 Jul 2016, 01:02
Contact:

Re: Trie Data Structure

15 Aug 2017, 08:16

:thumbup: That is a good addition rommmcek
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Trie Data Structure

15 Aug 2017, 11:29

Helgef wrote:Most time is spent updating the Listbox
Yes, I reached the same conclusion, but had forgotten about Redraw, was trying SendMessage instead, but I decided it wasn't worth the effort.

Edit: Why is Redraw only in ListBox and not in Styles ?

Helgef wrote:Usually one will narrow the search if presented 20+ suggestions anyways.
I am all for narrowing the results in the test, but I want to keep the trie generic.

rommmcek wrote:Sorry, the question remains, what's the advantage?
The test is merely to showcase the trie. The advantage of a try is that you can limit how much you need to search though, by specifying a prefix.

Every thing that starts with a t will be in the left subtrie, meaning that you have removed half of the example trie already, this is why it is efficient.
Image

Edit: I am thinking about adding the ability to remove values.
Please excuse my spelling I am dyslexic.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

15 Aug 2017, 13:15

Yeah, I'm dazzled using tool, having wordlist stored in DataBase and so a bit superior then the Trie by itself.
Is there any way to store the Trie in the Database and then access the keys still faster?
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Trie Data Structure

15 Aug 2017, 14:29

I am sorry, rommmcek, I don't understand what you are asking.
Please excuse my spelling I am dyslexic.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

15 Aug 2017, 15:33

Forget it! Not knowing procedures, how to store data in DB or Trie, the question obviously has no sense!
Anyway, thanks for your effort!
Helgef
Posts: 2404
Joined: 17 Jul 2016, 01:02
Contact:

Re: Trie Data Structure

15 Aug 2017, 15:53

Capn Odin wrote:
Helgef wrote:Usually one will narrow the search if presented 20+ suggestions anyways.
I am all for narrowing the results in the test, but I want to keep the trie generic.

Yes :thumbup: , I mean only for the example.
User avatar
Joe Glines
Posts: 470
Joined: 30 Sep 2013, 20:49
Facebook: https://www.facebook.com/theAutomatorGuru/
Google: https://plus.google.com/105328929654286634910
GitHub: joetazz
Location: Dallas
Contact:

Re: Trie Data Structure

16 Aug 2017, 07:12

Very cool Capn Odin! :) Thank you for sharing this during the webinar! I look forward to updates. :)

BTW- I tried it on a file with 2,235,964 words and it was not to happy. lol I'm sure this is beyond what you were imagining to use and not a realistic use case but thought you might like to know.
cheers,
Joe
User avatar
Capn Odin
Posts: 1240
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Trie Data Structure

16 Aug 2017, 08:07

Joe Glines wrote:Very cool Capn Odin! :) Thank you for sharing this during the webinar! I look forward to updates. :)

BTW- I tried it on a file with 2,235,964 words and it was not to happy. lol I'm sure this is beyond what you were imagining to use and not a realistic use case but thought you might like to know.
cheers,
Joe

Would you be willing to share the file ?
I believe the problem is the memory usage, I have tried to optimise it, for a norwegian dictionary with 300,000 words, I reduced the memory usage from 251 MB to 149 MB.
Please excuse my spelling I am dyslexic.
rommmcek
Posts: 266
Joined: 15 Aug 2014, 15:18

Re: Trie Data Structure

16 Aug 2017, 11:45

I think the only way to takle extra large wordlists (1M+) in Ahk is MCode. Analogues to what did feiyue in FindText - Capture screen image into text and then find it.

Return to “Scripts and Functions”

Who is online

Users browsing this forum: A_User and 20 guests