I’m working at a REPL completion/hinting feature for my programming language using the libraries ocamline and linenoise by keeping keywords that are in the environment inside a trie data structure.
It looks like it is working fine, except that it works only after an odd number of insertions. After the first two insertions in the trie the completion callback fails to retreive the last character of the keyword to complete, and after the fourth insertion it stops working completely ONLY AFTER EVEN NUMBERED INSERTIONS.
This is buzzing me a while and I don’t know where to inspect tbh. The trie is naive and keeps characters in its nodes. Here’s the code:
You’re giving us a lot of code to look through. If you think the error is in your trie implementation, it might be worth looking for a way to reproduce the problem just using the trie itself and a smallish number of calls to the trie API. In fact if you do this you’ll probably see the problem yourself. [Smiling colon hypen parenthesis here.] But even if you don’t, it will make it much easier for others to think about what’s going wrong.
Actually looking at Daniel’s code immediately made obvious that my proposed type signature above is not quite right: you want to store values at each node of the trie, as a key may be a prefix of another. Next iteration:
type 'a t =
| Node of 'a option * (char * 'a t) list
Even in the case where you only want sets not maps (no information carried for each word in the trie), you need to turn the 'a option in a bool parameter to know if each prefix is in fact a word in your dictionary.
That’s a rather odd implementation, I’m not sure it is a prefix tree at all. A trie should be a tree, where each node has multiple children and optional data. Here is a simple implementation of an imperative trie that focuses on performance. Here are the docs. Feel free to use it.
Thank you everybody for the suggestions. I’ve managed to make a working draft with an immutable trie. I’ll focus on optimizing it in the next days. Thanks especially to @gasche for the proposed signature, which is what I’ve used. If you want to try it I’ve pushed to the completion branch of gobba. Do you know any good micro benchmarking library that I can use to test my language? Something that produces already plotted charts or easily plottable data.