Interest in SQLite backed fuzzy search library?

Hi all,

I am writing to gauge interest in a light weight SQLite backed fuzzy search library in OCaml,
specifically if it is worth the effort of separating the search component out of an existing project and package into a proper library.

Fuzzy search here refers to fuzzy (or typo tolerant) search of phrases in text data, which is implemented in Docfd for document searches. The primary reason for an OCaml layer is that the full-text search extension for SQLite does not handle fuzzy search.

The main obvious nicety of this approach compared to in-memory indices is the very low memory usage but still reasonably fast up to a size of data set. And being backed by SQLite, it allows for standalone distribution easily.

Cheers,
Darren

3 Likes

Some project I have been planning to do requires something like this, so whether you end up distributing it or not I’m quite happy to learn that it exists and I will be taking a look at your code.

(So far I have only researched it a little, just enough to find out that DB systems have very limited support for fuzzy search and one needs to implement it on top of them.)

After some further reflection, I think maybe there is more value in documenting how Docfd did it for people looking to implement something similar, than to package the search component into a search library that lacks polish and features. Ultimately it is a very thin layer over SQLite that is neither technically complex nor novel, but I think the insights and numbers from my journey would be a good reference, since Docfd started with JSON+GZIP then CBOR+GZIP before SQLite.

And bringing the search component up to a level of feature parity suitable for a general search library, e.g. somewhat on par with Python’s Whoosh, is also too much further investment for me.

1 Like