Hi all!
I’m happy to introduce wu-manber-fuzzy-seach, my library for doing fuzzy searches using the Wu and Manber fuzzy search algorithm.
The novel part of this library particularly, when compared to agrep/ocamlagrep
, is that I additionally provide a right-leaning variant of the algorithm. The variant reports better matches and error counts when looking at the first match. Here’s an example of the differences.
# open Wu_Manber;;
# StringSearch.(search ~k:2 ~pattern:"brown" ~text:"quick brown fox" |> report);;
- : string = "Pattern matched with 2 errors at character 9 of text"
# StringSearch.(search_right_leaning ~k:2 ~pattern:"brown" ~text:"quick brown fox" |> report);;
- : string = "Pattern matched with 0 errors at character 11 of text"
It’s a pure OCaml implementation, using Optint.Int63.t
as bit-vectors. I don’t current support all the extensions that agrep/ocamlagrep
supports, and will definitely not match the performance: OCaml+C vs pure OCaml.
The documentation for the library can be found here.
Expect more bitvector, Levenshtein distance, and fuzzy search shenanigans in the near future!
Update: It’s on opam
now!