[ANN] Fmlib 0.5.5 with emphasis on combinator parsing with lexers

hbr · September 26, 2023, 8:29am

I am pleased to annouce version 0.5.5 of the functional library fmlib. The main part of version 0.5.5 is the sublibrary Fmlib_parse which has now support for combinator parsing with the separation of parsing and lexing. The new version is available via opam.

Combinator parsers usually do not separate the parser from the lexer. This is a disadvantage in parsing languages which have keywords which look like identifiers. A combinator parser usually needs a lot of backtracking (which costs sperformance) when parsing such a language.

In order to improve the situation Fmlib_parse allows the separation of parsing and lexing. The lexer usually needs very little or no backtracking to parse the lexical tokens. Any backtracking in the parser does not pushback larges sequences of characters to the input stream, it just pushes back already correctly parsed lexical tokens (which is cheaper) and does not need to rescan the pushed back lexical tokens character by character. See parse_lex (fmlib_parse.parse_lex) for details.

Furthermore the sublibrary Fmlib_browse supports the function map on virtual doms and attributes of the virtual dom and allows subscriptions on_animation. Look into doc (fmlib_browser.doc) to see how Fmlib_browser supports web applications written in ocaml.

hummy123 · September 26, 2023, 8:47am

Fmlib is a cool library. I remember finding it before when I tried looking at functional B-Tree implementations and coming across some discussion when someone said to use separate arrays for keys and values instead of using a (key * value) tuple.

I did a brief implementation of that (just append and lookup) and found that the resulting B-Tree was slower than an ordinary AVL or other balanced tree (I think likely because of the array-copying cost). So the advice that was given to you back then may be worth ignoring (or maybe my implementation was bad).

hbr · September 26, 2023, 1:02pm

Hello Humsa. This is an old discussion. If you consider the pure access to keys and values, then it is better to use two arrays, because there is one indirection less (the tuple) as long as you need either the key or the value. In terms of insertion and deletion of key value pairs having two separate arrays is more work, because you need array allocation and array copying for insertion/deletion of keys and the same for insertion/deletion of values.

In terms of insertion and deletion I think that avl or other balanced trees are more efficient than btrees (or similar array based trees). The question is how much?

I use btrees if insertion and deletion are not the dominant operations. If finding is the dominant operation then from my understanding btrees are unbeatable. Therefore they are used in nearly all database applications where lookup is the dominant operation. In my applications this is usually the case. There are much more lookup operations than insertion/deletion operations.

However this is just my theoretical analysis. I have not made any experiments and measurements to support this. I would be very interested in measurements in practical settings.

Regards
Helmut

Topic		Replies	Views
[ANN] Fmlib 0.5.8 with unicode parsers Ecosystem	2	285	February 5, 2024
[ANN] Release of Fmlib 0.5.7 Ecosystem opam , announce	0	379	November 5, 2023
[ANN] Release of `fmlib.0.1.0` Ecosystem	0	668	February 23, 2021
B Trees in Ocaml via Fmlib_std 0.3.0 Community	19	1532	May 24, 2021
[ANN] Layout Parsing and Nicely Formatted Error Messages Ecosystem	0	844	February 16, 2022

[ANN] Fmlib 0.5.5 with emphasis on combinator parsing with lexers

Related topics