We’d like to get rid of the fragile use of Obj.magic
in code generated by atdgen to read biniou records. This might worsen performance but we don’t know until we conduct proper benchmarks and this is a bit of work.
We’d be grateful to get answers to the following questions:
Do you use biniou or do you know someone who uses it?
Why do you use it?
Can you accept a slight slowdown of read operations? How much?
Other feedback?
2 Likes
Do you use biniou or do you know someone who uses it?
Yes I do, there: https://gitlab.com/smondet/vecosek
Why do you use it?
The workflow looks like this:
- generate a piece of data (called “The Scene” defined in an ATD description), and serialize it to JSON
- load the JSON with another application
For scenes ≥ 15 MiB of JSON, the parsing became the bottleneck of the loading procedure (2 or 3 seconds which is fine in normal conditions but a bit annoying while testing modifications to the generation part), so I saw in Biniou a quick/cheap way to speed it up (the parsing time is now not noticeable compared to the other things happening at load time).
The application still supports JSON (for users that may generate scenes in ≠ ways).
Can you accept a slight slowdown of read operations? How much?
Yes, I don’t think any reasonable slowdown would affect my workflow (bottleneck now seems to be a bunch of sequential RPC/system calls).
Also, in the worst case, since JSON compatibility will still be there, using the Marshal
module as a speed-hack is not out of the question.
Other feedback?
Always happy to see Obj.magic
calls go away Thanks for this!
2 Likes
Great feedback, thanks.
I wonder if you’ve tried using biniou tables, which allow for a more compact representation of list of records.
An example is:
type item = {
id : int;
data : string list;
}
type items = item list <biniou repr="table"> (* <-- this annotation *)
mjambon https://discuss.ocaml.org/u/mjambon
July 16
Great feedback, thanks.
I wonder if you’ve tried using biniou tables, which allow for a more compact
representation of list of records
http://atd.readthedocs.io/en/latest/atdgen.html#arrays-and-tables.
I haven’t tried (usually like to keep my ATD files as annotation-free as
possible) but if I need to speed up things again I will give this a shot.
infer uses biniou to transfer data from a C++ program (clang) to OCaml (infer). More precisely, clang is run with a plugin attached. The clang plugin has a custom biniou backend (and a couple of json backends supported by atdgen too). The plugin dumps biniou data about the AST of the current source file on stdout, and infer reads the output from the program and deriserialises it using biniou.
These serialised ASTs are quite big (can be 100s of MB). Deserialisation can be a bottleneck but my guess is that compilation time usually dominates it anyway so some perf regression on reads is probably ok for infer.
If you’re interested in using infer to test biniou performance, here’s an example of how to make infer read biniou, assuming a fresh clone of infer. The test uses the code of the plugin itself since it produces quite a lot of code:
$ ./build-infer.sh
$ cd facebook-clang-plugins/libtooling
$ # that file generates ~150MB of biniou data
$ infer -g capture -- clang++ -std=c++14 -fPIC -g -I../clang/install/include -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -fvisibility-inlines-hidden -fno-exceptions -fno-rtti -fno-common -Woverloaded-virtual -Wcast-qual -fno-strict-aliasing -pedantic -Wno-long-long -Wall -W -Wno-unused-parameter -Wwrite-strings -Wno-uninitialized -Wno-missing-field-initializers -Wno-vla-extension -Wno-c99-extensions -O3 -DNDEBUG -c ASTExporter.cpp -o build/ASTExporter.o
$ # the above produces a debug script which can produce biniou data saved on disk, run it
$ sh ASTExporter.cpp.ast.sh
$ # now for the test: this reads the biniou data and runs the infer frontend (compiles from clang AST to infer intermediate language)
$ time infer capture --clang-biniou-file ASTExporter.cpp.ast.biniou -- clang++ -std=c++14 -fPIC -g -I../clang/install/include -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -fvisibility-inlines-hidden -fno-exceptions -fno-rtti -fno-common -Woverloaded-virtual -Wcast-qual -fno-strict-aliasing -pedantic -Wno-long-long -Wall -W -Wno-unused-parameter -Wwrite-strings -Wno-uninitialized -Wno-missing-field-initializers -Wno-vla-extension -Wno-c99-extensions -O3 -DNDEBUG -c ASTExporter.cpp -o build/ASTExporter.o
As a note, the atdgen specification is generated from the comments in the source of ASTExporter.h, then are copied in the source tree of infer under infer/src/atd/. This directory also contains the OCaml code generated by atdgen
once infer has been built. For instance, after running a build of infer (./build-infer.sh
, or subsequently make
), you’ll find the file “infer/src/atd/clang_ast_b.ml”.
2 Likes
Thanks for biniou and for the heads up by the way!
1 Like