[ANN] Areas and Adversaries

I figured people might be bored of British pub names by now so I did another thing: a generator for titles of table-top role-playing games.

$ opam install areas-and-adversaries
...
$ areas-and-adversaries
Woods & Wizards

The code is on Gitlab: Raphaël Proust / areas-and-adversaries · GitLab

It was a good excuse to experiment with non-dune build systems (to scope things out). I went for a plain Makefile in the end which works well.

I also wanted to figure out a better way to embed data in an executable. I ended up wondering about moving as much of the processing as possible into the build phase. What I ended up with is a small program which prints a compilation unit (.ml) which has mostly array literals. Still have some open questions on that, any input welcome:

  • Should I have used meta-ocaml to print the code? The data/munch.ml would probably be more readable, but the build probably less.
  • How could I generate this kind of processed-data code for data-structures which don’t have a literal (maps, sets, hash tables, etc.)? How can I minimise the initialisation cost of the program for such situations?
9 Likes

At some point I investigated using malfunction as an alternative implementation to crunch (which I assume you are aware of hence your munch.ml) because I had the perception that parsing large OCaml source files with huge data literals was slow. I made a small proof-of-concept that maybe read data from a hard-coded file. I got stuck on figuring out how to make dune use this. For a makefile setup this wouldn’t be a problem I think. Unfortunately, I’m afraid the code only lives on the drive of my laptop that has since died. Anyway, I think an approach with Malfunction could be interesting :smiley:

1 Like

You can use Marshal: just dump the marshalled data to a file during your build phase and read it back at runtime.

Cheers,
Nicolas

2 Likes

I have used crunch in my previous random generator but I was unsatisfied with having the fake I/O and the parsing (even though it’s just some line splitting).

I was more focused on the cost of initialisation during the execution of the binary (e.g., avoiding a Hashtbl.of_seq (Array.to_seq <big-literal-here>) which does a whole lot of hashing, the same hashing every time as well). I hadn’t considered compilation time. That’s an interesting consideration and maybe I’ll use that next?

I’ve considered using marshaled data in a file but it still requires readin a file. And also having an error-revovery mechanism for when the file is not present (paths having changed, XDG variables having been modified, etc.)

I think it’d be useful to have a small library that deals with the error-recovery: load the file if present and if it works, otherwise regenerate the file and rewrite it to disk. Do you know if anything like that exists?

Not necessarily: you can embed the marshalled data as a string literal in your executable.

Cheers,
Nicolas

1 Like

You can also marshal into a string, and generate a .ml file which defines this string.
At runtime you unmarshal from the string, which will be part of (the static data of) your executable.
There’s a bit of work to make sure the string is escaped properly, but I think Printf.printf "%S" will work (you can also use the {foo|... |foo} syntax if you can check in advance for occurrences of |foo} in the string).

1 Like

That doesn’t work if you want your file to be a well-encoded utf-8 text. Escaping with "%S" will be more robust from that point of fiew.

I don’t really like "%S" it uses decimal escapes and one long line.

If you can afford a bit more code, the stdlib based code here makes a let for you, with hex escapes and restricted to 80 columns (not sure why it tries to compute the length of the buffer so precisely :joy: perhaps this wanted to use a bytes directly).

Also, hxd.caml (available via opam install hxd) does that too (but without the let) :slight_smile:

The marshal suggestions are nice. I’m now thinking of a ppx which turns

let foo[@marshalled] = <some expression>

into

let foo = Marshal.from_string "<%S escaped expression>"

It wouldn’t do anything within the dev profile (of dune) (to get faster feedback loop), and only be active for the release profile. I think it can work with some dune ocaml top-module <file-name>. I’ll give it a go at some point.

Sorry, I think my personal anecdote was confusing. I didn’t mean you should necessarily re-implement crunch, too. Instead, you could use malfunction to basically build a highly specialized compiler :smiley:

Have you tried ppx_blob ?

let csv = [%blob "../resource/something.csv"] in
frobnicate csv

and that is all it takes to embed the contents of a file. At compile time. So if you change the resource you better delete the *.cm* files as to force an update.

1 Like

That looks nice! It doesn’t solve the issue of parsing/initialisation but it does offer a simple mechanism for including a marshalled blob. Thanks for the pointer.