Extracting "Extra C options" etc. from archive files

I need a tool that can read a cma/cmxa file and tell me the information it contains in:


Extra C object files:
Extra C options:
Extra dynamically-loaded libraries:

Ideally something written in C. If that’s not available: is the format of these files formally documented somewhere?

Thanks,

Gregg

The binary format of these files may change from one OCaml version to the other as they consist of marshaled OCaml values.

The structure of these values is “documented” in the .mli files here and here.

Unless you want to implement a Marshal decoder in C, then you can try to call OCaml from C or simply shell out to ocamlobjinfo and parse its output.

Thanks. That looks less complicated than I expected, I might give it a try. (In case you were wondering why C, this is for build tooling so I want to avoid the bootstrapping problems associated with using ocaml libraries.)

It likely looks more complicated than you expect, for example starting with 5.1 marshaled values could be compressed with zstandard.

In any case the code here in omod has simple readers for OCaml compilation artefacts written in OCaml.

But if this is for build systems I suspect you are better off parsing ocamlobjinfo’s outputs (invoked with -no-approx and -no-code). Depending on what you look for it may be a bit annoying to parse, but it has the advantage of being stable over compiler releases which is not the case of the OCaml data structures stored in compilation objects (see for example the cmi reader here whose code need to be conditionalized on pre 4.08).

Thanks, I’ll take a look.

That would work, but it presupposes an OCaml installation, which I’d rather avoid. Another possibility would be to use strings or some other binutil, but tbh that doesn’t look very promising. OTOH I do smell a nice dirty trick opportunity. I only need those three fields, which are always at the end of the file. If I can figure out their exact format I should be able to just read the file backwards and pull the values. Wouldn’t be pretty but might work.

Not after 5.1 I’m afraid.

OCaml’s marshal implementation is written in C - it’s probably not that much code to write the parser in C and compile against libcamlrun (it might even be possible to do it without initialising the entire runtime)