Here is a blog post that I wrote recently, about the increase of the size of the OCaml Distribution over the recent years:
Hope it can be interesting to read !
–
Fabrice, OCamlPro
Here is a blog post that I wrote recently, about the increase of the size of the OCaml Distribution over the recent years:
Hope it can be interesting to read !
–
Fabrice, OCamlPro
Making the compiler distribution smaller also helps with the size of local opam switches. Right now, a local switch takes up hundreds of megabytes, which can get painful fast if you have a lot of separate projects.
I hadn’t really thought about it but there’s a sizeable chunk of executables in ~/.opam/default/bin
:
$ opam switch
# switch compiler description
→ default ocaml.4.14.0 default
$ cd ~/.opam/default/bin
$ ls
cppo ocamlc.opt ocamllex.byte ocamlopt.byte odoc
dune ocamlcmt ocamllex.opt ocamlopt.opt omd
lambda-term-actions ocamlcp ocamllsp ocamloptp rescript_syntax
mel ocamlcp.byte ocamlmklib ocamloptp.byte safe_camlp4
melc ocamlcp.opt ocamlmklib.byte ocamloptp.opt usegtrip
meldep ocamldebug ocamlmklib.opt ocamlprof utftrip
menhir ocamldep ocamlmktop ocamlprof.byte utop
ocaml ocamldep.byte ocamlmktop.byte ocamlprof.opt utop-full
ocamlbuild ocamldep.opt ocamlmktop.opt ocamlrun ydump
ocamlbuild.byte ocamldoc ocamlobjinfo ocamlrund
ocamlbuild.native ocamldoc.opt ocamlobjinfo.byte ocamlruni
ocamlc ocamlfind ocamlobjinfo.opt ocamlyacc
ocamlc.byte ocamllex ocamlopt octavius
$ du -sh .
427M .
Are all of these really necessary? E.g. do we really need to distribute ocamlopt.byte
and friends in a binary distribution?
EDIT: side note, what is octavius
? It doesn’t appear in the source code. And it’s pretty mysterious:
$ octavius -help
File "-help" does not exist
$ octavius --help
File "--help" does not exist
$ octavius
Usage: octavius FILE
Yeah, I wonder who’s using the bytecode executables nowadays. They’re pretty huge too.
I think it’s part of odoc?
The Docker images published under ocaml/opam don’t include them - it’s not as far as I know ever caused a CI issue!
As part of reviewing one of the PRs on the compiler’s build system recently, I’d been reminded of the -linkall
on ocamlcommon. It’d be really nice to get rid of that, but that involves either splitting ocamlcommon or doing some very fiddly work on the type checker’s global state.
I don’t really understand the need to have -linkall
on a library.
The only two cases I have met where it was used:
You have modules that perform side effects without being called by other modules, and you are afraid that, without -linkall
, these modules wouldn’t be linked and the side effects would not be performed. You can usually fix this problem by having an init()
function in the module, that performs the side-effects only once, and you call it from all the other modules that need the side effects
You have modules that you want to link even when you don’t need them, because you use Dynlink
and want them to be available for plugins. I think that, in such cases, you should put -linkall
when linking the executable, not on the library. If this does not give you the granularity to choose which libraries should be linked-all, then another flag should be added to the compiler to fix this.
Original discussion for ocamlcommon is in [github patch] add -linkall flag to ocamlcommon archives · Issue #6509 · ocaml/ocaml · GitHub. The problem is forward references, which can’t be solved trivially by init
functions. The pattern here is something like foo.ml
:
let forward_fn = ref Fun.id
let api_call x =
let y = do_something_here x in
!forward_fn y
and bar.ml
:
let _ =
Foo.forward_fn := (fun x -> (* ... *))
let api_call x =
let y = Foo.api_call x in
(* ... *)
and the fear is a program that uses Foo.api_call
but makes no references itself to Bar
. All fixable, but it’s not trivially solved by having an init
function (Foo
cannot call Bar.init
since Bar
references Foo
), which is why I say “fiddly”
I suppose that in the real example, you would split bar.ml
into a foo_init
:
let _ =
Foo.forward_fn := (fun x -> (* ... *))
and bar.ml
:
let () =
(* ... *)
let y = Foo.api_call x in
(* ... *)
So, why not rename foo.ml
into foo_needs_init.ml
and create a file foo.ml
:
let () = Foo_init.init ()
let api_call = Foo_needs_init.api_call
i.e. hide the part that needs initialization inside an internal module, that is then exported by an external module that performs the initialization ?
Good news–there is a PR that will get some nice space savings with very minimal changes (only some Makefile
hackery): Compile non-speed-critical tools to bytecode only by xavierleroy · Pull Request #11993 · ocaml/ocaml · GitHub
Looking for a review!
P.S. this is actually a follow-up to last week’s PR Reduce size of installed bytecode executables by xavierleroy · Pull Request #11981 · ocaml/ocaml · GitHub which already got some significant space savings.