I am trying to distribute an repl embedding some libraries so it can be run on a computer without ocaml/opam.
Typically, I want to allow users using a driver to execute ocaml scripts. e.g.
let () =
let b = My_lib.open_base "/foo/bar" in
(* ... *)
My_lib.close_base b
I created a dummy library using what I want to be embedded and ran dune top which gives me some directives.
Running ocaml -init <file_with_dune_top_output> works fine so I tried to edit my file in order to use relative pathes, and it worked.
Then I copied all the cma files used in the init file from .opam/4.11.1/lib/**/*.cma into ./etc/lib/**/*.cma.
And it’s where things stopped working. I does not crash or anything, it justs launche the repl but my libs are unbound.
Before trying to solve an hypothetical missing dependencies in my ./etc/lib, is there any chance that distributing ocaml binary and few cma files would give the user a toplevel able to execute their script ?
If not, is there anything I can do in order to obtain such a tool ? i.e. an ocaml REPL with my libraries loaded that can be used on a computer without ocaml ?
EDIT : I do not actually need interactive repl, even if it would be nice. I want to execute ml files.
You might want to look at ocamlmktop (documentation).
If you want to run the resulting toplevel on a computer that doesn’t have an OCaml runtime installed, you’ll likely want to use the -custom flag.
Not really something that can be used off-the-shelf, but this can also be worked around by using a custom load function in Persistent_env.Persistent_signature; what is needed is some way to serialize the .cmi files in some form inside the main executable, and a way to read them back in the load function.
First, I tried with method used in dune documentation.
$ dune build
$ echo "open Driver ;;" |_build/default/bin/testop.bc
OCaml version 4.11.1
Findlib has been successfully loaded. Additional directives:
#require "package";; to load a package
#list;; to list the available packages
#camlp4o;; to load camlp4 (standard syntax)
#camlp4r;; to load camlp4 (revised syntax)
#predicates "p,q,...";; to set these predicates
Topfind.reset();; to force that packages will be reloaded
#thread;; to enable threads
# open Driver ;;
Error: Unbound module Driver
Then, I tried using ocamlmktop
$ ocamlmktop -o foo.exe -linkall -custom -I _build/default/lib/.driver.objs/byte _build/default/lib/driver.cma
$ echo "open Driver ;;" | ./foo.exe
OCaml version 4.11.1
Findlib has been successfully loaded. Additional directives:
#require "package";; to load a package
#list;; to list the available packages
#camlp4o;; to load camlp4 (standard syntax)
#camlp4r;; to load camlp4 (revised syntax)
#predicates "p,q,...";; to set these predicates
Topfind.reset();; to force that packages will be reloaded
#thread;; to enable threads
# open Driver ;;
Error: Unbound module Driver
#
Did I missed some obvious stuff? Did I misunderstood what ocamlmktop actually does?
You need to make sure the toplevel sees driver.cmi either by invoking it with an appopriate -I flag, or by invoking #directory at the beginning of the session or in a file provided via -init.
Ok sorry about that, you mentionned it earlier but I was expecting module to be openable even without the .mlicmi. Don’t ask me why, or what should be the behavior in that case, I clearly have no idea.
It now loads if I give the -I flag. It still is a strange behavior to have to do so, I guess, since I do not really understand what it means for a module to be loaded (as ocamlmktop is supposed to do) but not openable/usable.
I’ll try to fix my problem importing cmi files, and will mark as resolve after that, thanks.
It means that the module is loaded in there. Compiled code loaded inside the toplevel can use it. But it is hidden from the interactive user.
The code written by the interactive user needs to be typechecked and compiled which is what the .cmi files are used for, like in the regular compilation pipeline.
mktoplevel_data.cma contains all the needed cmis marshalled into a string, and mktoplevel.cma does this:
let () =
Clflags.noversion := true ;
Clflags.noinit := true ;
let old = !Persistent_env.Persistent_signature.load in
let ht = Hashtbl.create (Array.length Data.cmis) in
Array.iter begin fun (src, cmi) ->
let filename = src in
let cmi = Marshal.from_string cmi 0 in
print_endline ("Loading " ^ cmi.Cmi_format.cmi_name ^ " -- " ^ src) ;
let x = Persistent_env.Persistent_signature.{ filename ; cmi } in
Hashtbl.add ht cmi.Cmi_format.cmi_name x
end Data.cmis ;
Persistent_env.Persistent_signature.load := begin fun ~unit_name ->
match Hashtbl.find_opt ht unit_name with
| Some t as x -> print_endline ("Found " ^ unit_name ^ " -- " ^ t.filename) ; x
| None -> print_endline ("Fallback " ^ unit_name) ; Unix.sleep 1 ; old ~unit_name
end
If I built this on my computer, and launch gwrepl.exe, I have this: (it is what is expected)
Hum… but there should be no stdlib.cmi at all since there is no OCaml on the targeted machine (isn’t it the purpose of -custom option?). Also, note that while I give complete path in the log, but it is actually meaningless since interfaces are served directly from embedded values (marshalled Cmi_format).
I’m interested too but I didn’t even know about @nojb’s trick. If you have time you could maybe try to make a simple self-contained example with repro instructions so that we can try to look in more details.
You could try adding Clflags.no_std_include := true to your code, which should make both versions (on your computer and through docker) behave similarly. It will likely mean that both will fail, but it might be easier to debug.
You could also try adding Clflags.nopervasives := true. I’m suspecting that maybe the code in the compiler treats initially open modules differently (requiring a .cmi file present on disk for them, even though it might not be read if Persistent_env.Persistent_signature.load is modified), and the nopervasives flag tells the compiler not to open the Stdlib module automatically. If that works, you will have to prefix your inputs with open Stdlib though (if you plan to provide an init file, you can put it at the top of that file instead).
If the nopervasives trick works, then it likely means that the Unbound module Stdlib error you get is a bug in the compiler, and it would be nice to file an issue about it.
I’m coming late to the discussion but indeed the compiler has some special logic around the opening of the “initial module” (Stdlib unless nopervasives is set to true) which looks like it may be going around Persisent_signature.load, so I suspect doing what @vlaviron suggests will probably be enough to fix the issue at hand.
Filing an issue to investigate whether this is an expected behaviour or a bug (which may be related to some later changes done to support the “prefixed” Stdlib) would also be a good idea.
Cheers,
Nicolas
PS the relevant bit of the compiler is the function Typemod.initial_env which is called with initially_opened_module equal to Some "Stdlib" when nopervasives is false.
# open Stdlib ;;
Error: Unbound module Stdlib
# open Stdlib.List ;;
Found Stdlib -- /home/geneweb/.opam/4.09.1+flambda+no-flat-float-array/lib/ocaml/stdlib.cmi
Found Stdlib__list -- /home/geneweb/.opam/4.09.1+flambda+no-flat-float-array/lib/ocaml/stdlib__list.cmi
This looks like the code that wrongly relies on the filesystem is not restricted to the initial opens.
Out of curiosity, if you try to open Stdlib twice, does it work the second time ?
I can’t find any reason why Stdlib.List would work and not Stdlib, but I would be less surprised if it turns out that trying to load a module twice could first fail then work.