(tuareg, ocaml REPL) Request for feedback about a proper way to auto-require packages

Dear all,

I’m more and more interested in finding ways to address this use case:

currently, when working with an OCaml project involving many dependencies, it is difficult to perform tests in a REPL (C-c C-s) with the proper packages loaded.
Granted, one can manually evaluate #use "topfind";; then #require "jwto";; or so for each package, but it quickly becomes cumbersome.

@yurug told me recently about the availability of dune utop, which is really cool!
However:

  • it implies the project is managed by dune (which may not be the case for a proof-of-concept development or so, with a single compilation unit and a .merlin listing the libs directories)
  • it requires utop, while I’d prefer a generic approach, just starting the ocaml REPL in a vanilla tuareg+merlin setup when doing C-c C-s.

I had opened a related issue and proposed some workaround in auto-require packages when doing C-c C-s (tuareg issue #234), a refinement of which being pasted below:

Assume I want to automate a session that would interactively looks like:

#use "topfind";;
#thread;;
#require "lwt";;
#require "lwt.unix";;
#require "compiler-libs";;
#load "ocamlcommon.cma";;
#load "ocamlbytecomp.cma";;

and by automating, I mean just typing C-c C-s (or maybe C-u C-c C-s in tuareg) to get an equivalent REPL session.

The current workaround I have is to run in a terminal:

args=($(OCAMLFIND_COMMANDS='ocamlc=echo' ocamlfind ocamlc -linkpkg $(for arg in lwt lwt.unix compiler-libs; do echo "-package $arg"; done) -thread ocamlcommon.cma ocamlbytecomp.cma)); args[0]="ocaml"
rlwrap "${args[@]}"

(or run echo "${args[@]}" then copy-and-paste the full command in Tuareg’s prompt
OCaml REPL to run: [/usr/local/bin/opam config exec -- ocaml by default].)

As for the OCAMLFIND_COMMANDS='ocamlc=echo trick: it was related to the fact the -thread option is needed in ocamlfind’s command-line, but unlike ocamlfind ocamlc, ocaml does not support the -thread option… which is however added first in the arguments list, so it was feasible to override it.

Admittedly, this is hacky, but I guess the underlying use case can be important in one’s workflow; so I hope one such feature could be available soon in tuareg.

So, first sub-question:

would some of you be aware of simpler solutions to achieve this? (possible based on ocamlfind or opam, but not requiring utop)

Also, as I had mentioned in the issue #234, a starting-point for specifying the packages to load could be the .merlin file, but I believe there are typically no PKG lines in dune-generated .merlin files (if I’m not mistaken), so maybe some post-processing of the .merlin is needed.

Hence the second sub-question:

assuming merlin is installed and a .merlin file enumerates some additional directories to load, would it be feasible to use the existing elisp code of Merlin, or a ocamlmerlin command, to get the list of .cma files to load (and/or -I CLI flags) for the current buffer?

1 Like

IIUC, you’re asking for something pretty automatic that would infer what requires need to be execute, to set up the toplevel so you can load files from that directory, yes? That seems like a tough goal. I tend myself to drop “include_ml” files that contain the script of commands, and then I just

#use "include_ml";;

right after C-c C-s in Emacs/Tuareg. After long enough, it becomes pretty much instinctive – M-r u s e pulls it right up. Or, y’know, M-r i n c l.

Somehow I never managed to get anything real done with the whole eval phrase in subshell in caml-mode.

The problem is that the infrastructure is totally inapt at handling redefinitions in your code. Namely when you make changes to your definitions you want to revaluate all dependents to be sure they are actually using your redefinition and not the previous version previously sent to the toplevel.

I think if one ever wants to be serious about this style of development (which would be great !) the road is to provide a proof general like experience where you interactively replay definitions of your project as you tweak them.

1 Like

Have you tried ocaml-top? It seems to do that re-evaluation sfuff

1 Like

You meant this ? No. I wonder if that works on whole projects.

But in any case I’m not interested in using another editor :–) (Aside, I don’t think it’s such a good idea to give newcomers systems that working programmers do not use. It make it difficult to help them when they get into trouble. But I’m not a teacher, maybe that works for the classroom.)

In general I think the current toplevel interactive experience is not super useful except to test little snippets or summing up your monthly bills. It would be worth revisiting it (also with Jupyter/Mathematica notebooks in mind).

Note that dune also has dune top that will output the sequence of #directory and #load directives needed to set up the usual ocaml toplevel.

Cheers,
Nicolas

1 Like

This is quite useful, however one annoyance I have with it is that if I’m working on a dune project that only has executable stanzas and no libraries, then both dune top and dune utop don’t load anything.

Is there some kind of underlying reason why this is the case or is this just another dune idiosyncronacity? Why can’t dune return the required external libraries for executables as well?

Also, it’s not like dune doesn’t have that information present - I was able to hack together a small emacs function to do work out the external libraries using output from dune:

(defun dune-retrieve-external-libraries ()
  "Return a list of the external OCaml libraries used by the
current direrctory -- assumes a dune-based project."
  (let ((str (shell-command-to-string   "opam exec -- dune external-lib-deps @all")))
    (setq str (split-string str "\n"))
    
    ;; Ignore dune's change directory log values
    (when (string-prefix-p "Entering " (car str))
      (setq str (cdr str)))
    ;; Drop the information text
    (when (string-prefix-p "These are the " (car str))
      (setq str (cdr str)))
    (setq str (seq-filter (lambda (vl) (not (string-empty-p vl))) str))
    (setq str (seq-map (lambda (vl) (substring vl 2)) str))
    str))

But it would be nice if dune did this itself.

In general dealing with external libraries is complicated because of the mismatch between opam package names and “findlib” names. In fact dune does not have access to this mapping, and the external-lib-deps command applied some heuristics which were not always right, and moreover it had a complicated implementation.

(By the way, I wrote “had” because it was decided to remove this command from the upcoming release 3.0 of Dune…)

That said, feel free to open an issue at Issues · ocaml/dune · GitHub with your use-case if you wish so that the developer team can keep it in mind.

Cheers,
Nicolas

1 Like

Ah, right, so I guess the issue with automating this process is the fact that there isn’t a consistent way to map from opam names to findlib names.

Indeed, when I look at the output of dune top on a project with a library, it provides absolute paths to the .cma files for the libraries.

If we set aside the findlib names, why can’t dune top generate these same load directives if the project only contains an executable?

1 Like

Yes the OcamlPro ocaml-top. And I not suggest it for whole projects. The way I use it, is having at the top of the project several files named <something>_top.ml and they look something like this:

#use "topfind";;

#require "lib_1";;
#require "lib_2";;

#directory "src";;
#directory "src/util";;
#mod_use "file_1.ml";;
#mod_use "file_2.ml";;

// Some copy-paste

#show Cmdliner.Term;;

// some code which I'm trying to get right

A real example is here: ocp_indent_top.ml, I was trying to figure out how to use ocp-indent library

It is a bit tedious to set up, but the feedback which I get is quick and and yes if some definitions are changed it is reloaded if if is at beginning or middle of the file.

Without looking at the details, I imagine this is probably doable but that no-one has asked for it before… The main issue is probably coming up with the right user interface. In any case, I suggest you open an issue at Issues · ocaml/dune · GitHub in order to get a discussion started around it.

Cheers,
Nicolas

In general, your life will be nicer if you structure your dune projects as one or more libraries with almost all the logic (down to defining e.g. a Cmdliner.Term.t) and then have a trivial executable entrypoint. Having non-trivial content in executables also makes things harder for testing and documentation generation.

@roddy you are right but I don’t think what @Gopiandcode wants is unreasonable.

At some point you still do get quite a bit of tool specific gluing logic that is not super useful to turn into a library but that for some reason you might want to explore in action interactively. I don’t see what’s the problem with simply loading the compilation units of the executable in the toplevel.

I have a similar mode in the (unpublished) brzo build tool and I grew quite fond of it. For this reason I nowadays often write my main invocation as:

let () = if !Sys.interactive then () else main ()

(I’m always delighted as to how confusing that must read to, say, a C programmer :–) which allows both to use the program as executable or interactively from the toplevel.

4 Likes

I don’t disagree that cramming your entire project into a single executable stanza isn’t the best project structure, and, for the record, when I plan to make a public executable I do try to split out the project into separate libraries.

However, often, when I’m starting out a project in the prototyping phase, I often find myself working with projects with just a single executable[1] - in these cases, it’s quite annoying to find that dune utop doesn’t work as expected, and having to manually resort to writing an .ocamlinit file.

[1] My usual workflow when starting a project is to use dune init exe <name> to create a skeleton executable project (which contains only an executable), and then set dune exec ./<name>.exe as my compile command and then use a mixture of printf debugging to explore potential program designs. Then, only once I start to finalize on a project structure, do I split out the executable into libraries.

1 Like

As Nicolas mentioned, it’s just that nobody bothered implementing this in dune yet.

However, I did not forget to consider this feature when implementing the first version of toplevel support ($ dune utop). Dune’s toplevel is intended to recursively load all the code under a directory. If the directory contains many executables (and tests), it would be quite annoying to execute all their side effects. At this point I gave up and decided to only load libraries.

Thinking about this again, perhaps dune top could introduce a directive to load individual executables. E.g.

# use_output "dune top";;
# dune_exe "foo";; (* load the compilation units corresponding to (executable (name foo)) *)

Ah, I see how that could be a problem - maybe executing the side effects of an executable wouldn’t be ideal, but would it be possible to just load the external libraries it depends on? Currently when running dune utop in a project with only an executable target, nothing at all gets loaded.

if the entry point of you executable is guarded by a

let _ = if !Sys.interactive then () else entry_point ()

then you can probably use #load_rec

#load <str>
  Load in memory a bytecode object, produced by ocamlc.

#load_rec <str>
  As #load, but loads dependencies recursively
2 Likes

That seems like an ad-hoc, but sensible workaround. I see you’ve already made a proposal on the bug tracker, is it the same thing as this proposal? If not, how does it compare?

I believe this workaround is pretty much the same as what I had proposed on the bug tracker - admittedly I had overlooked the question of how to deal with side-effects of executables, but the main aim was to attain a similar execution context as the executable (i.e with the same libraries, as best as possible, loaded as well).

I’ll modify the issue to clarify issues with side-effects.