Dynamically link native code into utop?

I think I know the answer to this question, but I wanted to make sure that my understanding is correct. (I’ve been looking for an answer, and everything points toward the following, but I haven’t found an explicit statement of it.)

I believe that there is no easy way to cause utop to dynamically link a native code library (e.g. built from OCaml source). I know that I can load a bytecode library into utop at runtime, and I know that it’s possible to build a custom utop that links in a native code library, but I don’t belief one can easily link native code in during the operation of utop or via command line options when I start it up.

Is this correct?

(If there is a non-simple way to dynamically link native code libraries into utop, that’s interesting, and I want to know about it, but I’m unlikely to pursue it at this time assuming that creating a custom utop is easier.)

1 Like

Yes. In other words the ocaml toplevel, only allows you to load .cmo and .cma compilation objects (and their C bindings if you are on a platform that supports C dynamic linking).

There is a native toplevel in the ocaml repository but I don’t think it’s working, I vaguely remember there were plans to revive it but I don’t know where that went. Another line of work on native toplevels can be found here.

Thanks @dbuenzli. I’m happy to know about the native toplevel projects. I should probably stay away from any strategy that’s not well-supported currently, so I’ll focus on learning to make custom a toplevel.

I’m not sure why you need this but note that a custom toplevel won’t bring you native OCaml execution at all.

It must be used for OCaml libraries with C bindings on platforms that do not support dynlink or can be used for prepackaging a toplevel with libraries so that you don’t have to #use them. But as far as OCaml code is concerned it always remains bytecode execution.

Oh.

I’m writing a model using Owl in which some operations involve moderately complicated processing in OCaml code–i.e. some things have to be done using calculations on individual values, rows, and columns of matrices. I don’t think the algorithms can be can be translated into pure matrix operations, which could then be handed off to OpenBLAS. Now that I’ve gotten the code working on toy examples, I’m finding that if I process reasonably large matrices such as 1000x1000, some operations can take a very long time. No doubt my code can be optimized further, although I suspect the speed improvements would not be worth the additional effort. I have to investigate further. In any event, I was thinking that moving from bytecode to native code might improve speed significantly, and would be easy to test.

I know how to make a native code application. I have done that with a related project. However, what’s most useful, for many purposes, is to be able to experiment with the models in a REPL of some kind. That will always be true, even when I get to the point where it’s useful to standardize some sequences of operations as a separate application. utop is a great environment for experimenting with the model, except that it’s inconvenient to sit and wait many minutes for an operation to finish. I’d rather run the slow operations in native code, but I want to be able to do that in a REPL.

Do I have to write my own REPL into a native code application in order to do this? If is there some way to do this that’s not too difficult, given that utop and ocaml already include routines that allow construction of a REPL? (I’m thinking about traditional lisps where you really can just compose the functions read, eval, and print in a loop to get a … REPL.)

1 Like

Yes, you could do your own mini-language that interprets and hooks into to your own (native) primitives of your native application.

Maybe you can try to reuse lua-ml for this, here’s a nice paper about it. (Disclaimer I never tried to use that software).

There is a native toplevel in the ocaml repository but I don’t think it’s working

It works ok. It’s internals are a but of a hack, bit it’s basically fine. We use it quite a bit at Jane Street in the form of a native code utop. Unfortunately, I think the native code utop support isn’t released externally anywhere – and might be quite tricky to release since our internal utop has diverged significantly from the external one (e.g. using async instead of lwt).

1 Like

Nooooo …
It’s a lot more work than is justified by the payoff.

(The nice thing about the roll-your-own-repl trick in some lisps is that read and eval are functions that already handle the entire language, plus any functions you define. Not all lisps are the same though. It’s easy in Common Lisp, but not so straightforward in Clojure. I understand that OCaml can’t have a function like CL’s eval, though, given the type system.)

It’s puzzling to me that one can make a custom utop that calls via C bindings, but it can’t call native OCaml code directly. I am probably misunderstanding something. If I wrote a C wrapper for an OCaml library, could I compile the OCaml library source to native code and then use the wrapper to link the whole thing into a custom utop? Maybe that would be easier, actually, than writing an interpreter.

1 Like

I guess the main reason is that you need a single runtime system and gc to be in charge as those are different for bytecode and native code (e.g. think only about the calling conventions).

Maybe you should still trust @lpw25 and try the ocamlnat route, you won’t have all the bells and whistle of utop but rlwrap ocamlnat will already bring you a long way. Another issue may be ocamlfind’s toplevel support (?) but you can load things a bit more manually for the time being.

1 Like

I don’t understand why people talk so much about multicore, but never about ocamlnat. I don’t use the toplevel that much because it’s much much slower than (flambda) native code for the kind of programs I write, which is a shame. A native utop would likely change my workflow.

2 Likes

That’s an interesting question. Speculating, I wonder if the community has just had more of a focus on speed in standalone applications than extremely efficient computation in a toplevel, which is already fast enough for many purposes.

I also think that multicore is something that functional languages can do particularly well, in theory, and it’s something that can convince imperative programmers to make the big leap to functional programming, so it makes to me sense to emphasize catching up to other functional languages on this dimension.

I obviously agree with you, though, that a fast toplevel is valuable.

One thing that I like about the OCaml community is that it’s rooted more in academia than some other language communities. I have the feeling that the OCaml community is more sensitive to the needs of scientific research. I’m not against business applications–I’ve been in that world as well, and OCaml has obviously benefited from its adoption by buinesses–but in some language communities, it’s difficult to get anyone to care about tools that would primarily be of use in research. A fast toplevel is that kind of tool, I believe. (Maybe that’s wrong, though, since Jane Street uses native code toplevels.) Perhaps if ocamlnat or something like that became well supported, that would be big a selling point for some researchers.

I spent sometime today to see where ocamlnat stands. I managed to get it work both on linux (via docker) and on macos.

You can try it by using this opam v2 repo (see the README for instructions).

With pure ocaml code it seems to work fine here’s a sample interaction using gg:

bash-4.3$ rlwrap ocamlnat -noinit 
        OCaml version 4.05.0 - native toplevel

# #directory "/home/opam/.opam/4.05.0+ocamlnat/lib/gg";;
# #load "bigarray.cmxs";;
# #load "gg.cmxs";;
# #install_printer Gg.V2.pp;;
# open Gg;;
# V2.(ox + oy);;
- : Gg.v2 = (1 1)

A few notes:

  1. ocamlnat is invoked with -noinit this is to prevent to read your .ocamlinit which likely #use 'topfind' which fails (more on that below).
  2. As a result include directories and dependencies are loaded manually. Note: it’s the cmxs you need to load. You’ll need to make sure the libraries you use properly builds them.
  3. Toplevel support libraries that install printers like gg_top.cmxs fail because they try to use the Toploop module rather than Opttoploop.
  4. I tried with a library with simple C bindings (mtime), it works out of the box on linux but on macos it needs a few adjustements on the cli to find both the static C library (?) and the dynamic one. See the interaction at the end of the message.

1 and 3 are due to the infortunate API design of the byte and native toploop API that splits the module names into Toploop and Opttoploop rather than expose the same names and treat the problem as a library variant. This implies that any consumer of the API has to conditionalize itself which doesn’t scale, this problem is documented in MPR 7589.

For a library with C bindings on macos the following env var had to be adjusted (and there are weird warnings I didn’t investigate exactly where they come from, I suspect a configuration thing that is no longer relevant in ocaml itself):

> DYLD_LIBRARY_PATH=$(opam var stublibs) LIBRARY_PATH=$(opam var mtime:lib)/os rlwrap ocamlnat -noinit
        OCaml version 4.05.0 - native toplevel

# #directory "/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/mtime";;
# #directory "/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/mtime/os";;
# #load "mtime.cmxs";;
# #use "mtime_top_init.ml";;
ld: warning: directory not found for option '-L/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/ocaml/camlp4'
ld: warning: directory not found for option '-L/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/ocaml/camlp4'
ld: warning: directory not found for option '-L/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/ocaml/camlp4'
ld: warning: directory not found for option '-L/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/ocaml/camlp4'
# #load "mtime_clock.cmxs";;
# Mtime_clock.now ();;
ld: warning: directory not found for option '-L/Users/dbuenzli/.opam/4.05.0+ocamlnat/lib/ocaml/camlp4'
- : Mtime.t = 92862271359034ns
7 Likes

Thanks @dbuenzli !

This looks great. I’ll first have to get opam 2.0 installed properly. So far I’m having trouble with that, but that’s a different issue.

(I had tried building ocamlnat on my own with 4.0.5.0 and got an error that I have not yet had a chance to investigate:

File "src/typing/printtyp.mli", line 19, characters 5-16:
Error: Unbound module Outcometree

This is something you’ve no doubt resovled in your version, so I’ll work on getting opam 2.0 set up.)

I have opam 2.0 and ocamlnat installed now. (!) Thanks again for all of your help, @dbuenzli.

This question might be of interest to others, and I’m sure there are many people who could answer it.

When I now try to install packages for the 4.05.0+ocamlnat switch, the first required action listed is always to “install ocaml 4.05.0”. I cancel the install when I see that. What does this mean? Is it going to mess up the switch, or replace ocamlnat, or have no effect, or … ? Is it possible to install opam packages (e.g. owl, oasis, batteries, core)? (With owl, I often install from source, but it’s normally built with oasis.)

I assume that opam doesn’t know that the switch I’m in contains a 4.05.0 compiler. Is there a way to tell it that this dependency is satisfied? (Maybe using the --switch or --ignore-constraints-on options to opam install?) I’ve been looking through opam docs online to see if there’s some way to make this switch tell opam that it provides 4.05.0, but I haven’t found it yet.

Thanks-

Letting opam install “ocaml 4.05.0” on the ocamlnat switch didn’t seem to cause problems. I still wonder what this is, though.

I had a look at the logs on a fresh switch and it seems nothing special happens on that one, this may be part of the (currently suboptimal in my opinion) switch initialization story of opam v2.

Note that if you need developement version of owl, you should be able to opam pin add owl --dev (you can also pin to a specific commit see opam pin --help).

1 Like

i’m much in the same situation: need for speed for interactive exploratory work in utop, driving a library in that i have written in ocaml. as you mention it’s ok when the only bottleneck is matrix operations but this is not always the case. i too have been hoping for a new revived ocamlnat…

this is great, thanks for taking a look into ocamlnat and especially the toploop api design issues. hopefully this can set gears into motion so that we can have a plug-and-play native utop!

I didn’t know that. It works! Magic.