I’ve started learning OCaml recently and one issue that I struggle with is finding a workflow that gives me quick feedback about changes I’ve made (I’m looking to reduce compile/run cycles). To give you some context - most of the time I’m programming in Lisps and Ruby, where it’s relatively easy to modify a running application. With all Lists my workflow is:
I make some change (e.g. I update some function)
I compile the changed bit of code (e.g. a function)
I immediately run this with the help of the REPL
Based on the results I got I go to the beginning or move on to the next things that needs doing
That’s known as interactive programming in some circles and I totally love it. With Ruby I can’t do the same, but at least with web apps there’s some degree of code reloading that allows you modify a running app.
With OCaml, however, I’m not sure how to get the quick feedback, as it seems I need to constantly compile and run stuff. I’m trying to work like with Lisps - I keep a toplevel open alongside my source code and occasionally send some code there, but the toplevel is limited in many way (you’re just dumping text there at the global level). dune utop helps to some extent, but it can’t reload changes. I’ve heard that some people were saying they didn’t even use a toplevel, so I’m wondering if I’m missing something. I’d be curious to hear any tips and tricks from your OCaml workflows.
Have a look at expectation testing with ppx_expect. That’s what I use to get this kind of quick feedback (and you can keep the results as a test).
Instead of sending code to a REPL, you write your test in an expect block and run dune runtest. Anything you print out in the expect gets captured, and dune shows you the output as a diff.
That said, I’ve never had good luck with REPLs, so I might be missing something about that workflow.
I don’t use a Toplevel. I mean, sometimes, rarely, when I am in the mood. But as someone who used Clojure professionally I never really use the OCaml toplevel (except for developing around toplevel tooling). In fact, I mostly don’t even run my code, because I can just ask the compiler whether it will be correct or not, no need to run it
Slight tongue-in-cheek aside, what I do a lot is use the LSP support in my editor (Neovim) where it tells me what code type checks, so I avoid a lot of trips to the shell telling me that what I wrote is utter nonsense. Then there is a bit of unit testing/expect testing/cram tests (depends on the project) that tell me whether what I am doing is actually working.
It is a bit of trial-and-error but I really disliked the fact I always had stale state in the Clojure REPL that would make everything act weirdly and restarting it took forever, so just running dune and starting from a clean slate has made the process much faster and having tests makes my tinkering more reproducible. It’s a different philosophy but honestly, I much prefer this over REPL driven development.
I’ve always been the type of person who prototypes first and writes the tests later (unless I’m fixing some bug where I usually start with a regression test), but I guess I have the REPL-driven workflow to blame for this as well. Perhaps this is my chance to change some habits. Admittedly being test-driven is pretty effective once an API is clearly formed in one’s mind, but you still have to wonder what to do when dealing with exploratory programming.
Any test can be exploratory if they’re expect tests, imho . The core requirement is to have printers for your types (e.g. ppx_deriving.show), so you can just create values and use
Format.printf "my thing is now %a@." pp_thing my_thing`;;
and see the output in dune. There’s a few annoyances (e.g. when it crashes you need a bit of elbow grease to get the backtrace) but for the most part it’s a nice workflow. I don’t apply it to everything, though.
I’ve been writing elixir for work and I really do enjoy the developing with iex (elixir repl) first to test out new behaviors.
make changes
run recompile
call function in iex
…
eventually write tests
But I can attest that it isn’t too different from what others are saying if you have dune runtest running all the time. It’s the same speed of feedback for me. Every new function or module I start, I usually setup a similar expect block. Hadn’t used ppx_expect but think I should now.
Having worked in Clojure and ClojureScript for about a decade prior to moving on (first to Haskell, briefly Rust, and then to OCaml), I absolutely understand and sympathize with your yearning for a quality REPL experience. There’s good and bad news:
the bad news is that non-lisps simply don’t have “good” REPLs as we would expect, for all sorts of reasons. However,
the good news is that OCaml provides IMO an almost uniformly superior development workflow to what I was accustomed to (in Clojure, but also in other lisps, even those with very integrated development environments like Racket)
(Parenthetically, utop is very good given the constraints that it must operate within given OCaml’s compilation and execution model, much better than the equivalents [when I last used them] in Rust and Haskell. You should use utop for quick spot experiments and such, but turn away once you’re working on code that should persist past a given toplevel session.)
Everything that makes it possible to work productively in OCaml comes from:
the compiler toolchain being fast (no, seriously, really fast)
dune & co. being fast (no, seriously, really fast)
editor tooling (specifically, type hints, completions, go-to-definition, etc) being fast (enough), accurate, and “total” (i.e. it’s extraordinarily rare for type information not to be exactly where you’d expect, even in the face of e.g. nontrivial ppx transforms)
None of this is the case in e.g. Clojure (or any other lisp IME), so use those attributes to your advantage:
Always have a terminal buffer on the side running e.g. dune runtest -w --no-buffer. Especially as you’re just starting out, unless you’re doing something pathological, you should get test feedback in a matter of milliseconds, every single time you hit save. This kind of rapid, total feedback should feel superior to typical Clojure workflows, whether you’re loading buffers into the REPL piecemeal (lots of mental accounting around which namespaces to load in which order), or having some tooling do full reloads on save (which tend to be quite slow even with smaller codebases).
As your codebases get larger, use dune aliases to restrict the set of things that are run when you hit save; e.g. if I’ve defined a testarrangement executable in a test or tests stanza (which could define dozens or hundreds of separate test executables in total), then (rule (alias arrangements) (action (run ./testarrangement.exe))) will give me a @arrangements alias I can use to trigger just that one via dune build @arrangements --no-buffer -w
In hindsight, so much of what we hype up as “exploratory programming” in the REPL is really just coping with the lack of useful type information. You don’t have to run to a REPL anymore to see if a given function takes a list or a seq, or see what the shape of a given bit of data is like. Since the vscode extension or merlin in emacs reveal the types in your program quite nicely, use them, you don’t have to rerun tests solely to verify that the type-checker has done its job properly. (Fully getting over this compulsion took me months!)
Note that you absolutely don’t have to use any kind of particular test-driven approach; these executables are just modules that have top-level effectful expressions, so you can do anything you want in them, and produce any kind of feedback that’s helpful to you (terminal output, automatically dumping visualizations to disk that vscode or emacs can readily detect and show/update, trigger an audio ding to indicate a success condition, etc).
Further on this point, you can always keep a “scratch” program off to the side, set off with its own alias, which you can abuse as you see fit.
At the end of the day, compared with Clojure/ClojureScript/Racket/CL/etc, I end up with the same kind of feeling of having a conversation with my program and the compiler, but with both much more granularity (because I always have type visibility at every level of the program), much broader scope (because I feel safe to make much larger changes in between e.g. test runs), yet with lower latency at every step.
Caveats to all of this include:
Yes, you’re throwing away aggregate program state every time the watcher restarts things. 99.8% of the time (my best guesstimate ) , that is completely okay (n.b. OCaml is fast, much faster for the same tasks than Clojure and even Java). In the extremely rare cases where your program / tests depend on some data that requires significant database interactions or a lot of compute to arrive at, it’s okay to have that work cache its results locally via a Marshal’d file so that subsequent runs are snappy again. I’ve only felt like I’ve had to do this once in ~three years.
very occasionally, I’ve witnessed the vscode extension somehow lose track of its type information (reporting everything to be weak types); restarting the language server or sometimes just making a single whitespace edit knocks things back into order (it thankfully happens so sporadically I’ve not bothered filing an issue)
You can’t open a REPL or toplevel into the running state of a remote system, something I used to use occasionally for debugging purposes. This is a genuine downside of non-lisp systems . I’ve compensated by being much, much more diligent about logging and error reporting otherwise, and by taking steps to be able to ship runtime state out of deployed systems when necessary for examination in a development environment.
The OCaml toplevel experience can be vastly improved, and I think there’s few features from the Lisp world that we cannot adequately imitate. It’s just something the vast majority of devs have no interest in doing because there’s more efficient ways to interact with the compiler.
I’ve witnessed the vscode extension somehow lose track of its type information (reporting everything to be weak types); restarting the language server or sometimes just making a single whitespace edit knocks things back into order (it thankfully happens so sporadically I’ve not bothered filing an issue)
It’s an issue I’m aware of. It’s a matter of dune notifying the LSP server when a modules’s dependencies or dune file have changed. Currently, this is something that the LSP server rechecks every time you edit.
The way I see it, ppx_expect is almost strictly superior to using the toplevel. So the way forward in making development more interactive is making this feature more accessible. On the ppx_expect, this could mean dropping dependencies. On the dune side, this would mean reducing the ceremony required to add modules that are pp’d with ppx_expect.
One improvement I’d love to see is the ability to run a subset of tests. Maybe using a regex to select tests to run? I don’t think it would be hard to add this feature to the test runner, but the way dune integrates inline tests makes it annoying to pass arguments to the runner.
Another downside of the current dune support is that you only see the test diff from a single module at a time. If you’re working on some code that changes a lot of tests, you either have to promote changes until you get to see the test you care about, or comment out the other tests.
Does dune have a way to run tests of only changed code? Because of OCaml’s modularity, it should be possible to limit the impact of code changes on unit test runs in a similar way to limiting unnecessary rebuilding of modules, especially using -opaque…
Dune already does that. The granularity for inline tests is at the level of the library stanza. So that means if a single test needs re-running inside a library with inline tests, all of them will need to re-run.
W.r.t to development cycle in OCaml, I find myself working in 3 distinct ways depending on at which point of a project I am at:
small experiments - sometimes I need a particular function that isn’t provided by the stdlib, or some other bespoke well-defined function. In these cases, I write the function directly in the source code - using merlin’s typing information etc. to develop as usual. Once I’ve done that, just to check the behaviour is as I would expect, I copy over the definition to utop - if the function depends on any larger state from my project, I write a dummy module to mock these parts.
exploring a stateful system - sometimes, I’m interacting with some kind of stateful existing system that can be hard to run in utop (for example, it might be a pain to get the library working in bytecode) - e.g. gtk. In these cases, I setup a small executable module that performs the minimum required to setup the initial state, and do my iterative experiments by recompiling after each change. Because the module is small, and the OCaml compiler/dune is fast, this leads to a quick iterative editing experience.
working on a well-known codebase - in all other cases, I’m working on a codebase that I either know fairly well, or has sufficient documentation and typing-discipline to make it easy to understand what’s going on. In these cases, I can rely on the types and documentation to make changes/add new functions, and if needed, use tests to document expected behaviours.
As I’m speaking to an Emacs celebrity @bbatsov (thank you for your great packages!), I’d be remiss if I didn’t mention my own little experiment in this direction - interactive-utop-mode:
It’s not quite the same level of interactivity as lisp, granted, but with judicious use of forking, you can emulate a slightly more interactive repl experience.
Disclaimer: I don’t actually use this that much myself, it’s just more of a proof of concept to show how it works.
Big +1 to expect tests for interactive programming. I wrote a bit about this in an old blog post:
As to how to make things yet better:
At present, I think ppx_expect itself depends only on a couple other ppx’s, and Base and Stdio, so I think it’s already reasonably light, and the libraries in question are all portable.
From my perspective, a more important direction is to improve the editor integration. Our internal expect test workflow is delightful, and quite a bit better than what’s available publicly. When an error pops up from the build, you hit a key to see the diff, hit another key to accept the diff if you want to. It would be great if we could get something similar to that in vscode.
This thread is the type of stuff I hope to provide users from experienced developers in the How I Start series. I wanted to check here if anyone was interested in providing an OCaml post? One I’ve been looking to get, along with Rust, for a long time
I wrote up the beginnings of a How I Start (opinionated version) in OCaml last year here. It could do with an expansion into how to build a simple thing in OCaml like the Haskell or Erlang versions. When I get time next month I am going to come back to it.
Off topic but the Haskell version could use some modernisation around the editor and build tooling setup. Is there a place to PR an updated version of that too?
On my personal OCaml dev workflow, I use opam/dune with Emacs and either merlin or ocaml-lsp-server.
Everything is driven via the editor or a terminal, using LSP to tell me about errors and types. The top level is never used in general, only when I have some demo code I want to write for that I would use mdx for showing example code and how it works which is executable via dune.
Compared to my Haskell workflow, which used to be REPL first with an Emacs session plus a cabal repl.
Startup cabal repl
Make some change to a source file
Issue :load or :reload at the repl
Then cabal test or have a file watching version running in the background
This was before LSP really worked properly in Haskell, now the Haskell LSP server is good but often slow or requires extra configuration in the form of hie.yaml files. I still need to do 4 for running tests in a terminal but my workflow looks identical to my OCaml one.