[ANN] ocaml-lsp-server 1.8.0

On behalf of the ocaml-lsp team, I’m pleased to announce version 1.8.0. This release contains some quality of life bug fixes, better diagnostics locations, and a few new code actions. Happy hacking.

The full change log is rendered below for your convenience:

Features

  • Add a new code action Add missing rec keyword, which is available when
    adding a rec keyword can fix Unbound value ... error, e.g.,

    let fact n = if n = 0 then 1 else n * fact (n - 1)
                                       (* ^^^^ Unbound value fact *)
    

    Adding rec to the definition of fact will fix the problem. The new code
    action offers adding rec.

  • Use ocamlformat to properly format type snippets. This feature requires the
    ocamlformat-rpc opam package to be installed. (#386)

  • Add completion support for polymorphic variants, when it is possible to pin
    down the precise type. Examples (<|> stands for the cursor) when completion
    will work (#473)

    Function application:

    let foo (a: [`Alpha | `Beta]) = ()
    
    foo `A<|>
    

    Type explicitly shown:

    let a : [`Alpha | `Beta] = `B<|>
    

    Note: this is actually a bug fix, since we were ignoring the backtick when
    constructing the prefix for completion.

  • Parse merlin errors (best effort) into a more structured form. This allows
    reporting all locations as “related information” (#475)

  • Add support for Merlin Construct command as completion suggestions, i.e.,
    show complex expressions that could complete the typed hole. (#472)

  • Add a code action Construct an expression that is shown when the cursor is
    at the end of the typed hole, i.e., _|, where | is the cursor. The code
    action simply triggers the client (currently only VS Code is supported) to
    show completion suggestions. (#472)

  • Change the formatting-on-save error notification to a warning notification
    (#472)

  • Code action to qualify (“put module name in identifiers”) and unqualify
    (“remove module name from identifiers”) module names in identifiers (#399)

    Starting from:

    open Unix
    
    let times = Unix.times ()
    let f x = x.Unix.tms_stime, x.Unix.tms_utime
    

    Calling “remove module name from identifiers” with the cursor on the open
    statement will produce:

    open Unix
    
    let times = times ()
    let f x = x.tms_stime, x.tms_utime
    

    Calling “put module name in identifiers” will restore:

    open Unix
    
    let times = Unix.times ()
    let f x = x.Unix.tms_stime, x.Unix.tms_utime
    

Fixes

  • Do not show “random” documentation on hover

  • Correctly rename a variable used as a named/optional argument (#478)

  • When reporting an error at the beginning of the file, use the first line not
    the second (#489)

12 Likes

I have a technical question. I notice that the code base uses dune fibers for concurrency. What benefits do fibers have over, say, Lwt/Async?

1 Like

I wanted to write about more that in more detail at some point. I can give a quick preview here I guess. The gist of it is that fibers are much more lightweight and provide simpler error handling semantics. You do lose out on choice (Deferred.choose and friends) though.

Like deferred/lwt, a fiber is also a monadic value that represents asynchronous computation. In deferred/lwt, binding on the computation waits for it to finish, and then stores the result in a ref. Subsequent binds will reuse the saved value. A fiber stores nothing and will always re-run the computation from scratch to produce the value. For example, this will print foo twice.

let foo = Fiber.of_thunk (fun () -> Fiber.return (print_endline "foo"))

let bar =
  let* () = foo in
  let* () = bar in
  Fiber.return ()

To save the value computed, one must opt-in and create an Ivar manually. What’s the advantage of this approach? Well, bind becomes a lot simpler (and cheaper). I’ll even paste the implementation to prove how simple it is:

let ( >>= ) t f k = t (fun x -> f x k)

Lwt’s and Async’s bind are far more complex (and slow).

Moving on to error handling. Lwt has the disaster that is Lwt.async. Enough said. There’s Lwt.dont_wait now, but it’s still fishy to me that yet another error handling channel is needed in addition to Lwt.catch. With Async, things are cleaner with monitors, but IMO, a completely separate concept for error handling is overkill.

To explain how fiber does error handling, I’ll first explain the other (aside from sequencing with bind) composition primitive for fibers. A fiber may spawn sub fibers that will be executed concurrently. The simplest primitive to do is that is the fork:

(* the main concurrency primitive *)
val fork_and_join : (unit -> 'a Fiber.t) -> (unit -> 'b Fiber.t) -> ('a * 'b) Fiber.t

Before a fiber can finish and produce a value, it must wait for all of its sub fibers to terminate. If a fiber raises an exception instead of producing a value, the exception will be passed to the fiber’s error handler. If there’ no error handler set, the exception will be passed to the parent’s fiber error handler (if the error handler itself raises, the error will always be passed to the parent’s error handler).

The rules above are exhaustive and leave very little room for ambiguity. In practice, I find it very convenient to write robust error handling code this way. For example with lsp, it’s easy to install a handler in one place for a single request and know that errors will not leak outside of it. It is also easy to make sure that no spawned computation is left “dangling” after a request is served.

There’s also some practical benefits that made me choose Fiber:

  • Depending on async/lwt would impose version constraints for end users on these key libraries. In contrast, fiber is a single module that is trivial to vendor and is invisible to users.
  • Fiber contains no C and hence very portable. This can be handy if one wants to get ocamllsp working with an experimental compiler for example.

There’s downsides as well of course. Fiber is quite minimal and therefore comes without a scheduler. I managed to throw something together by mostly copying dune’s scheduler, but it’s not something I’d recommend to anybody. If fiber ever becomes public, we’d need something like libuv to provide us with a serious and reusable scheduler.

That’s all for now. It would be much more interesting to write about how dune itself uses fibers, but that is better written by other people from the team.

13 Likes

I had a chance to look at the fibers code a bit. My impression is that it is extremely clean. But how do you achieve IO concurrency? Do you use Unix poll? select?

Yes, its extremely useful that it does not use any C. This probably explains why ocaml-lsp-server works quite well on my multicore switch (though multicore is supposed to work with existing ocaml C API too).

You mentioned that there is currently a hacked up scheduler for fibers. Would fibers be a good candidate to use upon domainslib on multicore?

2 Likes

I had a chance to look at the fibers code a bit. My impression is that it is extremely clean. But how do you achieve IO concurrency? Do you use Unix poll? select?

Details like this are supposed to handled by the scheduler. A good scheduler could provide primitives on fd’s that are select based for example. In lsp, I actually don’t use poll or select and achieve all the concurrency that I need with threads for portability reasons. Sure, this doesn’t “scale”, but it’s enough for what lsp is doing. To keep things even more simple, lsp does all file IO in the same main thread. I did not observe any problems with this approach yet.

You mentioned that there is currently a hacked up scheduler for fibers. Would fibers be a good candidate to use upon domainslib on multicore?

It could be. I would strongly favor a monad free API. In relation to lsp, there’s not much to gain from parallelism because the compiler api it relies on isn’t re-entrant.

3 Likes