Ocaml stdlib and death by a thousand papercuts

Slightly off topic: I’ve noticed that the Complex module does not have an equal function like all the other number types. Is there a reason for this?

To expand on Chet’s answer (at the risk of flogging a dead horse), take an even simpler data structure, namely a monomorphic list of int:

#use "topfind" ;;
#require "ppx_deriving.show" ;;

module M1 = struct
  type t = Nil
         | Cons of int * t [@@deriving show]
end ;;
let _ =
  let open M1 in
  let lst = Cons (1 , Nil) in
  print_endline (show lst) ;;

There, the show function is of type M1.t -> string, and the equivalent type of Format.formatter -> M1.t -> unit for the pp function. But if the list is made polymorphic, the show and pp functions become combinators, higher order functions requiring an additional argument comprising a monomorphic printer for the element type.

module M2 = struct
  type 'a t = Nil
            | Cons of 'a * 'a t [@@deriving show]
end ;;
let _ =
  let open M2 in
  let lst = Cons (1 , Nil) in
  print_endline (show Format.pp_print_int lst)

This is basically back to the approach in Format.pp_print_list.

1 Like

If the proof you’re asking for is adoption numbers, or feedback from the target group, then putting hurdles in the way of adoption is obviously going to interfere with that.

The point here is to remove hurdles from basic functionality, which benefits newcomers much more than others. Users that have already invested in understanding the Format module, package system, build system and how PPXes work have already overcome these significant hurdles, so for them it would be much less beneficial. For newcomers, having it available out-of-the-box largely is the feature.

OCaml’s rigorous quality control is one of the best parts of the language I think, but it certainly has its disadvantages as well. An out-of-the-box experience doesn’t necessarily require including it in the the language though, it can be provided by a “blessed distribution” as well, as has been brought up earlier in this thread.

It’s self-defeating for the goal of making the language more approachable to newcomers. Whether that should be a prioritized goal is a different matter. I’m just saying that if it is, then this process gets in the way of that. I also think there’s plenty of room to make the language more approachable without it becoming “completely unmaintainable”.

1 Like

I really sympathize with the lack of a standard utility for printing OCaml values. One of the first things I did once I started to use OCaml intensively was to write an Obj-based formatter for OCaml values. It is just 20 lines of self-contained code without any external dependency. Whenever I need to debug something, I just copy-paste these few lines of code into the current file and I am done. Sure, its output is not as pretty as a custom type-aware formatter. Sure, it will crash on cyclic values. Sure, it does not support some features from OCaml, e.g., objects. But the amount of time it has saved me over the last 20 years is inestimable.

1 Like

I think having a proper in the stdlib would be a good thing.

I feel like I should mention again the value of ensuring at every step during development that you can load your project into the toplevel. Between tracing and custom printers, I’ve been able to debug pretty effectively.

An example (albeit from a Camp5-based project)

It loads everything needed to test regexp PPX extensions in this project.

Similar things can be done automatically via dune, IIUC. Perhaps a dune user could explain how that’s done.

I meant that an example with a monomorphic print is somewhat simpler than a polymorphic one so its not a great example if you are trying to argue that “It isn’t that hard” in a more general sense. That is all.

And also since it wasn’t even clear to me that the same approach using deriving-show also works for that kind of setup made it trigger even more of a ‘but you are avoiding the real elephant in the room’ kind of thing to me.

That last part has now been clarified and disarmed somewhat because its clear that it does also work for polymorhphic data types.

Though I will say that it is still more complicated and cumbersome when you have a print ‘combinator’ rather than a simple print function that requires additional paramters to tell it how to print elements.

Now, assuming that all the StdLib types would provide matching show / pp functions then this is maybe not too bad… but:

  • I don’t think all StdLib types provide matching ‘show/pp’ functions, or if they do, it may be hard to find them (like Format.pp_list combinator is an example… if there was a ‘List’ module with a List.t then I’d look for that function in List.pp… not in the Format module).
  • Even assuming the StdLib provided a uniform and easy to find pp/show for every type it defines… even then it is cumbersome to have to fill in all these combinator parameters… and this can become arbitrarily complex depending on the complexity of your datatype. For example try creating filling in the combinator params for a (int * float) list tree.

If modular implicits are kind of like Haskell type-classes then perhaps they would indeed help. How likely are we to get that in Ocaml any time soon?

Multiple responses:

(1) first, I’m not going to defend this gap. It’s real.

(2) I -do- think that the right thing to compare against is Rust, not Golang, b/c Rust like OCaml is a type-erasure language, and Golang has a massive company behind it with reasons for trying to make it accessible to newbies, that communities like Rust and OCaml simply cannot match.

Even so, Rust does a much better job.

(3) now to your example. The generated code -is- a “print function that requires additional parameters to tell it how to print elements”. That is what “print combinator” means, after all, yes? So:

type 'a tree = Leaf of 'a | Node of 'a * 'a [@@deriving show];;
type 'a tree = Leaf of 'a | Node of 'a * 'a
val pp_tree : 'a Fmt.t -> 'a tree Fmt.t = <fun>
val show_tree : 'a Fmt.t -> 'a tree -> String.t = <fun>

show_tree Format.pp_print_int (Leaf 47) ;;
- : String.t = "(.Leaf 47)"

But also, you could do:

# [%show: int tree] (Leaf 47) ;;
- : string = "(.Leaf 47)"

An embarrassing gap in deriving.show is that there is no equivalent for pp:

# [%pp: int tree] ;;
Line 1, characters 2-4:
1 | [%pp: int tree] ;;
      ^^
Error: Uninterpreted extension 'pp'.
# 

but that’s really just an oversight. it would be trivial to implement. I think I’ll do so in pa_ppx.

(4) As for the Stdlib types, and prett-printers, I guess I would point at Fmt (by @dbuenzli). Honestly, I rarely-if-ever use Format, since Fmt is based on the same stuff, and is much, much, MUCH more nicely-packaged. So with Fmt, one would write:

# Fmt.(pf stdout "%a@." (pp_tree int) (Leaf 47)) ;;
(.Leaf 47)
- : unit = ()

or (haha)

# Fmt.(pf stdout "%a@." (list (pp_tree int)) [Leaf 47]) ;;
(.Leaf 47)
- : unit = ()
# Fmt.(pf stdout "%a@." (array (pp_tree int)) [|Leaf 47|]) ;;
(.Leaf 47)
- : unit = ()

(5) I can only agree that it’s somewhat cumbersome to fill in these type parameters, when the compiler knows them very well already (why do I need to tell the compiler – it already knows!) And the answer is modular implicits, which I’m pretty sure would solve this problem. Certainly that’s how it got solved in Rust, and the OCaml modular implicits proposals have all been at least as powerful.

As to when that’s happening? I have no idea. I fear it’s lower on the priority-list than either you or I would like.

(5) All this as given, I think a big problem is that there’s no standard way to approach OCaml for newbies, that rolls in all these things that you’re being informed of by other commenters. You have to scrabble around to find the stuff, and eventually ask here. Rust (for instance) isn’t like that: the online books and default tooling is qiute adequate to put you on the right set of rails to move forward.

I don’t have any answers there: the Haskell community also seems to have done a better job of this sort of documentation, so it isn’t some “type theorists just RTFSC, Luke” issue. I started with the old Luca Cardelli ML in 1986, then SML-NJ, then Caml-light, so I’ve never felt the need for such documentation (there was always the source code after all). And I was never one to comment my code, so … that doesn’t help either.

I’m sorry I don’t have good answers for you. Maybe others have better ones.

I agree this would be nice, but its not required to be “as good as rust”. You don’t have to specify types when printing in rust, but you have to specify types in other places because the type inference is not that powerful.

1 Like