Ocaml stdlib and death by a thousand papercuts

cvine · December 31, 2024, 2:57pm

If we are concerned with the “paper cuts” encountered by beginners which may make them give up too soon, I wonder if making them deal with Format style pretty printing at the outset is the way to go. To use boxes, break hints and so forth they will probably have to spend up to half a day reading up the documentation on the Format module (which is quite a hard initial read). Only then will they have a handle on such minor things as whether to use pp_priint_string or pp_print_text.

I do not know the answer, but as you can probably tell from my earlier posting I wonder whether explaining user-defined printers for Printf.fprintf debugging isn’t an easier start. That would require the standard library to provide show functions for its types as well as pp functions, though.

yawaramin · December 31, 2024, 3:08pm

Well, there’s Fmt.Dump which is much more plug-and-play.

EmileTrotignon · December 31, 2024, 3:10pm

Why would they do that ? Its perfectly fine to completely ignore that will using the format module.
Regular Printf is annoying because sprintf does not accept the same types as printf.

cvine · December 31, 2024, 3:21pm

I do not agree that it is possible for beginners to use Format.fprintf and its format strings successfully without learning first about boxes, break hints, flushing/newlines and the like.

Whilst irrelevant to this point, Format.sprintf versus Format.asprintf has its own idiosynchrasy.

EmileTrotignon · December 31, 2024, 3:34pm

What is the issue with just doing Format.printf "foo = %a\n" Foo.pp myfoo ? No knowledge about boxes is needed here.

This assumes that Foo.pp exists, and indeed I should it should in most cases.

cvine · December 31, 2024, 3:42pm

Well if that really is a valid use of Format.printf and its flushing (I have never tried it) then the documentation ought to make a better job of saying that, say under “Rule of thumb for casual users of this library”, the text of which at present makes no mention of that (which maybe brings us back to where we started).

In any event, a beginner using the Format module is going to have their eyes drawn to the documentation on that module, with all that that entails.

Edit: According to the documentation: “The behavior of pretty-printing commands is unspecified if there is no open pretty-printing box.”

jbeckford · December 31, 2024, 6:32pm

That seems a bit dangerous to learn as a beginner because there is no flushing directive. It works for Format.asprintf because strings automatically flush, but in general it should be something like Format.*printf "foo = %a@." Foo.pp myfoo. Otherwise, we end up with questions like Why does this code output nothing?.

cvine · December 31, 2024, 6:44pm

Does Format.*printf implicitly open a box when a leading ‘@[’ is missing from the format string? Otherwise “The behavior of pretty-printing commands is unspecified if there is no open pretty-printing box.”

jbeckford · December 31, 2024, 6:50pm

Dunno. Will defer to a Format expert!

jhw · December 31, 2024, 6:50pm

Characterizing that as a “defeat” is a choice. One could just as easily characterize it as a win.

There are plenty of programming languages available where a relentless focus on making the language approachable to newcomers who are familiar with other more popular programming languages, at the expense of making the basic tool chain completely unmaintainable except by large enterprise organizations with vast intellectual property rents to preserve.

I’m not sure making OCaml into yet another one of those is a wise idea.

Kris_De_Volder · December 31, 2024, 6:56pm

Actually… no. I probably spent more time learning ocaml / dune / opam and then I spend on Go and its ‘toolchain’. I’m a ‘relative newcomer’ to both I’d say. I think its just much easier to get into Go and its toolchain for a number of reasons. For one the language is a lot more ‘simplistic’ this is both good and bad (one of the ‘good’ things its just very easy to pickup compared to Ocaml, bad… if you are the kind of person who likes static types, go is often a bit ‘disapointing’.

The go toolchain is also a lot ‘simpler’ than ocaml’s, not to mention you have a different choices (like there this ‘Esy’ thing). In go I generally I just spend a lot less time ‘figuring things out’ and just writing code about the things I’m actually interested in.

In Ocaml its a lot harder to find out right way to do something (if there is such a thing). Case in point… I am overwhelmed by the responses in this thread about my aparant difficulties to ‘print values’. And you can see that there are quite a few suggestions and even some ‘competing’ opinions on how to do it, (e.g using ppx or not).

Apparantly these are well meaning and helpful suggestions (and I don’t mean to imply they are not helpful, they are, in fact I might need to dig deeper into some of them :-).

But if these are supposed to convince me that “its not that hard after all” it kind of does the opposite… the fact that we even have this kind of discussion on how to ‘print values the right way’ tells us something.

In Go there is no such discussion, it just is not a thing you have to talk or think about at all. The way printing works for any value is a kind of ‘magic’ that permeates everything and it mostly just works “well enough”. In the odd case where you don’t like the default behavior, there’s no difficult decision or learning process, all you need to understand is that you can implement the ‘Stringer’ interface to define how your particular types of things convert into strings in any context.

Kris_De_Volder · December 31, 2024, 7:32pm

Putting aside for a moment the fact that we have to think/decide about ‘competing opinions’ on how to print values again (should I use format or not)… there is one big difference between Go and Ocaml here.

You are telling me its ‘easy’ to implement a custom printer for my own ‘IntPair’ datatype. But if I defined something similar in Go I wouldn’t even have to bother doing that. It would print something ‘good enough’ for that sort of thing without me lifting a finger.

So the big difference here is that in Go you don’t think about printing until much later. I only need to think about how to print values the first time I run into a situation where the ‘magic implementation’ that is provided out of the box isn’t good enough.

In Ocaml… I am confronted with the ‘difficult problem’ of how to print values the first time I define any custom data type.

Also the ‘IntPair’ is perhaps not the best example because its only ‘easy’ because the example itself is a particularly simple one. What if my type is ‘generic’ and has some type parameters (Like a 'a Tree)… now you have to deal with the fact that you actually don’t know the type of 'a or how to print it… Yes, I guess we can pass in some function to print values of type 'a to our own printer function, or maybe we use a first class module? Or maybe a functor? Or … something else maybe? Maybe different folks will recommend different ‘best ways’ to approach it… I would not be surprised

In Go its all so much simpler, the magic assumption that you ‘can print anything’ kind of avoids that pitfall and you could implement a ‘Stringer’ for your Tree type relying on the fact that there is a ‘standard’ way to print any kind of value.

And look I’m not saying I would rather use Go… I really wouldn’t.

But I wish Ocaml could have some better ‘out of the box’ support for some kind of ‘magic printer of values’ so that I don’t have to think about printing until I actually have a case where I really want to take control over how exactly my own datatype(s) get printed. And when that time does come… (I’m sure it will at some point) I would hope I don’t have to decide between several different conventions (pp / format / show / ppx) to ‘override’ the default printer. (I.e. there should be a single, easy to understand and standard way to provide your own custom printer).

Kris_De_Volder · December 31, 2024, 8:09pm

Agreed… that is a good thing.But… it is also a bad thing.

In the same way that Go’s apparant simplicity is a good thing… and also a bad thing.

As is usually the case in our world there’s a flip-side. The fact that its possible for these things to co-exist and be explored… is good.

But the flip-side is that now a new-comer is confronted with a bunch of different choices they have to make about which of these competing alternatives they should buy into. And they may be at a point in their journey where they really wouldn’t have clue about how to make that choice, because they don’t know what tradeofs are… or which corners they may be backing themself into.

I think that is exactly the kind of ‘death of a thousand paper cuts’ this thread is about.

Having to worry about choosing between ppx or Camlp5 when you don’t even know what those things really are… or how they are different. For this specific example, I kind of had to suss-out ‘Camlp5’ seemed like “the old thing” and “ppx” is the “new and supposedly better thing”. So I decided (or maybe its better to say… made possibly ill-informed choice) that I should avoid anything mentioning Campl5 like the plague.

Generally… since I don’t really understand the difference between those two things, I really don’t care much which one I would use. But lacking any much better ‘criterion’ to make the choice I’d rather pick the “most current / best supported thing”. In that sense I think it might be useful to have a ‘curated’ set of things that are the “most supported thing right now to do X” that is recommended for “most people” (or beginners like me) who have no better way to make the choice between all the alternatives anyway.

I’m not sure how realistic that (curated list) is however… my impression from Ocaml community is that there’s usually some ‘constructive disagreement’ and it will be rather hard to get that kind of ‘consensus’ in the community and maintain it over time to boot.

This ‘diversity of opinions’ is a good thing… and also a bad thing

Chet_Murthy · December 31, 2024, 11:21pm

Kris,

Two thoughts:

(1) The “party line” here is that dune + ppxlib is the way to go. And I don’t deviate from that in any of my posts: when I advertise something about Camlp5, I always do so with a caveat that if you’re not a fanatic, you should stick to the official infrastructure. [My own thoughts about the worthiness of that infra, vs. what I use, are simply not relevant.]

(2) I think it would be great if some group of people were to arrive at a curated collection of infrastructure + packages, and document it, so that newcomers would use that by default. I don’t know why it hasn’t happened: I -have- noticed that (e.g.) Rust has far better documentation for newcomers, than OCaml has. It is what it is, don’t know how to fix that.

cvine · December 31, 2024, 11:29pm

How to print generics raises different issues. OCaml print functions are monomorphic because OCaml does not keep runtime type information. In the case of 'a tree you could adopt the approach in the Format.pp_print_list combinator and supply a monomorphic print function for the particular element type. No doubt you could do something similar with a functor, but what you are left with after applying the functor would be monomorphic. In the case of containers of literals, there is some type information available via the Obj module which for example the dump function in the BatPervasives module of Batteries used to extract (I don’t know if it still does). Possibly you could do something similar in a ppx, I don’t know.

The introduction of type classes would make printing more tractable, by enabling the compiler to pick the printing function by reference to type, but that is some way off I believe. OCaml is a statically and strictly typed language. Advantages have their disadvantages. I think the current situation using combinators is acceptable. (Your question implies that Go offers parametric polymorphism, but is that actually true? I don’t know Go well.)

Lisp has very capable and versatile printing functions, as does python, but they are dynamically typed.

Kris_De_Volder · January 1, 2025, 12:30am

Yes. See for example. It didn’t used to have generics but they’ve had support for ‘generics’ for a while now. Their are some limitations to what you can do with them that are rather hard to understand when you ‘stub your toes’ on them (speaking from experience :-). It is an ‘uncharacteristically complicated’ part of Go and I think a lot of go programmers and the libraries still shy away from using it and instead resort to usinginterface{} to type things a lot of the time. (You could probably program in go today for quite a while before you realized it does have support for generics… that’s how little it is used).

Chet_Murthy · January 1, 2025, 12:53am

If you look carefully at the example they provide, you can see that Golang provides parametric polymorphism, but does NOT provide type-erasure. This is an important difference: in OCaml, all types can be erased in the core language.

Kris_De_Volder · January 1, 2025, 1:53am

Perhaps that explains those ‘strange restrictions’ in Go generics I’ve stubbed my toes against :-).

Regardless though… I think it makes my point that its bit deceptive to say that “It is not that hard in Ocaml either” because the reason its ‘not that hard’ in your example is because you side-stepped those complexities.

I wonder… how would we use a ‘ppx-deriving-show’ for a generic type? Is that supported? I looked at the docs for it here GitHub - ocaml-ppx/ppx_deriving: Type-driven code generation for OCaml but don’t see examples of how to deal with generics and printing. So maybe its not supported at all? Or is it somehow possible to have it generate a pp function for my 'a Tree which takes a function to format elements as a parameter?

Chet_Murthy · January 1, 2025, 2:12am

It’s supported just as you described. Here’s an example I cooked up just now for show (run in the toplevel):

module M = struct type 'a tree = Leaf of int | Node of 'a tree * 'a tree [@@deriving show] end ;;
module M :
  sig
    type 'a tree = Leaf of int | Node of 'a tree * 'a tree
    val pp_tree : 'a Fmt.t -> 'a tree Fmt.t
    val show_tree : 'a Fmt.t -> 'a tree -> String.t
  end
#

and

# #show Fmt.t ;;
type 'a t = Format.formatter -> 'a -> unit

So everything looks like it ought. The big gap is the lack of typeclasses (“modular implicits”) that would obviate the need to -call- these pp_tree etc functions.

P.S. I’m not going to look at docs for ppx_deriving.show, but I will note that the unit-tests for that package have generic types, e.g. here’s one:

type 'a pt = { v : 'a } [@@deriving show]
let test_parametric ctxt =
  assert_equal ~printer ("{ "^filemod^"v = 1 }")
                        (show_pt (fun fmt -> Format.fprintf fmt "%d") { v = 1 })

type 'a btree = Node of 'a btree * 'a * 'a btree | Leaf
[@@deriving show]

That’s directly from the unit-test that I existed at the time that I wrote the Camlp5-based workalike. So ppxlib-based ppx_deriving.show supported generics just fine at that time.

ETA: haha, I got my cooked-up example wrong (!!) I did Leaf of int instead of Leaf of 'a. Here’s the fixed transcript:

module M = struct type 'a tree = Leaf of 'a | Node of 'a tree * 'a tree [@@deriving show] end ;;
module M :
  sig
    type 'a tree = Leaf of 'a | Node of 'a tree * 'a tree
    val pp_tree : 'a Fmt.t -> 'a tree Fmt.t
    val show_tree : 'a Fmt.t -> 'a tree -> String.t
  end
#

Chet_Murthy · January 1, 2025, 2:20am

Regardless though… I think it makes my point that its bit deceptive to say that “It is not that hard in Ocaml either” because the reason its ‘not that hard’ in your example is because you side-stepped those complexities.

I’m not sure what you mean here, could you expand ? The big thing that OCaml lacks for the uses you’ve described, is typeclasses/modular-implicits. I also bitterly wish for that. After that, sure, OCaml hasn’t had the sort of work you can see in Golang (remember: they have a ginormous company pushing it, so you should -expect that) or Rust (there, no excuses, they’ve simply done a better job with their documentation and making a ‘default path’ easy to use).

Topic		Replies	Views
What is the preferable solution for the role of standard library? Learning core , standardlibrary	37	9460	December 22, 2017
What I dislike about OCaml Community ocaml	117	11599	November 5, 2022
[ANN] v0.16 release of Jane Street packages Community announce	6	1703	June 28, 2023
Do you know of `StdLabels` and `MoreLabels` modules? Ecosystem stdlib , survey	7	400	June 20, 2024
Staying up to speed with OCaml in the year 2021 Learning	13	2015	February 6, 2021

Ocaml stdlib and death by a thousand papercuts

Related topics