Pretty printer for custom data types best practices?

Why is this complexity needed to print a value? What is the reason/advantage this provides over regular object printing in other languages?

Or Reason’s Console.log that allows printing any object without defining custom printers or ppxes?

Obviously, because Console.log is not available if you don’t compile to JS

1 Like

Console.log works on native (it’s also in the title - Reason Native)
the other Reason Native packages from the link are pretty good as well, unfortunately they are not published on opam since the authors consider publishing to opam a more manual process than it needs to be

1 Like

Compilation does not preserve type information in OCaml. This means that you are left with the options to either print the raw memory representation at runtime like the Console library that you linked, or to define custom printers to interpret the raw data in a semantic friendly format…

4 Likes

This answer should probably go in the FAQ.

@octachron
there is a middle way: retrieve the type information the compiler puts into .cmt files
and use the compiler’s own facilities to interpret the raw data. The ocaml interpreter
usefully prints values of any kind and that facility is made universally available by
the Genprint library on opam. It’s not ideal - a built-in Rust-like {} would be.
I believe OcamlPro has an internal compiler version that retains type info inside
compiled units for the same or similar purpose.

@mudrz
Genprint has to use a ppx but only to hide its mechanics - it doesn’t require
deriving-like additions to source files containing type declarations but rather
consults the .cmt files alongside already compiled modules.

@progman, your library breaks as soon as polymorphism is involved. You cannot really say that it works in OCaml when it is restricted to a strictly monomorphic subset of the language.

@octachron

type 'a t={x:int; y:'a}

let _=
  let f v=
    [%pr v]
  in
  let a={x=1;y=true} in
  f a;
  let b={x=2;y=0.0} in
  f b

indeed I would have expected:

=> {x = 1; y = <poly>} 
=> {x = 2; y = <poly>} 

but instead got:

=> <poly>
=> <poly>

I shall investigate. thanks for the feedback.

ps. is the objection more fundamental than I am imaging above?


there is no bug, the function f is just unconstrained in type and thus the print statement
has nothing to work with

let f (v : 'a t) = [%pr v]

Or simply

let rec map f = function
  | [] -> []
  | x :: r ->
    let y = f x in
    [%pr x x]; [%pr y y]; y :: map f r

And this limitation is intrinsic to your library since it can only print values with monomorphic type. I am not saying that your library is never useful, but that is a pretty severe limitation.

For the record:

module A = Set.Make(Int);
let a = A.of_list([1,2,3,4,5]);
Console.log(a);

will print

{{0, 1, 0, 1}, 2, {0, 3, {0, 4, {0, 5, 0, 1}, 2}, 3}, 4}

I would appreciate if we didn’t push for wrong/incomplete solutions and argue they work in the general case. Console.log clearly doesn’t.
This is expected: it’s clearly said in the documentation it works for basic data types, and that’s about it. As @octachron pointed out, it suffers from the same issues than most other solution based on memory representation suffer.

1 Like

@octachron
I see what you mean and indeed my intent is only to print where type is captured - monomorphism as you say. However I would consider I’d simply made a mistake by placing
a print statement where it wasn’t even possible to know the type in any respect, such as in your example.
My example (my error, no bug after all) demonstrates the showing of partial structure
which might be sufficient in some circumstance otherwise seeing a <poly> in a data field of interest would suggest I sample the data flow elsewhere.
I suppose, yes, that could be seen as a severe limitation. All I can say is In practice I don’t find it so but that is perhaps because I’m aware OCaml is not dynamically typed, et al.

There is the further issue, that we often have data-types that contain information we do not (at the moment) wish to print out during debugging. For instance

  1. ASTs contain “location” information. The proper maintenance of that information is important, but once we’ve verified that, and move on to actually doing type-checking, analysis, etc, when we print out ASTs the last thing we want is that location information.
  2. The same might be said of the environments that decorate ASTs after type-checking. Or even the types themselves.
  3. Even when printing ASTs, we might want to print their surface syntax as it would be parsed (b/c much more compact, more comprehensible to the reader).

What precisely we want to print, at any particular moment of debugging or error-message, is context-dependent. A flexible and powerful pretty-printing system is one that both allows most pretty-printing to be driven by type-structure (hence, “deriving”) but also allows us to step in and control what doesn’t get printed, and sometimes how it gets printed.

@Chet_Murthy
I agree with all you say. This library only does what the interpreter does and dump an annotated tree of one’s data
type to the console (subject to depth/term limits). It’s primitive but often all that’s necessary.
If I only want to confirm the presence of a certain value for instance, where the writing of
a custom printer or a deriving (with attendant delay for lots of superfluous recompilation)
is just overkill. It’s lightweight and is a boon to experimentation … as with the interpreter.
I will reinforce this point even more so on the project page.

Note that it’s not only me. For example these nice hexdumping combinators were designed by @pqwy.

I think you are right. The tty stuff could benefit from a second pass.

1 Like

Well, I hope you find the new semantic tags up to your standard for this task, they should compose much much better than the original iteration, and provide good abstraction.

1 Like

Yes, the treatment of colors in section 6 of Format Unraveled looks like it should work nicely.

1 Like

@jjb Actually, no. In the paper (and the original version of Format) they use strings as tags. Using strings is very problematic, since it doesn’t provide good encapsulation (the strings are public), composition and is inconvenient to transmit structured data. If I remember correctly, @dbunzli said that was a deal breaker for him.

Recently, we changed Format to use extensible variants as tags instead, which fixes all these issues.

3 Likes

Oh, excellent, even better! I hadn’t kept up with that development, that is a good step indeed.

thanks for the clarification octachron, I never thought about how printing works in other languages and that meta information if erased in the OCaml compiled code.
This was very helpful

Pushing for wrong solutions and arguing has not been my intent, I was trying to understand the limitations - I am relatively new to OCaml and while meta being erased during compilation might be common knowledge I was not aware of it. So I was only seeing some overly-complex APIs for printing values;

this thread helped me understand the purpose of all of JaneStreets to_sexp functions

2 Likes

Indeed. I believe sexp is a good choice to implement a uniform serialization of your data structures. I wonder if OCaml stdlib would include it by default, including mechanisms to pretty print sexp.

1 Like