Using ppx_deriving.show for trace messages

jumpnbrownweasel · July 22, 2023, 12:14am

I just started using ppx_deriving.show. It adds newlines to (edit: embeds them in the content of) the output of both pp and show which is appropriate for pretty printing. But what about tracing? Normally trace messages don’t have newlines so that grep’ing the trace file prints the entire event.

I could write separate pp functions for tracing, but in that case I might as well not use ppx_deriving. I could replace the newlines with spaces in the string that show returns and print it, but that’s not an efficient way to print traces.

Has anyone else run into this? I didn’t see anything in the ppx_deriving issues.

Chet_Murthy · July 22, 2023, 12:40am

Can you share an example? I don’t recall this as a problem with the show deriver, and it seems unlikely, since how could it compose (that is, the pp function for type a * b is composed of the function for each of a and b).

jumpnbrownweasel · July 22, 2023, 12:58am

Sure, I just saw it. Let me distill it a little.

jumpnbrownweasel · July 22, 2023, 1:10am

utop # #require "ppx_deriving.show" ;;
utop # type t = {dialog_idx: int; reply_idx: int; reply: string list; n_circular: int} [@@deriving show] ;;
type t = {
  dialog_idx : int;
  reply_idx : int;
  reply : string list;
  n_circular : int;
}
val pp : Format.formatter -> t -> unit = <fun>
val show : t -> string = <fun>
utop # show {dialog_idx = 1; reply_idx = 2; n_circular = 3; reply =  ["one"; "two"; "three"]} ;;
- : string =
"{ dialog_idx = 1; reply_idx = 2; reply = [\"one\"; \"two\"; \"three\"];\n  n_circular = 3 }"

The newline is after the list.

Chet_Murthy · July 22, 2023, 1:13am

now do:

Fmt.(str "%a" pp [value of type t here])

[I think that’s how it’s done – I’m not going to test it out.]

That is to say, there’s a difference between “show” and “pp” in that “show” intended for use at the toplevel, where “pp” is intended for composition. I rarely-if-ever use show.

jumpnbrownweasel · July 22, 2023, 1:14am

pp places the newline in the same place as show, so I naively assumed that the same code is used for both. But perhaps they’re different in some cases.

jumpnbrownweasel · July 22, 2023, 1:14am

utop # asprintf "%a" pp {dialog_idx = 1; reply_idx = 2; n_circular = 3; reply =  ["one"; "two"; "three"]} ;;
- : string =
"{ dialog_idx = 1; reply_idx = 2; reply = [\"one\"; \"two\"; \"three\"];\n  n_circular = 3 }"

Edit: they’re the same.

jumpnbrownweasel · July 22, 2023, 1:24am

In case this is what you’re going to ask next:

utop # printf "%a\n" pp r ;;
{ dialog_idx = 1; reply_idx = 2; reply = ["one"; "two"; "three"];
  n_circular = 3 }
- : unit = ()

Chet_Murthy · July 22, 2023, 1:34am

OH, NOW I SEE! You meant newlines in the interior of the printout. Yes, that’s normal and expected. And truthfully, why would you do otherwise? We all had the same problem in Java, back in the day. And removing those newlines meant an unholy hell of trying to reinfer them during log-analysis.

Let’s back up: your problem here is that you have a byte-stream, and you want to put records into it for later demarcated parsing. As Marshall Rose said oh-so-long-ago, there are three ways of doing this:

length-count (prepend a length-count)
sentinel (a special character at the end of each record) – this is your “let’s use newline as the sentinel”
byte-stuffing (convert that sentinel into something else when it occurs inside the record)

[it’s been a long, long time, so I might be misremembering]

So when I wrote a high-performance logging framework for Java, I chose (went back and looked at the code) length-count (#1). But I’ve used the sentinel approach – only, not “\n”, but rather “%|%” which never shows up in Java log output, for more “dash it off one-offs”.

Log-messages are also meant to be processed by humans, which is why removing newlines and whitespace is actually not-very-helpful.

jumpnbrownweasel · July 22, 2023, 1:38am

I understand your point, but I disagree. Grepping logs is common and doesn’t give you the entire matching event if newlines are embedded. This is from my experience in debugging production systems. But it is fine for us to disagree.

Chet_Murthy · July 22, 2023, 1:42am

Oh, I understand that you need to grep things. That’s what Perl scripts are for, eh? A Perl script like:

$/ = "%|%" ;
while (<>) {
  print $_ if $_ =~ /foo bar buzz/s;
}

and of course, it’s trivial to make a version that takes a cmdline pattern. And compiles that pattern into a thunk before the while-loop.

Again going back to Java, stack-tracebacks are common. When I build large systems in OCaml they’re common also. So grepping isn’t very useful: I don’t want the line containing the pattern I was searching for, but rather the log-record. And as I showed above, it’s pretty trivial to write a script to grep on log-records, not lines.

Chet_Murthy · July 22, 2023, 2:03am

You cann also put out “\0” (null) as a sentinel/terninator; then you can use “grep -z” that treats the null as the line-ending character. I never did that, b/c I found it ugly; I thought “%|%” was much better, b/c it stayed in ASCII.

jumpnbrownweasel · July 22, 2023, 2:11am

Thanks for all this info, it is useful. But really my desire is to get rid of the newlines (or replace them with something) when using deriving.show. I realized just now that I have been accustomed to newline-free output when using Rust’s derived Debug format, so that is part of why I was surprised. I think my mistake is in not realizing that deriving.show is all about pretty-printing, not debug output, including the show function. And that not everyone (maybe no one else) wants newline-free output.

Chet_Murthy · July 22, 2023, 2:17am

Oh, lots of people want newline-free output. But I think once you start seeing large things getting pretty-printed (e.g. stack tracebacks), if it’s newline-free, your eyes roll back in their sockets, and if that happens a few too many times, it really, really hurts. So you learn to like newlines and whitespace-preservation.

jumpnbrownweasel · July 22, 2023, 2:21am

What we did (in Java production systems) is to put everything about an error except for the stack trace in a single line, followed by a stack trace with newlines as usual. But of course that’s not the only option as you point out. It’s just what I am used to doing.

Chet_Murthy · July 22, 2023, 2:23am

I thought I should add that, no, pp/show are definitely for debug, not pretty-printing. No end-user wants to see OCaml types. So a pattern you’ll see is that there are functions pp, and also functions pp_hum. With the latter as the “pretty” printing functions. This realy is a difference of opinion about newlines in debug output. And you’re not alone, as I said before.

octachron · July 23, 2023, 7:08am

To answer the original question, the interpretation of new lines can be tuned in the formatter when using Format. For instance, we can set a low level device that outputs '↵` on newlines with

let set_escaped_newline ppf =
  let low_level_device = Format.pp_get_formatter_out_functions ppf () in
  let out_newline () = low_level_device.out_string "↵" 0 3  in
  let low_level_device = { low_level_device with out_newline } in
  Format.pp_set_formatter_out_functions ppf low_level_device

If we set up a version of sprintf that uses this kind of low_level device

let sprintf fmt =
  let b = Buffer.create 20 in
  let ppf = Format.formatter_of_buffer b in
  let () = set_escaped_newline ppf in
  Format.kfprintf
    (fun ppf -> Format.pp_print_flush ppf (); Buffer.contents b)
    ppf fmt

then

sprintf "@[<v>%a@]" Fmt.Dump.(list Fmt.int) (List.init 25 Fun.id)

returns

"[0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; ↵ 20; 21; 22; 23; 24]"

jumpnbrownweasel · July 23, 2023, 7:59am

Wow, thank you!! That will give me what I need.

Topic		Replies	Views
`Format`-based `ppx_deriving.show` syntax Learning	4	452	February 28, 2023
Deriving show not working as expected Learning ppx_deriving	2	2042	February 25, 2022
Ppx_deriving implementation for pretty printing sets & maps Learning	8	2154	December 16, 2021
How to derive show for pairs, lists, etc.? Learning ppx_deriving	1	451	January 1, 2024
Is there plugin `show` based on `ppxlib.Deriving` Learning ppx , ppx_deriving , ppxlib	1	1789	July 1, 2019

Using ppx_deriving.show for trace messages

Related topics