Using non-ASCII characters in pretty printing?

Consider the following code snippet :

type wrapped_int=Wr of int;;

let print_wrapped_int (Wr i)=
  "\xe3\x80\x90  "^(string_of_int i)^"  \xe3\x80\x91";;

let print_out_wrapped_int (fmt:Format.formatter) x=
   Format.fprintf fmt "@[%s@]" (print_wrapped_int x);;  

#install_printer print_out_wrapped_int;;


let example1=Wr 7;;
let example2=print_string(print_wrapped_int example1);;

The output in utop is as follows :

$ utop                                 --------+-------------------------------------------------------------+--------
        | Welcome to utop version 2.0.1 (using OCaml version 4.04.1)! |        
        +-------------------------------------------------------------+        

Type #utop_help for help about using utop.

-( 15:43:29 )-< command 0 >-------------------------------------{ counter: 0 }-
utop # type wrapped_int=Wr of int;;
type wrapped_int = Wr of int                                                   
-( 15:43:32 )-< command 1 >-------------------------------------{ counter: 0 }-
utop # 
let print_wrapped_int (Wr i)=
  "\xe3\x80\x90  "^(string_of_int i)^"  \xe3\x80\x91";;
val print_wrapped_int : wrapped_int -> string = <fun>                          
-( 15:43:43 )-< command 2 >-------------------------------------{ counter: 0 }-
utop # 
let print_out_wrapped_int (fmt:Format.formatter) x=
   Format.fprintf fmt "@[%s@]" (print_wrapped_int x);;  
val print_out_wrapped_int : Format.formatter -> wrapped_int -> unit = <fun>    
-( 15:43:43 )-< command 3 >-------------------------------------{ counter: 0 }-
utop # 
#install_printer print_out_wrapped_int;;
-( 15:43:43 )-< command 4 >-------------------------------------{ counter: 0 }-
utop # 
let example1=Wr 7;;
val example1 : wrapped_int = ?  7  ?                                           -
( 15:43:43 )-< command 5 >-------------------------------------{ counter: 0 }-
utop # let example2=print_string(print_wrapped_int example1);;
【  7  】val example2 : unit = () 

Here I am not satisfied with the display for example1, as the non-ASCII characters are rendered as a mere ?. I know however that my terminal is fully capable of displaying the characters correctly, as shown by example2.

Is this an inescapable limitation of pretty-printing in OCaml, or is there a workaround ?

The problem must be on your side. example1 renders correctly here.

Note however that Format.formatter will not be able to perform good layout on UTF-8, see here and here for a full explanation and (necessarily incomplete) solutions.

1 Like

Thank you for your feedback.

What do you mean ? Since when is rendering a character by another character “correct rendering” ?

I read those links thoroughly and did not find them helpful.
What would your “necessarily incomplete” fix to my code snippet be ?

I tried your example1 on my machine and it renders correctly like example2. So the problem seems to be with your terminal.

This is unrelated to your initial problem but you will hit it soon enough. You need to print UTF-8 encoded characters using the provided functions otherwise the format module with think of your UTF-8 encoded character as being three character long.

Though if you are not dealing with general UTF-8 input and only using a fixed set of uchars you could do it yourself e.g. via

     Format.fprintf ppf "@<1>%s" "\xe3\x80\x90"

I also realized just now that the problem appears only when I use utop, not when I use the ordinary ocaml toplevel. Which did you try on your machine ?

Utop on macOS. 123456789012345689 (characters added to sastisfy the minimal number of character needed to post).

I tried the following :

type wrapped_int=Wr of int;;


let print_out_wrapped_int (fmt:Format.formatter) (Wr i)=
   Format.fprintf fmt "@[<1>%s%s@<1>%s@]" 
   "\xe3\x80\x90" 
   (" "^(string_of_int i)^" ")  
   "\xe3\x80\x91";;  

#install_printer print_out_wrapped_int;;


let example1=Wr 7;;

But I still get that ? 7 ? in utop. Grr.