Pretty printing binary ints

Stdlib’s Printf allows you to print ints as unsigned octals or hexadecimals, but not as unsigned binary integers.

Does anyone have a library that can pretty print ints as binary ints? I’m in the middle of writing one, but thought I’d ask just in case I missed something.

Ideally I’m looking for something that would output 0b1_0010 or simply 10010 when I run Printf.printf "%a" pp_binary_int 18.

Bonus points if there’s a way to specify minimum widths: 0b0001_0010 for Printf.printf "%a" (pp_binary_int ~min_width:8) 18.

2 Likes

Interesting, looks like Printf at one point supported it but then it was removed: minor feature request: printf and binary digits · Issue #6249 · ocaml/ocaml · GitHub

1 Like

There doesn’t seem to be much demand for this feature. If someone can come up with a compelling argument, a good choice of letter (given that B and b are already taken), and a patch, we might include it.

Hmm… My use case is that I’m abusing int as a small bit-vector and need it for debugging. I don’t think this would count as “much demand”.

When pretty-printing stuff in anger, I’ve more-or-less stopped using Printf in favor of Fmt. It’s got enough bells-and-whistles that even for printing single lines, it’s the way to go. But either way, you can just use the “provide a function that converts to string” (for Fmt, "provide a function that pretty-prints) method to provide custom pretty-printers for your special types.

I hope that was clear; if not, I can give an example. It’s not as sweet at “%b”, but then again, it’s not very painful either.

2 Likes

Oh, I already use Fmt when I’m pretty-printing complicated types. Love that library!

I’m still going to have to write a function that converts ints to (strings of) binary ints.

I haven’t looked at that issue, but maybe you could scoop up the code [that got removed] from the associated PR ? Just a thought grin.

This is code that I have used in the past (from memory):

let int_size = Sys.word_size - 1
let int2bin =
  let buf = Bytes.create int_size in
  fun n ->
    for i = 0 to int_size - 1 do
      let pos = int_size - 1 - i in
      Bytes.set buf pos (if n land (1 lsl i) != 0 then '1' else '0')
    done;
    (* skip leading zeros *)
    match Bytes.index_opt buf '1' with
    | None -> "0b0"
    | Some i -> "0b" ^ Bytes.sub_string buf i (int_size - i)

Cheers,
Nicolas

Another option is to use Bitvector code from BAP bap/lib/bitvec at master · BinaryAnalysisPlatform/bap · GitHub

Done with my library: GitHub - ifazk/pp-binary-ints: An OCaml library for pretty printing ints as unsigned binary integers. Will post an announcement once’s the opam PR goes through.

I made it very customizable.

  • You can print with 0b prefixes and _ separators.
  • You can choose to print zeros just like the non-zeros, with prefixes and separators.
  • If you use zero padding, you can control how many leading zeros show up with the ~min_width argument.
  • It correctly handles the edge cases when adding _ separators: you won’t get leading underscores.
  • And of course, it includes pretty printers that work with Format and Fmt, not just to_string functions.

The “zeros just like the non-zeros” needs some explanation. I noticed some inconsistencies when printing zero with Printf.

# Printf.printf "%0#11d" 4444444;;
004_444_444- : unit = ()
# Printf.printf "%0#11d" 0;;
00000000000- : unit = ()

Notice the 00000000000, I would have expected 000_000_000. I made both behaviours available.

2 Likes

I would call this a bug. Did you consider reporting it?

No, I later realized that the two aren’t inconsistent.

Stdlib takes “zero-padding” a little to literally. Notice what happens when we use a much longer width.

# Printf.printf "%0#22d" 4444444;;
00000000000004_444_444- : unit = ()
# Printf.printf "%0#22d" 0;;
0000000000000000000000- : unit = ()

Still, I would expect padding zeros to be thousand-separated (that leaves the corner case when the mandated width is a multiple of 4, where you’d start with an underscore, but in that case we can blame the user for mandating an inadequate width with the alternate format).

NB: I just noticed another unexpected behavior: the + and <space> flags (prefix non-negative numbers with a plus or a space, for alignment purposes) disable the alternate format altogether.

For zero padding, it looks like this is known and intended behaviour, meant to keep things simple in Stdlib. Printf: alternative format for integers by ygrek · Pull Request #1182 · ocaml/ocaml · GitHub. For better handling, at least this comment believes it should be handled by external libraries.

Looks like + and <space> flags are not allowed to be used together and only there for legacy-behaviour. Try passing the -strict-formats flag to the ocaml toplevel or the compiler.

On my system (macos 12.6, OCaml 4.14) Sys.int_size and Sys.word_size - 1 both return 63

Just curious if there is a reason for using the latter over the former?

No, there isn’t. In fact, it is better to use Sys.int_size which is correct even for other backends such as js_of_ocaml (which uses 32-bit integers).

Cheers,
Nicolas