I’ve been reading this blog post, Parse, don’t validate, and think the ideas are pretty good.
Part of the main idea here is that you should use types that make invalid values impossible. They use an example of a non-empty list. I’ll translate the example into OCaml.
type' a non_empty = NonEmpty of 'a * 'a list [@@unboxed]
let non_empty = function
| [] -> None
| hd :: tl -> Some (NonEmpty (hd, tl))
In this way, you can define an head
function that never fails:
let head (NonEmpty (hd, _)) = hd
Only accepting an instances of non_empty
ensures the input is never invalid.
“Aha!”, you will surely say, “You are simply pushing error handling to the point at which you create the non_empty
type.”
That is correct. The data must still be checked and errors handled. The article claims that it’s better to do this in the data preparation phase than in the processing phase. That is, you parse everything first to ensure only valid values, use types which can only represent the valid values and then compute a result based on these values.
I was considering how one might do this with Map.t
and one was simply,
(* just assume this is functorized where `M` is an instance of Map.Make *)
type 'a non_empty = NonEmpty of {l: 'a M.t; v: M.key; d: 'a; r: 'a M.t}
let non_empty = function
| Empty -> None
| Node {l; v; d; r; _} -> NonEmpty {l; v; d; r}
I realize there’s probably a better way to do this (using choose
and remove
, for example), but the more important point is that it’s actually impossible to do this, as far as I know, since there’s no way for user code do destructure Map.t
.
At another time I was working on an iterator type inspired by Base.Sequence—I really like these Jane Street libraries, but I always hesitate to include them as dependencies in libraries I design to be distributed because they are kind of large and very opinionated and, while I tend to agree with many of their opinions, I don’t want to force them on potential users of my library, so I have occasionally ended up making poor-man’s copies of individual modules from Base (including Sequence and Sexp) which play nicely with StdLib for use in my libraries.
Anyway, I was working on my iterator type, and I was considering how I might efficiently convert instances of Map.t
into my type. To do so, I looked at the implementation of Map.Make.to_seq
. ocaml/stdlib/map.ml at 137dd26adc3345547b6eef6da744ac0d66fbc209 · ocaml/ocaml · GitHub
It’s a clever idea. I thought I might apply a similar idea to my iterator—but again remembered that I cannot, as a user, destructure instances of Map.t
, nor get access to cool implementation functions like cons_enum
so I will probably have to simply convert it to a Seq
and then convert that to my type. The overhead is not a big deal, but I’m sure any programmer can understand how bad this feels, especially when the point is to have an iterator which is more efficient than Seq
.
So my question is, is there a way to get “super secret access” to some of these implementation functions/constructors? I realize this is totally against the spirit of encapsulation, but I thought I might ask anyway.