How to constrain a function to receive only a type variant?

I’ve got a little confusion with OCaml’s type system, have a look at this code:

let usage () = print_endline "Usage: rev.exe PATH"

let do_exit (Error msg) code =
  print_endline @@ "ERROR => " ^ msg
  ; usage ()
  ; exit code
;;

let run path =
  match Rev_lib.valid_path_of_string path with
  | Ok vp -> Rev_lib.rev_file vp
  | Error _ as err -> do_exit err 2
;;

let () =
  match Sys.argv with
  | [| _; path |] -> run path
  | _ -> do_exit (Error "missing argument") 1
;;

The compiler complains about the do_exit function

Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.

I understand why it complains, but I don’t want to express in code the notion that the do_exit function could handle the other side of the result type, i.e. Ok

I don’t want do_exit to only receive a string either, I want to be more specific.

What am I missing?

For such a simple use case perhaps just raising an exception will do the trick? Is there a reason to wrap a message in Error just to unwrap it again and print it?

| _ ->
  usage ();
  failwith "missing argument"

I’m trying to play with the notion of “parse don’t validate”.

For instance, this is what my binary is calling into

type path = Valid_path of string

let valid_path_of_string p =
  if not @@ Sys.file_exists p then
    Error ("file does not exist: " ^ p)
  else
    Ok (Valid_path p)
;;

let rev_file (Valid_path p) = print_endline @@ "Ready to work on: " ^ p

In rev_file, I clearly signaled that my file path is “valid”, whatever that means.

It works here because I’ve only got one variant. As soon as I add another one, I get into the same problem.

So in short, I want to force callers to feed data in a certain constrained way. To minimize defensive coding later down the chain.

The simplest way to do this is to define a specific type for the argument of do_exit and translate into this specific type at the callsite:

type error =
  | Path_error of string

let error (Path_error msg) =
  ...

let run path =
  match Rev_lib.valid_path_of_string path with
  | Ok vp -> Rev_lib.rev_file vp
  | Error msg -> do_exit (Path_error msg) 2

Another way is to use polymorphic variants, where each use of a constructor is assigned its own “type” (see OCaml - Polymorphic variants):

let do_exit (`Error msg) code =
  print_endline @@ "ERROR => " ^ msg;
  usage ();
  exit code

let run path =
  match Rev_lib.valid_path_of_string path with
  | `Ok vp -> Rev_lib.rev_file vp
  | `Error _ as err -> do_exit err 2

let () =
  match Sys.argv with
  | [| _; path |] -> run path
  | _ -> do_exit (`Error "missing argument") 1

This looks clever (no need for an extra type definition), but you should keep in mind that the typing of polymorphic variants is more complex than that of usual variants, and makes the code harder to reason about. Because of this, in most situations, the first, simpler, approach is preferrable.

Cheers,
Nicolas

3 Likes

Of course it makes so much sens now! :slight_smile:

Nice feedback thanks, very much appreciated

You could also convince the typechecker that the Ok case is not going to happen using an empty type and adding a type annotation on your Error msg argument to say it has type (empty, string) Result.t:

type empty = |

let usage () = print_endline "Usage: rev.exe PATH"

let do_exit (Error msg : (empty, string) Result.t) code =
  print_endline @@ "ERROR => " ^ msg
  ; usage ()
  ; exit code

let run path = print_endline path

let () =
  match Sys.argv with
  | [| _; path |] -> run path
  | _ -> do_exit (Error "missing argument") 1
2 Likes

Very cool @zapashcanon!! This looks like a very flexible solution!

It’s the only one so far that allowed me to pass along the error as-is, like this:

| Error _ as err -> do_exit err 2

That equivalent code wouldn’t work with polymorphic variants.

I thought about a slight variation this morning where the rule is simple: just “capture” the variant into a custom type as such:

type 'a error = Error of 'a

let do_exit (Error msg) code =
  print_endline @@ Ansi.red ^ "ERROR => " ^ msg ^ Ansi.rst
  ; usage ()
  ; exit code
;;

let run path =
  match Rev_lib.valid_path_of_string path with
  | Ok vp -> Rev_lib.rev_file vp
  | Error msg -> do_exit (Error msg) 2
;;

And the type checker is happy. It doesn’t look like there are any downsides to doing that.

If this is an uncommon practice, maybe adding an extra type annotation would be useful, for clarity:

let do_exit ((Error msg): string error) code =
  print_endline @@ Ansi.red ^ "ERROR => " ^ msg ^ Ansi.rst
  ; usage ()
  ; exit code
;;

It’s the only one so far that allowed me to pass along the error as-is, like this:
That equivalent code wouldn’t work with polymorphic variants.

It would work:

let usage () = print_endline "Usage: rev.exe PATH"

let do_exit (`Error msg) code =
  print_endline @@ "ERROR => " ^ msg
  ; usage ()
  ; exit code

let foo _path = `Error "meh"

let run path =
  match foo path with
  | `Ok vp -> ignore vp
  | `Error _ as err -> do_exit err 2

let () = run ".."

But polymorphic variants are less efficient than regular ones. In this case, it also implies that you have control over what the foo function is giving you. Which may not always be the case.

And the type checker is happy. It doesn’t look like there are any downsides to doing that.

The downside is that this code is probably going to allocate, contrary to using the empty type with | Error _ as err -> do_exit err 2. To avoid this, you may need to add an unboxed annotation to your type definition.

But if all you want is to be explicit you could also just do the following:

let do_exit ~error:msg code =
  ...

let run path =
  match Rev_lib.valid_path_of_string path with
  | Ok vp -> Rev_lib.rev_file vp
  | Error msg -> do_exit ~error:msg 2

Oh I see. I didn’t understand I would also need to rewrite the types at their origin.

let valid_path_of_string (p : string) : [> `Error of string | `Ok of path ] =
  if not @@ Sys.file_exists p then
    `Error ("file does not exist: " ^ p)
  else
    `Ok (Valid_path p)

Right, I seed that now.

Got ya, good to know!

In general, I would prefer to convey meaning by a type rather than by a name. But labeled arguments look like a good option, at least something to consider.

Thanks :slight_smile:

I’m honestly impressed this flow of type info exists and works in the type checker. I expected it to complain that the types (valid_path, string) result and (empty, string) result won’t unify. And indeed it does if you don’t tell it we’re on the Error variant branch explicitly (e.g. by wildcarding with | err -> ... directly). But, it turns out, the compiler is smart enough to figure out that the type of Ok payload doesn’t matter.

Me too ! I had to compile it to convince myself it works.

Another solution:

include (struct
  type 'a error = 'a
  let error_wrap s = s[@@inline]
  let error_unwrap s = s[@@inline]
 end : sig
   type _ error
   val error_wrap : 'a -> 'a error
   val error_unwrap : 'a error -> 'a
 end)

let do_exit (msg : string error) =
  print_endline @@ "ERROR => " ^ (error_unwrap msg)

let foo path =
  if Random.bool () then Ok path else Error "meh"

let () =
  match foo "..." with
  | Ok vp ->
      print_endline vp
  | Error e ->
      do_exit (error_wrap e)
1 Like