Generalize the fields of a record to apply validation (& other ops)

type-system

#1

Supposing I have a record with several fields, and each field belongs to a basic type:

type person_info =
    { name       : string;
      surname    : string;
      nickname   : string;
      age        : int;
      ...
    };;

Now, I want to add some “meta data” and a few common functions like validation, so I can take an “invalidated record” and produce a valid one. My first simple approach was to use a tuple, where the value in the first position would be the actual field value, and the value in the second position would include all this “meta data”:

type 'a field_meta = {
  id: int;
  validate: 'a -> bool;}
type 'a field = ('a* 'a field_meta)
type person_info = {
  name: string field;
  age: int field;}
let name_meta = { id = 1; validate = (fun _  -> true) }
let age_meta = { id = 2; validate = (fun _  -> true) }
let record_a =
  {
    name = ("Joan", name_meta);
    age = (22, age_meta)
  }
let record_b =
  {
    name = ("Paul", name_meta);
    age = (24, age_meta)
  }

This allows to validate records quite comfortably:

let valid r =
  [r.name |> validate; r.age |> validate] |>
    (List.for_all (fun a  -> a == true))

I guess one could replace the tuples with modules as well, to better encapsulate this “meta data” and remove boilerplate. I like the simplicity and power of tuples :slightly_smiling_face:

What is bugging me now is that I can’t find a way so that I can create a list from the record field values, because each of them is of different type. let recList r = [r.name; r.age] fails to compile. This is understandable, but it also means that for every new function in the spirit of validate that has the shape “any field type in, one type out” I will have to manually walk through all the fields in the record. An example would be a size function that returns an int and apply it to all the fields to get the record size.

I imagine there must be a more modular way to do this. So my question is: is there some way in OCaml to abstract these shared behaviors between fields (functions like validate and size) so creating a list from the value of each field becomes possible? I’ve been reading a lot about different features and techniques in the language (GADTs, existential types, functors) but I couldn’t figure out a clear path to a solution.

Thanks!


#2

I’m not sure I really understand what you’re trying to achieve, so I have some questions before helping you.

First, Is the validate (or size) function, for a given type, specific to each value or is it the same for all the values of a given type? I mean, these functions are not necessarily the same for int and string types, but is it the same for all int values?

Second, and more importantly, why do you want to create a list of fields? Is it to fold the validate function (or any other operator) over them, as you did with your valid function?


#3

Thanks for your answer, I’ll try to clarify :slightly_smiling_face:

It would be specific to each field (so, two fields of type string could have different validation functions).

Yes, exactly. Considering there can be many of these “trait” functions that apply to the different fields –like validate, size, etc– I would like to create a listFromFields function just once, so I can map or fold over it without having to recreate the list in each function.


I guess my more generic question is: if there is a type field('a), and there are some functions that take values of field('a), field('b'), etc but have an output type that doesn’t depend on the input (like validate outputs a boolean regardless the input type, size outputs int), is there a way to “lift” the collection of types field('a), field('b), etc and consolidate them into a single type that “hides” all those internal variables 'a, 'b, …?


#4

Ok, I see what you want to do. I do not have time to give a complete answer (surely tomorrow), but you can try to play with simple use of GADT and existential type.

type wrapper = W : 'a * ('a -> bool) -> wrapper

let rec map = function 
  | [] -> []
  | W (v, f) :: ws -> f v :: map ws
;;
val map : wrapper list -> bool list = <fun>

let l = [ W (1, fun _ -> true) ; W ("foo", fun _ -> true)];;
val l : wrapper list = [W (<poly>, <fun>); W (<poly>, <fun>)]

map l;;
- : bool list = [true; true]

#5

Finally, I’ve some time.

Here a possible solution to your problem. I’m using modules and first-class modules to hold the meta informations on fields because I prefer this syntactic construction, but conceptually this is the same thing that you did with simple records.

(* the module type for operators on fields *)
module type Meta = sig
  type t
  val id : int
  val validate : t -> bool
  val size : t -> int
end

(* a meta field is a value of some type 'a with operations over it *)
type 'a meta_field = 'a * (module Meta with type t = 'a)

(* a field is a meta_field on some type 'a *)
type field = F : 'a meta_field -> field

(* we can check if a field is valid *)
let valid_field (F (v, (module M))) = M.validate v

(* we can compute the size of a field *)
let field_size (F (v, (module M))) = M.size v

(* the type of a person with meta fields *)
type person = {
  name : string meta_field;
  surname : string meta_field;
  age : int meta_field;
}

(* the function you wanted to compute : the list of fields of a person record *)
let fields_list {name; surname; age} = [F name; F surname; F age]

(* We can now use it to check if a person is valid and compute its size *)
let valid person = List.for_all valid_field (fields_list person)

let size person =
  fields_list person
  |> List.map field_size
  |> List.fold_left (+) 0

(* the different modules of operators for the distinct fields *)
module Name_meta = struct
  type t = string
  let id = 1
  let validate _ = true
  let size = String.length
end

module Surname_meta = struct
  type t = string
  let id = 2
  let validate s = String.contains s 'f'
  let size = String.length
end

module Age_meta = struct
  type t = int
  let id = 3
  let validate i = i > 25
  let size i = i
end

(* and an example *)
let batman = {
  name = ("Bruce", (module Name_meta));
  surname = ("Wayne", (module Surname_meta));
  age = (34, (module Age_meta))
}

(* use of the previous functions *)
 valid batman;;
- : bool = false

size batman;;
- : int = 44

Hope this is what you’re searching for. If you have some questions, don’t hesitate to ask.


#6

Thanks a lot @kantian, this is fantastic, and immensely helpful.

Here are some inline comments, just to make sure I understand what’s going on:

(* a meta field is a value of some type 'a with operations over it *)
type 'a meta_field = 'a * (module Meta with type t = 'a)

The with type part allows to keep the module type simple, without type variables (i.e. just type t). Also module keyword is needed to make the module compatible with the “plain types” as it converts it to a first class module.

(* a field is a meta_field on some type 'a *)
type field = F : 'a meta_field -> field

This is (to me) the core part of the solution, right? By leveraging GADTs to define an existential type, we can wrap many different types under a single one, without “leaking” the differences on the output (it’s just a field). [Side note: is there any book or blog posts where I can learn more about these features and techniques?]


I’m going to experiment with this a bit in order to learn more.

Thanks again for taking the time to write such a detailed response!


#7

the with type part belongs to the module language, it allows to define a new signature from a given one (in this case an abstract type becomes a concrete one).

module type Meta = sig
  type t
  val id : int
  val validate : t -> bool
  val size : t -> int
end

module type Meta_on_int = Meta with type t = int;;
module type Meta_on_int = sig 
  type t = int
  val id : t
  val validate : t -> bool
  val size : t -> t
end

Conceptually, a value of type Meta with type t = 'a is equivalent to a value of that type:

type 'a meta = {
  id : int;
  validate : 'a -> bool;
  size : 'a -> int;
}

Yes, this is it. See the reference manual.

You’re right. This existential type allows you to embed each field in a unique type, hence you can define the homogeneous list you wanted.

Sorry, but I’m not aware of such resources.

You’re welcome.


#8

Great example. Thanks for sharing!

I wonder if it is possible to make the module part generic as well.

For example the person type can be validated now. But lets say I want to serialize a person (and its properties) to Json or to Database.

How would the definition for a more generic person type look like?