Selectively bringing constructors or record fields into scope

Continuing the discussion from Generic numeric module, Problem with types:

I noticed that open seems to be the only way to bring constructors and record fields into scope. I tried:

module M = struct
  type t = { foo : int }

  let ( = ) a b = a.foo = b.foo
end

let () =
  (* None of the following ways to bring "foo" into scope
     works in this context: *)

  (* let foo = M.foo in *)
  (* let foo (x : M.t) = x.foo in *)
  (* let open M in *)
  (* let open (M : sig type t = M.t end) in *)

  (* Instead we have to copy the whole type definition here: *)
  let open (M : sig type t = { foo : int } end) in

  let _ = { foo = 5 } in
  assert (1 = 1)

I find the syntax open (Module_name : sig type type_name = { … } end) unsatisfying to an extent where I would rather want to avoid this.

What is the idiomatic approach?

  • Outside a module, work with functions instead of constructors and record fields?
  • Design modules for opening, e.g. by avoiding ambiguous names and operators like = in the example above?
  • Always provide helper functions in addition to constructors and record fields?
  • Always qualify constructors and record fields or use type annotations (see below)?

Am I overlooking an easy approach to this issue? Perhaps it’s not possible to nicely bring specific constructors or record fields into scope? If that is the case, then I probably have to resort to use type annotations or qualification:

-  let _ = { foo = 5 } in
+  let (_ : M.t) = { foo = 5 } in

Or:

-  let _ = { foo = 5 } in
+  let _ = { M.foo = 5 } in

But this is also a bit unsatisfying when I make large use of specific records or fields in my code.

This will bring every record label and constructor defined in M into scope. What exactly did you try?

In my experience prefixing record labels and constructors with their module is not too heavy, ie {M.foo = 5}, since 1) you only need to do it for the first label/constructor, and 2) thanks to type-based disambiguation, within a function, one or two qualifications are enough to disambiguate every use, eg let _ = if true then {M.foo = 5} else {foo = 7} typechecks correctly.

Cheers,
Nicolas

1 Like

I would like to bring constructors and record fields into scope selectively.

Consider:

module M = struct
  type t = { foo : int }

  let ( = ) a b = a.foo = b.foo
end

let () =
  let open M in

  let _ = { foo = 5 } in
  assert (1 = 1)

This will fail to compile:

File "./demo.ml", line 11, characters 10-11:
11 |   assert (1 = 1)
               ^
Error: The constant 1 has type int but an expression was expected of type M.t

An issue I ran into before, as discussed in the other thread.

So I was curious if I can bring constructors and record fields into scope selectively, i.e. picking which items of M are brought into scope.

The answer is no but to further expand on @nojb’s answer you can delete your let open M and write:

let _ = M.{ foo = 5 } in
1 Like

The syntax is nice, but comes with a risk:

module M = struct
  type t = { foo : int; bar : bool }

  let ( = ) a b = a.foo = b.foo
end

let () =
  let a = 1 in
  let b = 2 in
  let _ = M.{ foo = 5; bar = (a = b) } in
  ()

The above example won’t compile, again for the same reason as discussed in the other thread.

Another example where this causes problems:

module M = struct
  type t = { foo : int }
end

let () =
  let z = 5 in
  let x = M.{ foo = z } in
  assert (x.foo = z)

That program compiles and runs successfully. However, assume that in a new version of the module, there is a new item:

 module M = struct
   type t = { foo : int }
+  let z = "zero"
 end

Now this would lead to a compiler error. But even worse, consider:

 module M = struct
   type t = { foo : int }
+  let z = 0
 end

Now, this would cause a runtime error.

I know that a lot of these scenarios are unlikely to happen, but they still can happen, so I’d like to try to maintain some code/namespace hygiene here to minimize those risks. And I’m not sure what’s the best approach.

I am tempted to say that the best approach is simply to qualify your record field: {M.foo = 5}.

Cheers,
Nicolas

1 Like

Or make your type abstract and provide a constructor and accessors for it.

Would that mean that exposing concrete types, including type constructors (those with a capital letter in the beginning) and record fields, is likely something to only do where there is some “intimacy” between the library and its user?

In that case, one could argue, a user of a module would need to closely follow changes in implementation details anyway, such that using open is less of a risk.

But I’m not sure. My toy example aside, the real-world example I have is a module Relation like this:

(** Equations and inequalities. *)

(** Comparison operators. *)
type comp_op =
  | Leq  (** Left-hand side is less than or equal to right-hand side. *)
  | Eq  (** Left-hand side is equal to right-hand side. *)
  | Geq  (** Left-hand side is greater than or equal to right-hand side. *)

type 'a relation = {
  lhs : 'a array;  (** Coefficients of left-hand side of the relation. *)
  mutable op : comp_op;  (** Comparison operator. *)
  mutable rhs : 'a;  (** Value on right-hand side of the relation.*)
}
(** Relation type (equation or inequality) *)

I would like to be able to write things like:

open Relation

let f { lhs; op; rhs } = …

That works as long as I have control over what’s inside the Relation module and if I know there won’t be any surprising updates.

But I struggle how to name my type relation. Should I name it relation or t. The former makes it possible to use open in a good way, while the latter would be nicer with qualified names. Where I need qualificiation and don’t want to use open, I would prefer if I could write:

-let f ({ lhs; op; rhs } : 'a Relation.relation) = lhs
+let f ({ lhs; op; rhs } : 'a Relation.t) = lhs

I feel like both choices (Relation.t and Relation.relation for my type name) aren’t satisfying. I also feel like I need to make a choice when designing modules whether they are meant for opening or not, leading to very different naming schemes.

Note that it sounds like you are missing also the use of local module alias:

module R = Relation
let f { R.lhs; op; rhs } = ...

which often offers a good compromise between conciseness and explicitness when a module is very often used in a given module.

For your naming issue, I generally define both

module Relation = struct
  type t = ...
  type relation = t
end

whenever I design a module which is intended to be open in some scope. But yes globally opening a module (even more so from an external library) introduces a good amount of coupling with the opened module.

Note that you also selectively re-export a type

type 'a relation = 'a Relation.t =
  { lhs : 'a array; mutable op: comp_op; mutable rhs:'a }
1 Like

Oh, so I could do this:

module M = struct
  type t = { foo : int }
  let z = "zero"
end

type t = int

let () =
  let z : t = 5 in
  let module Helper = struct
    type new_type_name = M.t = { foo : int }
  end in
  let open Helper in
  let x = { foo = z } in
  assert (x.foo = z)

Not sure what the take-away is. I feel like the best is to decide whether a module is designed to be used with open or not. (I don’t quite like the approach to define multiple names for the same type just in case.) I also think that when a module is not designed to be used with open and when you can’t control updates in that module, you should (ideally) refrain from using open or M.{…} syntax.

And if you really want to avoid qualifications, you can use either use a short module alias as you pointed out above, or create a new module for opening that re-exports the desired items. It’s a bit verbose, but I guess there’s always a way.

I do think, however, the language could provide a bit less friction here, but probably that is not easy to fix (and not sure if there would be other downsides with a different behavior).

For selectively importing types you may consider ppx_import

type%import relation = Relation.t
1 Like

Idiomatically I would do this:

module M = struct
  type t = { foo : int }
end

let () =
  let z = 5 in
  let x = { M.foo = z } in
  assert (x.foo = z)

Consider:

module A = struct type t = {foo : int} end
module B = struct type t = {foo : float} end
module C = struct
  type t = A.t * B.t

  let f : t -> float = function
    | a, b -> Float.of_int a.foo *. b.foo
end

let () = Printf.printf "%f\n" (C.f ({foo = 2}, {foo = 3.}))

The last line and C.f’s definition work even though none of these modules have been brought into scope. This seems to be fairly new but I’m not sure when it was added exactly. I’d expect some mention like “type-directed disambiguation of constructors and record fields” but that term only comes up elsewhere in the manual.

a.A.foo is another way to minimally access A. I’ve had the same worry about unintended shadowing in M.(lots of code here) and that worry’s encouraged me to make local aliases instead as already suggested, let module L = Longer_name in

let f Relation.{ lhs; op; rhs } = ...

is how I like that, which looks cleaner to me thanRelation.lhs, and doesn’t come with the same worry about unintended shadowing from a later update to Relation’s signature.

After some more experimenting, the following was actually the solution to my initial problem of wanting to “selectivly bring constructors or record fields into scope”:

Here, the fields lhs, comp_op, and rhs are in scope and act as record fields, i.e. I can do somerecord.lhs anywhere in my code, without having to qualify (which was what I wanted to achieve).

Interestingly, it doesn’t seem possible to rename those fields locally, i.e. if there are collisions, you’ll have to resort to qualifying their use again.

The following preprocessor may be a shortcut, but I didn’t try it and didn’t want to pull-in more dependencies in my case:

Thank you for all your insight.


I find module organization and code hygiene one of the more difficult aspects when learning OCaml, though other languages come with some surprises as well. I wonder if programmers are generally aware that when they write, Some_module.(x + y + z) to bring + into scope, an update of Some_module could also (surprisingly) shadow z. This is easy to overlook/forget, I guess, and seems to be a bit dangerous, as demonstrated by the last z = 0 example in my post above.

The current OCaml manual generally warns of these dangerers by saying in Chapter 2: The module system:

In particular, opened modules can shadow identifiers present in the current scope, potentially leading to confusing errors: […]

The manual does state that local opening is a “partial” solution to the problem:

A partial solution to this conundrum is to open modules locally, making the components of the module available only in the concerned expression. […]

But what does “partial” really mean? Strictly speaking, it is actually only safe under the following conditions:

  • if either only items of the locally opened module are used, as for example in: Alcotest.(check (list int)) (example taken from here; note that all check, list, and int come from Alcotest here),
  • or if it is unlikely that certain items will be added to the opened module in future versions or if updates are thoroughly monitored or made impossible (by fixing versions).

So my conclusion here is to be more aware of the risks when opening modules (including opening locally).

Warning 44 solves this problem:

# module M = struct
  let ( + ) = ( ^ )
  let z = "oops!"
end;;

# let x = "hello, ";;
# let z = "world!";;

# let _ = M.(x + z);;
Line 1, characters 0-1:
Warning 44 [open-shadow-identifier]: this open statement shadows the value identifier + (which is later used)

Line 1, characters 0-1:
Warning 44 [open-shadow-identifier]: this open statement shadows the value identifier z (which is later used)
1 Like

Hmm, I see, but shadowing some identifiers actually is a pretty nice feature in my opinion, as it allows to redefine <, >, =, etc. I also use it to shadow functions like abs when working with generic numbers.

You can also do let ( + ) = M.( + ) and so on.

Yes, exactly. In most cases I would like to be in control of what I bring into scope. When it comes to constructors and record fields, however, it wasn’t as easy to find out how it works, but @octachron gave the answer above:

I searched the manual, and it seems like there is no example given, but it is mentioned here in the manual:

Re-exported variant type or record type: an equation, a representation.

In this case, the type constructor is defined as an abbreviation for the type expression given in the equation, but in addition the constructors or fields given in the representation remain attached to the defined type constructor. The type expression in the equation part must agree with the representation: it must be of the same kind (record or variant) and have exactly the same constructors or fields, in the same order, with the same arguments. Moreover, the new type constructor must have the same arity and the same type constraints as the original type constructor.

Maybe the syntax isn’t so bad, considering that I explicitly want to bring the constructors/fields into scope, so writing them down seems reasonable.

1 Like