Why does the order of fields in a record type definition matter?

OCaml made the decision to use user-written interface files both as high level specifications that constrain implementations and as low level interface definitions between compilation units.

Indeed, that was the crucial design decision that unlocked efficient compilation with lots of optimization opportunities, while combining that with the elegant module language and its abstraction powers.

That’s very interesting! I didn’t know that. Is this story written down in any more detail anywhere?

For a start, it’s worth reading Xaviery Leroy’s master’s thesis: https://xavierleroy.org/publi/ZINC.pdf

I have a distinct memory (which might be false, b/c it was >30yr ago) of reading it and getting the message:

“I’ve told you this long and complicated theoretical story, but really, I just wanted to make it run like C as much as I could”

At the time (1992), there was an ML called SML/NJ that was … well, it was a massive memory hog, b/c it chose to do the mathematically correct thing a few too many times. Like, left-to-right call-by-value evaluation. Heap-allocated closures like crazy. You had to run it on the biggest desktops you had. Caml-light came along, and because of -judicious- implementation choices (which were reflected in the semantics you had to know to reason about your programs), it could run on … basically anything in the office.

At INRIA in 1992 in Batiment 11 there were two color Sparc-10 machines (on the desks of the group leaders). They were basically the only machines on which SML/NJ ran well. Whereas, caml-light ran on every machine in the lab, including old SUN-2 machines (68010, yeah baby!), Sony NeWS stations, and on and on. Sure, caml-light was a bit slower. But it fit in constrained memory, and that was a big, big deal. Coq in caml-light ran on every one of those machines. Coq … ported to SML/NJ (I did the port, only to realize the error of my ways) could only run on those two Sparc-10s, b/c on any other machine, it spent all its time paging, paging, paging.

With the passage of time OCaml came along, with a native-code compiler, etc, but the initial … “attitude” of caml-light, that it was designed to fit the hardware of the day, was critical to its success. And you can see it in the world: SML/NJ … where is it? I haven’t seen it running in the wold in decades.

As an aside, I’d note that one of the seminal grandfather languages of computer science, ALGOL-60, was call-by-name. It was theoretically lovely, but … well, there’s a reason that the languages that achieved real longevity are all call-by-value languages (and Haskell simulates call-by-value with its monads).

Theoretical beauty and five bucks’ll get you a double-espresso at Peet’s. But if you want to get stuff done, you’re going to have to make compromises.

ETA: BTW, caml-light’s officially-unspecified, but in practice right-to-left evaluation order, had real consequences. Coq (for instance) had several places where code either didn’t work, or was inexplicably exponential-time, when ported to caml-light from “Caml Heavy” (the old Caml that ran on LeLisp). Problems like that came up from time-to time when people were porting code, or when they were writing code naively assuming (as I did, as everybody did) that evaluation was left-to-right. So these low-level choices for the sake of tractable implementation were reflected in every level of reasoning about the programs we wrote and ran on caml-light.

4 Likes

BTW, these sorts of stories of which design decisions were critical for performance, or memory effiicency, of whatever, are pretty much unpublished in my experience. B/c to publish them, would involve drawing comparisons with other implementations, other designs. And that’s going to lead to contention. So nobody points out that somebody else’s baby is ugly.

Considering that a record is a set (or list) of fields corresponds to the structural view of records. OCaml records are not structural, they are nominal, meaning that record types are identified neither by a set nor a list of fields, but really by its name. The compatibility check between interface and implementation can be arbitrary as long as long as it is sound. It would be trivial to extend the compatibility check to allow reordering of fields in type definitions (eg the typechecker could just reorder fields in some canonical order before doing anything else), without sacrificing efficiency. But as you noticed, for practical reasons and conceptual simplicity this is not done (just to mention one: it would be more complicated to write FFI bindings if fields were to be reordered by the compiler).

Incidentally, structural records will be arriving in OCaml soon-ish, following https://www.youtube.com/watch?v=WM7ZVne8eQE.

Cheers,
Nicolas

2 Likes

Maybe a more mathematical way to explain that order matters for record is to consider that a record declaration is a pair of tuple types and a labelling of the tuple indices, in other words

type t = { foo: int; bar:float; baz: string }

could be expressed in fantasy syntax as

type t = (int * float * string) with 0 ⇔ "foo" | 1 ⇔ "bar" | 2 ⇔ "baz" 

Labels in this view are then merely nicer names for the indices of the tuple.

1 Like

Yes, although those won’t allow reordering, either: bar:int * baz:string is not compatible with baz:string * bar:int

This and some other issues (or I should say friction, not really issues) raised before could be mitigated by allowing types to be specified only once, either in implementation or interface, and generating the other end accordingly, without needing duplication. From implementation to interface is already standard behavior, and a limited form of generating in the other direction already exists for when we’re defining recursive modules. plus, you got ppx_import.

Then if both directions are allowed language-wide, runtime considerations like field/variant order become transparent to the user here.

OCaml could also just silently (or with a warning) reorder impl fields to match sig declaration, like it does with modules (which are also records at runtime). I wonder why an error was chosen instead.

1 Like

If order matters (i.e., a given field is at a given position) and if the compiler silently reorders fields, then it means that you can no longer write foreign function interfaces. If you care about interfacing with other languages, you need to chose one or the other. Either the order of fields does not matter (e.g., OCaml objects) but performance are worse, or the order matters and the compiler is not allowed to reorder them.

1 Like

To be clear, the only way for the compiler to silently reorder fields is if you have a mismatched declaration of a type in both the interface and the implementation. That is, you manually wrote both. By the time you’re doing FFI, you already are concerned with internal implementation details and a simple “the order of fields/variants in a declared or generated interface file or signature is the single canonical order at runtime” in the FFI section of the manual should suffice. This would be no different of a programmer error than changing the fields of a type but forgetting to update the stubs which access it to reflect that.

That seems like a very restrictive view of the type system and module system of OCaml. Consider the following example:

module type T = sig
  type t = { a: int; b: int }
end

module M (T: T) = struct
end

module U : sig
  type t = { b: int; a: int }
end = struct
  type t = { b: int; a: int }
end

module N = M(U)

Let us assume that, as you propose, the very last line is not a hard error. According to your suggestion, what is the order of the fields of the type t in both modules M and U? (They have to agree, since there is no compile error.)

I see what you mean, and I can’t say the types are distinct because ocaml functors don’t make them distinct and that’s a more integral part of the module language design.

If you include T.t in M’s definition, the resulting type of its application would be equal to U.t and in the future Whatever.t. If I say module types are a suggestion, whereas signatures/interfaces attached to implementations are meaningful (for example, is unboxed annotation effective in a module type? or only when it’s attached to an implementation?*), so the canonical repr would be the one specified in U.t interface, then the language would have to necessarily accept that U.t and Whatever.t are equal types, by reordering the declaration on T to match each on their respective functor applications. And that means it’s possible to have two types which differ in runtime repr now be accepted by the type checker as equal. That’s unsound. If going the other way and saying the canonical order is the one specified by T, then should a function which accepts N.t still accept U.t, that’d also be unsound.

And going back to the module system, although some reordering happens, in this case it still rejects the module types when they mismatch in module field order. I’m talking about substituting records for fcms in your example. I can see why this isn’t a good suggestion now, compared to the removal of declaration redundancy one.

* Edit: even unboxed annotations aren’t silently ignored when they’re specified at module type, they also produce a mismatch error. consider:

module type TransparentBox = sig
  type 'a t = Box of 'a [@@unboxed]
end
module AllocBox = struct
  type 'a t = Box of 'a
end
module F (B : TransparentBox) = struct
  type 'a t = 'a B.t
end
module Fail = F(AllocBox)

I wouldn’t attribute too much of ocaml’s compilation performance to this without concrete evidence to back it up though.
I just tried it in one of my projects and deleting all the .mli files (which should make it behave roughly like ghc) has no noticable impact on compile times (it takes ~5.5 seconds in both cases after a dune clean)
Different setups might benefit from it more of course, but is that really worth leaking implementation details like this into what is supposed to be an abstract specification?
In the comment you linked, Xavier Leroy specifically mentions that he likes how mli files let you abstractly specify a module’s interface without having to care about implementation details like the order of declarations. This exactly undermines that benefit for an (at least in the case I measured) quite minor performance improvement.

The speed benefit is mostly for incremental builds, not clean builds. See [PROPOSAL] Support -opaque mode for faster compilation · Issue #1058 · ocaml/dune · GitHub

I don’t understand how being able to reorder signature item declarations differently from structure items undermines the benefit :man_shrugging: