Why is OCaml convention to use `type t = ... `

zeroexcuses · May 29, 2023, 10:54am

In Rust, we would write something like:

pub struct FooBar { ... }

in Java, it’d be something like

public class FooBar { ... }

in OCaml, we have:

module FooBar = struct
  type t = ...
end

This seems to make things unnecessairly hard to grep for the following reasons:

In a large project, I may have 2-5 FooBar, but not too many; so a grep result on FooBar is often correct. In contrast, we have lots of type t’s
It makes type signatures hard to grep. Again, in Java/Rust, we just grep for ‘FooBar’ and narrow the irrelevant ones. In OCaml, I can’t even grep for FooBar.t because there might have been a open FooBar earlier, so now a t is actually FooBar.t

In a purely constructive, non-complaining way, I’m really curious, why is OCaml convention this way? This seems like the absolute worst thing one can do for grep/code-search. (Especially when OCaml/Merlin is not as good as Rust/IntelliJ or Java/Intellij at resolving references / finding all usages / … ).

Thanks!

EDIT: To phrase the question more constructively; you git clone a 10k+ LOC OCaml project. Given the above issues with searching through the code, how do you grok the code ?

nojb · May 29, 2023, 12:15pm

This is a fairly recent convention; as far as I know it was popularized by Jane Street codebase. Historically, types had meaningful names, except in functor signatures and the like. For example in the compiler codebase over at GitHub - ocaml/ocaml: The core OCaml system: compilers, runtime system, base libraries you will find relatively few types named t.

Because of type inference the type names are not written that much in the source code, so in my experience looking up the definition of a type given its name is relatively infrequent. Rather, I usually find myself looking for a type definition given the name of one of its constructors, or the name of a record label, etc., and these are not usually concerned with the “t” convention.

Cheers,
Nicolas

Release-Candidate · May 29, 2023, 12:48pm

A type name of FooBar.t is used instead of the redundant FooBar.FooBar. So, if your module or type does not fit this pattern, use another name for it instead of just t.

lindig · May 29, 2023, 1:29pm

This is a fairly recent convention; as far as I know it was popularized by Jane Street codebase.

This is not a new convention. It is already present in Caml Light sources from 29 years ago and is also used in Standard ML. The convention is: when a module defines a single central abstract data type, it is called t such that Fifo.t is how it is used. This avoids names like Fifo.fifo where type and module have the same name.

github.com

camllight/camllight/blob/master/sources/src/lib/hashtbl.mli#L5


      
          (* Hash tables and hash functions *)
          
          (* Hash tables are hashed association tables, with in-place modification. *)
          
          type ('a, 'b) t;;
                  (* The type of hash tables from type ['a] to type ['b]. *)
          
          value new : int -> ('a,'b) t
                  (* [new n] creates a new, empty hash table, with initial size [n].
                     The table grows as needed, so [n] is just an initial guess.
                     Better results are said to be achieved when [n] is a prime
                     number. Raise [Invalid_argument "hashtbl__new"] if [n] is
                     less than 1. *)
          
            and clear : ('a, 'b) t -> unit

nojb · May 29, 2023, 1:46pm

You are right, but I was referring to the related convention of defining a module for every single type so that (following the existing convention that you mention) every type ends up being called t. I believe that that convention (one module per type) was popularized by the Jane Street codebase.

Incidentally, whether types of the form Fifo.fifo are a good idea or not depends on whether the module Fifo is expected to be opened or not. Nowadays there is a tendency to avoid opening modules globally, which may also partly explain the shift towards the one-module-per-type style.

Cheers,
Nicolas

lindig · May 29, 2023, 1:51pm

I agree that the JanesStreet code base is disciplined about not opening modules. The curse of many global open is that the origin of an identifier is difficult to track unless using an IDE. A local let open provides a middle ground between readability and convenience.

Frederic_Loyer · May 29, 2023, 1:56pm

It is not only a OCaml convention. Modula3 uses it too (with a capital T).

yawaramin · May 29, 2023, 3:02pm

This is just my opinion, but modules which are designed to be opened usually don’t have types named identically to the module name, they have types with different names.

Frederic_Loyer · May 29, 2023, 3:53pm

I try to avoid when possible open. The exception is Jane Street modules since Base or Core are to be used instead of Stdlib. I could also make a local exception with libraries which propose operators (>>= for example). But I guess that libraries with a type t declared are not meant to be opened : 2 of them will imply a conflict and we couldn’t refer to the first one easily.

K_N · May 29, 2023, 4:32pm

One thing that the other languages that you cite do not have is the mechanism of functors. E.g. if you look at Map.Make, it takes as argument a module containing a type and a comparison function. For this to work, the type has to be named and, by convention, it seems everyone settled on type t = ..., which allows you to simply write Set.Make(Foobar) instead of

Set.Make(struct
       include Foobar
       type t = foobar
       let compare = compare_foobar end)

W.r.t to grepping through code, I usually look for value (especially functions) names rather than types, since type don’t often occur in code (because of type inference). If I remember correctly, ocp-grep part of the ocp-index opam package is able to find qualified identifiers even in the context where they are unqualified. For instance, from the root of the ocaml compiler sources you can just do:

$ ocp-grep Format.fprintf .

And find all uses of fprintf (from the Format module) in the sources of the compiler, both qualified and unqualified.

benjamin-thomas · May 29, 2023, 6:44pm

Just for the interest of the discussion, I’ll note that I’ve come across this naming convention with Elixir’s “quite imperfect” type specification system (you type your dynamic program after the fact)

defmodule Dictionary.Impl.WordList do
  @type t :: list(String.t())

  @spec word_list() :: t()
  def word_list() do
    "../../assets/words.txt"
    |> Path.expand(__DIR__)
    |> File.read!()
    |> String.split("\n", trim: true)
  end

  @spec random_word(t()) :: String.t()
  def random_word(word_list) do
    word_list
    |> Enum.random()
  end
end

Chet_Murthy · May 29, 2023, 10:26pm

I would take exception with calling it a result of some Jane Street convention: I certainly didn’t learn it from Jane Street code, since I rarely if at all use their libraries. Instead, I started doing it because in complex code, you cannot be sure that a datatype will not have constructor-names that conflict with those of other datatypes. And this is something specific to OCaml, since in Standard ML you can just use casts to get around this problem. Combine this with heavy use of functors that produce many (seeming) “copies” of the same type, and using module-paths to disambiguate constructors and type-names becomes instinctive.

lambda_foo · May 31, 2023, 5:22am

For the OCaml specific pieces I’d setup an opam switch and use merlin/ocaml-lsp-server. The more general advice on reading an un-familiar codebase deserves a long response. The OCaml navigation via the shapes support added in 4.14 is solid. I do want a usable OCaml aware search tool, of which grep is the worst choice but sometimes the only one you have.

I don’t see the OCaml convention of FooBar.t any different to looking for type class instances in Haskell or traits in Rust and Scala. You need to know the language specific pieces.

zeroexcuses · May 31, 2023, 5:27am

This could just be a fault of my OCaml / Neovim setup. The difference here, compared to Rust / Scala is that the Rust / Scala IntelliJ plugins are rock solid – they have never failed [1] me on goto-def or find-all-refs. In contrast, neovim/lsp/merlin setup is quite finicky for me.

[1] The one exception t this: sometimes Rust struggles with macro rules that take too long to expand.

EDIT: In particular, in Rust/Scala, I almost never reach for grep. In OCaml, I have a terminal dedicated to rg.

lambda_foo · May 31, 2023, 5:39am

As an Emacs user it pains me to say this but VSCode integration for OCaml is the most well maintained editor integration. If it works there or not then you have your answer if it is possible

Using Emacs I often find the lsp-mode bindings for OCaml lack some polish and I need to go look for the right Lisp function or defcustom to get the equivalent behaviour. e.g. the VSCode integration gives you the mli documentation when you hover over the corresponding value in the ml file.

Topic		Replies	Views
OCaml's constructor and record field disambiguation feels like a bit of ad-hoc polymorphism Learning	1	519	May 30, 2023
Modules and generic type signature Learning	2	3724	May 31, 2019
A question on type/module equality Learning type-system , module , type-equality	12	2865	November 8, 2017
It is said structs are better in ocaml compared to haskell Learning	3	1120	November 23, 2021
Type inference and recursive modules Learning	4	889	October 5, 2021

Why is OCaml convention to use `type t = ... `

Related topics