This seems to make things unnecessairly hard to grep for the following reasons:
In a large project, I may have 2-5 FooBar, but not too many; so a grep result on FooBar is often correct. In contrast, we have lots of type t’s
It makes type signatures hard to grep. Again, in Java/Rust, we just grep for ‘FooBar’ and narrow the irrelevant ones. In OCaml, I can’t even grep for FooBar.t because there might have been a open FooBar earlier, so now a t is actually FooBar.t
In a purely constructive, non-complaining way, I’m really curious, why is OCaml convention this way? This seems like the absolute worst thing one can do for grep/code-search. (Especially when OCaml/Merlin is not as good as Rust/IntelliJ or Java/Intellij at resolving references / finding all usages / … ).
Thanks!
EDIT: To phrase the question more constructively; you git clone a 10k+ LOC OCaml project. Given the above issues with searching through the code, how do you grok the code ?
This is a fairly recent convention; as far as I know it was popularized by Jane Street codebase. Historically, types had meaningful names, except in functor signatures and the like. For example in the compiler codebase over at GitHub - ocaml/ocaml: The core OCaml system: compilers, runtime system, base libraries you will find relatively few types named t.
Because of type inference the type names are not written that much in the source code, so in my experience looking up the definition of a type given its name is relatively infrequent. Rather, I usually find myself looking for a type definition given the name of one of its constructors, or the name of a record label, etc., and these are not usually concerned with the “t” convention.
A type name of FooBar.t is used instead of the redundant FooBar.FooBar. So, if your module or type does not fit this pattern, use another name for it instead of just t.
This is a fairly recent convention; as far as I know it was popularized by Jane Street codebase.
This is not a new convention. It is already present in Caml Light sources from 29 years ago and is also used in Standard ML. The convention is: when a module defines a single central abstract data type, it is called t such that Fifo.t is how it is used. This avoids names like Fifo.fifo where type and module have the same name.
You are right, but I was referring to the related convention of defining a module for every single type so that (following the existing convention that you mention) every type ends up being called t. I believe that that convention (one module per type) was popularized by the Jane Street codebase.
Incidentally, whether types of the form Fifo.fifo are a good idea or not depends on whether the module Fifo is expected to be opened or not. Nowadays there is a tendency to avoid opening modules globally, which may also partly explain the shift towards the one-module-per-type style.
I agree that the JanesStreet code base is disciplined about not opening modules. The curse of many global open is that the origin of an identifier is difficult to track unless using an IDE. A local let open provides a middle ground between readability and convenience.
This is just my opinion, but modules which are designed to be opened usually don’t have types named identically to the module name, they have types with different names.
I try to avoid when possible open. The exception is Jane Street modules since Base or Core are to be used instead of Stdlib. I could also make a local exception with libraries which propose operators (>>= for example). But I guess that libraries with a type t declared are not meant to be opened : 2 of them will imply a conflict and we couldn’t refer to the first one easily.
One thing that the other languages that you cite do not have is the mechanism of functors. E.g. if you look at Map.Make, it takes as argument a module containing a type and a comparison function. For this to work, the type has to be named and, by convention, it seems everyone settled on type t = ..., which allows you to simply write Set.Make(Foobar) instead of
Set.Make(struct
include Foobar
type t = foobar
let compare = compare_foobar end)
W.r.t to grepping through code, I usually look for value (especially functions) names rather than types, since type don’t often occur in code (because of type inference). If I remember correctly, ocp-grep part of the ocp-index opam package is able to find qualified identifiers even in the context where they are unqualified. For instance, from the root of the ocaml compiler sources you can just do:
$ ocp-grep Format.fprintf .
And find all uses of fprintf (from the Format module) in the sources of the compiler, both qualified and unqualified.
Just for the interest of the discussion, I’ll note that I’ve come across this naming convention with Elixir’s “quite imperfect” type specification system (you type your dynamic program after the fact)
defmodule Dictionary.Impl.WordList do
@type t :: list(String.t())
@spec word_list() :: t()
def word_list() do
"../../assets/words.txt"
|> Path.expand(__DIR__)
|> File.read!()
|> String.split("\n", trim: true)
end
@spec random_word(t()) :: String.t()
def random_word(word_list) do
word_list
|> Enum.random()
end
end
I would take exception with calling it a result of some Jane Street convention: I certainly didn’t learn it from Jane Street code, since I rarely if at all use their libraries. Instead, I started doing it because in complex code, you cannot be sure that a datatype will not have constructor-names that conflict with those of other datatypes. And this is something specific to OCaml, since in Standard ML you can just use casts to get around this problem. Combine this with heavy use of functors that produce many (seeming) “copies” of the same type, and using module-paths to disambiguate constructors and type-names becomes instinctive.
For the OCaml specific pieces I’d setup an opam switch and use merlin/ocaml-lsp-server. The more general advice on reading an un-familiar codebase deserves a long response. The OCaml navigation via the shapes support added in 4.14 is solid. I do want a usable OCaml aware search tool, of which grep is the worst choice but sometimes the only one you have.
I don’t see the OCaml convention of FooBar.t any different to looking for type class instances in Haskell or traits in Rust and Scala. You need to know the language specific pieces.
This could just be a fault of my OCaml / Neovim setup. The difference here, compared to Rust / Scala is that the Rust / Scala IntelliJ plugins are rock solid – they have never failed [1] me on goto-def or find-all-refs. In contrast, neovim/lsp/merlin setup is quite finicky for me.
[1] The one exception t this: sometimes Rust struggles with macro rules that take too long to expand.
EDIT: In particular, in Rust/Scala, I almost never reach for grep. In OCaml, I have a terminal dedicated to rg.
As an Emacs user it pains me to say this but VSCode integration for OCaml is the most well maintained editor integration. If it works there or not then you have your answer if it is possible
Using Emacs I often find the lsp-mode bindings for OCaml lack some polish and I need to go look for the right Lisp function or defcustom to get the equivalent behaviour. e.g. the VSCode integration gives you the mli documentation when you hover over the corresponding value in the ml file.