I’m reading source code of OCaml projects and sometimes I see some interesting syntax overloading features.
For example, from the ocaml-non-empty-list package, I learned that you can overload the list literal syntax by naming your constructor as
(** The non-empty list type. The use of the [( :: )] infix constructor
allows the usage of the built-in list syntactic sugar provided by OCaml.
For example, a singleton is given by [  ]. A list containing 2 elements is given by
[ [1; 2] ]. *)
type 'a t = ( :: ) of 'a * 'a list [@@deriving eq, ord, show]
But for the love of my life, I can’t find any documentation about this syntax. My Google skills fail me on this one
Could you help me with some links?
Other syntax overloading features I found:
and* (and more general
- Overloading array indexing operators
Would love to gather more links!
I believe the place (which you already found) to look for such details is the OCaml manual. See here.
I wouldn’t say that the example demonstrates overloading. It simply redefines the
(::) constructor. Although with type based disambiguation things are perhaps a bit more nuanced.
I also often use this table from the manual to know the relative precedence of operators. It also serves me when I want to (re)define an operator: the first signs of the operator have an effect on the precedence.
As @polytypic said, this is not overloading, but a mere (re)binding of identifiers. Operator-like identifiers just happen to be treated differently by the parser.
You can also give operator-like names to type constructors, (edit: not really, see below) in which case there will some form of type-based disambiguation, true, but I wouldn’t call it overloading. I find it confusing as it brings to mind C++ where you can overload any function, which is definitely not the case in OCaml.
I see. It makes sense.
I called it “syntax overloading” because in my understanding, the meaning of a list literal like
[1; 2; 3] depends on the
(::) in scope and type, so in some sense you get an overloaded list literal (which is part of the syntax).
Still, I wonder, if there’s only
(::) for lists or something similar for tuples or something else for something else? And where this particular thing is documented?
It’s also fine if the documentation is not complete but I wanted to ask first if anyone knows any available resources
I have to backtrack on what I said:
Actually, you can only redefine a few existing constructors. I have not found an official list, but the type constructors that you can shadow seem to be:
false. Yes, you can redefine all those, so out of some arguably legitimate uses for
, there is potential for obfuscation.
# type void = | (* Empty type *);;
# type _ tuple = () : void tuple | (::) : 'a * 'b -> ('a * 'b) tuple;;
type _ tuple = () : void tuple | (::) : 'a * 'b -> ('a * 'b) tuple
# let a :: b :: () = if true then 1 :: 2 :: () else assert false;;
val a : int = 1
val b : int = 2
All other operator-like identifiers cannot be used as constructors, so although you can use them to construct values, you can’t use them in patterns:
# let ( $-$ ) x y = x :: y;;
val ( $-$ ) : 'a -> 'b -> ('a * 'b) tuple = <fun>
# 42 $-$ 12 $-$ ();;
- : ((int * int) tuple * void tuple) tuple = (::) ((::) (42, 12), ())
The one not yet mentioned in the discussion is that you can redefine
.<- as long as you do it inside a module named
String, AFAIR. There’s a thread here on Discuss on how such valuable lexical estate is wasted on strings, but I can’t find it.
.<- is not available anymore: Syntaxic sugar: String.set → Bytes.set? - Learning - OCaml
The list, unit, and other built-in types are documented in the OCaml Manual chapter on the
core library (not Jane Street Core, OCaml
core): OCaml - The core library
The documentation update we’ve started releasing includes a new tutorial on operators:
I’ve tried summarizing and grouping what’s in the reference manual and a couple of blog posts, in a readable form. Alas, it doesn’t cover the case of reusing the list syntactic sugar, yet. I’d be super happy if somebody could contribute some text. Up to my findings, documentation is available on unary operators, binary operators, indexing operators and custom binders. But I couldn’t find anything about what you mentioned. Interestingly, if you ask ChatGPT about that, it falls into a bad trip, which suggests it wasn’t fed with anything on that matter, probably because there isn’t anything available.
Also, I searched https://sherlocode.com/ with something like this:
\[^a-z_\]type \.\*=\.\*( \[^A-Z\]
It seems to indicate the list syntax is the only one we can play with (although this regexp needs to be improved)
P.S. The tutorial is brand new; any feedback would be appreciated
This is not overloading it is shadowing.
With overloading it would be possible to use both versions of
( :: ) in the same scope, and the “right” one would be picked, based on type. But once you have defined your own operator-constructor, it masks the previously available one, unless you use a type annotation
# type ('a, 'b) foo = ( :: ) of 'a * 'b;;
type ('a, 'b) foo = (::) of 'a * 'b
# let bar = function x :: y -> Some (x, y);;
val bar : ('a, 'b) foo -> ('a * 'b) option = <fun>
# let tail = function ((_ :: u) : 'a list) -> Some u | _ -> None;;
val tail : 'a list -> 'a list option = <fun>
@cuihtlauac Thanks a lot for this follow-up!
Indeed, writing a blog post with some info on operators is nice. But improving the official documentation would be even more awesome!
A clarification: what we’ve published is not a blog post; it is a tutorial that is part of the official documentation.
The manual provides authoritative reference information. Tutorials are for learners. That’s two different use cases. I agree something seems to be missing in the manual. But something else was also missing for learners.
Type based disambiguation makes things a bit more interesting:
OCaml version 4.14.0
Enter #help;; for help.
Findlib has been successfully loaded. Additional directives:
#require "package";; to load a package
#list;; to list the available packages
#camlp4o;; to load camlp4 (standard syntax)
#camlp4r;; to load camlp4 (revised syntax)
#predicates "p,q,...";; to set these predicates
Topfind.reset();; to force that packages will be reloaded
#thread;; to enable threads
# module Foo = struct type t = (::) of int * int end ;;
module Foo : sig type t = (::) of int * int end
# let eat_list xs = List.length xs ;;
val eat_list : 'a list -> int = <fun>
# let eat_foo ((x :: y): Foo.t) = x + y ;;
val eat_foo : Foo.t -> int = <fun>
# eat_foo (1::2) ;;
- : int = 3
# eat_list (1::2::3::) ;;
- : int = 3