I wondered whether [] is a type, since it would be a natural notation for an empty polymorphic variant type. At first it doesn’t seem to be:
# let f (x : []) : [] = x;;
Error: Syntax error: type expected.
But this works:
# type foo = [];;
type foo = []
# let f (x : foo) : foo = x;;
val f : foo -> foo = <fun>
However, it doesn’t seem to be an empty type:
# let f (x : foo) = match x with _ -> .;;
Error: This match case could not be refuted.
Here is an example of a value that would reach it: []
This error message suggests that [] is an element of the type [], and indeed it seems to be, at least as long as the type was given a name:
# let y : foo = [];;
val y : foo = []
# let y : [] = [];;
Error: Syntax error
# let f (x : foo) = match x with [] -> 3;;
val f : foo -> int = <fun>
Normally [] denotes the empty list, so I thought maybe [] was “the type of empty lists”. But if so, these “lists” don’t seem to have anything to do with ordinary ones:
# 3 :: y;;
Error: This expression has type foo but an expression was expected of type
int list
# let z : 'a list = [];;
val z : 'a list = []
# y = z;;
Error: This expression has type 'a list but an expression was expected of type
foo
I am very confused. Is [] a type, as it seems to sort of be? If so, what type is it, and why does it only work in some places and not others? If not, what is going on?
It’s not exactly a type, it’s a type constructor. If I remember correctly, there are five special “keywords” that are parsed and typed as type constructors: [] and (::), for the list type, (), for the unit type, and true and false for the bool type.
Apart from this special parsing rules, there is nothing particular about these constructors and you can reuse them for custom types:
# type t = () | [] | (::) | true | false | Something;;
type t = () | [] | (::) | true | false | Something
# [| (); []; (::); true; false; Something|];;
- : t array = [|(); []; (::); true; false; Something|]
[] is a data constructor, not a type. Replace it with Nil (just my choice of name, nothing special about it) everywhere in your code sample and you get exactly the same results.
I was defining a new variant type foo with one constructor [] having no arguments. I have to say I think it’s super confusing to have just a few special cases of tokens that are interpreted as data constructors. At least in the case of
type foo = ()
I get a
Warning 65 [redefining-unit]: This type declaration is defining a new '()' constructor
which shadows the existing one.
Hint: Did you mean 'type foo = unit'?
Wouldn’t it make sense to give similar warnings for the other special cases?
That’s neat; so even if you redefine [] and (::), the builtin syntax [ _ ; _ ; ... ; _] still expands in terms of the new ones. Why do you use 'a -> 'b rather than, say, 'a * 'b?
@CraigFe, do you disagree that redefining [] and (::) should issue a warning, because of that use case? Or would it be reasonable to issue a warning to help people like me, and let someone who really wants to use them for heterogeneous lists ignore or silence the warning?
How about redefining (), true, and false – are there any use cases for those?
The reason for the warning about () is because it is a common mistake and not very useful to redefine it. On the other hand, redefining other predefined constructors is less common and also more obviously useful. So it would seem OK not to warn about redefining them.
utop # type bool = true | false;;
utop # let x = true;;
utop # if x then 1 else 2;;
Error: This expression has type bool/1 but an expression was expected of type
bool/2
because it is in the condition of an if-statement
File "_none_", line 1:
Definition of type bool/2
Hint: The type bool has been defined multiple times in this toplevel
session. Some toplevel values still refer to old versions of this
type. Did you try to redefine them?
How about a warning if [] is redefined without simultaneously redefining (::)? That seems more likely to indicate someone, like me, not knowing what they’re doing. I expect most if not all real use cases would involve redefining both together?
I was curious about how prevalent redefining (::) is in the opam repository, and it seems to happen quite a bit:
List of packages
$ opam grep "( *:: *)"
accessor
amqp-client
amqp-client-async
amqp-client-lwt
archi
archi-async
archi-lwt
b0
batsat
batteries
bitwuzla
bitwuzla-bin
bitwuzla-c
blake3
brr
bsbnative
camlon
camlp5
coccinelle
codept
colombe
conformist
containers
containers-data
containers-thread
coq
coq-of-ocaml
coqide
crowbar
doc-ock
dot-merlin-reader
earley
elpi
embedded_ocaml_templates
extlib
farfadet
GT
git-cohttp-mirage
gopcaml-mode-merlin
graphql
graphql-async
graphql-cohttp
graphql-lwt
graphql_parser
immutable
iocaml
jsonrpc
ketrew
learn-ocaml
learn-ocaml-client
lem
linksem
liquidsoap
…
(At this point, opam grep ran out of disk space on my machine… .)
My suspicion is that adding a naive warning for redefinition of (::) would overall cause more inconvenience than it would save, although it might make sense to warn on redefinition of (::) but not [] in the same type. (This still warns against some valid use-cases, but I would guess far fewer of these occur in the wild.)
I agree that the construct type t = [] in particular is very surprising with regard to polymorphic variants. I would propose adding a subsection to the OCaml manual on polymorphic variants explaining:
the (lack of a) constructor-less polymorphic variant type, and the reasoning for this;
the gotcha mentioned here w.r.t. [];
a pointer to the section on empty variant types.
Ultimately, I think just making type foo = [] invalid wouldn’t be enough to fix the gotcha: there needs to be a longer-form explanation somewhere (although this could be linked from a hint, as is done for some of the other warnings).
Relatedly, there was some discussion back in 2013 about banning redefinition of (::) altogether, but this doesn’t seem to have led anywhere.
Speaking of footguns, it appears that if you redefine () to take an argument, it completely fubars the parser or something:
# type foo = () of int;;
Warning 65 [redefining-unit]: This type declaration is defining a new '()' constructor
which shadows the existing one.
Hint: Did you mean 'type foo = unit'?
type foo = () of int
# 2 * (1 + 1);;
Error: The constructor () expects 1 argument(s),
but is applied here to 0 argument(s)
# let f x = x;;
Error: The constructor () expects 1 argument(s),
but is applied here to 0 argument(s)
I see that it works fine in a source file rather than at toplevel. But let () = 1 + 2 doesn’t typecheck even ordinarily – 1 + 2 is not of type unit – so that can’t be exactly what utop is doing, can it?