Is [] a type or not?

viritrilbia · September 10, 2021, 3:08pm

I wondered whether [] is a type, since it would be a natural notation for an empty polymorphic variant type. At first it doesn’t seem to be:

# let f (x : []) : [] = x;;
Error: Syntax error: type expected.

But this works:

# type foo = [];;
type foo = []
# let f (x : foo) : foo = x;;
val f : foo -> foo = <fun>

However, it doesn’t seem to be an empty type:

# let f (x : foo) = match x with _ -> .;;
Error: This match case could not be refuted.
       Here is an example of a value that would reach it: []

This error message suggests that [] is an element of the type [], and indeed it seems to be, at least as long as the type was given a name:

# let y : foo = [];;
val y : foo = []
# let y : [] = [];;
Error: Syntax error
# let f (x : foo) = match x with [] -> 3;;
val f : foo -> int = <fun>

Normally [] denotes the empty list, so I thought maybe [] was “the type of empty lists”. But if so, these “lists” don’t seem to have anything to do with ordinary ones:

# 3 :: y;;
Error: This expression has type foo but an expression was expected of type
         int list
# let z : 'a list = [];;
val z : 'a list = []
# y = z;;
Error: This expression has type 'a list but an expression was expected of type
         foo

I am very confused. Is [] a type, as it seems to sort of be? If so, what type is it, and why does it only work in some places and not others? If not, what is going on?

vlaviron · September 10, 2021, 3:16pm

It’s not exactly a type, it’s a type constructor. If I remember correctly, there are five special “keywords” that are parsed and typed as type constructors: [] and (::), for the list type, (), for the unit type, and true and false for the bool type.
Apart from this special parsing rules, there is nothing particular about these constructors and you can reuse them for custom types:

# type t = () | [] | (::) | true | false | Something;;
type t = () | [] | (::) | true | false | Something
# [| (); []; (::); true; false; Something|];;
- : t array = [|(); []; (::); true; false; Something|]

yawaramin · September 10, 2021, 3:17pm

[] is a data constructor, not a type. Replace it with Nil (just my choice of name, nothing special about it) everywhere in your code sample and you get exactly the same results.

viritrilbia · September 10, 2021, 3:50pm

Oh! I get it. So when I wrote

type foo = []

I was defining a new variant type foo with one constructor [] having no arguments. I have to say I think it’s super confusing to have just a few special cases of tokens that are interpreted as data constructors. At least in the case of

type foo = ()

I get a

Warning 65 [redefining-unit]: This type declaration is defining a new '()' constructor
which shadows the existing one.
Hint: Did you mean 'type foo = unit'?

Wouldn’t it make sense to give similar warnings for the other special cases?

yawaramin · September 10, 2021, 3:53pm

Yeah, exactly. And you’re totally right, redefining [] should issue warning 65 too. Looks like a gap. Maybe file an issue on the compiler repo?

CraigFe · September 10, 2021, 3:54pm

Re-using the list data constructor names is occasionally useful, in particular when working with heterogeneous lists:

# type _ t =
  | [] : unit t
  | ( :: ) : 'a * 'b t -> ('a -> 'b) t ;;

# [ 1; "foo"; true ] ;;
- : (int -> string -> bool -> unit) t

viritrilbia · September 10, 2021, 5:09pm

That’s neat; so even if you redefine [] and (::), the builtin syntax [ _ ; _ ; ... ; _] still expands in terms of the new ones. Why do you use 'a -> 'b rather than, say, 'a * 'b?

CraigFe · September 10, 2021, 5:12pm

'a * 'b also works, and is functionally equivalent

Using -> has the minor convenience of not generating lots of nested parentheses in printed types, since -> is already right-associative.

# type _ t =
  | [] : unit t
  | ( :: ) : 'a * 'b t -> ('a * 'b) t ;;

# [ 1; 2; 3; 4; 5 ] ;;
- : (int * (int * (int * (int * (int * unit))))) t

viritrilbia · September 10, 2021, 5:14pm

@CraigFe, do you disagree that redefining [] and (::) should issue a warning, because of that use case? Or would it be reasonable to issue a warning to help people like me, and let someone who really wants to use them for heterogeneous lists ignore or silence the warning?

How about redefining (), true, and false – are there any use cases for those?

nojb · September 10, 2021, 5:20pm

The reason for the warning about () is because it is a common mistake and not very useful to redefine it. On the other hand, redefining other predefined constructors is less common and also more obviously useful. So it would seem OK not to warn about redefining them.

Cheers,
Nicolas

viritrilbia · September 10, 2021, 5:22pm

Is redefining true and false useful?

yawaramin · September 10, 2021, 5:27pm

IMHO, it’s more a footgun than useful.

utop # type bool = true | false;;
utop # let x = true;;
utop # if x then 1 else 2;;
Error: This expression has type bool/1 but an expression was expected of type
         bool/2
       because it is in the condition of an if-statement
       File "_none_", line 1:
         Definition of type bool/2
       Hint: The type bool has been defined multiple times in this toplevel
         session. Some toplevel values still refer to old versions of this
         type. Did you try to redefine them?

viritrilbia · September 10, 2021, 5:30pm

How about a warning if [] is redefined without simultaneously redefining (::)? That seems more likely to indicate someone, like me, not knowing what they’re doing. I expect most if not all real use cases would involve redefining both together?

CraigFe · September 10, 2021, 5:38pm

I was curious about how prevalent redefining (::) is in the opam repository, and it seems to happen quite a bit:

List of packages

$ opam grep "( *:: *)"

accessor
amqp-client
amqp-client-async
amqp-client-lwt
archi
archi-async
archi-lwt
b0
batsat
batteries
bitwuzla
bitwuzla-bin
bitwuzla-c
blake3
brr
bsbnative
camlon
camlp5
coccinelle
codept
colombe
conformist
containers
containers-data
containers-thread
coq
coq-of-ocaml
coqide
crowbar
doc-ock
dot-merlin-reader
earley
elpi
embedded_ocaml_templates
extlib
farfadet
GT
git-cohttp-mirage
gopcaml-mode-merlin
graphql
graphql-async
graphql-cohttp
graphql-lwt
graphql_parser
immutable
iocaml
jsonrpc
ketrew
learn-ocaml
learn-ocaml-client
lem
linksem
liquidsoap
…
(At this point, opam grep ran out of disk space on my machine… .)

My suspicion is that adding a naive warning for redefinition of (::) would overall cause more inconvenience than it would save, although it might make sense to warn on redefinition of (::) but not [] in the same type. (This still warns against some valid use-cases, but I would guess far fewer of these occur in the wild.)

I agree that the construct type t = [] in particular is very surprising with regard to polymorphic variants. I would propose adding a subsection to the OCaml manual on polymorphic variants explaining:

the (lack of a) constructor-less polymorphic variant type, and the reasoning for this;
the gotcha mentioned here w.r.t. [];
a pointer to the section on empty variant types.

Ultimately, I think just making type foo = [] invalid wouldn’t be enough to fix the gotcha: there needs to be a longer-form explanation somewhere (although this could be linked from a hint, as is done for some of the other warnings).

Relatedly, there was some discussion back in 2013 about banning redefinition of (::) altogether, but this doesn’t seem to have led anywhere.

SkySkimmer · September 10, 2021, 6:19pm

Those are not all redefinitions, for instance the (::) in batsat are in rust code.

viritrilbia · September 10, 2021, 8:08pm

Speaking of footguns, it appears that if you redefine () to take an argument, it completely fubars the parser or something:

# type foo = () of int;;
Warning 65 [redefining-unit]: This type declaration is defining a new '()' constructor
which shadows the existing one.
Hint: Did you mean 'type foo = unit'?
type foo = () of int

# 2 * (1 + 1);;
Error: The constructor () expects 1 argument(s),
       but is applied here to 0 argument(s)

# let f x = x;;
Error: The constructor () expects 1 argument(s),
       but is applied here to 0 argument(s)

yawaramin · September 10, 2021, 8:32pm

Nice find! I wonder how long this has been lurking in the shadows

octachron · September 10, 2021, 9:54pm

Redefining () as

type t = () of int

does not mess up with the parser. (and you can even construct value of this type with

let x= () 0

The responsible is utop which is rewritting

1 + 2;;

as

let () = 1 + 2

which does not work when () is not the expected unit.

viritrilbia · September 10, 2021, 11:44pm

I see that it works fine in a source file rather than at toplevel. But let () = 1 + 2 doesn’t typecheck even ordinarily – 1 + 2 is not of type unit – so that can’t be exactly what utop is doing, can it?

EduardoRFS · September 11, 2021, 12:02am

Because that’s the wrong syntax for pattern matching variants, essentially you’re doing, notice that all let is a pattern matching.

type t = Variant of int
let Variant = 1 + 2

Topic		Replies	Views
About Ocaml list type Learning	5	604	April 5, 2020
The empty list literal and the empty list of list literal Learning	4	695	May 2, 2019
Empty polymorphic variant type? Learning	5	1248	October 10, 2021
Higher kinded polymorphism Learning	13	6792	July 1, 2018
Sum type constructor declaration subtlety (and documentation wanted!) Ecosystem	25	3075	May 8, 2018

Is [] a type or not?

Related topics