I hope this is the kind of question that is simple yet subtle, rather than just simple:
Is there an idiomatic way in OCaml to distinguish between a “data value” and a “function value”?
This distinction feels quite natural in object-oriented languages, where data and functions/methods are treated differently. I understand that in functional languages like OCaml the distinction is less rigid (since functions are first-class values), but in practice I sometimes find myself wanting to express or enforce a distinction between data and behavior.
A specific case that brings this to mind is variant constructors and polymorphic variant tags. These occupy a kind of middle ground: they’re not functions per se, yet they produce values when used. Because they live in a kind of feature space that isn’t strictly data or behavior, I’m wondering if there’s a conventional way in the OCaml community to talk about or distinguish such constructs — or if the distinction is deliberately blurred in functional design.
Ignoring limited visibility via signatures and such, one answer might be that data-values can be pattern-matched, and functional values can only be applied to arguments. The actual -constructors- are second-class citizens, but sure, we can think of them as functional values.
ETA: It’s interesting to think about the Church-encoding of a datatype. E.g. Church numerals. There is a very real sense in which a Church numeral is its own destructor – carries within it the code for its match operation. And this is true for all Church-encoded datastructures.
Random aside–do you know if there’s a historical reason that we don’t have first class variant constructors? It’s always bothered me that you can’t pass around constructors as functions.
@bsidhom I’m curious if you have an example where that would be useful.
Variant constructors in OCaml are like functions in many cases but not uniformly so. I tend to think of them as distinct language constructs that exist in a space between functions and values: they’re designed to conveniently produce values of a fixed type, but they aren’t functions in the strict sense. As @octachron points out in this post, it’s probably best not to conflate variant constructors with functions in OCaml.
@Chet_Murthy That’s a really helpful way to make a practical distinction between “function-values” and “data values”. Focusing on pattern matching versus application as a practical distinction makes a lot of sense, especially since it gets at how values are used rather than how they’re typed.
I’m currently writing a lesson on variants and polymorphic variants and do not want to make lambda calculus / church-encoding prerequisites while still respecting functional idioms, but I appreciate how you introduced the idea that data can be viewed as functions, not just functions viewed as data
I’m sure it’s somewhat involved to do so, but I found the way in which combining
let-rec, match, and a data-type, produces exactly the induction combinator as generated by Coq (or some other proof-system) to be both beautiful, and indicative of the proof-structure of induction built into the data-types.
If I’m not mistaken, in SML constructors can be turned automatically into functions (for instance, in some context “Some” would be a function from 'a to 'a option) which would already be quite handy; I find myself defining explicitely such functions from time to time - and there is actually an Option.some function in the stdlib, so I might not be the only one.
This is true of both SML and F# both, which makes OCaml seem the odd one out in this respect.
I think SML has a slightly stronger syntactic distinction between data value and function values too, because SML function declarations tend to start with the fun keyword while value declarations start with val, but that’s probably only a minor thing since you can still bind functions/closures in a val.
Well the only surefire way I can think of is to try to compare the value to itself using the default equality operator. If it raises an exception then it either is or contains a function. If it doesn’t raise then it’s a non-function value.
Formally, we assume a collection of constants c ∈ C that are partitioned into constructors C ∈ C+ and primitives f ∈ C−. Constants also come with an arity, that is, we assume a mapping arity from C to IN. For instance, integers and booleans are constructors of arity 0, pair is a constructor of arity 2, arithmetic operations, such as + or × are primitives of arity 2, and not is a primitive of arity 1. Intuitively, constructors are passive: they may take arguments, but should ignore their shape and simply build up larger values with their arguments embedded. On the opposite, primitives are active: they may examine the shape of their arguments, operate on inner embedded values, and transform them.
I’m not sure this translates 100% to the way “constructor” is used with variants, but the distinction between “constructors” and “functions” (primatives) goes back to the foundations of OCaml and maps onto the distinction I’ve read that make a distinction between variant constructors and functions.
I don’t know if this directly answers your question about why OCaml doesn’t treat constructors as first-class functions, but I was pleased to find this formalization in the paper.
Interesting! I assumed at first that “constants” were normal functions, but that should imply arity 1 (universal currying). As far as I can tell, that’s the main practical distinction between constructors and functions.
I’d really like more context, but the link appears to be void.
I’m only a chapter in, but this book looks like a must-read, like the ZINC paper. Simply written, approachable, and dense with information.
Not sure why neither text is featured in the books section of the site. I ran into UUU in the source code! Perhaps they are kryptonite for most developers, but for me they distinguish OCaml from other languages I’ve worked with. Perhaps I’ll submit a pull request to the site to include them when I get a chance.