I’ve always though the most confusing thing for newcomers was the top-level double semicolon.
This thread confirms this.
To all future newcomers who are not already familiar with a repl: do not use the toplevel until you have grasped the basic syntax of OCaml (which kernel is really simple) by writing actual programs.
@rene_sax Double semicolons (;;
) is not really OCaml syntax. Let’s consider it a toplevel syntax.
OCaml syntax is all about building values out of expressions (an expression is a combination of values). For instance let negate s = "not " ^ s and x = "glop" in if happy then x else negate x
is an expression returning some string value. In OCaml as in a few other languages (but neither C nor Python), almost anything is an expression that has a value and so can be combined freely.
To build useful programs you need only one more thing: a way to bind a value to a global identifier (so that you can define functions and variables that can be called/used by other programs, and so that your can write programs with an entry point). To do this, you need a special let statement
that you write alone, not part of any other expression. For instance:
let main =
let happy = Random.int 2 = 1 in
let message =
let negate s = "not " ^ s and x = "glop" in
if happy then x else negate x in
print_string message
Notice there is not a single semicolon in there.
Notice also that the let main =
is different than the two other let
s: because it is written at the top level of your file, outside any other expression, then it’s a special let statement
with no value, but with the effect of binding some value to a global variable named main
.
You can write this in a file program.ml
and then compile it with ocamlc program.ml -o program
and then run the program with ./program
.
Now, OCaml is a language that accepts side effects. A few special constructs of the language have a side effect. For instance, the expression record.field <- 42
(which value is ()
) sets the value of some mutable record field to 42.
Also, calling functions written in C can also result in side effects (in addition to those functions returning a value).
In the program above this happened twice: the Random.int
function, as a side effect, update its internal random generator seed, and the print_string
function, in addition to returning the ()
value, also prints something on the terminal.
In a language with side effects, it is useful to sequence the side effects, so that one takes place before another one. For instance, we would like that, after having printed the message, the above program also prints a newline character.
OCaml syntax for sequencing two expressions is the ;
. expression_1 ; expression_2
means: evaluate expression_1
(and perform its side effects), discard its value, then evaluate expression_2
(and perform its side effects), and let the value of the all sequence be that of expression_2
.
Since the value of expression_1
is discarded, the compiler will warn if that value had any other type than unit, as it means you have computed a value that’s not used, which is suspicious.
We can thus change the program into:
let main =
let happy = (... etc ...) in
print_string message ; print_newline
And that’s all for OCaml syntax. Notice, no ;;
anywhere.
So, why is it so frequent to encounter OCaml code fragments that are littered with those ;;
?
For two reasons:
-
there is a toplevel, aka a REPL. It’s a tool that evaluates OCaml expressions and prints their value. Quite useful. But when it reads user entered expressions (and let statements, that are also understood and also printed), the top-level accepts multi-line entries. Then, how does it know when a user expression (or let statement) is complete ? You guessed it: it waits for a double semicolon. This is a completely different syntactic device than the semicolon used as the sequencing operator in the language, see ?
-
It might not be factually correct but my impression is that OCaml inventors had an idea that in hindsight might not have been so great: to help beginners learn OCaml, they decided to accept (and ignore) double semi-colons after let-statements, so that a beginner copy-pasting code from the top-level into a real source file would not be punished with a parse error. Following suit, a lot of OCaml programers and tutorial writers wrote their programs terminating every let-statement with a double-colon so that beginners could copy-paste the other way around, from the program to the top-level.
That’s why many OCaml code fragments feature those confusing double semicolons, that are nowhere to be seen in actual programs.
Hope that helped.