Why is declaring two variables with the sequence operator ; illegal?

What does this mean? I guess the distinction from other languages is what confused me. In MATLAB I would use that all the time to separate statements but somehow its supposed to be different in OCAML and I don’t get what the difference is still.

Though I’ve received a lot of very useful comments that are definitively very helpful so far!

I need to re-read this: https://ocaml.org/learn/tutorials/structure_of_ocaml_programs.html

multiple times but I’ve not found it very useful at all. Idk why though.

So in OCaml there are two valid syntaxes for let:

  • As a top-level binding: let x = 1. This is a binding and does not evaluate to a value. Arguably this is what you would call a statement in other languages.
  • As an expression: let x = 1 in x + 1. This returns a value (in this case 2)

Of course these are specific examples for the sake of simplicity but as you have seen elsewhere actual OCaml expressions are composeable and so can become fairly complex.

are you not missing ;;? As far as I understand let x = stuff is illegal unless we have ;; or an in somewhere

;; is strictly speaking only required to tell the OCaml toplevel (the REPL, like ocaml or utop) that user input has ended and it should start evaluating the input now. See https://ocaml.org/learn/tutorials/structure_of_ocaml_programs.html#The-disappearance-of for more details.

1 Like

What does this mean? I guess the distinction from other languages is what confused me. In MATLAB I would use that all the time to separate statements but somehow its supposed to be different in OCAML and I don’t get what the difference is still.

It means that the ; in OCaml is more similar to , in JavaScript/C than it is to ; in those languages. Basically (in simplified terms and with exceptions to the rule) you can think of OCaml programs as being one big expression rather than a big set of statements. Of course it’s not exactly the same, but here is an example:

(You can skip the C section if that doesn’t make sense to you)

Here’s a small C program that compiles just fine:

// cc -Wall example.c -o example.exe
#include <stdio.h>
int main()
{
        NULL;
}  

In OCaml this can be written like this:

let () = ()

Here’s another one with an effect-free (pure) expression that is not being observed by anything:

// cc -Wall exp.c -o cool
#include <stdio.h>
int main()
{
        1;
}

With the GCC compiler that will generate this warning:

exp.c: In function ‘main’:
exp.c:5:2: warning: statement with no effect [-Wunused-value]
    7 |  1;
      |  ^

That is similar (although not 1:1 equivalent) to the OCaml program below.
The difference is that the C compiler is warning about the lack of side-effects, not type of the expression.

# let () = 1 ;;
Error: This expression has type int but an expression was expected of type
         unit

Here’s one using comma to separate two expressions (in the same statement):

// cc -Wall comma.c -o cool
#include <stdio.h>
int main()
{
        1, NULL;
}

Which gives us:

comma.c: In function ‘main’:
comma.c:5:3: warning: left-hand operand of comma expression has no effect [-Wunused-value]
    7 |  1, NULL;
      |   ^
comma.c:5:3: warning: statement with no effect [-Wunused-value]

In OCaml you could write:

# let () = 1 ; () ;;
Warning 10: this expression should have type unit.

“Fixing” it by ignoring the type of the first sub-expression (1 : int):

# let () = (ignore 1) ; () ;;
# 

It’s not easy to come up with an exact analog in C, but in Javascript can try:

var a = (1 , 2);
console.log(a);
// Javascript doesn't warn about uncaptured expressions:
// it will just evaluate 1; throw that away; evaluate 2,
// which is then returned as the value of the overall
// expression, and captured in `var a`:
// > 2 === a;
// true

By contrast, in OCaml , is used to separate members of a tuple (returning a tuple with all of the elements rather than just the last element).

In Python you see kind of the same behavior (as OCaml ;) with the and and or operators, but :

>>> a = 1 and 'whatever' and 2
>>> a
2
# of course since 'and' and 'or' short-circuit, the value of the expression
# DOES depend on these values.
# This would be a better example:
>>> actual = 2
>>> b = (1 and actual) or actual
>>> b
2

in OCaml:

# let x : int = (ignore 1) ; 2 ;;
val x : int = 2

note that the ;; would not be in your file, it is only used in the toplevel (a.k.a. REPL) to tell it that you’re done typing.

1 Like

I’ve always though the most confusing thing for newcomers was the top-level double semicolon.
This thread confirms this.

To all future newcomers who are not already familiar with a repl: do not use the toplevel until you have grasped the basic syntax of OCaml (which kernel is really simple) by writing actual programs.

@rene_sax Double semicolons (;;) is not really OCaml syntax. Let’s consider it a toplevel syntax.

OCaml syntax is all about building values out of expressions (an expression is a combination of values). For instance let negate s = "not " ^ s and x = "glop" in if happy then x else negate x is an expression returning some string value. In OCaml as in a few other languages (but neither C nor Python), almost anything is an expression that has a value and so can be combined freely.

To build useful programs you need only one more thing: a way to bind a value to a global identifier (so that you can define functions and variables that can be called/used by other programs, and so that your can write programs with an entry point). To do this, you need a special let statement that you write alone, not part of any other expression. For instance:

let main =
  let happy = Random.int 2 = 1 in
  let message =
    let negate s = "not " ^ s and x = "glop" in
      if happy then x else negate x in
  print_string message

Notice there is not a single semicolon in there.
Notice also that the let main = is different than the two other lets: because it is written at the top level of your file, outside any other expression, then it’s a special let statement with no value, but with the effect of binding some value to a global variable named main.

You can write this in a file program.ml and then compile it with ocamlc program.ml -o program and then run the program with ./program.

Now, OCaml is a language that accepts side effects. A few special constructs of the language have a side effect. For instance, the expression record.field <- 42 (which value is ()) sets the value of some mutable record field to 42.

Also, calling functions written in C can also result in side effects (in addition to those functions returning a value).

In the program above this happened twice: the Random.int function, as a side effect, update its internal random generator seed, and the print_string function, in addition to returning the () value, also prints something on the terminal.

In a language with side effects, it is useful to sequence the side effects, so that one takes place before another one. For instance, we would like that, after having printed the message, the above program also prints a newline character.

OCaml syntax for sequencing two expressions is the ;. expression_1 ; expression_2 means: evaluate expression_1 (and perform its side effects), discard its value, then evaluate expression_2 (and perform its side effects), and let the value of the all sequence be that of expression_2.

Since the value of expression_1 is discarded, the compiler will warn if that value had any other type than unit, as it means you have computed a value that’s not used, which is suspicious.

We can thus change the program into:

let main =
  let happy = (... etc ...) in
  print_string message ; print_newline

And that’s all for OCaml syntax. Notice, no ;; anywhere.

So, why is it so frequent to encounter OCaml code fragments that are littered with those ;; ?

For two reasons:

  1. there is a toplevel, aka a REPL. It’s a tool that evaluates OCaml expressions and prints their value. Quite useful. But when it reads user entered expressions (and let statements, that are also understood and also printed), the top-level accepts multi-line entries. Then, how does it know when a user expression (or let statement) is complete ? You guessed it: it waits for a double semicolon. This is a completely different syntactic device than the semicolon used as the sequencing operator in the language, see ?

  2. It might not be factually correct but my impression is that OCaml inventors had an idea that in hindsight might not have been so great: to help beginners learn OCaml, they decided to accept (and ignore) double semi-colons after let-statements, so that a beginner copy-pasting code from the top-level into a real source file would not be punished with a parse error. Following suit, a lot of OCaml programers and tutorial writers wrote their programs terminating every let-statement with a double-colon so that beginners could copy-paste the other way around, from the program to the top-level.

That’s why many OCaml code fragments feature those confusing double semicolons, that are nowhere to be seen in actual programs.

Hope that helped.

1 Like

The double semicolon is useful when you have “raw values” (I just made this name up, don’t bother googling) meaning just some value in your file without binding it. So for instance the following is a valid file which prints HELO:

print_string "HELO"

the double semicolon comes in for instance when there is a preceding let binding:

let s = "HELO";;

print_string s

Without the ;; it parses as the literal string “HELO” applied to print_string and s, which is of course nonsense.

Raw values are mostly used for the toplevel to print a value (ie when you enter 2 + 2;; and ocaml says - : int = 4, in actual code AFAICT it’s more idiomatic to do a let () = ... binding, which would look like

let s = "HELO"

let () = print_string s

and needs no ;;.

This suggest that evolution could have happened in the other way around compared to what I described above: that maybe initially OCaml modules were composed of several global-statements separated with double semicolons, and then the only useful global-statement turned out to be the let-statement, but then the double semicolon is not required between let-statements so become optional.

Maybe some old-timer will drop by and clarify.

A common justification to keep the double-semicolon nowadays is that it ensures that syntax errors don’t get reported far away from where they actually occur. For example:

let foo bar =
  some code ~f:(fun x ->
    bar x);

let bar = 3

let baz = 4

The trailing semicolon in the definition of foo is incorrect, but the “location” of the syntax error is actually the line defining baz.

Adding a double-semicolon after the foo definition would actually cause there to be no syntax error in this case. Another example:

let foo bar =
  let () =
    some code ~f:(fun x ->
      bar x)
  in

let bar = 3

let baz = 4

This syntax error is reported on the same line as before, starting with let baz. Adding the double-semicolon after the foo definition places the syntax error on the double-semicolon, so it’s obvious that the definition isn’t complete.