Why is declaring two variables with the sequence operator ; illegal?

I guess your explanation makes sense, after, to my surprise the following code gave me a warning:

# let do_this x = ( add2 5; fun y -> y + 4 );;
Warning 10: this expression should have type unit.
val do_this : 'a -> int -> int = <fun>

although, I guess the code DID work since:

# do_this 2 4;;
- : int = 8

worked.

This are making a lot more sense. But I still don’t understand the difference between expressions and statements.

what [quote=“atavener, post:2, topic:4507”]
Think of building an expression rather than a sequence of statements.
[/quote]

what is the difference between a statement and an expression?

Statements don’t ‘return’ a value. E.g. in a language like say Python:

if x == 1: print("One")

This is a statement. It doesn’t ‘result in’ a final value. In OCaml:

if x = 1 then print_endline "One"

This is an expression. It ‘results in’ a value. In this case () of type unit. OCaml is designed so that almost all syntax is an expression. The standard if ... else ..., the try ... with ..., even for ... in ... do. This is useful because it’s more expressive. Different parts of the syntax compose together because everything results in values after all. While other languages had to add special syntax for this (e.g. Python’s X if COND else Y and C/C++/etc.'s ternary syntax), ML (and Lisp) languages get it from day one.

IMO, a good way to start is to stop using the top level, to write all your code in a file and to wrap all your code in a single “main” function (let _ = ...). Then, you can stop completely using ;; you’re only left with let ... in and statements.

let _ =
  let rec fib x =
    if x < 0 then failwith "fib";
    if x < 2 then x else fib (x - 1) + fib (x - 2)
  in
  let print_res x =
    Printf.printf "fib %d = %d" x (fib x)
  in
  let x1 = 5 in
  print_res x1;
  let x2 = 6 in
  print_res x2;
  ()

Once you’re at ease with this, it’ll be easier to understand. You would just rewrite it like that:

let rec fib x =
  if x < 0 then failwith "fib";
  if x < 2 then x else fib (x - 1) + fib (x - 2)

let print_res x =
  Printf.printf "fib %d = %d" x (fib x)

let _ =
  let x1 = 5 in
  print_res x1;
  let x2 = 6 in
  print_res x2;
  ()

Not sure if you are familiar with Python, but I will try translating your questions to that.

let x = 1 ; let y = 2
(* this fails because let .. = .. must be followed by `in`.

  In the top level / outer scope of a module (such as your file.ml)
  you don't need the "in," but for all the nested let .. you do.
*)

is essentially equivalent to:

x = ( 1; y = 2

and

let x = 1 let y = 3;;
(* with less misleading formatting: *)
let x = 1
let y = 3
;;

is

x = 1 ; y = 2
# aka
x = 1
y = 2

basically ; in ocaml is a very strong binding operator, as opposed to being a statement/expression separator in most other languages.
When you see a ; in OCaml you should read it at "put () around the previous expression, and this one, and make sure that the previous expression returned type unit".

so revisiting your original:

let x = 1; let y = 2
;;

first we need to close the let .. scope:

let x = 1; let y = 2 in y
;;
(* with the implicit grouping/binding order spelled out: *)
let x = ( 1; (let y = 2 in y) )

(* x is now 2 *)

OCaml will still complain, because the type of 1 is int, and you’re using ; to denote you want a side-effect and don’t care about the value of the expression (since nothing is recording it).

There’s a function in the standard library called ignore which has signature 'a -> unit, it is basically implemented like this:

let ignore _whatever = ()

We can use it like this to satisfy the type checker:

let x = ignore 1; let y = 2 in y
(* more readable formatting: *)
let x =
    ignore 1;
    let y = 2 in
    y

Hope that helps :slight_smile:

What does this mean? I think I’m starting to understand better that everything in OCAML results in a value (even though the distinction between expressions and statements is still not clear to me). I guess what I am confused is that I ONLY know python really. So I’m having a hard time understand what OCAML does. I thought python always returned values too. The way I thought about python is I have a bunch of statements (pretty much one per line usually) and they get executed and return a value. So for me everything is a statement in the programming I am used to. So what I am trying to understand is how do things work in OCAML. It seems once the distinction btw statements and expressions is clear everything would make sense to me.

me asking for clarification on the precise difference between statements and expressions: What is the difference between statements and expressions in OCAML?

I don’t think this is right. Correct me if I’m wrong. Check this:

Python 3.7.3 (default, Mar 27 2019, 16:54:48)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print("One")
One
>>> x = print("One")
One
>>> x
>>> print(x)
None
>>> y
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'y' is not defined

it seems that x DOES have a value. The print statement is a function and it returns None afaik. I tried printing y to emphasize that x had to be assigned.

print("One") is an expression. if x == 1: print("One") doesn’t.

>>> x = 1
>>> y = (if x == 1: print("One"))
  File "<stdin>", line 1
    y = (if x == 1: print("One"))
          ^
SyntaxError: invalid syntax

2 Likes

I’m confused about this. I can easily do:

# let x = 3;;
val x : int = 3

not only in top level but in my .ml file to. I can define a variable and then use it later if I need to. My question makes me feel there is something fundamental about OCAML or perhaps fp programming I don’t understand.

What does this mean? I guess the distinction from other languages is what confused me. In MATLAB I would use that all the time to separate statements but somehow its supposed to be different in OCAML and I don’t get what the difference is still.

Though I’ve received a lot of very useful comments that are definitively very helpful so far!

I need to re-read this: https://ocaml.org/learn/tutorials/structure_of_ocaml_programs.html

multiple times but I’ve not found it very useful at all. Idk why though.

So in OCaml there are two valid syntaxes for let:

  • As a top-level binding: let x = 1. This is a binding and does not evaluate to a value. Arguably this is what you would call a statement in other languages.
  • As an expression: let x = 1 in x + 1. This returns a value (in this case 2)

Of course these are specific examples for the sake of simplicity but as you have seen elsewhere actual OCaml expressions are composeable and so can become fairly complex.

are you not missing ;;? As far as I understand let x = stuff is illegal unless we have ;; or an in somewhere

;; is strictly speaking only required to tell the OCaml toplevel (the REPL, like ocaml or utop) that user input has ended and it should start evaluating the input now. See https://ocaml.org/learn/tutorials/structure_of_ocaml_programs.html#The-disappearance-of for more details.

1 Like

What does this mean? I guess the distinction from other languages is what confused me. In MATLAB I would use that all the time to separate statements but somehow its supposed to be different in OCAML and I don’t get what the difference is still.

It means that the ; in OCaml is more similar to , in JavaScript/C than it is to ; in those languages. Basically (in simplified terms and with exceptions to the rule) you can think of OCaml programs as being one big expression rather than a big set of statements. Of course it’s not exactly the same, but here is an example:

(You can skip the C section if that doesn’t make sense to you)

Here’s a small C program that compiles just fine:

// cc -Wall example.c -o example.exe
#include <stdio.h>
int main()
{
        NULL;
}  

In OCaml this can be written like this:

let () = ()

Here’s another one with an effect-free (pure) expression that is not being observed by anything:

// cc -Wall exp.c -o cool
#include <stdio.h>
int main()
{
        1;
}

With the GCC compiler that will generate this warning:

exp.c: In function ‘main’:
exp.c:5:2: warning: statement with no effect [-Wunused-value]
    7 |  1;
      |  ^

That is similar (although not 1:1 equivalent) to the OCaml program below.
The difference is that the C compiler is warning about the lack of side-effects, not type of the expression.

# let () = 1 ;;
Error: This expression has type int but an expression was expected of type
         unit

Here’s one using comma to separate two expressions (in the same statement):

// cc -Wall comma.c -o cool
#include <stdio.h>
int main()
{
        1, NULL;
}

Which gives us:

comma.c: In function ‘main’:
comma.c:5:3: warning: left-hand operand of comma expression has no effect [-Wunused-value]
    7 |  1, NULL;
      |   ^
comma.c:5:3: warning: statement with no effect [-Wunused-value]

In OCaml you could write:

# let () = 1 ; () ;;
Warning 10: this expression should have type unit.

“Fixing” it by ignoring the type of the first sub-expression (1 : int):

# let () = (ignore 1) ; () ;;
# 

It’s not easy to come up with an exact analog in C, but in Javascript can try:

var a = (1 , 2);
console.log(a);
// Javascript doesn't warn about uncaptured expressions:
// it will just evaluate 1; throw that away; evaluate 2,
// which is then returned as the value of the overall
// expression, and captured in `var a`:
// > 2 === a;
// true

By contrast, in OCaml , is used to separate members of a tuple (returning a tuple with all of the elements rather than just the last element).

In Python you see kind of the same behavior (as OCaml ;) with the and and or operators, but :

>>> a = 1 and 'whatever' and 2
>>> a
2
# of course since 'and' and 'or' short-circuit, the value of the expression
# DOES depend on these values.
# This would be a better example:
>>> actual = 2
>>> b = (1 and actual) or actual
>>> b
2

in OCaml:

# let x : int = (ignore 1) ; 2 ;;
val x : int = 2

note that the ;; would not be in your file, it is only used in the toplevel (a.k.a. REPL) to tell it that you’re done typing.

1 Like

I’ve always though the most confusing thing for newcomers was the top-level double semicolon.
This thread confirms this.

To all future newcomers who are not already familiar with a repl: do not use the toplevel until you have grasped the basic syntax of OCaml (which kernel is really simple) by writing actual programs.

@rene_sax Double semicolons (;;) is not really OCaml syntax. Let’s consider it a toplevel syntax.

OCaml syntax is all about building values out of expressions (an expression is a combination of values). For instance let negate s = "not " ^ s and x = "glop" in if happy then x else negate x is an expression returning some string value. In OCaml as in a few other languages (but neither C nor Python), almost anything is an expression that has a value and so can be combined freely.

To build useful programs you need only one more thing: a way to bind a value to a global identifier (so that you can define functions and variables that can be called/used by other programs, and so that your can write programs with an entry point). To do this, you need a special let statement that you write alone, not part of any other expression. For instance:

let main =
  let happy = Random.int 2 = 1 in
  let message =
    let negate s = "not " ^ s and x = "glop" in
      if happy then x else negate x in
  print_string message

Notice there is not a single semicolon in there.
Notice also that the let main = is different than the two other lets: because it is written at the top level of your file, outside any other expression, then it’s a special let statement with no value, but with the effect of binding some value to a global variable named main.

You can write this in a file program.ml and then compile it with ocamlc program.ml -o program and then run the program with ./program.

Now, OCaml is a language that accepts side effects. A few special constructs of the language have a side effect. For instance, the expression record.field <- 42 (which value is ()) sets the value of some mutable record field to 42.

Also, calling functions written in C can also result in side effects (in addition to those functions returning a value).

In the program above this happened twice: the Random.int function, as a side effect, update its internal random generator seed, and the print_string function, in addition to returning the () value, also prints something on the terminal.

In a language with side effects, it is useful to sequence the side effects, so that one takes place before another one. For instance, we would like that, after having printed the message, the above program also prints a newline character.

OCaml syntax for sequencing two expressions is the ;. expression_1 ; expression_2 means: evaluate expression_1 (and perform its side effects), discard its value, then evaluate expression_2 (and perform its side effects), and let the value of the all sequence be that of expression_2.

Since the value of expression_1 is discarded, the compiler will warn if that value had any other type than unit, as it means you have computed a value that’s not used, which is suspicious.

We can thus change the program into:

let main =
  let happy = (... etc ...) in
  print_string message ; print_newline

And that’s all for OCaml syntax. Notice, no ;; anywhere.

So, why is it so frequent to encounter OCaml code fragments that are littered with those ;; ?

For two reasons:

  1. there is a toplevel, aka a REPL. It’s a tool that evaluates OCaml expressions and prints their value. Quite useful. But when it reads user entered expressions (and let statements, that are also understood and also printed), the top-level accepts multi-line entries. Then, how does it know when a user expression (or let statement) is complete ? You guessed it: it waits for a double semicolon. This is a completely different syntactic device than the semicolon used as the sequencing operator in the language, see ?

  2. It might not be factually correct but my impression is that OCaml inventors had an idea that in hindsight might not have been so great: to help beginners learn OCaml, they decided to accept (and ignore) double semi-colons after let-statements, so that a beginner copy-pasting code from the top-level into a real source file would not be punished with a parse error. Following suit, a lot of OCaml programers and tutorial writers wrote their programs terminating every let-statement with a double-colon so that beginners could copy-paste the other way around, from the program to the top-level.

That’s why many OCaml code fragments feature those confusing double semicolons, that are nowhere to be seen in actual programs.

Hope that helped.

1 Like

The double semicolon is useful when you have “raw values” (I just made this name up, don’t bother googling) meaning just some value in your file without binding it. So for instance the following is a valid file which prints HELO:

print_string "HELO"

the double semicolon comes in for instance when there is a preceding let binding:

let s = "HELO";;

print_string s

Without the ;; it parses as the literal string “HELO” applied to print_string and s, which is of course nonsense.

Raw values are mostly used for the toplevel to print a value (ie when you enter 2 + 2;; and ocaml says - : int = 4, in actual code AFAICT it’s more idiomatic to do a let () = ... binding, which would look like

let s = "HELO"

let () = print_string s

and needs no ;;.

This suggest that evolution could have happened in the other way around compared to what I described above: that maybe initially OCaml modules were composed of several global-statements separated with double semicolons, and then the only useful global-statement turned out to be the let-statement, but then the double semicolon is not required between let-statements so become optional.

Maybe some old-timer will drop by and clarify.

A common justification to keep the double-semicolon nowadays is that it ensures that syntax errors don’t get reported far away from where they actually occur. For example:

let foo bar =
  some code ~f:(fun x ->
    bar x);

let bar = 3

let baz = 4

The trailing semicolon in the definition of foo is incorrect, but the “location” of the syntax error is actually the line defining baz.

Adding a double-semicolon after the foo definition would actually cause there to be no syntax error in this case. Another example:

let foo bar =
  let () =
    some code ~f:(fun x ->
      bar x)
  in

let bar = 3

let baz = 4

This syntax error is reported on the same line as before, starting with let baz. Adding the double-semicolon after the foo definition places the syntax error on the double-semicolon, so it’s obvious that the definition isn’t complete.