Has "let" syntax changed recently? And why are the REPL and compiler behaving differently?

I ran into two related problems starting ocaml today. Maybe I’ve made some bizarre error, but I can’t see how. The first is this

let hello = “hello world” in print_endline hello

It doesn’t work. And according to lots of material around the internet, it should - it’s copied straight from Values, expressions, and bindings — OCaml From the Ground Up. The error message was generic…

File “two.ml”, line 22, characters 26-28:
22 | let hello = “hello world” in print_endline hello
^^
Error: Syntax error

But I took a guess that this would fix it -

let () = let hello = “hello world” in print_endline hello

And it did. Has there been a change to ocaml syntax in the past few years to explain why the first version fails??? If so, I would suggest warning people. And even making some effort to contact people to have the old version removed. And if there hasn’t been a change, what is happening and can someone point me to a clear explanation of the relevant rules???

(Added: but that code DOES work in the REPL. Which seems crazy - the point of a REPL is to be able to experiment and see what will work in compiled version, surely?)

The second problem is this…

let d =
let a = [ 1; 2; 3; 5; 7 ] in
let b = [ 1; 2; 4; 8; 16 ] in
let hello = “foo to you” in print_endline hello;
List.map2 ( * ) a b

That works. But this doesn’t

let hello = “foo to you” in print_endline hello;

…And the explanations I’ve seen of the use of the semicolon wouldn’t have lead me to predict this.

My best guess at the moment is that Unit return types could once be ignored but this is no longer the case and they have to be absorbed by a () or _. Unless you are in the body of an expression, in which case you can ignore them by using ending with a ;

..If so, this needs explaining clearly - and I would suggest early on, because it is going to come up as soon as people start writing code snippets to test their understanding, which is what any good programmer will do in the first few minutes of learning a language. Or if I’m wrong, then whatever IS right needs explaining. And what is happening with the REPL???

The compiler and the REPL parse the exact same syntax and nothing has changed in the syntax of lets. In particular, the line

(* start of file *)
let hello = “hello world” in print_endline hello

works in both, at least at the start of a file.

However, this is a top-level expression since it is using let ... in ... without a top-level let. Moreover, top-level expressions must be separated from other top-level expressions by ;;:

1 ;; (* one top-level expression*)
2 ;; (* another *)
let x = 3 (* this is a definition *)
let y = 4 (* another definition *)
;;
x + y (* a final toplevel expressions *)

Since your error is reported at line 22, it is likely that your error was trying to mix toplevel expressions and definitions, something like

let x = 2
let y = 4 in x + y

which is a syntax error on the in.
This is why it is often advised to only use definitions to avoid being confused at first.

let x = 2
let z =
  let y = 4 in
  x + y

If I am not mistaken, your second issue is that you missed that

let d =
  let a = [ 1; 2; 3; 5; 7 ] in
  let b = [ 1; 2; 4; 8; 16 ] in
  let hello = “foo to you” in print_endline hello;
  List.map2 ( * ) a b

is equivalent to

let d =
  let a = [ 1; 2; 3; 5; 7 ] in
  let b = [ 1; 2; 4; 8; 16 ] in
  let hello = "foo to you" in
  print_endline hello; List.map2 ( * ) a b

Notably, newlines are always the same as spaces in OCaml.

No, the change is between what is shown in the book and the example you ran. In the book, the example shown is:

$ cat ./test.ml 
let hello = "hello world" in print_endline hello

$ ocamlopt -o test ./test.ml 

$ ./test 
hello world

Notice that the file test.ml is a single line of code.

In your error message, the line number given is 22, meaning you put other code before that line. That’s why you’re getting the error. When you have a file of code and then you put a let expression by itself in the file, it’s ambiguous and can’t be parsed. A let expression is one of the form let ... = ... in ....

If you had put a let statement by itself in the file, it would also have worked:

let () =
  let hello = "hello world" in
  print_endline hello

Because it would be unambiguous to the parser.

Imho, the example given in the book can be improved by using this let statement, because it would continue to work even if the source file contained other statements.

Umm, no. That is not “showing” the error. That is causing the error through bad documentation. I can’t think of any other language where a single line of code is going to behave differently to the same line of code in a file. “Showing” would be saying “Hey, unlike any other language you have used, this code works as a single liner or at the top of a file, but not anywhere else. Because -” And even if the because is “We will explain why later” it is still better than nothing.

Yes. This is indeed true: I couldn’t get an error message for line 22 if there weren’t other lines before it. But, no, it doesn’t tell me anything useful.

I’m reasonably certain that the in-less

let hello = “hello world”;;

is a let expression too…

And I appreciate the answer, but this is missing the point. Which isn’t writing optimal code but explaining key language rules to people when they need to know them.

Then people should be told this - early - and why using a let()= fixes the problem.

It’s probably too late, but if the ocaml community want their language to be widely adopted, it needs to start doing a much better job of communicating the basic rules of the language.

1 Like

FWIW, it’s not - let foo = bar is a definition which is a module item, not an expression.

(granted, pointing to the grammar part of the manual isn’t exactly beginner’s documentation…)

2 Likes

Well, no. If the REPL and only the first freaking line work the same way, then most of the file doesn’t work the same way - not from the perspective of someone trying the language out. This is MORE insane than the possibilities I considered.

Surely - and this isn’t a small point, although I suspect it may be tricky for a non-native speaker, that However should be Because? If it is a However, then what are the consequences?

And also - just WHY? for that entire sentence. This seems to be key but nothing I read indicated it.

So tell people that. Early. The only references I had seen to ;; were in the context of the REPL and they were phrased to imply it was just something the REPL needed. This seems incredibly important to know.

I haven’t seen this advice anywhere - and I speedread a stack of introductory material - but I agree that it is good.

My feeling is that this is a trivial issue that could easily be fixed with good communication. If people stay on the rails of a tutorial it might not come up, and in academia it will get fixed by a TA. But in the real world no decent programmer will stay on those rails - and they’ll probably be playing with other rival languages at the same time and judge you harshly. I shudder to think what this has done to the ocaml adoption rate.

Sorry: no, that wasn’t the issue at all. Again, my best guess was

..No mention of newlines or spaces. And actually, you haven’t said anything that makes me feel I really understand even now - unless that second sentence is more or less right.

Again, my overall impression is that there are probably two or three key things people need to know about ocaml early, that the community isn’t explaining them, and that this is partly because you take them for granted, and partly because they make explaining harder. But that messing this up is why ocaml isn’t widely used.

I’ll play around some more and I’m sure I will get it - but I’ve use more languages than most programmers and I have time on my hands. I really think this is one of the reasons your adoption rate is so low compared to eg Elixir - not the language feature, but the lack of communication.

Of course, it is not only the first line that works in this way. But the advantage of the first line is that the context is always the same. And your issue is that you are trying the OCaml equivalent of copy-pasting a line of python with a wrong indentation in the context (or an expression in a statement context in C). Starting with the first line avoids this source of confusion.

People mostly only use toplevel expressions and ;; in the REPL. If you look at the ocaml.org’s tutorial (Modules · OCaml Documentation) or the reference manual, the double semicolon disappears outside of the REPL.

In particular, a module (or a ml file if you prefer) can always be written as a sequence of toplevel definitions

let name_1 = expression1
let name_2 = expression2
let name_3 = expression3
let _ = expression4
let () = expression5

A point to note is that there are no ;; and that the toplevel definitions are not followed by in.
The last point is sometimes source of confusion, the construct

let name_1 = expr_1 in expr_2

is an expression itself and not a definition. Thus when you write

let hello = “hello world” in print_endline hello

you are writing an expression that is not valid in a context where a definition is expected.
But in order to support a REPL-oriented syntax, expressions are allowed at the toplevel if they are separated from definitions or other toplevel expressions by ;;. Thus you can write

let def1 = ...
let def2 = ...
;;
Format.printf "Hello";;
Format.printf "world";;
let def3 = ...
let def4 = ...
let def5 = ...
;;
Format.printf " !@.";;

Most people avoid mixing purposefully definitions and toplevel expressions in such way. However, if by accident someone transforms the fourth definition into

let def3 = 0
let def4 = 2 in def4 + 3
let def5 = ...

then they have transformed let def4 = 2 in def4 + 3 into an expression and we are back to mixing expressions and definitions.

Concerning single semicolon ;, it is best to think of it as an operator between expressions. Thus the line

let hello = “foo to you” in print_endline hello;

is as incomplete as

let one = 1 in one +
1 Like

Pretty much any language with compile-time syntax checking will treat an expression as a syntax error if it is placed outside of expression context. Eg if you put the number literal 1 in a Java file by itself on a line, outside of a method definition, the Java compiler will treat that as a syntax error.

the in-less…let hello = “hello world”;;…is a let expression too…

It’s not. It’s a module item, or as I typically explain it to beginners, a ‘statement’.

people should be told this - early - and why using a let()= fixes the problem.

I agree. This is touched on briefly in the tutorial: A Tour of OCaml · OCaml Documentation

But it does confuse a good number of people and should be made clearer. Note that the OCaml tutorials are all editable via PR and anyone can send PRs to improve them.

Yes: all languages do indeed have syntax. The point is that no other language community has docs explaining a key element of syntax so badly. None. And I’ve written elisp, so I don’t say this lightly.

Also… You seem to use top level/toplevel - and gawd help us, top-level - to mean both the REPL and the outermost scope of definitions. Hasn’t anyone ever suspected that this would lead to confusion???

Just take a look at results thrown up googling “single vs double semicolon”:

..a stack load of links removed because they’re not allowed..

A lot of people in these discussions say that ;; is REPL only - then others prove that it isn’t. Even people writing ocaml don’t seem to understand some of the basic rules. This is not a dignified way to program: its coding through ritual, not understanding. In a language that is supposed to be about clarity.

Eg

the toplevel reads in lines until it sees a ;; , then it ignores everything after that and tries to interprets everything before it as valid OCaml code. The ‘comments’ that you had after ;; were just ignored.

That value isn’t seen interactively – the examples given immediately after that part of the text are all of files. I doubt the value myself; the effect for someone learning OCaml is that inserting ;; makes syntax errors go away:

This is a mess. Much worse, it’s a poorly explained mess that will blow in the face of anyone doing early experiments with the language. The closest thing I’ve seen to a decent explanation is what the OP in that thread says he concluded from experiment:

I came up with these rules to follow that fixed my wanting to use ;; in source files:

  1. don’t use “let … in” in toplevel code
  2. write “let () =\n do_this ();\n do_that ()” instead of
    “do_this (); do_that ()” in toplevel code
  3. treat ; like an infix operator rather than a terminator or an
    optional separator. Type it in expectation of a right-hand-side.

‘toplevel’ meaning, not as part of the body of some other expression.

…But people shouldn’t have to code that way. Key language rules should be clearly stated. And from that same discussion

lpw25 > ;; is a valid part of the syntax – but is an unnecessary one and it is not supported in the REPL. ;; should be read as a prefix meaning “the following is a item-level construct”.

But

Double semi-colon isn’t needed in the language proper. It’s used to “submit” a batch of lines to the REPL

..So the people coding in ocaml can’t agree on something this basic. And worse, I just tested whether ;; works in the REPL and it does - and the person saying it doesn’t is listed as a “Maintainer”..!?

Again - this is a mess. In fact it’s a mess like nothing I’ve ever seen before - and google shows that this has been a source of confusion for years. Ocaml should be a trivial language for me to pick up - I’ve written lisp, Scala, Erlang and even some Haskell. Instead I’ve had to detour to investigate this incredibly trivial point and discovered a world of confusion among people using the language.

No, we are all morons here.

7 Likes

To anyone finding this thread after googling, read

The article is called " What I wish I knew when learning OCaml." And everything there is basic material that should be in introductory tutorials, but “This is a poorly structured list of common questions and answers about OCaml I had to ask myself or have been asked often.”

Also, looking over those links I couldn’t post, I’m fairly certain that people in the same discussion would use variants of toplevel to mean the REPL or the outer scope without realising that the other person was using the alternate meaning…

This is my best explanation of what people need to know. I probably have something wrong: please correct me if I do -

(* newline doesn’t end an expression *)
let hello =
“hello world”;;
print_endline hello

(* So what does??? )
(
compiler recognises ;; as end of expression )
(
without ;; would try to assign the print_endline to hello *)
let hello = “hello world”;;
print_endline hello

(* BUT compiler recognises let () = as start of new expression, therefore old finished *)
let yallo = “yallo woild”
let () = print_endline yallo

(* And most of the time, ocaml is written as a series of let’s. So the ;; isn’t needed )
(
The outer scope let d defines an expression that sets d to be the product of two lists; the compiler knows the expression is finished when it reaches code that gives d a value *)
let d =
let a = [ 1; 2; 3; 5; 7 ] in
let b = [ 1; 2; 4; 8; 16 ] in
List.map2 ( * ) a b

(* BUT!!! The REPL still uses ;; to terminate entry of an expression!!! )
(
also a ;; wouldn’t cause an error after the last line above - but it would be pointless if the next line is a let, which it really should be. lots of ;; is a sign of imperative code*)

(* single ; doesn’t end expression, it just tells the compiler that the value from that line should be ignored - only works if it is “unit” - sort of like C++ void *)

let d =
let a = [ 1; 2; 3; 5; 7 ] in
let b = [ 1; 2; 4; 8; 16 ] in
List.iter (printf "%d ") a;
List.map2 ( * ) a b

(* Do NOT use ; to terminate an expression! it means, >Ignore this (unit) value and CONTINUE with the expression.< So will do opposite of ending! *)

(* Could also have *)
let d =
let a = [ 1; 2; 3; 5; 7 ] in
let b = [ 1; 2; 4; 8; 16 ] in
let () = List.iter (printf "%d ") a in
List.map2 ( * ) a b

(* ..The let() = has matched the unit return value - that’s what () means - and the in does what it normally does. A let _ would also work, but removes the type checking, so why do it? *)

Assuming you are interacting in good faith, let expressions gives a reasonable explanation. I would recommend all of chapter 2 to get a better understanding.

1 Like

You are nearly correct, except for a question of vocabulary: what you call an expression is called a sentence (and a sentence is either an expression or a definition).
In particular, with this replacement, the two next blocks are correct.

(* newline doesn’t end a sentence *)
let hello =
  print_endline 
  "hello"
(* So what does???
  compiler recognises ;; as end of a sentence
  without ;; would try to assign the print_endline to hello *)
let hello = “hello world”;;
print_endline hello

The part about let is nearly correct, but all let are recognised (and type, class, module, any keyword that starts a definition) not only let ()

(* BUT compiler recognises let ... = as start of new sentence, 
  therefore old finished *)
let yallo = "yallo woild"
let punct = "!"
type pair = { left:int; right:int }
let () = print_endline (yallo ^ punct)

For the next part, it would be more precise to state that the body of an outer scope let is
an expression, but code that gives a value works.

The part about ; not being an terminator but an operator on expression is also correct. Note however, that

let one = ("Hello" ^ " here") ; 1

is valid but often considered bad style because the ; is discarding a non-unit value.

Could I ask how you settled on using https://ocamlbook.org/ as one of your beginner materials?

I think that the particular choice for introducing the structure of programs there is problematic, because it will lead readers to stumble in exactly the way you have. In defense of that resource it introduces itself as unfinished:

This is a work in progress introductory book on the OCaml programming language.

Right now the book is obviously incomplete, but together we can complete it faster than I can do it alone.

I think we can and should definitely do better in our on-boarding experience. That said, if you had come thru ocaml.org > learn > to arrive at a toure of ocaml, you’d find that the first section introduces Expressions and Definitions. That should help a bit already, and PRs to improve would surely be accepted.

If you had found the language manual, you might have found OCaml - The core language even more helpful.

So I’m wonderig if part of the challenge you are facing may be from having stumbled into a less happy introductory documentation path. And so, to repeat, I am curious how you ended up orientating around https://ocamlbook.org/. Perhaps this is something we could correct (by directing learners starting out to more polished sources)?

3 Likes

OCaml syntax is straightforward.
OCaml syntax has no ‘;;’.

The confusion is coming from the REPL. ‘;;’ is just a “command terminator” for the REPL. And let’s not even mention the other additional bits of syntax it introduces.

Just don’t use the REPL, don’t tell students about it, and everything will become simpler. Consider it an advanced feature, not a gateway for beginners.

Hard disagree on this one. OCaml has ;; in source files for the unambiguous termination of module-level expressions and definitions. It’s actually required to use it after multi-line function definitions in the Jane Street style guide. It’s also used heavily in code examples in RWO, iirc. I did not recall correctly.

I’m not a fan of this style, but it can make it easier to catch bugs.

3 Likes

When teaching students, my experience is that showing ;; in source files is confusing for the vast majority, because then they try to use it in place of ; and mayhem ensues. This is a discussion we often have amongst colleagues teaching OCaml. Our consensus, especially for beginners, is to tell them that ;; is only for the REPL (we use whenever possible the ocaml repl with down preloaded, which gives them a lean experience, pretty close to the default Python REPL; many thanks to @dbuenzli for down).
Otherwise, we tell them that an OCaml file is a sequence of definitions, either type t = ..., let ... = (and in time let rec and let rec ... and ...) or exception ..., which covers everything that is needed for our introductory courses.
For the more advanced course, we of course enrich the set of possible definitions with module and module type but at that point, the syntax is much less confusing for them already.

And I tell them that whenever they see material with ;; in files, they can either ignore it if it separate two definitions or if one of them is an expression, replace it by let () = e.

Lastly, after a few practical labs, they tend to not use the REPL directly but rather select text in VSCode and send it to the OCaml REPL.

And as always (imho), the official OCaml manual is already pretty good. The very first section at least clearly says that ;; is used to delimit phrases in the REPL (which it calls interactive system). It only uses the term toplevel to denote a position in the program and not the REPL.
It also mentions that ;; is optional in program files. A nice addition could re-use the last example of standalone program and say that the first and third ;; are unneeded but that the second one is mandatory, since the program would otherwise be parsed as exit 0 main (), and that this ;; can be removed if one writes let () = main ().

10 Likes