Order of evaluation in application/tuple/...?

StrongerXi · March 27, 2021, 3:17am

The following program

let f x y = ()
;;

let print_int n =
  Printf.printf "%d\n" n;
  n
;;

let print_int_list n =
  Printf.printf "%d\n" n;
  [n]
;;

(print_int 0)::(print_int_list 42)
;;

Some (print_int 0, print_int_list 42)
;;

(print_int 0, print_int_list 42)
;;

f (print_int 0) (print_int_list 42)
;;

Gives the this output:

I’m genuinely surprised by this, although I did find resources to justify it:
https://caml.inria.fr/pub/docs/manual-ocaml/expr.html

Still, why? OCaml is not Haskell, why didn’t the language designer reinforce these evaluations to a “more natural” order (one that resembles C, Java, Python, etc.)? What’s the design rationale behind this?

cemerick · March 27, 2021, 3:25am

The right-to-left ordering when consing should be intuitive (otherwise every additional element after the head results in a concat), but I understand the confusion re: tuples and thus arguments to functions.

This goes back to the original construction of the ZINC interpreter. The topic is a recurring one:

StrongerXi · March 27, 2021, 3:27am

Thanks, this is pretty cool. TIL

yawaramin · March 27, 2021, 4:54am

Also discussed in this book Order of Evaluation of Arguments

Maelan · March 27, 2021, 2:27pm

Two comments, in addition to what have already been said in earlier threads.

First, it is not obvious that left-to-right evaluation inside an expression is natural. People commonly consider that, if your program depends on the order in which things are evaluated, then you should sequence it explicitly. You might call this a circular argument though: if the order of evaluation of arguments was specified, then using it would arguably qualify as “explicit sequencing”. However:

it is debatable how explicit or unintentional it actually is, compared to e.g. let x = e1 in e2;
using sub-expressions that both compute a value and perform a side effect is often regarded as unnatural and bad style: functional or imperative, you should pick one.

Second, enforcing an order has a run-time cost, and since the language already features a way to sequence evaluation, there is no absolute need for that additional cost. As a matter of fact, while Python and Java indeed guarantee a strict left-to-right evaluation (and pay the cost for it), C leaves is unspecified, and right-to-left does happen in practice. I do not know how systematic it is but, for one, this C code compiled with gcc:

#include <stdio.h>

int main (void)
{
	printf(
	  "1st = %i, 2nd = %i\n",
	  (printf("evaluating 1st argument\n"), 1) ,
	  (printf("evaluating 2nd argument\n"), 2)
	);
}

gives me:

evaluating 2nd argument
evaluating 1st argument
1st = 1, 2nd = 2

As I understand it, the reason is the same as in OCaml (which inherits from the ZINC machine, as mentioned): due to the way arguments are passed to functions through the stack, right-to-left evaluation is much easier and more efficient to compile.

Python and Java, being higher-level languages, focus a bit less on performance and a bit more on having a straightforward, tightly-specified semantics.

gasche · March 27, 2021, 2:44pm

For the record, I believe that life would be simpler for everyone if OCaml specified a left-to-right evaluation order and enforced it. Yes, in theory it’s better to encourage people to write code that does not rely on execution order, and it can give some optimization opportunities, but in practice we don’t have effective tools to catch this mistake, and this ends up causing subtle bugs that waste people time.

If this had to be done again, or in a future world where we don’t keep the bytecode compiler as part of the toolchain, I think we would go for left-to-right evaluation.

Another solution would be to actually have a type system that reliably catches evaluation-order-dependent program and asks them to be rewritten with explicit sequencing. This would be very nice, it is an ambitious goal, but it’s worth working torwards. (Arguably Infer solves a harder problem, so it should be possible.)

StrongerXi · March 27, 2021, 2:45pm

Thank you so much for your explanation. I definitely agree with you, especially on

using sub-expressions that both compute a value and perform a side effect is often regarded as unnatural and bad style

I wonder if this is the fault of the standard calling convention from C, where args that can’t fit in register are located in the caller’s stack frame, and thus we must push the last argument value first… This also limit tail-call if it has more arguments than the current call. I don’t know how ZINC machine works though.

StrongerXi · March 27, 2021, 2:47pm

this ends up causing subtle bugs that waste people time.

Oh man I can relate to that so much. Hours and hours of debugging plus a hard lesson.

a type system that reliably catches evaluation-order-dependent program and asks them to be rewritten with explicit sequencing

That sounds really cool.

Maelan · March 27, 2021, 4:44pm

Which order would you pick for labelled arguments, though?

Indeed. It feels dissatisfying that we promote so much programming safety when selling the language to newcomers, while at the same we have such unspecified behaviors (also, unchecked exception catching).

Would pure arrows be of any help here? I got this idea while writing my previous message, I have not thought about it too hard.

gasche · March 27, 2021, 4:53pm

I would expect left-to-right in the order of application at the call-site.

gasche · March 27, 2021, 4:55pm

Note: when we say “left-to-right”, we mean “in code writing order” (from the beginning to the end). If OCaml had an input syntax in a right-to-left-writing language (for example an all-arabic syntax), then it would naturally be right-to-left.

Maelan · March 27, 2021, 4:56pm

Yes, exactly that. It is a matter of pushing the arguments to the stack in the order expected by the calling convention. For fixed-arity functions, you could probably say that it would be enough to reverse the convention, however with variable-arity functions (which are pervasive in OCaml because of currying, but already exist in C, e.g. printf) I think it is not as simple.

I believe that you haven’t quoted the intended part of my message, but it does happen that the bullet you quoted (mixing performing side effects and computing a value) is also attributable to C, to some extent. Because of the limitation on which values you can (efficiently) pass to functions, C programmers have got into the habit of using a “return argument”, i.e. passing as argument a pointer to where the result has to be written. Then the function both performs a side effect (writing its result) and returns a value (an error status, typically).

silene · March 27, 2021, 8:44pm

Is the bytecode compiler really such an issue? I mean, just put the input code into A-normal form and that is it, you get left-to-right evaluation for free. You do not even need to modify the bytecode compiler (except for the A-normalization pass, but that is a straightforward modification). Sure, the generated bytecode will be a bit less efficient, but if people are really concerned about the efficiency of the bytecode interpreter, that can be fixed by adding a few specialized opcodes and a peephole optimization.

LdBeth · March 28, 2021, 2:53am

Probably at the same time get call/cc also for free if a modified ANF is used.

I guess the evaluation order surprises people because there’s no widely available language report for OCaml that states this clearly. Scheme is famous for unspecified evaluation order and that doesn’t seem to stop it from been a good algorithmic language. Implementers have even seriously considered if a random generator should be used to ensure the evaluation order different every time.

jeffsco · March 28, 2021, 5:07am

The OCaml manual says specifically that evaluation order is unspecified.

Section 7.7.1.
Function application
The order in which the expressions expr, argument1, … argumentn are evaluated is not specified.

cvine · March 28, 2021, 11:23am

With multicore on the horizon you might want to leave open the possibility of evaluating arguments in parallel.

gasche · March 28, 2021, 12:41pm

… no! You just cannot expect parallelism at this fine-grained level to perform well (there have been decades of research on automatic parallelisation, and it basically never works). And because we are talking about effectful code (in general), if the programmer does not explicitly express their intent with respect to ordering/synchronization, then running the code “implicitly” in parallel would be a correctness disaster.

cvine · March 28, 2021, 4:28pm

I understand your point, but if evaluation of some or all of the arguments has side effects which may affect the evaluation of the others, you are already in trouble with ocaml’s currently unspecified order of evaluation. In such cases, the programmer should already be taking care of evaluation order explicitly, say with appropriate let bindings. The thing about allowing parallelism is that, where side effects affect evaluation of arguments, you add possible data races to the mix. The method of mitigation, or at any rate one of the methods of mitigation (forcing the evaluation order by hand), is similar.

Maelan · March 28, 2021, 5:26pm

Fun fact: in C, not only the order of evaluation is not specified, but it is in fact non-existent. The norm explicitly says that a compiler is allowed to interleave evaluation of several arguments (~~and I think I remember something about interleaving them with the evaluation of the called function, too, but I would have to check~~ that was wrong, see below). And there you are, race conditions without even multithreading.

For OCaml, the texts that I know of (i.e. the aforementioned manual) remains vague.

Edit: My comment is also relevant to @silene’s message below. We were typing simultaneously, a race condition again.

silene · March 28, 2021, 5:27pm

Not quite. There are a lot of code that have side effects that are observable, but for which the order does not matter. The simplest example is presumably

let fresh = let x = ref 0 in fun () -> let y = !x in x := y + 1; y

More often than not, the callers of this function do not care which integer they get. They only care that the one they get is different from all the other ones. So, the order of the calls does not matter.

But, if you now invoke this function twice in parallel (rather than sequentially), then all hell breaks loose. Parallelization brings a whole new class of bugs (data races, as you explained) that simply do not exist with unspecified order of argument evaluation.

Another example is a memoizing function. It has side effects, but the order of evaluation does not matter. However, data races certainly do.

Topic		Replies	Views
Evaluation order Learning	6	1594	May 30, 2023
Tuple and function execution order Learning tuple	9	2372	February 11, 2020
Evaluation order considered important for newbs to learn early Learning	4	929	November 3, 2017
About the order of definitions Learning	33	2117	April 27, 2023
Why are OCaml function arguments evaluated right-to-left? Learning language-design	2	6143	November 30, 2017

Order of evaluation in application/tuple/...?

Related topics