Anyone know the answer to this? I was just reminded in another thread that this order prevents the definition of the (>>) operator, which sequences 2 monadic computations. Instead, we have to use the ugly >>= fun () -> paradigm.
The reason we can’t have (>>) is just that the right hand side (the latter part of the computation) would be evaluated before the left hand side (the former part of the computation). Of course, we can have (<<), but given the way monadic computation is constructed, I don’t think it’s very useful.
I think there are people more knowledgable on history, but I vaguely remember that this comes from the design of ZINC abstract machine, which is the basis of caml-light (a predecessor of OCaml).
So, why ZINC abstract machine chooses this order? I again vaguely remember that ZINC is inspired by Krivine machine. The idea is that first you push all arguments into the stack, then pop each argument and evaluate it.
My memory is faulty and I am no means an expert. Please correct me if I’m wrong.
Yes indeed, @yoriyuki’s explanation above is correct. The right-to-left evaluation order is key to making the bytecode interpreter (which is still based on the ZINC design) efficient, more precisely to making curried functions almost as efficient as n-ary function. Evaluating an expression pushes it to the stack, and reducing a function pops an argument from the stack, so evaluating arguments from right to left puts values on the stack in the correct order so that the following applications (several in the case of curried functions, fun x -> fun y -> ...) consume arguments from left to right. (Left-to-right would require a FIFO structure to push arguments to, rather than a stack machine, but a stack makes access to recently-defined value faster so it’s typically a better choice for expressions.)
That said, there is no problem with evaluation order if you use monads with pure terms, that is if 'a m is really a pure term whose value describe an effectful computation that returns an 'a – this is what monads were invented for. The reason why you reject an operator (>>) : unit m -> a m -> a m is not that the monadic effects in a m would be executed before those of unit m (this does not depend on the language evaluation order), but because you use a library (such as Lwt or Asyc) that uses a monadic interface but still performs impure side-effects when evaluating values at monadic type.