Why don't we have a composition operator in Pervasives?


#41

Ok, I better understand what you want to do with the Staged module but I don’t find this very natural.

Maybe could I try an hypothesis: the pipe |> is a primitive, but the reverse composition %> and the composition % combinators are not. Here their definitions:

let (%>) f g x = g (f x)
let (%) f g x = f (g x)
external (@@) : ('a -> 'b) -> 'a -> 'b = "%apply"
external (|>) : 'a -> ('a -> 'b) -> 'b = "%revapply"

According to batteries source code, they were introduced into the language for version 4 in July 2012.

Now, consider this little benchmark (I also added Staged in the test):

open! Core
open Core_bench
open Base
open Re2

let (%>) f g x = g (f x)
let (%) f g x = f (g x)
(*let (|>) x f = f x*)

let f : int -> string -> int = fun x ->
  ignore (Regex.(matches (of_string "[0-9]+") "foo"));
  fun y -> x + String.length y

let h : int -> (string -> int) Staged.t = fun x ->
  ignore (Regex.(matches (of_string "[0-9]+") "foo"));
  Staged.stage (fun y -> x + String.length y)
  
let g : int -> int -> int = fun x y -> x * y

(* (fun f g -> fun x -> x |> g |> f) (g 2) (f 1) *)
let pipeline_composed : string -> int =
  g 2 % f 1
  
let pipeline_revcomposed : string -> int =
  f 1 %> g 2

let pipeline_with_pipe : string -> int =
  fun x -> x |> f 1 |> g 2

(* a quasi-inline version of pipeline_composed *)
let pipeline_with_pipe_bis : string -> int =
  let f_applied = f 1 in
  let g_applied = g 2 in
  fun x -> x |> f_applied |> g_applied

let pipeline_composed_unstage : string -> int =
  g 2 % Staged.unstage (h 1)
  
let pipeline_revcomposed_unstage : string -> int =
  Staged.unstage (h 1) %> g 2

let pipeline_with_pipe_unstage : string -> int = 
  let h_unstage = Staged.unstage (h 1) in
  (fun x -> x |> h_unstage |> g 2)

let () =
  let composed = Test.create
    ~name:"composed"
    (fun () -> pipeline_composed "foo")
  in
  let revcomposed = Test.create
    ~name:"revcomposed"
    (fun () -> pipeline_revcomposed "foo")
  in
  let with_pipe = Test.create
    ~name:"with pipe"
    (fun () -> pipeline_with_pipe "foo")
  in
  let with_pipe_bis = Test.create
    ~name:"with pipe bis"
    (fun () -> pipeline_with_pipe_bis"foo")
  in
  let composed_unstage = Test.create
    ~name:"composed unstage"
    (fun () -> pipeline_composed_unstage "foo")
  in
  let revcomposed_unstage = Test.create
    ~name:"revcomposed unstage"
    (fun () -> pipeline_revcomposed_unstage "foo")
  in
  let with_pipe_unstage = Test.create
    ~name:"with pipe unstage"
    (fun () -> pipeline_with_pipe_unstage "foo")
  in
  Bench.make_command [
    composed;
    revcomposed;
    with_pipe;
    with_pipe_bis;
    composed_unstage;
    revcomposed_unstage;
    with_pipe_unstage;
    ]
  |> Command.run

and its result:

Estimated testing time 1.75s (7 benchmarks x 250ms). Change using -quota SECS.
                                                            
  Name                     Time/Run   mWd/Run   Percentage  
 --------------------- ------------- --------- ------------ 
  composed                   8.44ns                  0.08%  
  revcomposed                8.44ns                  0.08%  
  with pipe             10_455.23ns     7.00w      100.00%  
  with pipe bis              4.75ns                  0.05%  
  composed unstage           8.44ns                  0.08%  
  revcomposed unstage        8.44ns                  0.08%  
  with pipe unstage          6.52ns                  0.06%

the naive use of the pipe is clearly inneficient, so I remove it from the bench.

Estimated testing time 1.5s (6 benchmarks x 250ms). Change using -quota SECS.
                                               
  Name                  Time/Run   Percentage  
 --------------------- ---------- ------------ 
  composed                7.91ns       99.27%  
  revcomposed             7.97ns      100.00%  
  with pipe bis           4.48ns       56.29%  
  composed unstage        7.93ns       99.59%  
  revcomposed unstage     7.91ns       99.25%  
  with pipe unstage       6.37ns       79.97%

Here the pipe is clearly more efficient, even with the staged version. But now, I redefine the pipe |> to not use the primitive version:

let (|>) x f = f x

and here the result:

Estimated testing time 1.5s (6 benchmarks x 250ms). Change using -quota SECS.
                                                         
  Name                  Time/Run   mWd/Run   Percentage  
 --------------------- ---------- --------- ------------ 
  composed                8.43ns                 83.26%  
  revcomposed             8.43ns                 83.23%  
  with pipe bis           8.45ns                 83.43%  
  composed unstage        9.03ns                 89.15%  
  revcomposed unstage     8.96ns                 88.45%  
  with pipe unstage      10.13ns     5.00w      100.00%  

the gain form the use of pipe |> disappear. :wink:

Maybe if the composition combinators were primitives they would be as performant as the pipe.


#42

@kantian speaks rightly. In code that exploits higher order functions, composition is indispensable, which is why it is so prevalent in Haskell code. 'duh :smile:


#43

Interesting results! I think defining a primitive version of the composition operator would still have to allocate a closure (although maybe flambda takes care of that—dunno).

In any case, I agree there should still be some “blessed” conventional name for the composition operator if only for the purpose of uniformity and predictability, whether it’s more or less efficient.


#44

My experience speaks differently. We have tons of higher order code, (including lots of applicatives and monads), and composition comes up only rarely in our codebase.

It’s worth remembering that there are different styles available, even within the same language. It’s easy to imagine that the way you’re used to doing things is the only good way, but it’s rarely true.

For what it’s worth, I don’t think it’s crazy to have a composition operator. I just tend to think it isn’t worth the cost.

Anyway, maybe we should move on from the topic, since the issue has been settled recently at least as far as the standard library is concerned, and Core and Base are unlikely to move on this in the short term.

y


#45

Be careful that you are comparing apples and oranges here. When you define let pipeline_composed = g 2 % f 1, g 1 and f 1 are arguments to the function % and so are evaluated before performing % — thus g 1 and f 1 are evaluated only once, when you define pipeline_composed. On the other hand, let pipeline_with_pipe = fun x -> x |> f 1 |> g 2 runs f 1 and g 2 for each value of x provided. The same “inefficiency” would happen if you declared let default = fun x -> g 2 (f 1 x) — would you then conclude that the naive use of function evaluation is inefficient? :wink: The function pipeline_with_pipe_bis is comparable to pipeline_composed and you clearly see that it is about twice as efficient (basically as efficient as the usual evaluation g_applied(f_applied x)).


#46

I know, that’s exactly what I already explain in this comment and that’s what I illustrate more precisely with the benchmark, and that is also why I “inline” pipeline_composed in the function pipeline_with_pipe_bis (if you read the previous exchanges in the discussion, you’ll understand why I wrote this function, this is the same @bcc32 wrote in his benchmark but I change the function f).

My point is mostly to say that not having an infix version of the composition combinators (both forward and backward) is not a good solution because replacing f x % g y by fun x -> x |> g y |> f x (which is what @Yaron_Minsky suggest) can lead to a great lose of efficiency. And if you don’t want to lose efficiency you have to evaluate f x and g y then create the pipeline. I don’t find this very natural. :wink:

It seems the only reason (but I’m not sure and don’t know if that will solve the problem of efficiency) that this solution is more efficient is that the composition combinators are not defined like this:

external (%) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = "%compose"
external (%>) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c = "%revcompose"

Moreover the type of (%>) is the double modus ponens in its direct form, which is beautiful! :slight_smile:


#47

But it may also be the other way around because creating too many closures (performing no intermediate work) will slow down your program. Having a composition operator will not be a magic bullet for this: if it evaluates its arguments, it may save some work (in rare cases I’m afraid) but then it will also create closures; if it does not, then you will gain nothing from your example. In the end, you need to know the performance of your functions and combine them accordingly.


#48

I think you miss my point. I rewrite @bcc32 benchmark:

open! Core
open Core_bench

let (%>) f g x = g (f x)
let (|>) x f = f x

let f x y = x + String.length y
let g x y = x * y

let pipeline_composed = f 1 %> g 2

let pipeline_eta_expanded x = x |> f 1 |> g 2

let () =
  let composed     = Test.create ~name:"composed"     (fun () -> pipeline_composed "hello")     in
  let eta_expanded = Test.create ~name:"eta expanded" (fun () -> pipeline_eta_expanded "hello") in
  Bench.make_command [ composed; eta_expanded ]
  |> Command.run

but this time, as you can notice, the pipe doesn’t benefit from the compiler optimization (it is not a primitive), and here the result:

Estimated testing time 1s (2 benchmarks x 500ms). Change using -quota SECS.
                                                  
  Name           Time/Run   mWd/Run   Percentage  
 -------------- ---------- --------- ------------ 
  composed         8.96ns                 75.36%  
  eta expanded    11.89ns    10.00w      100.00%

So prior to version 4.0, the mathematician point of view was equaly efficient, but now the mathematicians are second class citizen of the language. :confused:

I know, the pipe and the composition point of view are complementary, not exclusive. But I’m a bit confuse that we don’t have as efficient infix composition as pipe infix combinator.

And to illustrate differently my thought, my HIFI installation is reader %> preamp %> amp %> speaker. But when someone tell me to use the pipe |> it’s like he’s telling me: “Hey, you forgot to mention your DVD”, you should say fun dvd -> dvd |> reader |> preamp |> amp |> speaker. :roll_eyes:


#49

I’m going to sum up a few of the remarks above with a little bit of extra commentary.

There’s a tradition in OCaml to avoid composition operators. Not everyone follows this tradition, I gather, but it seems to be the traditional mainstream view. I understand that.

However, a composition operator is normal, common, and natural in functional programming. Haskell has one, SML has one, Clojure has one (prefix, of course, and with a four-character name, but still)–and last, but not least, mathematics has one. (As others pointed out OCaml has too many; it doesn’t have a single standard operator that everyone recognizes.)

OCaml is not only for traditional OCaml programmers! It can and should have a much larger community, and that can be facilitated by being slightly less idiosyncratic. Of course any sensible newbie should want to learn idiomatic OCaml, and experienced programmers will want to help them learn idiomatic OCaml, but adding a composition operator is a minor change. Why keep out a trivial syntactic enhancement that’s natural in functional programming in general, and that’s relatively easy to implement and understand? It harms no one. No one who dislikes it has to use it.

If a composition operator is used in ways that make code unreadable, that’s bad, but there are lots of ways to write unreadable code, and they all should all be avoided, other things being equal. There are clearly many uses of a composition operator that would be quite intelligible as long as it’s familiar. And I know I’m not alone in thinking that some uses of a composition operator are more readable than alternatives.


#50

What you’ve added here is an appeal to popularity: OCaml should gain users by growing in the direction of functional programmers. And it can do that with a change that “harms no one”. I’d just like to add

  1. appeals like this are invariably taken as offensive to established communities. It’s like joking about a stranger’s name, right after meeting him. Isn’t it obvious that hundreds to thousands of people have come before you and had the same impulse? “You should do as I personally desire because it will also suddenly make you popular and cool.” just comes across as a cynical attempt at manipulation - to someone who’s seen it hundreds of times already, attached to all kinds of random proposals.

  2. this particular appeal is especially hard to see under any other light, because it’s the tiny group of FP people who like pointfree code that the move is supposed to appeal to. OCaml should be “less idiosyncratic” by adding a feature that appeals to an extremely small and unusual group of programmers? Also, the argument that adding the feature will not burden other programmers relies on those programmers never having to see code that uses it, which is at least an odd argument to make about a popularity-increasing feature. Shouldn’t the massive influx of pointfree programmers result in massive amounts of new OCaml code that’s written in that style?

I really like readability arguments that make use of code examples, though. What uses of the composition operator do you have in mind that are more readable than alternatives? Do you have some real code that you’ve taken and rewritten in another way, that’s markedly more or less readable as a result of the rewrite? It’d be nice if threads like this would tend towards looking like a series of code reviews. Somehow that never happens.


#51

I really did not mean to offend. I didn’t see what I wrote that way, and I gather that it at least might have offended you, @techate. So I apologize for offending you or anyone else.

I love OCaml and I want to see it grow. (I will admit that there is a selfish element to that, because for me, as a programmer, a bigger community means more resources. However, I assume that others have similar desires.)

Yes, I would prefer that there would be a standard composition operator, and I personally find its absence surprising.

However, I absolutely do not think that OCaml, or any other well-designed language should just add whatever is popular. Uck. That leads to everything-including-the-kitchen-sink monstrosities like, … well I won’t get started seeming to insult another language now, but I have a couple in mind. fwiw there have been repeated, well-known, dust-ups in the Clojure community in which someone argued loudly that such and such feature was lacking, and I respect and am grateful to Rich Hickey (Clojure BDFL) for resisting these calls and maintaining Clojure’s (imperfect, but still) elegant utility of its own kind. One of the things that I like about OCaml is that it also keeps things simple and elegant, but like Clojure (though in a different way), with pragmatic goals in mind.

Manipulation? Well, OK, you can call it that. I just call it argument. I assume that there can be a value in discussing change, and trying to give reasons for them. There shouldn’t be anything intrinsically offensive about that. So I’m a fan of OCaml, and yeah, I am giving an argument. Yes, I think that more popularity is better, other things being equal, and yes, I did feel that in this case, things are equal (you seem to disagree, OK) and yes, I was raising the possibility a composition operator could be one of a number of factors that just rubs potential users the wrong way. It is a small thing, but small things can add up. (I love programming in OCaml, but I am an outsider in that I only started using it seriously over the last year or so. It’s possible that that means that I notice things about how something would strike a new user more clearly than more experienced OCaml programmers, but that’s speculation. I’m an unusual user in many respects, too, so that could distort my intuitions as well.)

If it’s true there is only a tiny group of functional programmers who like to use a composition operator sometimes, then that would undermine a premise of my argument, and I would accept that. I’m not sure if that’s exactly what you meant, though. I’m not talking about point-free code as a general practice, just simple examples like mapping the composition of two functions over a list without adding a new definition or wrapping it in a (fun ... -> ...). This kind of code has already been illustrated and discussed in this thread, so there’s no point in me reproducing it. I don’t think anyone would argue that such uses are particularly difficult to read.

Again, if the tone of my post was offensive, I apologize for that. I fear that I was misunderstood, but I take responsibility for it. Sorry about that.


#52

On second thought, I am not sure that I should be so apologetic, or that all of the misunderstanding was my fault.


#53

I think that most of the discussion in this thread is somewhat missing the point. There are two separate questions:

  1. Should the compose function be part of Pervasives?
  2. Should it be an ordinary function or an operator?

I think there is a reasonable argument for including a compose function in Pervasives since it is a fundamental operation and for some backends it would benefit from being implemented as an optimised primitive. Of course the same is true of the identity function, which is also not in Pervasives, but it is still an obviously reasonable proposal and I don’t think it has been discussed recently by the OCaml developers.

So assuming that we include a compose function in Pervasives, should it be an operator? That is the real question at issue. It is not about comparing:

List.map (f <*> g) l

with

List.map (fun x -> f @@ g @@ x) l

but with

List.map (compose f g) l

For this we need to consider why something should be made an operator. There are a few common reasons given for making something an operator:

  1. It is used so often and pervasively that the saving a few characters is worth the cost of having an unreadable name.
  2. It is represented by the same operator in a domain that people using it are likely to be familiar with.
  3. It would benefit from the different precedence and associativity rules associated with operators, enough to outway the cost of having an unreadable name. This is why @@ and |> are operators: they would be pretty useless without the change in precedence.

Note that “it is a fundamental operation” is not a good reason: there is nothing more fundamental about an operator than an ordinary function.

Do these reasons apply for composition? The first reason seems unlikely since most people agree that straight-forward composition does not actually occur that often. Given that we can’t use anything close enough to the standard mathematical symbol for composition, and there isn’t a standard operator for composition in programming generally, reason 2 doesn’t seem to apply either. That leaves us with reason 3: does the removal of the parentheses going from:

List.map (compose (add 1) (mul 2)) l

to

List.map (add 1 <*> mul 2) l

justify everyone having to learn that <*> means compose. This is something that reasonable people can easily disagree on. The OCaml developers have decided that it doesn’t. This is consistent with other similar decisions – they’ve shown a pretty clear preference for ordinary functions over operators in borderline cases like this. Other languages have a much lower bar for using an operator – Haskell uses operators pretty liberally – it is mostly a matter of taste.

Finally, I would like to point out that “it should be an operator in OCaml because it is an operator in mathematics” is a pretty poor argument, for a few reasons:

  • Mathematics uses operators for pretty much everything
  • Mathematics frequently deals with first-order situations where composition cannot itself be represented as a function.
  • Mathematical syntax is often badly designed, much of it coming from a time before people put serious study into the concept of syntax: it should by no means be considered the gold standard.

#54

Note that the discussion also touched on a sub-question of 1: how much optimized can such a primitive be. Clearly it should avoid creating unnecessary closures; are there other things? Given that avoiding closures whenever possible is not specific to composition, one may wonder whether it is necessary to have a primitive operation or whether flambda will be able to optimize usual functions.


#55

This is a perfect summary of the situation. I note that for the first question you agree it would be useful to have compose as an optimised primitive.

For the second one, the point is not to compare

List.map (f <*> g) l

with

List.map (compose f g) l

but to compare

let my_pipeline = f <*> g <*> h <*> i

with

let my_pipeline = compose f (compose g (compose h i))

It is not unusual to have more than two morphisms in a pipeline. Consider my example of an hi-fi pipeline:

hifi_pipeline = cd_reader %> preamp %> amp %> speaker

compared with

hifi_pipeline cd = cd |> cd_reader |> preamp |> amp |> speaker

#56

compare

let my_pipeline = f <> g <> h <*> i

with

let my_pipeline = compose f (compose g (compose h i))

Sure, and it is a perfectly sensible position to say that it is not worth making everyone learn <*> to remove those parentheses.

It is not unusual to have more than two morphisms in a pipeline

The thing is this is actually very unusual. It is not unusual to have multiple stages in a pipeline, but it is unusual for each stage to be a function of a single argument that is the previous stage in the pipeline. It is more common for the stages to be something more abstract than just a function or for stages to produce multiple outputs some of which are used by later stages. To get the exact situation you are proposing normally requires you to deliberately design the library you are using to construct these pipelines so that they can be used this way – in which case you should include a composition operator in that library.

It is also quite uncommon for each stage to be so obvious that you don’t want to name the intermediate products.


#57

I can understand and the decision belongs to maintainer of the different libraries (OCaml team for Standard Lib, and Jane Street for Core and Base). Nevertheless I do not claim that everyone should learn an infix notation, but only that those who want to use one should be able to do so (without efficiency penalty).

I consider this is mainly a matter of taste (and it’s mine) and if we could have a compose function as efficient as the pipe (as it was prior to 4.0) it’s a good compromise. Finally I agree with @Yaron_Minsky on this point:

:wink:

For the rest I do not really want to continue the debate.


#58

Right, I had no problem using the new standard operators. They’re fine, and maybe even easier wrt reading order as others said. I’d still like a compose op. in the classical math syntax, but that’s ok if it won’t be added.


#59

Oh, no, it’s not unusual at all. That’s how you write really good monadic code, too, it’s full of liftM’s and mapM’s. Anyway, the more purely functional you write, the more you miss a compose operator in my experience. :wink:


#60

Thanks very much @lpw25 for this very clear discussion (and for others’ comments on it). I think it’s helpful to think in terms of tradeoffs, as your post suggests.

Some additional thoughts.

First, I don’t think it’s a question of how often a pattern has been used, but how often it would be used if it were more convenient. The use of composition operators in other languages and in math seems relevant to this question. (As I’ve suggested earlier, I also think that it could be useful to consider future OCaml users in this calculation as well as existing ones. I’ve made it clear that this should not mean adding any feature that seems to be popular outside of OCaml, but careful consideration of existing styles of functional programming in other languages might be relevant.)

Second, I don’t feel that an operator must come to be be used pervasively for its use to be justified. If it’s used occasionally but regularly, and many of those uses make code more succinct and easier to understand, that can be enough. I’m in effect suggesting that there is a sort of relative expected value of a pattern–something roughly like the number of uses times their relative benefit compared to alternative patterns. (Even in math the composition operator is uncommon, at least in domains with which I’m familiar, but when it’s used, it often aids understanding of certain ideas, I believe.)

About cost, I gather that the main cost that we’re considering would be the confusion that results from having to learn an arbitrary operator character or characters. (I hope it would not be three characters!) I absolutely agree that that’s a reasonable concern. (Here’s an illustration of such confusion: I was confused about some of the discussion in this thread until I looked up some of the operators.)

In this case I really think that adding a new operator for composition is not much of a burden, because (a) it will be used–regularly, even if not in all code or even a lot in some code–and (b) the function represented by the operator is one that is familiar to any experienced functional programmer. It’s difficult to remember a new operator that represents an unfamiliar function, but this one is not unfamiliar. A composition operator, it seems to me, would quickly become something that anyone would assume should be memorized–even those people who never wanted to use it. Here again the presence of composition operators in other languages and in math is relevant. (As I said earlier, I’ve been using OCaml quite a bit over the last year or so, and I’ve never used any operators other than math operators, or even looked for them, except for a composition operator, which I sought out very soon after I started using OCaml, because I wanted to use it in my code to simplify a map application without defining a new function name.)

Btw, I completely agree with the points you make about math as a model, @lpw25, but the role of math as an examplar in my reasoning above is different. The argument isn’t “Math has this operator, so OCaml should.” It’s that the familiarity of the composition operator in math helps to make the concept familiar, so that it’s readily understandable and memorable, and might suggest useful applications in code. (My first comments about the significance of math didn’t make this clear, and I don’t think I’d made it completely clear for myself, either.)

All of my points do, indeed, depend on judgements about which disagreement can be reasonable. For my part, I still feel that the potential benefits of a composition operator to (some) users and to the language community in the long term outweigh the costs of adding it, which seem small to me.

(About precedence, for me personally that’s not the main issue. I wouldn’t mind e.g.

List.map ((add 1) % (mul 2)) mylist

as well as simpler cases in which the functions have only one argument. If a composition operator allowed removing those inner pairs of parentheses, that might be good for some users. I might just add the parentheses sometimes, anyway, so I don’t have to remember the precedence rule when the syntax doesn’t make it obvious. In this particular example it would be easy to guess what’s going on without parentheses, though.)