Why don't we have a composition operator in Pervasives?

And what is the higher-order composition operator for you ? That’s what it is intended for. :wink:

Don’t you see the pipeline from space (or type) A to space D ?

(* pipeline from left to right *)
let (>>) f g x = x |> f |> g

(* pipeline from right to left, the composition operator in mathematic *)
let (<<) f g x = x |> g |> f

let double x = 2 * x
let square x = x * x

(double >> succ >> square) 3;;
- : int = 49

(square << succ << double) 3;;
- : int = 49

List.map (double >> succ >> square) [1; 2; 3];;
- : int list = [9; 25; 49]

(* we can easily define our own pipelines *)
let my_pipeline = double >> succ >> square
let show_pipeline = string_of_int >> print_endline

(* and chain them *)
List.iter (my_pipeline >> show_pipeline) [1; 2; 3];;
9
25
49
- : unit = ()

@UnixJunkie: i do not understand what you say, |> is evaluated left to right like unix pipelines.

3 Likes

Absolutely! They serve much of the same purpose, which is why I don’t think we need both as infix operators.

y

If I understand correctly, your position is that since we can write my last examples this way:

let my_pipeline x = x |> double |> succ |> square
let show_pipeline x = x |> string_of_int |> print_endline

List.iter (fun x -> x |> my_pipeline |> show_pipeline) [1; 2; 3];;

we don’t need other infix operators. Am I right?

That’s mostly a matter of taste. Since I’m used to mathematical notations, I prefer infix operators that operate on functions. The interest I find in >> and << is that we “see” the pipeline’s flow.

I don’t understand why people who prefer this idiom should define their own operators and can not rely on the standard lib to have a uniform notation or, otherwise, use an idiom they dislike to stay standard.

1 Like

OCaml doesn’t have the function composition operator mainly because of the value restriction. Here is an example, that worthwhile hundreds of words:

# let (++) f g x = f (g x);;
val ( ++ ) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>
# ident ++ ident;;
- : '_a -> '_a = <fun>

Thus, no matter, how useful the composition operator looks to you, it just will not work in OCaml, because functions are not considered as first-class values by the type system. Of course, it still can be used for ground terms, and you can define this operator, as I did before and use it if you believe that it will make your code more readable. My personal opinion is that the application operator, as well as the reverse application (@@ and |>) are much more intuitive, at least for those of us, who prefer to read from left to right. But your mileage may vary.

And for the historical reference, this is a link to, probably, the first discussion of the composition operator in OCaml, that is available on the Internet.

12 Likes

The OCaml developers have given us a powerful rewriting tool for when we think they are a bit too conservative about a new feature:

# #require "ppx_compose";;
# let ident x = x;;
val ident : 'a -> 'a = <fun>
# ident % ident;;
- : 'a -> 'a = <fun>

At least that solves the issue with the value restriction. About consensus for the operator, the best I could think of was to reuse the operators from an existing library (Batteries Included).

5 Likes

Thanks for the reference. I’ve read it few years ago and I still disagree with Pierre Weis.

It’s frequent that we have to use composition in a monomorphic context, so the value restriction is rarely a problem. But eta expand the pipeline leads to a lose of efficiency:

let make_add () =
  let r = ref 1 in
  (fun () -> r := 1),
  (fun i -> 
    Printf.printf "Add has been evaluated %d times\n" !r;
    incr r; (+) i)

let reset, add = make_add()

map (fun x -> x |> add 1 |> add 2) (1 -- 5);;
Add has been evaluated 1 times
Add has been evaluated 2 times
Add has been evaluated 3 times
Add has been evaluated 4 times
Add has been evaluated 5 times
Add has been evaluated 6 times
Add has been evaluated 7 times
Add has been evaluated 8 times
Add has been evaluated 9 times
Add has been evaluated 10 times
- : int BatEnum.t = 4 5 6 7 8

reset (); map (add 1 %> add 2) (1 -- 5);;
Add has been evaluated 1 times
Add has been evaluated 2 times
- : int BatEnum.t = 4 5 6 7 8

With infix composition your pipeline is evaluated only once, but evaluated each time we apply it when it is eta expanded. That’s inefficient !!!

Concerning the notation, both batteries and containers use %> and % instead of >> and << which is better for precedence level with @@:

double %> succ @@ 3;;
- : int = 7

succ % double @@ 3;;
- : int = 7

Concerning the value restriction, Base.Fn has a const function which has the same problem and writing const 1 instead of fun x -> 1 is not really a gain.

3 Likes

One strategy is to make it evident using the type system when you have “costly” partial applications (i.e., partially-applied functions that are expensive to construct because they do something with arguments as they are passed one-by-one).

For example, Base does this with a module called Staged:

type +'a t
val stage : 'a -> 'a t
val unstage : 'a t -> 'a

So for example you might have a regexp-matching function that first compiles the regular expression:

(** [does_match pattern input] *)
val does_match : string -> string -> bool

You might be tempted to use it like so, which would be expensive because it would compile the regular expression over and over again:

let nums = List.filter (fun str -> does_match "\\d+" str) many_strings in ...

But if does_match is changed to have the following type:

val does_match : string -> (string -> bool) Staged.t

Then it becomes obvious that partially applying the function to only its first argument is “expensive”, so then it is clear that the preceding code should be written like so, where we save the work of compilation:

let is_number = unstage (does_match "\\d+") in
let nums = List.filter is_number many_strings in
...

The different type makes it much more obvious when the result of partial application should not be evaluated more than once for performance (or other) reasons.

Are you serious? Have you read the discussion?

I have, for example, two functions f : int -> int -> int and g : string -> int -> int, and I want to create a pipeline with them. @Yaron_Minsky told me that I just have to write:

let my_pipeline x = x |> f 1 |> g "I dislike composition"`

Then I answer this is less efficient than composition because the pipeline is evaluated each time I use it, which is not the case if I simply write:

let my_pipeline = f 1 %> g "I like composition"

And now, you tell me that I should use poor man staged programming with the identity monad! :roll_eyes:

1 Like

Right, and I was addressing the concern of performance by pointing out that in cases that you do have non-trivial partially-applied functions you can simply give a name to the partially-applied version. (The identity monad helps you mark where you have performance implications for partial application.) Most of the time, I wager, the function you are working with does not behave that way.

If you want to do the performance comparison for non-staged partial application, the result is contrary to what you are suggesting (|> is faster than %>):

open! Core
open Core_bench

let (%>) f g x = g (f x)

let f x y = x + String.length y
let g x y = x * y

let pipeline_composed = f 1 %> g 2

let pipeline_eta_expanded x = x |> f 1 |> g 2

let () =
  let composed     = Test.create ~name:"composed"     (fun () -> pipeline_composed "hello")     in
  let eta_expanded = Test.create ~name:"eta expanded" (fun () -> pipeline_eta_expanded "hello") in
  Bench.make_command [ composed; eta_expanded ]
  |> Command.run
;;

With the output:

Estimated testing time 20s (2 benchmarks x 10s). Change using -quota SECS.
                                        
  Name           Time/Run   Percentage  
 -------------- ---------- ------------ 
  composed         6.79ns      100.00%  
  eta expanded     2.55ns       37.55%  

(at least on my machine, a 1.4GHz Macbook Air running OCaml 4.05.0, no flambda)

I’m not going to hazard any speculation about why the performance of the two pipelines are so different in the first place, since I’m certainly not qualified to, but that’s the result I got, FWIW. YMMV.

Sorry my initial post wasn’t clear about what point I was addressing. I should have included an example or something.

3 Likes

Ok, I better understand what you want to do with the Staged module but I don’t find this very natural.

Maybe could I try an hypothesis: the pipe |> is a primitive, but the reverse composition %> and the composition % combinators are not. Here their definitions:

let (%>) f g x = g (f x)
let (%) f g x = f (g x)
external (@@) : ('a -> 'b) -> 'a -> 'b = "%apply"
external (|>) : 'a -> ('a -> 'b) -> 'b = "%revapply"

According to batteries source code, they were introduced into the language for version 4 in July 2012.

Now, consider this little benchmark (I also added Staged in the test):

open! Core
open Core_bench
open Base
open Re2

let (%>) f g x = g (f x)
let (%) f g x = f (g x)
(*let (|>) x f = f x*)

let f : int -> string -> int = fun x ->
  ignore (Regex.(matches (of_string "[0-9]+") "foo"));
  fun y -> x + String.length y

let h : int -> (string -> int) Staged.t = fun x ->
  ignore (Regex.(matches (of_string "[0-9]+") "foo"));
  Staged.stage (fun y -> x + String.length y)
  
let g : int -> int -> int = fun x y -> x * y

(* (fun f g -> fun x -> x |> g |> f) (g 2) (f 1) *)
let pipeline_composed : string -> int =
  g 2 % f 1
  
let pipeline_revcomposed : string -> int =
  f 1 %> g 2

let pipeline_with_pipe : string -> int =
  fun x -> x |> f 1 |> g 2

(* a quasi-inline version of pipeline_composed *)
let pipeline_with_pipe_bis : string -> int =
  let f_applied = f 1 in
  let g_applied = g 2 in
  fun x -> x |> f_applied |> g_applied

let pipeline_composed_unstage : string -> int =
  g 2 % Staged.unstage (h 1)
  
let pipeline_revcomposed_unstage : string -> int =
  Staged.unstage (h 1) %> g 2

let pipeline_with_pipe_unstage : string -> int = 
  let h_unstage = Staged.unstage (h 1) in
  (fun x -> x |> h_unstage |> g 2)

let () =
  let composed = Test.create
    ~name:"composed"
    (fun () -> pipeline_composed "foo")
  in
  let revcomposed = Test.create
    ~name:"revcomposed"
    (fun () -> pipeline_revcomposed "foo")
  in
  let with_pipe = Test.create
    ~name:"with pipe"
    (fun () -> pipeline_with_pipe "foo")
  in
  let with_pipe_bis = Test.create
    ~name:"with pipe bis"
    (fun () -> pipeline_with_pipe_bis"foo")
  in
  let composed_unstage = Test.create
    ~name:"composed unstage"
    (fun () -> pipeline_composed_unstage "foo")
  in
  let revcomposed_unstage = Test.create
    ~name:"revcomposed unstage"
    (fun () -> pipeline_revcomposed_unstage "foo")
  in
  let with_pipe_unstage = Test.create
    ~name:"with pipe unstage"
    (fun () -> pipeline_with_pipe_unstage "foo")
  in
  Bench.make_command [
    composed;
    revcomposed;
    with_pipe;
    with_pipe_bis;
    composed_unstage;
    revcomposed_unstage;
    with_pipe_unstage;
    ]
  |> Command.run

and its result:

Estimated testing time 1.75s (7 benchmarks x 250ms). Change using -quota SECS.
                                                            
  Name                     Time/Run   mWd/Run   Percentage  
 --------------------- ------------- --------- ------------ 
  composed                   8.44ns                  0.08%  
  revcomposed                8.44ns                  0.08%  
  with pipe             10_455.23ns     7.00w      100.00%  
  with pipe bis              4.75ns                  0.05%  
  composed unstage           8.44ns                  0.08%  
  revcomposed unstage        8.44ns                  0.08%  
  with pipe unstage          6.52ns                  0.06%

the naive use of the pipe is clearly inneficient, so I remove it from the bench.

Estimated testing time 1.5s (6 benchmarks x 250ms). Change using -quota SECS.
                                               
  Name                  Time/Run   Percentage  
 --------------------- ---------- ------------ 
  composed                7.91ns       99.27%  
  revcomposed             7.97ns      100.00%  
  with pipe bis           4.48ns       56.29%  
  composed unstage        7.93ns       99.59%  
  revcomposed unstage     7.91ns       99.25%  
  with pipe unstage       6.37ns       79.97%

Here the pipe is clearly more efficient, even with the staged version. But now, I redefine the pipe |> to not use the primitive version:

let (|>) x f = f x

and here the result:

Estimated testing time 1.5s (6 benchmarks x 250ms). Change using -quota SECS.
                                                         
  Name                  Time/Run   mWd/Run   Percentage  
 --------------------- ---------- --------- ------------ 
  composed                8.43ns                 83.26%  
  revcomposed             8.43ns                 83.23%  
  with pipe bis           8.45ns                 83.43%  
  composed unstage        9.03ns                 89.15%  
  revcomposed unstage     8.96ns                 88.45%  
  with pipe unstage      10.13ns     5.00w      100.00%  

the gain form the use of pipe |> disappear. :wink:

Maybe if the composition combinators were primitives they would be as performant as the pipe.

5 Likes

@kantian speaks rightly. In code that exploits higher order functions, composition is indispensable, which is why it is so prevalent in Haskell code. 'duh :smile:

1 Like

Interesting results! I think defining a primitive version of the composition operator would still have to allocate a closure (although maybe flambda takes care of that—dunno).

In any case, I agree there should still be some “blessed” conventional name for the composition operator if only for the purpose of uniformity and predictability, whether it’s more or less efficient.

1 Like

My experience speaks differently. We have tons of higher order code, (including lots of applicatives and monads), and composition comes up only rarely in our codebase.

It’s worth remembering that there are different styles available, even within the same language. It’s easy to imagine that the way you’re used to doing things is the only good way, but it’s rarely true.

For what it’s worth, I don’t think it’s crazy to have a composition operator. I just tend to think it isn’t worth the cost.

Anyway, maybe we should move on from the topic, since the issue has been settled recently at least as far as the standard library is concerned, and Core and Base are unlikely to move on this in the short term.

y

5 Likes

Be careful that you are comparing apples and oranges here. When you define let pipeline_composed = g 2 % f 1, g 1 and f 1 are arguments to the function % and so are evaluated before performing % — thus g 1 and f 1 are evaluated only once, when you define pipeline_composed. On the other hand, let pipeline_with_pipe = fun x -> x |> f 1 |> g 2 runs f 1 and g 2 for each value of x provided. The same “inefficiency” would happen if you declared let default = fun x -> g 2 (f 1 x) — would you then conclude that the naive use of function evaluation is inefficient? :wink: The function pipeline_with_pipe_bis is comparable to pipeline_composed and you clearly see that it is about twice as efficient (basically as efficient as the usual evaluation g_applied(f_applied x)).

1 Like

I know, that’s exactly what I already explain in this comment and that’s what I illustrate more precisely with the benchmark, and that is also why I “inline” pipeline_composed in the function pipeline_with_pipe_bis (if you read the previous exchanges in the discussion, you’ll understand why I wrote this function, this is the same @bcc32 wrote in his benchmark but I change the function f).

My point is mostly to say that not having an infix version of the composition combinators (both forward and backward) is not a good solution because replacing f x % g y by fun x -> x |> g y |> f x (which is what @Yaron_Minsky suggest) can lead to a great lose of efficiency. And if you don’t want to lose efficiency you have to evaluate f x and g y then create the pipeline. I don’t find this very natural. :wink:

It seems the only reason (but I’m not sure and don’t know if that will solve the problem of efficiency) that this solution is more efficient is that the composition combinators are not defined like this:

external (%) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = "%compose"
external (%>) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c = "%revcompose"

Moreover the type of (%>) is the double modus ponens in its direct form, which is beautiful! :slight_smile:

But it may also be the other way around because creating too many closures (performing no intermediate work) will slow down your program. Having a composition operator will not be a magic bullet for this: if it evaluates its arguments, it may save some work (in rare cases I’m afraid) but then it will also create closures; if it does not, then you will gain nothing from your example. In the end, you need to know the performance of your functions and combine them accordingly.

I think you miss my point. I rewrite @bcc32 benchmark:

open! Core
open Core_bench

let (%>) f g x = g (f x)
let (|>) x f = f x

let f x y = x + String.length y
let g x y = x * y

let pipeline_composed = f 1 %> g 2

let pipeline_eta_expanded x = x |> f 1 |> g 2

let () =
  let composed     = Test.create ~name:"composed"     (fun () -> pipeline_composed "hello")     in
  let eta_expanded = Test.create ~name:"eta expanded" (fun () -> pipeline_eta_expanded "hello") in
  Bench.make_command [ composed; eta_expanded ]
  |> Command.run

but this time, as you can notice, the pipe doesn’t benefit from the compiler optimization (it is not a primitive), and here the result:

Estimated testing time 1s (2 benchmarks x 500ms). Change using -quota SECS.
                                                  
  Name           Time/Run   mWd/Run   Percentage  
 -------------- ---------- --------- ------------ 
  composed         8.96ns                 75.36%  
  eta expanded    11.89ns    10.00w      100.00%

So prior to version 4.0, the mathematician point of view was equaly efficient, but now the mathematicians are second class citizen of the language. :confused:

I know, the pipe and the composition point of view are complementary, not exclusive. But I’m a bit confuse that we don’t have as efficient infix composition as pipe infix combinator.

And to illustrate differently my thought, my HIFI installation is reader %> preamp %> amp %> speaker. But when someone tell me to use the pipe |> it’s like he’s telling me: “Hey, you forgot to mention your DVD”, you should say fun dvd -> dvd |> reader |> preamp |> amp |> speaker. :roll_eyes:

1 Like

I’m going to sum up a few of the remarks above with a little bit of extra commentary.

There’s a tradition in OCaml to avoid composition operators. Not everyone follows this tradition, I gather, but it seems to be the traditional mainstream view. I understand that.

However, a composition operator is normal, common, and natural in functional programming. Haskell has one, SML has one, Clojure has one (prefix, of course, and with a four-character name, but still)–and last, but not least, mathematics has one. (As others pointed out OCaml has too many; it doesn’t have a single standard operator that everyone recognizes.)

OCaml is not only for traditional OCaml programmers! It can and should have a much larger community, and that can be facilitated by being slightly less idiosyncratic. Of course any sensible newbie should want to learn idiomatic OCaml, and experienced programmers will want to help them learn idiomatic OCaml, but adding a composition operator is a minor change. Why keep out a trivial syntactic enhancement that’s natural in functional programming in general, and that’s relatively easy to implement and understand? It harms no one. No one who dislikes it has to use it.

If a composition operator is used in ways that make code unreadable, that’s bad, but there are lots of ways to write unreadable code, and they all should all be avoided, other things being equal. There are clearly many uses of a composition operator that would be quite intelligible as long as it’s familiar. And I know I’m not alone in thinking that some uses of a composition operator are more readable than alternatives.

7 Likes

What you’ve added here is an appeal to popularity: OCaml should gain users by growing in the direction of functional programmers. And it can do that with a change that “harms no one”. I’d just like to add

  1. appeals like this are invariably taken as offensive to established communities. It’s like joking about a stranger’s name, right after meeting him. Isn’t it obvious that hundreds to thousands of people have come before you and had the same impulse? “You should do as I personally desire because it will also suddenly make you popular and cool.” just comes across as a cynical attempt at manipulation - to someone who’s seen it hundreds of times already, attached to all kinds of random proposals.

  2. this particular appeal is especially hard to see under any other light, because it’s the tiny group of FP people who like pointfree code that the move is supposed to appeal to. OCaml should be “less idiosyncratic” by adding a feature that appeals to an extremely small and unusual group of programmers? Also, the argument that adding the feature will not burden other programmers relies on those programmers never having to see code that uses it, which is at least an odd argument to make about a popularity-increasing feature. Shouldn’t the massive influx of pointfree programmers result in massive amounts of new OCaml code that’s written in that style?

I really like readability arguments that make use of code examples, though. What uses of the composition operator do you have in mind that are more readable than alternatives? Do you have some real code that you’ve taken and rewritten in another way, that’s markedly more or less readable as a result of the rewrite? It’d be nice if threads like this would tend towards looking like a series of code reviews. Somehow that never happens.

8 Likes

I really did not mean to offend. I didn’t see what I wrote that way, and I gather that it at least might have offended you, @techate. So I apologize for offending you or anyone else.

I love OCaml and I want to see it grow. (I will admit that there is a selfish element to that, because for me, as a programmer, a bigger community means more resources. However, I assume that others have similar desires.)

Yes, I would prefer that there would be a standard composition operator, and I personally find its absence surprising.

However, I absolutely do not think that OCaml, or any other well-designed language should just add whatever is popular. Uck. That leads to everything-including-the-kitchen-sink monstrosities like, … well I won’t get started seeming to insult another language now, but I have a couple in mind. fwiw there have been repeated, well-known, dust-ups in the Clojure community in which someone argued loudly that such and such feature was lacking, and I respect and am grateful to Rich Hickey (Clojure BDFL) for resisting these calls and maintaining Clojure’s (imperfect, but still) elegant utility of its own kind. One of the things that I like about OCaml is that it also keeps things simple and elegant, but like Clojure (though in a different way), with pragmatic goals in mind.

Manipulation? Well, OK, you can call it that. I just call it argument. I assume that there can be a value in discussing change, and trying to give reasons for them. There shouldn’t be anything intrinsically offensive about that. So I’m a fan of OCaml, and yeah, I am giving an argument. Yes, I think that more popularity is better, other things being equal, and yes, I did feel that in this case, things are equal (you seem to disagree, OK) and yes, I was raising the possibility a composition operator could be one of a number of factors that just rubs potential users the wrong way. It is a small thing, but small things can add up. (I love programming in OCaml, but I am an outsider in that I only started using it seriously over the last year or so. It’s possible that that means that I notice things about how something would strike a new user more clearly than more experienced OCaml programmers, but that’s speculation. I’m an unusual user in many respects, too, so that could distort my intuitions as well.)

If it’s true there is only a tiny group of functional programmers who like to use a composition operator sometimes, then that would undermine a premise of my argument, and I would accept that. I’m not sure if that’s exactly what you meant, though. I’m not talking about point-free code as a general practice, just simple examples like mapping the composition of two functions over a list without adding a new definition or wrapping it in a (fun ... -> ...). This kind of code has already been illustrated and discussed in this thread, so there’s no point in me reproducing it. I don’t think anyone would argue that such uses are particularly difficult to read.

Again, if the tone of my post was offensive, I apologize for that. I fear that I was misunderstood, but I take responsibility for it. Sorry about that.

1 Like