OCaml disadvantages


#1

Hello! In this post I want to ask you to share your thoughts about different OCaml disadvantages (or something you consider to be a disadvantage). Why? - Cause it would be useful for the whole community and especially for OCaml core team as the response to the awesome work they’ve done. Also it would be interesting for authors of different programming languages and those who are interested in theory behind categorical abstract machine, OCaml and language design at all.

Also it’s possible to organize some working groups that could work on language or standard library improvements. For example, if I don’t like “A” and somebody else doesn’t like “A” then we can together fix the “A”.


#2

My view may not be really an “expert” one, as I’m only writing OCaml extensively for a month or so. But IMO the biggest disadvantage is that OCaml is really “advanced”, which limits the availability of developers quite a lot. Most people (even really smart ones) won’t go into the trouble of learning it, because they can do what they need (and they also can put food on the table) with much less trouble.

I am trying to start a business right now and I am betting on OCaml/Bucklescript, but I think this might be a serious challenge (finding employees) in the future.

Of course, once you know it, it feels really powerful and saves you lots of trouble, but it’s really a hard sell so that’s how I plan on offsetting the stated problem (not hiring coders until I can afford to pay really well)

On the technical side I can find many disadvantages if I compare it to something like Erlang (I come from that background), but it also excels in the parts where Erlang falls short, so the two are a very good combination. Eg. If you want to use websockets extensively on you web service, I’d choose Erlang any day. In general, everything that has lots of parallel IO - better go with Erlang - it’s the best platform for that kind of things.

Indeed, for my project I plan on using Erlang to orchestrate and route communication between OCaml executables, so it will be something like Bucklescript <--> Internet <--> Erlang <--> OCaml.
Of course, one could probably go without the Erlang part and use some OCaml web framework, but in my case it’s just easier and faster (to build) this way.


#3

I think this can work out in some beneficial ways:

  1. While it’s true it’ll be harder to find workers, the ones you do find are likely to be on average ‘higher quality’ than your generic workers.
  2. It seems to me like the supply of workers interested to work at jobs with functional programming may still be higher than the demand, which will work in your favor. I could be wrong about this, but that’s the impression I’m getting.
  3. Once you learn a functional programming language, others are fairly easy to pick up. OCaml is easier to learn than Haskell, so a Haskell programmer can write OCaml code. Scala programmers are also fairly compatible, as are Erlangers. And of course F# is essentially a clone of OCaml with interfaces to the CLR.

#4

Performance and a job market status are not language properties, but anyway thank you for your response. Could you please provide more details about Erlang’s outperformance related to websockets comparing to OCaml? As for me, OCaml has no serious performance issues, so that the equivalent code in Erlang should not be a lot faster (that is my suggestion cause I have no serious experience in both: OCaml and Erlang).


#5

I’ll list what I see as a more interesting disadvantage: as with all languages that use garbage collection, very very hard real time or tight memory environments are unlikely to be good targets for the code. Note that both of these are a very small subset of the world’s applications, but it’s reasonable to know that. This disadvantage is also present in a huge number of other languages of course. It is also the case that languages without garbage collection should be avoided unless absolutely necessary because of their own disadvantages (though Rust seems to give one a lot of the best of both worlds).


#6

It’s worth noting that OCaml is possibly the best of the GC languages in this regard. The GC is very tight and is regarded as just about the fastest out there (the GC, not the language).


#7

It’s not necessarily “faster”, but it’s certainly “leaner” and “easier” to do such kind of things with Erlang. Erlang has “one-of-a-kind” preemptive scheduler in its VM which allows you to have thousands or even millions actors running simultaneously in the same VM while most of them will get “reasonable” latency. WhatsApp is quite famous with having achieved this, and me too, personally (although not in the range of millions) have written performant websocket servers which can effortlessly serve many thousands of simultaneous users.

WhatsApp: http://highscalability.com/blog/2014/2/26/the-whatsapp-architecture-facebook-bought-for-19-billion.html

A nice description of EVM scheduler: http://jlouisramblings.blogspot.bg/2013/01/how-erlang-does-scheduling.html


#8

Depending on the definition of “best”, Erlang is probably (much) better in this regard, as it only runs the GC in the scope of each individual actor, thus achieving much better “soft-realtime” guarantees for timers and messages.


#9

Fascinating. Sounds like Erlang sacrifices some performance (by interpreting) for the sake of decreasing latency (ability to preemptively context switch at the user level). They also give up on the notion of shared memory to simplify parallelism, which is a trade-off Go also makes.


#10

I think that’s a very good tradeoff. It works very well in the systems that have tried it, and there’s a big win in that if you share nothing you have no memory access hazards to deal with. Furthermore, in the long run, we may end up with systems with very large numbers of processors but relatively low performance for shared RAM.


#11

Probably depends on the application. I agree it’s a unique approach to cater to a particular niche. As an aside, I’m theorizing here, but it may be possible to do something similar with the OCaml multicore design. The key point being that the scheduler in the multicore branch is written in OCaml and can thus be swapped out at will, or modified to your heart’s content. If you wanted to run a low-latency application, you could potentially specialize a scheduler for this task.

This is an ongoing debate, but my opinion is that it’s unlikely to happen. OCaml has paid a heavy price for betting on this being true, and so far it has not borne out – the shared memory bus is significantly faster than the cost of sending messages, and languages that can handle shared memory perform better in general. Since Moore’s law is effectively dead with respect to performance, and the only way we proceed from here on out is to add cores, it’s in the interest of CPU companies to make shared memory as fast as it can be, and that’s precisely what they’ve been doing. The catch is that for many applications nowadays, aside from very intensive applications such as AAA games, video players, virtual reality, browsers, operating systems and such, shared memory is irrelevant.


#12

So, just a couple of quick notes:

(1) technically, Moore’s law is only about the number of transistors per device, and not about clock rates and such. I mention it only because it’s a pet peeve.

(2) as the number of cores scales, the complexity of managing cache coherency hardware goes up markedly. I don’t think we’ll be seeing systems with thousands of cores and high speed shared memory, but we already have systems with thousands of cores (like GPUs) that don’t conform to the CPU memory model.

All that said, you’re clearly correct that betting on a particular hardware future that might not come to pass is a mistake. Being able to operate in either model may be a win.


#13

I literally had that in parentheses in my post before I decided to delete it. :smile: That’s why I said ‘as it relates to performance’.


#14

We have very good (and performant) ways of doing shared memory tasks - it would be impossible to write real software otherwise. Of course its nothing close to the performance of OCaml - you can think of it as a “native database”, still much faster than what most languages have or what they can achieve by using redis/memcached:

ETS - http://erlang.org/doc/man/ets.html (seems to be down right at the moment)

And we also have Mnesia built on top of ETS, which provides distributed (between machines) transactions and optional disk storage


http://erlang.org/doc/man/mnesia.html


#15

Erlangers who like OCaml (or vice-versa) could keep an eye on Alpaca.


#16

I’ve found that I am using Irmin a lot like I would Mnesia.