Interesting OCaml Articles

The cross platform story for F# should get better now that dot net core is the primary dot net version that’s used everywhere. After reading this post I tried setting up F# on a Mac and it was quick and easy to get up and running (with decent editor setup via LSP + neovim).

1 Like

butt up against the OO system

This exactly… The least bad option for me for using other ecosystems backend libraries is probably ocaml talking to golang libraries. I haven’t looked up whether this actually is possible currently

I can’t compare OCaml to F# here–I’ve never used F#–but when I was learning OCaml, I read through the description of the Jane Street maps/hashes etc. and kind of ran screaming. The online docs for the JS maps were even scarier. Yes, partly it’s that I was coming from Clojure, where you can just throw anything into a map, because it’s dynamically typed and maps are designed that way. But still … I thought, I know it’s harder with OCaml because it needs to be type safe, but still there has to be a better way. So it wouldn’t surprise me if F# maps were easier to understand.

Biggar’s description of the F# ecosystem reminds me of Clojure. Maybe in some nearby possible world OCaml-Java could have played a similar role. Or maybe, per @mbacarella:

For me, the article highlights the need for there to be an informal way for people to signal that they have private source code that they will share with others directly, upon request. I am sure someone has google cloud bindings, just no one wants to put such a library on GitHub, because they don’t have the time to answer issues or support the library. Generally, if people put more of their libraries on GitHub (or more open system like codeberg), but marked their repo as “archived”, that would be a good way to signal other people can copy it, while telling others not to bother contacting for support

2 Likes

GitHub allows switching off the wiki, issues, and PRs per repo.

3 Likes

[full disclosure: I lived thru the first ten years of Java at IBM, and watched as the “JDK” grew to engulf library-after-library. I will always prefer the OCaml or Perl model of user-contributed libraries, b/c monoculture led to bugs and lack of flexibility. And I can always write my own version of whatever missing package. I mean, it’s just not that damn hard.]

I wonder if what he’s really complaining about is lack of a large commercial backer (Java has many, F# has MSFT, Rust has … ? (maybe none, which would disprove this thesis)).

Repeatedly in both the linked articles, as well as his comparison of OCaml to Rust, it seems like he’s complaining that there aren’t enough “libraries that he needs” available. And to be sure, for a commercial organization, this could be a dealbreaker. [A quick Google search shows what appear to be Perl APIs for Spanner, so there’s at least an existence proof for a community-backed language having such support.]

I do wonder how Rust does it. Are there more commercial users? Are there big commercial backers?

1 Like

I don’t think Rust is much closer to OCaml when it comes to ecosystem than Java, F# or Go for that matter. And actually the commercial baker in Go’s case seems to prove your point:

This is a chart showing the number of pull requests to repos in a given language on GitHub. It’s not a good measurement of popularity but maybe it is a good measurement of the ecosystem health.

1 Like

Well he also complains about the syntax and that “I saw a team struggle with OCaml, and for good reason. Language tutorials are extremely poor in OCaml compared to other languages; they’re mostly lecture notes from academic courses”. I have some sympathy: for example, writing code to an interface is a fairly basic starter for a programmer but not intuitive in ocaml: the most direct equivalent requires module signatures with abstract types and sharing constraints and possibly functors (the last three of which are unavailable in F#), and there is nothing in the ocaml website’s “Learn” section explaining how to achieve this kind of equivalence.

That is probably for good reason. In ocaml it is generally more idiomatic to think of such things in terms of higher-order functions and combinators, or pattern matching on variants. There is a cognitive jump involved.

Heh. I remember when I started learning Golang (at Google). I was shocked and appalled at how terrible the language was. When I learned that an interface with N implementing concrete types, had N+1 distinct nil values, I (figuratively) puked. I wasn’t very impressed by the documentation and tutorials about Golang, either: they’re great for gently ascending the learning curve for complete newbies, but terrible for actually learning Golang to the point where you can actually, actually, ACTUALLY know what a piece of code does by reading it.

Maybe that’s where I’m coming down: I’m sure he’s a smart guy, but he’s channelling the experiences of his team of newbies. Which, sure, that’s what the commercial world is like. And for sure, for OCaml to be successful, which we should all wish for, that world has to be addressed, and pandered-to. It’s reality, as I learned (again, at the tip of the spear, as it cut into my innards) working with Java, the JVM, and large Java-based products, as it was initially commercialized.

[I remember the day that it became clear that all strings and character-oriented I/O would be via immutable Strings, and not using some UTF8-encoded collection of types. Hey, gotta pander to the idiots, it’s how we get paid. I’m cool with it.

I’m just thankful that OCaml’s core doesn’t mandate all of that rubbish, and I can build things properly.

3 Likes

I have recently bumped into a paper called The Case for Writing Network Drivers in High-Level Programming Languages. Author concluded that OCaml’s implementation is not very good in terms of latency (Figure 3a). Did anybody look into their implementation?

That case was pretty conclusively proven in 1997 by Mark Hayden’s Ensemble (PhD thesis, Cornell). he achieved 10us overhead on 75us round-trip Myrinet running full virtual synchrony (which is a lot more than TCP).

If you’re interested in learning more, this paper was also discussed in this thread, where @Reperator, one of the authors, provided some additional context.

1 Like

Not specifically an OCaml article, but this paper Quantifying the Performance of Garbage Collection vs. Explicit Memory Management has people talking about Why Automatic Garbage Collection Is Expensive again.

It sparked an interesting Twitter exchange between the Emery Berger, Chris Lattner and our own @Yaron_Minsky.

1 Like

Always surprising what gets published, I guess. Yaron is being generous and kind. It would seem, uh, blisteringly obvious that the reason I use a GCed language is that I’m willing to trade memory for programmer-hours. And this is a real cost: those programmer-hours can be used in other optimization work that can yield great rewards.

There was a (at the time) famous example of a Taligent application that drew four rectangles, and a trace revealed that (inexplicably) it did some massive number of printf-operations. These guys were funded as a startup to redo what NeXT did with Objective C, only with better resource utilization, b/c Objective C was, y’know refcounted, and we can do better with explicit memory-management.

The entire resurgence of O-O business application servers was made possible by the rise of GCed language runtimes: the Object Management Group and armies of programmers at big enterprise software companies invested billions of dollars in trying to build what became J2EE, doing it in C/C++, and failing miserably. I. Mean. Miserably. Bloated systems that were unstable as heck, with programmer-induced memory-leaks, memory-errors, and all manner of failure. When Java came out, programmers RAN from C++ to Java, because it was so difficult to write decent business-logic server systems in C++. That is to say, sure, with sufficiently skilled programmers, and a sufficiently limited and unchanging requirement-set, you can do it. But if you’re actually trying to run a business, you can’t afford that quality of programmer (even GMail is famously written in Java, not C++), can’t keep your requirements simple, and can’t keep them from changing.

4 Likes

Always surprising what gets published, I guess.

I disagree and find it unfair. What you describe is of course right, but at the time the article was published, explicit memory management that was safe and automatic had not emerged yet but was an active area of research. The debate on the thread was about tracing GC vs. prompt deallocation, not necessarily manual. I find the paper interesting, like Yaron’s/Stephen Dolan’s fine point, which is important to keep in mind when seeing the paper being quoted as “GC requires 4x as much memory”.

2 Likes

Looking even closer, I see that this is done on JikesRVM. That’s a research toy, and nobody puts their best GC algorithms into research toys (I was at IBM Research, worked for the same folks). Furthermore, they didn’t compare their work against Bacon et al’s RC-GC (implemented in JikesRVM), AFAICT. I mean, if you’re going to put in explicit memory-management, I’d think you’d want to compare against RC-GC also, no? [Looks like they’re comparing against an Appel generational stop-and-copy collector]

Look: I get that researchers have to do “research”. But I know from experience that the amount of improvement that can be eked-out of even really awful starting-point GCs is quite substantial: I watched as guys improved the IBM [ETA: edited] product JVM’s GC considerably, using all sorts of low-level tricks.

ETA: I mean, what did we learn that we didn’t know already from work going back three decades? That explicit memory-management can use less memory than generational stop-and-copy GC? Check. That copying GC can be faster than explicit memory management? Check. That there’s a space-time tradeoff there? Check.

2 Likes

Let me share my new blog post on understanding format6 with examples.
https://blog.tail.moe/2021/01/13/format6.html

It’s almost my reading note for the paper Format Unraveled (on module Format) and experiments on utop. I tried not to be too verbose though.

5 Likes

Well, I made a sequel of format6 post,
Understanding format6 in OCaml by diagrams
https://blog.tail.moe/2021/01/15/format6-diagram.html

This time I just use four examples with four diagrams e.g. it’s the one for Scanf.sscanf

p.s. It’s a pity that I missed Gabriel’s post The 6 parameters of (’a, ’b, ’c, ’d, ’e, ’f) format6 after writing that one.

7 Likes

Not primarily a programming article but I thought this is an interesting exception because it may be the first time OCaml has been mentioned in the Financial Times: Jane Street: the top Wall Street firm ‘no one’s heard of’  | Financial Times

6 Likes

However, OCaml is not given a very honorific qualification: a recondite programming language!

1 Like