Major OCaml pain points

A typical use of shadowing is

let (* maybe rec *) f x = ...

let (* no rec *) f x = assert stuff; f x

or with values

let x = bla in 
let x = Some x in (* with -rectypes using rec would typecheck *)
...
1 Like

Thanks for your reply. I’m curious, is there a github request for “more subtyping and polymorphism” also ? As an OCaml user, I feel the need for more subtyping and polymorphism now and then, but at the same time I find myself unable to specify precisely the feature I want without breaking all the nice type inference and structure. It is not obvious how to arrange all the new polymorphic types we would like in one coherent whole so that any value still has a unique “principal” type.

Shadowing is ambiguous in general, for example consider those two perfectly valid definition :

let just_one =
  let x = [] in
  let x = 1 :: x in
  x

and

let many_one =
  let rec x = [] in
  let rec x = 1 :: x in
  x

Personally, I am not fond of the idea of pervasive implicit recursion everywhere, because I like code to have beginnings and endings when I read it. It might make some code more awkward to write, but the readability gain seems worth it.

3 Likes

People are talking about shadowing here, note that shadowing would still be possible if the function recursion default was switched, you would just have to create a binding with let nonrec ... instead of let .... [EDIT: I mean, presumably let nonrec ... would also be added as valid syntax at that point :-)]

I doubt this would work well in OCaml. It doesn’t in Haskell. In SML this works great since records are structurally typed, so field is roughly an equivalent to OCaml’s field : < field : 'a; .. > -> 'a

Namespaces. We have a flat namespace for compilation units that requires us to prefix each module with a unique name. Having a Common Lisp style namespaces (aka packages) is what we need in modern Reason/OCaml which is used with package managers such as opam or esy.

9 Likes
  • let nonrec ... would be markedly more cumbersome than the current let rec ....: there are far fewer recursive functions in an Ocaml corpus than there are nonrecursive ones

  • making let rec the default would mean that either (a) the semantics of repeated definitions (as in the shadowing examples up-thread) would be DIFFERENT at top-level, from inside expressions, or (b) they’d be the SAME. Both of these choices have real problems. #a is obvious, and #b means that you can’t even shadow inside expressions

  • There are good reasons for let rec to be required for recursion: recursion is intrinsically different from nonrecursive code, and historically many languages have required it to be marked as such by the programmer.

    -> This is a really important point. Is Ocaml supposed to be a language that is only about supporting the “unreflective (as in unthinking) programmer”? Or is it supposed to support the programmer who wants to carefully write code and reason about its correctness? Making non-recursive-ness a property that requires checking an entire file is a good way to make proving one’s code correct (in the desk-checking, Dijkstrak/Hoare/Gries manner, not the Coq manner) harder. let rec is a way of ensuring that when you’re doing your analysis, you know precisely where the code is that needs to be considered.

To my mind, the most important property of let rec is the last: It is a way for a programmer to mark that they intend that recursion occur. And that is important in the same sense that strong typing is important.

9 Likes

@jonathandoyle see Function argument polymorphism and ocaml #8513 for the discussion and further references.

For your record problem, you can use lenses in OCaml today. The lenses are essentially encoding the position of the field in the records. Indeed, in OCaml,

type t = { x: int; y:int}
let f r = r.x

and

type revt = {y:int; x:int}
let f r = r.x

are very different function. The first translates to access the first field of the record and the second access the second field of the record. So the problem is larger than just a type system problem.

Not a major pain point here (as there’s a working solution) but I’d like to see first class support for “deriving framework” from compiler toolchain (w/o ppx).

I think the deriving construct is pretty well understood and useful to be available out of the box. Now I don’t like that I have to use ppx (and then ppxlib, base, stdio, … long chain of dependencies) just to derive some comparator or sexp encoder/decoder. Even more than that — I have to impose those dependencies on the consumers of my software.

6 Likes

I worry that this is in effect a request for a less modular compiler ecosystem, and as a result a request for a slower development process, for both extensions (like deriving) and for the compiler proper.

The current PPX architecture could use strengthening, but I don’t see why we’d be better off by stuffing more things into the compiler proper that can effectively be built outside of it.

Why is this an onerous list of dependencies? ppx_fields_conv, for example, requires base and ppxlib…and that’s it? (well, there’s the runtime library, fieldslib, but that’s to be expected. And there are no further dependencies of that.) And Base is portable and builds quickly.

y

3 Likes

I wholeheartedly agree. Something stable and future proof, with a well defined scope (as in rust or Haskell) with a handful of builtin derivers, would go a very long way.

1 Like

I would really enjoy curried type constructors like Haskell

Yaron, you’re absolutely right here. For a while camlp4 (the predecessor) was the way to go for syntax-extensions and such. It was possible to develop PPX with very little interaction with the core Ocaml system, and then later to switch over to it. If camlp4 had been more-deeply-embedded in Ocaml, it would have been much more involved to swap it out.

Another example: first it was ocamlbuild, then oasis, and now dune, that is the “more-or-less official build system for ocaml projects.” But I continue to prefer using Makefiles, because my projects invariably involve lots of code in C/C++ and other languages (and with GNU autoconf, this is easy to handle).

If dune had been baked-in as Ocaml’s build-system, my preferences would have been … more fraught.

Ocaml’s -openness- (real openness) and modularity are both important values, and it would be awful if they were sacrificed for an illusory “unitary system”.

One last thing: I’ve worked a TON on the inside of the JVM and Java. Java is a great example of a system that incorporates many, many functionalities that in Ocaml are found as add-ons. And the way that Java did this was just to incorporate those functionalities by fiat. The result was that when better versions of those functions became available, they were difficult-to-impossible to incorporate well. It also means that changes/improvements to libraries end up being chained to JVM/JDK update cycles. So a bugfix (that breaks previous clients – viz StringTokenizer) for a class in JDK 1.1 ends up in JDK 1.2, and the previous behaviour is not available in JDK 1.2.

Whereas the somewhat-more-loose coupling between libraries and Ocaml versions makes it more likely that when you upgrade from Ocaml version N to N+2, your libraries can keep working – because they were tested with both versions.

If what’s desired is a “packaged set of Ocaml + critical libraries” then really, anybody who wants to, can offer that, and make sure it builds and deploys on the desired platforms.

3 Likes

Arguably, if camlp4 had been part of OCaml, there would have been no need to swap it out. It would have worked well with every new release and been part of merlin.

The problem with ppx is precisely that it’s tighly coupled to OCaml versions, unlike normal libraries. It’s probably going to be eventually fixed, but a deriving mechanism would by definition work on new versions without hassle.

1 Like

Arguably, but I’d argue against! Camlp4’s basic design made creating good dev tools harder. The fact that camlp4 provides for an extensible grammar means that something as simple as a syntax highlighter or indenter can’t be written in a way that works across different camlp4 extensions.

PPX was an example of learning a lesson from Lisp: metaprogramming should for the most part be done within the single AST of the language.

I think another lesson to be learned from Lisp and its descendants is that metaprogramming is an extremely useful and powerful tool, and it should be treated as a first-class member in the ecosystem. I understand the real problems and limitations of our current PPX tooling; but I think the enlightened approach is to push towards making them better, not push for avoiding metaprogramming.

y

2 Likes

Well, except that that mechanism would need to be upgraded as the AST changed[1], and hence

  • would slow down releases
  • the people who work on deriving would have to be involved with the release process
  • late-appearing bugs in deriving would hold up the release
  • ocaml users who didn’t rely on deriving would wait longer for new releases

and on and on. I’ve worked on commercial software and seen all these forms of dysfunction and more. And I really think you don’t want it for Ocaml.

[1] to clarify: one way or another, when the AST changes, somebody’s gotta update the deriving mechanism – whether it’s the “deriving team” or the “core team who also own deriving” doesn’t change things. If deriving were absolutely essential to using Ocaml, it might make sense to incorporate it into the core system, but I don’t think that that’s the case.

2 Likes

I might be naive but I’m assuming that deriving is pretty much “done”. I could be wrong though.

And then ppxlib requires stdio, ocaml-migrate-parsetree… It’s not a big deal, you are right. I’m just expecting such functionality to be provided out of the box.

I can understand your point and I think I agree. Maybe my frustrations is just about expectations and it could be solved instead by a separate OCaml distribution which would encompass an OCaml compiler toolchain, some base ppx infra and sensible stdlib (base or containers or …). That distribution’s release cycle could be based on top of OCaml compiler’s release cycle but shifted further so it can adapt itself to changes in the latter.

At the same time 4.08 now has let operators which effectively replace ppx_let — another well known and well understood construct.

It is said that no piece of software is ever “done”; there is always another bug, or incompatibility, or feature request.

There is so much to be done in deriving: