What are the biggest reasons newcomers give up on OCaml?

Note that even though it wasn’t properly designed for that, the omod command line tool that comes up with omod (a toplevel helper to load what you want by module name rather than by library name which I find more natural) can help answering these questions – though it reports them at a lower level. For example:

> omod cobj Misc
Misc 9d0e49da1703d17dc3043e52bca13fe5 /Users/dbuenzli/.opam/4.14.0/lib/ocaml/compiler-libs/misc.cmi
Misc 9d0e49da1703d17dc3043e52bca13fe5 /Users/dbuenzli/.opam/4.14.0/lib/ocaml/compiler-libs/misc.cmx
Misc 9d0e49da1703d17dc3043e52bca13fe5 /Users/dbuenzli/.opam/4.14.0/lib/ocaml/compiler-libs/ocamlcommon.cma
Misc 9d0e49da1703d17dc3043e52bca13fe5 /Users/dbuenzli/.opam/4.14.0/lib/ocaml/compiler-libs/ocamlcommon.cmxa

Since nowadays a library foo defined in an opam pkg is most of the time installed as $(opam var lib)/pkg/foo/foo.cm[x]a you should be able to infer the library name and opam package of a given module name (more precisely of a given compilation unit name)

But, as the example above shows, that’s not always the case.

3 Likes

Wow. That is a very useful tool.

I agree calling opam init by hand it quite annoying for newcomers. Unless I missed the ticket I guess this is not a thing because nobody ever thought of opening a feature request for this. I opened one here: https://github.com/ocaml/opam/issues/5395. Maybe there are good reasons for not doing it, i don’t know, but if there are they’ll be discussed there at least.

That shouldn’t be the case. My guess why you think that is that you didn’t accept that opam modifies your shell config files during your first opam init. No program can automagically modifying the environment without modifying your shell config, so that’s why it is needed in the first place.

$ opam init

<><> Fetching repository information ><><><><><><><><><><><><><><><><><><><><><>
[default] Initialised

<><> Required setup - please read <><><><><><><><><><><><><><><><><><><><><><><>

  In normal operation, opam only alters files within ~/.opam.

  However, to best integrate with your system, some environment variables
  should be set. If you allow it to, this initialisation step will update
  your bash configuration by adding the following line to ~/.profile:

    test -r /home/opam/.opam/opam-init/init.sh && . /home/opam/.opam/opam-init/init.sh > /dev/null 2> /dev/null || true

  You can always re-run this setup with 'opam init' later.

Do you want opam to configure bash?
  1. Yes, update ~/.profile
  2. Yes, but don't setup any hooks. You'll have to run eval $(opam env) whenever you change your current 'opam switch'
  3. Select a different shell
  4. Specify another config file to update instead
> 5. No, I'll remember to run eval $(opam env) when I need opam

[1/2/3/4/5] 5

[...]

Now there could be improvement to the way we display this information. If anyone have ideas, you’re most welcome to open a feature request on our bugtracker.

2 Likes

First, before I write anything else, I don’t find this to be problematic. I have a bash function:

function refresh_switch {
 eval $(opam env)
}

so I can do

opam switch 5.0.0
refresh_switch

and everything is straight and clean. I could combine the two … which leads now to an observation.

There’s this thing called “Anaconda” (aka “conda”). Maybe some readers know of it. It’s an environment that started off as “better than virtualenv+pip for python” but evolved into a sort of crazy-on-steroids package-manager for data science. It’s … well, it’s better than building packages by hand, but pretty wild-west and lots of things break. But for your average chemist/physicist, it’s a lifesaver, b/c really, building stuff from source is death. Hot flaming death.

Where am I going with this …

Anaconda support multiple “environments” (a lot like switches) and you can switch really effortlessly. To switch to an environment named “Goo” you do

conda activate Goo

and all the environment variables get set, etc, etc, etc.

How does it accomplish this? Well, obviously “conda” is a bash function. And there are versions of it for other shells. So once you see that, it’s obvious that you can put all the smarts into a single command, and no need to do any eval argle bargle trickery.

OK. So to sum up, there’s a way to avoid the (to me minor) extra steps. It involves some extra work by the opam developers. Me, I think I’d rather they spend their time on other things, and this particular thing is something that a reasonably experienced BASH developer could knock out as a contribution.

Anyway, my 2c

3 Likes

It was written to provide a constructive proof that one didn’t need additional metadata (i.e. ocamlfind META files) in order to link libraries or use a given compilation unit since the whole dependency graph is already written on disk in the compilation artefacts.

ocamlfind’s META files are more powerful but 99% of time this power is not used and omod will do. IIRC I found a single, rather rare, pattern on which it fails (it was an odd thing made by the digestif package, which entails an implicit dependency on a C stub that cannot be discovered).

As a side effect it shows that there is a lot of untapped user experience improvements available upstream – e.g. autoloading modules on first use in ocaml. But it would need more user experience thinking upstream, rather than the current feature-based mindset. Workflows, not features.

Ultimately I think this is the biggest hurdle facing OCaml: it will be difficult to compete with languages that take user experience seriously. And that doesn’t mean you need cargo upstream as people seem to think nowadays, a simple notion of library would already go a long way. It’s rather a constellation of details that support established development and software life-cycles practices (deprecation, debugging, profiling, testing, etc.) in which the compiler should be on the front line to help but is more often missing in action at the moment.

16 Likes

For what it’s worth, I think of the point that Jon Harrop was making a few times over the years:
the OCaml’s place in the language design space eschews the various forms of overloading: overloaded functions, overloaded operators, all-computation-is-methods OOP languages, type classes. This makes OCaml’s code more understandable, which is a downside because it makes it harder to write code that one doesn’t understand.

4 Likes

No Unix program :slightly_smiling_face: As it happens, the Windows version of opam 2.2 does this automatically without needing any shell configuration both in Cmd and PowerShell (unless you explicitly configure opam not to do it).

1 Like

Yes and no. I’d say that modern OCaml is now trying to eschew overloading today but it has the legacy of type unsafe polymorphic equality, comparison and hashing and handy things like List.assoc and Hashtbl built atop them. I’m guessing those aren’t going away for practical reasons?

The problem with [%derive.show: t] and friends is that they accept types and not values. That’s a problem because it requires the type to be written somewhere in a language where not having to write out types is a core feature. The place where that chafes the most is tuples because they are almost always inferred. In that case the resulting code was worse than using combinators:

Print.(list (list int)) data

As you say, having them accept values would go against the spirit of modern OCaml.

FWIW, I’m curious about the opposite language design. What if a language had ints, floats, chars, tuples, strings, arrays, sets and maps with polymorphic equality, comparison, hashing, pretty printing and serialization built-in? Not academically interesting but perhaps very pragmatic.

2 Likes

L̶a̶c̶k̶ ̶o̶f̶ ̶m̶u̶l̶t̶i̶c̶o̶r̶e̶ ̶s̶u̶p̶p̶o̶r̶t̶

3 Likes

Builtin in derivation that is delightful to use—yes! ppx_deriving baked in, additional goodies for making printing/formatting simpler. The second for me, debugging.

I have influence in technology selections periodically at my company. I cannot sensibly offer ocaml as a contender in good faith without these. To offer this toolkit would reveal my bias, not my good faith argument that it’s a sensible engineering decision, knowing the hurdles to productivity.

8 Likes

The issue is that ppx doesn’t have access to the typechecker’s output. This is one of the major things that distinguish a macro-based system like Rust’s from ppx.

2 Likes

I don’t think that’s right. Or, more precisely, I don’t think that that’s correct in a surface reading. Rust’s macro system is actually much weaker than PPX. Much, much weaker. What makes it work, is that you combine it with traits/modular-implicits. That’s where the type-system access comes from.

The two combined, effectively, yes, gives macros access to the type-checker.

4 Likes

In my experience programming in several languages, I usually look for particular words in the error messages, looking for the signal in the noise. Here’s a typical type error:

File "./test.ml", line 1, characters 12-16:
1 | let x = 1 + true
                ^^^^
Error: This expression has type bool but an expression was expected of type
         int

Note: the only colouring used in this error message is for the ^^^^ and the word Error. Even leaving that aside, the layout of the message is not great. I need to scan through the entire message to figure out which types are mismatching. Imho if we printed a simple diff style it would be much more easily scannable:

Error: type mismatch
File: ./test.ml:1:12-16
1 | let x = 1 + true
                ^^^^
Expected:
  int
Actual:
  bool

From there we can actually talk about more sophisticated diffing capabilities for complex types.

some projects have gone overboard with these features in a way that hurts usability

Perhaps, but even a complex type mismatch error may be improved by a careful layout and diffing to make it easier for the human reader.

11 Likes

I hope this isn’t off topic.

I’ve been using OCaml in an Honors CS1 course since 2014. My dept. is unlikely to continue using OCaml in that course now that I’ve finished teaching it, even though OCaml is truly great for teaching once you overcome the barriers-to-entry. The stronger students in the Honors course love OCaml but it doesn’t resonate as well for the average student who is wondering why we aren’t using resume-friendly Python like their friends. Some grudgingly accept learning OCaml in part because they see that it’s used in prestigious schools like Cornell & Harvard. If Cornell & Harvard stopped using OCaml enrollment in my Honors course would drop.

Installation of OCaml and graphics libraries is a very significant problem. Wealthier students have Macs so they have less trouble but many students have inexpensive Windows systems and installation issues for Windows students have been a show-stopper for many of them.
The situation has been improving but not enough to win over enough students and I’ve made no headway in recruiting colleagues to continue using OCaml in Honors CS1. It’s a bummer. If OCaml had a one-click install teaching environment things might be different.

14 Likes

TypeChecking MLton has great type errors that put brackets around the parts of the type that are different. (I’m not sure if I agree with its approach of replacing the parts of the type that are the same with underscores, because seeing the full type helps me figure out where I went wrong if the type or expression is complicated.)

1 Like

For this particular error, I think it would be nice (maybe with additional compiler flag) to see a (truncated?) chain of how both the “actual” and “expected” were inferred, because sometimes the mistake isn’t at that particular location… i.e. actual type is “x” because it was inferred from usage on line ### to be “y”, which in turn was inferred from line ### to be “z”… (and similarly for the expected type)

Well, have you considered proposing your preferred error format as a PR to the OCaml compiler?

4 Likes

I know this thread is about improvements, but I would like to give thanks to the compiler team for the improvements in type errors over the years. I notice it especially in terms of polymorphic variants. It used to be that you’d just be told the two variants don’t match, and now it tells you what’s missing. Other messages have improved a lot over the years too. There is always room for improvement, but error messages are really a lot better now.

15 Likes

Looking at some Rust macro code, you’re absolutely right. I thought their macros did more stuff under the hood, but in fact many of them serve as efficient convenience functions, and Traits do most of the heavy lifting.

I’ve been playing around with it locally but haven’t had time to dig deeper:

diff --git a/typing/printtyp.ml b/typing/printtyp.ml
index b0bf36ceb..48e428d3a 100644
--- a/typing/printtyp.ml
+++ b/typing/printtyp.ml
@@ -2346,7 +2346,7 @@ let head_error_printer mode txt_got txt_but = function
   | None -> ignore
   | Some d ->
       let d = Errortrace.map_diff (trees_of_type_expansion mode) d in
-      dprintf "%t@;<1 2>%a@ %t@;<1 2>%a"
+      dprintf "@[%t:@]\n       @[%a@]\n@[%t:@]@[%a@]"
         txt_got type_expansion d.Errortrace.got
         txt_but type_expansion d.Errortrace.expected

Will try to do so over the holidays.

2 Likes