What are the biggest reasons newcomers give up on OCaml?

Yeah they’re definitely serviceable APIs, but I wouldn’t call them particularly friendly. They expose a lot of detail such as permissions, modes, etc., which is great for power users but maybe not ideal if you just want to read from a file and print some data. Like if I were a novice programmer and trying to read from a file, that In_channel docs page would be a little intimidating and confusing. Naming-wise, it’s also not what I’d have in mind for an IO module.

1 Like

This docs page almost immediately points to the Examples section which shows how to read from a file…

These examples fail to avert the footguns. If you want to read/write a file to a string regardless of stdin/stdout the runes ends up a bit more complicated than that. See for example here for sample code (using the cli convention that - denotes standard channels).

And of course the day you want to add compression/encryption layer you are busted, that’s the reason why bytesrw exists.

This is one area where package managers like Nix and Guix shine. They allow you to instantiate “shells” that pull in a set of packages defined either in a config file or as an ad-hoc list on the command line. The best part is that they’re not limited to any single language’s ecosystem.

Note that technically neither is opam.

2 Likes

Ah, I didn’t see that, thanks. Is there any reason it’s not the first item on the page? Having it buried under a link to the bottom of the page isn’t ideal imo.

Also, I’d argue that an API for a common usecase such as reading/writing to a file should be a little more simple and easy to recall. Like I can recall fs::write and fs::read_to_string a lot easier than In_channel.with_open_bin file In_channel.input_all. Like Rust has the OpenOptions API for complicated usecases, but in practice most people just use the fs or File API.

All good feedback, and I am sure a PR to improve the docs and add a convenience function would be seriously considered. I don’t know if read_to_string file really buys you all that much over with_open_bin file input_all, but it’s worth starting the discussion.

EDIT: after reading Alice’s reply, I feel it’s worth mentioning that the OCaml standard library is generally not targeted towards zero-knowledge newcomers. You do need to study it a little bit to be able to use it (hopefully not too much). To be honest, I feel Rust has its obscure bits of the standard library too, where you would need to hunt through different pages to get what you need, so singling out OCaml for this doesn’t make sense.

1 Like

A read_to_string function like this is frankly a bad idea. In general, you’ll want to avoid reading the entire file into memory if possible.
Making the path of least resistance do the wrong thing without an obvious way to improve it (like switching from In_channel.input_all to incremental reads) seems pretty counterproductive.

3 Likes

I’m a longtime OCaml user (been using caml-light since 1992) so for me, these niceties about which API is presented in which library just don’t matter. But I take @NicholasLYang 's point, that the library that a new user sees -first- should be a natural one, and that’s not the same as “the lowest-level one” (which is kind of what the Stdlib is, eh?)

I read his description of file I/O, and thought “oh, that’s what bos is for!” but how is he supposed to know that?

It feel like (haha, I bet this has been discussed N times already) what’s needed is something like Batteries, or Core, but specifically constructed of only easy-to-use APIs, perhaps not achieving full coverage of all uses, but really easy-to-use. I don’t know how one gets there from here, when the only people who could/would construct such a thing are long-time users. After all, @welltypedwitch has a point that “read to string” isn’t always a good idea. Nevertheless, it -is- the lowest-effort API for reading from a file and maybe that’s why it gets into that “simplest-to-use API”.

Dunno, just riffing here.

1 Like

It is possible to provide both, like containers has done for years:

and so on, and the easy, read-whole file API:

I wouldn’t say containers covers the most advanced use cases (mostly
because this is old code and I didn’t care as much about, say, reusing
buffers) but it’s good enough for a lot of use cases.

As an aside, I think providing a std::fs::read equivalent is good
because it’s perfectly fine to read the whole file in memory for many
use cases, notably config files. That includes a lot of modern yaml
stuff.

5 Likes

Is reading a file into memory such a huge issue? If it causes performance issues, the user will notice and optimize accordingly. It’s not like this will cause a security bug. Likewise, if the API is named accordingly, i.e. read_to_string, I think the drawbacks will be quite obvious. To be frank, I think calling this a “bad idea” is a bit of an over generalization. There are many situations where reading a file into memory is warranted, whether that’s prototyping or reading small config files.

7 Likes

I don’t think any of these suggestions regarding the standard library are a bad idea, and they could be suggested upstream for inclusion. The problem is that everyone has a different notion of what should be included in the standard library, and this makes finding consensus elusive.

Cheers,
Nicolas

1 Like

This resonates with me (long-time user). Syntax errors are often a head-scratcher when they provide approximately-zero information. There’s no need to recover from syntax errors I’d argue, and probably complicating everything is not worth the complexity, but maybe we could integrate a fallback analysis into Merlin [or ocamlformat, see below] in case of syntax errors.

Would also be nice if ocamlformat didn’t refuse to do anything on syntax errors as otherwise it would have the potential to highlight where the root cause of the error lies.

1 Like

The lack of advanced learning resources is indeed a stopper.

GADT documentation for people coming from C, C++, Python, or JavaScript is nearly inexistent. When browsing OCaml’s ecosystem, I often felt doing research. People in the industry are not starting a PhD and they generally do not have the theoretical background. That being said, I really appreciated answers from the maintainers on Discord or here in this forum.

The lack of documentation is also for the entire ecosystem where, often, the best documentation you can find is a readme and few examples. Better documentation attracts more people who write more packages in return. The ecosystem is big for research projects but small from an industrial point of view.

Last but not least, it would be a very good thing to merge all the standard libraries. I do not think a community of that size can afford a split.

OCaml still have a lot of strength and could easily provide the perfect platform to create Web application. Keep on going!

These are common pain points and often brought up by users wanting to start using OCaml in earnest. But there is no magic solution to be had. It is hard to compare the situation with languages that have massive industrial backing like C/C++/JavaScript/etc. It is true that the current state of affairs makes OCaml a difficult pick in industrial settings. Having said that, I believe that those that are willing (and able) to “take the plunge”, generally find out that the ecosystem, even in its current state, is quite capable of sustaining industrial development of high quality.

While I agree that more resources to learn advanced type theoretic techniques is always nice, I think the “need” to understand GADTs for everyday programming is overblown and something of a red herring.

I would say that 95% of programming needs are satisfied by the core language: functions and recursion, (ordinary) variant and record types, pattern matching, etc. Having a solid understanding of these basics will take you a long, long way, and you can very well make a living out of it without ever using “fancy” techniques like GADTs.

In my experience (in an industrial setting), instances when advanced techniques like GADTs did actually solve a problem that could not be solved in a simpler way using the core language were (very) few and far between. Most of the time, the “solution” using GADTs was less flexible, complicated and harder to understand than the alternative.

Just my 2c.

Cheers,
Nicolas

9 Likes

Does merlin use the OCaml compiler to do semantic analysis? If so, then I think error recovery is important. It’s also quite nice having multiple error messages versus a single syntax error with no recovery. But agreed that it’s maybe not a top priority

Thank you for taking the time to reply.

I guess I was unlucky then. I tried to implement an Entity Component System to make a game running on the Web and I needed GADT to handle query of different component types.

Maybe my design was wrong, but I found similar pattern in caqty too. Printf also uses GADT. While it is true that the vast majority of the codebase does not use GADT, I do not want my project to get stuck because of that.

GADT has been around since 2012, I find it sad no good documentation arises in a decade. People like to learn good tricks, specially if they are beautiful even if they do not often use it (I find building hlist using currying beautiful).

Another example is that the accumulator trick is not explained in the official manual.

I think I won’t give up with OCaml, but I hope I will be easily able to use JS/TS libraries, and that the initial cost will pay back when I will have to refactor my code.

3 Likes

I agree that there’s a difference in investment (see the creator of Elm’s talk on programming language economics), but I do want to bring up the importance of culture. If there’s a culture of documentation and clear examples, then maintainers will consider that as a necessary requirement for package adoption. Even if they can’t provide the labor themselves, they may make issues asking for help or direct new contributors to work on docs. If there’s a culture of “the type signature is the documentation”, then there is less interest in writing docs.

Also a culture of documentation results in corresponding tooling. Rust has mdBook and rustdoc/docs.rs. OCaml has the lovely docs site which is a great equivalent to docs.rs, but I’m not sure there’s a mdBook equivalent. Tbh I’m not sure there even needs to be an “equivalent”. mdBook is language agnostic and can be used for OCaml.

Some examples of well documented crates in Rust land: miette, clap, thiserror, serde. Are these perfect in their docs? Certainly not. I regularly find holes or have to ask questions. But they are quite decent in having good examples and links to guides.

4 Likes

It’s documented in the manual: OCaml - Generalized algebraic datatypes

The OCaml manual is formatted as a book and is also available as PDF and plain text. The odoc tool can also produce high-quality documentation.

Some examples of well documented crates in Rust land

There are similar examples in OCaml land:

Every ecosystem has similar uneven experiences with documentation. I can easily find sparsely-documented Rust packages. While ‘culture’ feels important, it can be very hard to pin down and I don’t know if we can really draw a conclusion that OCaml doesn’t have a culture of documentation. Earlier I pointed to the standard library page which mentioned examples near the very beginning of the page. If people don’t read the first few lines of the page, what can be done?

6 Likes

You are missing my point because you removed the context. I could have cited other sources starting from the chapter in Real World OCaml which give more examples. My point is that, IMO, the documentation does not target a newcomer’s audience, ie. People coming from other mainstream languages like C/C++/Python/JS/TS.

Just in the paragraph below, I gave another example: the accumulator trick is not explained in the official manual.

You might have a different opinion because a book learning curve appreciation depends on the reader’s background, but for my point of view this book is far more accessible to newcomers because it explains the concept, not only the OCaml way of implementing the concept and I wish there were something similar for GADT covering hlist, hmap, printf, type erasure, unifying variables, when not to use GADT, etc.

Just my 2c.

1 Like