Suggestions from the OCaml Survey result

In the OCaml User Survey topic, @xavierleroy wrote:

The OCaml survey 2020 is now closed. It attracted 745 replies. Everyone should be able to see the summary of results and to download the raw data.

I and a few others will try to write a summary of the replies, especially of the many free-form replies. Everyone is welcome to help sifting through the results! Short comments can be posted in this discussion thread, and longer analyses can be put somewhere else and mentioned here.

Thanks for putting together this great resource!

I think it’s quite important to take the results from this survey and find actionable items (or highlight ongoing efforts) which aim to improve different problems for different users. As a result I have put together some graphs trying to categorise the answers based on proficiency (beginner, intermediate, advanced and expert). These groupings are quite subjective but I thought it might offer a slightly more nuanced look into what problems users had, where they come from, what they are doing and what commonalities they share. The code is here and the HTML version of the notebook here (I’m no data-scientist nor python developer :snake: if you see glaring mistakes do raise an issue!).

Here are some possible inferences (warning: opinions ahead) that can be made from the data along with possible actionable steps (or links to ongoing projects):

  • Everybody wants better documentation for user libraries. I think people want more long-form style documentation. However, excellent work is currently taking place with odoc (see also OCaml workshop video) and perhaps these concerns can be wrapped into this ongoing work.
  • Everybody wants multicore but “experts” want implicits a little more – multicore is coming along amazingly and implicits are in an earlier but active state AFAICT.
  • A wide variety of application domains - less proficient OCaml users tend to be doing more web-related tasks or data processing. Advanced/expert users are implementing programming languages, building developer tools, systems and formal methods. This is reflected in other breakdowns for example in “other fluent languages” we see JS more popular with beginners and C with experts. But also track the progress of Coq and friends moving up as proficiency increases. This is quite interesting and I think it could point to an actionable item of working more on the web and formal methods components of the OCaml ecosystem i.e. working with Coq devs and JS devs to help them.
  • From a Javascript point-of-view, looking at the implementations section it’s interesting to note across the four proficiency levels js_of_ocaml is more popular than Reason except for beginners. Both projects are actively worked on, but perhaps more js_of_ocaml documentation would be good with many tools already existing like gen_js_api, Jane Street’s bonsai etc. In my opinion, this survey motivates working on unifying the JS world of OCaml as (at least to me) right now it feels a little fragmented and hard for someone new to know where to go.
  • Opam with the public repository across all users is consistently the most used installation method – this justifies the large amount of work that tasks place on opam that I’m sure most are very grateful for. Tying this back to the previous point of “cross community” work, opam have community meetings which seem to be v. useful.
  • Pain points – the top two (although they interchange) is “lack of critical libraries” and “too hard to find and hire OCaml developers”. Some effort has been made to find these libraries including in the survey. I’m not sure what else can be done. As for hiring, looking at the different communication channels there is quite a spread with different users using different channels. This is great, but may also cause some fragmentation and make the “hiring” process worse.

These are just some thoughts from a relatively new OCaml community member, I would love to know what other actionable steps people think we can take from this data and hopefully we can produce a more concise and specific set of steps to put this survey to great use :))

21 Likes

Should consider announcing this survey in rescript forums next year, to get the largest amount of data points.

2 Likes

I believe a common theme among the comments in the survey is a lack of structure found in documentation, libraries, and critical build tools. I’d argue that this is inevitable to some extent but developers coming from other languages seem to feel it acutely. For the realm of libraries, I could imagine that curated Opam repos could help. If somebody feels strongly about a domain, they could create an Opam repo that assembles tried and tested libraries that work well together. After several attempts I have little hope that a standard library, which is basically a code repository, will emerge. So maybe moving one level up to Opam repos could help. Such Opam repos can be still mixed and matched but could provide an easier entry into specific domains.

2 Likes
  • Use sourcegraph and other code search mechanisms to surface all the work in progress (WIP) libraries that people have put together on github/gitlab/etc which aren’t on opam, so that no one starts from scratch. People can at least start by copying someone else’s WIP library
  • Have some way for companies to announce that they have libraries that they don’t have time to formally open source, but can provide on demand. The hardest part about this problem is there is no positive incentive to duplicate effort and publish to a public and private repo… Maybe some match making could incentivize people to participate as both users and producers. So a website where people list their missing libraries, as well as people note their WIP libraries available upon request.
1 Like

Thanks for this great analysis. Re:

(And this is not a question for you specifically, but) Is GSoC still a thing? Or Outreachy internships or other funded dev work at less than senior level? There are libraries out there in other languages that would be tremendously beneficial to port over to OCaml, and could legitimately be done over the course of a summer.

1 Like

Thanks for the great suggestions so quickly! I’ll also set aside some time to summarise everything once some more ideas are posted and there is general consensus.

Completely agree – documentation is fundamental as it is felt by most I feel. The limitations of build-tools are perhaps more felt by more advanced or complicated workflows.

This sound really interesting – from experience I think quite a few beginners are unfamiliar with the ability of having multiple repos. I certainly was until I started cross-compiling to RISC-V :)) I think a good example curated repo would help drive this home more, do you know if one exists beyond the cross-compiling ones which aren’t exactly what we’re looking for?

This sounds good. I do wonder if more documentation/tutorials/awareness of dune-release and opam-publish might also help. I, for one, am sitting on a few libraries I should really publish but the process can be a little intimidating, so I still agree a search would be great but ultimately it would be good for libraries to make it upstream.

Thanks for the kind words :))

Not in a position to start making GSoC proposals but they are definitely still a thing! For example two projects I think are great, still do them. Even outside of GSoC perhaps a curated list of OCaml Community approved projects/internships that would benefit the whole community would be good that different businesses/organisations/foundations could use when making their own internships would be useful? At least everyone would be united on that front.

2 Likes

It might help for the people who have libraries that have reached that level of maturity. I think most people and companies are too busy to get their libraries to that level of maturity and being willing to properly support a package put on opam. direct, point to point sharing of source code can lower the fear/hesitation about “I won’t have time to support, answer questions, or clean up this code for someone else’s use”

I was doing absolutely the same thing (a jupyter notebook in google colab so users can play with the data) but you got there first :smiley: I was also interested to see correlation between different “random variables” such as most wished feature and OCaml experience :smiley:

3 Likes

That abundance of tools required to install, develop, compile, and release packages is a complaint in the survey. Adding more is not going to reduce complexity. I do think the publishing process is too complex, though.

2 Likes

Do you know if and how one can use ocaml on colab? In principle there is support for the jupyter protocol…

I was using python :snake:

This is perhaps better in (yet another) split thread, but if you’re looking for an easy solution to spin up OCaml-kernel Jupyter notebooks – without hosting the infrastructure – then it’s very easy to do this with Binder and the right Dockerfile.

I recently used this workflow for OCaml teaching, and was very happy with it. /h.t. to @jonludlam for the suggestion.

1 Like

Thanks Patrick for the fantastic distillation of results. It’s extremely useful to see our user responses segmented by their experience. (in particular, our self-identified expert userbase runs Coq a lot more than our self-identified newcomer codebase who tend to use JavaScript – we want to make sure we continue to help all of these segments!).

At a high level, this survey has already influenced the Platform tool developers. Documentation has been identified as a priority item for next year, and so a couple of next steps are happening this week already among the various groups:

  • The regular odoc developer meeting this week will also feature the opam team as part of our community dev meetings, and we are getting together to put the final plan together to put a docs.ocaml.org site together. The results of this will be on the dev wiki as usual, so anyone interested can track progress and suggest ideas (the video meetings are not exactly closed, but fully open participation isn’t practical given the constraints of current technology – please ping the odoc maintainer @jonludlam directly if interested in attending)

  • the second part of a docs site is to ensure we have really reliable and solid workflows for opam-repo contributions (including bulk builds, health checks and having a more automated contribution process, ideally without having to run a CLI tool). The next opam dev meeting later this week will feature us planning a switch to a cluster-based nextgen CI for opam-respository. We’ve also invited the maintainers of the Coq opam repository as well as Tezos and Jane Street (who contribute large package sets regularly) so we can ensure we work well with those ecosystems as well. Our intention here is to really reduce the burden on contributions to opam repository by mechanising as much of the grunt work as possible, thereby helping both beginners and expert users. Our new CI will also feature macOS and Windows testing as we bring those cluster workers online, and be much more easily extensible to custom workflows.

  • Having all the fancy package cluster builds in the world don’t help if noone is actually writing any documentation in their libraries. We’re hoping that new tools (such as mdx from Real World OCaml) will reduce the friction of entry to writing ocamldoc tutorials and sites. The mdx tool usage is easy but the implementation is quite complex (due to the interlock with the internal compiler-libs), so there is a mdx team working away on it, including hopefully speeding it up with native code compilation and continuing to improve the integration with dune and other build tools. @yminsky and I use mdx to write the whole of Real World OCaml (v2 of which is coming out soon in print), and we are most eager for other people to fork our tools and write their own books (like the Owl scientific computing book).

  • Finally, @ashish @gemmag and I have been putting our heads together to get funding sorted to reboot the ocaml.org site and make it easier to maintain, in recognition of the fact that the “old guard” (Christophe, Ashish, Phillippe, myself) just don’t have the day-to-day time anymore to keep things up. @patricoferris, @kanishka @sanette Bella and @JohnWhitington have all been contributing content to ocaml.org, and we are doing both incremental changes and also overhauling the internals of how it is built to use the latest and greatest innovations. I’m excited to see what all these new contributors will come up with.

This is of course not a closed list of action items – simply what I am tracking as the coordinator of the OCaml Platform efforts – so please keep suggestions and analysis flowing.

Thanks to everyone for the input and comments so far. If anyone has a burning desire to be in any of the dev meetings, please get in touch with me directly (anil@recoil.org) or the maintainers of the individual tool. I’ve never seen this much activity happening in all my time working on OCaml, so it warms my heart on this crisp winters day to see all the constructive positivity and effort going on. Keep it up and keep the suggestions coming :slight_smile:

18 Likes

Thanks for the update.

For mdx, I think this RFC if implemented would be very helpful as it would help with both documentation and testing. Sometimes an API example really helps sharpen the understanding.

1 Like

I was surprised to see that relatively few people seem to have talked about OCaml teaching. I would argue that these are a lot of newcomers, but they don’t answer polls :slight_smile: As far as we are concerned, the modern OCaml experience is much better for us researchers but is more difficult when wearing our teaching hat. Our students are not complete beginners (they already know several languages and an interpreter/notebook-based approach would not fit the size of programs they have to write) but most of them are still not mature developers. So we’re a bit stuck between an interpreter-based approach that doesn’t scale much and that is not the current standard way, and OTOH an opam/dune-based approach that’s still a bit rough on the edges. Installing an all-in-one OCaml teaching platform still requires many steps students don’t understand and we can’t ensure stability from one year to the next. Also, the source-based approach is a problem for some students with old hardware because installation is looong. My 2 cents.

8 Likes

I freshened up the links for courses on ocaml.org into its own table here (https://gitlab.com/kanishka-azimi/ocaml-courses). It might help a little to be able to see what the larger classes are using each year for install instructions, while waiting for improvements upstream in the core tools.

1 Like

Talking about teaching.

I learnt OCaml at school 3 years ago. I remember two kinds of setup: practical sessions writing all the code in one file only and using the interpreter, or projects with a Makefile already given by the teacher, and mainly using pre-dune tools like ocamlbuild or OCaml Makefile in a black box way, which did not encourage students to try to understand how an OCaml project was supposed to be managed. I remember having to learn on my own about ocamlbuild, to be able to manage my own side projects. So in a way, I totally understand what you feel, I can only like your message.

But I also feel like dune makes project management a lot simpler. I mean, in OCaml courses we usually build small compilers or that sort of stuff, which needs handling ocamllex / menhir files, not much more. I guess we can still provide a rather simple dune file, but the difference is, now we can explain it quite clearly to the students, and not have that kind of obscure build files. We can even use things such as utop for live-testing.

Another issue (the main issue preventing students from easily adopting OCaml in my opinion) was that we had to learn emacs (with tuareg mode) along with OCaml. The difficulty was not only to learn a new language or even a new paradigm, but also learning a less beginner-friendly development environment, and not being able to use what we were used to. This problem has been solved too thanks to the development of environments such as VSCode. (edit: I’m not anti-emacs, I recently started learning vim, but learning a text editor during a course was not the right time)

So, while I understand your concerns and do not want to question anything about what you said, I think the situation has improved a lot (even in the last 3 years). I believe we are close to having something practical. I’m only a new PhD student so I have the newcomer’s point of view, but I just wanted to bring some optimism to the topic.

I’ve been working on some side projects in Haskell recently (well, today I’m that guy bringing Haskell into OCaml conversations). We already have merlin and great IDE extensions, but I think we just lack a stack-like tool so that we can just setup a project, compile the sources, and publish the executables quickly. An OCaml version of Hoogle (which I find great) could be nice too.

3 Likes

Would a Docker image set up for the course not take away most of these problems? And skeleton projects that can be cloned from a git repository?

Opam and Dune are great for development, and I convinced my colleagues to use them for teaching, but I think they sometimes cursed me :wink: I appreciate greatly the work done on the OCaml platform and thank all those who contributed to it. But from a pure teaching point of view, I think the experience has sometimes been difficult in our school these past years, compared to Makefile-based lab sessions. Even for some colleagues of mine who are excellent teachers but don’t necessarily follow the OCaml pace; but they know how to debug a failing ocamlc compilation called from a Makefile. I don’t think this is the place to list difficulties, so -in the spirit of this poll- I’d rather make the wish that we can get a teaching-friendly variant of the OCaml platform: stable during 3 school years (for instance), easy to scaffold new projects, well isolated from the rest of the filesystem (dune workspace, I’m looking at you), promoting testing (we try to teach our students to write contracts and do heavy testing)… For a very long time, I think universities have played a good role in spreading the OCaml word and I fear that, as the focus shifts on developer-friendliness (which is good), we forget teaching-friendliness.

4 Likes

Our students work on university machines as well as their own one (in particular in these times of lockdown), running any of Linux/Mac/Windows. A lot of them have low-grade machines. I think it’s still easier to tell them something like:

  1. install opam in one command (found on the opam website)
  2. issue only one opam special command to install a stable teaching environment (hopefully avoiding local compilation)
  3. install the OCaml plugin for your favorite editor among VS Code/Emacs/Vim

For scaffolding, I think dune init is actually a first step, but we would once again need a special configuration as the dev and release dune modes need some tweaking (which we currently do in our skeletons) to fit our requirements (e.g. we want non-fatal warnings in dev mode, but also allow running inline tests: we spent a long time figuring out why our inline tests were not failing, until we realized we were in release mode to avoid fatal warnings but then inline tests were inhibited).

2 Likes