[ANN] OCaml User Survey 2020

This might be a good addition, but note that that list contains projects that are open to help from new contributors. They are not necessarily the most important projects for the community. For example, multicore is probably high on most people’s list of projects that are important, but it’s being implemented by a small group of experts, and probably would not benefit from “too many cooks in the kitchen”, so it wouldn’t be in a list of projects looking for help.

I would like to thank everyone for your feedback, in this thread and in the responses. We will incorporate it for next year’s survey run.

I wonder if some people out there would be volunteers to help with processing the survey results, in preparation of a post about them. (Right now there are about 650 responses, which I think is already significant to draw conclusions; I am planning to resend it for another round in a couple weeks.) We are planning to publish the full results in any case, so a volunteer would not have access to more information (anyone will be free to look at the result and post their own analysis/deductions/conclusions), but they could help providing the initial analysis, writing the analysis post, and (tedious but useful) clustering free-form answers into pertinent new options. (Otherwise, most likely, I will do this work myself. I’m not looking for crowd-sourcing, I think one/two people work best.)

@cjr do you think that you would have some methodological advice on how to analyze the survey results? One thing I don’t know about, for example, is what are good practices to analyze “feelings” question with answers of the form (“I (strongly) {agree, disagree}” or “Neutral”): in the analysis, how should we handle the “neutral” responses?

1 Like

Regarding the Likert-type scales, what I frequently do is collapse response ranges together and drop the middle/neutral responses. Collapsing response ranges is useful because one person’s “strongly agree” might be another’s “agree.” How many categories is a judgment call but with a 1-5 range (which I think is what the survey used), I’ll often simply collapse all positive responses together and all negative responses together.

I usually drop the neutral category entirely because it’s rarely clear what it means. Does it mean that the respondent is truly ambivalent? Or that they don’t know or that they don’t care? Or that the question is irrelevant to them? Something to look for: if you’re getting a higher-than-average number of “neutral” responses for a particular question, it might be a sign that the question in bad in some way. (Incidentally, I rarely include middle categories with my Likert-type scales precisely because I don’t know what they mean. I force a positive or negative response and then provide options for “Don’t know,” “Inapplicable,” etc.)

1 Like

I wouldn’t mind helping you out :slightly_smiling_face:

Re “publishing all results”: does that mean you plan to make all responses openly accessible? Should it come with some privacy-preserving techniques applied, particularly for free form questions?

1 Like

for next year, force people to also vote amongst “housekeeping” buckets, where one has to choose amongst various less interesting tasks that need to be done but aren’t as exciting as new language features or new libraries.

1 Like

Thanks to whoever wrote the questions. This is really well put together.

1 Like

One other minor comment. someone mentioned open-collective as a way for individual community members to vote on community projects, in a related ocaml community thread. Maybe this survey can eventually feed into an Ocaml open collective project with modest funding goals, as a continuous way to get feedback on which projects people want to see realized sooner. The projects that reach their funding goal can trigger getting corporate matching funds.

The OCaml survey 2020 is now closed. It attracted 745 replies. Everyone should be able to see the summary of results and to download the raw data.

I and a few others will try to write a summary of the replies, especially of the many free-form replies. Everyone is welcome to help sifting through the results! Short comments can be posted in this discussion thread, and longer analyses can be put somewhere else and mentioned here.


Thanks a lot for this!
Some of the bar graphs look a bit weird (for example on “Which types of software do you develop with OCaml?”), with some of the labels on the left not being there (though one can still hover the bar itself to get it). I wonder if there’s something that could be done about it?

Thank you for this insight!

If I may make a suggestion: to me, histograms illustrate results more clearly than pie charts, especially with arbitrary colors like that. If colors were a gradient from red to green for those [disagree..agree] range questions, it’d already be easier to interpret at a glance.


Right from Vg’s test image database here is a demonstration that pie charts are a bad visualisation device (the eye is not good at comparing angles).


Keep in mind that some people have no difficulty sorting pie slices by size. To us, the pie chart is more readable than the equivalent bar chart.

Not to go off-topic, but how could a bar chart be less readable than a pie chart? The heights are side by side, directly comparable without any processing, and directly connected to their labels instead of having to cross-reference a legend of colors.

1 Like

I don’t know. I have a better perception of the size of blobs than of the length of sticks. Perception is a complex topic.

And yes, we’re clearly off-topic :grimacing:

Pie charts are fine when the number of response categories is small and there’s good disparity in response rates; many (but not all) of these meet that criteria. Pie charts also don’t impose an ordering on the response categories, which can be useful.

It sounds like @mjambon has better spatial perception than most; in general, people find it very difficult to compare the sizes of pie slices. Another problem with pie charts is when they rely upon color coding (as these do). This can make them difficult or impossible to decode for people with color blindness such as myself.

I don’t think that I know of any studies finding that pie charts are easier/faster to decode than bar charts/line graphs/histograms/etc; there are studies showing the opposite. (I published a paper on set-theoretic visualization techniques last year: https://journals.sagepub.com/doi/full/10.1177/2059799119862110)


I never said pie charts should be preferred over bar charts. They clearly should not. My only intent was to point out the ineffectiveness of arguments like “this looks good to me and you’re wrong if it looks bad to you because just look at it”.

1 Like

I extracted the replies to the free-form questions and made them available as a Gist. There are so many replies that I have no idea how to exploit and summarize them!


7 posts were split to a new topic: Suggestions from the OCaml Survey result

Note: I “split” the excellent discussion by @patrickoferris as a separate topic, as it was going in the (very useful) direction of discussing broadly the ecosystem, rather than specifically the survey result. I would encourage people to post here for specific details on the survey process and results, and create new topics for discussions inspired by the survey.

Let me quote below the part of @patricoferris’ post that would be most useful to anyone interested in processing the results:

I think it’s quite important to take the results from this survey and find actionable items (or highlight ongoing efforts) which aim to improve different problems for different users. As a result I have put together some graphs trying to categorise the answers based on proficiency (beginner, intermediate, advanced and expert). These groupings are quite subjective but I thought it might offer a slightly more nuanced look into what problems users had, where they come from, what they are doing and what commonalities they share. The code is here and the HTML version of the notebook here (I’m no data-scientist nor python developer :snake: if you see glaring mistakes do raise an issue!).

The new topic has excellent discussion on Patrick’s findings from the survey data.


Here is a summary and analysis of the survey results I wrote on behalf of the OCaml Software Foundation: https://www.dropbox.com/s/omba1d8vhljnrcn/OCaml-user-survey-2020.pdf?dl=0