New Tutorials on Basics of OCaml

Dear OCamlers

The OCaml.org team is happy to announce the publication of two new tutorials:

  1. Values and Functions
  2. Basic Data Types and Pattern Matching

These pages teach the basics of OCaml, starting from what is a value and continuing to basic types. Before them, the documentation on OCaml.org assumed a lot more understanding and made it hard for beginners to learn OCaml using OCaml.org. There are more gaps to fill, but alongside the recently published “Get Started” docs (Installing OCaml, A Tour of OCaml and Your First OCaml Program), this will allow people to use OCaml.org to start learning the language.

Unlike “Get Started” documents, these two have a narrower focus:

  • The former is new. It covers values, definitions, functions, environments, scopes, closures, and shadowing.
  • The latter is mostly new. It reuses some of the “Data Types And Matching” materials. But it also replaces it. It covers predefined types, variants, records, and pattern matching.

They form a series. “Values and Functions” goes first, and “Basic Data Types And Pattern Matching” goes after. The only prerequisite is the completion of the “Get Started” series.

We’ve received and included feedback on them when they were pull requests. However, as of any fresh release, there’s room for fixes and improvements. Don’t hesitate to share your opinion, comments or suggestions. Since this is beginner-oriented material, we’d appreciate very much if you find brave enough people to learn OCaml using them and provide a report on how it went.

Hope it helps

EDIT - What’s the target audience?

This targets developers who already know some programming but are not discovering it. Functional programming knowledge is not a prerequisite but rather a by-product. Self-learners and outside-university learners are in the target audience.

25 Likes

What kind of beginner did you have in mind? Any “normal” programmer will leave the site after reading this first paragraph (maybe uttering something like “at least no mention of monads and/or endofunctors”):

This tutorial teaches the skills needed to handle expressions, values, and names. You’ll learn the ability to write expressions, name values or leave them anonymous, appropriately scope names, handle multiple definitions of the same name, create and use closures, and produce or avoid side effects.

If they are brave enough and kept reading on, they’d leave after reading the tree’s definition.

I’m sorry if that sounds harsh, but if this is really aimed at your average programmer, it should be removed in total and “Basic Data Types And Pattern Matching” made the first chapter, that is actually a good start.

3 Likes

What kind of beginner did you have in mind?

The target audience is described in the Tour of Ocaml tutorial, which is a prerequisite of Values and Functions. We mostly assume knowledge of another programming language.

Any “normal” programmer will leave the site after reading this first paragraph (maybe uttering something like “at least no mention of monads and/or endofunctors”):

This tutorial teaches the skills needed to handle expressions, values, and names. You’ll learn the ability to write expressions, name values or leave them anonymous, appropriately scope names, handle multiple definitions of the same name, create and use closures, and produce or avoid side effects.

Can you elaborate a bit on this? Is the list too long? Which terms are dreadful? I fail to understand why this would deter some from reading further. But stepping back into beginners’ shoes is hard.

If they are brave enough and kept reading on, they’d leave after reading the tree’s definition.
I’m sorry if that sounds harsh, but if this is really aimed at your average programmer, it should be removed in total

Here, the learning goal is “pattern matching between let and = is available for user-defines variants”. Do you have a better example in mind?

and “Basic Data Types And Pattern Matching” made the first chapter, that is actually a good start.

It would be interesting to hear about attempts at starting with Basic Data Types And Pattern Matching. I think it is possible.

1 Like

Let me start by saying that I know that it is really hard to write documentation for less experienced persons, and that I myself am bad at that (ever after writing documentation for 20 years and actually teaching in schools for 3 years) - as you see, my sentences tend to meander on forever, without reaching a real conclusion. So please don’t take that as an , I do know, that the tutorials I would write wouldn’t be any better, they would be worse in other ways :wink: and I seriously appreciate your time spend writing them. That’s why I think they should be as good as possible, so that your time hasn’t been wasted.

The most important thing to always remember: it doesn’t matter, what you write, most people will never (no, really, never, ever) read that but just copy everything that is in code boxes. So the upmost importance is that they are as self explanatory as possible, at least in the order they are presented (nobody can expect to skip 10 examples and understand the 11th one).

First of, this reads like an academic paper, which is the cliche of functional programming - not being usable “in the real world”. If it where Haskell, you could play with that cliche by mentioning monads and endofunctors, but this is OCaml, so either no “fancy words” or in a humorous way to differentiate the language. When trying to write for “non functional programmers” avoid sounding like an academic paper or a C or C++ standard at all costs. This does not mean avoiding the proper terms, but always define them with a concrete example and never ever more than one at a time. As soon as an example needs two “new words”, it is the wrong example at this time (see later).

As it is now, the whole page is a tutorial for functional programmers, not non-functional programmers.

But now to the actual phrase: what you have written, is an elaborate paragraph telling me I’m going to learn “the basics”. Which is nice, but expected, it is the first page of a tutorial for beginners, after all. And for people who don’t understand the used terms - closure, that is something about programming in a hammock, right? - they are not that “interesting sounding” that they would be motivated. How many people have ever approached you, demanding that you teach them “the ability to write expressions, name values or leave them anonymous, appropriately scope names, handle multiple definitions of the same name, create and use closures, and produce or avoid side effects”? If you declare a goal, either make that sound (at least) somewhat interesting and sexy, or use a “known” term (the number of which hopefully grown throughout the tutorial), or don’t write that at all - again, the goal of a tutorial is obviously to teach things.

Then the next paragraph (brace yourself!).

In OCaml, functions are values.

Don’t get me wrong, this is a great and important thing to say, and most people will understand that, if the next phrase is something like “So you can use functions as arguments to functions and return them from functions, for example”. But

In comparison to other mainstream languages, this creates a richer picture between expressions, values, and names. The approach in this tutorial is to acquire the related capabilities and understanding by interacting with OCaml in UTop. This hands-on experience is intended to build understanding by experimentation rather than starting with the definition of the concepts.

I don’t know how to write about this without needlessly offending you, but this is “some slimy kind of insurance salesperson marketing speech” kind of cringeworthy. Don’t do that, except if trying to be funny. This is something Oracle, SAP or IBM (actually, it’s more like Accenture, McKinsey and their ilk) would have written and we don’t want to program in COBOL, do we?

First: don’t do that on the first page. You are way too fast.

The problem is that you are showing a declaration which uses 6 new concepts in the first line, of which 4 are special and/or ugly OCaml/ML syntax. And the main problem to learning functional programming is understanding recursion and recursive data structures, the 6th.

type 'a tree = Node of 'a * 'a tree list;;
  • type, which declares a type. So far, so normal and understandable
    But now:
  • 'a for a type variable. The concept of a type variable already confuses people, having that strange ' makes it worse and the backward syntax doesn’t help either. And then there is Rust which uses that for lifetimes.
  • the of: is that a keyword, another strange type of metavariable, a special type?
  • the second worst of all of ML syntax, using * in the type of a tuple (whoever came up with that: you don’t use + for sum types, do you?)
  • the worst of the syntax, the “backwardness” of type definitions 'a tree instead of tree 'a and tree list instead of list tree
    And last: this a a recursive data structure, a tree. Most programmers have never used a tree in their whole life, much less the concise, recursive form. They have more practical knowledge of monads than of trees or recursive data structures or recursion in general.

Then afterwards the mentioning of alpha. By mentioning only the correct “pronounciation” of 'a you make it seem as if the only part of the whole expression worth talking about is the fact that 'a is actually an alpha. By that, you make it (subconsciously or consciously) clear to the reader that you expect them to understand the whole gibberish in the box and they give up.

12 Likes

Thanks Roland,

I’ve tried to turn your feedback into GH issues:

Please let us know if you feel these issues aren’t faithfully capturing your feedback. Also, tell us if you think other shortcomings or gaps can be turned into issues.

P.S.

In comparison to other mainstream languages, this creates a richer picture between expressions, values, and names.

I agree this sounds like corporate fluff. Definitely not a great sentence. I’d love to replace it with anything better ASAP.

4 Likes

Of course I’m going to, thank you very much. I’ll just take some time after work, as I’ve just realised that my post above is missing some words because of my work interfering with the formulation of the answer ;).

1 Like

I would highly recommend looking at the Tour of Go and trying to absorb how they are introducing the language. It’s very matter-of-fact, no fluff, divided into bite-sized pieces, and each page deals with a single concept. The reader feels like they are making real progress at each step.

This is not an easy feat, but by looking at how they did it, I believe we already have a head start. They even have a page on type inference: A Tour of Go

5 Likes

The question is what the goal of the tutorial is. The goal of “Tour of Go” is to show the language itself and its usage, but not a tutorial on how to actually program in Go. Whereas the goal of this tutorial is to also be an introduction of functional programming. A “Tour of OCaml” would have functional programmers as intended audience.

Learn You a Haskell for Great Good! is maybe the best known example of a tutorial that teaches “functional programming”.

For OCaml there is now OCaml Programming: Correct + Efficient + Beautiful

4 Likes

I wouldn’t be too sure about that. It depends entirely on the quality of the writing. If the prose is clear and to the point, I’ll keep reading. Furthermore, I think its very common to skip around - I might start with the code boxes, but later skip back to the explanatory prose.

4 Likes

I think “Values and Funtions” needs to be rethunk. The title is very misleading - it should not even mention functions, imho. The real topic of the article is the relationships between names, expressions and their denotations (“values”) within scopes (environments).

Section “What is a value?” does not answer its own question. It should be renamed “What is an expression?”. But it does not say what an expression is. Personally I think a nod to the philosophical heritage of programming languages can add some clarity. Russel famously made a distinction between names and “definite descriptions”. For example, “George Washington” and “The first president of the US” both refer to the same historical individual (“value”). The former is a name, the latter a definite description (marked by “the”). We see the exact same thing in programming languages: “5” and “2 + 3” refer to the same thing. “5” is a name, “2 + 3” is a definite description (“the sum of 2 and 3”). We just use the term “expression” instead of “definite description”. A lambda expression is a definite description of a function. The takeaway is that names and expressions are just different ways of referring to values.

In “Global Definitions” you have “Every expression can be named”. No, every value can be named. # let the_answer = 2 * 3 * 7;; does not name the expression 2 * 3 * 7, it names 42. It says the name and the expression (i.e. the definite description “the product of 2, 3, and 7”) refer to the same value.

At the beginning of the article we have “…multiple definitions of the same name…”. Not possible. Names are always scoped so strictly speaking names like d in let d = ... are name parts. If environments were explicitly named then all names would be unique. So you cannot define the same name multiple times, but you can use the same name part (local name) in multiple environments.

In “Local definitions” we have “Local definitions are like global definitions, except the name is only bound inside an expression. They are introduced by the let … = … in … expression.” First point: let ... is not an expression. Maybe “declaration” or just “definition”. Point 2: names are bound in environments, not expressions. I’d say something like: “except the scope of the definition is limited to the local environment established by the `let … in …” declaration."

Suggest: declare explicitly that let ... in ... introduces a new environment.

I think it would also be useful to introduce the concept of lexical scoping. It’s simple and concise.

HTH,

Gregg

3 Likes

er, the nature of lets should be made more explicit.

1 Like

Thanks, Gregg, this is great feedback.

I will answer here on what appears to me as general comments. I’ve opened an issue with the rest, which I believe will be easier answered in a pull request.

I think “Values and Funtions” needs to be rethunk. The title is very misleading - it should not even mention functions, imho. The real topic of the article is the relationships between names, expressions and their denotations (“values”) within scopes (environments).

In essence, you are perfectly correct. However, I think it is useful to use the word function. I personally believe the document could be titled “The Environment”. But I may be a bit opaque to newcomers (I think this is implied in @Release-Candidate comment about the introduction). We’re really open to suggestions here. A good title has to be meaningful and appealing. If there’s something better than the existing one, let’s change it.

Section “What is a value?” does not answer its own question. It should be renamed “What is an expression?”. But it does not say what an expression is. Personally I think a nod to the philosophical heritage of programming languages can add some clarity. Russel famously made a distinction between names and “definite descriptions”. For example, “George Washington” and “The first president of the US” both refer to the same historical individual (“value”). The former is a name, the latter a definite description (marked by “the”). We see the exact same thing in programming languages: “5” and “2 + 3” refer to the same thing. “5” is a name, “2 + 3” is a definite description (“the sum of 2 and 3”). We just use the term “expression” instead of “definite description”. A lambda expression is a definite description of a function. The takeaway is that names and expressions are just different ways of referring to values.

I love putting things into context, either philosophical, historical or even wider. Unfortunately, it seems to give the creeps to some readers. This is why we’ve tried to put examples first, definitions second, and relationships with context as minimized notes or remarks (ergo, last).

I believe there is a risk of being too abstract in the case of names, expressions and values. Maybe an analogy could be drawn between defining a function in a non-dynamic programming language (i.e. where name rebinding is not allowed) and value naming in OCaml; both create an immutable relationship. From that, as a remark, we could say that the denotation of “definition” (pun intended) in OCaml is close to its philosophic or linguistic counterpart and link to Definition - Wikipedia (which contains a link to RTD).

Now the question again is about the goal and the intended audience of this tutorial and their goals.

It is perfectly fine to avoid certain terms or use slightly incorrect or even wrong terminology to avoid overcomplicating the task at hand. You all have experienced that when learning mathematics in school and at university, even if you have been in a “real” maths degree. There is no need to talk about Peano axioms and (abelian) groups when teaching children how to add 2 to 8. You can formulate for example the “successor axiom” in a way the children understand after they can add natural numbers when e.g. talking about “the smallest and biggest natural number”, same as commutativity. But you hopefully :wink: don’t think that reading the actual (well, not the actual, but the modern “version”) definition is a valid way of doing that.

Or imagine saying that a cube is a cylinder, which is mathematically (geometrically) correct, but doesn’t really help to better understand what a cube actually “is”.

So in short: “Scope”: OK. “Environment”: not without an explicit definition. Practically, it helps to know about environments when you want to implement a functional language, but for using one the concept of “lexical scope” is way more important and warrants an elaborate explanation.

What I’m wary of is an introductory tutorial with a lot of jargon. Here’s an example from that page:

That means programs are expressions. Actually, almost everything is an expression.

This is maybe true in some sense, but for beginners they will actually come across many things that are not expressions, and then be puzzled by this discrepancy because at least in the beginning they will not be defining complex functions so most of their code will be made up of statements.

A “Tour of OCaml” would have functional programmers as intended audience.

I disagree, the intended audience should be any programmer who is interested in using OCaml for some practical application. There are many guides, tutorials, and books for more academic or theoretical foundations already.

2 Likes

Yes, these issues describe my intentions well.

Sorry to not respond earlier, but I did some “field study” :wink: by watching the OCaml videos of “The Primeagen/The Vimeagen”. He (who is a senior engineer at Netflix, btw. :wink: recorded his learning of OCaml.
The Vimeagen - Playlist.
The oldest ones, where he basically just reads some introductions to OCaml, almost no coding:
Learning OCaml, Pt 1
Learning OCaml, Pt 2
From now on he’s solving problems in OCaml:
Advent of Code, Day 1
More Advent of Code
99 OCaml Problems
Dream
HTMX with Dream

Beginning at about here: Running the wrong executable there is an interesting situation, he is running another executable than would be built from the source he is editing. Took him quite some time to realise that. That’s actually something that is true for Dune (and some other build systems), running the built executable is way harder and less straightforward than it should be. Actually most build systems (for a single language) are optimised for, let’s say, smallish projects, but not for “just” generating an executable out of a single source file, which is what most beginners start with. So a better documentation on how to generate and run just a single file using Dune would be of benefit to beginners.

But now, the difficulties I have noticed that need more detailed explanation:

  • ;; vs ; sometimes he just randomly inserts ;; in his source code.
  • the need for a unit return value in the “main” function of a file. So a mention of ignore and what it does.
  • reading error messages

And then the “big” two problems:

  • functors, mainly Map: I guess a “slow” step-by-step introduction of how to use e.g. a map is needed, especially mentioning that the type of the values is “inferred”. His struggle with functors and a module’s documentation can be seen in the second “Advent of Code” video, starting here
  • and the biggest one, also the biggest hurdle for him in using “advanced” documentation and error messages: function parameters and how to read them in documentation and error messages. This is the single biggest problem he faces in the later stages of his (recorded) learning. So I would say a chapter on how to read function (and later module) signatures and the application of the knowledge in reading “advanced” documentation and error message would be highly valuabe.

As he is already used to Rust, pattern matching and variants isn’t something new. So there is no information about how successfully the OCaml documentation would be in learning these and he also obviously has quite some experience in using recursion, it’s not totally new to him.

I would have never thought, how interesting it is watching people trying to write some - more or less “easy” code I’ve just split a string in half.

7 Likes

Hi Roland. This is excellent feedback, thanks a lot.

We’ve watched a couple of the same videos this summer when we started working on the tutorials. What he’s done is incredibly useful. You’ve watched more than us. We’ve only watched the two firsts.

So a better documentation on how to generate and run just a single file using Dune would be of benefit to beginners.

There is a text intended to explain precisely this at the end of the recently released Your First OCaml Program.

  • ;; vs ; sometimes he just randomly inserts ;; in his source code.

We have this: Introduction to the OCaml Toplevel. Also a fresh release. There were a couple of sentences explaining that ;; is part of the ocaml syntax but is a no-op. It seems we’ve lost them at some point. Hopefully, a git revert may be sufficient to resurrect it. We could add more text or make it better. This document also needs to be linked more.

functors, mainly Map: I guess a “slow” step-by-step introduction of how to use e.g. a map is needed, especially mentioning that the type of the values is “inferred”.

This is high on our to-do list. Earlier this afternoon, we considered making this our next document to write.

and the biggest one, also the biggest hurdle for him in using “advanced” documentation and error messages: function parameters and how to read them in documentation and error messages.

That’s a tougher one.

We hadn’t thought of writing something on how to read function and module signatures, but does make a lot of sense. Also, it is consistent with our overall approach: learn how to use first and define your own second. I will add this to our board.

Regarding the documentation, the ocaml.org “packages” part of ocaml.org has undergone many changes. The standard library should be available there too; it should help.

Regarding error messages, we also have a card on this in our board. However, this is a bit daunting and we’d love to get input on how to approach this.

2 Likes