"OCaml -- first impressions"

Richard-Degenne · July 10, 2017, 3:46pm

I disagree. When you first come across a language, your first question is never “Does it support multi-threading?” or “Can I overload functions?”.

On the other hand, when you first need to look up the docs, and you land on a page like this, I understand it is very disappointing, even more so if you’re used to documentation that looks like Ruby’s.

The iOS REPL is kinda extreme, though.

Alex · July 10, 2017, 4:10pm

Yeah, that’s never a question, because almost all users expects for math operators to work on every numeric type. Which isn’t the case with OCaml.

Again, no one (except maybe python programmers) expects that MT is broken. And when user discovers this fact it’s a huge turn-off.

Also I don’t see why do you think that List module documentation is disappointing. It’s a reference, not a tutorial. And unlike Ruby, OCaml have pretty good types which also serves as documentation.

bluddy · July 10, 2017, 4:13pm

It really depends on your target audience. Python and ruby programmers make up a huge chunk of the programming world nowadays. Add javascript devs, and you’ve got the great majority of programmers not expecting multithreading. Of course, anyone from haskell, C++, Java, C# etc who’s looking to build some systems software would be greatly disappointed at the current lack of MT, but green threading is a thing even in those languages, and there are good reasons for it.

Richard-Degenne · July 10, 2017, 4:23pm

I’m afraid we won’t be able to reach a agreement here, @Alex, our opinions on the topic diverge way too much, so let’s not start a dialogue of the deaf. I hear your points, and I hope you hear mine. (Deaf, hear… Got it?)

Anyway, being beginner-friendly and powerful aren’t incompatible, but doing both requires a lot of work. And recent feedbacks may suggest that we aren’t putting enough effort in the “beginner-friendly” side.

boxed · July 10, 2017, 8:57pm

Hi there. I saw a link to this thread in the referrers list on medium. I’m the original author so I’d like to clear up some questions and misunderstandings:

I do know about utop, but the point is more like “there exists something that is good, so why ship something bad by default?” Python does something similar with its default repr which might or might not work with arrows and such while ptpython pretty much always works. If you don’t want to lock the nice one into the standard library, just ask to install it on first launch.
Multithreading: well, this is a complicated subject MT is really quite badly broken in my opinion in C++, Java, etc. Clojure is the only language I’ve used that doesn’t have badly broken MT. So OCaml not having a MT solution seems to me to be an advantage, if it’s good like Clojure (or erlang maybe?) when it arrives. The C/C++ way of just blaming the user when things go badly is not “having MT” in my opinion
Significant whitespace: I’m talking about the impression one might get. I understand that OCaml doesn’t have significant whitespace (except space separating tokens!), but code examples are laid out in a way that looks like there is, if that makes sense?
The manual: glad to see people looking at it. This forum is pretty impressive in design and UX btw!
Unicode: I would certainly bash Go for not doing unicode, if it wasn’t so low on my list of serious problems with Go Punting on the issue by saying “it’s UTF8” is not a serious solution I’m afraid. Some issues: a) how do you iterate over code points? b) what about normalizations? c) how do I know if something is a series of bytes or a piece of text if there is no type difference?
iPhone app: You guys should really check out Pythonista, it’s absolutely fantastic. Being able to learn a new language on the commute is very nice and if only Python can do it (which is my experience having tried many apps, including Swift that only does iPad!) that’s a competitive advantage when trying to get new users. And it really shouldn’t be that hard to create something passable.
Overloading: sure, that’s weird. Seems just as bad as Elm or Haskell from what I can tell. But these impressions are way before getting to any serious code.

Phew, I think that was all. Hopefully this clears things up! Thanks for taking this “article” seriously!

dbuenzli · July 11, 2017, 2:08am

It’s not punting on the issue. We are very well aware of the limitations of the current approach. There are reasonably well functioning libraries for doing what you want, they just happen not to be part of the standard library, and for some of them, for good reasons.

With the time I just find it amusing that most of the people getting an impression on OCaml almost always mention this “unforgivable” sin.

But the reality is that in the set of languages out there that do have a type for Unicode strings in their standard library, very few of them have a non broken one (the only ones I know for sure have not a broken one are Swift and rust).

I personally don’t see much difference and even prefer almost no (and hence sound) support rather than broken support — e.g. JavaScript, Python, Java, etc. in which you may have a type for Unicode strings but can’t do a) for Unicode scalar values, i.e. the actual textual content, nor answer c).

boxed · July 11, 2017, 7:46am

I believe python 3.3 fixed iterating over unicode code points. That’s 2012.

dbuenzli · July 11, 2017, 8:58am

Indeed you can now iterate over code points, but your indices should iterate over scalar values. Since apparently in 2017 they didn’t fix c) that’s the mess you get:

> python3
Python 3.6.1 (default, Apr  4 2017, 09:40:21) 
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> '\uD800' # unpaired surrogate
'\ud800'
>>> '\uD800'[0]
'\ud800'
>>> '\uD800'.encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 0: surrogates not allowed
>>> '\uD83D\uDC2B' # paired surrogates representing U+1F42B
'\ud83d\udc2b'
>>> '\uD83D\uDC2B'[0]
'\ud83d'
>>> '\uD83D\uDC2B'[1]
'\udc2b'
>>> '\uD83D\uDC2B'.encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 0-1: surrogates not allowed
>>> '\U0001F42B'
'🐫'
>>> '\U0001F42B'[0]
'🐫'

So your python3 string doesn’t represent a sequence of Unicode scalar values, which as a programmer, is the minimal model you’d like to be able to work with (Swift made a bolder, more programmer friendly move for certain scripts as you index by grapheme clusters).

boxed · July 11, 2017, 9:37am

So the way to handle ‘\uD83D\uDC2B’ is to normalize it? I haven’t come across this case so not really sure what you’re talking about quite frankly

dbuenzli · July 11, 2017, 9:57am

No, this has nothing to do with Unicode normalization. This has to do with the fact that Unicode code points (integers 0x0000 to 0x10FFFF) in general do not always represent text, Unicode scalar values do (integers in the ranges 0x0000…0xD7FF and 0xE000…0x10FFFF).

Ways of handling this would be to either completely disallow surrogate specification in escapes or to translate that literal string on the fly to ‘\U0001F42B’ and error on unpaired surrogate escapes. This would be some measures towards ensuring you do actually have c) in python3 (I don’t know if there are other means by which such bogus strings can be created) which would then give you a) on scalar values.

You may then be interested in reading my minimal Unicode introduction.

th3rac25 · July 12, 2017, 4:07am

I am happy to help !

didier-wenzek · July 12, 2017, 12:58pm

The so called minimal Unicode introduction seems in fact largely above the minima !
It shows a great attention to the details which make the difference between a sloppy and true support for Unicode.

This level of quality might better shine in the REPL and the librairies.

Beyond the first impressions a new comer might have, my concern is rather how can I pursue this quest for quality along the whole chain (encoding, printing, editing, …) without having to know all the Unicode internal details.

I find for instance too bad that the lambda-term example for a REPL (https://github.com/diml/lambda-term/blob/master/examples/repl.ml) can manage simple Unicode but is lost when we try to edit Japanese as “私は瀧です”.

michipili · July 12, 2017, 1:18pm

I wrote a meant-to-be-friendly answer to Anders and invited him here.

PS: Ah, I should have read more, as the author made it here. Under a different name – yes I’m desperately looking for apologies.

boxed · July 12, 2017, 1:41pm

Same nick and full name as on GitHub and medium. Unclear why this forum rendered my name as my nick and not my real name though

michipili · July 12, 2017, 2:05pm

I also realised after that you are here known as boxed (Anders Hovmöller) and on the other site as Anders Hovmöller (boxed). This is enough to fool me.

gasche · July 12, 2017, 2:25pm

Thanks, @boxed, by the way, for stopping by and giving more information. As you have seen, your post hit on some things that were already being worked on, and set some others in motion. The immediate effect may not look like big changes of course, but I think we’re improving piece by piece.

octachron · July 12, 2017, 10:14pm

Thanks! I would try to think of a good process, and then probably start a new topic soonish on how to update the manual css.

mars0i · July 25, 2017, 1:51pm

Just one additional comment about the earlier discussion of whether new programmers come to a language looking for multithreading: Maybe not, but programmers newly interested in functional programming will sometimes come looking for multithreading (on multiple cores). Since FP still seems to be a weird stretch for many programmers, they need strong motivation to try any functional language at all, obviously. From what I’ve seen, with Clojure, one of the main selling points of FP for those new to it is the way that FP can simplify taking advantage of multiple cores (e.g. here and here, but I’ve seen this point made repeatedly). I think that for someone who’s sold on FP for the sake of its ability to simplify exploiting multiple cores, but who wants static typing or native code compilation, they might look at OCaml and then be surprised that a well-established FP language doesn’t support easy multicore multithreading. (In the end this issue will just go away with the new multicore revision, I believe.)

Chris00 · August 3, 2017, 11:55am

Should we change the name toplevel to REPL? While interactive toplevel is an accepted name, REPL seems much more common and should remove a (minor) pain point for people new to the language. If there is an agreement, I’ll update the website ocaml.org.

dbuenzli · August 3, 2017, 12:22pm

No. The rest of the eco-system documentation and tooling would become inconsistent.

Please let’s focus on the real pain points.

Topic		Replies	Views
Survey on the new "Getting Started" Documentation on OCaml.org Learning user-feedback , ocamlorg	3	686	November 8, 2023
[ANN] New Get Started Documentation on OCaml.org Community ocamlorg	2	904	October 19, 2023
OCaml - first impressions Learning	26	2284	September 20, 2020
Feedback on RWO dev site Site Feedback real-world-ocaml	3	1106	December 18, 2023
OCaml at First Glance Learning	21	3307	August 30, 2022

"OCaml -- first impressions"

Related topics