Capitalized_underscore vs CamelCase


What is your preferred style for module names and data constructors?

Jane Street libraries seems to prefer Capitalized_underscore (i.e. Core_async) - while OCaml standard library seems to mix both styles (!) - i.e. ListLabels vs Ast_mapper.

Personally I like CamelCase as underscore looks a little weird when used with capitalized names - I’m not aware of any other language where it’s used like this. So I’m a little unsure what should I use for my libraries.


I also prefer CamelCase to Camel_snake. :camel::snake:


Underscored module names won out for me after many years, when I decided I prefer filenames like to or If it wasn’t for the coupling between filenames and module names, I might have preferred capitalized camelcase :dromedary_camel:


Follow the programming guidelines.


I’ve wound up with a mixture of snake case and camel case module names with some others that are just all lowercase. I’m a JS dev who enjoys OCaml creating a ReasonML project so my preference is a bit all over the place at the moment!

I’m leaning towards snake case, but ReasonML in general tends to be camel case to appeal to JS devs.


Reason is all over the place at the moment, at least for function calls… e.g. from their docs:

/* receive & destructure the unit argument */
let logSomething () => {
  print_endline "hello";
  print_endline "world";

/* call the function with the value of type unit */
logSomething ();

I think the current standard in OCaml is Camel_snake for top-level modules (inferred from file name) and CamelCase for modules defined inside a file?


I’m moving toward Capitalized_underscore more and more. I find it more readable.


Capitalized_underscore for me.


I am also firmy on the Separated_words side. After all, explicitly separated words were one of the best innovation of medieval typography. And with CamelCase, the rule for separating words becomes context-dependent once acronyms or qualifiers such at 2D, 3D appear. In A1DArray, A1_D_Array is just an improbable interpretation, not an impossible one.


Wasn’t the naming used in core libraries chosen because of module packaging (ie, to indicate that all the ‘Core_*’ modules were to be packaged in the ‘Core’ module)?


Capitalise_underscore for me. And I also do not have uppercase letters anywhere else. So for example in containers, CCList would be Cc_list or Cclist for me with Cclist_labels. Probably the latter due to the underscore being annoying for such a commonly used module.


We chose Capitalize_underscore for Core and company many years ago, both to match the programming guidelines, and because we believed it was clearly better from a readability perspective. I think Core-style libraries differ mostly in that they apply the rule very consistently, but even outside of our libraries, Capitalized_underscore is the more common naming practice for identifiers.

Consistency in API design is valuable for a variety of reasons, but a big one is that it avoids having silly arguments about things that are of relatively little importance, like the style of identifiers. Capitalized_underscore is the thing that’s closest to an accepted standard, and I think we’d be a bit better off if we accepted it more uniformly.



The only problem with Cap_underscore is that it’s completely foreign to the outside world. It looks strange to newcomers, and that carries with it a certain cost, both in learning, and in the lack of consistency of contributions from newcomers.

Also, the usage of Cap_underscore in OCaml leads to Cap_underscore in associated C code, which looks really weird, and is inconsistent with C conventions.


We must be living in a different world.

Newcomers should simply consult the programming guidelines.

Don’t know which conventions you looked (there’s not a single one) but most C conventions I know of (gnu, linux kernel, K&R) do use underscores rather than Caml case.


Please show me an instance of Cap_underscore (really Camel_snake) elsewhere. I have programmed in C, C++, Java, Haskell, Python, and Ruby, and I haven’t found it anywhere else. It’s either snake_case, CamelCase, all ALL_CAPS. That’s what the outside world has. This makes OCaml conventions even more exotic to outsiders, especially in C code.


Maybe you should try to see that from another perspective.

OCaml uses snake casing, as the programming guidelines will tell you. However, as the manual will explain to you, the language does make distinctions in certain contexts between capital and lowercased identifiers. Namely constructors and module names do start with a capital letter.

TBH I don’t think that this is what a newcomer will struggle with the most when she tries to learn the language. I actually find that these conventions are quite nice to orient yourself in the syntactic constructs of the code (e.g. am I accessing a record or a module ?).

I find the obsession some people have that OCaml’s syntax and conventions should absolutely be like everything else out there a bit ridiculous (if such an eveything else even exists…). The reason why we have different programming languages is precisely because they are… different. For example I would never give up OCaml’s terse ML-like syntax for a noisy JavaScript-like notation.

Convert CamelCase string to Capitalized_snake_case without regex

I’m not really disagreeing – I’m ok with Camel_snake. But I remember how strange it was initially, and I’m a little uncomfortable with using it in C code. Every time we diverge from what’s well known to users, we incur a cost. I’m not saying in this particular case the overhead is huge – only that it adds to the sum total. Every little bit of cost makes it harder to bring new members into the language.

Of course, if we have no cost whatsoever, we also provide no value to the users – they might as well use an existing language. But there’s a balance to be sought. Look at how ReasonML is taking off precisely because they’re removing the cost of the particular OCaml syntax (whether you like what they’re doing or not). They’re keeping the value of the semantics (which also incur their own cost) while minimizing the overhead of the syntax for the average programmer.


Why do you do this ? AFAIK nobody advocates (or even does) that, even in bindings. If you program in C use one of the C conventions.

This is OT but by providing too much familiarity and favoring the expression of certain constructs I think they are also removing the ability for newcomers to try have another take on computing, which is to program with values rather than statements. This may have a longer term cost…


The OCaml compiler’s C files use this convention. That’s the only reason I mention it.


Do you have a link ? I never saw that.