OCaml 4.14.0 is released

The OCaml team has the pleasure of celebrating the birthday of Alexander Grothendieck by
announcing the release of OCaml version 4.14.0.

Some of the highlights in the 4.14.0 release are:

  • Integrated support for “go to definitions” in Merlin.
  • Standard library: new modules In_channel and Out_channel,
    many new functions in Seq module, UTF decoding and validation support
    for strings and bytes.
  • Runtime optimisation: GC prefetching. Benchmarks show a speedup of around 20%
    in GC-heavy programs.
  • Improved error messages in particular for module-level error.
  • Deprecated functions and modules in preparation for OCaml 5.
    In particular, the Stream and Genlex modules are now deprecated.
  • Type variables can be explicitly introduced in value and variant constructor
    declarations. For instance,
     val fold: ('acc -> 'elt -> 'acc) -> 'acc -> 'elt list -> 'acc
     type showable = Show: 'a * ('a -> string) -> showable

can now be written as

    val fold: 'acc 'elt. ('acc -> 'elt -> 'acc) -> 'acc -> 'elt list -> 'acc
    type showable = Show: 'a. 'a * ('a -> string) -> showable
  • Tail-call with up to 64 arguments are now guaranteed to be optimized
    for all architectures.
  • Experimental tail modulo cons (TMC) transformation

The full list of changes can be found in the changelog below.

Those releases are available as OPAM switches, and as a source download here:

https://github.com/ocaml/ocaml/archive/4.14.0.tar.gz
https://caml.inria.fr/pub/distrib/ocaml-4.14/ocaml-4.14.0.tar.gz

47 Likes

What is news in 4.14.0 ?

OCaml 4.14 is the last release of the OCaml 4 branch. If there is much anticipation
for OCaml 5, OCaml 4.14 still brings many quality of life improvement to OCaml 4.
In this no-so-short-any-longer post, I will try to describe a few personal highlights in this new release:

  • Integrated support for “go to definitions”
  • Standard library improvements
  • Improved error messages
  • Runtime optimisation for the Garbage Collector
  • More tailcall guarantee
  • Syntax fine-tuning
  • Experimental TMC transformation

Integrated support for “go to definitions”

Finding the original definition of a type or expression is quite common features
in code editors for many languages. However, most languages don’t have a
powerful module system that allows to easily transport definitions from one
module to another module like OCaml.

Consider for instance a file const.ml

module type S = sig val euler_mascheroni: float end
module Origin = struct
  let euler_mascheroni = 0.577215664901532866
end

module F(X:S)(Y:S) = struct
  module A = Y
  module B = X
end

module G(X:sig module A:S end)=X.A

include G(F(Origin)(Origin)) 

At some point in the future, I might want to to modify the definition of the
Euler-mascheroni constant in const.ml, for instance to switch to an
hexadecimal representation to avoid rounding errors.

When querying for the original definition of the constant, I don’t want to be
sent to the functor application:

include G(F(Origin)(Origin)) 

by my editor: the original definition is not here.
Similarly, the original definition of constant does not belong to the body of the functor G

module G(X:sig module A:S end)=X.A

nor to the body of the functor F


module F(X:S)(Y:S) = struct
  module A = Y
  ...
end

And it is only after traversing two functors applications and two functor bodies
that I can at last reach the original definition in the Origin module.
This is possibly a quite involved process, that I don’t want to do manually
for every definitions.

Before 4.14, Merlin was doing this tracking of original definitions on its own,
with some difficulties and some bugs. To avoid those issues, the calculus needed
to track the origin of such definitions has been upstreamed in the compiler.
Moving this computation upstreams means that the compiler itself can record the
information needed to track original definitions through functor applications,
module projection, inclusions, and saves this information in the cmt files
as a new field that can be used directly by Merlin. This is done by computing
a shape for each modules that record new definitions, but also functor
applications and module projections (M.X). This shape information is also
optimised to reduce future query time. This optimisation might increase the
compilation for the most heavy-user of modules, but that was deemed a reasonable
compromise.

Standard library

Seq

The previously slightly bare-bone API of the Seq module has been significantly
expanded with more than 40 functions that should make it easier to produce (6 new functions),
consume (15 new functions), transform (11 new functions), combine (6 new functions),
split (6 new functions), or convert sequences (2 new functions).

In_channel and Out_channel

Two new modules have been added to the standard library: In_channel and Out_channel.
All functions operating on in_channel and out_channel in the Stdlib module are now also available
in those modules. But those modules also provides new function like In_channel.input_all which reads
the full content of an input channel, or Out_channel.with_open_in which guarantees that the opened
channel is closed after its use.

UTF decoding and validation

Continuing with the spirit of defining strings and bytes as array of chars,
that might contain valid unicode data, new functions have been added to the modules Uchar, String,
and Bytes to help validate and decode unicode contents stored in strings or bytes

Deprecations

In preparation to OCaml 5, the standard library has added many deprecation warnings on functions
or modules that were “well-known” to be deprecated. In particular, the Stream and Genlex module
have been deprecated in OCaml 4.14 and will not be available in OCaml 5 as a part of the standard library.
Those modules are now available in a new separate library, camlp-streams, for packages that depends
on those modules.

Improved error messages

In 4.14.0, the on-going work on better error messages has been particularly
fruitful for error message at the module level, but there was also some
interesting progress for type error messages.

Full trace for module errors

Module errors are possibly one of the hardest to read error message in OCaml.
This phenomenon was worsened by the fact that module-level error were often
laconic on the reason why two type were mismatched. For instance, reading the
error for

module M: sig
  type ('a,'b) arrow
  val app : ('a,'b) arrow -> 'a -> 'b
end = struct
  type 'a pr = Format.formatter -> 'a -> unit
  type ('a,'b) arrow = <name:string; a:'a pr; b:'b pr; f:'a -> 'b>
  let app arrow x =
    let y = arrow#f x in
    Format.printf "%t(%a)=%a" arrow#name (arrow#a: _ pr) x (arrow#b: _ pr) y;
    y
end
Error: Signature mismatch:
       ...
       Values do not match:
         val app :
           < a : 'a pr; b : 'b pr; f : 'a -> 'b;
             name : Format.formatter -> unit; .. > ->
           'a -> 'b
       is not included in
         val app : ('a, 'b) arrow -> 'a -> 'b

only tell us that there is some incompatibility between the interface and
implementation type.

And that’s it.

With OCaml 4.14, the compiler is much more loquacious, and the type incompatibility is fully explained:

       The type
         < a : 'a pr; b : 'b pr; f : 'a -> 'b;
           name : Format.formatter -> unit > ->
         'a -> 'b
       is not compatible with the type ('a, 'b) arrow -> 'a -> 'b
       Type
         < a : 'a pr; b : 'b pr; f : 'a -> 'b;
           name : Format.formatter -> unit >
       is not compatible with type
         ('a, 'b) arrow = < a : 'a pr; b : 'b pr; f : 'a -> 'b; name : string > 
       Types for method name are incompatible

This result was achieved by making sure that error message at the module level are chained with
the type error that lead to the module-level error message.

Mismatched definitions

Have you ever swapped a handful of constructors in a type definition and forgot
to update the corresponding interface file?

module A: sig
  type t = A | B | C | D | E | F | G | H
end = struct
  type t = A | B | E | D | C | F | H | G
end

This situation is certainly caught by the typechecker. However, the error
message was often very indirectly related to the issue at hand:

 Type declarations do not match:
  type t = A | B | E | D | C | F | H | G
is not included in
  type t = A | B | C | D | E | F | G | H
Constructors number 3 have different names, E and C.

It is certainly true that the third constructor has a different name between the signature and the implementation.
However, if I follow this error message blindly, I would stumble into another error message

module A: sig
  type t = A | B | C | D | E | F | G | H
end = struct
  type t = A | B | C | D | C | F | H | G
end
Error: Two constructors are named C

And if I apply another fix following blindly the error message step-by-step,
I will need 10 iteration to reach the correct definition. To avoid this long
chain of error message, OCaml 4.14 has extended the use of the new diffing
algorithms for functor to type definitions.

The idea is to compute a minimal patch that can transform the implementation
definition into the interface declaration by using semantic actions
(like adding a constructor, or swapping two constructors) and use this
minimal patch to explain this error. The previous example was for instance
constructed by swapping two constructors, and the diffing algorithm
sucessfullly decompose the error in two swaps:

module A: sig
  type t = A | B | C | D | E | F | G | H
end = struct
  type t = A | B | E | D | C | F | H | G
end
Error: Signature mismatch:
       ...
       Type declarations do not match:
         type t = A | B | E | D | C | F | H | G
       is not included in
         type t = A | B | C | D | E | F | G | H
       5<->3. Constructors C and E have been swapped.
       8<->7. Constructors G and H have been swapped.

This new error message format works also quite well when either the implementation
or the interface is missing constructors or record fields:

 module M: sig
  type t = { a:int; b:int; c:int; d:int}
end = struct
  type t = { a: int; b : int }
end
Error: Signature mismatch:
       Modules do not match:
         sig type t = { a : int; b : int; } end
       is not included in
         sig type t = { a : int; b : int; c : int; d : int; } end
       Type declarations do not match:
         type t = { a : int; b : int; }
       is not included in
         type t = { a : int; b : int; c : int; d : int; }
       3. A field, c, is missing in the first declaration.
       4. A field, d, is missing in the first declaration.

Type-directed disambiguation should fail early

One of the disadvantage of type-directed disambiguation is that it may delay
a type error and leads to surprising error messages.
For instance, before OCaml 4.14, the following function

let () = match Seq.return 10 with Nil | Cons _ -> ()

raised the error about an unknown Nil constructor.

Error: Unbound constructor Nil

At first glance, one might be tempted to fix this issue by using
the fully qualified name

let () = match Seq.return 10 with Seq.Nil | Seq.Cons _ -> ()

But this updated code is rejected with

Error: This pattern matches values of type 'a Seq.node
      but a pattern was expected which matches values of type
        int Seq.t = unit -> int Seq.node

In other words, we were sent to a wild good chase, and the true error laid in a
missing argument to Seq.return function

let () = match Seq.return 10 () with Nil | Cons _ -> ()

At this point, one may wonder why the compiler complained initially about an unbound Nil constructor.
And the true reason is that the type constructor Seq.t does not define a constructor Nil;
ore more precisely, it does not define any constructors at all since it is a type abbreviation for
unit -> 'a Seq.node.

The error logic was definitively backward here. Fortunately, this point is fixed in 4.14, and now

let () = match Seq.return 10 with Nil | Cons _ -> ()

is rejected with an error message that does mention that there is something
wrong with this pattern matching:

Error: This pattern should not be a constructor, the expected type is int Seq.t

Hopefully, this should decrease the time spent debugging this kind of issue in the future.

Improved printing

When printing types in errror message, OCaml should be much more parcimonious in its use of as bindings.
For instance, the error message for

type 'a t
type a
let f : < .. > t -> unit = fun _ -> ()
let _ = fun (x : a t) -> f x
Error: This expression has type a t but an expression was expected of type
         (< .. > as 'a) t
       Type a is not compatible with type < .. > as 'a 

used to add an as 'a because the printer for type expressions anticipated that it might need
a name for linking recursive occurences. With 4.14, such names are only printed when they are really necessary:

Error: This expression has type a t but an expression was expected of type
         < .. > t
       Type a is not compatible with type < .. > 

Runtime optimisation : GC prefetching

The code of the major GC has been fine tuned to better use the processor cache.

One of the phase of the major GC consists in marking live objects starting
from roots of known-to-be alive objects.

In the previous version of OCaml, this marking was a depth-first traversal of
reachable objects: newly discovered objects were added to the top of the marking
stack. Since those newly discovered objects were unlikely to be in the cache,
this means that this phase spent a significant amount of time waiting for load
after a cache miss.

This lack of cache locality has been fixed in OCaml 4.14 by adding
a prefetching circular buffer in front of the marking stack.
The new scanned element is then:

  • draw from the front of prefetchting buffer if it contains more than pq_min
    elements
  • draw from the marking stack otherwise
    Newly discovered objects are added to the back of the prefetching queue except
    in case of overflow, in which case they are added to the stack.
    This new design ensures that discovered elements spend enough time in the
    prefetching buffer to have been likely be loaded in cache once we try to scan
    them.

For programs that were spending a significant amount of times in the GC, this
can result in a 20% speedup.

Tailcall optimisation for functions with many arguments

Before OCaml 4.14, mutually recursive functions with too many arguments were not subject to
tail-call optimisation. For instance,

let rec h x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 =
  if x1 = 0 then
    x11
  else
    i (x1 - 1) x2 (x1 + x2) (x2+x3) (x4+x5) (x5+x6) (x6+x7) (x7 + x8) (x8 + x9) (x9+x10) (x10+x11)
and i x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 =
  h x1 x2 (x1 + x2) (x2+x3) (x4+x5) (x5+x6) (x6+x7) (x7 + x8) (x8 + x9) (x9+x10) (x10+x11)

The call to h might not tail-call optimised by ocamlopt in OCaml 4.13.
Moreover, the precise number hidden the words “too many” was function of the
processor architecture.

The OCaml function call convention have been updated to ensure that recursive
tail-call of functions with less than 64 arguments are always optimised to not
consume stack frames.

Syntax fine-tuning

Explicit quantified type variables in declaration

Type variables can be explicitly introduced in the declarations of values and variant constructors.
For instance,

    val fold: ('acc -> 'elt -> 'acc) -> 'acc -> 'elt list -> 'acc
    type showable = Show: a * ('a -> string) -> showable

can now be written as

    val fold: 'acc 'elt. ('acc -> 'elt -> 'acc) -> 'acc -> 'elt list -> 'acc
    type showable = Show: 'a. 'a * ('a -> string) -> showable

Less parentheses for immediate objects

Immediate objects no longer requires parentheses:

let print x = x#name
let name = print object method name = "Named" end

Experimental TMC transformation

It is well-known that there is many algorithms on recursive data types where there is
compromise between readability and speed on large values. For instance, when comparing

let rec map f = function
| [] -> []
| x :: xs -> f x :: map f xs 

and

let rec map f acc = function
| [] -> List.rev acc
| x :: xs -> map f (f x::acc) xs
let map f l = map f [] l

the first implementation is easier on the eyes and faster on smaller lists.
However, this first function is not tail-recursive, it is using stack spaces and
might trigger a stack overflow on larger lists. The second function is thus
better behaved (and in fact generally faster) on large input.

The experimental tail-modulo-cons program transformation offer a solution to this compromise
between speed and readability when every recursive calls happen under a constructor.

By annotating a function with [@tail_mod_cons] and each tail call with [@tailcall] annotation

let rec map[@tail_mod_cons] f = function
| [] -> []
| x :: xs -> f x :: (map[@tailcall]) f xs 

the compiler will generate an optimised version of the function that does not use stack space
by using an initializable hole.

Informally, the TMC transformation is rewritting

f x :: (map[@tailcall]) f xs 

as

let $hole in
let result = f x :: $hole in
previous_result.$hole <- result;
map_dps result f xs 

using a new destination passing style function map_dps that rewrites the content
of the hole in result once it is available.

For a better explanation, there is a whole new manual chapter explaining this
transformation at OCaml - The “Tail Modulo Constructor” program transformation and that the
feature is still slightly in flux in term of user interface. (Once the updated manual will be online).

24 Likes

Changelong - OCaml 4.14.0 (28 March 2022)

Language features (highlights):

  • #10437: Allow explicit binders for type variables.
    (Stephen Dolan, review by Leo White)

  • #181, #9760, #10740: opt-in tail-modulo-cons (TMC) transformation
    let[@tail_mod_cons] rec map f li = …
    (Frédéric Bour, Gabriel Scherer, Basile Clément,
    review by Basile Clément and Pierre Chambart,
    tested by Konstantin Romanov)

Runtime system (highlights):

  • #10195, #10680: Speed up GC by prefetching during marking
    (Stephen Dolan, review by Xavier Leroy, Guillaume Munch-Maccagnoni,
    Jacques-Henri Jourdan, Damien Doligez and Leo White)

Code generation and optimizations (highlights):

  • #10595: Tail calls with up to 64 arguments are guaranteed to be compiled
    as tail calls. To this end, memory locations in the domain state
    are used for passing arguments that do not fit in registers.
    (Xavier Leroy, review by Vincent Laviron)

Standard library (highlights):

  • (breaking change) #10710: Add UTF tools, codecs and validations to the Uchar, Bytes and
    String modules.
    (Daniel Bünzli, review by Florian Angeletti, Nicolás Ojeda Bär, Alain
    Frisch and Gabriel Scherer)

  • (breaking change) #10482: mark the Stream and Genlex modules as deprecated, in preparation
    for a future removal. These modules (without deprecation alert)
    are now provided by the camlp-streams library.
    (Xavier Leroy, review by Nicolás Ojeda Bär)

  • #10545: Add In_channel and Out_channel modules.
    (Nicolás Ojeda Bär, review by Daniel Bünzli, Simon Cruanes, Gabriel Scherer,
    Guillaume Munch-Maccagnoni, Alain Frisch and Xavier Leroy)

Compiler user-interface and warnings (highlights)

  • #10328, #10780: Give more precise error when disambiguation could not
    possibly work.
    (Leo White, review by Gabriel Scherer and Florian Angeletti)

  • #10361: Improve error messages for mismatched record and variant
    definitions.
    (Florian Angeletti, review by Gabriel Radanne and Gabriel Scherer)

  • #10407: Produce more detailed error messages that contain full error traces
    when module inclusion fails.
    (Antal Spector-Zabusky, review by Florian Angeletti)

Internal/compiler-libs changes (highlights):

  • #10718, #11012: Add “Shape” information to the cmt files. Shapes are an
    abstraction of modules that can be used by external tooling to perform
    definition-aware operations.
    (Ulysse Gérard, Thomas Refis and Leo White, review by Florian Angeletti)

Language features:

  • #10462: Add attribute to produce a compiler error for polls.
    (Sadiq Jaffer, review by Mark Shinwell, Stephen Dolan
    and Guillaume Munch-Maccagnoni)

  • #10441: Remove unnecessary parentheses surrounding immediate objects.
    Allow ‘object … end # f’, ‘f object … end’, etc.
    (Yan Dong, review by Nicolás Ojeda Bär, Florian Angeletti and Gabriel Scherer)

Runtime system:

  • (breaking change) #9391, #9424: Fix failed assertion in runtime due to ephemerons set_ and
    blit_ function during Mark phase
    (François Bobot, reported by Stephen Dolan, reviewed by Damien Doligez)
  • #10549: Stack overflow detection and naked pointers checking for ARM64
    (Xavier Leroy, review by Stephen Dolan)
  • (breaking change) #10675, #10937: Emit deprecation warnings when old C runtime function names
    are used. This will break C stub code that uses these old names and
    treats warnings as errors. The workaround is to use the new names.
    (Xavier Leroy and David Allsopp, review by Sébastien Hinderer and
    Damien Doligez)
  • #10698, #10726, #10891: Free the alternate signal stack when the main OCaml
    code or an OCaml thread stops
    (Xavier Leroy, review by David Allsopp and Damien Doligez)

  • #10730, 10731: Fix bug in Obj.reachable_words causing a slowdown when called
    multiple time (Alain Frisch, report by ygrek, review by Xavier Leroy)

Code generation and optimizations:

  • #10578: Increase the number of integer registers used for
    parameter passing on PowerPC (16 registers) and on s390x (8 registers).
    (Xavier Leroy, review by Mark Shinwell)

  • #10591, #10615: Tune the heuristic for CSE of integer constants
    so as to avoid excessive CSE on compiler-generated constants
    and long register allocation times.
    (Xavier Leroy, report by Edwin Török, review by Nicolás Ojeda Bär)

  • #10681: Enforce boolean conditions for the native backend
    (Vincent Laviron, review by Gabriel Scherer)

  • #10719: Ensure that build_apply respects Lambda.max_arity
    (Stephen Dolan, review by Xavier Leroy)

  • #10728: Ensure that functions are evaluated after their arguments
    (Stephen Dolan, review by Mark Shinwell)

  • #10732: Ensure right-to-left evaluation of arguments in cmm_helpers
    (Greta Yorsh, review by Xavier Leroy)

Standard library:

  • (breaking change) #10622: Annotate Uchar.t with immediate attribute
    (Hongbo Zhang, reivew by Gabriel Scherer and Nicolás Ojeda Bär)

  • (breaking change) #7812, #10475: Filename.chop_suffix name suff now checks that suff
    is actually a suffix of name and raises Invalid_argument otherwise.
    (Xavier Leroy, report by whitequark, review by David Allsopp)

  • #10526: add Random.bits32, Random.bits64, Random.nativebits
    (Xavier Leroy, review by Gabriel Scherer and François Bobot)
  • (breaking change) #10568: remove Obj.marshal and Obj.unmarshal
    (these functions have been deprecated for a while and are superseded
    by the functions from module Marshal)
    (François Pottier, review by Gabriel Scherer and Kate Deplaix)
  • #10538: add Out_channel.set_buffered and Out_channel.is_buffered to control
    the buffering mode of output channels.
    (Nicolás Ojeda Bär, review by John Whitington, Daniel Bünzli, David Allsopp
    and Xavier Leroy)
  • (breaking change) #10583, #10998: Add over 40 new functions in Seq.
    (François Pottier and Simon Cruanes, review by Nicolás Ojeda Bär,
    Daniel Bünzli, Naëla Courant, Craig Ferguson, Wiktor Kuchta,
    Xavier Leroy, Guillaume Munch-Maccagnoni, Raphaël Proust, Gabriel Scherer
    and Thierry Martinez)
  • #10596, #10978: Add with_open_bin, with_open_text and with_open_gen to
    In_channel and Out_channel. Also, add In_channel.input_all.
    (Nicolás Ojeda Bär, review by Daniel Bünzli, Jérémie Dimino, Damien Doligez
    and Xavier Leroy)

  • #10658: add detailed information about the current version of OCaml
    to the Sys module of the standard library.
    (Sébastien Hinderer, review by Damien Doligez, Gabriel Scherer, David
    Allsopp, Nicolás Ojeda Bär, Vincent Laviron)

  • #10642: On Windows, Sys.remove and Unix.unlink now remove symlinks
    to directories instead of raising EACCES. Introduce
    caml/winsupport.h to hold more common code between the runtime,
    lib-sys, and win32unix.
    (Antonin Décimo, review by David Allsopp and Xavier Leroy)

  • #10737: add new ephemeron API for forward compatibility with Multicore
    OCaml.
    (Damien Doligez, review by Stephen Dolan)

  • (breaking change) #10922: Add deprecation warnings on {Int32,Int64,Nativeint}.format.
    (Nicolás Ojeda Bär, review by Xavier Leroy and Florian Angeletti)

Other libraries:

  • #10192: Add support for Unix domain sockets on Windows and use them
    to emulate Unix.socketpair (only available on Windows 1803+)
    (Antonin Décimo, review by David Allsopp)

  • #10469: Add Thread.set_uncaught_exception_handler and
    Thread.default_uncaught_exception_handler.
    (Enguerrand Decorne, review by David Allsopp)

  • #10697: Bindings of dup and dup2 in win32unix now correctly call
    WSADuplicateSocket on sockets instead of DuplicateHandle.
    (Antonin Décimo, review by Xavier Leroy and Nicolás Ojeda Bär)

  • #10951: Introduce the Thread.Exit exception as an alternative way to
    terminate threads prematurely. This alternative way will become
    the standard way in 5.00.
    (Xavier Leroy, review by Florian Angeletti)

Tools:

  • #10839: Fix regression of #show when printing class type
    (Élie Brami, review by Florian Angeletti)

  • #3959, #7202, #10476: ocaml, in script mode, directive errors
    (#use "missing_file";;) use stderr and exit with an error.
    (Florian Angeletti, review by Gabriel Scherer)

  • #10438: add a new toplevel cli argument -e <script> to
    run script passed to the toplevel.
    (Pavlo Khrystenko, review by Gabriel Scherer)

  • #10524: Directive argument type error now shows expected and received type.
    (Wiktor Kuchta, review by Gabriel Scherer)

  • #10560: Disable colors if the env variable NO_COLOR is set. If
    OCAML_COLOR is set, its setting takes precedence over NO_COLOR.
    (Nicolás Ojeda Bär, report by Gabriel Scherer, review by Daniel Bünzli,
    Gabriel Scherer and David Allsopp)

  • #10565: Toplevel value printing: truncate strings only after 8 bytes.
    (Wiktor Kuchta, review by Xavier Leroy)

  • #10527: Show “#help;; for help” at toplevel startup
    (Wiktor Kuchta, review by David Allsopp and Florian Angeletti)

  • #10846: add the -shape command-line option to ocamlobjinfo. When reading a
    cmt file, shape information will only be shown if that option is used.
    (Ulysse Gérard, review by Florian Angeletti)

Debugging:

  • #10517, #10594: when running ocamldebug on a program linked with the
    threads library, don’t fail immediately; instead, allow debugging
    until the program creates a thread for the first time, then fail cleanly.
    (Xavier Leroy, report by @anentropic, review by Gabriel Scherer)

  • #9621: Pack the ocamldebug modules to minimize clashes
    (Raphael Sousa Santos, review by Vincent Laviron and Gabriel Scherer)

Manual and documentation:

  • #7812, #10475: reworded the description of the behaviors of
    float->int conversions in case of overflow, and of iterators
    in case of concurrent modifications.
    (Xavier Leroy, report by whitequark, review by David Allsopp)

  • #8697, #10666: add M, m, n options of the OCAMLRUNPARAM to manual and man page
    for ocamlrun command line options
    (Dong An and Anukriti Kumar, review by David Allsopp, Gabriel Scherer
    and Damien Doligez)

  • #10281, #10685: Add description of C compiler on macOS and Windows platforms.
    (Dong An, review by Xavier Leroy and David Allsopp)

  • #10397: Document exceptions raised by Unix module functions on Windows
    (Martin Jambon, review by Daniel Bünzli, David Alsopp, Damien Doligez,
    Xavier Leroy, and Florian Angeletti)

  • #10589: Fix many typos (excess/inconsistent spaces) in the HTML manual.
    (Wiktor Kuchta, review by Florian Angeletti)

  • #10605: manual, name few css classes to ease styling and maintainability.
    (Florian Angeletti, review by Wiktor Kuchta and Gabriel Scherer)

  • #10668, #10669: the changelog (this file), LICENSE and README files are now
    installed as part of the distribution. The destination directory can be
    customized using the --docdir argument to ./configure.
    (Nicolás Ojeda Bär, report by Daniel Bünzli, review by David Allsopp,
    Sébastien Hinderer, and Daniel Bünzli)

  • #10671, #10672: webman: Fix misalignments in unordered lists by changing the
    CSS for coloring bullets
    (Wiktor Kuchta, review by Florian Angeletti)

  • #11107: Lifted comments in the Parsetree module into actual documentation.
    (Paul-Elliot Anglès d’Auriac, review by Florian Angeletti)

  • #11120, #11133: man pages, add missing warning entries and add mnemonic names
    to the list of warnings.
    (Florian Angeletti, report by Kate Deplaix, review by Gabriel Scherer)

Compiler user-interface and warnings:

  • #10531: add naked_pointers to ocamlc -config exporting NAKED_POINTERS from
    Makefile.config.
    (Damien Doligez, review by Mark Shinwell and Gabriel Scherer)

  • #9116, #9118, #10582: Fix single-line source highlighting in the
    presence of tabs
    (Armaël Guéneau, review by Gabriel Scherer,
    split off from #9118 by Kate Deplaix, report by Ricardo M. Correia)

  • #10488: Improve type variable name generation and recursive type detection
    when printing type errors; this ensures that the names given to type variables
    are always reused in the following portion of the trace and also removes
    spurious as 'as in types.
    (Antal Spector-Zabusky, review by Florian Angeletti)

  • #10794: Clarify warning 57 (Ambiguous or-pattern variables under guard)
    (Wiktor Kuchta, review by Gabriel Scherer)

Internal/compiler-libs changes:

  • #1599: add unset directive to ocamltest to clear environment variables before
    running tests.
    (David Allsopp, review by Damien Doligez and Sébastien Hinderer)

  • #8516: Change representation of class signatures
    (Leo White, review by Thomas Refis)

  • #9444: -dtypedtree, print more explictly extra nodes in pattern ast.
    (Frédéric Bour, review by Gabriel Scherer)

  • #10337: Normalize type_expr nodes on access
    One should now use accessors such as get_desc and get_level to access fields
    of type_expr, rather than calling manually Btype.repr (which is now hidden
    in Types.Transient_expr).
    (Jacques Garrigue and Takafumi Saikawa,
    review by Florian Angeletti and Gabriel Radanne)

  • #10474: Force normalization on access to row_desc
    Similar to #10337. Make row_desc an abstract types, with constructor
    create_row and accessors defined in Types rather than Btype.
    A normalized view row_desc_repr is provided for convenience.
    (Jacques Garrigue and Takafumi Saikawa,
    review by Leo White and Florian Angeletti)

  • #10541: Make field_kind and commutable abstract, enforcing correct access
    (Jacques Garrigue and Takafumi Saikawa,
    review by Thomas Refis and Florian Angeletti)

  • #10575: add a -dump-dir flag, which redirects all debugging printer
    (-dprofile, -dlambda, …) to the target directory
    (Florian Angeletti, review by Thomas Refis and Gabriel Scherer)

  • (breaking change) #10627: Make row_field abstract
    Completes #10474 by making row_field abstract too.
    An immutable view row_field_view is provided, and one converts between it
    and row_field via inj_row_field and row_field_repr.
    (Jacques Garrigue and Takafumi Saikawa, review by Florian Angeletti)
  • #10433: Remove the distinction between 32-bit aligned and 64-bit aligned
    64-bit floats in Cmm.memory_chunk.
    (Greta Yorsh, review by Xavier Leroy)

  • #10434: Pun labelled arguments with type constraint in function applications.
    (Greta Yorsh, review by Nicolas Chataing and Nicolás Ojeda Bär)

  • #10470: Remove unused cstr_normal field from the constructor_description
    type
    (Nicolas Chataing, review by Gabriel Scherer)

  • #10382: Don’t repeat environment entries in Typemod.check_type_decl
    (Leo White, review by Gabriel Scherer and Florian Angeletti)

  • #10472: refactor caml_sys_random_seed to ease future Multicore changes
    (Gabriel Scherer, review by Xavier Leroy)

  • #10487: Move logic to get the type path from a constructor return type in
    Types
    (Nicolas Chataing, review by Jacques Garrigue)

  • #10555: Do not use ghost locations for type constraints
    (Nicolás Ojeda Bär, report by Anton Bachin, review by Thomas Refis)

  • #10598, #10616: fix an exponential blow-up when typechecking nested module
    types
    (Florian Angeletti, report and review by Stephen Dolan)

  • #10559: Evaluate signature substitutions lazily
    (Stephen Dolan, review by Leo White)

  • #8776, #10624: Fix compilation time regression introduced in 4.08
    (Nicolás Ojeda Bär, fix by Leo White, report by Alain Frisch, review by Thomas
    Refis)

  • #10618: Expose more Pprintast functions
    (Guillaume Petiot, review by Gabriel Scherer)

  • #10637: Outcometree: introduce a record type for constructors
    (Gabriel Scherer, review by Thomas Refis)

  • #10516: refactor the compilation of the ‘switch’ construct
    (Gabriel Scherer, review by Wiktor Kuchta and Luc Maranget)

  • #10670: avoid global C state in the RE engine for the “str” library
    (Xavier Leroy, review by Gabriel Scherer)

  • #10678: Expose descriptions in Warnings module
    (Leo White, review by Gabriel Scherer and Alain Frisch)

  • #10690: Always build ocamltoplevel.cmxa
    (David Allsopp, review by Gabriel Scherer)

  • #10692: Expose Parse.module_type and Parse.module_expr
    (Guillaume Petiot, review by Gabriel Scherer)

  • #10714: Add X86_proc.with_internal_assembler for temporarily changing the
    assembler used by the backend.
    (David Allsopp, review by Gabriel Scherer)

  • #10715: Allow the assembler and loader to be substituted in ocamlnat, for
    example to be replaced with a binary emitter.
    (David Allsopp and Nathan Rebours, review by Louis Gesbert,
    Nicolás Ojeda Bär and Gabriel Scherer)

  • #10742: strong call-by-need reduction for shapes
    (Gabriel Scherer and Nathanaëlle Courant,
    review by Florian Angeletti, Ulysse Gérard and Thomas Refis)

Build system:

  • #10828 Build native-code compilers on OpenBSD/aarch64
    (Christopher Zimmermann)

  • #10835 Disable DT_TEXTREL warnings on x86 32 bit architecture by passing
    -Wl,-z,notext in mksharedlib and mkmaindll. Fixes relocation issues, reported
    in #9800, making local patches in Debian, Alpine, and FreeBSD superfluous.
    (Hannes Mehnert with Kate Deplaix and Stéphane Glondu, review by Xavier Leroy)

  • #10717: Simplify the installation of man pages
    (Sébastien Hinderer, review by David Allsopp)

  • #10739: Stop installing extract_crc
    (Sébastien Hinderer, review by David Allsopp, Daniel Bünzli, Xavier Leroy
    and Gabriel Scherer)

  • #10797: Compile with -d2VolatileMetadata- on supporting versions of Visual
    Studio. This suppresses the addition of .voltbl sections and eliminates
    linking errors in systhreads.
    (David Allsopp, review by Jonah Beckford and Sébastien Hinderer)

Bug fixes:

  • #9214, #10709: Wrong unmarshaling of function pointers in debugger mode.
    This was causing ocamldebug to crash when running some user-defined printers.
    (Xavier Leroy, report by Rehan Malak, review by Gabriel Scherer and
    Vincent Laviron)

  • #10473: Add CFI directives to RISC-V runtime and asmcomp.
    This allows stacktraces to work in gdb through C and OCaml calls.
    (Edwin Török, review by Nicolás Ojeda Bär and Xavier Leroy)

  • #10539: Field kinds should be kept when copying types
    Losing the sharing meant that one could desynchronize them between several
    occurrences of self, allowing a method to be both public and hidden,
    which broke type soundness.
    (Jacques Garrigue, review by Leo White)

  • #10542: Fix detection of immediate64 types through unboxed types.
    (Leo White, review by Stephen Dolan and Gabriel Scherer)

  • #10590: Some typechecker optimisations
    (Stephen Dolan, review by Gabriel Scherer and Leo White)

  • #10633: Stack overflow recovery in ocamlopt for AMD64/Linux and ARM/Linux
    was not restoring the minor heap pointer correctly
    (Stephen Dolan, review by Xavier Leroy)

  • #10659: Fix freshening substitutions on imported modules
    (Leo White and Stephen Dolan, review by Matthew Ryan)

  • #10677, #10679: Fix detection of CC as gcc in configure (allow for
    triplet-prefixed GCC) and fix all C compiler detection when CC is a path
    rather than a basename.
    (David Allsopp, report by Fabian @copy, review by Gabriel Scherer)

  • #10690: Add --enable-native-toplevel to configure to enable installing
    ocamlnat as part of the main build (default is not to install it)
    (David Allsopp, review by Gabriel Scherer)

  • #10693: Fix ident collision in includemod
    (Leo White, review by Matthew Ryan)

  • #10702: Fix cast of more strictly aligned pointer in win32unix
    implementation of stat
    (Antonin Décimo, review by David Allsopp)

  • #10712: Type-check toplevel terms in the native toplevel in the same way as
    the bytecode toplevel. In particular, this fixes the loss of type variable
    names in the native toplevel.
    (Leo White, review by David Allsopp and Gabriel Scherer)

  • #10735: Uncaught unify exception from build_as_type
    (Jacques Garrigue, report and review by Leo White)

  • #10763, #10764: fix miscompilation of method delegation
    (Alain Frisch, review by Vincent Laviron and Jacques Garrigue)

  • #10822, #10823: Bad interaction between ambivalent types and subtyping
    coercions (Jacques Garrigue, report and review by Frédéric Bour)

  • #10836, #10952: avoid internal typechecker errors when checking signature
    inclusion in presence of incompatible types.
    (Florian Angeletti, report by Craig Ferguson, review by Gabriel Scherer)

  • #10849: Display the result of let _ : <type> = <expr> in the native
    toplevel, as in the bytecode toplevel.
    (David Allsopp, report by Nathan Rebours, review by Gabriel Scherer)

  • #10853: Obj.reachable_words could crash if called after a marshaling
    operation in NO_SHARING mode.
    (Xavier Leroy, report by Anil Madhavapeddy, review by Alain Frisch)

  • #10907, #10959: Wrong type inferred from existential types
    (Jacques Garrigue and Gabriel Scherer, report by @dyzsr, review by Leo White)

  • #10688: Move frame descriptor table from rodata to data section on
    RISC-V. Improves support for building DLLs and PIEs. In particular, this
    applies to all binaries in distributions that build PIEs by default (eg
    Gentoo and Alpine).
    (Alex Fan, review by Gabriel Scherer)

  • #11031: Exception handlers restore the rbp register when using frame-pointers
    on amd64.
    (Fabrice Buoro, with help from Stephen Dolan, Tom Kelly and Mark Shinwell,
    review by Xavier Leroy)

  • #11025, #11036: Do not pass -no-pie to the C compiler on musl/arm64
    (omni, Kate Deplaix and Antonio Nuno Monteiro, review by Xavier Leroy)

  • #11101, #11109: A recursive type constraint fails on 4.14
    (Jacques Garrigue, report and review by Florian Angeletti)

  • #11118: Fix integer overflow on 64-bit Windows when indexing bigarrays (which
    could lead to a segmentation fault).
    (Roven Gabriel, review by Nicolás Ojeda Bär and Xavier Leroy)

12 Likes

Sometimes I do want to know, not the original point at which a name was defined, but the point at which it was added to the current namespace. Will it remain possible to do this? I guess as far as the compiler is concerned, the question is: does the compiler provide resources for merlin to do this?

2 Likes

Thank you everyone involved! As a happy Ocaml user I appreciate you hard work.

8 Likes

Everything that was possible before should remain possible: we are only adding new metadata information. At the same time, I am not sure if Merlin can completely track the introduction of names in scope, since for instance the identifiers don’t carry the location of the open that introduced then.

1 Like

Thanks to everyone involved! I’m really excited for this release overall, although specially for the allocation-free UTF decoding, the new channel functions and TMC.

I just want to point out that the right annotation for TMC, as of this release, is actually:

let[@tail_mod_cons] rec map f = function
| [] -> []
| x :: xs -> f x :: (map[@tailcall]) f xs 
4 Likes

I have fixed the TMC attribute in my example, thanks for the noticing this mistake!

Successfully installed in WSL2 + Ubuntu-20.04 from zip source archive.
Currently best preferred new feature : Immediate objects no longer requires parentheses :slightly_smiling_face:

This is a useful set of additions, but to save users having to provide their own aliases I suggest that the Seq module should provide an alias from Seq.flat_map to the let* operator, from Seq.map to the let+ operator and from Seq.zip to the and+/and* operator. These operators can be useful as they make it straightforward to manipulate multiple sequences whilst retaining their laziness.

(One could make similar proposals about Result.bind and so on, and Option.bind and so on: I have never fully understood why aliases for these have not been provided.)

See Provide let operators in the standard library by lpw25 · Pull Request #2170 · ocaml/ocaml · GitHub.

1 Like

Q: Should this be announced on https://ocaml.org/ ? Or are you waiting a few days intentionally?

Thanks for the explanation.

Now that this shape information is available in the cmt file, just curious – is the 4.14 version of merlin actually using it currently? (There is a 4.14 version of merlin available as a preview from opam currently)

The update on https://ocaml.org is ongoing. However, the release page has been updated: OCaml Releases with the 4.14 releases. (There is some latency on the site update, and I made some mistakes with the initial update that slowed the process.)

As far as I remember, the preview version of merlin on 4.14 is not using shape currently, partly because
we only settled on the the right optimization algorithm for computing shapes at the end of the last alpha release of 4.14 . However, if I am not mistaken, the current plan for Merlin is to have shape support on the first non-preview version for 4.14 .

3 Likes