Jane Street, compiler development, and open-source

@reisenberg wrote a summary of our recent efforts on the compiler, and @sid asked a question about how this impacts Jane Street’s publicly released software:

How would the newer versions of the libraries that progressively use more and more of custom language be made available to the OCaml community?

This is a good question, given that it may well take years to upstream some of our compiler changes, and some of them might not be accepted by upstream at all.

In some sense, we’re in this world already. We’ve already made changes to Base, our standard library, to use the local mode in order to better support stack allocation. And this has already made it in to our public release, as you can see if you look for the [@local] annotations in the code below.

Our strategy here was to use the new feature in a way that doesn’t break the syntax, so it could be smoothly included in our public code. This works for modes because modes can be erased without changing the semantics of the code.

This approach isn’t always possible. Consider the include functor syntax that was mentioned in @reisenberg’s post:

module List = struct
  type 'a t =
  | Nil
  | Cons of 'a * 'a t

  let mapi t ~f = ...

  include functor Make_map
end

This can’t just be desugared away, as it happens, and so we simply block ourselves from using this feature within publicly released code.

There’s an obvious tension here, since blocking ourselves from using new features makes it more painful to open source things. Part of our intent here is to prioritize the upstreaming of things that cause these problems. include functor is an example of a feature that’s pretty small, and we think is generally quite useful, so we intend to propose it upstream soon.

I suspect unboxed types will be more like include functor than like modes, in that it’s going to be hard to use it in a way that’s compatible with the public release. We haven’t yet really worked through what the tradeoffs will be there.

In any case, our public release code is important to us, and we’re going to continue to release new versions. We value the fact that it’s helpful to the community, but it’s also more directly valuable to us. In particular, it makes it easier for us to reuse work done by other people, since people spend the time and effort to build libraries that interoperate well with our code (notably, with Async). And also, our public release helps people take into account our usage patterns when working on the compiler or other important community tools like Merlin or OCamlformat.

y

12 Likes

Thanks for your reply!

Another reason to keep releasing these libraries is that you end up having a wider base of users for your code. Your code is run under different conditions, situations and ways that would typically happen from within Jane Street.

These users I’m sure occasionally find and report bugs leading to the general improvement in the robustness of the Jane Street codebase. This bug finding community might get lost if your libraries don’t remain compatible with the vanilla OCaml distribution.

I read your reply carefully – it seems that things will generally be on a “best effort” basis and you are still figuring out your long term strategy here.

Generally speaking the enhanced emphasis on the internal custom compiler toolchain at Jane Street has come as a bit of a surprise to me personally. While I never used to think twice in the past about a Jane Street package dependency in a package, I may have to spare it some thought now.

I hope that you can see this is a logical and reasonable concern. Will Jane Street be committed about having compatible libraries in 2,3,4 years time when the “delta” between the internal Jane Street compiler and the vanilla OCaml distribution has increased substantially?

Finally, I would like to end by thanking you for spearheading Jane Street’s open source contributions – these contributions are very appreciated. I am just a bit concerned that things are going to be a bit more inconvenient for Jane Street in open sourcing its work in the future given an ever increasing “delta”. We all know how fast paced software development can be in an organization – and if the pain to open source is too high, it’s likely open sourcing will not happen…

1 Like

I think you read me wrong. We don’t know all the trade-offs we’ll make, but we’re not going to yank back the public release. I don’t think you should view our packages as less reliably available than they have been in the past.

Indeed, we just added a person to focus specifically on the public release process, so I expect this to get better and more reliable, not worse.

I think you’re assuming that the delta is just going to grow without limit, and we’re working hard to avoid that. We’re basically only making changes we think can make it upstream, and we’re starting to spend more and more time communicating about our proposals and soliciting feedback. All in, I’m cautiously optimistic.

An important point here is that, while we’re committed to fixing the limitations we currently see in OCaml, we’re not committed to the precise way we’re solving them. The fact that we have control over our internal codebase means we’re well positioned to make significant changes to our designs in response to critiques from upstream. Which means we’re not going to be trying to jam changes through that we feel desperately committed to. I think that increases the chance that this will all go well.

Your point about things changing fast is true, but the public-release process is deeply embedded in our development process; people more or less can’t release features that break the public release, so we’re not going to just trip into breaking it.

All in, my sense is that our new approach reduces pressure on the relationship with upstream, rather than increasing it. The way we’re operating now, we can solve our most critical problems quickly, and when we submit changes upstream, they’ll be better worked out, and come with better evidence of their value.

All in, this makes me more optimistic about our ability to make progress in a way that works smoothly with the broader OCaml world. In particular, the probability of a messy break between Jane Street’s version of OCaml and upstream seems to me lower than it was.

y

9 Likes

For whatever it’s worth, I just want to echo @Yaron_Minsky’s points here. As we’re designing out new features, we’re very mindful of the fact that they will be deployed far beyond our walls. This is really a consideration from the very beginning, and something that we can’t incorporate into Base/Core or something that is backward-incompatible is just a complete no-go.

6 Likes