A next-generation IDE for OCaml

Thanks for the additional feedback @Pavel.Mikhailovskii

I do understand you’re only at the research stage, the prospect of having an OCaml-centric next-gen IDE sounds very appealing to me.

I don’t have a clear picture of what you have in mind, but it sounds exciting nonetheless :slight_smile:

I’ll send you an email so we can talk more.

I am puzzled… the state of art of collaborative programming is using some Git enhanced platform (GitHub, GitLab…). How will a structured set of data will be managed by Git ?

However, having an IDE that deals with a text file but have a structured representation of it, displays the structure (a tree of Module, with values at leaf…), handle as such (right click, add function, declare the function public - i.e. add its signature in the .mli file, etc).

However, I guess that an OCaml IDE from a major brand will make OCaml more visible… it would be a good news. (I note that F# is already supported)

I’ve been a JetBrains user for most of my career (IntelliJ, RubyMine and CLion). While we have a quite decent OCaml LSP today, it sits within a text editor (VS Code) that has an independently developed set of plugins. Frankly, those plugins don’t work well together. In contrast, I have found JetBrains IDEs to be coherent, even when mixing multiple languages, which is a natural outcome only when the design comes from a single vendor.

My two cents: I think you’ll end up leveraging OCaml’s strengths if the AI and UI are generating/editing declarative documents, and plain OCaml is used for the logic + universal code translation. The OCaml toplevel is the code you’ll want to extend for the declarative pieces (reify the documents as modules, add some include module expressions to mixin your own behaviors, and implement a save/restore of the in-memory toplevel environment to fill out the Smalltalk experience). A concrete example of the code translation is something along the lines of my post OCaml as the Universal Translator; very difficult to pull off in other languages. Anyway, I’m biased but I think you should end up in a very good spot if you use OCaml.

4 Likes

Check out 🗺 A tour of Unison · Unison programming language

Remember: Unison code is not saved as text-based file content. That’s why we need a tool that lets us change and run Unison programs.

I’m interested by design principles from https://radicle.xyz/. I don’t know how much importance JetBrains is giving to the cloud. In principle, if you have radicle-style distributed protocols in place to exchanging repository metadata, there’s nothing stopping you from using top-notch local-first software that makes the experience seamless, including commercial solutions I suppose.

Unison is a very interesting project, very well thought through. I’m a bit afraid that they have too many ideas and target a too narrow niche (a very specific model of distributed computation). Also, the Haskell-like syntax might be a factor hindering the adoption. Their chat-like editing mode can suddenly be very useful in AI-assisted coding.

@Pavel.Mikhailovskii That post is very exciting!

I’d be very interested if such IDE included integrated functionality related to collaboration and code review. I am curious whether you’re familiar with Janestreet’s iron explorer IDE? (blog post). A quote from that post:

The truth is that after writing code at work this way for a while, you start to wish that you had something similar at home. But there’s nothing quite like it for Git and Github and editors like Vim or Textmate or VS Code.

I’ve had the chance to have first-hand experience with that Janestreet system in the past, and I agree with the author on that.

I’m doing some experiments on the review part (I posted in this forum about it here). I don’t have a UI yet, and even if I prefer its current very basic terminal UI over web-based workflows, personally I feel a big drop in my efficiency compared to what I remember from using janestreet’s IDE. If a futurist IDE targeting the OCaml ecosystem was to emerge, I’d want to see if it can be made to fill that gap.

We’re still deciding between two options:

  • using our own solution for versioning, a versioned graph/linked documents database supporting Git-style branching
  • using a structured data format and versioning it using a traditional VCS in combination with custom-tailored diff and merge tools.

The most important difference with the traditional textual representation of code is that use of a structured format allows us to store symbolic references in resolved form and also store all kinds of additional information like inferred types.

This is a very interesting feedback. I would love to discuss your ideas in more detail.

Presumably you wouldn’t get much value out of storing information that can be derived from the traditional textual representation, would you? (e.g. assuming incremental builds and cache technics). I’m curious what you envision could be stored that is not expressible in the source.

custom-tailored diff and merge tools

If you have the luxury of a fresh start, you’d probably want to have first class support for 4-way diffs (diff4s). Tools are usually good up to 3-ways, but when you are reviewing a conflict after the fact, the resolution that your peer is proposing constitute a fourth node that you need to take into account. If you consider the result of comparing each pair in the diff4 diamond according to some code-equivalence relation, the number of diff4 classes that you can get that way is B_4 (Bell number), each conveying a slightly different story as to what happened during resolution.

I worked on textual diff4s algorithms in the first quarter of this year and finds it quite an interesting topic! I would imagine the generalization of this problem to structural diffs to be quite fascinating.

Very interesting! I’ve never heard about 4-way diffs before. Could you recommend any article explaining this idea?

I remember looking over the internet a while ago to find diff viewers (some graphical) with support for four panels views, without great success. I would need to do another search to determine if the landscape has changed. I think git-range-diff documentation probably has relevant contents too. Iron (Apache-2.0) is a repo I was familiar with in the past. It contains code related to diff4 classes computation.

1 Like

Not a very useful reply, but more of an encouragement: I, for one, am very excited about a JetBrains product for OCaml and would definitely pay for a good experience.

I love JetBrains products, but because my main language is OCaml, I never quite found the need to pay for one of them.

3 Likes

Make it ReasonML compatible, please.

Thank you.

Not super enthused by the AI stuff, so it would be nice to be able to turn that off if possible.

The Smalltalk-style IDE on the other hand definitely makes me excited – I’ve always been hopeful to see that integrated with a typed programming language! I’m really curious to see how you approach it. Glad also to see you’ve also taken note of Glamorous Toolkit… I’ve found their work fascinating, but unfortunately haven’t delved too deep into it (I find it hard to stray from the comfort of type systems).

3 Likes

Great news! Could you please elaborate on the role of OCaml in the new AI-oriented stack?

I do not have any experience with the interesting words of smalltalk IDE like pharo, and i have almost no experience with jetbrains (a bit pycharm, a bit of clion) but i have been using LLM for devs quite a bit recently.

The strength of ocaml here in my opinion is the strong abstraction provided by modules and signatures. More importantly for LLM the fact they are splitted into two separated textual representations, which makes it easier for LLM to work on isolation.

Here are some things that tend to work well :

  • not specific to ocaml, but LLM are killing it at writing documentation. Usually with ocaml it can do a good enough job writing documentation without the implementation with some good prompt.
  • writing user friendly interface out of implementation
  • writing tests
  • writing implemetation out of interface. This is not magic, but with enough documentation it can provide a good skeleton.

The nice things about module with signatures is that, even with long context, it can help to restrict the amount of token the LLM has to work with in the codebase by looking only at interfaces instead of the details of the implementation, which i suspect is why you mention smalltalk ?

So all of this works relatively ok with say vscode. But the UX is not specialized to this workflow. If the IDE could internally link interfaces “objects” to implementation “objects”, with higher level manipulation of code objects like functions, and integrate that with LLM UX, it could provide a very nice workflow.

All of this is already roughtly possible with selections of text objects in vscode + some lsp magic, but this still operates mostly on text, and involves manual actions, i suspect the UX could be a lot more polished.

You get exactly to the point!

OCaml stack seems to be a great basis for AI-assisted programming because of its simplicity, clear separation between abstractions and implementations, explicit context.

As I said before, we are at a preliminary research stage. To make this vision come true, we need to convince our management that such a product would have a sufficient market potential. Ideally, we would like to collect a number of business cases in the form “the company X would love to use this product and they’re particularly in features Y and Z”.

I would really appreciate it if you (or anyone who’s going to read this) would help us with collecting such business cases.

OCaml has a number of unique features that make it particularly fitting for AI-assisted coding: simplicity, strict semantics, powerful, but pragmatic type system, modularity. To be more precise, we are trying to create a “Meta-OCaml”, a new high-level interface to the language infrastructure simplifying working with modules and signatures and utilizing the full power of modern IDEs and AI assistants. For example, we’re going to store inferred types together with code, display them when needed and notify the user if changes in the implementation change the inferred signature.

To make this vision possible, we first need to convince our management that it could be turned into a successful product. Ideally, we want to collect a number of use-cases in the form “company X would be very interested in using such a product”.

I would be very thankful if you could help us with collecting such an evidence base.

1 Like

What exactly are you mostly interested in?
Targeting JavaScript, compatibility with NPM, ReasonML syntax, first-class support for front-end development?

1 Like