How to create a compiler from some language to OCaml?

1/ Mainly, I would like to compile some expression in language A to an expression in OCaml.

How should I do and with which components from the existing Ocaml infrastructure to fully benefit from the existing materials. Certainly the grammar of the PL expressed in OCaml and some pattern matching “landing structures” that I would have to complete in order to translate from A to OCaml.

2/ Optionally, I would like to compile some expression in OCaml to an expression in language B.
Same question.

I’m aware of what is a compiler and an interpreter but I don’t know OCaml internals. I really want to do it straightfully and without reinventing the wheel.
Thanks.

1 Like

For first part you could try to compile your language into OCaml-like internal representation of programs an use GitHub - stedolan/malfunction: Malfunctional Programming to get a binary.

Starting with 2/ (OCaml to your language), I’d suggest forgetting it for now. If you really want to try, making your own OCaml-like parser with only the features you care about is an option, and if full compatibility is required then you can use compiler-libs to get a Parsetree.t from a source file, and then translate from that. But the OCaml language is quite complex, so you’re looking at a lot of work.

The other way is more interesting. If you’re only interested in benefitting from the OCaml compiler, then you might want to look at the malfunction opam package; it provides a language that is a bit simpler than the source language, and that will be fed back to the OCaml compiler as one of its internal representations. If what you want is to generate source OCaml code, then I’m less familiar with the tools available but the ppx ecosystem is full of tools for generating and printing valid OCaml code, so you should probably investigate ppxlib and related packages.

I don’t want to produce a binary from my expression/program in language A.
I need to produce OCaml (source code) from it.

There is some stuff that I could manually redefine in OCaml (in terms of modules, etc.).
But most of the stuff is defined as classes and relationships with attached code or pseudo code.
There should exist some mapping between these classes and/or OCaml modules (or classes), and between relationships and some product types (tuples, records, etc.).

The issue is that I have not the infrastructure in language A to make an AST from an expression. So I need to use the OCaml infrastructure I’m familiar with to do that.
I believe that I should parse expression(s) in language A (that can be described in XML) to produce OCaml source code. And I think that this requires to first get a description for this language (a grammar → production rules).
Does it make sense for you?

Yes, but sadly I can’t be very helpful. It looks like that you are trying compile/transpile an external DSL to OCaml code. Camlp5 does something like this, but the language is rather similar to OCaml. Coq does something like that during extraction, but I doubt it will be helpful too. Highly likely you will need to invent infrastructure for right construction of ocaml code on you own.

If you’ve got an XML document you need to transform you probably can’t do better then XSL. Vastly easier than writing low-level code using an XML API, in any language.

About 2/ I am interested on that true. My objective was to compile ocaml (or a subset of it) to Lua. Aoo the answers were helpful, but the conclusion is that it is very complex.