Why is OCaml bad at deep learning on the GPU?

There is a really cool library GitHub - LaurentMazare/ocaml-torch: OCaml bindings for PyTorch , but if we are honest, that is mostly a OCaml wrapper of PyTorch.

For deep learning on the GPU, it seems the core components are:

  1. auto differentiation of some Math DSL (symbolic task)
  2. compiling said Math DSL to CUDA (compiler task)
  3. running forward / backward passes

1 & 2 are things OCaml are known to excel at. 3 is mainly FFI. Why is it there are no popular full stack OCaml GPU DL toolchains ?

(If the answer is ‘Python wins by momentum’, the counter question is: why does Julia have a more ‘complete’ GPU DL stack than OCaml?)


For the same reason OCaml is unpopular (it is not among the 100 most popular languages IIRC), despite clearly being the best programming language…

But, a more serious answer: Signals and Threads | Python, OCaml, and Machine Learning

1 Like

Note that OWL: OCaml Scientific Computing supports deep learning, but its support for GPU is limited to emitting ONNX. ETA: I was wrong! OWL also supports compiling to OpenCL.


I would say, Julia still wins by momentum in the numerical computing space by a factor exceeding 20. Owl is a heroic effort trying to change that but to get viable will require gradual improvement based on feedback from a critical mass of regular users. It will not have the manpower to tackle a full GPU autodiff implementation, afaics.


That’s an easy question to answer: how about you implement this “full stack OCaml GPU DL toolchain” you would like to have? See, there is your answer!

Implementing a comprehensive AD framework is a hell lot of work. And then what? There is hardly any commercial demand for OCaml, especially for machine learning. If there is no professional perspective for such an undertaking, why bother?

1 Like

One part that is not obvious to me is how training for onnx is supposed to work after looking at onnx.ai

Is there some service where we specify an onnx template + inputs, and a service outputs onnx weights ?

(I am working on one but it is way too early to share.)

1 Like

Check this chapter under the ONNX Engine section. In a nutshell you need to save your computation in the onnx format then you can load and run it via the python onnx runtime. I believe this can be automated and done from within an OCaml environment via the pyml library.


I think what can be also done to make Owl work with GPUs is that since Owl already has the functionality of producing symbolic graphs of computations, it should be possible to write another engine that can target MLIR. Another possibility is to write an implementation of the Ndarrays based on the arrayfire library. Yet another possibility is to utilize spoc somehow.

1 Like

Now reading the OWL book, I was wrong: OWL also supports JIT-compiling to OpenCL.

  1. where is OpenCL mentioned? I don’t see if in the Compiler Backends chapter: Compiler Backends | SpringerLink

  2. Is there significant difficulty in OpenCL backend vs Cuda backend ?

  3. IIRC, for Nvidia GPUs, there were Cuda vs OpenCL performance differences – is this still true, and does it apply to AMD/ATI GPUs too ?

If OWL can, via OpenCL get 50% of the performance of PyTorch / CUDA, I’d be okay with just using OWL instead.

It is near the end of the preceding chapter: pages 181-182 (190-191 of the pdf). I did not find OpenCL referenced in the API when I was skimming it a while back.

OpenCL is woefully unused in the industry and lags far behind CUDA. As in other domains, one needs many specialists to develop this stuff.

Additionally, OCaml doesn’t have that much of an advantage here unfortunately. The biggest source of bugs and errors in this domain has to do with tensors and their dimension sizes, and there are just no good type systems that handle this stuff outside of dependent types AFAIK.

Additionally, the industry advances extremely rapidly. Bindings to pytorch and tensorflow are probably the best we can do.


seconded, what’s more, if you really want to use OCaml, the bindings to PyTorch are binding to the C++ library. This sounds like a really decent solution if you want to use OCaml in the ML context.


I wasn’t cognizant of that, worth underlining! A point in favor of PyTorch vs. JAX.

1 Like

The main opportunity I see here for OCaml is in managing the processing of structured data. Python is superb for deep leaning with arrays and/or strings going in and out but grim for all other data types. OCaml could really excel here.


I’m a bit worried that with pytorch 2.0, pytorch is stating that they are moving away from C++ and rewriting more of the core components in python. Apparently there are ways to do that that preserve performance. That would potentially make the C++ / C bindings to pytorch incomplete / obsolescent. I hope I’m wrong?

1 Like

Are there promising projects for e.g. dataframes in ocaml? I’m aware of only the dataframe in Owl which seemed to offer little in terms of useful type information when I skimmed it.

1 Like