Why is OCaml bad at deep learning on the GPU?

zeroexcuses · February 12, 2023, 2:44pm

There is a really cool library GitHub - LaurentMazare/ocaml-torch: OCaml bindings for PyTorch , but if we are honest, that is mostly a OCaml wrapper of PyTorch.

For deep learning on the GPU, it seems the core components are:

auto differentiation of some Math DSL (symbolic task)
compiling said Math DSL to CUDA (compiler task)
running forward / backward passes

1 & 2 are things OCaml are known to excel at. 3 is mainly FFI. Why is it there are no popular full stack OCaml GPU DL toolchains ?

(If the answer is ‘Python wins by momentum’, the counter question is: why does Julia have a more ‘complete’ GPU DL stack than OCaml?)

lukstafi · February 12, 2023, 3:17pm

For the same reason OCaml is unpopular (it is not among the 100 most popular languages IIRC), despite clearly being the best programming language…

But, a more serious answer: Signals and Threads | Python, OCaml, and Machine Learning

lukstafi · February 12, 2023, 3:24pm

Note that OWL: OCaml Scientific Computing supports deep learning, but its support for GPU is limited to emitting ONNX. ETA: I was wrong! OWL also supports compiling to OpenCL.

n4323 · February 14, 2023, 8:59am

I would say, Julia still wins by momentum in the numerical computing space by a factor exceeding 20. Owl is a heroic effort trying to change that but to get viable will require gradual improvement based on feedback from a critical mass of regular users. It will not have the manpower to tackle a full GPU autodiff implementation, afaics.

mmottl · February 14, 2023, 2:49pm

That’s an easy question to answer: how about you implement this “full stack OCaml GPU DL toolchain” you would like to have? See, there is your answer!

Implementing a comprehensive AD framework is a hell lot of work. And then what? There is hardly any commercial demand for OCaml, especially for machine learning. If there is no professional perspective for such an undertaking, why bother?

zeroexcuses · February 14, 2023, 5:51pm

One part that is not obvious to me is how training for onnx is supposed to work after looking at onnx.ai

Is there some service where we specify an onnx template + inputs, and a service outputs onnx weights ?

lukstafi · February 14, 2023, 6:57pm

(I am working on one but it is way too early to share.)

qubit · February 14, 2023, 8:02pm

Check this chapter under the ONNX Engine section. In a nutshell you need to save your computation in the onnx format then you can load and run it via the python onnx runtime. I believe this can be automated and done from within an OCaml environment via the pyml library.

qubit · February 14, 2023, 9:36pm

I think what can be also done to make Owl work with GPUs is that since Owl already has the functionality of producing symbolic graphs of computations, it should be possible to write another engine that can target MLIR. Another possibility is to write an implementation of the Ndarrays based on the arrayfire library. Yet another possibility is to utilize spoc somehow.

lukstafi · February 23, 2023, 6:17pm

Now reading the OWL book, I was wrong: OWL also supports JIT-compiling to OpenCL.

zeroexcuses · February 24, 2023, 12:56am

where is OpenCL mentioned? I don’t see if in the Compiler Backends chapter: Compiler Backends | SpringerLink
Is there significant difficulty in OpenCL backend vs Cuda backend ?
IIRC, for Nvidia GPUs, there were Cuda vs OpenCL performance differences – is this still true, and does it apply to AMD/ATI GPUs too ?

If OWL can, via OpenCL get 50% of the performance of PyTorch / CUDA, I’d be okay with just using OWL instead.

lukstafi · February 24, 2023, 6:56am

It is near the end of the preceding chapter: pages 181-182 (190-191 of the pdf). I did not find OpenCL referenced in the API when I was skimming it a while back.

bluddy · February 24, 2023, 12:01pm

OpenCL is woefully unused in the industry and lags far behind CUDA. As in other domains, one needs many specialists to develop this stuff.

Additionally, OCaml doesn’t have that much of an advantage here unfortunately. The biggest source of bugs and errors in this domain has to do with tensors and their dimension sizes, and there are just no good type systems that handle this stuff outside of dependent types AFAIK.

Additionally, the industry advances extremely rapidly. Bindings to pytorch and tensorflow are probably the best we can do.

wokalski · February 24, 2023, 1:06pm

seconded, what’s more, if you really want to use OCaml, the bindings to PyTorch are binding to the C++ library. This sounds like a really decent solution if you want to use OCaml in the ML context.

lukstafi · February 24, 2023, 1:23pm

I wasn’t cognizant of that, worth underlining! A point in favor of PyTorch vs. JAX.

Jon_Harrop · February 24, 2023, 5:33pm

The main opportunity I see here for OCaml is in managing the processing of structured data. Python is superb for deep leaning with arrays and/or strings going in and out but grim for all other data types. OCaml could really excel here.

n4323 · February 27, 2023, 8:23pm

I’m a bit worried that with pytorch 2.0, pytorch is stating that they are moving away from C++ and rewriting more of the core components in python. Apparently there are ways to do that that preserve performance. That would potentially make the C++ / C bindings to pytorch incomplete / obsolescent. I hope I’m wrong?

n4323 · February 27, 2023, 8:26pm

Are there promising projects for e.g. dataframes in ocaml? I’m aware of only the dataframe in Owl which seemed to offer little in terms of useful type information when I skimmed it.

Topic		Replies	Views
GPU accelerated numerical ops on OCaml arrays via pyml + jax? Learning	26	2119	May 25, 2023
Is it possible to use Machine Learning and Deep Learning frameworks through OCaml? Community	5	8442	November 12, 2020
CUDA, GPUs, and https://ocaml.xyz/ Learning	0	470	April 1, 2023
New Owl book: Architecture of Advanced Numerical Analysis Systems Ecosystem book , owl	11	2250	July 20, 2025
Week 44: what's everyone hacking on this week? Community	51	4346	November 6, 2017

Why is OCaml bad at deep learning on the GPU?

Related topics