GPU accelerated numerical ops on OCaml arrays via pyml + jax?

lukstafi · April 25, 2023, 5:56pm

The way I would approach it is to outright start a new framework, inspired by JAX, and in the process of writing it, refactor my current framework to reuse the computation layer. I might then discard my current framework (design) in favor of a lessons-learned better design… right now it seems unlikely I’ll pursue that. My current design is very imperative. I’m evolving my current design, but incrementally in the direction of simplicity, and it will stay imperative, will not lead to a JAX-like symbolic interpretation framework…

laurent · April 26, 2023, 10:45am

@ghennequin certainly happy if these bindings can be helpful and bindings them to owl sounds like a nice thing to do, and I would be keen to adapt the bindings, add the missing bits if necessary etc.
My main goal for writing these was to write a jax like library on top of it - and keep it as close as possible to the python version so that it’s easy to port models and interoperate between the python and ocaml worlds. I’ll start toying with this in a separate library that will use the ocaml-xla package (in the same way jax uses jaxlib), the goal would be to keep ocaml-xla as low-level and unopiniated as possible so that it can serve as backend for multiple libraries - it’s actually already the case that some of the examples include various helper functions that wouldn’t belong to the xla bindings but more to a higher level NN library.
Also just to mention that on top of GPU support, using xla will bring TPU support to owl

ghennequin · April 26, 2023, 1:01pm

A new jax-like library on top of ocaml-xla would be amazing ─ I would be very happy to contribute. What is the best place to start a discussion on this? Also just curious: does JS have a vested interest in such a library or are you just doing this in your spare time?

laurent · April 26, 2023, 6:19pm

Just a spare time project for me. Not sure what the best place is neither, and it may be a bit on the early side at least for me as I haven’t started hacking much on this, but probably using discord or slack would work.
Another thing that I might have forgotten to mention is that ocaml-xla now features the LLaMA text model as an example. For the smallest variant with 7B parameters, inference requires a GPU with 16GB of memory though, or 32GB of ram if running on CPU, I plan on polishing this for efficiency, e.g. supporting quantization and flash attention and maybe also plug Alpaca to have a conversational model.

qubit · May 1, 2023, 10:34am

Gotcha, thanks for the reply!

n4323 · May 10, 2023, 7:22am

I’m looking forward to trying this out! A few small comments:

1c: This is good for interoperability. However, in the context of owl people make a good argument that a better basic data type would be a view into a Genarray, i.e. a prescription that indexes into a Genarray with prescribed slicing and range restriction, and makes it appear as a contiguous array from the outside. Is that accurate @bluddy ?

is what I’m excited about because when trying to use Owl.Algodiff I was really held back by the fact that all data: tensors, matrices, scalars now just show up as an opaque Algodiff.t. This breaks many advantages of Ocaml’s type system IMO.
is a pity but maybe not too bad as AFAIR pytorch has the same restriction and is useful nonetheless. (Although that may be outdated info?)

zeroexcuses · May 25, 2023, 5:41am

Apologies if this is too off topic for OCaml.

I was just reading The tiny corp raised $5.1M | Hacker News

In particular, quoting the article: the tiny corp raised $5.1M | the singularity is nearer

There’s a great chip already on the market. For $999, you get a 123 TFLOP card with 24 GB of 960 GB/s RAM. This is the best FLOPS per dollar today, and yet…nobody in ML uses it.

Now I’m curious. Does anyone have experience programming https://www.newegg.com/xfx-radeon-rx-7900-xtx-rx-79xmercb9/p/N82E16814150878v ?

I’m assuming it’s not CUDA. So is it OpenCL, or some other instruction set ?

In theory, this looks like an opportunity to shine, if, on a FLOP/$ basis, there is superior hardware that no one uses because it is not CUDA, and one can unlock this power ‘merely’ by writing a backend to OWL.

Topic		Replies	Views
Suggestion to make ocaml-rs and pyml official projects Ecosystem	3	662	October 26, 2021
Owl project restructured Ecosystem announce	10	1832	July 21, 2024
Multicore OCaml: May 2020 update Community multicore , compiler , multicore-monthly	0	8794	June 1, 2020
New Owl book: Architecture of Advanced Numerical Analysis Systems Ecosystem book , owl	10	2107	February 19, 2024
Language abstractions and scheduling techniques for efficient execution of parallel algorithms on multicore hardware Community multicore	19	3678	October 15, 2021

GPU accelerated numerical ops on OCaml arrays via pyml + jax?

Related topics