How hard would jsoo -> pyoo be?

I am aware of the excellent work of GitHub - LaurentMazare/ocaml-torch: OCaml bindings for PyTorch but I am interested in a solution that gives me all the libraries of Python, from OCaml.

Having become somewhat familiar with jsoo & it’s FFI design; I am curious: how engrained is “js” in “jsoo” ? Would it be easy to refactor into a pyoo ?

I think pyoo + pyoo-top-level would be a very interesting alternative to pure python.


PS: ping @hhugo

See Compiling subset of OCaml to GoLang / Julia? - #10 by yawaramin

I forgot that thread. Thank you for refreshing my memory. That project looks very close to what I want (especially now that I’ve switched from ReScript to JSOO).

Various top level folders look untouched for 2, 3, 5 years. What are the main features rehp is lagging behind jsoo on ?


1 Like

In addition to the one yawaramin linked, here are a couple of other useful related posts for interested readers, including yours from the Rescript forum as well (I think it fits here too):

Edit: it would be cool to see rehp revived…Compile to all the things!

A priori, it seems difficult to make “pyoo” work. IIUC, “jsoo” translates the bytecode to JS, and then lets the JS JIT run amok to optimize the code. Python doesn’t have a JIT, and its bytecode is, I would guess, unlike OCaml’s. So you’d get two layers of interpretation, and that would be pretty awful for performance.

This is a valid criticism given I wrote “all python libs” above.

However, what I wrote is not accurate. I only only care about “all python DL libs”, where I think 99% of the time is either (1) pre-processing the data or (2) waiting on CUDA calls.

Without having hard numbers and merely waving my hands, I think, in this case, it is okay even if the generated python code is 10x-100x slower – because the bulk of the computation is waiting on CUDA / FFI-routines anyway.

In that case, rather than a general approach, you could try and write bindings just for the subset of whatever libraries that you need. In other words, use the ffi approach (eg ctypes, pyml, etc). Similar in spirit to how the rescript folks suggests writing js binding code only to the stuff you’re actually using for your app rather than trying to bind whole giant libraries in one go. (Pretty sure gasche mentioned something similar in one of these threads as well…ie rather than compiling one to the other, use ffi.)

Upon further reflection, I think your idea makes more sense than my idea. I should probably first get proficient with pyml_bindgen , what works / does not, instead of trying to revive rehp with a python backend. The pyml_bindgen approach is : just write ffi bindings; where as the pyoo approach is: write a new transpiler + write ffi bindings; so the pyml_bindgen is strictly less work.

1 Like

Aren’t Python DL libs mostly wrappers for C/C++/Fortran? It seems it would be better to cut the middleman?

Just want to mention that it would probably be a good idea to check out pyml itself to see how the python bindings are working. pyml_bindgen is nice (imo, but I’m biased), but under the hood it’s “just” generating bindings using pyml.

edit: Reason I mention it, is because any limitations that pyml_bindgen has, you can drop down to pyml and do it there. (ie the pyml_bindgen can’t do everything that you could do manually with pyml)

But also, yawaramin has a good point…I would be also interested in seeing how cutting out the python entirely and binding the lower-level libs would go.

I think this would be terrible for flow.

You do a google search, get some sample code online that uses some python package.

With the bind python approach: you wrap 2-3 calls to the python package as blackboxes and move on.

With the bind C/C++ approach: no suddenly you have to re-implement all the functions from the python package you need.

No. For example, JAX (+Flax) emphatically isn’t.

Look into pyml_bindgen, this is great stuff made by @mooreryan .