Js_of_ocaml: OCaml's values and the structured clone algorithm

Does anyone knows if js_of_ocaml’s encoding of OCaml values for reasonable types (ground types, records, arrays, lists, variants, polyvars not including functions or objects) survives a trip through the JavaScript structured clone algorithm ?

That is can I safely ping-pong OCaml values with web workers without having to serialize them to strings ?

I would be enclined to say yes and I have a few encouraging tests but I bet someone like @hhugo has a better idea about problems I may be missing.

1 Like

This will not work with bytes, int64 and big arrays, which are implemented as an object with a constructor. There is also an issue with exceptions, that rely on physical equality.
But other values are basically translated to arrays, strings and numbers, so they should indeed safely survive a trip through the structured clone algorithm.

5 Likes

Thanks for your answer @vouillon.

Just to make things clear when you said bytes that does not include string values, right ? (They did seem to round trip in my tests).

Right, immutable string values are now mapped to JavaScript strings.

1 Like

Well, actually you need to use --enable use-js-string for that.
Otherwise, the structured clone algorithm should break string comparison and hashing.

1 Like

I see. Is there any reason why this is an option ? This kind of choices always bring more complexity from a systems perspective (feature interactions in bug reports, expectations in FFI code, etc.).

Also I’m thinking maybe it would be good to have a best-effort zero-copy primitive say caml_js_{to,of}_structurally_clonable, that handles the bytes, int64 and bigarray cases and that one can call when sending or receiving such values.

the reason --enable use-js-string is not the default is that it’s a potentially breaking change. I’ve tried to fix various js stubs over the years but was never confident enough to make it happen. see Use javascript string by default by hhugo · Pull Request #976 · ocsigen/js_of_ocaml · GitHub

1 Like

Js_of_ocaml.Json.{output,unsafe_input} does some of this (string and int64) I believe.

Do you have a reasonable idea on who could be affected ? I guess most people should be insulated by the various FFI bits no ?

It’s a bit of a chicken egg situation. I suspect that this kind of options do not get tweaked by most people. If string as JS string is the future I think js_of_ocaml should start by inverting the default (major release), and a few release later drop the option.

Indeed it seems to do funny stuff with the reviver but of course you don’t want to serialize to a string here, you want to keep the value as is, except for when you can’t.

This is related but a bit OT for both the topic and the OCaml forum, but I rather have this in this topic.

I’m just wondering if anyone has experience with webworkers. I’m getting miserable performance (70s) using them in contrast to blocking the page UI thread (3s), even if I subtract the data transfert overhead of 2-3s, that still seems a bit egregious.

Not sure if I’m doing something wrong or if that’s kind of expected, anyone has a hint ?

I’ve used Web Workers in a couple of projects.

One project implemented a simple ray-tracer for DRR computation. The ray-tracer was implemented in C++ and compiled to Asm.js using Emscripten (we also briefly tested compiling to WebAssembly but at least back then we got better performance from Asm.js in Chrome). The input 3D image and output 2D images were stored in memory shared by all web workers (one per CPU thread) and the main JS context. A single broadcast message was sent to the workers to render an image. Performance was comparable to a similar ray-tracer written in C++ and running natively.

Another project where I’m using web workers is my Fωμ sandbox. The program is sent to a web worker for compilation and the compiled program is also run in a web worker. (You can find all the JS code under docs in the project.) Aside from improving responsiveness this makes it easy to kill them. (Then there are a couple of other workers, but they are not that essential.)

In both of these the web workers are created on startup and then used for computation. IOW, not starting new web workers or stopping them on each use. I would expect that the overhead of starting and stopping web workers would be significant as they would have to load and execute all code on every run.

The 2-3s transfer overhead sounds very large. Either the messages are massive or there are lots of them. In the ray-tracer project we noticed that message passing had a fairly large overhead/latency, which is why we used broadcasting. The DRR computation was used both for displaying an interactive real-time projection and as a part in automatic 2D-3D registration and I recall estimating that sending messages per worker would already have cost significant amount of time (but I don’t recall estimated numbers).

So, in summary, assuming that workers are not unnecessarily started/stopped and message passing overheads are avoided I would expect good performance.

1 Like

Thanks for you feedback. I’m not starting a worker on each computation, I have a work queue (gist here) at which I throw work.

Regarding the transfer it’s one large ocaml value with a lot of geometrical data time series, I could cope with that. What I find really strange is massive difference in execution time – I thought maybe workers were throttled in some way but according to what you write that doesn’t seem to be the case.

However the way I go about these things is likely not very orthodox, it could be related to that (?) :

  1. I use the same script as the page script, the main function checks for Brr_webworker.Worker.ami and dispatches on the page code (which starts the webworker) or the worker code. Basically a fork mecanism.
  2. I have a dirty hack in place to be able to make that work over the file:// protocol. Because stupidly you can’t invoke your own script over file://, you get a cross-origin error. So I embed my page script in the page itself. This allows me to get the source, make a data url of it and invoke the worker with that :grimacing: (the nice thing though is that if you do so with all ressources you get a single double-clickable file “application”).

I found it a bit difficult to investigate what’s really happening and lack a bit the time to go deep into this so for now I’ll likely rather continue do it on the ui thread with a bit of setTimeout relaxation.

That doesn’t seem to be the case. In fact I tried to simply round trip my data with the worker and then perform the computation on the ui thread again and I get the same timings and, in fact, wrong results.

It seems that going trough the structured cloned aglorithm breaks the OCaml values in some subtle way which in turn means that my computation constantly hits slow paths that explain the slowdown.

Still unclear for now what breaks, testing for equality between the data and the round-tripped data returns true.

So I had some kind of stub value against which I was physically testing. Of course these kind of values get a different address when you cross boundaries and this had catastrophic effects on the computation in this case.

Conclusion: value your option types :–)

I now get my results within the same amount of time, just a bit disappointed by the high transfer time overhead.

In case it’s of any help, ArrayBuffer objects can be sent from and to workers without copying, using the transferList argument to postMessage (this is unrelated to SharedArrayBuffer, which is not widely supported in browsers yet).

ArrayBuffers can also be viewed from Bigarrays without copying, and most web APIs support ArrayBuffers (including XHR, WebSocket and FileReader)

2 Likes

Thanks for the tip. Unfortunately the code to execute works with lists of gg points not bigarrays so that won’t really help.

Still, I managed to halve the transfer times by peeling a bit the arguments I was sending from large amounts of data that were not being used by the actual computation in the worker.

For the fun I wanted to compare to what a trip via a single Marshaled string would give but alas you can’t marshal floating points with jsoo.

Actually, you can put your floats in a bigarray and marshalling will work - I had to do it recently. I was also expecting that putting them in a record with floats only would work but it doesn’t ; but I believe jsoo could be patched to make it work in that case as there’s a tag saying it’s a float record.

On a side note, we also wanted to send the marshalled string via a websocket (using Brr client side and Dream server side) I though that using Websocket.send_blob would be enough but for some reason it wasn’t working (we’ll open an issue soon after making sure we didn’t miss something else).

The dirty hack to make it work was something like: Format.sprintf "%S" marshalled_string on the client side and some Scanf on the server… :sweat_smile:

Bigarray’s are backed by typed arrays so I don’t think you’d need any marshal here, it should structurally clone cleanly.

I suspect you stumbled over binary vs text issues.

Between string as byte sequences or as UTF-8, bytes as byte sequences or as UTF-8, bigarrays, Javascript strings, JavaScript binary strings, JavaScript typed arrays, JavaScript buffer arrays and JavaScript blobs, it’s easy to confuse oneself (see e.g. Base64 decoding · Issue #18 · dbuenzli/brr · GitHub, maybe a few easier conversion paths are still missing in brr to clarify all that).

If you do, do it rather early than late, I’ll soon make a release.