Using ocaml-ctypes with WebAssembly?

Hi there,

I’ve been using js_of_ocaml to make semgrep run from inside a web browser, and am hitting an interesting issue with one of our dependencies. We use ocaml-yaml for YAML parsing, which uses ocaml-ctypes under the hood to interface with libyaml (a C library). I’ve managed to compile libyaml to WebAssembly and write all the JS glue code to wire everything up.

However, the memory offsets calculated by ocaml-ctypes for libyaml struct fields are all slightly off due to the fact that ocaml-ctypes was built on a 64-bit machine (size_t and void* are 8 bytes) but WebAssembly is 32-bit (size_t and void* are 4 bytes). It looks like sizes and alignments are hardcoded at build time: ocaml-ctypes/Makefile at 46cc3ffc47119264bf530905fce4a2c3750f4c8c · yallop/ocaml-ctypes · GitHub

I did some ugly hacks to work around this with jsoo (see js: fix libyaml bindings by tpetr · Pull Request #7918 · returntocorp/semgrep · GitHub) because I didn’t think it was feasible to cross-compile or drastically alter ocaml-ctypes, but my hope is that there is a simpler solution that I’m not aware of.

Has anyone hit this issue before? Any advice?

Thanks!

3 Likes

I am not sure if this will be helpful, since I think you are going for a much deeper OCaml / FFI interaction than I did.

I have a web app where

  • OCaml compiled to JS via jsoo
  • Rust compiled to wasm32 via cargo build --target=wasm32-unknown-unkonwn
    talk to each other.

They communicate based on just passing JavaScript objects back & forth. On the rust side, I use web-sys - The `wasm-bindgen` Guide . I can Chrome/console.log values from ether side.

Two ideas come to mind:

  1. are you compiling libyaml (C lib) via emscripten ? If so, is there some emscripten header filer you can include, which will let you read/create javascript objects from libyaml ?

  2. I don’t know how’s your Rust. There appears to be a pure Rust YAML parser at https://crates.io/crates/yaml-rust (never used it)

Thanks for the reply and the ideas! You’re correct, this is unfortunately a very deep interaction. Semgrep is tightly integrated with the shape of ocaml-yaml, so I don’t think I can adopt the approach you suggested (we need to be able to support jsoo and native executables at the same time).

I just had another idea that I’m investigating: what if we patch ocaml-ctypes’ sizeof and alignment functions? For example:

let patched_sizeof : type a. a Ctypes_static.typ -> int = function
  | Pointer _ -> 4
  | Primitive Size_t -> 4
  | other -> Ctypes_static.sizeof other

If I can manage to override the JS method for Ctypes_static.sizeof with patched_sizeof then I think things should Just Work

I agree; for things that are deeply intertwined, it probably makes more sense to have ocaml-ctypes detect whether it is interacting with js/wasm and act accordingly.

PS, I wonder, for a temporary hack, if you could modify it to … read the ptr size as an Environment variable? :slight_smile:

Yeah that could be a good approach too. :+1:

Newbie question: Let’s say that I have a branch of ocaml-ctypes with this functionality. I’m assuming that getting my PR merged and an official ocaml-ctypes release cut could take awhile… do you know how hard it would be to get my opam project to depend on this branch? I know I can run opam pin on the command line, but it’s unclear if I can do the same in a .opam file.

ctypes supports cross-compilation. Details here: Don't run compiled binaries while cross-compiling by whitequark · Pull Request #383 · yallop/ocaml-ctypes · GitHub

Interesting, thanks. I’ve been doing some more digging (and learning) and I realize now that the memory offsets are baked into libyaml at build time (because it’s using the stubgen functionality), so it really does look like cross-compilation is needed here.

Is there a good primer on how to set up 32-bit cross-compilation in dune? I’ve been googling and it seems like the latest 32-bit opam variant is fairly old (4.11.2, semgrep wants 4.13, but I’m not sure how hard of a requirement that is).

You can install 32-bit versions of recent compilers using opam’s ocaml-option-* machinery. For example, the following command installs a 32-bit OCaml 4.13 switch:

opam switch create 4.13+32-bit --packages=ocaml-variants.4.13.0+options,ocaml-option-32bit --repos=default,beta
1 Like

Excellent, thank you for the pointer! It looks like the 32-bit compilers are only built for x86 (I’m on an M1 MBP), so I just need to figure out how to get a compiler that works for me and then I think we should be golden.

I’m going to mark @yallop’s suggestion to cross-compile as the answer. Thanks, all!

1 Like