Relocatable compiler with binary patching?

Following up on my earlier question Relocatable compiler work

Could opam or dune make the compiler relocatable by downloading pre-built binaries and doing a search-and-replace of the hard-coded paths with the user’s opam switch path?

Suppose we are talking about Windows, which has a max path length of 256. Let’s say we arrange things so that the prebuilt binaries have paths like:

C:/U/.opam/_________________________________________________________________________________________________________________________________________________________/.opam-switch/build/ocaml-base-compiler.4.14.1/stdlib/libasmrun.a(main.n.o)._main.fail_nat.c

Ie we choose a name for the switch such that the longest path in the binary would be 256 characters. Then on the user’s device we could replace the prefix with their actual opam switch prefix and pad the difference (if any).

Assuming prebuilt binaries are built and available from eg a cloud storage bucket for the user’s OS and architecture, would this strategy work?

Yes, that’s how esy does things.

1 Like

Any interest from opam or dune people in doing this and massively speeding up installing a new switch with a prebuilt compiler installed especially on Windows? @dra27 @rgrinberg

No interest from me. There’s already an adequate workaround in using the system compiler, and as far as I know, David is making good progress on relocating the compiler. There’s already plenty of relocation issues outside the compiler that would be good meanwhile.

I am wracking my brain to remember what the ‘system compiler’ workaround is, could you give me a pointer? opam switch --help does not seem to mention anything about it as far as I can tell. [EDIT: found it opam - Usage ‘Creating a new switch requires re-compiling OCaml, unless you use the ocaml-system package, that relies on the global OCaml installation.’ It doesn’t explain how to use it though…]

So just to clarify, the relocation work being done now would enable an opam switch (ie compiler) to be installed with a prebuilt compiler without needing any binary patching?

With relocation, opam could choose to reuse compiler artifacts between switches.

For dune’s own package management, no further action will be necessary and caching should be automatic provided the dune cache is enabled.

1 Like

So just to clarify, the relocation work being done now would enable an opam switch (ie compiler) to be installed with a prebuilt compiler without needing any binary patching?

@dra27 gave a nice talk about this effort which has more details (spoiler alert: yes, it is without binary patching) Copying opam switches – it should Just Work™ - Watch OCaml

2 Likes

That’s cool. I really meant installing prebuilt switches and compiler artifacts from a cloud hosted location though, so that new users wouldn’t even need to have a full-fledged C toolchain to build it on their machines. Plus it would dramatically cut down the install time.

Three problems with that approach when I looked at that a long time ago:

  1. Pretty big security hole. It is similar to a SQL injection attack when you modify an executable with arbitrary input but don’t validate the input is safe.
  2. The behavior of taking an existing executable and modify its bits is what viruses do. It is hard enough getting through the Windows Defender and the zillion virus checkers … now add that your program is acting exactly like a virus and witness your false positive rate increase.
  3. It invalidates the signature of the executable.

Granted, some people may not care.

5 Likes

To add a couple of items to @jbeckford’s list:
4. Breaks with executable compressors (which are quite common on Windows…)
5. That search-and-replace needs to cope with both UTF-8 and UCS-2 while also assuming that there are no embedded ANSI strings managing to look like UCS-2. It works, but it’s brittle and as-and-when someone hits one of the problems, hello segfault city and there’s no easy solution…

It is already correctly noted that making the compiler relocatable allows trivial duplication of switches. The core idea here is that opam can instantly just “copy” but that proper caching systems such as Dune would link, reducing space wastage. In terms of priority, my hypothesis is that users will spend more time working on projects (with local switches / Dune package management) and thus duplicating the compiler than waiting for the first compiler to build…

… but the first time experience is obviously very important, or they’ll never start that first project! I don’t think it’s valuable to have a version of OCaml distributed which doesn’t require a C compiler… it’s basically equivalent to a full binary distribution (you can’t even build Dune). However, during a period of ill-health last year, I did some investigations on top of @SebHash’s work on the compiler’s build system, improving the parallelism and on Windows switching from using a Cygwin/MSYS2 make+sh to using a single native Windows make.exe on top of the standard Windows Command Processor. The speed-up on Linux is impressive - roughly 20% and bringing the 8-core build time down to a minute on my benchmarking system. On Windows, the speed-up was double, bringing the -j8 build time down to 2 minutes on the same system.

5 Likes

To expand on these points, one bit that’s specific to malware is modifying executable code at runtime, once the executable has been loaded in memory. I think that the original question is about patching on disk, which isn’t great but not as worrying.
Actually, dune does binary rewriting in some cases, in particular to support dune-build-info. But this is causing us trouble on macos in particular because of the code signing that’s enabled by default. So I’m moving that to a solution that doesn’t require binary patching (this is unrelated to compiler relocation, but that gives you an idea of the where the ecosystem is heading).

4 Likes

Have you considered using opam-bin (Welcome to Opam-bin — opam-bin master (f0476a3 ) (2022/12/27 17:38) documentation) ? It provides binary packages on top of opam, with the ability to re-use these binary packages between switches and users. Using it is transparent for opam, as long as packages are relocatable. For that, we provide some patches to relocate the most commonly used packages in GitHub - OCamlPro/relocation-patches: A repository of patches to make OPAM packages relocatable . opam-bin automatically apply these patches to packages when they are built for the first time.

1 Like

That’s an interesting point. Wouldn’t we be able to change how distribution is done post relocatable compiler being shipped? For example, setting up the initial opam switch could become just downloading binary artifacts for the switch–the compiler itself and dune. Similar to what opam-bin is doing now but of course without needing source patches. Then once we have those three pieces–opam, the compiler itself, and dune–we could call that a fairly complete OCaml toolchain as it can be used to build any pure OCaml package.

Of course, that won’t work for any C bindings packages, but people who want those should be willing to spend a little more effort to get them.

I think this is correct, I don’t remember we were getting complaints regarding esy on Windows in relation to antivirus software. cc @prometheansacrifice

Though IIRC the limit for the path prefix was not the Windows path length but a shebang line limit (128 on macOS) which is in use by bytecode executables (to refer to ocamlrun).

By the way, the nice property of this technique is that not only compiler can be made relocatable but also any OCaml or even C/C++ lib.

2 Likes

True. And I have used esy on Windows a lot and never see anything getting flagged by Defender or any AV.

By the way, the nice property of this technique is that not only compiler can be made relocatable but also any OCaml or even C/C++ lib.

I find this to be a very important point. I also made the same observation for the long paths issue - fix made inside esy works for tools not written in OCaml too. Ideally, OCaml toolchain needs to consider issues like path rewriting and long paths manifests for tools not written in OCaml too.

1 Like

I haven’t seen anything on my PC for years because I excluded my dev folders from anti-virus a long time ago (doing that improves performance significantly, and I expect the majority of Windows devs to get around to doing that at some point). But all that says is the denominator for AV false positives is the number of users, not how long a single person went without seeing anything.

The symptoms are an .exe is not present (it was quarantined) or an .exe not being able to execute. A cursory scan in the esy queue shows:

  1. Windows - Avast Anti-virus falsely flags binaries as being infected with “IDP.Generic”. That issue makes it kinda obvious that esy is not immune to AVs. But perhaps the AVs have learned to trust esy.exe / npm bundled executables since then.
  2. windows: Unix_error -123. That mixed path (UNC + normal squished together?) makes it a better bet it is a simple bug in the path, but there is a not-insignificant chance with the CreateProcess error that it is an AV false positive. Since it is still open, it wouldn’t hurt to ask some of them to disable Windows Defender + AVs and try again.
1 Like

What exactly is the proposal? Download and patch binaries? Treat one report from one proprietary AV product as proof of security? Just wondering.

Not a proposal. Just my annual thread where I try to ask about how we could get relocation to work and people patiently explain to me that it’s almost there :wink:

2 Likes

I’m with you on that (relocatabilitificaldiousness)