Improving windows install time

Continuing the discussion from [ANN] Diskuv OCaml 1.1.0 - Windows, winget, setup-dkml:

@jbeckford I didn’t want to go off-topic on your announcement thread so I detached this discussion into a new topic.

Does it become faster depending on the speed of your internet connection and the number of cores?
Even so 90 minutes sounds a lot longer than what I’m used to with Linux, e.g. building OCaml 5.0 compiler itself from scratch takes ~1m2s for me on Linux (with 24 cores).

There are a few techniques that might be useful to speed up install time (perhaps as an optional alternative download):

  • ship a precompiled version of the compiler (might need relocation support in opam, which is probably not there yet, but if someone would use a hardcoded path, e.g. ‘C:\Program Files\Diskuv’ would that help meanwhile?)

  • ship a dune cache of everything you’d normally install after the compiler itself. On Linux to turn on dune cache with opam I have to set an environment variable: export DUNE_CACHE=enabled (because the sandboxed dune won’t see my ~/.config/dune/config where I turn on caching otherwise). Caching can be tricky (e.g. if you’re not careful you will run out of disk space eventually when the dune cache grows to tens of gigabytes: you have to run dune cache trim periodically but that could be fixed with a scheduled task), and probably depends on the path you use for the install

Probably not much you can do about speeding up the install of visual studio itself, however that may already be present on the user’s system, so speeding up the installation if that is already present might be nice.

There is also a portable version of MSYS here that might be useful: GitHub - 2wayne/MSYS2Portable: MSYS2 in PortableApps.com Format (although from a quick search I was not able to find the source code of portableapps.com to see how it actually works, although it claims it is open-source)
I’m not sure whether it does anything more than move directories and search&replace paths inside config files with the install dir (e.g. if the install path gets hardcoded in a binary you build I doubt it could help fix that), and from what I can see the OCaml compiler binaries (at least on Linux) do end up with absolute paths inside the executables.
I see that there are experimental relocation patches for the compiler at GitHub - OCamlPro/relocation-patches: A repository of patches to make OPAM packages relocatable, could these help in making a pre-compiled OCaml distribution for windows?

1 Like

I am far more concerned about correctness and parity than a one-time performance cost at this point for Windows. Even with a fictional 90-second install, who cares if OCaml packages don’t work on Windows? I sure don’t.

But if you are heavily invested and concerned about performance today, you can contribute today. dkml-base-compiler is an opam package that works on Windows/macOS/Linux so you don’t need to be a Windows user to work on it. It could be sped up tremendously by not compiling from scratch but by re-using existing ocaml*.opt binaries. opam supports the extra-source xyz.tar.gz { ... } clause which can be used for binary assets and should be a good starting point. Private message me and I can shepherd you along.

Summary: If you want to improve the Windows UX, there are a lot of already-known areas you can help. This is a resource (people power) problem. So these conversations go nowhere unless people are willing to actually contribute. So far I haven’t seen people willing to contribute. Prove me wrong this New Year!

7 Likes

Good choice, it’d be difficult for me to contribute to something windows specific, however if it is cross-platform then I can!

I’ve opened a few small PRs with some small improvements:

  • -j wasn’t used consistently, and fixing that yielded an ~3x speedup)
  • the instructions from the README.md didn’t work, but I found instructions in DEVELOPING.md that did (maybe after following the instructions in DEVELOPING.md the ones from the README would work? I haven’t tried, but anyway I think adding a pointer to DEVELOPING.md from README.md would be useful)

There are definitely more improvements that can be made, like the one you suggested, but tweaking make flags was an easy first fix.

I like short edit-compile-test feedback cycles, improving that might lower a barrier to contribution, so I’ve made some small steps towards that.

2 Likes

Also as a bonus I noticed that diskuv has some patches/support for cross-compilation, which might become useful for me. Currently I have to compile Linux RISC-V either in QEMU or on native hardware (both of which are slow), and cross-compiling would be ideal, however that is not supported by the upstream compiler distribution. It probably won’t be a trivial amount of work to get that working (although I see some ARM cross compilation scripts there already), but I’ll give a go next time I have some time (on 4.14, since 5.0 doesn’t have RISC-V yet)

Amazing! Thanks for the PRs and the feedback. I’ll do a point release next week and see the overall effects on the Vagrant-based installation testing.

I would like to understand the current architecture and whether MSVC can be made an option, having an Msys2 based subset as the minimum viable product.

AFAI understand, the mingw port works but it is affected by the above mentioned relocation issues. If this was solved, what else is left? Couldn’t all other packages then be built from there? Or lift Msys2’ binary distribution architecture?

Msys2 already is agnostic to the location and can be placed anywhere (its packages too) or even have multiple copies. I have for instance used this property to build Emacs using a script that downloads Msys2, the relevant packages and compilers, and builds a self contained package GitHub - juanjosegarciaripoll/emacs-build: Scripts to build a distribution of Emacs from sources, using MSYS2 and Mingw64(32) This said, a build process with Msys2 is slower because shell scripts and executable invocation are both slower in Windows than in Linux.

Is this any different from the discussion you started on Installing on Windows ? The one where you said that “Ocaml’s inability to “relocate” properly has broken the (MSYS2) port”?

I’ll mention again what I said in that thread (“There is no obstacle except people power”) and almost all threads related to Windows. And in particular, a MSYS2 port is not something I would ever almost never prioritize in Diskuv OCaml, since that is focused on native development (ie. MSVC not GCC on Windows). Diskuv OCaml makes a lot of opinionated choices that may not reflect what you need. I do believe that I’ve been very upfront on what those choices are; confer the first paragraph of GitHub - diskuv/dkml-installer-ocaml: The Windows-friendly distribution of OCaml . And I must pick and choose because the Windows distribution is a part-time, tangential endeavor for my company.

That doesn’t mean you or anyone else couldn’t make a MSYS2 port their priority. It’s open source. And I think it would be a good idea. Hack away!

BTW, this same topic (lack of contributors) has come up in the recent OCaml compiler thread, recent dune thread and by myself for recent Windows threads. It may be time for a reset of expectations for discuss.ocaml.org members who are capable of being contributors.

3 Likes

To your first question, my message above is different from the one I posted before. This thread started with questions about the time it takes to build this distribution. Along these lines I was asking why Diskuv cannot have optional dependencies on Visual Studio when it is clearly already installing MSYS2 and using it to build the compiler. You point me to a paragraph that talks about compatibility with Visual Studio, but the fact is that libraries built using mingw/ucrt64 runtimes are compatible with Visual Studio.

Having led my own Common-Lisp compiler project for 10 years, I understand this (the Windows port) is not a full time effort for you, and it must be exhausting to reply to questions, but rephrasing sincere curiosity as criticism of your choices or plain demands is a bit cynical, IMHO.

Nevertheless, I understand from your message that maybe this forum is for members who are capable of being contributors and the rest of users without enough ML mojo should lurk away and wait for this to be finished.

Not my intent. I did not see how those questions were different so I asked. And it is related to an assumption (“clearly already installing MSYS2 and using it to build the compiler”) that is at best partially right. Confusion is to be expected. Let me clarify and hopefully satisfy your curiosity:

  • Using MSYS2 does not imply using the GCC compiler. Git for Windows and vcpkg both use MSYS2 for the bash shell and tar and a several other common UNIX tools, but don’t bring along GCC. In the same way Diskuv OCaml is using bash, GNU tar, etc. but not GCC.
  • It is a stretch (and confusing) to say that Diskuv OCaml is using MSYS2 to build the compiler. It is using bash from MSYS2 and also using MSVC, but the critical part is MSVC. At some point the bash part will be dropped (more on this below).
  • Since my distribution is for native users of Windows, I do not want to expose Unix to Windows users. I do my best to hide all uses of MSYS2 from the user.
  • More pointedly, if the OCaml compiler did not require a bash shell (today it relies on GNU autotools, but that is changing), and if several opam packages (and opam itself) did not assume Unix tools were present, then I would drop MSYS2 in a heartbeat.
  • I do think there is a great deal of minimization of what it takes to properly support MSYS2 in the phrasing “clearly already installing MSYS2 and using it to build the compiler”. That person would have take on the duties of the packages.msys2.org maintainer, and would have upstream or maintain patches for MSYS2 compatibility and depexts in the opam repository.

One extra assumption that seems incorrect (“libraries built using mingw/ucrt64 runtimes are compatible with Visual Studio”). It is a common misunderstanding, but it is not strictly true. I’ll refer to Rust which does the right thing and separates the MSVC ABI from the GNU ABI. Mixing those ABIs leads to very difficult to debug problems in production where no one will be able to help you, and is one of the primary reasons why … as a distributor who has to respond to tickets … I avoid GCC on Windows. (To minimize those problems I am using CLANG64 in MSYS2, but even that isn’t a full solution; see conan’s very good explanation of the differences in the clang target ABIs.)

Hopefully that answers why I have less than zero interest in taking up an additional mantle as the MSYS2 package maintainer and the additional mantle of MSYS2 opam repository maintainer. I did not mean that as a slam on someone who wants to adopt a seamless OCaml + MSYS2 solution; it simply is counterproductive to my goals. I do think a seamless OCaml + MSYS2 solution is a great idea though for some users who have different goals, and I believe Diskuv OCaml would be the easiest starting point to add in seamless OCaml+MSYS2. So I’m going to encourage people who are capable and in that target user group to lead that effort because … who else is going to do it?

3 Likes

Obviously you were not being cynical so there must be a language issue here. What he in fact said was “It may be time for a reset of expectations for discuss.ocaml.org members who are capable of being contributors” (my emphasis).

If I’m reading this correctly then Clang has a mode that would be more compatible with MSVC than GCC, what is missing from clang to be fully compatible with MSVC? (IIUC msvc defines an incorrect value for __cplusplus but would otherwise be compatible with clang or clangcl - not msys clang though)

My only experience with trying to support MSVC for another open source project was about a decade ago, so is probably outdated, but at the time the main difficulty was that MSVC didn’t fully implement C99, and the only way to compile C99 code was to claim the code was C++. But that only got you 90% of the way, because of course C++ and C99 aren’t exactly the same language even if you don’t use C++ features. So either had to avoid using C99 in the code even on other more capable compilers like GCC, or try to work around the various C++/C99 and other compatibility issues.
Although looking at Microsoft C/C++ language conformance | Microsoft Learn it looks like VS2017 has finally caught up and implemented nearly entire C99, and some C11 features too. Which is good to see because my impression was that they were stagnating on C standard support focusing only on C++.

C In Visual Studio 2019 version 16.6 and later versions, the compiler fully implements the standard C99 preprocessor via the /Zc:preprocessor option. (In Visual Studio 2017 versions 15.8 through 16.5, the compiler supports the standard C99 preprocessor via the /experimental:preprocessor compiler option.)
This option is on by default when the compiler option /std:c11 or /std:c17 is specified.

(obviously there is a lot more to C code compatibility than the language standard itself, but that is a minimum the OS has to support to make porting possible)

What is the minimum version of MSVC required by dkml? Would requiring MSVC >= 2019 version 16.6 help in reducing incompatibility with OCaml packages?

The exact list is at MSVC compatibility — Clang 16.0.0git documentation . To be honest, none of that stuff concerns me because those outstanding issues are about C++ ABI rather than C ABI. It is actually that MSYS2 (and Cygwin) bundle MinGW with a bunch of GNU specific attributes rather than the real Windows SDK. And that is not MSYS2/Cygwin’s fault because of the licensing for Windows SDK.

Oh, I think they are still stagnating on C support. It is annoying that they don’t seem to care about C that much.

The only versions supported are 16.5 and 16.6. Not surprisingly those are the versions available in GitHub Actions and GitLab CI/CD for MSVC 2019. All other versions I tried run into problems;
dkml-runtime-distribution/Machine.psm1 at main · diskuv/dkml-runtime-distribution · GitHub has some of those problems. But that was using OCaml 4.11 and 4.12. I believe there is some CI floating around using MSVC 2022 (it may be the opam CI), so OCaml 4.14.x probably work with a wider range of versions and that list can be expanded.

1 Like

Thanks, indeed, the linker you bug you pointed out in that file claims to have been fixed by Use /EMITVOLATILEMETADATA:NO with Microsoft Linker by dra27 · Pull Request #96 · ocaml/flexdll · GitHub.

1 Like

Discourse only allows me one :heart: reaction, so I’ll add some more here: :heart::heart::heart::heart::heart: :slightly_smiling_face:

Note that C:\Program Files is localised for many locales and should never be hard-coded in anything. The environment variable ProgramFiles is provided to point to it, and applications should use SHGetKnownFolderPath to retrieve it - but that of course doesn’t work without relocation. When they existed, the Windows binary distributions of OCaml “cheated” by setting OCAMLLIB, but that’s a serious stability problem in a multi-compiler opam world. Binary rewriting tricks are the least unstable thing in the meantime before relocation lands (returns to keyboard to rebase relocatable patches :dash:)

I’m just a decades-long fan-boy, rather than affiliated, but this isn’t the official line. VS 2019 is C11 and C17 compliant (since 2020 - see announcement) and support for C11 optional features is in VS 2022 17.5 preview 2 which means OCaml 5.1/5.2 will hopefully restore MSVC support. Once merged, it will have the nice effect that OCaml 5.x will require a C11-compliant compiler without any exceptions needed for MSVC (OCaml 4.14 is C99-requiring with MSVC as an exception).

4 Likes

Off-topic, but note that RISC-V support has landed and will be available in 5.1 (or today if you use the 5.1.0+trunk opam switch).

Cheers,
Nicolas

2 Likes