While implementing a small CLI tool, I ran into a somehow undocumented feature of the Ocaml compiler: it automatically expands wildcards before doing anything else. Which proved to be a problem.
This post serves three different goals:
- give some visibility, in case someone else run into this issue in the future
- expose a possible workaround
- ask the community if there is a better way™ to solve this
Context
My tool uses Cmdliner
for CLI args processing, and needs to handle basic wildcard processing for one of its options, eg. it should handle mytool.exe -x *.ml
.
This would get expanded to mytool.exe -x a.ml b.ml c.ml
which Cmdliner cannot handle. Under any common Unix shell, this is not a problem: we just have to escape the star character with eg. mytool.exe -x \*.ml
, have mytool handle the expansion itself and we’re all set. So far, so good.
Then came Windows. Whatever I would do, it seemed like there was no way of preventing that wildcard to be expanded. I learned that on Windows, the calling program was responsible for dealing with wildcards, not the shell. After some digging, the root cause of this behaviour was found in the ocaml runtime itself, in runtime/main.c
:
int main_os(int argc, char_os **argv)
{
#ifdef _WIN32
/* Expand wildcards and diversions in command line */
caml_expand_command_line(&argc, &argv);
#endif
/* [...] */
}
After a bit of history digging, it turns out this behaviour dates back from the very early stages of the Ocaml compiler, see this commit by Xavier Leroy from… 1996!
Workaround
The runtime/main.c
file gives a hint on how to work around this:
/* Main entry point (can be overridden by a user-provided main()
function that calls caml_main() later). */
So the most elegant workaround I could find was to create a copy of the main.c
file inside the source tree of mytool and comment out the call to caml_expand_command_line
. Then it was a matter of compiling and linking everything altogether. I use dune
to compile mytool.exe
, and after a lot of trial-and-error, I found out it could handle this very easily with the foreign_stubs
stanza:
(executable
(name mytool)
(foreign_stubs (language c) (names main))
; ...
)
Minimal working example
I opened a Github repository containing a minimal project featuring a custom entry point so that command-line arguments expansion does not happen on Windows.
See: GitHub - benji-sb/ocaml-windows-argv
Open Questions
- The root cause of this issue was introduced almost 30 years ago. How come no one on the Internets seem to have run into a similar issue?
- Why was this behaviour introduced in the first place? I suspect it may have make it easier to setup a Windows toolchain back then, but that’s just wild speculation.
- Is this behaviour still needed, or could we get rid of it?
- Should this be more wildly documented, and if so, where? The ocaml compiler docs and the dune docs could probably benefit from a small paragraph on how to override the default entry point.