Large repository of real dune files, to grep for examples

Is there a place in which a large number of real dune files are easily accessible, so that one can grep them for examples? Namely for details which are not necessarily written out in the docs.

For instance, opam-repository is ideal for grepping opam files and finding out how real people solve some issues, which syntax they use, etc. I often feel the need for a similar thing for dune files, so that I can quickly check if a given expression is meaningful (or better: finding a similar, but actually valid, expression).

I’ve used Github search for this in the past: Code search results Ā· GitHub (But I’m not sure how stable/reliable it is anymore).

https://sherlocode.com is a good alternative for OCaml, but it does not index dune files, as far as I’m aware.

2 Likes

https://sherlocode.com is a good alternative for OCaml, but it does not index dune files, as far as I’m aware.

This is impressive because it is extremely fast. But how are people using it? It implements string search and that is often a bit too granular for conceptual problems.

You could also use opam-grep. It downloads every package in OPAM and searches within that set.

1 Like

opam-grep is quite limited (it only says ā€œpackage foo matchesā€ or "package foo doesn’t match), and slow. At least assuming it hasn’t changed in recent times. I had to write another tool instead. E.g. to check how people use the preprocess field in the opam packages of the current switch:

$ time _build/default/bin/opam_search.exe -x 1 $(opam list --short | while read p; do echo -p $p; done) -- grep -r --include 'dune' 'preprocess' -A 1 | head
alcotest:src/alcotest-stdlib-ext/dune: (preprocess future_syntax))
alcotest:--
alcotest:src/alcotest-engine/dune: (preprocess future_syntax))
alcotest:--
alcotest:src/alcotest/dune: (preprocess future_syntax))
angstrom:lib_test/dune: (preprocess
angstrom:lib_test/dune-  (per_module
angstrom:--
angstrom:lib/dune: (preprocess future_syntax))
base:generate/dune: (preprocess no_preprocessing))

real	0m0,849s
user	0m0,937s
sys	0m0,655s

Thanks! I didn’t know about opam grep.

I tried a bit, and I must agree with @v-gb that it is somewhat slow, and the matching is not very informative, but just to clarify, it does search every file in the package, right? And there’s no option to filter by file name (e.g. rg --glob)? Because then I could include only dune and dune-project files and it would be perfect!

I just cloned opam-search and ran dune build, and after 3 minutes compiling (I thought it was looping, since there was no specific output during this time), it produced some warnings:

File "time_float_unix/time_unix/time_unix.ml", line 1, characters 4-14:
1 | [@@@deprecated "[since 2022-04] Use [Time_float_unix] instead"]
        ^^^^^^^^^^
Warning 53 [misplaced-attribute]: the "deprecated" attribute cannot appear in this context
File "bigstring_unix/src/dune", line 4, characters 9-29:
4 |   (names bigstring_unix_stubs recvmmsg))
             ^^^^^^^^^^^^^^^^^^^^

and some errors:

/home/user/.cache/dune/toolchains/ocaml-compiler.5.3.0-ca92397a8f2c8e38e16694f64da85045/target/lib/ocaml/caml/platform.h: In function ā€˜caml_plat_latch_is_released’:
/home/user/.cache/dune/toolchains/ocaml-compiler.5.3.0-ca92397a8f2c8e38e16694f64da85045/target/lib/ocaml/caml/platform.h:222:10: error: implicit declaration of function ā€˜atomic_load_acquire’ [-Wimplicit-function-declaration]
  222 |   return atomic_load_acquire(&latch->value) == Latch_released;
...
error: implicit declaration of function ā€˜atomic_store_release’
...

Should I report these as issues in Github?

Otherwise it seems very useful for what I want, thanks!

It downloads and unpacks the whole repository, you can take that location and use it with e.g. ripgrep

2 Likes

Just the dune files from everything in Opam is probably not a lot of data. So a script that looks just at code hosted on GitHub (for simplicity) and downloads all dune files into a hierarchy in the local file system is probably not out of reach. Sounds like a job for an AI.

The first dune build would build dependencies, so yeah, it’d take a bit of time. But dune provides progress, so not sure why you weren’t seeing any.
For the build errors, that’s surprising considering the lock file. Let’s discuss this in an issue, yes. Thanks!

1 Like

The regular releases only do so when started with --display short (starting with 3.21). The preview used to display this by default, the nightly preview does not.

1 Like

In a nearby future, Software Heritage should hopefully allow such kinds of tasks to be performed without need for an AI. And with the advantage of having many more repositories than Github (e.g. self-hosted Gitlab instances).

1 Like

Indeed, and it has a copy of everything on the opam repository.