My first experience with OCaml

Hey,

I’ve been playing with OCaml for the past few days and decided to share my first impressions while the memory is still fresh.

You can find the full source code of the program I wrote here.

I’d be curious to hear your thoughts and hear any feedback.

Best,
Alex

14 Likes

Thanks for this impression.

I find it curious that Lwt and Domain doesn’t fit together and result in a segmentation fault.

There is also GitHub - ocaml-community/awesome-ocaml: A curated collection of awesome OCaml tools, frameworks, libraries and articles. which can be helpful when searching a library.

1 Like

I am not sure what exactly caused this segmentation fault. I spawned threads using domainslib, and inside threads called cohttp-lwt which caused that. Then I used ezcurl instead and the issue disappeared. Thanks for the link!

Ezcurl start a “curl” sub process, then it doesn’t depends of C code which doesn’t belong to OCaml standard libraries (exempted the curl program which runs isolated in its own process). (And we can expect these libraries to be reentrant since 5.0). On the other way, Cohttp depends of multiple C codes which were developped before OCaml 5 when reentrancy was not necessary (Lwt, OpenSSL…).

(Normally OCaml code doesn’t throw a segmentation fault. A non reentrant C code is probably the culprit).

1 Like

As far as I know Ezcurl is a thin wrapper for Ocurl which is a binding for libcurl, so there is no subprocess and the request runs in the same process as the OCaml program. Someone please correct me if I’m wrong. Cc @c-cube

2 Likes

Yes, my bad. I havn’t thought that curl comes also as a library (libcurl). I guess reentrancy is better designed with ocurl.

I feel your pain with Dune.

1 Like

Correct, ezcurl is a wrapper around libcurl bindings. The subprocess one is curly.

1 Like

Thanks for playing with OCaml and sharing your experience. One gets to learn something for the first time exactly once. So you documenting your experience is invaluable for us. Thanks.

Reg domainslib, what you want here is concurrency and not parallelism. If you expect to speed up CPU intensive computation, then go for domainslib. If you expect to do a lot of IO (as is the case here), go for a concurrent programming library. Generally, a library that offers concurrency but not parallelism is good enough for the latter such as Lwt and Async. Note that neither is safe yet for parallelism. If you use Lwt or Async in a parallel setting (with multiple domains), expect crashes.

If you want to try a new and shiny concurrent programming library, give Eio a spin. This builds on top of the latest concurrent programming features available in OCaml 5 (effect handlers) and is parallelism safe (unlike Lwt and Async).

6 Likes

Yeah, I saw Eio was mentioned somewhere for concurrency. Yes, I need something lightweight for this program, not a separate thread per website. Something similar to a goroutine from Go.

With Domainslib, you limit the number of threads, then the limit of simultaneous web fetchs.

With a concurrent library, a single thread can fetch all web pages simultaneously. Each I/O would trigger a coroutine yield and the fetch of an other page.

In eio/README.md at main · ocaml-multicore/eio · GitHub you have a worker pool pattern which can be used with Eio to limit simultaneous downloads.

2 Likes

If your needs fit somewhere in the middle (lots of I/O but also the occasional CPU intensive task) there’s Lwt_domains (by @sudha ) which allows you to spawn some parallel computations from within Lwt, and view it as a promise.

1 Like

Hi @plutov, thanks for writing up your experience.

There is a concurrent HTTP1 and HTTP2 client, which you may be interested in: httpcats 0.0.1 (latest) · OCaml Package (opam install httpcatts).

3 Likes