OPAM Packaging for large project

I’m after some help and advice on packaging a large OCaml project up for OPAM.

I’m working on a native OCaml project for AWSinhabitedtype/ocaml-aws. The project uses the service defintions published via
botocore and uses a generator to give OCaml. As it currently stands the project is split into the following
parts:

  • aws - core http / serialisation / error handling code
  • aws-lwt - thin wrapper to use lwt
  • aws-async - another thin wrapper to use async
  • aws-* - a library per service eg RDS, EC2

Each part has it’s own *.opam file, so I would expect to push each to opam as a library.
Does this split make logical sense for an OCaml library? I’m following the experience I had with
a similar Haskell library (amazonka) which worked nicely.

How can I package up each library for OPAM, with as automated process as possible?
The last time I tried this none of the available tools like opam-publish/dune release worked well with multiple libraries in a single github repo.

Right now I have working interactions with a reasonable selection of services:
autoscaling, cloudformation, cloudtrail, cloudwatch
ec2, elasticache, elasticloadbalancing, rds
route53, sdb, sqs, ssm, sts

The challenge I have is testing each service properly to ensure I don’t break something when adding new services or hacking on the code generation. Right now I have a bunch of unit-tests that hit individual services and do some basic things. This approach isn’t going to scale particularly well, the tests are somewhat slow once you wait for real resources to spin up, and the monetary cost of AWS resources will get too much for an individual to cover. I am thinking I will write a selection of basic scenario tests to use free-tier resources and also provide a cli client that replicates the official cli client so that other tests can be done ad-hoc.

Thoughts? Is there other options I haven’t considered.

Questions:

  • How can I package up each library for OPAM?
  • How can I best test the library without incurring significant overhead or cost?

First, this is great! I mean, great! When I was working with AWS 13 years ago, I had to hack together some Perl scripts to talk to it, b/c the Perl APIs were insufficient for my needs, and no way was I gonna use Java APIs. Great to see progress.

Second: for testing. A possible way to approach this, is to write mock AWS backends, that respond correctly. You can test -those- by using actual distributed AWS client code (which you hence can assume is correct) and pointing it at your mock backends, checking that they get the right results, etc.

On testing I’ve used https://github.com/localstack/localstack as a Local mock library for many core AWS services, our code was JS/F# rather than OCaml but it worked well enough. Other suggestions are welcome. I specifically don’t want to take on the maintenance burden for that code, I’ve enough code to write as is :slight_smile:

What was the problem with dune-release? Assuming you want to release all of the packages in the repository at the same time (with the same version number), it should be very easy.

For packaging, just generating all the opam files from dune-project and then using dune-release should work fine. See GitHub - mirage/ocaml-github: GitHub APIv3 OCaml bindings for a project with multiple files.

For AWS testing, we do have a small grant from AWS (that we use to provision a Gravitron2 ARM64 machine for the opam-repo-ci). Given the importance of having working AWS bindings to the OCaml ecosystem, I’d be happy to give you access to that account so that you can run the full test suite before every release. Drop me an email (anil@recoil.org) and we can figure out the details directly.

5 Likes

Is there any issue with packaging everything up as a single tar?

I expected there to be a single tbz file for each opam project, so you would only need to download the bits you actually depend on. However, I noticed using dune-release creates a single aws-1.1-71-g48534408.tbz file.

Assuming you want to release all of the packages in the repository at the same time (with the same version number)

I can if that is the easiest path, but I really don’t need to. The core pieces change infrequently while the bindings to individual services like RDS change more often, being the place where the serialisation happens. Taking a data point from amazonka they just release everything with the same version number.

Thanks for this effort!
We have support pending for opam files defining the use of a single subdirectory for a subpackage, which could then allow to generate smaller tarballs for your libraries ; but it’s been decided to keep that for opam v2.2 (v2.1 is now in beta). So at the moment, indeed, the idea is to have a single tarball; it will be cached, so it’s not that big a deal.

Also, opam-publish now supports submitting multiple packages in a single command, so if you already have opam files ready, just running opam publish from the root of your project should do the trick.