OCaml carbon footprint

patricoferris · October 6, 2021, 4:42pm

Thanks for the questions, just a quick disclaimer that (a) I can only talk about the work that I know about i.e. from an OCaml Labs perspective and (b) the new site has lost its banner saying that it is very much a WIP, especially the copy used on the pages so that was probably added as a placeholder.

But, we do have strong intentions of being more cognisant of the environmental impact of the OCaml infrastructure. In particular the large cluster of machines that do things such as:

Build the ocaml/opam OCaml docker base images
Run CI checks on opam-repository
Provide individual CI per project
Build and deploy ocaml.org and v3.ocaml.org

See here for a more complete list of available services. Running these services doesn’t come for free both in terms of the cost of manufacturing/maintaining those machines and also the energy used in powering them.

In terms of actions already taken, I believe (not 100% sure) that some deliberation has gone into choosing infrastructure providers whose environmental goals align with our own, @avsm might be able to provide more insight there. But I’ll also refer you to the section in the original roadmap v3.OCaml.org: A roadmap for OCaml's online presence.

Of course swapping the cluster to only use renewable resources is not a single solution, when one thing uses renewables it means something else cannot, so the question becomes is job X worth running at all? Which brings me on to the future goals (as I see it).

Future Goals

Taking decisive actions to reduce the environmental impact means having a much better understanding of the current environmental impact. This should actually be fairly achievable using the cluster management tooling (see OCluster) and knowledge of the types of machines all the jobs are scheduled on. With better reporting we can make decisions with measurable impact.

Some potential wins we could probably implement are:

Better caching and sharing of artefacts (there’s a lot of overlap in jobs like the health-check and the docs generation, what could they share?). Already the infrastructure tries to re-run jobs on the same machine to make hitting caches more likely which is a step in the right direction.
Sometimes jobs are re-run when maybe they shouldn’t be I think. Maybe we should be more opt-in for a rerun rather than automatically rebuilding lots of things.
Surface the reporting – there’s a lot of jobs running constantly that lots of people are quite unaware of I think (e.g. the health-check), the more use we can get out of the vasts amount of data the infrastructure produces, the better.
Surfacing the environmental metrics to users.

Without the metrics system giving us good ball-park figures it would be hard to know if any changes make a difference, so that in my mind is crucial. These are just some thoughts, happy to hear anymore suggestions and ideas.

Topic		Replies	Views
OCaml.org: recapping 2022 and queries on the Fediverse Community ocamlorg	31	3390	January 10, 2023
Initial Emissions Monitoring of the OCaml.org Infrastructure Community	6	1273	July 4, 2023
OCaml's domains Community	27	2942	April 20, 2018
[ANN] the OCaml Software Foundation Community announce , ocsf	28	8146	November 4, 2020
Interesting OCaml Articles Community web , blog , video , podcast , social-media	109	19927	October 18, 2023

OCaml carbon footprint

Future Goals

Related topics