[ANN] First opam release of easy_xlsx

I’m pleased to announce the first release of easy_xlsx and some related lower-level libraries (spreadsheetml to parse the contents of an XLSX file, and open_packaging to parse Microsoft’s “Open Packaging Specification”, which is used by all new Microsoft Office formats).

See documentation here

The short version is that if you need to read a .xlsx file, you can do:

Easy_xlsx.read_file "example.xlsx"
|> List.iter (fun sheet ->
  printf "Got sheet: %s\n" (Easy_xlsx.name sheet);
  Easy_xlsx.rows sheet
  |> List.iter (fun row ->
    List.iter (function
      | Easy_xlsx.Value.Date d -> (* Ptime.date *)
      | Datetime dt -> (* Ptime.t *)
      | Number n -> (* float *)
      | String s -> (* a string obviously *)
      | Time t -> (* Ptime.time *))))

(There’s also a Easy_xlsx.Value.to_string if you just want to convert XLSX to CSV)

If you have any XLSX files that don’t work right and can make them public, please open a pull request to add them to the test suite. Detecting datetimes is surprisingly complicated since it depends on the formatting applied to the column (datetimes are identically to numbers), so if you’re dealing with unusual formatting I may not be able to detect it, but having more examples in the test suite would help.

10 Likes

That is very nice!

I must admit that my main use for generating Excel files in the past has been that the CSV format’s method to specify background colors for cells was pretty buggy and not portable between Excel/LibreOffice, so I’m not a heavy user, but I think this is good to have in the ecosystem!

At the moment I don’t really offer an interface to write XLSX files back out, but I’d be happy to have it.

spreadsheetml is able to expose cell styles but it’s not exposed at the easy_xlsx layer (since that library is meant more for data processing. I’d like to have a more mid-level interface at some point, where we expose the full content of the cells instead of just the data.

2 Likes