Migrating to floatarray (blog post)

nojb · April 26, 2021, 4:27pm

Dear all,

At LexiFi we recently migrated our codebase to use floatarray in place of float array in order to disable the “flat float array” mode in the compiler. If you are interested in finding out more about how we did it, we wrote a blog post about it Migrating to floatarray: experience report | LexiFi. Enjoy!

Cheers,
Nicolás

hannes · April 26, 2021, 6:55pm

Thanks for that excellent article!

EduardoRFS · April 29, 2021, 4:46pm

That was a really cool reading, thank you

dbuenzli · June 15, 2023, 10:00pm

I still don’t understand what floatarray brought to the party – or rather – to the OCaml array orgy.

Why didn’t you simply migrate to 1D bigarrays of floats ?

IIRC @xavierleroy’s idea with the float array optimization is that one could take any textbook numerical algorithm and implement it with OCaml arrays right away in a natural way with reasonable performance (i.e. no boxing).

If you disable the “flat float array” mode you lose that property but maybe get gains elsewhere. Fine.

But why the heck was yet another specialized array datastructure introduced in the stdlib ? Didn’t we have enough with bigarrays of floats ?

All these different choices of arrays (string, bytes, array, floatarray, bigarray) is extremely annoying for writing libraries.

UnixJunkie · June 16, 2023, 1:13am

Why not going to float bigarrays?
The cool thing is that you can select the size of the floats your are using.
I use this in one program and use 32b floats for large arrays so that they occupy less space.

PS: I do as Daniel suggested; always 1D even if the array is multidimensional

nojb · June 16, 2023, 3:30am

Hello!

We already use 1D bigarrays in a number of places, mostly to interface with C. However, I understand that bigarrays are less efficient than floatarrays: bigarrays are allocated in the C heap, while floatarrays are allocated in the OCaml heap (the latter has faster allocation, is much better at avoiding fragmentation, etc). Also, reads and writes to bigarrays require an extra memory indirection. All this may not matter 99% of the time, but apparently makes a difference for those writing high-performance numerical code which use a lot of short-lived float arrays.

Of course, these differences are observable in small experiments and benchmarks, but we had no way to test the two against each other in real-life code and at scale (as we had no easy way to rewrite the codebase to do so), so I cannot say for sure how much of a regression switching for 1D bigarrays would have represented “globally”.

I agree!

Cheers,
Nicolas

dbuenzli · June 16, 2023, 8:19am

Do you have an idea of which kind of code did that ?

I suspect a lot of the short-lived float data is in the realm of small vector data (points, matrices etc.) for which you can use fixed size records of float fields – that’s even better, you don’t get bound checks.

Large stuff whether scalar, raster or vector data you will need bigarrays anyways as you are unlikely to skip the chance to use your GPU or blas/lapack.

Basically my impression is that the introduction of floatarray was a “cover your ass” move for those willing to use --disable-flat-float-array.

And we are not over, we’ll soon get twice boxed dynarrays – the more the better. I look forward to the introduction of floatdynarray and uchardynarray.

I think more thought should be given to these things and maybe less in the name of performance but of usability. JavaScript may have a horrible array representation but in the end its a versatile datastructure which gets a nicer eco-system to work with (note though that they also do have bigarrays nowadays).

In the end I wonder if @xavierleroy’s first call, which as far as I remember mostly seemed to annoy compiler devs was not a much better idea.

What is the long term plan here ? To default to --disable-flat-float-array ?

nojb · June 16, 2023, 8:57am

I don’t have anything concrete to show. I will see if I can convince one of our quantitative developers to explain this point a bit more and if there is anything of interest I will post back here.

This question actually came up during the last dev meeting, and my recollection of the discussion is that no, there is no long term plan to default to --disable-flat-float-array. In other words, the long term plan seems to be to stick to the status quo.

Cheers,
Nicolas

c-cube · June 16, 2023, 1:25pm

True, but in JS, “arrays” are actually dynarrays. So maybe it’s not a
bad thing to add these?

dbuenzli · June 16, 2023, 2:02pm

That’s besides the point. You can always add more types to a system, that doesn’t necessarily make it better.

The problem is which type APIs agree on to interchange data so that one doesn’t have to constantly handle representation mismatches.

In this particular case it’s unlikely going to be that new Dynarray proposal since it boxes even more than arrays do. For most of my dynarrays uses a few convenience combinators to easily extend regular arrays would have gone a long way.

Topic		Replies	Views
`float array` and `floatarray` have similar performance in my benchmark Learning	9	873	February 14, 2023
Disable unboxed float array by default on OCaml 5 Ecosystem compiler , ocaml	9	2024	February 15, 2021
About the Unboxed Float Arrays proposal Community array	3	1107	January 8, 2020
Is OCaml compiler configured with flat_float_array marshal compatible with no_flat_float_array Learning	4	627	December 15, 2019
Vectorized operations for float arrays or bigarrays of floats Community	13	1065	April 12, 2023

Migrating to floatarray (blog post)

Related topics