Cast Bigarray kind

crackcomm · September 10, 2021, 7:25pm

Hey there,

I am trying to cast a float Array2.t into a bigstring (char Array1.t). I started with reshaping and Obj.magic, I tried writing a C extension, nothing with success.

Is there any way I can do this? It’s pretty important for me not to make a copy.

hcarty · September 10, 2021, 8:09pm

I have some code/stubs here which I wrote a long while ago to cast the underlying bytes between different bigarray types:

I haven’t tested or used this code in several years at this point, so please proceed with caution if you give it a try!

crackcomm · September 10, 2021, 8:44pm

Thanks! that is very helpful. I had to make few changes (eq. the elems counter should use * not +). I created a fork and committed the changes if you’re interested.

It seems to work fine, yet the question remains if it is possible without making a copy.

nojb · September 10, 2021, 9:54pm

I believe this should is doable without much trouble from inside the runtime, but I don’t think all required functionality is accessible from outside the runtime (you would need to duplicate the caml_ba_update_proxy function).

Cheers,
Nicolas

crackcomm · September 10, 2021, 10:13pm

Here is what I have (gist):

CAMLprim value ba_cast_to_char(value vb)
{
     CAMLparam1(vb);
     CAMLlocal1(res);
#define b (Caml_ba_array_val(vb))
     intnat dim[CAML_BA_MAX_NUM_DIMS];
     for (int i = 0; i < b->num_dims; i++)
     {
          dim[i] = b->dim[i] * caml_ba_element_size[b->flags & CAML_BA_KIND_MASK];
     }

     int flags = CAML_BA_CHAR | (b->flags & (CAML_BA_LAYOUT_MASK | CAML_BA_MANAGED_MASK));

     res = caml_ba_alloc(flags, b->num_dims, b->data, dim);
     /* Copy the finalization function from the original array (PR#8568) */
     Custom_ops_val(res) = Custom_ops_val(vb);
     /* Create or update proxy in case of managed bigarray */
     caml_ba_update_proxy(b, Caml_ba_array_val(res));
     /* Return result */
     CAMLreturn(res);

#undef b
}

open Bigarray

external c_cast
  :  ('a, 'b, 'c) Bigarray.Genarray.t
  -> (char, int8_unsigned_elt, 'c) Bigarray.Genarray.t
  = "ba_cast_to_char"

let elems ba = Array.reduce_exn Int.( * ) (Bigarray.Genarray.dims ba)

let cast ba =
  let ba = Bigarray.reshape ba [| elems ba |] in
  c_cast ba |> array1_of_genarray
;;

It again seems to work but I’m not sure if the code is correct, in particular should it replicate CAML_BA_MANAGED_MASK or should it be set to EXTERNAL and whether data should be passed to caml_ba_alloc or only to caml_ba_update_proxy?

nojb · September 11, 2021, 6:18am

I left some comments in your gist.

Cheers,
Nicolas

xavierleroy · September 11, 2021, 5:28pm

I think there’s a problem in your code: you’re multiplying every dimension by the byte size of elements. Only one dimension should be multiplied. Consider a 2x2 array of 64-bit floats, with size 2x2x8 = 32 bytes. You’re turning it into a 16x16 array of bytes, with size 16x16x1 = 256 bytes.

More generally: can you write a specification for what you’re trying to achieve?

crackcomm · September 11, 2021, 8:30pm

Yes, you are completely right. I was reshaping the array into a single dimension and I didn’t catch the mistake. Only the last dimension should be multiplied by the byte size.

The main use of this function is to cast multidimensional array into a bigstring type for hashing and sending it across the network. Hopefully that could be achieved without any C extensions. I think it’s pretty common operation.

yallop · September 13, 2021, 2:10pm

In case it’s of interest, Ctypes can do this. For example, here’s a function that provides a bigstring view onto a float Array2.t value:

let bigstring_of_float64_array2 arr =
  let start = coerce (ptr double) (ptr char) (bigarray_start array2 arr) in
  Bigarray.(bigarray_of_ptr array1 (Array2.dim1 arr * Array2.dim2 arr * sizeof double) char start)

Here’s the function in action, showing that the size of the bigstring is calculated appropriately:

# let arr = Bigarray.(Array2.(create float64) c_layout 10 20);;
val arr : [...]
# let bs = bigstring_of_float64_array2 arr;;
val bs : [...]
# Bigarray.Array1.dim bs;;
- : int = 1600

Here’s a demonstration that there’s no copying going on:

# arr.{0,0} <- 10.5;;
- : unit = ()
# Int64.float_of_bits (Core_kernel.Bigstring.get_int64_t_le bs 0);;
- : float = 10.5
# arr.{1,1} <- -4.1;;
- : unit = ()
# Int64.float_of_bits (Core_kernel.Bigstring.get_int64_t_le bs 168);;
- : float = -4.1

crackcomm · September 13, 2021, 4:12pm

That’s pretty much what I was looking for, brilliant.

UnixJunkie · February 20, 2023, 7:09am

In the same vain, could Ctypes give a bytes view of a char bigarray (a bigstring)?

yallop · February 20, 2023, 11:18am

I don’t think there’s a way to do that, because the memory layouts are too different.

Topic		Replies	Views
Casting bigarray between different kinds, using ctypes, possible problems? Learning	3	846	October 18, 2021
Bigarray tutorial or examples? Learning	12	3350	May 19, 2024
How to reliably build a Bigarray of specific element type using a string Learning	16	500	June 21, 2024
Writing bigarray to file Community	5	1134	July 7, 2023
Migrating to floatarray (blog post) Community announce	9	1611	June 16, 2023

Cast Bigarray kind

Related topics