Cast Bigarray kind

Hey there,

I am trying to cast a float Array2.t into a bigstring (char Array1.t). I started with reshaping and Obj.magic, I tried writing a C extension, nothing with success.

Is there any way I can do this? It’s pretty important for me not to make a copy.

I have some code/stubs here which I wrote a long while ago to cast the underlying bytes between different bigarray types:

I haven’t tested or used this code in several years at this point, so please proceed with caution if you give it a try!

1 Like

Thanks! that is very helpful. I had to make few changes (eq. the elems counter should use * not +). I created a fork and committed the changes if you’re interested.

It seems to work fine, yet the question remains if it is possible without making a copy.

I believe this should is doable without much trouble from inside the runtime, but I don’t think all required functionality is accessible from outside the runtime (you would need to duplicate the caml_ba_update_proxy function).


Here is what I have (gist):

CAMLprim value ba_cast_to_char(value vb)
#define b (Caml_ba_array_val(vb))
     intnat dim[CAML_BA_MAX_NUM_DIMS];
     for (int i = 0; i < b->num_dims; i++)
          dim[i] = b->dim[i] * caml_ba_element_size[b->flags & CAML_BA_KIND_MASK];

     int flags = CAML_BA_CHAR | (b->flags & (CAML_BA_LAYOUT_MASK | CAML_BA_MANAGED_MASK));

     res = caml_ba_alloc(flags, b->num_dims, b->data, dim);
     /* Copy the finalization function from the original array (PR#8568) */
     Custom_ops_val(res) = Custom_ops_val(vb);
     /* Create or update proxy in case of managed bigarray */
     caml_ba_update_proxy(b, Caml_ba_array_val(res));
     /* Return result */

#undef b
open Bigarray

external c_cast
  :  ('a, 'b, 'c) Bigarray.Genarray.t
  -> (char, int8_unsigned_elt, 'c) Bigarray.Genarray.t
  = "ba_cast_to_char"

let elems ba = Array.reduce_exn Int.( * ) (Bigarray.Genarray.dims ba)

let cast ba =
  let ba = Bigarray.reshape ba [| elems ba |] in
  c_cast ba |> array1_of_genarray

It again seems to work but I’m not sure if the code is correct, in particular should it replicate CAML_BA_MANAGED_MASK or should it be set to EXTERNAL and whether data should be passed to caml_ba_alloc or only to caml_ba_update_proxy?

I left some comments in your gist.


I think there’s a problem in your code: you’re multiplying every dimension by the byte size of elements. Only one dimension should be multiplied. Consider a 2x2 array of 64-bit floats, with size 2x2x8 = 32 bytes. You’re turning it into a 16x16 array of bytes, with size 16x16x1 = 256 bytes.

More generally: can you write a specification for what you’re trying to achieve?

1 Like

Yes, you are completely right. I was reshaping the array into a single dimension and I didn’t catch the mistake. Only the last dimension should be multiplied by the byte size.

The main use of this function is to cast multidimensional array into a bigstring type for hashing and sending it across the network. Hopefully that could be achieved without any C extensions. I think it’s pretty common operation.

1 Like

In case it’s of interest, Ctypes can do this. For example, here’s a function that provides a bigstring view onto a float Array2.t value:

let bigstring_of_float64_array2 arr =
  let start = coerce (ptr double) (ptr char) (bigarray_start array2 arr) in
  Bigarray.(bigarray_of_ptr array1 (Array2.dim1 arr * Array2.dim2 arr * sizeof double) char start)

Here’s the function in action, showing that the size of the bigstring is calculated appropriately:

# let arr = Bigarray.(Array2.(create float64) c_layout 10 20);;
val arr : [...]
# let bs = bigstring_of_float64_array2 arr;;
val bs : [...]
# Bigarray.Array1.dim bs;;
- : int = 1600

Here’s a demonstration that there’s no copying going on:

# arr.{0,0} <- 10.5;;
- : unit = ()
# Int64.float_of_bits (Core_kernel.Bigstring.get_int64_t_le bs 0);;
- : float = 10.5
# arr.{1,1} <- -4.1;;
- : unit = ()
# Int64.float_of_bits (Core_kernel.Bigstring.get_int64_t_le bs 168);;
- : float = -4.1

That’s pretty much what I was looking for, brilliant.