GADTs for tracking dimensions of multidimensional arrays?

Chet_Murthy · August 30, 2022, 5:59pm

I was musing on GADTs today, and wondered if anybody had used them to track the dimensions of multi-dimensional arrays. Something like maybe using a list of integers for the dimensions, and then matrix-multiply would have matching-constraints.

Or maybe restricted to vectors and matrices, using an integer for the dimension of a vector, and a pair of integers for the dimensions of a matrix.

I’ve never done such a thing (never used GADTs), but it seems like maybe this might work, viewed from 10,000 feet up. Does anybody have experience trying this?

octachron · August 30, 2022, 6:32pm

You can have a look at my tensority toy library for an example of an implementation of statically verified tensor algebra (which has the advantage of giving a common foundation for multiplying matrices or vectors).

In brief, tensority implements tensors with statically-verified ranks and sizes, by mixing GADTs at the rank level (the dimension of the tensor indices) and phantom types for the size of the arrays:

(* Multidimensional array literals *)
let array4 = [%array 4
      [
        [1, 2; 3, 4], [5, 6; 7, 8], [9, 10; 11, 12];
        [1, 2; 8, 9], [9, 1; 2, 3], [19, 12; 14, 17]
      ]
];;

(* Definition using function *)
let array = init_sh [502103s] (function [k] -> Nat.to_int k)

(* accessing an element *)
let one = array4.( 1i, 0i, 0i, 0i )

(* accessing an element out-of-bound is a type error *)
let one = array4.( 1i, 5i, 0i, 1i )

(* element assignment *)
array4.(0i,0i,0i,0i) <- 0

(* slicing *)
let matrix = array4.!( All, All, 1j, 1j )
let matrix_squared = matrix * matrix

The implementation and signatures are probably too complex for their own sake, and I hope to find someday the time to distill them to a more modest and usable set.

mooreryan · August 30, 2022, 6:35pm

You might also have fun looking at the code for slap, which adds type-based static sized checking to BLAS. Pretty interesting library.

Chet_Murthy · August 30, 2022, 6:46pm

Oh, nice! I’ll look at both of these soon! I do agree that this sort of dimension-checking is a lot for the kind of user who is barely willing to keep track of types (I’ve worked with code where somebody was using real-value entry matrices, and didn’t realize that they might end up needing complex-value entries but were just lucky that they didn’t). But OTOH, for some uses, this might be really nifty, and as a lapsed priest of the One True Church of Curry-Howard, it’s certainly something I’d find lovely.

n4323 · August 31, 2022, 8:25am

I find these type safe Linear Algebra interfaces very interesting. But I’m asking myself if the overhead of parsing the types would not get in the way too much. For instance in slap, a 5-dim vector would have a type (z s s s s s, 'a) vec . If add a 572-dim vector to a 500-dim one, the type error will fill the screen. Just saying that when representing dimensions by natural number types, one would need a type formatter to make that usable.

bluddy · August 31, 2022, 9:00am

Which is precisely why nobody does this in any non-dependent type language AFAIK.

octachron · August 31, 2022, 9:09am

It is indeed a problem with unary representation of integers at the type-level. Even worse, the typechecker itself only supports types of “reasonable” size. Typically, that meant that slap could not handle vector of size ≈35000 last time I checked. The increase of the stack limit in OCaml 5 probably means that this is no longer the case however. This is one of the reason why tensority uses a decimal representation of integers at the type-level and the type of 1000s (where s stands for size in contrast to indices)

let n1000 = 1000s

is

([ `_1 of [ `_0 of [ `_0 of [ `_0 of [ `T ] ] ] ] ], [ `Eq ]) Nat.t

which is at least partially readable.

Chet_Murthy · August 31, 2022, 10:16am

Ehh, there are a ton of interesting codes out there in numerical linear algebra. It would already be a great thing, to have a dimension-aware type system that allowed to write those codes and get correct dimensionality-checking. A system that was good enough to do that, even if it was ugly and cumbersome to write the types, could be improved, I suspect, so that it was easy to write the types.

mseri · August 31, 2022, 11:44am

There was NumLin that tried something like that recently, and could compile to C or Owl.

Chet_Murthy · August 31, 2022, 6:24pm

Marcello, do you have any pointers to writeups about NumLin? I clicked-around the examples, and … it’s difficult to figure out what they’re doing (that is, what the intent of the work is). Sorry, I might be dense, or just getting old (grey matter exfiltrating thru ear canals …)

mseri · August 31, 2022, 9:27pm

The only think I have, is their paper.

I have only skimmed through it at the time and to be honest I have forgotten most of it. Their aim is to improve memory safety guarantees for linear algebra algorithms. If my memory is not failing me, they implemented a small language with a special linear type system with encoding of the dimensions of the tensors. After type checking your linear algebra code, they can compile to C (calling BLAS/LAPACK) and Owl.

mmottl · September 2, 2022, 4:31pm

Lennart Augustsson has recently developed a Haskell library called Orthotope for dealing with multidimensional arrays. It can reflect shape and dimension information in types. You can also watch a video presentation about Orthotope.

There is likely no way to translate this to OCaml without significant type system extensions, but some of the concepts and ideas might inspire similar work.

Topic		Replies	Views
Implementing map for a size-indexed vector using GADTs Learning gadt	3	1055	April 1, 2019
Beginner trying a simple GADT needs some guidance Learning	8	682	May 27, 2022
"Parsing" terms into a well-typed representation: a GADT puzzle Learning gadt	4	2597	October 30, 2021
A bestiary of GADT examples? Ecosystem gadt	11	1779	May 18, 2023
Filtering constructors by type variable (GADTs + polymorphic variants) Learning gadt , polymorphism	6	1137	August 7, 2020

GADTs for tracking dimensions of multidimensional arrays?

Related topics