'Binding' to SQL queries

After coming across projects like GitHub - ygrek/sqlgg: SQL Guided (code) Generator , I’ve started thinking of SQL mapping in a more functional way: mapping to SQL queries should look like binding to a function. E.g., conceptually,

external add_user : string -> int -> unit = "insert into user (name, ranking) values (?, ?)"

After playing around with this idea for a bit I came up with this style:

let add_user db name ranking =
  query db "insert into user (name, ranking) values (?, ?)" (text name) (int ranking) ret_unit
(* val add_user : db -> string -> int -> (unit, string) result *)

The query, text, int, ret_unit values above are designed to work together so that effectively query is a variadic function. E.g. a query with no arguments and one integer output:

let count_users db =
  only @@ query db "select count(*) from user" ret_int64
(* val count_users : db -> int64 *)

(only is a convenience to assert that there is only one element in the result Seq.t, and get it.)

I’ve implemented this for SQLite, repo link in case of interest: GitHub - yawaramin/ocaml_sql_query: PoC of functional-style SQL query

8 Likes

Looks quite similar to this :–)

9 Likes

Also looks similar to my sqlite3_utils library:

(With the :: and [] constructors, one can re-use the list syntax!).

4 Likes

Wow, we even have the same data model (person with name, age) for our examples!

That’s a cool example! It just helped me fix a design flaw. Translated to my library:

# let fib n = only @@ query
  db
  "with recursive fib(a,b,c) as
    ( values (1,1,1),(2,1,2) UNION select a+1, c, b+c from fib where a<=?)
    select c from fib where a = ?;"
  (int n)
  (int n)
  ret_int64
val fib : int -> int64 = <fun>

# fib 10;;
- : int64 = 89L

# fib 100;;
Exception: Invalid_argument "Expected int".

For large output numbers, SQLite actually switches the return type to text! I updated my ret_text handler to account for that:

# let fib n = only @@ query
  db
  "with recursive fib(a,b,c) as
    ( values (1,1,1),(2,1,2) UNION select a+1, c, b+c from fib where a<=?)
    select c from fib where a = ?;"
  (int n)
  (int n)
  ret_text;;
val fib : int -> string = <fun>

# fib 10;;
- : string = "89"

# fib 100;;
- : string = "5.73147844014e+20"

I have my own version of this this that I use for my own postgresql library and sqlite3 library, but took a slightly different approach. I, too, chose a person name and age :slight_smile:

let insert_sql =
          Pgsql_io.Typed_sql.(
            sql
            /^ "insert into foo (name, age) values($name, $age)"
            /% Var.text "name"
            /% Var.smallint "age")
in

And to use this:

Pgsql_io.Prepared_stmt.execute conn insert_sql "Test" 26

And here is a sqlite3 one that returns something (I think this is a bit clunky but I don’t have a great solution for it:

          Sql.(
            sql
            // Ret.string
            // Ret.int
            /^ "SELECT name, age FROM foo WHERE name = $name order by age asc"
            /% Var.string "name")

I chose to use named variables instead of just ? because it gives me some extra flexibility. In particular, let’s say I want to translate a complex external query into a SQL query, as in in the UI for my application someone can do (name = foo or name = bar) and age >18 and age < 65 I can pass in an array of names and an array of ages and then construct a query that accesses those indices in the passed in array in the constructed query, this way I avoid having to create a new type on the fly based on user input. But to accomplish that I need to refer to the variable by name.

Additionally, using names decouples the order of variables in the query from the function. That is really helpful when refactoring a large query to add a new variable, where the variables end up in the query doesn’t change the type of the function.

2 Likes

Caqti by @paurkedal is similar:

Caqti_request.create Caqti_type.(tup2 string string) 
  Caqti_type.int Caqti_mult.one 
  "insert into migrations (name, hash) values ($1, $2) returning id"

When I used it last time, I crated a helper SQL module that simply includes Caqti_{request,type,mult}, so the same code can look like that:

SQL.(create (tup2 string string) int one)
  "insert into migrations (name, hash) values ($1, $2) returning id"
2 Likes

FYI, in the Haskell postgresql-simple library (and other similar libs), typeclasses are used to bridge the gap between the type system and SQL, enforcing the use of types that can be converted to SQL arguments and types that can be built from what a query returns.

query :: (ToRow args, FromRow result) => Connection -> Query -> args -> IO [result]

However, as the SQL query is not strongly typed, they have to annotate the polymorphic function to guide the typeclass inference, which is a similar overhead to your use of functions to build return types explicitly. Therefore, your proposition looks like a reasonable equivalent to their solution, IMHO.

1 Like

Speaking about bindings, that’s very close to the ctypes library.

1 Like

Some new operators are scheduled for the next Caqti release, making the syntax a bit more airy:

let bounds_upto_req =
  let open Caqti_type.Std in
  let open Caqti_request.Infix in
  tup2 int32 float --> option (tup2 float float) @:-
  "SELECT min(y), max(y) FROM samples WHERE series_id = ? AND x < ?"

with alternative operators -->. (exec), -->? (zero or one now), -->* (multi-row) for other multiplicities and alternatives to @:- for different ways of constructing the query.

(I might do something about the double open, though I also expect real applications will assemble their own module with additional definitions.)

1 Like

Is there a reason to keep tup2 instead of the list overload syntax? :slight_smile:

Very good question. I have tried to think in list operator overloading in the interface, but have ended up with some reservations each time. Here is my thinking as far as I recall. If we do this only on the type side, we will be left with passing/matching deeply nested parenthesis, so we also need an alternative to tuples on the value side. So we’ll also make a list-overloaded tuple values, i.e. Caqti_type.[...] consuming/producing say Tuple.[...]. We will also want to eliminate the nested parentheses in type expressions, using types like ('a1 -> ... -> 'aN -> unit) Tuple.t instead of ('a1 * (... * 'aN)...) Tuple.t. In some cases a free variable instead of unit will allow certain additional operations AFAIR, but the extra parameter will cause generalization issues here since we are defining global values. So, if I recall correctly, this would work with an additional Tuple module definition, but one thing I don’t like, is that the user will need to use the type constructors directly, while the preferred API is the Caqti_type.Std which I think is nicer apart from the tuples. I didn’t completely shelf the idea, but I didn’t think it was an obvious improvement, considering at adds some complexity on its own to the API.

For the next release, instead consider adding higher arity tuples. This has just become more feasible after refactoring the tuple from the driver code into a common library.

I found a way of factoring out tuple-cases into a generic Tuplet library. I didn’t try to use it for the SQL type representation in Caqti yet, but the test should cover what is needed.