Caqti 0.9.0 release - Compatibility layer for database clients

A am please to finally announce the first release of Caqti to our main opam repository. Caqti is an RDBMS client library for OCaml, currently shipped with drivers for MariaDB, PostgreSQL, and Sqlite3. In additional to serving as a compatibility layer, it provides monadic concurrency with support for Lwt and Async.

TL;DR (yes it is; hopefully some of you will read anyway): There is an example linked from the GitHub page.

Design

Caqti accepts a URI which provides the location, authentication, and configuration for a particular database, and returns a pool of first-class modules implementing a common signature used to execute queries. The pooled modules includes the connection object, so they are meant to be used only in local contexts. Automatic loading of drivers is provided by linking against caqti-dynload, otherwise link against the needed drivers.

The client can define fixed requests (queries) at the module level. When executed, Caqti prepares and caches the prepared statements within the module corresponding to the connection. Caching of prepared statements can be disabled in case the query is generated on the fly.

Caqti handles encoding and decoding of results based on types specified along with the queries. Most common datatypes are supported except for the time-of-day (time) type. The elementary types are represented as an open variant, which can be extended by providing conversion functions between the types implemented by drivers. On top of the elementary types, on can form tuples and custom types converted to and from tuples, though these custom types should be considered experimental for now.

On the other hand, Caqti does not provide a DSL for defining queries. Queries a basically strings, though in the form of templates in order to ease generation of queries strings and to support both ?-style and $n-style parameters independent of which database driver is used. The primitive request constructor actually takes a function which receives information about the driver and returns a template, in order to allow the client code to dispatch between different RDBMS where needed.

In other words, the focus of Caqti is to provide the most basic compatibility layer to allow writing code which works across RDBMS. It is okay to use it directly, but there is also room for higher level interfaces like code generators or DSLs implemented in plain OCaml or with the help of PPXes. I have seen several good approaches, but I’ll leave that for the discussion and future work.

Status

The API underwent some redesign this fall, but is largely based on code which I have used myself for a few years. Lwt and PostgreSQL is the best tested components, since I’ve used them in all my production code, though with help from @andrenth, I believe the MariaDB component is fairly solid too now. Both PostgreSQL and MariaDB drivers are based on asynchronous calls provided by the respective C bindings. The Sqlite3 driver is a fairly straight-forward preemptive wrapper around the already stable C bindings.

There may still be issues with conversion of certain values depending on databases, especially when the type cannot be inferred. Handling conversion as best as possible is one of the goals of Caqti, so please report it in the issue tracker or submit a PR with a test case.

This being the first properly announced release, I think it is also a good time for you to report back if we need adjustments to the API. Consider that also a warning, though I hope we can avoid substatial redesign.

The API documentation is not on-line at the moment, but can be generated with odig odoc.

Please ignore the Caqti1_* modules and *.v1 findlib libraries, they will be deprecated (as soon as I rewrite the epiSQL code generator, which is not released in opam-repository yet).

Thanks

Thanks to Markus Mottl, Alain Frisch, Christian Szegedy, and Andre Nathan for a great job on providing bindings to the C libraries, which requires special expertise in memory management, debugging, and packaging.

22 Likes

Congrats on the release, Petter, and thanks for all your help making the MariaDB bindings stable.

Looks very good, thanks for releasing this!

I released Caqti 0.10.0 with the following changes:

  • Added -linkall flags to driver libraries to fix direct linking (thanks to @bobbypriambodo).
  • Added convenience functions collect_list and rev_collect_list.
  • Renamed template to query and related function, leaving deprecated
    aliases.
  • Added ptime_span field type mapping to SQL intervals.
  • Be more permissive about types of data returned from MariaDB when
    expecting numerical results.
2 Likes

How could I create a custom type of my own for a piece of data which requires more than 4 columns (i.e. I couldn’t describe it using tup*).

How could I create a custom type of my own for a piece of data which requires more than 4 columns (i.e. I couldn’t describe it using tup*).

The tuple constructors are recursive, so only tup2 is really needed:

utop # Caqti_type.(let (&) = tup2 in int & bool & string & int);;
- : (int * (bool * (string * int))) Caqti_type.t =
Caqti_type.Tup2 (Caqti_type.Field Caqti_type.Int,
 Caqti_type.Tup2 (Caqti_type.Field Caqti_type.Bool,
  Caqti_type.Tup2 (Caqti_type.Field Caqti_type.String,
   Caqti_type.Field Caqti_type.Int)))

which will produce values like (1, (true, ("blah", 99))). The main reason I added tup3 and tup4 is to take off the edge when dealing with wide tuples. This gives a bit flexibility in grouping parts of a long tuple. A second reason is to allow code generators to optimise extraction of result rows by delivering more values per allocation.

1 Like

Ooh this is a really neat trick. Is this already documented somewhere?

I haven’t documented it in Caqti at least, but at least Eliom uses a binary operator to construct tuple types, and i might have seen it elsewhere too. I might add the operator to Caqti_type, though the flip side is that, while the type expression is nice, the value expressions can only be matched with deeply nested parentheses. Picking a right-associative operator helps a bit when editing values and patterns.

Another trick I considered was to define a polymorphic list by locally overloading the [] and (::) list constructors to create syntactically flat expressions for both tuple-like values and corresponding type descriptors:

module Tuple = struct
  type _ t =
    | [] : unit t
    | (::) : 'a * 'b t -> ('a * 'b) t
end

module Type = struct
  type _ t =
    | ...
    | [] : unit Tuple.t t
    | (::) 'a t * 'b Tuple.t t -> ('a * 'b) Tuple.t t
end

I can’t recall where I saw that, I think it was mentioned on a here. Though, I opted for sticking with regular tuples.

The big advantage of overloading lists for that purpose is that you can pattern match with them as well:

# let f Tuple.[x; y; z] = x + y + int_of_string z ;;
val f : (int * (int * (string * unit))) Tuple.t -> int = <fun>

I would recommend using arrows instead of tuples as the type level though, as in my blog post: it’s significantly more readable. :wink:

1 Like

One of my earlier sketches for the API rewrite was based on a double format-like type. Along the lines of

type ('f, 'r, 'g, 'q) request =
  | Query : string -> ('r, 'r, 'q, 'q) request
  | Param : 'a typ * (var -> ('f, 'r, 'g, 'q) request) -> ('f, 'r, 'a -> 'g, 'q) request
  | Result : 'a typ * (var -> ('f, 'r, 'g, 'q) request) -> ('a -> 'f, 'r, 'g, 'q) request
...
val find : ('f, 'a, 'g, 'a option future) request -> 'f -> 'g
val fold : ('f, 'a -> 'a, 'g, 'a -> 'a future) request -> 'f -> 'g
val fold_s : ('f, 'a -> 'a future, 'g, 'a -> 'a future) request -> 'f -> 'g

if we skip over less interesting details. This completely removes the need for tuples as a basic building blocks, since values are passed curried to the result handler of type 'f and parameters are passed curried to the execution functions. Though, the value restriction blocked my attempts to define convenient combinators for constructing queries and made it difficult to obtain a fixed identifier to use for recording identities of prepared queries.

I agree that encoding tuple types in terms of function types is more readable, but I think the free type parameter needed can easily get in the way, and in particular it may be easier to build higher level interface on top of Caqti if there are no “free” type parameters.

The paper you cite seems like a best solution, well the only one I’ve seen, though I wonder if one could not be sufficient to introduce a pseudo-variance, which only testify to the fact that the type parameter does not occur in a mutable position in the type.

Update: Forgot about the talk about typed effects mentioned in About Multicore, which could also resolve the value restriction.

Having or not a type variable at the end is independent from whether you use tuples or functions:

module Tuple = struct
  type _ t =
    | [] : unit t
    | (::) : 'a * 'b t -> ('a -> 'b) t
end

# let a = Tuple.[1 ; "2" ; 3.] ;;
val a : (int -> string -> float -> unit) Tuple.t

Adding the type variables at the end gives you the ability to typecheck append (among other things). The choice between product or arrow types is purely syntactic.


That being said, I agree. The value restriction really gets in the way when doing those kind of things and makes it quite inconvenient.

1 Like

Indeed, I’ll keep that in mind in case I introduce this.

I made another bugfix release (0.11.1). This only contains one fix, but I think it was important enough, since it could lead to a dead-lock in case there were problems with the database connection.

(On a minor note, I also have a PR to disable testing via opam due to issues with internal dependencies. I expect to re-enable it for the next release.)