Caqti: confused about passing around a Db module as a function parameter

Hello!

I’m having issues trying to structure my project better with Caqti.

Firstly, I define a customer_repo.ml file like this:

(* ./lib/customer_repo.ml *)
module Q = struct
  open Caqti_request.Infix

  let create_tbl =
    Caqti_type.(unit ->. unit)
    @@ {|
   CREATE TABLE IF NOT EXISTS customers
     ( id INTEGER PRIMARY KEY AUTOINCREMENT
     , first_name VARCHAR(255)
     , last_name VARCHAR(255)
     )
   |}
  ;;

  let insert =
    Caqti_type.(tup2 string string ->. unit)
    @@ {|
  INSERT INTO customers (first_name, last_name)
  VALUES (?, ?)
    |}
  ;;
end

let create_tbl (module Db : Caqti_lwt.CONNECTION) = Db.exec Q.create_tbl

let insert (module Db : Caqti_lwt.CONNECTION) first_name last_name =
  Db.exec Q.insert (first_name, last_name)
;;

Then I create a binary that will only be concerned about setting up the database:

(* ./bin/setup.ml *)
module Db : Caqti_lwt.CONNECTION =
  (val let cwd = Sys.getcwd () in
       let path = Printf.sprintf "sqlite3://%s/db.sqlite3" cwd in
       let connect = Caqti_lwt.connect (Uri.of_string path) in

       Lwt_main.run (Lwt.bind connect Caqti_lwt.or_fail))

let info_log fmt = Printf.printf ("[INFO] " ^^ fmt ^^ "\n%!")
let err_log fmt = Printf.printf ("[ERROR] " ^^ fmt ^^ "\n%!")

let () =
  let open Lwt_result.Syntax in
  let all_promises : (unit, 'error) result Lwt.t =
    let* () = Lib.Customer_repo.create_tbl Db () in
    let* () = Lib.Customer_repo.insert Db "John" "Doe" in
    Lwt.return_ok ()
  in

  Lwt_main.run all_promises |> function
  | Ok () -> info_log "Setup OK!"
  | Error e -> err_log "%s" (Caqti_error.show e)
;;

The issue comes from (module Db : Caqti_lwt.CONNECTION), which looks like a function parameter that could be passed around.

However, I get this type error which I don’t understand:

Error: This expression should not be a constructor, the expected type is (module Caqti_lwt.CONNECTION)

It seems to me that I’m passing specifically a module that has the type Caqti_lwt.CONNECTION.

I fact, I don’t really understand this definition:

module Db : Caqti_lwt.CONNECTION =
  (val let cwd = Sys.getcwd () in
       let path = Printf.sprintf "sqlite3://%s/db.sqlite3" cwd in
       let connect = Caqti_lwt.connect (Uri.of_string path) in

       Lwt_main.run (Lwt.bind connect Caqti_lwt.or_fail))

I understand I’m initializing a module correctly (I can see the many functions it provides), but I’m confused about the “val syntax”, what it represents, what this type of module is exactly, and what I can do/not do with it.

If I define this module in customer_repo.ml directly, and stop trying to pass it around as a function parameter, then it works!

But I can also see that receiving the module as a function parameter is valid syntax too, if I look at the bikereg example. So I’m obviously confused.

I looked at various module docs, but nothing seems to fit. I’ve learnt about modules as a way to namespace things, control the private/public status of its functions via mli files and have only a very basic understanding of functors (I haven’t learnt them properly yet).

But this module seems to be something else. What am I missing?

OCaml has a value-level language and a module-level language. The syntax (val ...) allows using a value expression in the module language and (module ...) allows using a module expression in the value language. A type or signature can be added with ... : ... if needed. To pass Db as a function argument, write (module Db : Caqti_lwt.CONNECTION) where the signature judgement can probably be omitted in this case.

To elaborate on my previous answer, module Db : C = (val ...) unpacks the first-class module into a global module, which makes it directly usable. However, since you pass it around before using it, you might as well keep it as a value (let db = ... withouth val) until it’s unpacked by create_tbl and insert.

(Note that (module ...) in pattern or formal argument position unpacks a module for the scope, while (module ...) as an actual argument packs the module as a first-class module value.)

Thank you so much.

You actually gave me just enough to find out more. I was in the process of posting this updated snippet as you replied:

module Customer_repo = Lib.Customer_repo

(*
 * UTILS
 *)
let info_log fmt = Printf.printf ("[INFO] " ^^ fmt ^^ "\n%!")
let err_log fmt = Printf.printf ("[ERROR] " ^^ fmt ^^ "\n%!")

(*
 * DATABASE
 *)

let init_conn () =
  let cwd = Sys.getcwd () in
  let path = Printf.sprintf "sqlite3://%s/db.sqlite3" cwd in
  let connect = Caqti_lwt.connect (Uri.of_string path) in

  Lwt_main.run (Lwt.bind connect Caqti_lwt.or_fail)
;;

(*
   "Unpack" the `Caqti_lwt.CONNECTION` module into `db`,
   by creating an anonymous `first-class module` with the `val` keyword (on the left).

   See: https://dev.realworldocaml.org/first-class-modules.html
  *)
let db = (module (val init_conn ()) : Caqti_lwt.CONNECTION)

(*
 * BOOTSTRAP
 *)

let () =
  let open Lwt_result.Syntax in
  let all_promises : (unit, 'error) result Lwt.t =
    let* () = Customer_repo.create_tbl db () in
    let* () = Customer_repo.insert db "John" "Doe" in
    let* () = Customer_repo.insert db "Jane" "Doe" in
    Lwt.return_ok ()
  in

  Lwt_main.run all_promises |> function
  | Ok () -> info_log "Setup OK!"
  | Error e -> err_log "%s" (Caqti_error.show e)
;;

This feature is pretty advanced, so I think it’ll take a while before I’m comfortable with it. But as-is, I think I have just enough knowledge to put this on the side, to get back to it at a later point.

I’m also learning about lwt and wondering: does it make sens to have 2 schedulers running in one process?

  • one for Caqti
  • one for the app

I understood that it’s better to have one Lwt_main.run call on app init.

Any thoughts on that?

Yes, I think it’s better to have one Lwt_main.run. Note that if you call it multiple times in a single-threaded program, it will just run one instance of the Lwt scheduler after the other. So, my answer if I understand your question correctly is that there’ll be one scheduler.

My recommendation for real applications would be to create on main function which handles any command-line argument, invokes Lwt_main.run, reads in any configurations, allocations a collection up-front or creates a connection pool, then proceeds with its task.

Note that let db = (module (val init_conn ()) : Caqti_lwt.CONNECTION) is just let db = init_conn (), since (val ...) unpacks the module and (module ...) re-packs it.

You’re right that first-class modules is an advance feature, maybe one which one shouldn’t need to know about to use a database in an application. Some ideomatic usage might be acceptable, but can spill into confusion, as this thread indicates.

First-class modules fits the internal design perfectly, but it would be possible to hide it from the end user in an alternative API. Your above functions might then look like:

let create_tbl db = Caqti_lwt.exec db Q.create_tbl

let insert db first_name last_name =
  Caqti_lwt.exec db Q.insert (first_name, last_name)

Though, I’m a bit weary of adding more ways to do the same thing, as we might then see a split between both code and various documentations depending on preference.

Thanks for your valuable feedback @paurkedal

I think it’s fine to expect a user to come up to speed and use an advanced feature if it makes sens for the library’s architecture. To help beginners, maybe adding a more featureful example would help? Would you be interested in such a contribution? Otherwise, maybe adding a simple comment would be sufficient.

Overall, I agree that having only one way to do things would be better.

Another example will be appreciated.

Great! I’ll get back at you then :slight_smile: