Errors with Base/Core and an imported PPX's or deriving

I’m very new to OCaml (~2 weeks) and I keep running into issues when adding Base/Core and modules that implement either PPX or deriving.

How are you supposed to mitigate these issues?

Both have taken quite a bit of work to identify the cause, is there a way to more easily understand where the problems are coming from?

Two examples follow:

The first issue is when I pull in Base with [@@deriving yaml]

open Base

type book = { 
  title: string;
  authors: string list
} [@@deriving yaml]

Compile error:

File "bin/main.ml", line 9, characters 11-22:
9 |   authors: string list
               ^^^^^^^^^^^
Error (warning 6 [labels-omitted]): label f was omitted in the application of this function.
File "bin/main.ml", line 9, characters 11-17:
9 |   authors: string list
               ^^^^^^
Error: This expression should not be a function, the expected type is
       'a Base.List.t

The second issue was when I added Core and pgocaml's PPX in the same module, trying to compile failed. I reduced the issue to the following, where String.concat "" ["a"] existed in the pgocaml PPX.

Starting with:

open Core
open PGOCaml
let () =
  let dbh = PGOCaml.connect () in
  let insert name salary =
    [%pgsql dbh "insert into employees (name, salary) VALUES ($name, $salary)"]
  in
  ignore(insert "Chris" 1_000.0);

Then run ppxfind generates the following, and below is a simplified result which conflicts with Core.

(* 
PGHOST=localhost PGUSER=postgres PGDATABASE=postgres PGPASSWORD=example ppxfind -legacy pgocaml_ppx ./bin/main.ml
PGHOST=localhost PGUSER=postgres PGDATABASE=postgres PGPASSWORD=example dune exec bin/main.exe -- 
*)
open Core
open PGOCaml
let () =
  let dbh = PGOCaml.connect () in
  let insert name salary =
    PGOCaml.bind
      (let dbh = dbh in
       let params : string option list list =
         [[Some (((let open PGOCaml in string_of_string)) name)];
         [Some (((let open PGOCaml in string_of_float)) salary)]] in
       let split =
         [`Text "insert into employees (name, salary) VALUES (";
         `Var ("name", false, false);
         `Text ", ";
         `Var ("salary", false, false);
         `Text ")"] in
       let i = ref 0 in
       let j = ref 0 in
       let query =
         String.concat ""
           (List.map
              (function
               | `Text text -> text
               | `Var (_varname, false, _) ->
                   let () = incr i in
                   let () = incr j in "$" ^ (string_of_int j.contents)
               | `Var (_varname, true, _) ->
                   let param = List.nth params i.contents in
                   let () = incr i in
                   "(" ^
                     ((String.concat ","
                         (List.map
                            (fun _ ->
                               let () = incr j in
                               "$" ^ (string_of_int j.contents)) param))
                        ^ ")")) split) in
       let params = List.flatten params in
       let name = "ppx_pgsql." ^ (Digest.to_hex (Digest.string query)) in
       let hash =
         try PGOCaml.private_data dbh
         with
         | Not_found ->
             let hash = Hashtbl.create 17 in
             (PGOCaml.set_private_data dbh hash; hash) in
       let is_prepared = Hashtbl.mem hash name in
       PGOCaml.bind
         (if not is_prepared
          then
            PGOCaml.bind (PGOCaml.prepare dbh ~name ~query ())
              (fun () -> Hashtbl.add hash name true; PGOCaml.return ())
          else PGOCaml.return ())
         (fun () -> PGOCaml.execute_rev dbh ~name ~params ()))
      (fun _rows -> PGOCaml.return ()) in
  ignore (insert "Chris" 1_000.0)

That compiled results in this error.

File "bin/main.ml", lines 44-59, characters 11-38:
44 | ...........(List.map
45 |               (function
46 |                | `Text text -> text
47 |                | `Var (_varname, false, _) ->
48 |                    let () = incr i in
...
56 |                             (fun _ ->
57 |                                let () = incr j in
58 |                                "$" ^ (string_of_int j.contents)) param))
59 |                         ^ ")")) split)...
Error: The function applied to this argument has type ?sep:string -> string
This argument cannot be applied without label

Simplified down, String.concat "" ["a"] with Core is not compatible.

open Core
let () =
  let query = String.concat "" ["a"] in
  print_endline query; 
  
Compile Error:  
  The function applied to this argument has type ?sep:string -> string
  This argument cannot be applied without label 

Hi @gregberns, welcome to OCaml! I’m sorry to hear about the issues you ran into. In this case, the ppx will write code that references whatever String module is in scope at that point in the code. I’m not familiar with either of these ppxes, but one workaround you could use is to delay opening Base like so:

type book =
  { title : string
  ; authors : string list
  }
[@@deriving yaml]

open Base

Another option, which is quite a bit more convoluted but leaves open Base at the top of the module, which is conventional:

open Base

module String = Stdlib.String

type book =
  { title : string
  ; authors : string list
  }
[@@deriving yaml]

This would replace the String module from base with the one from the standard library for the rest of the enclosing module.

Perhaps the authors of the ppxes might accept a patch to directly use Stdlib.String instead of unqualified String. That would also solve any issues that would arise if you were to open the StdLabels module instead of Base.

4 Likes

Hi, just a note that you don’t actually need to open ... modules, in fact it should be done quite carefully, as it can bring any modules and values into scope that are in the opened modules, overriding (shadowing) existing things that were already in scope with the same name.

In both of your code examples I see no real reason to open Base or open Core; they don’t seem to be using any functionality from there. Also there’s no reason to open PGOCaml either; in the PG’OCaml README the example given does not do so and instead uses the module without opening it.

Just to clear up a possible misunderstanding: in OCaml you don’t need to open modules before using them. Modules are always available in a global namespace. So e.g. if you’re using PGOCaml.connect () in your code, you don’t need to open PGOCaml beforehand. The module is just available by having been linked in to your project’s compilation.

Finally, you are talking about running ppxfind, as far as I know that is not a standard workflow for using PPXs. The standard build tool for OCaml is dune, and it will take care of running PPXs for you. E.g. check this dune file which shows an example of configuring the PG’OCaml PPX: pgocaml/dune at v4.0 · darioteixeira/pgocaml · GitHub

1 Like

@bcc32 Thanks, I’ll play around with that. As of now, I’ve been putting most everything in one file, so it looks like there are conflicts. I’ll try what you suggest and maybe separate things into other files.

@yawaramin

If I leave out open PGOCaml I get this:

21 |     [%pgsql dbh "insert into employees (name, salary) VALUES ($name, $salary)"]
           ^^^^^
Error: Uninterpreted extension 'pgsql'.

I need to use elements from them for other things - these are minimal examples for clarity.
BUT - I may have just been using sub components (maybe Result or something) - and what you’re saying is to not bring in Base, and instead only bring in Base.Result. That may solve some of these issues!

Ok, so if you don’t open the module it’s still available - makes sense. But lets say I need read_all from Stdio.In_channel, is the convention to specify the whole path (Stdio.In_channel.read_all) every time you use it?? That seems annoying and verbose.

Also what about >>= from Base.Result. If I don’t include open Base.Result I get Unbound value >>=.

I’m using dune. I couldn’t find any documentation on how to get the PPX to work. Finally I found a snippet that lead me to using this in my dune file, which allowed me to compile:

(libraries core pgocaml pgocaml_ppx)
  (preprocess
   (action
    (run ppxfind -legacy pgocaml_ppx ./bin/main.ml)))

This is probably antiquated, but I see no documentation in the repo on how to get the PPX configured - that worked anyways.

If I:

  • use(preprocess (pps pgocaml.ppx)) I get : Library "pgocaml.ppx" not found.
  • use (preprocess (pps pgocaml_ppx)) I get: Package pgocaml has no version ppx

Sometimes, when people use stdlib replacements, they prefer to open them everywhere — even where they aren’t presently being used — so they don’t accidentally start using stdlib stuff that they expected to be shadowed.

1 Like

This means the project is configured incorrectly. You don’t need open PGOCaml. I’ll show the details later.

So ‘bring in’ is not really the way I would think about it. There’s no concept of importing or ‘bringing in’ modules in OCaml. They become available when their libraries are configured to be part of the build, in dune. I’ll show this later. You might also want to examine what exactly you need Base.Result for, since OCaml already ships with the result type and and Result module in its standard library.

If you are doing it often enough then you can alias a short name for the module, e.g. module INC = Stdio.In_channel, and do INC.read_all. Or you can alias that specific function if you’re using it all over the place: let read_all = Stdio.In_channel.read_all. Or you can use a ‘local open’, i.e. open the module for a small scope of a function body or even shorter, and use it there. The idea is to avoid globally opening modules all over the place and potentially messing up the default environment of the program.

You can use the methods I mentioned above to bring it into scope.

Yeah, unfortunately, it’s not very obvious. But you can figure out the correct way by checking the repo, if you know a few things:

  1. The repo lists two opam files at the root of the project: pgocaml.opam and pgocaml_ppx.opam. opam files represent the names of the packages published by this repo. You will want to install them to use it: opam install pgocaml pgocaml_ppx (you might have done this already, check if installed by running opam list).
  2. You can use the dune init command to start a new project with a set of libraries and PPXs scaffolded for you automatically. Check dune init --help for details, but the command I used is: dune init proj pgo --libs pgocaml --ppx pgocaml_ppx. This creates a new directory called pgo with the project set up with idiomatic lib, bin, and test subdirectories. The lib directory is meant to contain the functionality of the project, and its dune file has the correct PG’OCaml dependency and PPX set up. The bin directory is meant to contain a thin executable wrapper (optional).
  3. The lib directory needs to contain a module with the same name as the value in the name field of the lib/dune file. In this case the name is pgo so create a file lib/pgo.ml to put your functionality in. Of course this depends on what you call your project.
  4. The bin directory comes with a file main.ml where you can write the main wrapper logic. This is meant to be executed.
  5. Run dune exec bin/main.exe from the project root to run the project.

As long as your PostgreSQL server is running and has the required user and database, and your SQL commands are correct, it will compile and run. I just tested it out.

Strongly agree with Levi here. It’s de rigueur in Jane Street’s codebase to open Base or Core at the top of a file, and for good reason. Base and Core change defaults in lots of important ways, hiding problematic functionality like polymorphic compare, and changing default behavior from what you might expect from the stdlib. For someone who is expecting Base/Core conventions, running into the Stdlib alternatives is confusing and can easily be a source of errors.

3 Likes

@Yaron_Minsky and @Levi_Roth Thanks. I’ve opened an issue in the repo.

@ yawaramin Thanks, that helped clarify quite a bit.

Regarding the Standard Lib vs Core - I’d like to use Core if possible, but maybe we can save that discussion for another thread. I would like to understand the tradeoffs at some point, and most of the discussions I’ve seen revolve around opinions not tradeoffs and are probably out of date at this point (since they’re several years old now)].

By “bring in” I was just trying to say the functions in the opened module are now directly available (without the namespace) in the ‘file’/module. It is now definitely more clear that there can be challenges when top level modules like Base are opened.

Thats very cool. I was wondering how you could just pull in single functions, without polluting the space.

PGOCaml: Thanks for taking the time to do that. I had about 90% of that, but the preprocess was totally mis-configured. I’ve opened an issue in PGOCaml to help others get over the hurdle more easily.

2 Likes

If a ppx uses a runtime function (such as List.map) it is best practice to sanitize it to make sure it won’t refer to whatever List module is in scope. It looks like pgocaml does not do that, so fixing that would solve the issue.

2 Likes