[ANN] First release of PGX (Pure-OCaml PostgreSQL client)

I’m happy to announce the first release of PGX on opam. PGX is a pure-OCaml PostgreSQL client based on PG’Ocaml.

Since the fork, we’ve made the following improvements:

We’re still looking for feedback on the API and pull requests are welcome!

29 Likes

I’ve just discovered this, and looks like a really good piece of work. I hope I’ll be able to try pgx soon.

Are you using any ppx on top of this or any user-level conveniences?

2 Likes

There’s currently not any ppx to go with it. We’ve talked about doing something similar to Diesel in Rust, but none of us really have experience with ppx and the convenience factor hasn’t been worth it for us yet.

(We’re definitely open to new contributions though)

1 Like

Could you elaborate on the rationale for forking PGOcaml instead of contributing to its development? Is there a fundamental difference in philosophy?

The big draw of PGOcaml over PGX for me is the PPX, which provides compile-time checking of SQL queries and results against the database. It doesn’t seem like this and the mentioned features of PGX are mutually exclusive though.

1 Like

Continuing the comparison with Rust libraries, PGOcaml seems more similar to SQLx, so perhaps the difference is in the vision as Diesel is more like an ORM.

1 Like

Could you elaborate on the rationale for forking PGOcaml instead of contributing to its development?

I’m not really sure since the fork happened before I started.

I do know that we were focused on making the library usable without ppx and had no interest in the ppx interface.

1 Like

Ah, that’s unfortunate. Thanks for the quick answer though!

I’m interested in the PPX for the same reason I’m interested in strong statically typed programming languages. It seems like a very natural extension to also check the syntax and types of SQL queries at compile-time, and to have the same fast feedback loop when writing them. I’m hoping you wouldn’t be opposed to adding such a PPX if I find there’s compelling reasons to move to PGX instead of incrementally improving PGOcaml.

1 Like

I’m not the maintainer of PGX anymore, so you’d have to ask the current Arena team.

Personally, I think it would be better to make a separate ppx for generating SQL, and then use it with any PostgreSQL client library though (although there may be some complexity with handling the different types used by different client libraries). I don’t think this really needs to be part of the client library itself though.

1 Like

Generating SQL would be rather limited, like most ORMs are, or a massive undertaking considering all the possible postgres extensions. The benefit of the approach taken by PG’OCaml and SQLx is that the postgres backend itself checks the syntax and types, taking into account which extensions it has installed and therefore supports the whole syntax with very little effort.

It should be possible to make an independent PPX with adapters for different client libraries, but the PG’OCaml PPX seems to have somewhat tighter integration that might make generalizing it a bit more awkward. Definitely worth looking into though!

1 Like

I’m currently using an Async thread-pool based implementation that can send queries in parallel on top of postgresql-ocaml (works quite nicely and can use nearly all cores on a 40 core postgresql server): rage/postgresql_async.ml at update2 · edwintorok/rage · GitHub
I’ve tried wrapping libpq with Async more directly (i.e. avoid using multiple threads), but it is quite difficult (libpq likes to close and reopen file descriptors internally when it reconnects to the server, which causes issues if that FD was registered with epoll in Async, since the newly opened FD won’t be registered with epoll).
A pure OCaml implementation is welcome and would avoid the difficulties in using libpq in an asynchronous way.

Although the README for pgx says “Trying to run multiple queries at the same time will work properly (although there’s no performance benefit, since we currently don’t send queries in parallel).”, so it is not quite a replacement for what I use currently. Do you think this limitation could be removed in the future?

Or can I use a similar solution as I used before (open N connections to the server, one/core), I assume queries on different connections could be sent in parallel? (which should still be an improvement over what I use currently since it wouldn’t necessarily need to use a separate OS thread to do that, just an Lwt or Async lightweight thread).

1 Like

I’m not sure if the Postgres protocol can support multiple queries on a single connection (so Pgx can’t either), but Pgx should work fine if you use multiple connections in parallel inside of a connection pool. Maybe the README should be updated to make that more clear.

If there is a way to run parallel queries on the same connection, I think Pgx upstream would accept patches to support it.

1 Like

While in theory I appreciate this, in practice the problem is that to do this kind of check you need a running Postgres server with your schema loaded, which makes builds require network and quite heavy and unusual dependencies (think of how many packages there exist that require a running SQL server to produce a binary). I would much rather have a way to cache the results of that check somehow in a way that could be committed to the repo to avoid complex dependencies.

2 Likes

SQLx has an offline mode which does exactly this.

Alternatively you could require type annotations, which you would otherwise need in some form anyway, and just run the checks in development mode to make sure you got it right.

But in practice I haven’t found that running postgres on CI and committing a dump of the database schema to be all that problematic. Definitely takes more fiddling to get up and running though, and a caveat that it’s good to be aware of before deciding to go this route.

2 Likes

I wondered if Caqti supports PGX; there is an issue open, but this has not been done yet.

(I think Caqti is the most usable “general library” for database usage in OCaml, so it’s nice when it can benefit from more specialized developments such as PGX.)

1 Like

I have greatly enjoyed using ppx_rapper, which provides compile-time checks of both query syntax and relevant in/out query parameters. It does not verify coherence with any given schema (either live or e.g. as represented in a DDL file), but given the complexities involved there, I’ve been happy enough.

2 Likes

The problem with this kind of inline type annotations, while nice and readable and easy to parse, is that it makes it hard to run the query manually. SQLx instead (ab)uses the aliasing syntax, making it valid SQL, e.g. select stuff as "stuff!: Vec<i32>" from whatever.

For its syntax checking, ppx_rapper seems to be using bindings to libpg_query, a C library extracted from the postgres server. Interesting approach, but with the downside that it requires using and for someone to quite actively maintain such a library for each database it should support, it won’t be aware of any extensions, and of course it won’t check against the schema.

1 Like

It is possible since postgres 14

2 Likes

You could do this with dune’s promotion mechanism. You’d need to write a script to fetch the schema from the running database and save into some serializable form.

1 Like

Granted that ppx_rapper’s type annotations need scrubbing if you’re aiming to copy/paste a query into a different tool, but doing so always requires some degree of editing regardless of the library one uses, at the very least to swap out parameter placeholders for concrete values…so I don’t know that there’s any large difference in difficulty there.

That’s honestly a feature for me, and a reason why I didn’t choose PGOcaml. I appreciate static analysis, but not being able to run a build without having an accessible database running in good order felt like a bridge too far.

2 Likes

If you use postegres’s own placeholders ($1, $2 etc.), you can copy it verbatim into a´preparestatement, thenexecute` it without doing anything other than supplying the parameter values and types.

I also think there’s a pretty big difference between just replacing parameters, and also having to rewrite the list of returned columns though. But my point is more that it’s an unnecessary extra burden as ppx_rapper could do away with it by just changing the syntax a little bit.

1 Like