Best way to structure an algebraic data type for ASTs?

I’d like to have the type system give stronger assurances that if a parse tree can be expressed, it is valid.

This is really tough to do in the general case. It is possible to abuse GADTs to express properties such as well-typedness and well-scopedness at the type level, but for real languages this becomes rather ugly and hard to understand.

Unless you have a burning desire for type hackery, I would stick with simple approaches as in gasche’s suggestion. Where you rely on properties that are not convenient to express in OCaml’s type system, consider simply writing a predicate and asserting it.