Tools or pointers for control flow analysis of OCaml code?

I’m working on a PPX rewriter that needs to reason about possible return values of a function. Roughly, I’d like to have my PPX generate a control flow graph from the parsetree representation of a function, then traverse the flowpaths of the CFG to gather information about how that function’s return values are constructed.

OCaml seems to be widely used to do static analysis of other languages, but as yet I have not found a tool for generating a control flow graph for OCaml code. Has anyone tried to implement something like this, or know of any tools to do so? Any ideas about where to start implementing this from scratch? Is there maybe an intermediate representation to which I could convert an OCaml parsetree, for which generating a CFG is already implemented somewhere?

The compiler does some kind of control flow analysis, but I think it happens much deeper than the parsetree generation phase, and I’d like to avoid relying on compiler internals that could break with a new compiler release.

2 Likes

This might be of some help for you.

Another option would be to add the OCaml support in GitHub’s Semantic. Since there is already a tree-sitter parser for this language developers claim it will be much easier soon. Before this you can follow the documentation on adding the new language. Beware of Haskell!

BTW, note that mentioned pfff is now being developed here, not in the archived facebook repository!

1 Like

Just a heads up - there is a work going on adding OCaml to GitHub Semantic, see the corresponding PRs:

Once these are merged, it will be possible to start adding Semantic support itself.

2 Likes

@amirshukayev and @XVilka, belated thanks for the responses. pfff looks like it may do what I need out-of-box. I’ll post back if I get results with either tool.

FYI, the active pfff repo appears to have moved again: https://github.com/returntocorp/pfff