Can OCaml "abstract above the code level"?

Sorry if this is a bit too philosophical.

Leslie Lamport says (in his TLA+ course) that programming languages don’t abstract above the code level.
Is this true? True about all programming languages, including OCaml?

Maybe “don’t abstract above the code level” needs a precise definition, but I don’t think I personally could give one.

2 Likes

Yes, it is )

But recently there was a discussion about a data structure and there was one which function was annotated with a comment saying the execution time is O(1)

This can be probably thought of as example of abstraction which is “above the code level”.
Maybe, that’s not what you had in mind.

I used to tell young’uns that there were only two languages that effectively abstracted above the level of code: SQL and spreadsheets. In both you tell the computer what you want not how to compute it.

OCaml express what what you want (excepted if you use ← or :=) … but in the opposite of the spreadsheet you have to order the declarations where the spreadsheet order them for you and detect circular references.

Metafont/Metapost can be seen as the same level than a spreadsheet if not higher. You don’t have to order the equality declaration. You can even type:
a=b/2
b=a+1
And it will solve this linear system of equations. (I am not sure of the syntax, but you get the idea)

I think this is a great question! I’m interested in it, and have given it some thought as a result of working on stuff adjacent to TLA+ for some time. (But let me also add the caveat that I am not an expert in this domain and don’t have formal training to draw on).

My general POV is that Lamport’s takes don’t apply nearly as well to modern FP languages as they did to, e.g., Java or C.

I agree that this is vague. What is “the code level” exactly? Once we start considering higher-order computations and meta-programming, is there a single “code level”? If “the code level” just means binary data, than all programming languages abstract over this.

To a certain degree, I think Lamport’s view here comes near to the tautology that the programming languages in which we implement programs cannot avoid being concerned with the implementation of the program. This is in strong contrast to TLA+, which is not intended to have any operational semantics and expressly not meant to be a programming language for implementing programs.

I think an initially plausible interpretation of his claim in that video is that programming languages cannot abstract over implementation details. But this is just false: modern programming languages have many forms of abstraction over implementation details. So I think he must rather mean something like this: non-executable specification languages, like TLA+, can abstract away all concerns with implementation details, because they are not held responsible for producing an implementation. The flip side of this lack of responsibility is that it is very easy to specify programs in TLA+ that are correct (according to TLC or Apalache) but impossible to implement.

In contrast, languages like OCaml (or Haskell, or SML, or Coq, or Adga, or F*, etc.) have the responsibility of producing working implementations. They provide all kinds of methods of abstracting over implementation details, but they cannot, in general, be entirely free from concerns with implementation. (Another way of describing this difference is that TLA+ is based on a non-constructive logic, whereas the logic used for reasoning in the languages mentioned is constructive, so when you prove properties of the systems using the tools those languages provide, you do so by showing how to construct an instance of the property, and that proof is the program.)

To be a bit more concrete, maybe we can consider some of the forms of abstraction that OCaml supports:

  • Functions abstract over the values (sub-programs) in their body
  • Algebraic data types abstract over the particularities of data representation
  • Type constructors abstract over families of types
  • Abstract types abstract over any particularities of type representation
  • The module system abstracts over (parametric) contexts of types and values
  • The effect system abstracts over concrete effects

We use these (and other) forms of abstraction to create additional forms of abstraction. I’ll mention two:

Lamport follows his claim in the video by explaining how disjunction and conjunction in TLA+ are commutative, and not indicative of any execution order. It is true that OCaml has an operational semanics and yields working programs, so expression in the language are sensitive to evaluation order. But we also use the previously listed forms of abstraction to create new abstractions over these details! One of the nicest ways of doing this is by abstracting over algebraic structures. This lets us reason entire classes of programs, defined only up to very general algebraic properties, such as monoids (including commutative monoids, that would encompase conjunction as a special case), functors, groups, monads, etc. We can leverage rigorous mathematical reasoning to design the composition of these classes of programs and prove their properties, and then we can go an implement instances of them and plug them together to get real working programs. This is all inheritance from the tradition of algebraic specification.

OCaml’s type system (taken all together including the core and module languages) is the nearest obvious analog in OCaml for the kind of specification that TLA+ provides. Using types, especially module types, we can specify the kinds of properties that a program should exhibit (and the checker will ensure that programs we write to inhabit those types really do preserve the stated properties). TLA+ is a much more expressive logic than what we have in OCaml types, but languages like Coq and F*, or tools like gospel can help us here.

To put a cap on all that: IMO, Lamport’s claim from the video is “too vague to know whether or not it is correct” (however, IIRC, Lamport has a lot of interesting and much more nuanced explanations of his POV in various papers and in Specifying Systems). I think that abstraction is much more of a spectrum, and that modern (esp. functional) languages like OCaml give us a lot of tools to push the boundaries towards ever more abstract expressions of the specification and implementation of our programs.

That all said, a specification language which is not required to ensure the specified program can be constructed doesn’t need to take any account of implementation constraints. This can allow very edifying specification of systems at a very high-level of abstraction. I believe that constructive methods can generally catch up, but it takes time.

I’d be very curious to know whether my take resonates with you at all, and/or if you think I may be missing some important aspect of Lamport’s point (given that you have his course fresh in mind). I also look forward to reading what others think on this interesting topic :slight_smile:

3 Likes

Given other answers here, I suspect “code-level” may have a specific technical meaning I am overlooking, and so maybe my long ramble is off-topic? If so, I’d be grateful for pointers or an explanation that helps clarify :slight_smile:

+1

The distinction between programs and math is a bit in the eye of the beholder. But, indeed TLA+ goes “above” the code in the sense that, in languages such as TLA+, you don’t really write a program (that is, a recipe telling the computer what to do, or even how to do it) but rather a specification (that is, what is true of a family of programs).

To do so, you can in particular rely on unbounded nondeterminism but you can also assume strong properties of the system or environment (e.g. about the relative speed of different processes, about scheduling or network features); you can also constrain whole executions using temporal logic.

I think it’s fair to say that it is more abstract than programming languages abstractions in the sense that the latter hide details (e.g. my program is parameterized by a scheduler that will ultimately be provided) while the former ignore details (e.g. some scheduling is present, with certain properties, but I don’t even know if it could be implemented).

1 Like

Wouldn’t this be true about any declarative-style language?
A couple of other candidates that come to mind are Prolog, CSS, HTML and its relatives (XAML), and the Network Monitor Parsing Language :slight_smile:

1 Like
  1. Many “declarative” languages actually have an implicit algorithm, and talented coders know they’re coding to that algorithm. E.g. Prolog: I remember a friend of mine who did Prolog consulting, back in teh late 80s, told me that you coded very closely to the SLD-resolution algorithm.
  2. CSS/HTML are both not full programming languages (setting aside Javascript), right?
    3.I don’t know XAML.

The thing that I believe makes SQL so special, is that there’s a compiler that selects the algorithm, and it can use runtime statistics and data-set statistics to do so. And it’s wildly widespread.

ETA: Re: Prolog, I should have noted that Prolog only has one execution method: SLD resolution. So it makes sense that people code directly to that algorithm.

Yes, CSS/HTML… and XAML are designed to describe user interface. We can see them as a data structures which are mapped to some graphical toolkit function (Tk/Gtk…).

They are similar to TyXML constructs (using only constants… then a very narrow subset of OCaml).

PGF/Tikz as well. Deserves a Nobel prize for programming imho.

@Chet_Murthy : to your list of declarative programming languages, you could add

  • reactive languages of the Lustre / Scade / Simulink kind, which are basically spreadsheets for time-indexed sequences;
  • Datalog, the “database” subset of Prolog, whose semantics is defined independently of the resolution strategy.

But I’m not sure Lamport meant “declarative programming” when he talked about “abstracting over the code level”. A declarative program describes what should be computed in preference to how to compute it, but it’s still executable, and so lives “at the code level” in some sense.

However, what about Coq, Lean, Agda, and other “Curry-Howard-based” systems ? These languages can express both executable code and mathematical specifications in the same framework, and I think they give Lamport’s TLA+ a run for its money.

3 Likes

I don’t think that the possibility of implementation is a question of constructive vs non-constructive logic (after all, as soon as you have exception, as in OCaml, you get some non-constructive principles). You already got this as soon as you have first-class function. For instance, the general modus ponens principle (which is constructive) is expressible in OCaml:

let modus_ponens major minor = major minor

It’s just function application. The general form of the major is 'a -> 'b, i.e the one of an hypothetical judgement: you have an antecedent 'a, a consequent 'b and the major claims the necessary consequence of 'b after 'a. But in the major, the modality of the antecedent 'a is still problematic: nobody knows if it is possible or not. The modus ponens rules only say that if 'a is possible (you can implement it, 'a is asserted and no more problematic) then you can apply the major to it and get the conclusion, i.e. a 'b. In France, we have an expression to state this property of hypothetical judgment: « Avec de si, on mettrai Paris en bouteille » (“with a lot of if, we could put Paris in a bottle”).

By the way, you can already study a lot of meta-property of OCaml programs in OCaml itself, see for instance first-class subtypes paper by @yallop and @stedolan.

For the meaning of “abstracting above code level”, as far as I understand it, I guess @xavierleroy is right in his above answer, and that dependently typed language “Curry-Howard-based” do this.

1 Like

TLA+ is a mixture of set theory and linear temporal logic, and accompanying proof rules. These make TLA+ very expressive to model and prove properties of concurrent or distributed algorithms. You can certainly implement (a variant of) TLA+ in proof assistants (see Merz’ work on TLA+ in Isabelle) but, as of now, I think none of them is on par with TLA+ as far as ease of modelling and proving are concerned (compare e.g. TLA+ with Verdi and Ironfleet). And vice versa, TLA+ is not a logical framework.

1 Like

I don’t know anything about TLA+, I was just commenting on the meaning of “abstracting above the code level”.

By the way,

I’m not a great fan of set theory (if by this you mean ZFC), and I prefer modern type theories to it (it’s not so much ZFC than first order logic and its Tarsky semantic: try to formalize linear algebra in it, and you will laugh). I will conjecture that with some axioms, you can add linear temporal logic to it. See for instance, the part of the lecture of @xavierleroy, last week in Collège de France, about Lustre and reactive languages.

My view on this largely comes from (what I understand of) e.g., Computational type theory - Scholarpedia. In my understanding, they are indeed related insofar as providing a constructive proof of some proposition P should amount to an effective procedure for producing a witness a : P, and I think this should amount to an implementation of it. Of course, I grant that OCaml is neither a total language nor a theorem prover, and I understand that not every proposition is provable. But if you have a constructive proof of some proposition, I believe I should be able to derive an implementation from that proof (in theory). I’d be interested to study a counterexample tho, if you know of something off hand. I find this area very interesting, and always find I am pleasantly surprised that there is lots more to learn!

Granting that TLC and Apalache are great for model checking TLA+ specs, it’s probably worth noting that that usage generally doesn’t amount to a proof (unless you have a quite small state space and count a successful, exhaustive model check as a proof. Tho you can do some neat things with inductive invariants with Apalache too, from what I understand). Do you have a sense for how TLAPS fares against the theorem provers? I’ve never used it, but I had the impression it was quite unwieldy.

I’d suspect this is true in the specific domain of distributed systems! But I don’t think this actually holds for some other domains, e.g. cryptography? As you also suggest, any comparison should bear in mind the striking differences between the two approaches.The fact that tools like Coq and Lean and F* generally support program extraction, means that the extra work that goes into a proof of your spec also produces a proven correct implementation thereof. I have seen first-hand that TLA+ specs can be valuable, but I have also seen how difficult it is to translate those specs into any sort of guarantees about the system you actually implement, especially as that system changes over time. I think both approaches can be valuable, but I think we need to bear in mind the differences between the apples and oranges we are comparing: apples don’t need be peeled, and they are great and healthy, but it’s very hard to use them to make orange juice!

I have often read TLA+ enthusiasts making the point that the language is able to be much more concise and abstract than any theorem provers or programming languages because it is just a logical formalism and is not bound by any operational semantics. Lamport is very clear on the same point you make, that TLA+ is just set theory and LTL (the Actions part being basically a macro system). And it is common to draw a very firm distinction between, e.g., TLC and TLA+. A nice thing about this view is that, iiuc, it also means OCaml is able to support the pure, uninterpreted form of TLA+ (embedded in comments) :wink: . Jokes aside, TLC and Apalache are great (but I am biased here, as I have contributed to the latter) and model checking is really nice for exploring distributed algorithms. In fact, I am pretty optimistic about finding ways to combine theorem proving and model checking.

But, to come back to the main point, it’s not at all clear to me how we can look at languages like Coq, Lean, or F* and conclude that they “cannot abstract over the code level”. I’d say the same for OCaml too, albeit to a more limited extent: what is the type system, if not an integrated part of the language for abstracting over the code level? (This is a genuine, and not a rhetorical question!)

Your understanding is right, but in OCaml you can still write some code where you suppose a value of some type, without knowing if that type is empty or not (the question of the possibility of its implementation).

The simplest example is this one:

type t = |

let ex_falso_quodlibet (x : t) = match t with _ -> .

Now, try to use modus_ponens ex_falso_quodlibet? :wink:

We cannot produce a value to inhabit the bottom type, that is true! I take this to show that in the very special case where we can prove the proposition p -> q for a p which is fundamentally, unprovable, it gives us a useless program. I bet we can agree that this helps demonstrate the perks of sticking to stuff that is (constructively) provable! :smiley: But the implementation of a useless program is an implementation all the same. What we cannot do, afaik, is implement x. But that is also to say that we cannot prove it.

Yes, and that’s all the purpose of “avec des si, on mettrai Paris en bouteille”. Be careful when you make unrealistic hypothesis. My point was just to say that this property (the problem of emptiness can be hidden in a more subtle way) is not sufficient to distinguish OCaml and TLA+ (as far as I understand it).

1 Like

Ah, yes. I think I see now and concede the point: as you’ve just shown, in OCaml we can indeed specify at least one program that we cannot implement! I guess the missing piece of my point (if I have one) is that you can use TLC to verify a spec that cannot be implemented, but we cannot do the equivalent for languages that use constructivistic methods (and so too with OCaml, to the limited and imperfect extent it can be said to do so), since to prove is to implement (or at least construct something that should be implementable in principle).