A Brief Introduction to Normalization-By-Evaluation

I wrote a brief introduction to the normalization-by-evaluation algorithm in OCaml, which is the standard way to efficiently normalize lambda calculus terms to their beta normal forms. I hope you find this introduction helpful!

2 Likes

Normalization-by-evaluation is a standard technique for computing β-normal forms of arbitrary lambda calculus terms.

The text is missing an explanation why default implementation of evaluator (Sestoft’s paper) can’t do the same.

Here are two comments. First, the part about going back to the original terms is a bit obscure, due to the use of De Bruijn indices and the way to compute them. Using a different representation, e.g., locally nameless, would make the whole thing much more readable (though a bit slower).

Second, a very important message is left hidden: The whole evaluation process does not need to know about lambda-calculus! It can be whatever you want. (Some machine code compiled with an intricate compiler and running on an exotic processor? Check!) In particular, as far as the normalization is concerned, the payload of Vclosure will never be explored, it will just be the target of an application. Said otherwise, you can give it the type Vclosure of value -> value. Conversely, as far as the evaluation is concerned, the payload of Vneutral will never be explored, it will just be the target of an application. Said otherwise, you can give it the type Vneutral of value -> value. At this point the code of denote is now just

      match m_val with
      | VClosure f -> f n_val
      | VNeutral f -> f n_val

As the two branches are now indistinguishable, the difference between VClosure and VNeutral is no longer needed. You can just use type value = value -> value and just do the application m_val n_val in denote. (As this definition of value is recursive, this is a bit tedious to use in OCaml. At this point, you might just as well opt for an untyped language.) Notice how denote has now become completely unaware of the existence of neutral values. It is just a pure off-the-shelf evaluator for a programming language with closures.

I think this is the most striking feature of normalization-by-evaluation. Contrarily to the way you are presenting it, the evaluation process does not need to know that it is part of a normalization procedure for the lambda-calculus. For example, you can make a trivial translation of a lambda-term to a javascript term (resp lisp, resp whatever) and then feed it to your favorite javascript compiler and interpreter (resp lisp, resp whatever) and you will still be able to use to recover a normal form from it.

(I am not sure where this idea of agnostic evaluation originated from, but it is after reading Maxime Denes’ PhD thesis that I became aware of it.)

3 Likes

Thanks for the feedback. With regards to the higher-order representation of terms, I believe your presentation is just a refunctionalized version of my first-order NbE style. Since first-order and higher-order settings are essentially interchangeable in such languages as OCaml and Haskell, I don’t believe there is much difference between them that should be noted specially. Also, since in a real-world setting, value will be a sum type and not just a function value -> value, we are now hitting the problem of “exotic terms” that pattern-match on the input value, when the shouldn’t. This problem is not present in the first-order setting, and requires the use of PHOAS (Parametric Higher-Order Abstract Syntax) to formally solve it in the higher-order setting.

You missed my point . This is not about defunctionalization or higher order. This is about the fact that, in “normalization-by-evaluation”, evaluation is nothing fancy; it is just some code running somewhere. As long as it runs, you can normalize it (assuming it can be represented by a normal form).

Let me rewrite your example in this setting. Notice how the three combinators S, K, I are no longer algebraic values. There is no need for a denote function. The combinators are now plain OCaml functions written in the most natural way. And yet, we can normalize such functions:

# let ic x = x
  let kc x y = x
  let sc x y z = (x z) (y z);;
val ic : 'a -> 'a = <fun>
val kc : 'a -> 'b -> 'a = <fun>
val sc : ('a -> 'b -> 'c) -> ('a -> 'b) -> 'a -> 'c = <fun>

# normalize 0 (sc kc kc);;
- : term = Lambda (Var 0)

Here is the definition of normalize. It is basically the same as yours, but shorter and simpler because evaluation is now plain OCaml evaluation rather than a call to denote:

type value = value -> value
type neutral = NVar of int | NApply of neutral * value

let rec accumulate n = (); fun v -> accumulate (NApply (n, v))
let vvar n = accumulate (NVar n)

type term = Lambda of term | Apply of term * term | Var of int

let rec normalize lvl v =
  match get_neutral v with
  | None -> Lambda (normalize (lvl + 1) (v (vvar lvl)))
  | Some neutral -> quote_neutral lvl neutral

and quote_neutral lvl = function
  | NVar var -> Var (lvl - var - 1)
  | NApply (neut, n_val) ->
      let m_nf = quote_neutral lvl neut in
      let n_nf = normalize lvl n_val in
      Apply (m_nf, n_nf)

Finally, the ugly part, i.e., the definition of get_neutral. That is the part where I would have preferred to use a programing language where introspection looks a bit less like Obj.black_magic.

let get_neutral : value -> neutral option =
  let acc = Obj.repr (vvar 0) in
  let () = assert (Obj.size acc = 4 && Obj.size (Obj.field acc 3) = 1) in
  fun n ->
  let n = Obj.repr n in
  assert (Obj.tag n = Obj.closure_tag);
  if Obj.size n = 4 && Obj.field n 0 == Obj.field acc 0 then
    Some (Obj.magic (Obj.field n 3))
  else None
2 Likes

@silene, yes, I do not understand your point. The set of OCaml definitions you’ve demonstrated is called shallow embedding. In contrast, the approach demonstrated in my blog post is called deep embedding. As Oleg Kiselyov demonstrated in this paper, these approaches are formally interchangeable: we can define shallow embeddings as corresponding constructor variants, and the deep embedding to recursively invoke the shallow functions. This approach doesn’t involve defunctionalization or other such techniques.

Nevertheless, if you want to compute the normal form of a shallow embedded term, there is a much cleaner approach that doesn’t require the Obj black magic:

The link includes a complete implementation of normalization-by-evaluation using the tagless-final style, without compromising type safety or requiring language-level introspection. However, as you might already guess, the implementation is quite far from trivial, and it would be a very poor pedagogical practice to expose both your and my techniques to a person just trying to understand the idea of computating of beta normal forms.

No, it is not. For it to be a shallow embedding, it needs to be an embedding in the first place. This is indeed the case of the paper you quote:

type lambda = {e: 'a. (module Lambda with type repr = 'a) -> 'a}
let lam e = {e = fun (type a) ((module L : Lambda with type repr = a) as l)-> L.lam (e.e l)}
let eval : lambda -> NativeEval.v = fun {e} -> e (module NativeEval) []
let ci = lam vz

Notice how the definition of ci above is forced to use a definition lam that was provided by the author of the normalizer. And similarly, to evaluate ci, one would be forced to use the function eval. If you try to type ci ci, you will get a type error, since you cannot apply something to a record (which shallowly embeds lambda-terms).

Now compare those to my definition of ci:

let ci x = x

There is no embedding here; ci is just a plain OCaml function. At no point did I need to depend on any prior definition, I just took an OCaml function off the shelf. And if I want to apply something to ci, I just pass it as argument; no need for the Lambda.(/) operator to make an application. And no need for an eval function either. There is no embedding in the first place!

So, let me point to the elephant in the room again. You do not need an embedding (be it shallow or deep) to use normalization by evaluation. You can normalize arbitrary functions (as long as they have a normal form.)

Thank you for the explanation, this clarified the situation. My use case for normalization-by-evaluation was mainly implementation of dependent type theories (where one already has some sort of deep embedding after the parsing phase), although I’m sure in practice there are scenarios when normalization without an embedding can be beneficial.