Type checking is a variant of program analysis that automatically proves that your program behaves correctly. As any other static analysis tools it is worthwhile as it helps you to rule out many programming errors.
Type inference and type checking look very close in that sense, as they fit the same niche. In fact, they are quite different and even live in two different complexity classes, as type checking only verifies that the answer to an equation is correct, while type inference is (a) searching for the solution for the system of equations and (b), in case of the ML type inference, provides the most general answer.
Let’s do some concrete examples, what classical type checking is doing is comparing two expressions for syntactic equality, e.g., a = a
, or a -> b = a -> b
. From the logic perspective, it checks tautologies. What ML style type inference is doing, is treating the comparison as an equation and is trying to find a solution for it, e.g., a = a
, a tautology, is easy solvable. But the ML typechecker can do much more, for example a -> b = x -> y -> z
has a solution a = x, b = y -> z
. And we have only scratched the surface of the OCaml type inferenence, that is a unification modulo theory engine. The modulo theory part here, is especially interesting, as OCaml is capable of going beyond the syntactic equality, with subtyping theory the syntactic equality is generalized to the equivalence relation over a partial ordering on the lattice-like structure of subtypes. In other words, the type checker may find the solution, that is the least upper bound, for example, the following equation
a -> (<quacks : unit> as b) = x -> (<walks : unit> as b)
has the solution
a = x
b = <quacks : unit; walks : unit>
Let’s do a real-world but simplified example of this. Suppose we have an expression language, and like in C, some expressions can occur on the left-hand side (aka lvalues) and some are not. Like we can’t assign to a+b
but we can assign to a
or &(a+b)
. Since we can build arbitrary expressions, we don’t want to manually ascribe each possible tree with whether it is a lvalue or rvalue. However, we can ascribe only our base constructors, and allow the OCaml typechecker to compute for us to which class an arbitrary expression belong. So, let’s define a simplified expression language, that has two constants and one binary operation:
type _ exp =
| Int :[> `rhs] exp
| Var :[> `lhs] exp
| Cat : ('a exp * 'a exp) -> 'a exp;;
Here, the type of the Cat
constructor should not be misread. By using the same type variable for all three parts, we’re not postulating that all three should have the same type, we’re saying that all three should be unfied, and let the typechecker find the least common denominator. For two equal type, this is simple
# Cat (Var,Var);;
- : [> `lhs ] exp = Cat (Var, Var)
but the typechecker can also handle different types,
# Cat (Int,Var);;
- : [> `lhs | `rhs ] exp = Cat (Int, Var)
So, given this, we can define our type safe assignment operator,
# let (:=) : [`lhs] exp -> [> `lhs | `rhs] exp -> string = fun x y -> "ok";;
val ( := ) : [ `lhs ] exp -> [> `lhs | `rhs ] exp -> string = <fun>
that will statically disallow non well-formed programs:
# Cat(Var,Int) := Int;;
Characters 8-11:
Cat(Var,Int) := Int;;
^^^
Error: This expression has type [> `rhs ] exp
but an expression was expected of type [ `lhs ] exp
The second variant type does not allow tag(s) `rhs
# Cat(Var,Var) := Cat(Var,Int);;
- : string = "ok"
So again, we only scratched the surface. The general idea is the Curry-Howard Isomorphism between logic and programming, where our types are theorems, our values are proofs, and the typechecker is an automatic theorem prover. So, when you program in OCaml, you write the mli file, that states your theory, you write the implementation, that is a particular model of you theory, and typechecker verifies that your implementation actually models your theory. And OCaml typesystem is expressive enough to postulate quite complex theories.
So why do we need this? Typechecker is an easy to use and fast static analysis tool that finds bugs for us.