There is more to Hindley-Milner type inference than the Algorithm W.
In 1988, Didier Rémy was looking to speed up the type inference in
Caml and discovered an elegant method of type generalization.
Not only it is fast, avoiding scanning the type environment. It
smoothly extends to catching of locally-declared types about to escape,
to type-checking of universals and existentials, and even to MLF.

Alas, both the algorithm and its implementation in the OCaml type checker are little known and little documented. This page is to explain and popularize Rémy's algorithm, and to decipher a part of the OCaml type checker. The page also aims to preserve the history of Rémy's algorithm.

The attraction of the algorithm is its insight into type generalization as dependency tracking -- the same sort of tracking used in automated memory management such as regions and generational garbage collection. Generalization can be viewed as finding dominators in the type-annotated abstract syntax tree with edges for shared types. Fluet and Morrisett's type system for regions use the generalization of a type variable as a criterion of region containment. Uncannily, Rémy's algorithm views the region containment as a test if a type variable is generalizable.

- Introduction
- Generalization
- Unsound generalization as memory mismanagement
- Efficient generalization with levels
- Even more efficient level-based generalization
- Type Regions
- Discovery of levels
- Inside the OCaml type checker
- Generalization with levels in OCaml
- Type Regions
- Creating fresh type variables
- True complexity of generalization

- This page started as notes taken to understand the OCaml type checking
code, which is extensive, complex and hardly documented. Digging
through the code unearthed real gems. One of them -- an efficient and
elegant method of type generalization -- is spotlight here.
OCaml generalization is based on tracking of so-called

*levels*of a type. The very same levels also ensure that types defined within a module do not escape into a wider scope. Levels hence enforce the region discipline for locally introduced type constructors. It is intriguing how generalization and regions are handled so uniformly. There are even more applications of levels in the OCaml type checker, for records with polymorphic fields and existentials. MetaOCaml indirectly relied on levels to track the scope of future-stage bindings. There is a common refrain in all these applications: tracking dependencies, computing region containment or dominators in data-dependency graphs. One is immediately reminded of the region-based memory management by Tofte and Talpin. As Fluet and Morrisett showed, Tofte and Talpin type system for regions can be encoded in System F, relying on universal quantification to statically prevent allocated data from escaping their region. Dually, the level-based generalization relies on detecting escapes of a type variable to determine its region and hence the place for its universal quantification.OCaml's generalization is a (partial) implementation of the algorithm discovered by Didier Rémy back in 1988. The idea is to explicitly represent the sharing of types in the type-annotated abstract syntax tree. A type variable can only be quantified at a node that dominates all occurrences of that variable. Generalization amounts to the incremental computation of graph dominators. Rémy's MLF is the natural outgrowth of this idea.

Unfortunately, Rémy's generalization algorithm and the underlying ideas are little known. The implementations, such as the one in OCaml, do not seem to be documented at all, aside from a couple of brief puzzling comments in the OCaml source code. They ought to be widely known. Towards this goal, the present page sets to (i) motivate and explain the algorithm, expose its intuitions and sketch implementations; (ii) help decipher the OCaml type checker.

The second part of this page aims to be a commentary on a portion of the OCaml type-checker, and is, therefore, quite technical. It refers to OCaml 4.00.1 type checking code, located in the directory

`typing/`

of the OCaml distribution. The file`typecore.ml`

is the core type checker: it annotates nodes of the abstract syntax tree with types and the typing environment. To be precise, it transforms`Parsetree`

(defined in`parsing/parsetree.mli`

) into`Typedtree`

. The file`ctype.ml`

implements unification and level manipulation functions.I am indebted to Didier Rémy for his comments, explanations, insights and recollections of the discovery of the algorithm. I thank Jacques Garrigue for helpful comments and explanations of more applications of levels within the OCaml type checker. Additional references provided by Matthew Fluet and Baris Aktemur are gratefully acknowledged.

**Version**- The current version is February 2013
**References**- Didier Rémy:
Extension of ML Type System with a Sorted Equational Theory on Types

Research Report 1766, Institut National de Recherche en Informatique et Automatique, Rocquencourt, BP 105, 78 153 Le Chesnay Cedex, France, 1992

<http://gallium.inria.fr/~remy/ftp/eq-theory-on-types.pdf>Matthew Fluet and J. Gregory Morrisett: Monadic Regions

J. Functional Programming, 2006, v16, N4-5, pp. 485-545

The paper shows that parametric polymorphism is all that needed for a sound type system of memory regions.

- This background section reminds the type generalization in the
Hindley-Milner type system, stressing subtle points and inefficiencies of
the naive implementation. These inefficiencies motivated Rémy's discovery
of the level-based generalization algorithm.
Recall that

*generalization*`GEN(G,t)`

of the type`t`

with respect to the type environment`G`

is quantifying free type variables of`t`

that do not occur as free in`G`

. In Greek:`GEN(G,t) = ∀ α1 ... αn. t`

where`{α1 ... αn} = FV(t) - FV(G)`

. In the Hindley-Milner terminology, this quantification converts a type to a so-called type schema. Generalization is used in type checking`let`

-expressions:G |- e : t G, (x:GEN(G,t)) |- e2 : t2 ---------------------------------------- G |- let x = e in e2 : t2

That is, the type inferred for the let-bound variable is generalized when type checking the body of the let-expression. ML adds a condition for generalization, so-called value restriction: the let-bound expression`e`

, by the look of it, must have no visible side-effects -- technically,`e`

must pass the syntactic test of being*nonexpansive*. OCaml relaxes the value restriction, see later on this page.Here is a trivial example of generalization:

fun x -> let y = fun z -> z in y (* 'a -> ('b -> 'b) *)

The type checker infers for`fun z -> z`

the type`β->β`

with the fresh, and hence unique, type variable`β`

. The expression`fun z -> z`

is syntactically a value, the generalization proceeds, and`y`

gets the type`∀β.β->β`

. Because of the polymorphic type,`y`

may occur in differently typed contexts -- may be applied to arguments of different types, -- as infun x -> let y = fun z -> z in (y 1, y true) (* 'a -> int * bool *)

Generalization

`Gen(G,t)`

quantifies over only those free type variables of`t`

that do not occur in`G`

. This condition is subtle but crucial: without it, the unsound type`α->β`

is inferred for the functionfun x -> let y = x in y

To wit: to infer the function's type, we infer the type of its body`let y = x in y`

in the environment in which`x:α`

where`α`

is a fresh type variable. According to the`let`

-rule above the type inferred for`y`

, and hence the result type is`Gen(x:α,α)`

. Clearly`α`

does occur in the environment`x:α`

. If we quantify over it nevertheless,`y`

receives the polymorphic type`∀α.α`

, which can then be instantiated to any type. The result is the function that ostensibly converts its argument to the value of any type whatsoever.Thus, for each type variable to quantify we must make sure that it does not occur in the type environment. Naively, we could scan the type environment looking through the type of each binding -- in fact, the original Caml did exactly that. The type environment however can get very large. Typically ML functions contain long sequences of

`let`

-expressions. A non-recursive`let`

has in its type environment the bindings of all previous`let`

s; the environment of a recursive`let`

has the bindings of all`let`

siblings. Scanning the environment as part of the generalization for a single`let`

takes time linear in the function size; type checking of the whole program will be quadratic then. (Except for pathological cases, Hindley-Milner type inference scales nearly linearly with the program size.) The inefficient generalization was one of the main reasons for the slow speed of Caml compilation, Didier Rémy recalls. Bootstrapping the compiler and type checking two mutually recursive functions for compiling patterns and expressions took 20 minutes.There has to be a way to avoid scanning the environment. The next section gives the idea.

- This section begins to introduce the ideas behind Rémy's algorithm,
relating them to region-based memory management. For concreteness we
will be using a toy Hindley-Milner type inferencer. In this section,
the inferencer has the
*unsound*generalization function that quantifies free type variables in a type with no regard for the environment. We type check in detail three simple examples, and relate the inferring of unsound types with the common problems of manual memory management: releasing memory still in use. The unsound generalization will be fixed in the next section, drawing the inspiration from the standard methods of preventing premature deallocation of resources.Although our Hindley-Milner type inferencer is toy, it shares many implementation decisions (and even some function names) with the real OCaml type checker. Understanding it will help when we turn to OCaml internals later on this page.

Our toy language is the standard pure lambda-calculus with

`let`

. Its expressions are:type exp = | Var of varname (* variable *) | App of exp * exp (* application: e1 e2 *) | Lam of varname * exp (* abstraction: fun x -> e *) | Let of varname * exp * exp (* let x = e in e2 *)

Types are comprised of (free or bound) type variables, quantified type variables and function types:type qname = string type typ = | TVar of tv ref (* type (schematic) variable *) | QVar of qname (* quantified type variable *) | TArrow of typ * typ and tv = Unbound of string | Link of typ

Types with`QVar`

are type schemas; without -- simple types. Type schemas, i.e. quantified types, in the Hindley-Milner system are in the prenex form (that is, universal quantifiers are all outside), and so the quantifiers need not be represented explicitly.In the Prolog tradition, type variables are represented as reference cells. An unbound variable contains the null or the self pointer -- or, in our case, the name of the variable for easy printing. When a free type variable is unified with some type

`t'`

, the reference cell is overwritten with the pointer to`t'`

. To prevent cyclical (and, for us, unsound) types, the `occurs check' is performed first:`occurs tv t'`

traverses`t'`

raising an exception if it comes across the type variable`tv`

:let rec unify : typ -> typ -> unit = fun t1 t2 -> if t1 == t2 then () (* t1 and t2 are physically the same *) else match (t1,t2) with | (TVar {contents = Link t1},t2) | (t1,TVar {contents = Link t2}) -> unify t1 t2 | (TVar ({contents = Unbound _} as tv),t') | (t',TVar ({contents = Unbound _} as tv)) -> occurs tv t'; tv := Link t' | (TArrow (tyl1,tyl2), TArrow (tyr1,tyr2)) -> unify tyl1 tyr1; unify tyl2 tyr2 (* everything else is error *)

The type checker is completely standard. It infers the type for the expression

`exp`

in the type environment`env`

:type env = (varname * typ) list let rec typeof : env -> exp -> typ = fun env -> function | Var x -> inst (List.assoc x env) | Lam (x,e) -> let ty_x = newvar () in let ty_e = typeof ((x,ty_x)::env) e in TArrow(ty_x,ty_e) | App (e1,e2) -> let ty_fun = typeof env e1 in let ty_arg = typeof env e2 in let ty_res = newvar () in unify ty_fun (TArrow (ty_arg,ty_res)); ty_res | Let (x,e,e2) -> let ty_e = typeof env e in typeof ((x,gen ty_e)::env) e2

The function`newvar`

allocates a new`TVar`

, with a unique name. The function`inst`

instantiates a type schema, that is, replaces each`QVar`

with a fresh`TVar`

. It is also standard. The generalization function is unsound: it quantifies all free variables in the type regardless of the environment:let rec gen : typ -> typ = function (* unsound! *) | TVar {contents = Unbound name} -> QVar name | TVar {contents = Link ty} -> gen ty | TArrow (ty1,ty2) -> TArrow (gen ty1, gen ty2) | ty -> ty

The quantification replaces a`TVar`

with the corresponding`QVar`

. The original`TVar`

is hence implicitly deallocated: When a free variable is bound, it `disappears', being replaced by the `pointer' to the binder.With respect to type variables,

`typeof`

allocates free variables, unifies them, and deallocates, after quantification. Let us type check simple examples observing the sequence of these three main operations that affect free type variables. The first example is the one where nothing should go wrong:fun x -> let y = fun z -> z in y

The trace of type-checking, showing only type-variable related operations, is as follows:1 ty_x = newvar () (* fun x -> ... *) 2 ty_e = (* let y = fun z -> z in y *) 3 ty_z = newvar (); (* fun z -> ... *) 3 TArrow(ty_z,ty_z) (* inferred for: fun z -> z *) 2 ty_y = gen ty_e (* ty_z remains free, and so *) 2 deallocate ty_z (* quantified and disposed of *) 1 TArrow(ty_x, inst ty_y) (* inferred for: fun x -> ... *)

The number in the first column is the depth for the recursive invocations of`typeof`

. Since`typeof`

recurs on each non-leaf node of the abstract syntax tree (AST), this recursive invocation depth is the depth in the AST of the node being type checked. The inferred type is`'a -> 'b -> 'b`

, as expected. Nothing went wrong.The second example, also seen earlier, is the one for which the unsound generalization gives the unsound type

`'a->'b`

:fun x -> let y = x in y

Diagramming the`TVar`

operations as before reveals the problem:1 ty_x = newvar () (* fun x -> ... *) 2 ty_e = (* let y = x in y *) 3 inst ty_x (* inferred for x, same as ty_x *) 2 ty_y = gen ty_e (* ty_x remains free, and is *) 2 deallocate ty_x (* quantified, and disposed of *) 1 TArrow(ty_x, inst ty_y) (* inferred for: fun x -> ... *)

The type variable`ty_x`

is part of the return type, used at depth 1 -- and yet it is quantified and disposed of at depth 2. We disposed of the value still in use.The third example is also problematic. The unsound generalization again gives the unsound type

`('a->'b) -> ('c ->'d)`

:fun x -> let y = fun z -> x z in y

The diagram shows a memory management problem again:1 ty_x = newvar () (* fun x -> ... *) 2 ty_e = (* let y = ... *) 3 ty_z = newvar () (* fun z -> ... *) 4 ty_res = newvar () (* typechecking: x z *) 4 ty_x := (* as the result of unify *) 4 TArrow (ty_z,ty_res) 4 ty_res (* inferred for: x z *) 3 TArrow(ty_z,ty_res) (* inferred for: fun z -> x z *) 2 ty_y = gen ty_e (* ty_z, ty_res remain free *) 2 deallocate ty_z (* quantified and disposed of *) 2 deallocate ty_res (* quantified and disposed of *) 1 TArrow(ty_x, inst ty_y) (* inferred for: fun x -> ... *)

The type variables`ty_z`

and`ty_res`

are quantified over and hence disposed of at depth 2, and yet they are part of`TArrow (ty_z,ty_res)`

that is assigned to`ty_x`

, which, in turn, is part of the result.All unsound examples had a `memory management problem', deallocating memory (

`TVar`

) still being used. This is no accident. When a type variable is quantified over, later on it can be instantiated with any type whatsoever. However, a type variable that appears in the type environment cannot be replaced with any type without affecting the rest of the type checking. Likewise, when we free a piece of memory, we give the run-time the permission to reallocate it and overwrite with arbitrary data. The rest of our program should not depend on what happens later with the deallocated memory -- provided it was really free, not needed further in the program. In fact, one may define `memory not in use' as arbitrary changes to that memory not affecting the rest of the program. Deallocating memory still in use will affect the rest of the program -- often, crash it. Incidentally, unsound types inferred for our examples often lead to the same result. **References**- unsound.ml [11K]

Complete code for the toy type inferencer with the unsound generalization, with many more examples of unsound inference

- This section continues the exposition of the ideas behind Rémy's
algorithm. Now that we have seen how the unsound generalization
relates to releasing memory still in use, we apply the standard
remedy for premature deallocation -- ownership tracking, or
regions -- and cure the unsound generalization without much
overhead. We develop two algorithms. The simpler one,
`sound_eager`

, is motivated and explained in this section. The optimal`sound_lazy`

, which captures the main features of the Rémy algorithm, is presented next.Clearly, before deallocating memory we must check if it is still in use. Naively, we could scan all memory known to be in use looking for references to the deallocation candidate -- in other words, do the full garbage-collection marking pass and see if our candidate got marked. Put this way, the check seems awfully expensive. At least we should wait until garbage accumulates, to collect en masse. Alas, in the Hindley-Milner type system we cannot delay quantification arbitrarily, since the generalized type may be used right away.

More promising is ownership tracking: associating an allocated resource with an owner, an object or a function activation. Only the owner may deallocate its resources. A similar strategy is regions, which are areas of heap memory created by a lexically-scoped so-called

`letregion`

primitive. When`letregion`

goes out of scope, its whole whole region is summarily deallocated. This idea matches the generalization well. In the Hindley-Milner system, generalization is always a part of`let`

. A`let`

-expression`let x = e in e2`

is the natural owner of all type variables allocated when inferring the type of`e`

. When the type of`e`

is found, all free type variables still owned by the`let`

-expression can be disposed of, that is, quantified.These intuitions underlie the sound and efficient generalization algorithms. The first is

`sound_eager`

, described in the rest of the section. Its code differs only in small, but significant, details from the toy Hindley-Milner inferencer from the previous section. We will explain only these differences; the complete code is available below. The main difference is that free type variables, albeit unbound, are now owned, and refer to their owner. The owner, always a`let`

expression, is identified by a positive integer called*level*. It is the De Bruijn level, or the nesting depth, of the owing`let`

-expression. Level 1 corresponds to the (implicit) top-level`let`

. (Incidentally, although both`let`

s in`(let x = e1 in eb1, let y = e2 in eb2)`

have level 2, no confusion can arise as neither`let`

is in each other scope and hence their regions are disjoint.) The`let`

-nesting depth is equal to the`let`

-expression's type checking recursion depth, which is is simple to determine, with the help of one reference cell.type level = int let current_level = ref 1 let enter_level () = incr current_level let leave_level () = decr current_level

The type inferencer maintains the`let`

type-checking depth:let rec typeof : env -> exp -> typ = fun env -> function ... (* the other cases are the same as before *) | Let (x,e,e2) -> enter_level (); let ty_e = typeof env e in leave_level (); typeof ((x,gen ty_e)::env) e2

The only change to the main type-inference function was adding`enter_level`

and`leave_level`

to track the level. The rest of`typeof`

is literally the same as in the original toy version.Free type variables now carry the level identifying their owner. A freshly allocated type variable receives the

`current_level`

, meaning that its owner is the latest`let`

being type-checked. (In region-based memory management, all new memory is allocated in the innermost alive region.)type typ = | TVar of tv ref (* type (schematic) variable *) | QVar of qname (* quantified type variable *) | TArrow of typ * typ and tv = Unbound of string * level | Link of typ let newvar : unit -> typ = fun () -> TVar (ref (Unbound (gensym (),!current_level)))

Just as an assignment may change the owner of an allocated piece of memory, unification may change the level of a free type variable. For example, if

`ty_x`

(level 1) and`ty_y`

(level 2) are both free and`ty_x`

is unified with the type`TArrow(ty_y,ty_y)`

, the arrow type and its components are exported into region 1, and so the level of`ty_y`

is changed to 1. One may view the above unification as replacing all occurrences of`ty_x`

with`TArrow(ty_y,ty_y)`

. Since`t_x`

has a smaller level and may hence occur outside the inner, level-2`let`

, after the bound-expression of that inner`let`

is type-checked`ty_y`

should not be deallocated. With the updated`ty_y`

level, it won't be. All in all, unifying a free type variable`ty_x`

with`t`

has to update the level of each free type variable`ty_y`

in`t`

to the smallest of`ty_y`

and`ty_x`

levels. Unifying a free type variable with`t`

also has to do the occurs check, which too traverses the type. The two traversals can be merged. The new`occurs`

does the occurs check and updates the levels:let rec occurs : tv ref -> typ -> unit = fun tvr -> function | TVar tvr' when tvr == tvr' -> failwith "occurs check" | TVar ({contents = Unbound (name,l')} as tv) -> let min_level = (match !tvr with Unbound (_,l) -> min l l' | _ -> l') in tv := Unbound (name,min_level) | TVar {contents = Link ty} -> occurs tvr ty | TArrow (t1,t2) -> occurs tvr t1; occurs tvr t2 | _ -> ()

The only difference from the original

`occurs`

code is the second clause in the pattern-match. The unification code does not have to be modified at all. Finally, we fix the generalization function, to make it sound:let rec gen : typ -> typ = function | TVar {contents = Unbound (name,l)} when l > !current_level -> QVar name | TVar {contents = Link ty} -> gen ty | TArrow (ty1,ty2) -> TArrow (gen ty1, gen ty2) | ty -> ty

The change is minimal: the condition

`when l > !current_level`

. Recall the new`typeof`

code:let rec typeof : env -> exp -> typ = fun env -> function ... (* the other cases are the same as before *) | Let (x,e,e2) -> enter_level (); let ty_e = typeof env e in leave_level (); typeof ((x,gen ty_e)::env) e2

It invokes`gen`

after the region established for type checking`e`

exits. A free type variable still owned by that region will have the level greater than the current. Since the region is now dead, any such type variable may be deallocated, that is, quantified.These are all the changes of

`sound_eager`

from the unsound toy algorithm, which fix the type inference. Here is the old problematic examplefun x -> let y = x in y

Diagramming the`TVar`

operations shows no problems now:1 1 ty_x/1 = newvar () (* fun x -> ... *) 2 2 ty_e = (* let y = x in y *) 3 2 inst ty_x/1 (* inferred for x, same as ty_x *) 2 1 ty_y = gen ty_e (* ty_x/1 remains free, but is *) (* level = current, can't *) (* quantify, can't dispose *) 1 1 TArrow(ty_x/1, inst ty_y) (* inferred for: fun x -> ... *)

The first column of numbers shows the`typeof`

recursion depth, or the depth of the AST node being type-checked. The number in the second column is the`current_level`

, the`let`

-nesting depth. We write the level of a free type variable after the slash, as in`ty_x/1`

. That variable is no longer quantified by`gen`

at depth 2 (level 1) since`ty_x/1`

belongs to to the current, still active region 1. Therefore, the inferred type is`'a->'a`

, as expected.In a slightly more complex example,

fun x -> let y = fun z -> x in y

the type variable`ty_x`

for the type of`x`

is allocated at level 1, whereas`ty_z`

is allocated at level 2. After the inner`let`

, region 2, is finished,`ty_z/2`

will be quantified and disposed of, but`ty_x/1`

will not. The inferred type therefore is`'a->'b->'a`

. The reader is encouraged to diagram other examples, to check that the inferred types are sound.Level tracking may look like reference counting. However, rather than counting the number of users for a free type variable, we keep track of only one user, the one with the widest scope. Level tracking does look a lot like generational garbage collection: Memory is allocated in the young generation, and summarily disposed of at minor (youngest) collection, unless it is `re-parented' or referenced from the stack. The old generation does not have to be scanned for references to the new generation, since no such references are expected -- unless there was an assignment of a (pointer to a) young value to a field of an old data structure. A generational garbage collector (such OCaml GC) keeps track of young-to-old assignments. At minor collection, young data referred from the old are promoted to the old generation. Type generalization indeed looks very similar to the minor GC collection.

**References**- sound_eager.ml [13K]

The complete code for the toy type inferencer with the`sound_eager`

generalization, with many more examples of now sound inference

- This section continues the exposition of the ideas behind Rémy's
algorithm and presents
`sound_lazy`

: an optimized version of`sound_eager`

from the previous section. The`sound_lazy`

algorithm eschews repeated, unnecessary traversals of a type during unification, generalization and instantiation, and avoids copying the parts that do not contain variables to generalize or instantiate, thus improving sharing. The algorithm delays the occurs check and the level updates, so that the unification with a free type variable takes constant time. Levels are updated incrementally and on demand. All in all,`sound_lazy`

embodies the main ideas of Rémy's algorithm. Some of these ideas are implemented in the OCaml type checker.To carry on the optimizations, we change the syntax of types. Recall that in

`sound_eager`

, types were comprised of free or bound type variables`TVar`

, (implicitly universally) quantified type variables`QVar`

and function types`TArrow`

. The first, seemingly unprincipled change, is to eliminate`QVar`

as a distinct alternative and dedicate a very large positive integer -- which should be treated as the inaccessible ordinal ω -- as a`generic_level`

. A free type variable`TVar`

at`generic_level`

is taken to be a quantified type variable. More substantially, all types, not only free type variables, have levels now. The level of a composite type (`TArrow`

in our case) is an upper, not necessarily exact, bound on the levels of its components. In other words, if a type belongs to an alive region, all its components should be alive. It immediately follows that if a (composite) type is at`generic_level`

, it may contain quantified type variables. Contrapositively, if a type is not at`generic_level`

, it does not contain any quantified variable. Therefore, instantiating such a type should return the type as it is without traversing it. Likewise, if the level of a type is greater than the current level, it may contain free type variables to generalize. On the other hand, the generalization function should not even bother traversing a type whose level is equal or less than the current. This is the first example of how levels help eliminate excessive traversals and rebuildings of a type, improving sharing.Unifying a type with a free type variable should update the type's level to the level of the type variable if the latter level is smaller. For a composite type, such an update means recursively updating the levels of all components of the type. To postpone costly traversals, we give composite types two levels:

`level_old`

is an upper bound on the levels of type's components;`level_new`

, which is less or equal to`level_old`

, is the level the type should have after the update. If`level_new < level_old`

, the type has pending level updates. The syntax of types in`sound_lazy`

is thustype level = int let generic_level = 100000000 (* as in OCaml typing/btype.ml *) let marked_level = -1 (* for marking a node, to check*) (* for cycles *) type typ = | TVar of tv ref | TArrow of typ * typ * levels and tv = Unbound of string * level | Link of typ and levels = {mutable level_old : level; mutable level_new : level}

We have not explained

`marked_level`

. The occurs check on each unification with a free type variable is expensive, raising the algorithmic complexity of the unification and type checking. We now postpone this check, until the whole expression is type checked. In the meanwhile, unification may create cycles in types. Type traversals have to check for cycles, or risk divergence. The`marked_level`

is assigned temporarily to`level_new`

of a composite type to indicate the type is being traversed. Encountering`marked_level`

during a traversal means detecting a cycle, which raises the occurs check error. Incidentally, in OCaml types are generally cyclic: (equi-)recursive types arise when type checking objects and polymorphic variants, and when the`-rectypes`

compiler option is set. The OCaml type checker uses a similar marked-level trick to detect cycles and avoid divergence.The

`sound_lazy`

unification has several important differences from`sound_eager`

:let rec unify : typ -> typ -> unit = fun t1 t2 -> if t1 == t2 then () (* t1 and t2 are physically the same *) else match (repr t1,repr t2) with | (TVar ({contents = Unbound (_,l1)} as tv1) as t1, (* unify two free vars *) (TVar ({contents = Unbound (_,l2)} as tv2) as t2)) -> if tv1 == tv2 then () (* the same variable *) else if l1 > l2 then tv1 := Link t2 else tv2 := Link t1 (* bind the higher-level var *) | (TVar ({contents = Unbound (_,l)} as tv),t') | (t',TVar ({contents = Unbound (_,l)} as tv)) -> update_level l t'; tv := Link t' | (TArrow (tyl1,tyl2,ll), TArrow (tyr1,tyr2,lr)) -> if ll.level_new = marked_level || lr.level_new = marked_level then failwith "cycle: occurs check"; let min_level = min ll.level_new lr.level_new in ll.level_new <- marked_level; lr.level_new <- marked_level; unify_lev min_level tyl1 tyr1; unify_lev min_level tyl2 tyr2; ll.level_new <- min_level; lr.level_new <- min_level (* everything else is the unification error *) and unify_lev l ty1 ty2 = let ty1 = repr ty1 in update_level l ty1; unify ty1 ty2

where the auxiliary`repr`

, like OCaml's`Btype.repr`

, chases links of bound variables returning a free variable or a constructed type. Unlike OCaml, we do path compression. The unification function no longer does the occurs check; therefore, it has to make an effort to detect accidentally created cycles. Unifying with a free variable now takes constant time, to bind the variable after a shallow`update_level`

.The function

`update_level`

is one of the key parts of the optimized algorithm. Often, it merely promises to update the level of a type to the given level. It works in constant time and maintains the invariant that a type level may only decrease. The level of a type variable is updated immediately. For a composite type,`level_new`

is set to the desired new level if the latter is smaller. In addition, if previously`level_new`

and`level_old`

were the same, the type is put into the`to_be_level_adjusted`

queue for later update of the levels of the components. This work queue is akin to the list of assignments into the old generation from the young maintained by a generational garbage collector (such as the one in OCaml).let to_be_level_adjusted = ref [] let update_level : level -> typ -> unit = fun l -> function | TVar ({contents = Unbound (n,l')} as tvr) -> assert (not (l' = generic_level)); if l < l' then tvr := Unbound (n,l) | TArrow (_,_,ls) as ty -> assert (not (ls.level_new = generic_level)); if ls.level_new = marked_level then failwith "occurs check"; if l < ls.level_new then begin if ls.level_new = ls.level_old then to_be_level_adjusted := ty :: !to_be_level_adjusted; ls.level_new <- l end | _ -> assert false

The pending level updates must be performed before generalization: After all, a pending update may decrease the level of a type variable, promoting it to a wider region and hence saving it from quantification. Not all pending updates have to be forced however -- only of those types whose

`level_old > current_level`

. Otherwise, a type contains no variables generalizable at the present point, and the level update may be delayed further. The described forcing algorithm is implemented by`force_delayed_adjustments`

, see the source code. Incidentally, if a level update of a composite type (`TArrow`

) has to be really performed, the type has to be traversed. Unification of two`TArrow`

types also has to traverse them. Therefore, unification could, in principle, also update the levels along the way. That optimization is not currently implemented, however.The generalization function searches for free

`TVar`

s that belong to a dead region (that is, whose level is greater than the current) and sets their level to`generic_level`

, hence quantifying the variables. The function traverses only those parts of the type that may contain type variables to generalize. If a type has the (new) level of`current_level`

or smaller, all its components belong to live regions and hence the type has nothing to generalize. After the generalization, a composite type receives`generic_level`

if it contains a quantified type variable. Later on, the instantiation function will, therefore, only look through those types whose level is`generic_level`

.let gen : typ -> unit = fun ty -> force_delayed_adjustments (); let rec loop ty = match repr ty with | TVar ({contents = Unbound (name,l)} as tvr) when l > !current_level -> tvr := Unbound (name,generic_level) | TArrow (ty1,ty2,ls) when ls.level_new > !current_level -> let ty1 = repr ty1 and ty2 = repr ty2 in loop ty1; loop ty2; let l = max (get_level ty1) (get_level ty2) in ls.level_old <- l; ls.level_new <- l (* set the exact level upper bound *) | _ -> () in loop ty

The type checker

`typeof`

remains the same, entering a new region when type checking a`let`

expression. Please see the source code for details.We have presented the optimized

`sound_lazy`

type generalization algorithm that avoids not only scanning the whole type environment on each generalization, but also the occurs check on each unification with a free type variable. In the result, unification takes constant time. The algorithm eliminates unnecessary type traversals and copying, saving time and memory. Two ideas underlie the optimizations, besides the type levels for free type variables. First is the assigning of levels to composite types, to give us an idea what a type may contain without looking though it. The second principle is delaying expensive actions (type traversals) with the hope they will get done in the future alongside of something else. In other words, if dealing with a problem is postponed long enough, it may go away: procrastination sometimes helps. **References**- sound_lazy.ml [20K]

The complete code for the optimized toy type inferencer, again with many examples

- This OCaml internals section describes the implementation of
the type levels in the OCaml type checker and their application for
efficient generalization. The next section shows how the levels
help prevent escapes of local types and type check existentials.
The ideas behind the type generalization in OCaml have been presented in the previous sections, in the form of the toy algorithms

`sound_eager`

and`sound_lazy`

. Their code has been intentionally written to resemble the OCaml type checker, often using the same function names. The OCaml type checker implements the`sound_eager`

algorithm with a few optimizations from`sound_lazy`

. OCaml is far more complicated: whereas unification in the toy code takes just a few lines, the OCaml unification code, in`ctype.ml`

, takes 1634 lines. Nevertheless, understanding the toy algorithms should help in deciphering the OCaml type checker.Like the

`sound_eager`

algorithm, the OCaml type checker does the occurs check and the levels update on each unification with a free variable; one can clearly see that from the code of`Ctype.unify_var`

. On the other hand, like in`sound_lazy`

, the OCaml type checker assigns levels to all types, not only to type variables -- see`type_expr`

in`types.mli`

. One reason is to detect escaping local type constructors (described in the next section). Also like in`sound_lazy`

,`generic_level`

distinguishes quantified type variables and the types that may contain quantified variables (so-called `generic types'). Therefore, the schema instantiation function`Ctype.instance`

and`Ctype.copy`

will not traverse and copy non-generic parts of a type, returning them as they are, which improves sharing. Type variables at`generic_level`

are printed like`'a`

; with other levels, as`'_a`

. As in our toy algorithms, a mutable global`Ctype.current_level`

tracks the current level, which is assigned to newly created types or type variables (see`Ctype.newty`

and`Ctype.newvar`

). The`current_level`

is increased by`enter_def()`

and decreased by`end_def()`

. Besides the`current_level`

, there is also`nongen_level`

, used when type checking a class definition, and`global_level`

used for type variables in type declarations.A very simplified code for type-checking

`let x = e in body`

is as follows.let e_typed = enter_def (); let r = type_check env e_source in end_def (); r in generalize e_typed.exp_type; let new_env = bind env x e_typed.exp_type in type_check new_env body_source

Here,`e_source`

is the abstract syntax tree, or`Parsetree.expression`

for the expression`e`

and`e_typed`

is the`Typedtree.expression`

, the abstract syntax tree in which each node is annotated with its inferred type, the field`exp_type`

.Thus the overall type generalization pattern, often seen in the OCaml type checker, is

let ty = enter_def (); let r = ... let tv = newvar() in ... (... tv ...) end_def (); r in generalize ty

If

`tv`

was not unified with something that existed in the environment before`enter_def()`

, the variable will be generalized. The code looks quite like our toy code.Interestingly, levels have another use, enforcing the region discipline for local type declarations.

- The OCaml type checker relies on type levels also to check that
types are not used before being declared and that locally introduced
types do not escape into a wider scope. Unification, akin to assignment,
facilitates both mischiefs. We have seen how type levels are
related to region-based memory management. It is not surprising then
that the levels help rein in the unification, preventing resource
mismanagement -- this time, not with type variables but with type constants.
OCaml, unlike SML, supports local modules, or modules defined in local scope, via the

`let module`

form. A local module may declare a type, and may even let this type escape, as inlet y = let module M = struct type t = Foo let x = Foo end in M.x ^^^ Error: This expression has type M.t but an expression was expected of type 'a The type constructor M.t would escape its scope

Such an escape must be flagged as an error. Otherwise,`y`

will receive the type`M.t`

where`M.t`

and even`M`

are not in scope where`y`

is. This problem is akin to returning the address of an automatic local variable from a C function:char * esc_res(void) { char str [] = "local string"; return str; }

A locally declared type can escape not only through the result type but also by unification with an existing type variable:fun y -> let module M = struct type t = Foo let r = y Foo end in () ^^^ Error: This expression has type t but an expression was expected of type 'a The type constructor t would escape its scope

This sort of error is also familiar to C programmers:char * y = (char*)0; void esc_ext(void) { char str [] = "local string"; y = str; }

Even top-level modules have type escaping problems. Here is the example taken from a comment in the OCaml type checker:

let x = ref [] module M = struct type t let _ = (x : t list ref) end

The variable`x`

has the non-generic type`'_a list ref`

. The module`M`

defines the local type`t`

. The type attribution causes`x`

, defined prior to`t`

, to have the type`x : t list ref`

. It looks like`t`

is used before defined. Such type escaping may occur even without modules, as pointed by Jacques Garrigue:let r = ref [] type t = Foo let () = r := [Foo] ^^^ Error: This expression has type t but an expression was expected of type 'weak1 The type constructor t would escape its scope

OCaml cannot let such escapes go uncaught. Under no circumstances a type constructor may be used outside the scope of its declaration. Type levels enforce this region-like discipline for type constructors.The OCaml type checker already supports regions for the sake of type generalization, providing operations

`begin_def`

for entering and`end_def`

for exiting (destroying) a new region, associating types to their owner region, and tracking ownership changes during unification. What remains is to make a type declaration enter a new region and to associate the declared type constructor with this region. Any type in which this type constructor appears must belong to a region within the type declaration region: the declaration of a type constructor must dominate all its uses.As explained earlier, type regions are identified by a positive integer, type level: the nesting depth of the region. Each type has the field

`level`

with the level of its owner region. Type constructors would need a similar level annotation. It turns out, a different facility of OCaml serves exactly this purpose. Type constructors, data constructors, term variables may be re-defined within an OCaml program: a type can be re-declared, a variable can be rebound several times. OCaml relies on*identifiers*(see`ident.ml`

) to distinguish among differently declared or bound occurrences of the same name. An identifier has the name and the timestamp, a positive number. The global mutable`Ident.currentstamp`

keeps the `current time' and advances it when a new identifier is created, by a declaration or a binding. The timestamp of the identifier is thus its binding time. The binding time is the natural way to relate an identifier to a type region. If the current time is set to the current level, new identifiers will have their binding time not smaller than the current level: they will be regarded as owned by the current type region. Non-escaping then means that the level of a type is no less than the binding time of each type constructor within the type.Unification, specifically, unification with a free type variable -- akin to assignment -- may change the ownership of a type, and so has to update the type level accordingly. It can also check, at the same time, that the non-escaping property still holds: see

`Ctype.update_level`

.We can now understand the OCaml code for type checking a local module, the expression

`let module name = modl in body`

, excerpted below from`typecore.ml`

.| Pexp_letmodule(name, smodl, sbody) -> let ty = newvar() in (* remember the original level *) begin_def (); Ident.set_current_time ty.level; let context = Typetexp.narrow () in let modl = !type_module env smodl in let (id, new_env) = Env.enter_module name.txt modl.mod_type env in Ctype.init_def(Ident.current_time()); Typetexp.widen context; let body = type_expect new_env sbody ty_expected in (* go back to original level *) end_def (); (* Check that the local types declared in modl don't escape through the return type of body *) begin try Ctype.unify_var new_env ty body.exp_type with Unify _ -> raise(Error(loc, Scoping_let_module(name.txt, body.exp_type))) end; re { exp_desc = Texp_letmodule(id, name, modl, body); exp_loc = loc; exp_extra = []; exp_type = ty; exp_env = env }

The type variable

`ty`

is created to receive the inferred type of the expression. The variable is created in the current region. After that, a new type region is entered, by`begin_def()`

, and the identifier timestamp clock is set to correspond to the new`current_level`

. (The timestamp clock is advanced right before a new identifier is created. That's why`Ident.set_current_time`

receives`ty.level`

rather than the incremented`current_level`

as the argument.) Any type constructor declared within the the local module will hence have the binding time of`current_level`

or higher.`Ctype.init_def(Ident.current_time())`

sets the type level to be the binding time of the last identifier of the local module. Therefore, all fresh types created afterwards, when type checking the`body`

, will have the level greater or equal than the binding time of any local module's type constructor. The unification will watch that any level update preserve the invariant. Finally, the unification with`ty`

at the very end (whose region, recall, is outside the`let module`

's region) will make sure than none of the local type constructors escape through the return type.Incidentally,

`Typetexp.narrow ()`

and`Typetexp.widen context`

in the above code establish a new context for type variables within the local module. That's whyfun (x:'a) -> let module M = struct let g (x:'a) = x end in M.g

has the inferred type`'a -> 'b -> 'b`

rather than`'a -> 'a -> 'a`

. The two occurrences of`'a`

in the above code are the distinct type variables. A local module shares none of its type variables with the surrounding.Existential types are quite like the types declared in local modules: in fact, existentials can be implemented with first-class local modules. Therefore, checking that types created by pattern-matching on (or, opening of) an existential do not escape the pattern-matching clause uses the same technique: see

`Typecore.type_cases`

.

- Didier Rémy has discovered the type generalization algorithm based on
levels when working on his Ph.D. on type inference of records and
variants. (Incidentally, he calls 'levels' ranks -- alas, 'levels' is
the term now used in the OCaml type checker.) He prototyped his record
inference in the original Caml (Categorical Abstract Machine
Language), which was written in Caml itself and ran on the top of Le
Lisp. That was before Caml Light let alone OCaml. He had to recompile
Caml frequently, which took a long time. As he says, the type
inference of Caml was the bottleneck: ``The heart of the compiler code
were two mutually recursive functions for compiling expressions and
patterns, a few hundred lines of code together, but taking around 20
minutes to type check! This file alone was taking an abnormal
proportion of the bootstrap cycle. This was at the time when
recompiling fonts in LaTeX would also take forever, so I think we were
used to scheduling such heavy tasks before coffee breaks -- or the
other way round.'' The type inference in Caml was slow for several
reasons. First, the instantiation of a type schema would create a new
copy of the entire type -- even of the parts without quantified
variables, which can be shared instead. Doing the occurs check on
every unification of a free type variable (as in our eager toy
algorithm), and scanning the whole type environment on each
generalization increase the time complexity of inference.
Didier Rémy resolved to speed up the process. He says:

``So, when I wrote my prototype for type checking records and variants (which, being structural, tend to be much larger then usual ML types), I was very careful to stay close to the theory in terms of complexity.- I implemented unification on graphs in
`O(n log n)`

---doing path compression and postponing the occurs-check; - I kept the sharing introduced in types all the way down without breaking it during generalization/instantiation;
- finally, I introduced the rank-based type generalization.''

`sound_lazy`

algorithm explained earlier was a very simple model of Rémy's algorithm, representing its main features. Xavier Leroy implemented the type levels and the level-based generalization in Caml-Light. However, for various reasons he implemented the version akin to`sound_eager`

, with the occurs check on each binding of a free type variable.Didier Rémy prefers to view ranks, or levels, in terms of graphs. If we add to the abstract syntax tree type annotations on each node, edges for shared types and edges from a quantified variable to its quantifier, we obtain a graph. The level of a free type variable can be thought of as the De Bruijn level -- the pointer to the AST node that will quantify the type variable. That AST node must be a

`let`

node, in the Hindley-Milner system. Unifying two free variables adds a sharing edge between them, which requires the adjustment of levels to maintain the invariant that a quantifier node dominates all uses of its bound variables. (Recall, a dominator in a graph for a set of nodes`V`

is a node`d`

such that all paths from the root to each node in`V`

pass through`d`

.) Adding the sharing edge may create a path that no longer passes through the old dominator, letting the variable escape, so to speak, and become dominated by a`let`

node with a wider scope.The graphical view of the ranks proved fruitful. Rank-based generalization easily extends to type checking of records with polymorphic fields. Eventually this graphical view has led to MLF. Didier Rémy remarks that ``the main operation in MLF -- raising binders -- is analogous to the computation of minimal rank between two nodes.'' Rémy's two MLF talks below describe the system and show several animations of rank adjustments during type checking. He also points out how ranks fit with the constraint-based presentation of ML type inference, explained in ``The Essence of ML Type Inference''.

- I implemented unification on graphs in
**References**- A History of Caml

<http://caml.inria.fr/about/history.en.html>

Section ``The first implementation'' describes the original Caml.François Pottier and Didier Rémy. The Essence of ML Type Inference

In Advanced Topics in Types and Programming Languages (Benjamin C. Pierce, editor)

Chapter 10, pages 389-489. MIT Press, 2005.Didier Rémy: Extension of ML Type System with a Sorted Equational Theory on Types

Research Report 1766, Institut National de Recherche en Informatique et Automatique, Rocquencourt, BP 105, 78 153 Le Chesnay Cedex, France, 1992

<http://gallium.inria.fr/~remy/ftp/eq-theory-on-types.pdf>Didier Rémy: A new look on MLF

<http://cristal.inria.fr/~remy/mlf/portland.pdf>Didier Rémy: MLF for Everyone (Users, Implementers, and Designers)

<http://cristal.inria.fr/~remy/mlf/mlf-for-everyone.pdf>David McAllester: A logical algorithm for ML type inference Proc. RTA'03, pp. 436-451

David McAllester has much later re-discovered the efficient generalization. He also showed that the ML type inference is nearly linear in program size for most practical programs.George Kuan and David MacQueen: Efficient ML Type Inference Using Ranked Type Variables

ML Workshop 2007

<http://people.cs.uchicago.edu/~gkuan/pubs/ml07-km.pdf>

The paper compares two level-based Hindley-Milner inference algorithms: one uses`let`

-levels, as explained on this page, while the other relies on lambda-levels. The paper develops abstract machines for both algorithms and describes their several interesting formal properties. The lambda-level approach was used in SML/NJ.Peter Sestoft: Programming Language Concepts

Springer Undergraduate Texts in Computer Science. xiv + 278 pages. July 2012

<http://www.itu.dk/people/sestoft/plc/>

Chapter 6 (see lecture slides and examples on the above page) describes a simpler version of Rémy's algorithm -- essentially,`sound_eager`

.

- The OCaml type checker provides two functions to create a fresh type
variable. This section illustrates the difference between them. The
functions are defined in
`ctype.ml`

, with the following signatures:newvar : ?name:string -> unit -> type_exp newgenvar : ?name:string -> unit -> type_exp

Both take the optional argument`?name`

to give the name to the variable. The name will be chosen automatically otherwise.The function

`newvar`

creates a variable at the`current_level`

whereas`newgenvar`

creates at the`generic_level`

. In the codelet ty1 = newvar () in unify env ty1 some_type let ty2 = newgenvar () in unify env ty2 some_type

both`ty1`

and`ty2`

behave the same: the type variable will be bound to`some_type`

. Since the`current_level`

corresponds to the innermost alive region,`some_type`

's level is the current level or smaller, and so remains unchanged in either case.The difference emerges in the following two snippets (the second often occurs in

`typecore.ml`

)let ty1 = newvar () in let list_type = newgenty (Tconstr(p_list, [ty1])) in let texp = instance env list_type in unify env texp some_type let ty2 = newgenvar () in let list_type = newgenty (Tconstr(p_list, [ty2])) in let texp = instance env list_type in unify env texp some_type

The function`instance`

copies the type -- creates a`Tsubst`

node, to be precise -- only if the type is generic. That is, inlet ty = newvar () in instance env ty

`instance`

acts as the identity function. However, inlet ty = newgenvar () in instance env ty

`instance`

copies the variable. Therefore, in the first snippet above,`unify`

at the end may affect the`list_type`

, by instantiating`ty1`

. The`list_type`

cannot possibly be affected in the second snippet since`unify`

will act on the copy of`ty2`

.

- The
`let`

-generalization in OCaml is far more complex than what we have sketched earlier. This section is to help appreciate the true complexity of generalization.The

`let`

-expression in OCaml has the general formlet [rec] pattern = exp and pattern = exp ... in body

The`let`

type checker`type_let`

-- 160 lines of code in`typecore.ml`

, not counting the type checking of patterns -- receives the list of pattern-expression pairs, and the recursion-flag. Here is the end of its codebegin_def (); ... let exp_list = List.map2 (fun (spat, sexp) (pat, slot) -> .... (* type checking of expressions *) type_expect exp_env sexp pat.pat_type) spat_sexp_list pat_slot_list in ... end_def(); List.iter2 (fun pat exp -> if not (is_nonexpansive exp) then iter_pattern (fun pat -> generalize_expansive env pat.pat_type) pat) pat_list exp_list; List.iter (fun pat -> iter_pattern (fun pat -> generalize pat.pat_type) pat) pat_list; (List.combine pat_list exp_list, new_env, unpacks)

We see the familiar pattern:begin_def(); ... newvar () ... end_def(); generalize

But there is another traversal of the type, with`generalize_expansive`

. That function is invoked only if the expression is expansive, that is, may have a visible effect -- for example, it is an application. The function`Ctype.generalize_expansive`

traverses its argument`type_expression`

; when it comes across a constructed type`Tconstr(p,args)`

(such as the list type, etc), and is about to traverse an`arg`

,`generalize_expansive`

checks the declaration of the type`p`

for the variance of that argument. If`arg`

is covariant,`generalize_expansive`

traverses`arg`

and sets the levels of the components above the`current_level`

to the`generic_level`

. If`arg`

is not covariant (e.g., the argument of`ref`

and`array`

type constructors),`arg`

's components with the levels above the current are set to the`current_level`

. The subsequent`generalize`

will leave those levels as they are. This is how a so-called relaxed value restriction is implemented, which is responsible for inferring the polymorphic type for# let x = (fun y -> print_string "ok"; y) [];; ok val x : 'a list = []

Here,`x`

is bound to an application, which is not a syntactically value and which is expansive. Its evaluation certainly has a visible effect. And yet the type of`x`

is generalized because the`list`

type is covariant in its argument. SML would not have. **References**- Jacques Garrigue: Relaxing the Value Restriction

FLOPS 2004, pp. 196-213