There is more to Hindley-Milner type inference than the Algorithm W. In 1988, Didier Rémy was looking to speed up the type inference in Caml and discovered an elegant method of type generalization. Not only it is fast, avoiding the scan of the type environment. It smoothly extends to catching of locally-declared types about to escape, to type-checking of universals and existentials, and to implementing MLF.
Alas, both the algorithm and its implementation in the OCaml type checker are little known and little documented. This page is to explain and popularize Rémy's algorithm, and to decipher a part of the OCaml type checker. The page also aims to preserve the history of Rémy's algorithm.
The attraction of the algorithm is its insight into type generalization as dependency tracking -- the same sort of tracking used in automated memory management such as regions and generational garbage collection. Generalization can be viewed as finding dominators in the type-annotated abstract syntax tree with edges for shared types. Fluet and Morrisett's type system for regions and MetaOCaml environment classifiers use the generalization of a type variable as a criterion of region containment. Uncannily, Rémy's algorithm views the region containment as a test if a type variable is generalizable.
Inside OCaml type checker
OCaml generalization is based on tracking of so-called levels of a type. The very same levels also ensure that types defined within a module do not escape into a wider scope. Levels hence enforce the region discipline for locally introduced type constructors. It is intriguing how generalization and regions are handled so uniformly. There are even more applications of levels in the OCaml type checker, for records with polymorphic fields and existentials. MetaOCaml indirectly relies on levels to track the scope of future-stage bindings. There is a common refrain in all these applications: tracking dependencies, computing region containment or dominators in data-dependency graphs. One is immediately reminded of the region-based memory management by Tofte and Talpin. As Fluet and Morrisett showed, Tofte and Talpin type system for regions can be encoded in System F, relying on universal quantification to statically prevent allocated data from escaping their region. Dually, the level-based generalization relies on detecting escapes of a type variable to determine its region and hence the place for its universal quantification.
OCaml's generalization is a (partial) implementation of the algorithm discovered by Didier Rémy back in 1988. The idea is to explicitly represent the sharing of types in the type-annotated abstract syntax tree. A type variable can only be quantified at a node that dominates all occurrences of that variable. Generalization amounts to the incremental computation of graph dominators. Rémy's MLF is the natural outgrowth of this idea.
Unfortunately, Rémy's generalization algorithm and the underlying ideas are little known. The implementations, such as the one in OCaml, do not seem to be documented at all, aside from a couple of brief puzzling comments in the OCaml source code. They ought to be widely known. Towards this goal, the present page sets to (i) motivate and explain the algorithm, expose its intuitions and sketch implementations; (ii) help decipher OCaml type checker.
The second part of this page aims to be a commentary on a portion of the OCaml
type-checker, and is, therefore, quite technical. It refers
to OCaml 4.00.1 type checking code, located in the directory typing/
of the OCaml distribution. The file typecore.ml
is the
core type checker: it annotates nodes of the abstract syntax tree with
types and the typing environment. To be precise, it transforms Parsetree
(defined in parsing/parsetree.mli
) into Typedtree
.
The file ctype.ml
implements unification and level manipulation
functions.
I am indebted to Didier Rémy for his comments, explanations, intuitions and recollections of the discovery of the algorithm. I thank Jacques Garrigue for helpful comments and explanations of more applications of levels within the OCaml type checker. Additional references provided by Matthew Fluet and Baris Aktemur are gratefully acknowledged.
Matthew Fluet and J. Gregory Morrisett: Monadic Regions
J. Functional Programming, 2006, v16, N4-5, pp. 485-545.
The paper shows that parametric polymorphism is all that needed for a sound type system of memory regions.
Recall that generalization GEN(G,t)
of the type t
with respect
to the type environment G
is quantifying free type variables of t
that do not occur as free in G
. In Greek: GEN(G,t) = ∀ α1
... αn. t
where {α1 ... αn} = FV(t) - FV(G)
. A Hindley-Milner
system applies the generalization, to convert a type to a type schema,
when type checking let
:
G |- e : t G, (x:GEN(G,t)) |- e2 : t2 ---------------------------------------- G |- let x = e in e2 : t2
That is, when inferring the type for let x = e in e2
, a free type
variable in the type of e
is quantified if the type variable does
not occur in the type environment. ML adds another condition
for generalization, so-called value restriction: the type of e
is
generalized only if e
, by the look of it,
has no visible side-effect -- technically, e
passes the syntactic
test of being nonexpansive.
OCaml relaxes the value restriction, see later on this page.
Here is a trivial example of generalization:
fun x -> let y = fun z -> z in y (* 'a -> ('b -> 'b) *)The type checker infers for
fun z ->z
the type β->β
with the fresh,
and hence unique, type variable β
. The expression fun z ->z
is syntactically
a value, generalization proceeds, and y
gets
the type ∀β.β->β
. Because of the polymorphic type, y
may occur in
differently typed contexts -- may be applied to arguments of different
types, -- as infun x -> let y = fun z -> z in (y 1, y true) (* 'a -> int * bool *)
Generalization Gen(G,t)
quantifies over only those free type variables of t
that do not occur in G
. This condition is subtle
but crucial: without it, the unsound type α->β
is inferred for
the function
fun x -> let y = x in y
To elaborate: to infer the type of a function, we infer the type of
its body in the environment in which x:α
where α
is a fresh type
variable. The form let y=x in ...
adds to the environment y:
Gen(x:α,α)
. Clearly α
does occur in the environment x:α
. If we
quantify over it nevertheless, y
receives the polymorphic type ∀α.α
, which can then be instantiated to any type. The result is
the function that ostensibly converts its argument to the value of
any type whatsoever.
Therefore, for the soundness of the type system, we must make sure
that the type variable to quantify does not occur in the type
environment. Naively, we could scan the type environment,
looking through the type of each binding -- in fact, the original
Caml did exactly that. The type environment
however can be very large. Typically ML functions contain long
sequences of let
s. A non-recursive let
has in its type environment
the bindings of all previous let
s; the environment of a recursive let
has the bindings of all let
siblings. Searching the
environment as part of the generalization for a single let
takes time
linear in the function size; the type checking of the whole program will be
quadratic then. (Except for pathological cases, Hindley-Milner type
inference scales nearly linearly with the program size.)
The inefficient generalization was one of the main reasons for
the slow speed of Caml compilation, Didier Rémy recalls.
Bootstrapping the compiler and type checking two mutually recursive
functions for compiling patterns and expressions took 20 minutes.
There has to be a way to avoid scanning the environment. The next section gives the idea.
Although our Hindley-Milner type inferencer is toy, it shares many implementation decisions (and even some function names) with the real OCaml type checker. Understanding it will help when we turn to OCaml internals later on this page.
Our toy language is the standard pure lambda-calculus with let
.
Its expressions are
type exp = | Var of varname (* variable *) | App of exp * exp (* application: e1 e2 *) | Lam of varname * exp (* abstraction: fun x -> e *) | Let of varname * exp * exp (* let x = e in e2 *)Types are comprised of type variables, quantified type variables and function types:
type qname = string type typ = | TVar of tv ref (* type (schematic) variable *) | QVar of qname (* quantified type variable *) | TArrow of typ * typ and tv = Unbound of string | Link of typType schemas in the Hindley-Milner system are in the prenex form (that is, universal quantifiers are all outside), and so the quantifiers need not be represented explicitly. Types with
QVar
are type schemas;
without -- simple types.
In the Prolog tradition, type variables are represented as reference cells.
An unbound variable contains the null or the self pointer -- or, in our case,
the name of the variable for easy printing. When a free type variable is
unified with some type t'
, the reference cell is overwritten with the
pointer to t'
. To prevent cyclical (and, for us, unsound) types,
the `occurs check' is performed first: occurs tv t'
traverses t'
raising
an exception if it comes across the type variable tv
:
let rec unify : typ -> typ -> unit = fun t1 t2 -> if t1 == t2 then () (* t1 and t2 are physically the same *) else match (t1,t2) with | (TVar ({contents = Unbound _} as tv),t') | (t',TVar ({contents = Unbound _} as tv)) -> occurs tv t'; tv := Link t' | (TVar {contents = Link t1},t2) | (t1,TVar {contents = Link t2}) -> unify t1 t2 | (TArrow (tyl1,tyl2), TArrow (tyr1,tyr2)) -> unify tyl1 tyr1; unify tyl2 tyr2 (* everything else is error *)
The type checker is totally standard. It infers the type for the
expression exp
in the type environment env
:
type env = (varname * typ) list let rec typeof : env -> exp -> typ = fun env -> function | Var x -> inst (List.assoc x env) | Lam (x,e) -> let ty_x = newvar () in let ty_e = typeof ((x,ty_x)::env) e in TArrow(ty_x,ty_e) | App (e1,e2) -> let ty_fun = typeof env e1 in let ty_arg = typeof env e2 in let ty_res = newvar () in unify ty_fun (TArrow (ty_arg,ty_res)); ty_res | Let (x,e,e2) -> let ty_e = typeof env e in typeof ((x,gen ty_e)::env) e2The function
newvar
allocates a new TVar
, with a unique name.
The function inst
instantiates a type schema, that is,
replaces each QVar
with a fresh TVar
. It is also standard.
The generalization function is unsound: it quantifies all free variables
in the type regardless of the environment:let rec gen : typ -> typ = function (* unsound! *) | TVar {contents = Unbound name} -> QVar name | TVar {contents = Link ty} -> gen ty | TArrow (ty1,ty2) -> TArrow (gen ty1, gen ty2) | ty -> tyThe quantification replaces a
TVar
with the corresponding QVar
.
The original TVar
is hence implicitly deallocated: When a free variable is
bound, it `disappears', being replaced by the `pointer' to the binder.
With respect to type variables, typeof
allocates free variables,
unifies them, and deallocates, after quantification.
Let us type check simple examples observing the sequence of these
three main operations that affect free type variables. The first example
is the one where nothing should go wrong:
fun x -> let y = fun z -> z in yThe trace of type-checking, showing only type-variable related operations, is as follows.
1 ty_x = newvar () (* fun x -> ... *) 2 ty_e = (* let y = fun z -> z in y *) 3 ty_z = newvar (); (* fun z -> ... *) 3 TArrow(ty_z,ty_z) (* inferred for: fun z -> z *) 2 ty_y = gen ty_e (* ty_z remains free, and so *) 2 deallocate ty_z (* quantified and disposed of *) 1 TArrow(ty_x, inst ty_y) (* inferred for: fun x -> ... *)The number in the first column is the depth for the recursive invocations of
typeof
. Since typeof
recurs on each non-leaf node of the abstract
syntax tree (AST), this recursive invocation depth is the depth in the AST
of the node being type checked. The inferred type is 'a -> 'b -> 'b
,
as expected. Nothing went wrong.
The second example is the one we saw before, for which the unsound
generalization gives the unsound type 'a->'b
:
fun x -> let y = x in yDiagramming the
TVar
operations as before reveals the problem:1 ty_x = newvar () (* fun x -> ... *) 2 ty_e = (* let y = x in y *) 3 inst ty_x (* inferred for x, same as ty_x *) 2 ty_y = gen ty_e (* ty_x remains free, and is *) 2 deallocate ty_x (* quantified, and disposed of *) 1 TArrow(ty_x, inst ty_y) (* inferred for: fun x -> ... *)The type variable
ty_x
is part of the return type, used at depth 1 --
and yet it is quantified and disposed of at depth 2. We disposed
of the value still in use.
The third example is also problematic. The unsound generalization again
gives the unsound type ('a->'b) -> ('c ->'d)
:
fun x -> let y = fun z -> x z in yThe diagram shows a memory management problem again:
1 ty_x = newvar () (* fun x -> ... *) 2 ty_e = (* let y = ... *) 3 ty_z = newvar () (* fun z -> ... *) 4 ty_res = newvar () (* typechecking: x z *) 4 ty_x := (* as the result of unify *) 4 TArrow (ty_z,ty_res) 4 ty_res (* inferred for: x z *) 3 TArrow(ty_z,ty_res) (* inferred for: fun z -> x z *) 2 ty_y = gen ty_e (* ty_z, ty_res remain free *) 2 deallocate ty_z (* quantified and disposed of *) 2 deallocate ty_res (* quantified and disposed of *) 1 TArrow(ty_x, inst ty_y) (* inferred for: fun x -> ... *)The type variables
ty_z
and ty_res
are quantified over and
hence disposed of at depth 2, and yet they are part of TArrow (ty_z,ty_res)
that is assigned to ty_x
, which, in turn,
is part of the result.
All unsound examples had a `memory management problem', deallocating
memory (TVar
) still being used. This is no accident. When a type
variable is quantified over, later on it can be instantiated with any
type whatsoever, with no limitations and in any context. However, if
a type variable also appears in the type environment G
, no longer can we replace the type variable with any type without
affecting the rest of the type checking.
Likewise, when we free a piece of memory, we give the run-time
the permission to reallocate it and overwrite with arbitrary data.
The rest of our program should not depend on what happens later
with the deallocated memory -- provided it was really free, not
needed further in the program. In fact, one may define `memory not in use'
as arbitrary non-deterministic changes to that memory not affecting the
rest of the program. Deallocating memory still in use will affect
the rest of the program -- often, crash it. Incidentally, unsound
types inferred for our examples often lead to the same result.
sound_eager
,
is motivated and explained in this section. The optimal sound_lazy
, which captures the main features of the Rémy algorithm,
is presented next.Clearly, before deallocating memory we must check if it is still in use. Naively, we could scan all memory known to be in use looking for references to the deallocation candidate -- in other words, do the full garbage-collection marking pass and see if our piece of memory got marked. Put it this way, the check seems awfully expensive. At least we should wait until garbage accumulates, to collect it en masse. Alas, in the Hindley-Milner type system we cannot delay quantification arbitrarily, since the generalized type can be used right away.
More promising is ownership tracking, associating an allocated resource
with an owner, an object or a function activation. Only the owner may
deallocate its resources. A similar strategy is regions, which are
areas of memory for heap-allocated data created by a lexically-scoped letregion
primitive. When letregion
goes out of scope, its whole
whole region is summarily deallocated. This idea matches the
generalization well. In the Hindley-Milner system, generalization is
always a part of let
. A let
-expression let x = e in e2
is the
natural owner of all type variables allocated when inferring the type
of e
. When the type of e
is found, all free type variables still
owned by the let
-expression can be disposed of, that is, quantified.
These intuitions underlie the sound and efficient generalization
algorithms. The first is sound_eager
, described in the rest of the
section. Its code differs only in small, but significant, details from
the toy Hindley-Milner inferencer from the previous section. We will
only explain these small differences; the complete code is available
below. The main difference is that free type variables, albeit
unbound, are now owned, and contain the `pointer' to their owner. The
owner, always a let
expression, is identified by a positive integer
called level. This level is essentially the de Bruijn level, only
referring to the owner rather than the binder. That is, the level is
the nesting depth of the owning let
expression in the abstract
syntax tree (AST); level 1 corresponds to the (implicit) top-level let
. (Incidentally, although both let
s in (let x = e1 in eb1,
let y = e2 in eb2)
have level 2, no confusion can arise as neither let
is in each other scope and hence their regions are disjoint.)
Since the depth of typeof
recursion equals the AST depth of the
expression it type checks, the level is equal to the let
-expression's
type checking recursion depth. That depth
is simple to determine, with the help of one reference cell.
type level = int let current_level = ref 1 let enter_level () = incr current_level let leave_level () = decr current_levelThe type inferencer maintains the
let
type-checking depth:let rec typeof : env -> exp -> typ = fun env -> function ... (* the other cases are the same as before *) | Let (x,e,e2) -> enter_level (); let ty_e = typeof env e in leave_level (); typeof ((x,gen ty_e)::env) e2The only change to the main type-inference function was adding
enter_level
and leave_level
to track the level. The rest of typeof
is
literally the same as in the original toy version.
Free type variables now carry the level identifying their owner.
A freshly allocated type variable receives the current_level
, meaning that
its owner is the latest let
being type-checked. (In region-based
memory management, all new memory is allocated in the innermost alive
region.)
type typ = | TVar of tv ref (* type (schematic) variable *) | QVar of qname (* quantified type variable *) | TArrow of typ * typ and tv = Unbound of string * level | Link of typ let newvar : unit -> typ = fun () -> TVar (ref (Unbound (gensym (),!current_level)))
Just as an assignment may change the owner of an allocated memory,
unification may change the level of a free type variable. For example,
if ty_x
and ty_y
are both free, ty_y
has level 2, ty_x
has
level 1, and ty_x
is unified with the type TArrow(ty_y,ty_y)
, the
arrow type and its components are exported into region 1, and so the
level of ty_y
is changed to 1. One may view the above
unification as replacing all occurrences of ty_x
with TArrow(ty_y,ty_y)
. Some occurrences may be outside the scope of the
inner, level-2 let
. Therefore, when the type checking of that inner let
finishes, ty_y
should not be deallocated. With the updated ty_y
level, it won't be. Thus unifying a free type variable ty_x
with t
has to update the levels of each free type variable ty_y
in t
to
the smallest of ty_y
and ty_x
levels. Unifying a free type
variable with t
also has to do the occurs check, which too traverses
the type. The two traversals can be merged. The new occurs
does the
occurs check and updates the levels:
let rec occurs : tv ref -> typ -> unit = fun tvr -> function | TVar tvr' when tvr == tvr' -> failwith "occurs check" | TVar ({contents = Unbound (name,l')} as tv) -> let min_level = (match !tvr with Unbound (_,l) -> min l l' | _ -> l') in tv := Unbound (name,min_level) | TVar {contents = Link ty} -> occurs tvr ty | TArrow (t1,t2) -> occurs tvr t1; occurs tvr t2 | _ -> ()
The only difference from the original occurs
code is the second clause
in the pattern-match. The unification code does not have to be modified
at all. Finally, we fix the generalization function, to make it sound:
let rec gen : typ -> typ = function | TVar {contents = Unbound (name,l)} when l > !current_level -> QVar name | TVar {contents = Link ty} -> gen ty | TArrow (ty1,ty2) -> TArrow (gen ty1, gen ty2) | ty -> ty
The change is minimal: the condition when l > !current_level
. Recall
the new typeof
code:
let rec typeof : env -> exp -> typ = fun env -> function ... (* the other cases are the same as before *) | Let (x,e,e2) -> enter_level (); let ty_e = typeof env e in leave_level (); typeof ((x,gen ty_e)::env) e2It invokes
gen
after the region established for type checking e
exits. A free type variable still owned by that region will have
the level greater than the current. Since the region is now dead,
any such type variable may be deallocated, that is, quantified.
These are all the changes of sound_eager
from the unsound toy algorithm,
which fix the type inference. Here is the old problematic example
fun x -> let y = x in yDiagramming the
TVar
operations shows no problems now:1 1 ty_x/1 = newvar () (* fun x -> ... *) 2 2 ty_e = (* let y = x in y *) 3 2 inst ty_x/1 (* inferred for x, same as ty_x *) 2 1 ty_y = gen ty_e (* ty_x/1 remains free, but is *) (* level = current, can't *) (* quantify, can't dispose *) 1 1 TArrow(ty_x/1, inst ty_y) (* inferred for: fun x -> ... *)The first column of numbers shows the
typeof
recursion depth, or the depth
of the AST node being type-checked. The number in the second column is
the current_level
, the let
nesting depth. We write the level
of a free type variable after the slash, as in ty_x/1
.
That variable is no longer quantified by gen
at depth 2 (level 1)
since ty_x/1
belongs to to the current, still active region 1.
Therefore, the inferred type is 'a->'a
, as expected.In a slightly more complex example,
fun x -> let y = fun z -> x in ythe type variable
ty_x
for the type of x
is allocated at level 1,
whereas ty_z
is allocated at level 2. After the inner let
, region
2, is finished, ty_z/2
will be quantified and disposed of, but ty_x/1
will not. The inferred type therefore is 'a->'b->'a
. The
reader is encouraged to diagram other examples, to check that
the inferred types are sound.Level tracking may look like reference counting. However, rather than counting the number of users for a free type variable, we keep track of only one user, the one with the widest scope. Level tracking does look a lot like generational garbage collection: Memory is allocated in the young generation, and summarily disposed of at the minor (youngest) collection, unless it is `re-parented' or referenced from the stack. The old generation does not have to be scanned for references to the new generation, since no such references are expected -- unless there was an assignment of a young value to the field of an old data structure. A generational garbage collector (such OCaml GC) keeps track of inter-generational assignments. At the minor collection, young data referred from the old are promoted to the old generation. Type generalization indeed looks very similar to the minor GC collection.
sound_eager
generalization, with many more examples of now sound inferencesound_lazy
, an optimized version of sound_eager
from the previous section. The sound_lazy
algorithm
eschews repeated, unnecessary traversals of the type during unification,
generalization and instantiation, and avoids copying the parts
that do not contain variables to generalize or instantiate, thus
improving sharing. The algorithm delays the occurs check and the level
updates, so that the unification with a free type variable takes
constant time. Levels are updated incrementally and on demand. All in
all, sound_lazy
embodies the main ideas of Rémy's algorithm. Some of
these ideas are implemented in the OCaml type checker.
To carry on the optimizations, we change the syntax of types. Recall that
in sound_eager
, types were comprised of free or bound type
variables TVar
, (implicitly universally) quantified
type variables QVar
and function types TArrow
. The first,
seemingly unprincipled change, is to eliminate QVar
as a
distinct alternative and dedicate a very large positive integer -- which should
be treated as the inaccessible ordinal ω -- as a generic_level
.
A free type variable TVar
at generic_level
is taken to be a
quantified type variable. More substantially, all types, not only free
type variables, have levels now. The level of a composite type (TArrow
in our case) is an upper, not necessarily exact, bound on the levels
of its components. In other words, if a type belongs to an alive region,
all its components should be alive.
It immediately follows that if a (composite) type
is at generic_level
, it may contain quantified type variables.
Contrapositively, if a type is not at generic_level
, it does not
contain any quantified variable. Therefore, instantiating such a
type should return the type as it is without traversing it. Likewise,
if the level of a type is greater than the current level, it may
contain free type variables to generalize. On the other hand,
the generalization function should not even bother traversing a type
whose level is equal or less than the current. This is the first example of
how levels help eliminate traversals and rebuildings of a type, improving
sharing.
Unifying a type t
with a free type variable should update t
's level
to the level of the type variable if the latter level is smaller. For a
composite type, such an update means recursively updating the
levels of all components of the type. To postpone costly traversals, we
give composite types two levels: level_old
is an upper
bound on the levels of type's components; level_new
, which is less or
equal to level_old
, is the level the type should have after the update.
If level_new < level_old
, the type has pending level updates. The
syntax of types in sound_eager
is thus
type level = int let generic_level = 100000000 (* as in OCaml typing/btype.ml *) let marked_level = -1 (* for marking a node, to check*) (* for cycles *) type typ = | TVar of tv ref | TArrow of typ * typ * levels and tv = Unbound of string * level | Link of typ and levels = {mutable level_old : level; mutable level_new : level}
We have not explained marked_level
. The occurs check on each
unification with a free type variable is expensive, raising the
algorithmic complexity of the unification and type checking. We now
postpone this check, until the whole expression is type checked. In
the meanwhile, unification may create cycles in types. Type traversals
have to check for cycles, or risk divergence. The marked_level
is
assigned temporarily to level_new
of a composite type to indicate
the type is being traversed. Encountering marked_level
during a
traversal means detecting a cycle, which raises the occurs check
error. Incidentally, in OCaml types are generally cyclic:
(equi-)recursive types arise when type checking objects and
polymorphic variants, and when the -rectypes
compiler option is set. The
OCaml type checker uses a similar marked-level trick to detect cycles
and avoid divergence.
The sound_lazy
unification has several important differences
from sound_eager
:
let rec unify : typ -> typ -> unit = fun t1 t2 -> if t1 == t2 then () (* t1 and t2 are physically the same *) else match (repr t1,repr t2) with | (TVar ({contents = Unbound (_,l1)} as tv1) as t1, (* unify two free vars *) (TVar ({contents = Unbound (_,l2)} as tv2) as t2)) -> if l1 > l2 then tv1 := Link t2 else tv2 := Link t1 (* bind the higher-level var *) | (TVar ({contents = Unbound (_,l)} as tv),t') | (t',TVar ({contents = Unbound (_,l)} as tv)) -> update_level l t'; tv := Link t' | (TArrow (tyl1,tyl2,ll), TArrow (tyr1,tyr2,lr)) -> if ll.level_new = marked_level || lr.level_new = marked_level then failwith "cycle: occurs check"; let min_level = min ll.level_new lr.level_new in ll.level_new <- marked_level; lr.level_new <- marked_level; unify_lev min_level tyl1 tyr1; unify_lev min_level tyl2 tyr2; ll.level_new <- min_level; lr.level_new <- min_level (* everything else is the unification error *) and unify_lev l ty1 ty2 = let ty1 = repr ty1 in update_level l ty1; unify ty1 ty2where the auxiliary
repr
, like OCaml's Btype.repr
, chases links of
bound variables returning a free variable or a constructed type.
Unlike OCaml, we do path compression. The unification function no
longer does the occurs check; therefore, it has to make an effort to
detect accidentally created cycles. Unifying with a free variable now
takes constant time, to bind the variable after a shallow update_level
.
The function update_level
is one of the key parts of the optimized
algorithm. Often, it merely promises to update the level of a type to
the given level. It works in
constant time and maintains the invariant that a type level may
only decrease. The level of a type variable is updated immediately.
For a composite type, level_new
is
set to the desired new level if the latter is smaller. In
addition, if previously level_new
and level_old
were the same, the
type is put into the to_be_level_adjusted
queue for later
update of the levels of the components. This work queue is akin to
the list of assignments from the old generation to the young
maintained by a generational garbage collector (such as the one in
OCaml). Incidentally, a unification of two TArrow
types has to traverse
the types anyway, and so it does pending level updates along the way.
let to_be_level_adjusted = ref [] let update_level : level -> typ -> unit = fun l -> function | TVar ({contents = Unbound (n,l')} as tvr) -> assert (not (l' = generic_level)); if l < l' then tvr := Unbound (n,l) | TArrow (_,_,ls) as ty -> assert (not (ls.level_new = generic_level)); if ls.level_new = marked_level then failwith "occurs check"; if l < ls.level_new then begin if ls.level_new = ls.level_old then to_be_level_adjusted := ty :: !to_be_level_adjusted; ls.level_new <- l end | _ -> assert false
Pending level updates must be performed before generalization:
After all, a pending update may decrease the level of a type
variable, promoting it to a wider region, and hence save the variable
from quantification. Not all pending updates have to be forced however --
only of those types whose level_old > current_level
. Otherwise,
a type contains no variables generalizable at the present
point, and the level update may be delayed further.
The generalization function searches for free TVar
s that belong to a
dead region (that is, whose level is greater than the current) and
sets their level to generic_level
, hence quantifying the
variables. The function traverses only those parts of the type that
may contain type variables to generalize. If a type has the (new)
level of current_level
or smaller, all its components belong to live
regions and hence the type has nothing to generalize. After the
generalization, a composite type receives generic_level
if it
contains a quantified type variable. Later on, the instantiation
function will, therefore, only look through those parts of the type whose
level is generic_level
.
let gen : typ -> unit = fun ty -> force_delayed_adjustments (); let rec loop ty = match repr ty with | TVar ({contents = Unbound (name,l)} as tvr) when l > !current_level -> tvr := Unbound (name,generic_level) | TArrow (ty1,ty2,ls) when ls.level_new > !current_level -> let ty1 = repr ty1 and ty2 = repr ty2 in loop ty1; loop ty2; let l = max (get_level ty1) (get_level ty2) in ls.level_old <- l; ls.level_new <- l (* set the exact level upper bound *) | _ -> () in loop ty
The type checker typeof
remains the same, entering a new region when
type checking a let
expression. Please see the source code for details.
We have presented the optimized sound_lazy
type generalization algorithm
that not only avoids scanning the whole type environment on each
generalization. The algorithm also avoids the occurs check on each
unification with a free type variable, so that this unification takes
constant time. The algorithm eliminates unnecessary type traversals
and copying, saving time and memory. Two ideas underlie the
optimizations, besides the type levels for free type variables. First
is the assigning of levels to composite types, to give us an idea what
a type may contain without looking though it. The second principle is
delaying expensive actions (type traversals) with the hope they will
get done in the future alongside of something else. In other words, if
dealing with a problem is postponed long enough, it may go away:
procrastination sometimes helps.
The ideas behind the type generalization in OCaml have been presented in the
previous sections, in the form of the toy algorithms sound_eager
and sound_lazy
. Their code has been intentionally written to
resemble the OCaml type checker, often using the same function names.
The OCaml type checker implements the sound_eager
algorithm with a few optimizations from sound_lazy
. OCaml is far
more complicated: whereas unification in the toy code takes just
a few lines, the OCaml unification code, in ctype.ml
, takes
1634 lines. Nevertheless, understanding the toy algorithms should help
in deciphering the OCaml type checker.
Like the sound_eager
algorithm, the OCaml type checker does the
occurs check and the levels update on each unification with a free
variable; one can clearly see that from the code of Ctype.unify_var
.
On the other hand, like in sound_lazy
, the OCaml type checker
assigns levels to all types, not only to type variables -- see type_expr
in types.mli
. One reason is to detect escaping local
type constructors (described in the next section). Also like in sound_lazy
, generic_level
distinguishes quantified type variables
and the types that may contain quantified variables (so-called
`generic types'). Therefore, the schema instantiation function
(Ctype.instance
and Ctype.copy
) won't traverse and copy
non-generic parts of a type, returning them as they are, which
improves sharing. Type variables at generic_level
are printed like 'a
; with other levels, as '_a
. As in
our toy algorithms, a mutable global Ctype.current_level
tracks the
current level, which is assigned to newly created types or type
variables (see Ctype.newty
and Ctype.newvar
). The current_level
is increased by enter_def()
and decreased by end_def()
. Besides
the current_level
, there is also nongen_level
, used when type
checking a class definition, and global_level
used for type
variables in type declarations.
A very simplified code for type-checking let x = e in body
is
as follows.
let e_typed = enter_def (); let r = type_check env e_source in end_def (); r in generalize e_typed.exp_type; let new_env = bind env x e_typed.exp_type in type_check new_env body_sourceHere,
e_source
is the abstract syntax tree, or Parsetree.expression
for the expression e
and e_typed
is the Typedtree.expression
, the abstract syntax tree in which each node is
annotated with its inferred type, the field exp_type
.Thus the overall type generalization pattern, often seen in the OCaml type checker, is
let ty = enter_def (); let r = ... let tv = newvar() in ... (... tv ...) end_def (); r in generalize ty
If tv
was not unified with something that existed in the environment
before enter_def()
, the variable will be generalized. The code looks
quite like our toy code.
Interestingly, levels have another use, enforcing the region discipline for local type declarations.
OCaml, unlike SML, supports local modules, or modules defined in
local scope, via the let module
form. A local module may declare a type,
and may even let this type escape, as in
let module M = struct type t = Foo let x = Foo end in M.x;; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Error: This `let module' expression has type M.t In this type, the locally bound module name M escapes its scopeSuch an escape must be flagged as an error. Otherwise, the expression will receive the type
M.t
where M
and M.t
come into scope only inside the expression. This problem is
akin to returning the address of an automatic local variable from
a C function:char * esc_res(void) { char str [] = "local string"; return str; }A locally declared type can escape not only through the result type but also by unification with an existing type variable:
fun y -> let module M = struct type t = Foo let r = y Foo end in ();; ^^^ Error: This expression has type t but an expression was expected of type 'a The type constructor t would escape its scopeThis sort of error is also familiar to C programmers:
char * y = (char*)0; void esc_ext(void) { char str [] = "local string"; y = str; }
Even with ordinary, top-level rather than local, modules may have type escaping problems. Here is the example taken from a comment in OCaml type checker:
let x = ref [] module M = struct type t let _ = (x : t list ref) endThe module
M
defines a local type t
. The variable x
has the
non-generic type '_a list ref
. The type attribution causes x
,
defined prior to t
, to have the type x : t list ref
. OCaml cannot
let such escapes go uncaught. Under no circumstances the scope of a
type constructor may wider that the scope of its declaration. Type
levels enforce this region-like discipline for type constructors.
The OCaml type checker already supports regions for the sake of type
generalization, providing tools for entering a new region and
destroying it (begin_def
and end_def
), associating types to their
owner region, and tracking ownership changes during unification. What
remains is to associate a type constructor to the region
created at the point of its declaration. A type constructor can escape if
used in a type whose region is wider than the type
constructor's region -- in other words, if a type may outlive its type
constructors. Affirmatively, we aim to enforce the property
that the region of a type constructor declaration be wider than the
region of any type with that type constructor: the declaration of
a type constructor must dominate all its uses.
Type regions are identified by a positive number, type level, which
is just the nesting depth of the region. Each type has the field level
with the level of its owner region. Luckily, there is an unexpected way to
attribute the type level also to a type constructor. Type constructors, data
constructors, term variables may be re-defined within an OCaml program:
a type can be re-declared, a variable can be rebound several times.
OCaml relies on identifiers (see ident.ml
) to distinguish among
differently declared or bound occurrences of the same name. An identifier has
the name and the timestamp, a positive number. The global mutable Ident.currentstamp
keeps the `current time' and advances it when
a new identifier is created, by a declaration or a binding. The timestamp
of the identifier is thus its binding time. The binding time is the natural
way to relate an identifier to a type region. If the current time
is set to the current level, new identifiers will have their binding time
not smaller than the current level: they will be regarded as owned
by the current type region. In type-level terms, the non-escaping property
states that the level of a type is no less than the binding time of
each type constructor within the type.
Unification, specifically, unification with a free type
variable -- akin to assignment -- may change the ownership of a type,
and so has to update the type level accordingly. The very same Ctype.update_level
function can hence check that the non-escaping property
still holds.
We can now understand the OCaml code for type checking a local module,
the expression let module name = modl in body
, excerpted below
from typecore.ml
.
| Pexp_letmodule(name, smodl, sbody) -> let ty = newvar() in (* remember original level *) begin_def (); Ident.set_current_time ty.level; let context = Typetexp.narrow () in let modl = !type_module env smodl in let (id, new_env) = Env.enter_module name.txt modl.mod_type env in Ctype.init_def(Ident.current_time()); Typetexp.widen context; let body = type_expect new_env sbody ty_expected in (* go back to original level *) end_def (); (* Check that the local types declared in modl don't escape through the return type of body *) begin try Ctype.unify_var new_env ty body.exp_type with Unify _ -> raise(Error(loc, Scoping_let_module(name.txt, body.exp_type))) end; re { exp_desc = Texp_letmodule(id, name, modl, body); exp_loc = loc; exp_extra = []; exp_type = ty; exp_env = env }
The type variable ty
is created to later receive the inferred type
of the expression. The variable is created in the current region. After
that, a new type region is entered, by begin_def()
, and the
identifier timestamp clock is set to correspond to the new current_level
.
(The timestamp clock is advanced right before a new identifier is
created. That's why Ident.set_current_time
receives ty.level
rather
than the incremented current_level
as the argument.)
Any type constructor declared within the the local module
will hence have the binding time of current_level
or higher. Ctype.init_def(Ident.current_time())
sets the type level to be
the binding time of the last identifier of the local module. Therefore, all
fresh types created afterwards, when type checking the body
,
will have the level greater or equal than the binding time of
any local module's type constructor. The unification will watch
that any level update preserve the invariant. Finally, the
unification with ty
at the very end (whose region, recall, is outside
the let module
's region) will make sure than none of the local
type constructors escape through the return type.
Incidentally, Typetexp.narrow ()
and Typetexp.widen context
in the above code establish a new context for type variables within
the local module. That's why
fun (x:'a) -> let module M = struct let g (x:'a) = x end in M.ghas the inferred type
'a -> 'b -> 'b
rather than 'a -> 'a -> 'a
. The two
occurrences of 'a
in the above code are the distinct type variables. A local
module shares none of its type variables with the surrounding.
Existential types are quite like the types declared in local modules:
in fact, existentials can be implemented with first-class local modules.
Therefore, checking that types created by pattern-matching on
(or, opening of) an existential do not escape the pattern-matching
clause uses the same technique: see Typecore.type_cases
.
Didier Rémy resolved to speed up the process. He says:
So, when I wrote my prototype for type checking records and variants (which, being structural, tend to be much larger then usual ML types), I was very careful to stay close to the theory in terms of complexity.This efficient type inference algorithm was described in Rémy's PhD dissertation (in French) and in the 1992 technical report. The
- I implemented unification on graphs in O(n log n)---doing path compression and postponing the occurs-check;
- I kept the sharing introduced in types all the way down without breaking it during generalization/instantiation;
- finally, I introduced the rank-based type generalization.
sound_lazy
algorithm explained earlier was a very simple model of
Rémy's algorithm, representing its main features. Xavier Leroy
implemented the type levels and the level-based generalization in
Caml-Light. However, for various reasons he implemented the version
akin to sound_eager
, with the occurs check on each binding of a free
type variable.
Didier Rémy prefers to view ranks, or levels, in terms of graphs. If
we add to the abstract syntax tree type annotations on each node,
edges for shared types and edges from a quantified variable to its
quantifier, we obtain a graph. The level of a free type variable can
be thought of as a de Bruijn level -- a pointer to the AST node that
will quantify the type variable. That AST node must be a let
node,
in the Hindley-Milner system. Unifying two free variables adds a
sharing edge between them, which requires the adjustment of levels to
maintain the invariant that a quantifier node dominates all uses of
its bound variables. (Recall, a dominator in a graph for a set of
nodes V
is a node d
such that all paths from the root to each node
in V
pass through d
.) Adding the sharing edge may create a path that
no longer passes through the old dominator, letting the variable
escape, so to speak, and become dominated by the let
node with the
wider scope.
The graphical view of the ranks proved fruitful. Rank-based generalization easily extends to type checking of records with polymorphic fields. Eventually this graphical view has led to MLF. Didier Rémy remarks that ``the main operation in MLF -- raising binders --is analogous to the computation of minimal rank between two nodes.'' Rémy's two MLF talks below describe the system and show several animations of rank adjustments during the type checking. He also points to the view of ranks from the point of constraint-based presentation of ML type inference, explained in ``The Essence of ML Type Inference''.
François Pottier and Didier Rémy. The Essence of ML Type Inference
In Advanced Topics in Types and Programming Languages (Benjamin C. Pierce, editor)
Chapter 10, pages 389-489. MIT Press, 2005.
Didier Rémy: Extension of ML Type System with a Sorted Equational Theory on Types
Research Report 1766, Institut National de Recherche en Informatique et Automatique, Rocquencourt, BP 105, 78 153 Le Chesnay Cedex, France, 1992.
< ftp://ftp.inria.fr/INRIA/Projects/cristal/Didier.Remy/eq-theory-on-types.ps.gz >
Didier Rémy: A new look on MLF
< http://cristal.inria.fr/~remy/mlf/portland.pdf >
Didier Rémy: MLF for Everyone (Users, Implementers, and Designers)
< http://cristal.inria.fr/~remy/mlf/mlf-for-everyone.pdf >
David McAllester: A logical algorithm for ML type inference Proc. RTA'03, pp. 436-451
< http://dl.acm.org/citation.cfm?id=1759182 >
David McAllester has much later re-discovered the efficient generalization. He also showed that the ML type inference is nearly linear in program size for most practical programs.
George Kuan and David MacQueen: Efficient ML Type Inference Using Ranked Type Variables
ML Workshop 2007
< http://people.cs.uchicago.edu/~gkuan/pubs/ml07-km.pdf >
The paper compares two level-based Hindley-Milner inference algorithms:
one uses let
-levels, as explained on this page, while
the other relies on lambda-levels. The paper develops abstract
machines for both algorithms and describes their several interesting
formal properties. The lambda-level approach was used in SML/NJ.
Peter Sestoft: Programming Language Concepts
Springer Undergraduate Texts in Computer Science. xiv + 278 pages. July 2012
< http://www.itu.dk/people/sestoft/plc/ >
Chapter 6 (see lecture slides and examples on the above page) describes a simpler version of Rémy's algorithm -- essentially, sound_eager
.
ctype.ml
, with the following signatures:newvar : ?name:string -> unit -> type_exp newgenvar : ?name:string -> unit -> type_exp
Both take the optional argument ?name
to give the name to the
variable. The name will be chosen automatically otherwise.
The function newvar
creates a variable at the current_level whereas newgenvar
creates at the generic_level
. In the code
let ty1 = newvar () in unify env ty1 some_type let ty2 = newgenvar () in unify env ty2 some_typeboth
ty1
and ty2
behave the same: the type variable will be bound
to some_type
. Since the current_level
corresponds to the innermost
alive region, some_type
's level is the current level or smaller,
and so remains unchanged in either case.
The difference emerges in the following two snippets
(the second often occurs in typecore.ml
)
let ty1 = newvar () in let list_type = newgenty (Tconstr(p_list, [ty1])) in let texp = instance env list_type in unify env texp some_type let ty2 = newgenvar () in let list_type = newgenty (Tconstr(p_list, [ty2])) in let texp = instance env list_type in unify env texp some_typeThe function
instance
copies the type -- creates a Tsubst
node, to be
precise, only if the type is generic. That is, inlet ty = newvar () in instance env ty
instance
acts as the identity function. However, inlet ty = newgenvar () in instance env ty
instance
copies the variable. Therefore, in the first snippet above, unify
at the end may affect the list_type
, by instantiating ty1
.
The list_type
cannot possibly be affected in the second snippet since unify
will act on the copy of ty2
.
let
-generalization in OCaml is far more complex than what we
have sketched earlier. This section is to help appreciate the true
complexity of generalization.
The let
-expression in OCaml has the general form
let [rec] pattern = exp and pattern = exp ... in bodyThe
let
type checker type_let
-- 160 lines of code in typecore.ml
, not counting the type checking of patterns -- receives
the list of pattern-expression pairs, and the recursion-flag. Here is
the end of its codebegin_def (); ... let exp_list = List.map2 (fun (spat, sexp) (pat, slot) -> .... (* type checking of expressions *) type_expect exp_env sexp pat.pat_type) spat_sexp_list pat_slot_list in ... end_def(); List.iter2 (fun pat exp -> if not (is_nonexpansive exp) then iter_pattern (fun pat -> generalize_expansive env pat.pat_type) pat) pat_list exp_list; List.iter (fun pat -> iter_pattern (fun pat -> generalize pat.pat_type) pat) pat_list; (List.combine pat_list exp_list, new_env, unpacks)We see the familiar pattern:
begin_def(); ... newvar () ... end_def(); generalizeBut there is another traversal of the type, with
generalize_expansive
. That function is invoked only if the expression
is expansive, that is, may have a visible effect -- for example, it is
an application. The function Ctype.generalize_expansive
traverses its argument type_expression
; when it comes across a
constructed type Tconstr(p,args)
(such as the list type, etc),
and is about to traverse an arg
, generalize_expansive
checks
the declaration of the type p
for the variance of that argument.
If arg
is covariant, generalize_expansive
traverses arg
and
sets the levels of the components above the current_level
to
the generic_level
. If arg
is not covariant (e.g., the argument of ref
and array
type constructors), arg
's components with
the levels above the current are set to the current_level
.
The subsequent generalize
will leave those levels as they
are. This is how a so-called relaxed value restriction is
implemented, which is responsible for inferring the polymorphic type for# let x = (fun y -> print_string "ok"; y) [];; ok val x : 'a list = []Here,
x
is bound to an application, which is not a syntactically
value and which is expansive. Its evaluation certainly has a visible
effect. And yet the type of x
is generalized because the list
type is covariant in its argument. SML would not have. Without the
relaxed value restriction, MetaOCaml will be hardly usable. For
example,let x = .<1 + .~(...)>.is an expression, so the classifier cannot be generalized and hence
x
cannot be run.oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML