eval
function), the environment is
one of its arguments. Compiled functions receive the environment as
an extra argument. Terms with variables are commonly thought to mean
functions from the environment to the value domain.
This article demonstrates a different, unconventional approach to variable references, coming from the observation that a (let-)bound variable can only be used while the control remains within the corresponding let-expression body. In interpreter terms, a variable may be accessed only while the interpreter is still handling the let-form, which remains on the interpreter stack. Therefore, to find the value associated with the bound variable we search the stack -- or, better, just point to the right place on the stack that stores that value.
That is a very attractive idea, which seems first to come to mind when implementing higher-order languages. It was indeed first that came to mind to J. McCarthy and A. Turing. Alas, the idea -- now called dynamic binding -- is flawed (as Turing soon realized). To regard functions as first-class, to be able to pass them as arguments and return as results, lexical binding is indispensable.
This article demonstrates that, perhaps surprisingly, lexical binding can be implemented via dynamic binding, in strict and lazy settings.
The main advantage of this alternative approach is making variables a modular feature: variables and (first-class) functions can be introduced to a first-order language without any changes to the latter, without any need to re-write the interpretation of the first-order fragment (numbers, strings, etc.) to pass the environment. We may write extensible interpreters and compilers. Another application is the surprisingly simple implementation of staging.
Dynamic binding can be implemented in many ways. One particular implementation, in terms of delimited control, turns out particularly insightful. Incidentally, it demonstrates the need for multi-shot (non-affine) delimited continuations -- perhaps the first non-trivial use of such delimited continuations aside from non-determinism.
Another insight is that a let-expression -- often considered a mere syntax sugar in lambda calculus -- turns out more fundamental than the lambda abstraction.
We implement the languages by writing their interpreters -- several, in fact, for various evaluation strategies -- in OCaml. It helps therefore to embed the languages in OCaml as DSLs. We will be using tagless-final embedding, as more illuminating and easier to extend. The latter is particularly important: the extensibility of implementations is one of the main points.
The base language is the simplest first-order language with integers and integer operations, as well as applications (of built-in functions, since there is no way to define functions yet.) As a tagless-final DSL, the language (its operations -- syntax -- and the types) are described by the following OCaml signature.
module type fo = sig type 'a exp (* expressions *) val int : int -> int exp (* integer literals *) val ( + ) : int exp -> int exp -> int exp val ( - ) : int exp -> int exp -> int exp val (//) : ('a -> 'b) exp -> 'a exp -> 'b exp (* application, left-assoc *) end
A DSL expression of type t
is represented by an OCaml expression of
(an abstract) type t exp
. For example, an OCaml expression int 1
represents
DSL's integer literal 1
. Here is a (OCaml representation) of
a more elaborate DSL expression:
let exfo = int 1 + int 2 - int 3 - int 4Applications are denoted by the (left-associative) operator
//
. Built-in
functions are easy to add but they will not be needed for our examples.
The second language, fol
, extends fo
with definitions:
let-expressions and variables:
module type fol = sig include fo type 'a var val newvar : string -> 'a var val var : 'a var -> 'a exp val let_ : 'a var -> 'a exp -> (unit -> 'b exp) -> 'b exp endThe type
'a var
describes variables, created by newvar
from an
unlimited supply. The first argument of var
is the name, used only
for tracing, which is quite useful especially for call-by-need later
on. Variables per se are not expressions: use
var
to reference them (which may, and actually will be, an effectful
operation). The body of a let-expression is a thunk:
the timing of its evaluation is controlled by let_
.
A sample fol
term may look as follows:
let x = newvar "x" let y = newvar "y" let exfol = let_ x (int 10 - int 5) @@ fun () -> let_ y (int 1 + var x) @@ fun () -> let_ x (var x + var y) @@ fun () -> var x + var yThe first two lines create the variables for use in
exfol
. (Hereafter we assume that all variables appearing in examples
have been created similarly).
Finally, we make the language higher-order by adding first-class functions, built by lambda-expressions:
module type ho = sig include fol val lam : 'a var -> (unit -> 'b exp) -> ('a -> 'b) exp endFor example, an anonymous function of adding two to its argument is represented as
lam x (fun () -> x + int 2)
.
Here are a few sample expressions using all ho
facilities, including
those inherited from fo
and fol
:
let ex0 = (lam x @@ fun () -> let_ y (var x + var x) @@ fun () -> var y + var y) // int 10 let ex1 = (lam x @@ fun () -> let_ y (var x + var x) @@ fun () -> lam z @@ fun () -> var z + (var z + (var y + var y))) // (int 10 - int 5) // (int 20 - int 10) let exd = let_ h (lam f @@ fun () -> let_ x (int 10) @@ fun () -> var f // var x) @@ fun () -> let_ x (int 1) @@ fun () -> var h // (lam y (fun () -> var x + var y))The last example,
exd
, deliberately uses the same variable x
in different
contexts, to highlight the difference between lexical and dynamic bindings.
fo
in this section. For a tagless-final DSL, an interpreter is not a
single (eval
) function; rather, it is a realization of the fo
signature, showing the implementation (interpretation, or, denotation)
of each expression form of the DSL.
We chose the simplest, meta-circular interpreter, which interprets the
fo
DSL as a subset of OCaml. Therefore, the type t exp
is realized as t
,
DSL integer literals are OCaml integers, DSL addition and subtraction are
OCaml addition and subtraction (modulo printing the trace), DSL
application is the OCaml application. Other interpreters are possible:
pretty-printer or code generator.
module FO = struct type 'a exp = 'a let int : int -> int exp = Fun.id let ( + ) : int exp -> int exp -> int exp = fun x y -> Printf.printf "Adding %d and %d\n" x y; x + y let ( - ) : int exp -> int exp -> int exp = fun x y -> Printf.printf "Subtracting %d and %d\n" x y; x - y let (//) : ('a -> 'b) exp -> 'a exp -> 'b exp = (@@) end
Thus defined FO
can be immediately used to evaluate the sample term
exfo
, producing -4
and printing the trace of additions and
subtractions.
module type dynbind = sig type 'a dynvar val dnew : ?init:'a -> unit -> 'a dynvar val dlet : 'a dynvar -> 'a -> (unit -> 'w) -> 'w val dref : 'a dynvar -> 'a endThe operation
dnew
creates a new dynvar: dynamically-bound
variable, possibly with the default value. The operation dlet
, having received
a dynvar, a value and a thunk, evaluates the thunk in the dynamic
environment extended with the binding of the given value to the
given dynvar. Finally, dref
obtains the value associated with the dynvar
by the closest dynamically-enclosing dlet (or the default value, if it
was specified and the variable is not bound).
There are several implementations of this signature. The simplest uses reference cells: so-called `shallow binding'.
module DBShallow = struct type 'a dynvar = 'a option ref let dnew : ?init:'a -> unit -> 'a dynvar = fun ?init () -> ref init let dlet : 'a dynvar -> 'a -> (unit -> 'w) -> 'w = fun dv e body -> let vold = !dv in dv := Some e; match body () with | r -> dv := vold; r | exception e -> dv := vold; raise e let dref : 'a dynvar -> 'a = function | {contents = Some x} -> x | _ -> failwith "unbound dynamic variable" end
One may also implement dynamic binding via delimited control --
so-called delimited dynamic binding (DDB
). In this implementation,
dereferencing a dynvar is literally reaching back to the place
on the evaluation stack that stores the bound value (as part of the
dlet
stack frame). Finally, dynamic binding
is what's called a `reader effect', and may be implemented with any
algebraic effect facility (e.g., `one-shot' algebraic effects of OCaml
5).
fol
, which extends fo
with variables
and let-expressions. A variable may only be accessed while in scope of
the let-expression that bound it. In fol
, the scope of a
let-expression -- when control remains within its body -- cannot be
interrupted, captured or resumed. Therefore, lexical scope and dynamic
scope coincide, and we may use dynamic binding to implement
let-expressions and variables. The advantage of dynamic binding is
that it may added modularly to an existing language without any
changes to the latter -- as we now demonstrate.
The interpreter of fol
is a proper extension of FO
, including it
as is and adding the interpretation of the new facilities: variables
and binding:
module FOL(D:dynbind) = struct include FO type 'a var = string * 'a D.dynvar (* name for tracing *) let newvar : string -> 'a var = fun name -> (name, D.dnew ()) let var : 'a var -> 'a exp = fun (name,dv) -> Printf.printf "Needing %s...\n" name; D.dref dv let let_ : 'a var -> 'a exp -> (unit -> 'b exp) -> 'b exp = fun (name,dv) e body -> Printf.printf "Binding %s...\n" name; D.dlet dv e body endEssentially, DSL variables are realized as dynvar,
var
is dref
and let_
is dlet
. In particular, accessing a
variable is a (reader) effect.
The FOL
interpreter is parameterized by the implementation of
dynamic binding. See the accompanying code for tests, using DBShallow
and DDB
implementations of dynamic binding.
We must stress again the absence of any explicit environment, and that
the first-order fragment FO
is used as is, without any modification.
The first-order fragment remains unaware of any variables.
let
:
module HODB(D:dynbind) = struct include FOL(D) let lam : 'a var -> (unit -> 'b exp) -> ('a -> 'b) exp = fun v body -> fun e -> let_ v e body end
which is what the original Lisp did. It is indeed very intuitive and
elegant -- in fact, too intuitive. Trying to interpret our examples
shows the problem: whereas ex0
runs and gives the expected result, interpreting ex1
, repeated below
let ex1 = (lam x @@ fun () -> let_ y (var x + var x) @@ fun () -> lam z @@ fun () -> var z + (var z + (var y + var y))) // (int 10 - int 5) // (int 20 - int 10)ends in a ``unbound dynamic variable'' when accessing
y
, according
to the trace. Looking closely, we see that although lam z
is defined
within the scope of let_ y
, it is applied outside the dynamic
scope of that let-expression. The example exd
shows a more insidious error:
let exd = let_ h (lam f @@ fun () -> let_ x (int 10) @@ fun () -> var f // var x) @@ fun () -> let_ x (int 1) @@ fun () -> var h // (lam y (fun () -> var x + var y))Interpreting it with
HODB
finishes without problems; the result,
20
, is probably not what one may have expected.
In the presence of first-class functions, lexical
binding no longer coincides with dynamic binding.
The solution is also old, proposed by Landin back in 1960s: closures, which carry the variable environment at the point of function definition. When the function is later applied, its body is evaluated within the environment of its definition rather than the environment of the application. We can implement this standard, textbook solution literally, if we use an extended dynamic binding facility that permits capturing and restoring of the dynamic environment:
module type db_capture = sig include dynbind type env val current_env : unit -> env val with_env : env -> (unit -> 'w) -> 'w endThe operation
current_env
captures the current environment and
returns it as an opaque value (of an abstract type env
). The
operation with_env
receives an env
and a thunk
and evaluates
the thunk in the given dynamic environment env
rather than the
current one.
Given an implementation of db_capture
, lexical first-class functions
are implemented literally as the textbooks say:
module HO(D:db_capture) = struct include FOL(D) let lam : 'a var -> (unit -> 'b exp) -> ('a -> 'b) exp = fun v body -> let env = D.current_env () in (* env of the lam definition *) fun e -> D.with_env env @@ fun () -> let_ v e body endThe body of the lambda-expression is evaluated in the environment of its definition.
db_capture
-- extended with the operations for
capturing and restoring of the dynamic environment. We now demonstrate
that these operations are realizable in terms of the ordinary dynamic
binding (plus the facility to build sequences, or at least something
like function compositions).
The proof is constructive: the following functor delivers an
implementation of db_capture
given an implementation of dynbind
:
module DBCapture(D:dynbind) = struct include D type env = {k: 'w. (unit -> 'w) -> 'w} let with_env : env -> (unit -> 'w) -> 'w = fun {k} body -> k body let env_acc = dnew ~init:{k=fun b -> b ()} () let current_env : unit -> env = fun () -> dref env_acc let dlet : 'a dynvar -> 'a -> (unit -> 'w) -> 'w = fun dv e body -> let env = current_env () in D.dlet env_acc {k = fun b -> env.k (fun () -> D.dlet dv e b)} @@ fun () -> D.dlet dv e body endThe trick is the dynamic variable
env_acc
, to accumulate the
current dynamic environment. A dlet
expression records in env_acc
the binding it is about to make.
The functor DBCapture
builds the implementation of db_capture
generically, given any implementation of dynbind
. For a particular
implementation of dynbind
, however, there may be a more efficient
realization of db_capture
. For example, the accompanying code
shows such efficient implementation of db_capture
it terms of
DDB
, using multiprompt delimited control (supported in OCaml
as the library delimcc
). Incidentally, capturing of the environment
requires multi-shot delimited control, which is not supported by
OCaml 5 algebraic effects. It turns out that multi-shot delimited
control is needed not just to implement non-determinism.
fol
interpreter: in
call-by-name, lexical and dynamic scope differ even without
first-class functions.
Call-by-need, or lazy, evaluation is call-by-name with the
memoization of the result. A lazy expression is not evaluated until
its result is needed. At that point, the expression is evaluated and
the result is memoized. Thus a lazy argument expression is
evaluated at most once. Dynamic binding often includes the operation
dset
to change the binding of a dynamic variable in scope (in
particular, DDB
supports such an operation; it is easy to add to
DBShallow
). Using dset
in the implementation of the var
operation is all that is needed to make call-by-name into call-by-need.
The goal at that time was to related call-by-need with delimited control. I have eventually realized that it is not about delimited control but about dynamic binding (which may be implemented via delimited control, among other things).
eval
) is the fold over that data type.
lazy-eval.scm [6K]
The complete implementation and several illustrative examples,
including the recursively-defined Fibonacci stream. The code
was originally written on Feb 21, 2005. The version as of Dec 28, 2008 adds
a convenient automatic identification of lazy expressions and more
examples.
Ronald Garcia, Andrew Lumsdaine and Amr Sabry: Lazy Evaluation and Delimited Control. Logical Methods in Computer Science, July 11, 2010, Volume 6, Issue 3