We implement the type-directed memoization from the paper in OCaml. Quite a few adjustments had to be made. First of all, OCaml does not have type families, or even type classes. We use type-indexed--value approach pioneered by Zhe Yang. Second, whereas Haskell relied on laziness, we explicitly use reference cells for recording the computed results. Finally, recursive types require an additional level of indirection via the reference cell. The trick is not unlike the eta-expansion that converts the ordinary fixpoint combinator to the applicative one: We have to delay the computation of the fixpoint until we receive the argument for the fixpointed function. Still, the computed fixpoint should be shared among all applications of the memoized function.
Here is a simple example: memoizing functions on boolean lists. A boolean list has a recursive type (recursive sum of product), which can be written as:
BList = 1 + Bool * BListor, with the explicit fixpoint,
blist = mu self.(unit + bool * self)In OCaml, we write it as follows:
module BLST = FIX(struct type 'self t = (unit,bool*'self) either let mdt self = md_sum md_unit (md_prod md_bool self) end) let nil = BLST.Fix (Left ()) let cons h t = BLST.Fix (Right (h,t))The type expression for the type of
(unit,bool*'self) eithermatches the mathematical notation. Mainly, the expression
md_sum md_unit (md_prod md_bool self)that computes the memoizer for the functions on boolean lists from the memoizers for unit, booleans and memoizer combinators, too, matches the mathematical notation for the recursive type of boolean lists. Our memo tables are indeed type-directed.
The complete OCaml code and the tests
The cleverest copyist is the one whose music is performed with the most ease without the performer guessing why. -- J.J.Rousseau, Dictionary of Music.
The appeal to music is not just a flourish or a veiled reference. It is the point: how pleasant is a strict language to play the classical themes from analysis -- infinite power series.
As the instrument to to deal with infinite structures, lazy evaluation -- evaluating an expression only when needed and to the needed extent, memoizing the result -- has the undeniable appeal. Just look at the example, chosen to show off Haskell at haskell.org:
primes = filterPrime [2..] where filterPrime (p:xs) = p : filterPrime [x | x <- xs, x `mod` p /= 0]Haskell does not hold the monopoly on lazy evaluation: almost any strict language supports it, just not by default. For example, the straightforward implementation of
primesin OCaml is as follows:
type 'a ll = 'a lv Lazy.t and 'a lv = Cons of 'a * 'a ll | Nil let rec filter p : 'a ll -> 'a ll = function | lazy Nil -> lazy Nil | lazy (Cons (x,t)) -> let t = lazy (Lazy.force @@ filter p t) in if p x then lazy (Cons (x,t)) else t let rec iota x : int ll = lazy (Cons (x,iota (x + 1))) let primes = let rec filter_prime = function | lazy (Cons (p,xs)) -> lazy (Cons (p,filter_prime @@ filter (fun x -> x mod p <> 0) xs)) in filter_prime @@ iota 2Haskell however looks quite more attractive in comparison. It is not just the clutter of
forcethat spoils. It is the line
lazy (Lazy.force @@ filter p t). The simple
filter p tin its place would have type-checked as well. Guess how that code would have worked though. Thus not only the programmers have to use
force; they have to know where to use them.
One may wonder if the burden of correctly placing
is the inherent defect of strict languages, or one can do better. And
what a better example to test this than the most elegant and perhaps
the most sophisticated use of lazy evaluation: power streams. Such was
the challenge posed by Kim-Ee Yeoh: write Doug McIlroy's power serious
one-liners without thunks and lazy annotations.
Doug McIlroy started his ``Music of streams'' paper with the
J.J.Rousseau's quote. Uncannily, the quote talks about our challenge:
play power streams without the performer guessing their implementation
-- without even being aware if he is working in a
strict or lazy language.
Our subject is hence the infinite power series
a0 + a1*x + a2*x^2 + ... + an*x^n + ..., or, in Horner rule,
a0 + x*(a1 + x*(a2 + ...)). It is represented as the non-ending
stream of its coefficients
[a0, a1, ... an, ...].
The series is always infinite, although it may
have a zero tail. McIlroy later describes the extension
to represent polynomials finitely, which is out of scope for this article.
The table below contrasts McIlroy's famous one-liners for power
stream manipulation, written in Haskell and OCaml. The Haskell code
comes verbatim from McIlroy's web page, which also explains the underlying
mathematics. In his convention, a series variable has
t when it is a tail. The OCaml code is written with
the library to be introduced later; its functions are
qualified with the module name
I. We show the inferred signatures in
the comments. When comparing code one has to keep in mind that
Haskell and OCaml differ in many respects, not just the evaluation
strategy. OCaml is in general a little bit more verbose and ungainly
at places: its syntax had to be compatible with the original (heavy)
Caml, which in turn inherited the syntax from ML of late 70's, limited
by the parsing technology of the day. Recursive definitions have to be
marked explicitly. OCaml also does not have overloading: That is why
we see not just
+ but also
+. (for float addition) and
addition of infinite series).
|Coerce scalar to series||
series f = f : repeat 0
let series f = I.cons f I.repeat 0. (* val series : float -> float I.t *)
|Promote integer constants||
fromInteger c = series(fromInteger c)
let int x = series @@ float_of_int x (* val int : int -> float I.t *)
negate (f:ft) = -f : -ft
let rec negate fs = I.decon fs @@ fun f ft -> I.cons (-. f) negate ft
negate = map negate
let negate = I.map (fun x -> -. x)
(f:ft) + (g:gt) = f+g : ft+gt
let rec (+%) fs gs = I.decon2 fs gs @@ fun f ft g gt -> I.cons (f +. g) (fun (ft,gt) -> ft +% gt) (ft,gt)
(+) = zipWith (+)
let (+%) = I.zip_with (+.)
(f:ft) * gs@(g:gt) = f*g : ft*gs + series(f)*gt
let rec ( *% ) fs gs = I.decon2 fs gs @@ fun f ft g gt -> I.cons (f *. g) (fun gt -> ft *% gs +% series f *% gt) gt
-- Tying-the-knot trick (f:ft) / (g:gt) = qs where qs = f/g : series(1/g)*(ft-qs*gt)
let ( /% ) fs gs = I.decon2 fs gs @@ fun f ft g gt -> I.fix @@ I.cons (f /. g) (fun qs -> series (1. /. g) *% (ft -% qs *% gt))
(f:ft) # gs@(0:gt) = f : gt*(ft#gs)
(* Since # is reserved in OCaml, we use %% *) let rec ( %% ) fs gs = I.decon2 fs gs @@ fun f ft 0. -> I.cons f (fun gt -> gt *% (ft %% gs))
|Reversion (compositional inverse)||
-- Tying-the-knot trick revert (0:ft) = rs where rs = 0 : 1/(ft#rs)
let revert fs = I.decon fs @@ fun 0. ft -> I.fix @@ I.cons 0. @@ fun rs -> int 1 /% (ft %% rs) (* val revert : float I.t -> float I.t *)
-- integral from 0 to x int fs = 0 : zipWith (/) fs [1..]
let integ = I.cons 0. (fun fs -> I.zip_with (/.) fs (iota 1.)) (* val integ : float I.t -> float I.t *)
-- type (Num a,Enum a)=>[a]->[a] diff (_:ft) = zipWith (*) ft [1..]
let diff fs = I.decon fs @@ fun _ ft -> I.zip_with ( *. ) ft (iota 1.) (* val diff : float I.t -> float I.t *)
tans = revert(int(1/(1:0:1)))
let tans = revert @@ integ (int 1 /% from_list [1.;0.;1.]) (* val tans : float I.t *)
|Sine ands cosine||
-- (Implicit) Mutual recursion sins = int coss coss = 1 - int sins
let (sins,coss) = I.fix2 (fun (s,c) -> (integ c, int 1 -% integ s))
test4 = tans - sins/coss
let test4 = tans -% sins /% coss
The series for
tan x is in terms of the series for its functional
arctan x = ∫dx/(1+x^2). The final test (Music of Streams
paper, Example 4) exercises all facilities, and is demanding. It seems
to run a bit faster in the OCaml byte-code interpreter than in GHCi.
As promised, there are no thunks or lazy annotations.
The code relies on the library of power series combinators of the following interface:
module type INFSER = sig type 'a t val decon : 'a t -> ('a -> 'a t -> 'w t) -> 'w t (* deconstructor *) (* Another deconstructor, although technically unnecessary, but convenient *) val decon2 : 'a t -> 'b t -> ('a -> 'a t -> 'b -> 'b t -> 'w t) -> 'w t val cons : 'a -> ('b -> 'a t) -> 'b -> 'a t (* constructor *) val repeat : 'a -> 'a t (* same elems *) val map : ('a -> 'b) -> ('a t -> 'b t) val zip_with : ('a -> 'b -> 'c) -> ('a t -> 'b t -> 'c t) val take : int -> 'a t -> 'a list val fix : ('a t -> 'a t) -> 'a t val fix2 : ('a t * 'b t -> 'a t * 'b t) -> 'a t * 'b t end
Using the interface is straightforward, although its design is not.
The case in point is
cons, with an unexpected type. Another
surprise is the deconstructor
decon, whose result must be a power
series. A function that deconstructs and consumes an infinite series
must itself be producing an infinite series. INFSER lets us re-implement
let rec filter : ('a -> bool) -> 'a I.t -> 'b I.t = fun p l -> I.decon l (fun x t -> let t = filter p t in if p x then I.cons x (fun x -> x) t else t)Now we can write
filter p tfor the recursive invocation, without ill-effects.
Again, INFSER lets us write power serious without the clutter of
force, thunks and other suspensions that had to be correctly
placed by the user. It almost seems that the programmer does not have
to be aware of the default evaluation strategy. Yet the recursion
gives the strictness away. Although we can write recursive
definitions just like in Haskell (see, e.g., the code for addition and
multiplication), and although the explicit fixpoint for
tying the knot may occur with either strategy (cf.
mfix in Haskell)
and should probably be
encouraged given its subtlety, recursive series definitions seem to
be possible only in a lazy language. As Doug McIlroy pointed out in
the discussion thread, the exponential series can be defined in Haskell
exps = 1 + integral expsThe similar definition in OCaml
let rec exps = int 1 +% integ expsis not accepted: ``This kind of expression is not allowed as right-hand side of let rec''. One is forced to use the explicit fixpoint:
let exps = I.fix @@ fun exps -> int 1 +% integ exps
Again one wonders how deep is this problem and if something can be done,
in a hypothetical strict language -- or even in OCaml. Recall,
a recursive definition
let rec f = e may be treated as a syntax sugar for
let f = fix (fun f -> e). Such a `macro-expansion' is valid in any language,
strict or lazy. If only we could sneak-in our own
fix -- based
on the inferred type of the expression or in some other way... OCaml attributes
do give us `the other way'. With the appropriate PPX extension, one can write
let[@stream I] rec exps = int 1 +% integ expsand having it expand to the earlier
I.fixcode. Again, such an expansion is valid regardless of the evaluation strategy. Thus elegant recursive power-series definitions can in principle be written in a strict language, or even in OCaml with some extensions.
Speaking of elegance, I can't help but quote the paragraph from Doug McIlroy's web page:
``Extensions to handle polynomials make a practical package, doubled in size, not as pretty, but much faster and capable of feats like pascal. To see the dramatic speedup, try a bigger test like take 20 tans.''
Indeed, ``not as pretty, but much faster''.
In the upshot, well-chosen combinators are far more important than strict/lazy evaluation differences. A language expressive enough to support mini-languages (embedded DSLs) lets us deal with infinite series without much trouble or clutter. Strict evaluation is not an obstacle: if we can define a DSL, we can make it follow any evaluation strategy we want. Tom Ellis summarized it well: ``we can have access to laziness within strict languages with no less elegance than we have access to IO in pure languages.''
I thank Tom Ellis for the productive discussion and Kim-Ee Yeoh for posing such an interesting and well-defined problem.
M.D. McIlroy: The music of streams.
Information Processing Letters 77 (2001) 189-195.
< http://www.cs.dartmouth.edu/~doug/music.ps.gz >
The complete OCaml code for the article, with more examples and tests
The discussion thread on Haskell-Cafe, 25 December 2015 and January 2016
The Eff language is an OCaml-like language centered on algebraic effects, designed to try out algebraic effects on a larger scale and gain practical experience using them. It is currently implemented as an interpreter, with a compiler to OCaml in the works.
Rather than compile Eff to OCaml, we embed it. After all, save for algebraic effects, Eff truly is a subset of OCaml. The effect-specific parts are translated to the OCaml code that uses the library of delimited control delimcc or the new effects of the OCaml-effects branch. The translation is local and straightforward. We may almost cut-and-paste Eff code into OCaml, with simple adjustments. We thus present a set of OCaml idioms for effectful programming with the almost exact look-and-feel of Eff.
We idiomatically support even `dynamic effects' of Eff 3.1 with their attendant polymorphism, offering a more general view: on our translation, the dynamic creation of effects is but another effect, with no special syntax or semantics.
Joint work with KC Sivaramakrishnan.
OCaml code accompanying the paper
The code relies on the delimcc library of delimited control.
The code for the quuens benchmark
The code relies on the delimcc library of delimited control.
The Eff language
< http://www.eff-lang.org/ >
One may think that making captured continuations persistent is
trivial: after all, OCaml already supports marshaling of values
including closures. If one actually tries to marshal a captured
delimited continuation, one quickly discovers that the naive
marshaling fails with the error on attempting to serialize an abstract
data type. One may even discover that the troublesome abstract data
_chan. The captured delimited continuation (a
piece of stack along with administrative data) refers to heap data
structures created by delimcc and other OCaml libraries; some of these
data structures are closures, which contain module's environment and
may refer to standard OCaml functions like
function is a closure over the channel
stderr, which is
non-serializable. This points out the first problem: if we serialize
all the data reachable from the captured continuation, we may end up
marshaling a large part of the heap and the global environment. This
is not only inefficient but also lethal, as we are liable to
encounter channels and other non-serializable data structures.
There is a more serious problem however. If we serialize all data reachable from the captured delimited continuation, we also serialize two pieces of global state used by the delimcc library itself. When the stored continuation is deserialized, a fresh copy of these global data is created, referenced from within the restored continuation. Thus the whole program will have two copies of delimcc global data: one for use in the main program and one for use by the deserialized continuation. Although such an isolation may be desirable in many cases, it is precisely wrong in our case: the captured and the host continuations do not have the common view of the system and cannot work together. It may be instructive to contemplate process checkpointing offered by some operating systems (see also `undump' typically used by Emacs and TeX). When checkpointing a process, we wish to save the continuation of the process only (rather than the continuation of the scheduler that created the process, and the rest of the OS continuation). We also wish to save data associated with the process, for example, the process control block and the description of allocated memory and other resources. Control blocks of all processes are typically linked in; when saving the control block of one process, we definitely do not wish to save everything that is reachable from it. When saving the state of a process in a checkpoint, we do not usually save the state of the file system -- or even of all files used by the process. First of all, that is impractical. Mainly, it is sometimes wrong. For example, a process might write to a log file, e.g., syslog. We specifically do not wish to save the contents of the syslog along with the process image. We want the restored process append to the system log rather than replace it!
Of course resuming a suspended process after modifying its input files may also be wrong. It is a hard question of what should be saved by value and what should be saved by reference only. It is clear however that both mechanisms are needed. The serialization code of the delimcc library does offer both mechanisms. The inspiration comes from the fact that OCaml's own marshaling function, when handling closures, serializes OCaml code by reference. The delimcc library extends this approach to data. The library supports the registration of data (which currently must be closures in the old heap) in a global array. When serializing a continuation, the library traverses it and replaces all references to registered closures with indices in the global array; we then invoke OCaml's own serialization routine to marshal the result. After that, we undo the replacement of closures with indices. Such value mangling is not without precedent: to detect sharing, OCaml's own marshaling routine too mangles the input value. The use of the global array is akin to the implementation of cross-staged persistence in MetaOCaml.
Persistent delimited continuations for CGI programming with nested transactions
The salient application of persistent delimited continuations is the library for writing CGI scripts as if they were interactive console applications using
printf. The above library implements the minimal CGI programming framework with form validation. The library also supports nested transactions. The captured continuations are relatively compact: the essentially empty captured continuation takes 491 bytes when serialized. Serialized continuations of the unoptimized blog application have the typical size of 10K (depending on the size of the posts); bzip can compress them to one third of the original size.
Dynamic binding is implemented as a regular library, dependent
on the delimited continuations library. No changes to the OCaml system
and no code transformations are required; (parts of the) code that do
not use dynamic variables incur no overhead and run at the same speed
as before. Our dynamic variables are mutable; mutations however are
visible only within the scope of the
dlet statement where
they occurred. It is also possible to obtain not only the latest
binding to the dynamic variable, but also any of the shadowed
Because dynamic binding is implemented in terms of delimited continuations, the two features harmoniously interact. We can use dynamic variables in shift-based, cooperative threads, and support partial inheritance of the dynamic environment, with both shared and thread-private (mutable) dynamic variables.
caml-liston Mon, 10 Apr 2006 18:25:29 -0700
The library interface
The library implementation
The test code. The example at the end of that file demonstrates the partial inheritance of the dynamic environment among the parent and two cooperatively run threads.
Delimited Dynamic Binding
The ICFP 2006 paper justifying the implementation
Luc Moreau: A Syntactic Theory of Dynamic Binding. Higher-Order and Symbolic Computation, 11, 233-279 (1998)
Printing the outline of a pruned tree, using the extension to obtain shadowed dynamic bindings.
Delimited continuations in OCaml: required dependency
We show a conservative and elementary implementation of resumable exceptions in OCaml: the implementation is a self-contained 100% pure OCaml library; makes no changes to the OCaml system; supports the existing style of defining exceptions; is compatible with the ordinary exceptions; works in byte- or natively-compiled code; uses the most basic facilities of ML and so can easily be translated to SML.
We impose no extra restrictions on the resumable exception
raising and handling forms. Like with ordinary exceptions, resumable
ones may encapsulate values of arbitrary types; the same exception
handler may process exceptions of many types -- and send resumption
replies of many types. The raise form may appear within
the guarded code at many places; different raise forms
may resume with the values of different types. Furthermore, resumable
exceptions are declared just like the ordinary ones,
exception keyword. If
the resumable exception handler never resumes, resumable exceptions
act and feel precisely as the regular ones.
caml-liston Wed, 14 Jun 2006 15:54:03 -0700. This file also includes follow-ups, discussing syntactic sugar and the ways to implement resumable exceptions in a multi-threaded system.
Complete implementation, interface documentation, explanation, and an illustrative example.
Rainer Joswig's message with several examples of usefulness of resumable exceptions. It is posted on `Lambda the Ultimate' on June 15, 2006.
< http://lambda-the-ultimate.org/node/1544#comment-18632 >
After the preparation step or the alternative preparation step
described below, we enter or
#use, in the toplevel, the
print-toplevel-bindings.ml. We define a few sample
# let x = 1;; # let x = 2;; # let y = 10;;After that, evaluating
# print_bindings Format.std_formatter (get_value_bindings (!Toploop.toplevel_env));;will print all the top-level bindings defined since the start of the toplevel session:
binding: get_value_bindings/79 val get_value_bindings : Env.t -> (Ident.t * Types.value_description) list binding: print_bindings/107 val print_bindings : Format.formatter -> (Ident.t * Types.value_description) list -> unit binding: type_to_str/177 val type_to_str : Types.type_expr -> string binding: print_int_toplevel/179 val print_int_toplevel : Format.formatter -> (Ident.t * Types.value_description) list -> unit binding: x/186 val x : int binding: x/187 val x : int binding: y/188 val y : int DoneFor each binding, its name and type are printed. We see that the type environment keeps track of all the previous definitions of a name. Because
xwas defined twice, there are two entries in the type environment:
x/187. The counters are the timestamp.
We can also print the values associated with the bindings of
one particular type, for example,
# print_int_toplevel Format.std_formatter (get_value_bindings (!Toploop.toplevel_env));;which gives the following output:
binding: x/186 value: 2 binding: x/187 value: 2 binding: y/188 value: 10 DoneIf a variable is defined several times, the top-level value environment keeps the last associated value, however. The function
print_int_toplevelcannot, generally, be polymorphic -- unless we are willing to assume responsibility that our type representation string matches the desired type -- or we are willing to use MetaOCaml.
This step requires the OCaml installation directory with the object files left after the building of the toplevel. First, we need to retrieve
and adjust the paths in the
#directory directives to point to our OCaml installation directory. We start the OCaml toplevel and execute all
#directory and the
#load directives in that file up to, but not including the loading of
genprintval.cmo . Please do not change the order of the load directives! It took about half an hour to find the right order....
Alternative preparation step
On the discussion thread, Jonathan Roewen suggested an alternative. It requires rebuilding of the toplevel; the OCaml distribution is no longer needed then. We need to skip the expunge step after the toplevel is built: grep for
expunge in the base
Makefile. That step erases the mentioning of many internal components from toplevel's module dictionary. Therefore, these OCaml system modules act as if they are not loaded. We need these module for the present application however.
The implementation file. It was posted on the above discussion thread on Tue, 26 Sep 2006 01:01:20 -0700 (PDT)
amboperator, first introduced by John McCarthy and well described by Dorai Sitaram in the context of Scheme, takes zero or more expressions (thunks) and nondeterministically returns the value of one of them. This implies that at least one of
amb's expressions must yield a value, that is, does not fail. If
ambhas no expressions to evaluate or all of them fail,
ambitself fails. One may think that
ambis easily implementable by taking a list of thunks and evaluating the thunks in some order within the
tryblock. The value of the thunk finishing without raising an exception is returned. However, that simple implementation is wrong. It is not enough that
amb's chosen expression itself evaluates successfully. The chosen expression must be such that its value causes the whole program finish without errors, if at all possible. The
amboperator must `anticipate' how the value of the chosen expression will be used in the rest of the computation. Therefore, amb is called an angelic nondeterministic choice operator.
Andrej Bauer gave the following example on the discussion thread:
if (amb [(fun _ -> false); (fun _ -> true)]) then 7 else failwith "failure"This program, he explained, should return 7: ``the
ambinside the conditional should "know" (be told by an angel) that the right choice is the second element of the list because it leads to 7, whereas choosing the first one leads to failure.''
Therefore, we need the ability to examine (or speculatively
execute) the rest of the computation. In Scheme,
implementable in terms of
call/cc, as well explained by
Dorai Sitaram. OCaml has more appropriate delimited control operators,
amb in two lines of code. We also need a `toplevel
function', to tell us if the overall computation succeeded. One may
think of it as St. Peter at the gate. For now, we take a computation
that raises no exception as successful. In general, even
non-termination within a branch can be dealt with
intelligently (cf. `cooperative' threading which must yield from time to time).
Andrej Bauer's test now looks in full as
let test1 () = let v = if (amb [(fun _ -> false); (fun _ -> true)]) then 7 else failwith "Sinner!" in Printf.printf "the result: %d\n" v;; let test1r = toplevel test1;; (* "the result: 7" *)Speculatively evaluating amb's expressions or the rest of the computation may incur effects, such as mutation or IO. We can deal with them using one of the standard transaction implementation techniques: prohibit effects, log the updates, log the state at the beginning to roll back to, use zipper for functional `mutations'.
Here is a more advanced test, requiring a three-step-ahead clairvoyance
let numbers = List.map (fun n -> (fun () -> n)) [1;2;3;4;5];; let pyth () = let (v1,v2,v3) = let i = amb numbers in let j = amb numbers in let k = amb numbers in if i*i + j*j = k*k then (i,j,k) else failwith "too bad" in Printf.printf "the result: (%d,%d,%d)\n" v1 v2 v3;; let pythr = toplevel pyth;; (* the result: (3,4,5) *)In monadic terms,
ambis equivalent to
MonadPlus. Even though we are interested in the first result of the entire
MonadPluscomputation, along the way we have to keep track of many possible worlds. That is, we need something like a
Listmonad rather than a
Maybemonad (the latter should not even be regarded as
Dorai Sitaram: Teach Yourself Scheme in Fixnum Days. 1998-2004. Chapter 14. Nondeterminism.
< http://www.ccs.neu.edu/home/dorai/t-y-scheme/t-y-scheme-Z-H-16.html#node_chap_14 >
Commented OCaml implementation
It was posted on the above discussion thread on Sat, 10 Feb 2007 03:15:56 -0800 (PST)
Zipper File system and transactional storage
fun 'a embed () = let exception E of 'a fun project (e: t): 'a option = ...At first sight, that SML feature seems impossible in OCaml (prior to OCaml 3.12). Although we can declare local exceptions in OCaml via local
structures, before OCaml 3.12 we could not use type variables quantified outside the structure: a structure limits the scope of all of its type variables. We show that local globally-quantified exceptions are macro-expressible in all versions of OCaml and demonstrate two translations. The first one relies on multi-prompt delimited continuations, whose implementation leads to the second translation. The latter represents a polymorphic exception mere by a parameter-less exception and one reference cell.
The caml-list thread referenced below gave a good motivation
for locally polymorphic exceptions: writing an efficient library
fold_file of the following interface:
module type FoldFile = sig val fold_file : in_channel -> (* file *) (in_channel -> 'a) -> (* read_func *) ('a -> 'b -> 'b) -> (* combiner *) 'b -> (* seed *) 'b end;;We can use this general folding over a file to, for example, count the number of lines in a file:
module TestFold(F:FoldFile) = struct let line_count filename = (* string->int *) let f = open_in filename in let counter _ count = count + 1 in F.fold_file f input_line counter 0 let test = line_count "/etc/motd" end;;The following tentative implementation has been outlined on ocaml-list:
module Attempt0 = struct exception Done of 'a let fold_file file read_func elem_func seed = let rec loop prev_val = let input = try read_func file with End_of_file -> raise (Done prev_val) in let combined_val = elem_func input prev_val in loop combined_val in try loop seed with Done x -> x end;;The
loopis properly tail-recursive (NB: the body of a
tryblock is not in a tail position) and avoids any administrative data structures. Alas, the typechecker does not accept the exception declaration, which says that
Doneshould carry a value of all types. There is no such value in OCaml, and if it were, it wouldn't be useful. That was not our intention anyway: we want the value of
Doneto have the same type as the result of the polymorphic function
fold_file. We should have declared the exception inside
fold_file. Surprisingly, that can be done:
Delimcc.promptis precisely this type of `local exceptions'. We need only a slight and local adjustment to the above code to make it compile. This is our first translation.
open Delimcc let abort p v = take_subcont p (fun sk () -> v);; module AttemptA : FoldFile = struct let fold_file file read_func elem_func seed = let result = new_prompt () in (* here is our local exn declaration *) let rec loop prev_val = let input = try read_func file with End_of_file -> abort result prev_val in let combined_val = elem_func input prev_val in loop combined_val in push_prompt result (fun () -> loop seed) end;; let module TestA = TestFold(AttemptA) in TestA.test;; (* - : int = 24 *)
The analogy between exceptions and delimited continuations is profound: local exceptions are commonly used to implement multi-prompt delimited continuations in SML. We see the converse is also true. Furthermore, delimited continuations in OCaml are implemented in terms of exceptions. Abort is essentially raise. If we `inline' the gist of the delimited continuation library we arrive at our second translation. The result requires no libraries and works with both byte-code and native compiler.
module AttemptR : FoldFile = struct exception Done let fold_file file read_func elem_func seed = let result = ref None in (* here is our local exn declaration *) let rec loop prev_val = let input = try read_func file with End_of_file -> result := Some prev_val; raise Done in let combined_val = elem_func input prev_val in loop combined_val in try loop seed with Done -> (match !result with Some x -> x | _ -> failwith "impossible!") end;; let module TestR = TestFold(AttemptR) in TestR.test;; (* - : int = 24 *)The code is still properly tail-recursive and deforested. In contrast to other imperative implementations of fold_file, ours is almost pure: the reference cell
resultis written to and immediately after read from only once during the whole folding -- namely, at the very end. The bulk of the iteration is functional. A mutable cell is the trick behind typing of multi-prompt delimited continuations. One may even say that if a type system supports reference cells, it shall support multi-prompt delimited continuations -- and vice versa.
Caml-list discussion thread Locally-polymorphic exceptions October 3-4, 2007.
Of special note is the PML implementation posted by Christophe Raffalli.
The MLton team. UniversalType Source for the SML example of local exceptions
< http://mlton.org/UniversalType >
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML