previous   next   up   top

Staging, Program Generation, and Meta-Programming



Tagless-Staged: Combinators for Impure yet Hygienic Code Generation

We present a combinator library to write (Haskell) programs that generate code in a subset of Haskell, OCaml, or other target language. The library embeds an extensible typed higher-order domain-specific language. It statically assures not only that all generated code is well-formed and well-typed but also that all generated variables are bound as expected. Such assurances are crucial for code generators to be written by domain experts rather than compiler writers; after all, the most profitable optimizations are domain-specific ones. These static guarantees are maintained in the presence of arbitrary effects, including IO. Effects are indispensable in code generators, to report failures, to generate alternatives to choose the fastest from, to introduce auxiliary variables, to interchange loops, etc.

Our framework is not a an experimental language with a new syntax and uncertain future. Rather, it is a regular, not a bleeding edge, library in mature Haskell. It is modular and extensible: one can add to the target language more features and constants or change it to be typed or untyped, first- or higher-order. Polymorphism over generated environments makes generators composable. The higher-order abstract syntax used in the library makes target functions and variables human-readable. Our library may be regarded as `staged Haskell'.

There are many code generation frameworks with one or the other of the features. Ours has all of them. Template Haskell, among other systems, falls short, assuring -- dynamically rather than statically -- the well-typedness of only the final code. Furthermore, although the generated code, after a type-check, is well-typed, it may still be ill-scoped.

The static types of our generator expressions not only ensure that a well-typed generator produces well-typed and well-scoped code. They also express the lexical scopes of generated binders and prevent mixing up variables with different scopes. For the first time we demonstrate, in an embedded domain-specific language, statically safe and well-scoped loop interchange and constant factoring from arbitrarily nested loops.

Joint work with Yukiyoshi Kameyama and Chung-chieh Shan.

The current version is October 2015.
beyond.pdf [412K]
Combinators for Impure yet Hygienic Code Generation
Science of Computer Programming, Volume 112, Part 2, 15 November 2015, pp. 120–144. doi:10.1016/j.scico.2015.08.007

beyond-talk.pdf [193K]
The annotated slides of the talk presented at PEPM 14 on January 20, 2014 in San Diego, CA, USA.

Region Memory Management for Free Variables: Type- and Scope-Safe Code Generation with Mutable Cells
The formalization of one part of the combinator library, as the calculus <NJ>

TSCore.hs [29K]
The core of the code generation library

TSCPST.hs [17K]
Code generation with control effects reaching beyond the closest binder. Far-reaching let-insertion. The implementation of the CPSA applicative hierarchy.

Examples and Sample code


MetaScheme, or untyped MetaOCaml

We implement the four MetaOCaml special forms -- bracket, escape, cross-stage persistence (aka `lift'), and run -- in R5RS Scheme. A Scheme system thus becomes untyped MetaOCaml, in which we can write and run code with an arbitrary number of stages. We point out how distinct the staging annotations are from Scheme's quasiquotation, despite superficial similarity. Our implementation of cross-stage persistent (CSP) values closely corresponds to the one in MetaOCaml and so helps explain the latter.

Here is a sample MetaOCaml code and its Scheme translation

     # let eta = fun f -> .<fun x -> .~(f .<x>.)>.
       in .<fun x -> .~(eta (fun y -> .<x + .~y>.))>.
     - : ('a, int -> int -> int) code = .<fun x_1 -> fun x_2 -> (x_1 + x_2)>.
      (let ((eta (lambda (f) (bracket (lambda (x) (escape (f (bracket x))))))))
       (bracket (lambda (x) (escape (eta (lambda (y) (bracket (+ x (escape y)))))))))
Translating MetaOCaml code into Scheme seems trivial: code values are like S-expressions, MetaOCaml's bracket is like quasiquote, escape is like unquote, and `run' is eval. If we indeed replace bracket with the quasiquote and escape with the unquote in the above Scheme code and then evaluate it, we get the S-expression (lambda (x) (lambda (x) (+ x x))) , which is a wrong result, quite different from the code expression .<fun x_1 -> fun x_2 -> (x_1 + x_2)>. produced by MetaOCaml. The latter is the code of a curried function that adds two integers. The S-expression produced by the naive Scheme translation represents a curried function that disregards the first argument and doubles the second. This demonstrates that the often mentioned `similarity' between the bracket and the quasiquote is flawed. MetaOCaml's bracket respects alpha-equivalence; in contrast, Scheme's quasiquotation, being a general form for constructing arbitrary S-expressions (not necessarily representing any code), is oblivious to the binding structure.

Our implementation still uses S-expressions for code values (so we can print them and eval-uate them) and treats escape as unquote. To maintain the hygiene, we need to make sure that every run-time evaluation of a bracket form such as (bracket (lambda (x) x)) gives '(lambda (x_new) x_new) with the unique bound variable x_new . Two examples in the source code demonstrate why static renaming of manifestly bound identifiers is not sufficient. We implement the very clever suggestion by Chung-chieh Shan and represent a staged expression such as .<(fun x -> x + 1) 3>. by the sexp-generating expression `(,(let ((x (gensym))) `(lambda (,x) (+ ,x 1))) 3) which evaluates to the S-expression ((lambda (x_1) (+ x_1 1)) 3) . Thus bracket is a complex macro that transforms its body to the sexp-generating expression, keeping track of the levels of brackets and escapes. The macro bracket is implemented as a CEK machine with the defunctionalized continuation. In our implementation, the Scheme translation of the eta-example yields the S-expression (lambda (g6) (lambda (g8) (+ g6 g8))) . Just as in MetaOCaml, the result represents the curried addition.

CSP poses another implementation problem. In MetaScheme we can write the MetaOCaml expression .<fun x -> x + 1>. as (bracket (lambda (x) (+ x 1))) , which yields an S-expression (lambda (g1) (+ g1 1)) . When we pass this code to eval, the identifier + will be looked up in the environment of eval, which is generally different from the environment that was in effect when the original bracket form was evaluated. That might not look like much of a difference since + is probably bound to the same procedure for adding numbers in either environment. This is no longer the case if we take the following MetaOCaml code and its putative MetaScheme translation:

     let test = let id u = u in
                   .<fun x -> id x>.
     (define test (let ((id (lambda (u) u)))
                    (bracket (lambda (x) (id x)))))
The latter definition binds test to the S-expression (lambda (x) (id x)) that contains a free identifier id , unlikely to be bound in the environment of eval. Our code values therefore should be `closures', being able to include values that are (possibly locally) bound in the environment when the code value was created. Incidentally, the problem of code closures closed over the environment of the generator also appears in syntax-case macros. R6RS editors made a choice to prohibit the occurrence of locally-bound identifiers in the output of a syntax-case transformer.

MetaOCaml among other staged systems does permit the inclusion of values from the generator stage in the generated code. Such values are called CSP; they are evident in the output of the MetaOCaml interpreter for the above test:

     val test : ('a, 'b -> 'b) code =
     .<fun x_1 -> (((* cross-stage persistent value (as id: id) *)) x_1)>.
In MetaScheme, we have to write CSP explicitly: (% e) , which is often called `lift'.
     (define test
       (let ((id (lambda (u) u)))
         (bracket (lambda (x) ((% id) x)))))
One may think that such a lifting is already available in Scheme: (define (lift x) (list 'quote x)) . Although this works in the simple cases of x being a number, a string, etc., it is neither universal nor portable: attempting this way to lift a closure or an output-port could be an error. According to R5RS, the argument of a quotation is an external representation of a Scheme datum. Closures, for example, are not guaranteed to have an external representation. For portability, we implement CSP via a reference in a global array, taking advantage of the fact that both the index (a number) and the name of the array (an identifier) have external representations and hence are trivially liftable by quotation. This is precisely the mechanism used by the current version of MetaOCaml.

The source code contains many more examples of the translation of MetaOCaml code into MetaScheme -- including the examples with many stages and CSP.

Joint work with Chung-chieh Shan.

The current version is 1.4, Aug 20, 2008.
meta-scheme.scm [15K]
The complete implementation with many comments. It should work with any R5RS Scheme system. The code has been tested with Scheme 48 1.3, SCM 5e1, and Petite Chez Scheme 6.0a.


Let-insertion without pain or fear or guilt

We describe the so-called let-insertion and its safe and surprisingly convenient implementation as a BER MetaOCaml library. Implementors of domain-specific languages (DSL) embedded in MetaOCaml no longer have to meticulously specify where to place let bindings. Rather, they can program an automatic insertion at the `right' and assuredly safe place. The end-users of the DSL do not have to worry about let-bindings or even know about them. This safe and modular let-insertion has become possible in MetaOCaml for the first time.

Program generators, as programs in general, greatly benefit from compositionality: the meaning of a complex expression is determined by its structure and the meanings of its constituents. Building code fragments using MetaOCaml's brackets and escapes is compositional. For example, in

     let sqr: int code -> int code = fun e -> .<.~e * .~e>.
the meaning of sqr e -- that is, the code it generates -- is the multiplication of two copies of the expression generated by e. We are sure of that even if we know nothing of e except that it is pure. Likewise, in
     let make_incr_fun : (int code -> int code) -> (int -> int) code = fun body ->
     .<fun x -> x + .~(body .<x>.)>.
     let test1 = make_incr_fun (fun x -> sqr .<2+3>.)
      (* .<fun x_24  -> x_24 + ((2 + 3) * (2 + 3))>. *)
make_incr_fun body should produce the code for an OCaml function. The result, shown in the comments, confirms our expectations.

Compositionality lets us think about programs in a modular way, helping make sure the program result is the one we had in mind. We ought to strive for compositional programs and libraries. The trick however is selecting the `best' meaning for an expression. Earlier we took as the meaning of a code generator the exact form of the expression it produces. Under this meaning, compositionality requires the absence of side-effects. Our test1 is pure and so its result is easy to predict: it has to be the code of a function whose body contains two copies of the expression 2+3. Although the confidence in the result is commendable, the result itself is not. Duplicate expressions make the code large and inefficient: imagine something complex in 2+3's place. Furthermore, 2+3 in test1 does not depend on x and can be lifted out of the function's body -- so that it can be computed once rather than on each application of the function. In short, we would like the result of test1 to look like

     let test_desired = .<let t = 2+3 in fun x_24  -> x_24 + (t * t)>.

One may remark that a sufficiently smart compiler should be able to transform the code produced by test1 into test_desired, performing common subexpression elimination and invariant code motion. Although the optimizations may be expected for the simple code like test1, an expression more complex than 2+3 will thwart them. First of all, its harder to see the commonality of two large expressions. Second, the two optimizations are sound only if the expression to move and share is pure. If it calls external functions the compiler will not likely to be able to check the functions are pure. Finally, if 2+3 were replaced with read_config "foo" where read_config does IO, the compiler shall not de-duplicate or move such an expression. The DSL designer however may know that read_config, albeit effectful, is safe to move and eliminate. Its side-effect, reading a file, is irrelevant for the end users of the program. Therefore, the DSL designer needs a way to explicitly perform the optimizations like duplicate elimination and invariant code motion, without relying on the compiler. After all, the aspiration of staging is to let DSL designers express expert knowledge and use it for domain-specific optimizations -- in effect turning a general-purpose compiler into a domain-specific optimization toolkit.

We change the meaning of code generators: sqr e now means the multiplication of two copies of e or of the result of evaluating e. The meaning is the generated code with possible let-expressions. Compositionality now permits effectful generators, whose side-effect is let-insertion. MetaOCaml could offer simplest such generators as built-ins. Adding more built-ins to MetaOCaml -- just like adding more system calls to an OS -- makes the system harder to maintain and ensure correctness. Generally, what can be efficiently implemented at the `user-level' ought to be done so. Let-insertion can be written as an ordinary library, relying on the delimited control library delimcc.

The first attempt at let-insertion takes only a few lines:

     open Delimcc
     let genlet : 'w code prompt -> 'a code -> 'a code = fun p cde -> 
        shift p (fun k -> .<let t = .~cde in .~(k .<t>.)>.)
     let with_prompt : ('w prompt -> 'w) -> 'w = fun thunk ->
       let p = new_prompt () in
       push_prompt p (fun () -> thunk p)
The function genlet takes a code expression and let-binds it -- at the place marked by with_prompt. The two functions communicate via the so-called prompt. The evaluation sequence for a simple example below demonstrates the process:
     with_prompt (fun p -> sqr (genlet p .<2+3>.))
     --> (* evaluating genlet *)
     with_prompt (fun p -> sqr (
        shift p (fun k -> .<let t = .~(.<2+3>.) in .~(k .<t>.)>.)))
     --> (* evaluating shift, capturing the continuation up to the prompt, binding it to k *)
     with_prompt (fun p ->
      let k = fun hole -> push_prompt p (fun () -> sqr hole) in
        .<let t = .~(.<2+3>.) in .~(k .<t>.)>.)
     with_prompt (fun p ->
      let k = fun hole -> push_prompt p (fun () -> sqr hole) in
        .<let t = 2+3 in .~(k .<t>.)>.)
     --> (* applying the captured continuation k *)  
     with_prompt (fun p -> 
        .<let t = 2+3 in .~(push_prompt p (fun () -> sqr .<t>.))>.)
     -->* (* evaluating (sqr .<t>.) *)
     with_prompt (fun p -> .<let t = 2+3 in .~(.<t * t>.)>.)
     .<let t = 2+3 in t * t>.

Indeed genlet took .<2+3>. and let-bound it where with_prompt was. The let-binding of 2+3 effectively eliminated the expression duplication, saving us or the compiler trouble searching for common subexpressions. The binding place can be arbitrarily away from genlet:

     with_prompt (fun p ->
       make_incr_fun (fun x -> sqr (genlet p .<2+3>.)))
      (* .<let t_17 = 2 + 3 in fun x_16  -> x_16 + (t_17 * t_17)>. *)
thus realizing the invariant code motion, moving 2+3 out of the function body. Code motion is desirable, but also can be dangerous:
     with_prompt (fun p ->
       make_incr_fun (fun x -> sqr (genlet p .<.~x+3>.)))
      (* BEFORE N100: .<let t_17 = x_16 + 3 in fun x_16  -> x_16 + (t_17 * t_17)>. *)
      (* BER N101: exception pointing out that in .<.~x+3>. the variable x
         escapes its scope *)
attempting to move out the expression that contains x outside its binding! Before, prior to BER N100, this attempt was successful, generating the shown code that exhibits the so-called scope extrusion. In the current version of BER MetaOCaml, the example type checks as before. However, its evaluation no longer succeeds. Rather, running the generator throws the exception with an informative message.

We have just seen the code generation with control effects, that let-insertion is highly desirable and highly dangerous, and that in the present MetaOCaml, it is finally safe. It is safe in the following sense: if the generator successfully finished producing the code, the result is well-typed and well-scoped.

Although the naive attempt to program let-insertion works and is now safe, it is not convenient. One has to explicitly mark the let-insertion place. When several places are marked, we or the user have to choose. We want to automate such choices. We would like the result of test1 to be the desired code test_desired, with let-bindings. The operations sqr and make_incr_fun -- which are programmed by the DSL designer -- may be re-defined. However, test1 should be left as it was. It is written by the end user, who should not care or know about let-insertion's taking place, let alone pass prompts around.

Recall that trying to insert let at a wrong place raises an exception. This dynamic exception lets us program genlet so to try the insertion at various places -- farther and farther up the call chain -- until we get an exception. The best place to insert let is the one that is farthest from genlet while causing no exceptions. We arrive at the following simplified interface for the let-insertion:

     val genlet    : 'a code -> 'a code
     val let_locus : (unit -> 'w code) -> 'w code
The interface no longer mentions any prompts. The function let_locus marks possible places for inserting let; genlet chooses the best place among the marks and let-binds its argument expression there. This let-insertion is safe, and the safety is easy to prove: since this interface is implemented as an ordinary library with no compiler magic, by elaborating the naive implementation, the static guarantees of MetaOCaml hold. In particular, if an attempt is made to generate an ill-scoped code, an exception will be thrown. By contraposition, if no exception is thrown, the generated code has no scope extrusion. As an example, we re-define sqr and make_incr_fun primitives for let-insertion. We fully reuse the earlier versions without re-implementing them: sqr should let-bind its argument and make_incr_fun says that places before and after the function binder are good places for let-insertion.
     let sqr e = sqr (genlet e)
     let make_incr_fun body = 
       let_locus @@ fun () ->
         make_incr_fun @@ fun x ->
           let_locus @@ fun () -> 
             body x
With these new sqr and make_incr_fun, the very same test1 produces the desired code (shown in comments below):
     let test1 = make_incr_fun (fun x -> sqr .<2+3>.)
        (* .<let t_17 = 2 + 3 in fun x_16  -> x_16 + (t_17 * t_17)>. *)
     let test2 = make_incr_fun (fun x -> sqr .<.~x + 3>.)
        (* .<fun x_18  -> x_18 + (let t_19 = x_18 + 3 in t_19 * t_19)>. *)
The slightly modified test2 also inserts let, at the different, and safe place.

We have demonstrated for the first time the self-adjusting, safe and convenient let-insertion with static guarantees. The scope extrusion check meant to crash bad generators surprisingly helps implement good ones -- more convenient than those possible otherwise.

The current version is June 2014.
References [2K]
A simple illustration of let-insertion, its convenience and dangers (the example from the FLOPS 2014 talk)
The example uses the `traditional' implementation of let-insertion, which is not convenient. [6K]
The safe and convenient let-insertion

talk-FLOPS.pdf [196K]
The annotated slides of the talk presented at FLOPS 2014 in Kanazawa, Japan on June 4, 2014. [7K]
Loop-invariant code motion with the convenient let-insertion (the main example in the FLOPS 2014 talk)


Why a program in CPS specializes better

It has become folk knowledge that writing a program in the continuation-passing style (CPS) helps a partial evaluator do more operations at specialization time and hence produce a better specialized program. The reason for that has not as widely spread. The following example, distilled from the papers by Bondorf and Lawall & Danvy, might help.

The example is the following simple expression.

     	\i -> (if i then 2 else 3) + 4
To partially evaluate it, we first annotate all sub-expressions as static S (known at compile time) or dynamic D (known only at run-time). That is, we perform the `binding-time analysis' (BTA) and add binding-time annotations. For clarity, we show all the steps of the analysis. First, we see no applications to our function and hence do not statically know i:
     	\i_D -> (if i_D then 2 else 3) + 4
The if-test with the dynamic condition is dynamic:
     	\i_D -> (if i_D then 2 else 3)_D + 4
and the addition expression with a dynamic operand is dynamic:
     	\i_D -> ((if i_D then 2 else 3)_D + 4)_D
All sub-expressions are dynamic and hence there are no opportunities for partial evaluation.

Let's re-write the original expression in CPS:

     \i k -> let k' = \v -> k (v+4) in
             if i then k' 2 else k' 3
and do the binding-time analysis again. As before, function arguments, i and k, must be dynamic:
     \i_D k_D -> let k' = \v -> k_D (v+4) in
                 if i_D then k' 2 else k' 3
The variable k' is static since its value, the \v-> ... abstraction, is statically known. Constants are obviously static:
     	\i_D k_D -> let k'_S = \v -> k_D (v+4) in
     	            if i_D then k'_S 2_S else k'_S 3_S
There are now two applications of a statically known argument to a statically known function. We can do them statically, at the specialization, or `compile'-time, obtaining:
     	\i k -> if i then k 6 else k 7
which is certainly an improvement: there are no longer any additions left for the run-time.

Bondorf (1992) showed the alternative to writing the source program in CPS. We should instead write the specializer (partial evaluator) in CPS, and skillfully massage the continuations to achieve binding-time improvements. A partial evaluator becomes harder to write, but the source programs may remain in the original, direct style, and still benefit from improved specialization. Lawall and Danvy (1994) showed that delimited control helps keep even the partial evaluator in direct style. Thus a specializer capable of improvements shown in this message becomes only a little more difficult to write.

The current version is January 2012.
Anders Bondorf: Improving Binding Times Without Explicit CPS-Conversion
Conf. Lisp and Function Programming, 1992
The seminal paper on binding-time improvements.

Julia L. Lawall and Olivier Danvy: Continuation-Based Partial Evaluation
Conf. Lisp and Function Programming, 1994


Sound and Efficient Language-Integrated Query -- Maintaining the ORDER

[The Abstract of the paper]
As SQL moved from the English-like language for ad hoc queries by business users to its present status as the universal relational database access, the lack of abstractions and compositionality in the original design is felt more and more acute. Recently added subqueries and common table expressions compensate, albeit generally inefficiently. The inadequacies of SQL motivated language-integrated query systems such as (T-)LINQ, which offer an applicative, programming-like query language compiled to efficient SQL.

However, the seemingly straightforward ranking operations ORDER BY and LIMIT are not supported efficiently, consistently or at all in subqueries. The SQL standard defines their behavior only when applied to the whole query. Language-integrated query systems do not support them either: naively extending ranking to subexpressions breaks the distributivity laws of UNION ALL underlying optimizations and compilation.

We present the first compositional semantics of ORDER BY and LIMIT, which reproduces in the limit the standard-prescribed SQL behavior but also applies to arbitrarily composed query expressions and preserves the distributivity laws. We introduce the relational calculus SQUR that includes ordering and subranging and whose normal forms correspond to efficient, portable, subquery-free SQL. Treating these operations as effects, we describe a type-and-effect system for SQUR and prove its soundness. Our denotational semantics leads to the provably correctness-preserving normalization-by-evaluation. An implementation of SQUR thus becomes a sound and efficient language-integrated query system maintaining the ORDER.

The current version is November 2017.
sqr.pdf [372K]
The paper (together with Tatsuya Katsushima) published in the Proc. APLAS 2017, November 27-29, 2017, Suzhou, China. Lecture Notes in Computer Science, v10695, 2017, pp. 364--383

sqr-talk.pdf [334K]
Slides of the talk presented at APLAS 2017, November 29, 2017. Suzhou, China


Finally, Safely-Extensible and Efficient Language-Integrated Query

[The Abstract of the paper]
Language-integrated query is an embedding of database queries into a host language to code queries at a higher level than the all-to-common concatenation of strings of SQL fragments. The eventually produced SQL is ensured to be well-formed and well-typed, and hence free from the embarrassing (security) problems. Language-integrated query takes advantage of the host language's functional and modular abstractions to compose and reuse queries and build query libraries. Furthermore, language-integrated query systems like T-LINQ generate efficient SQL, by applying a number of program transformations to the embedded query. Alas, the set of transformation rules is not designed to be extensible.

We demonstrate a new technique of integrating database queries into a typed functional programming language, so to write well-typed, composable queries and execute them efficiently on any SQL back-end as well as on an in-memory noSQL store. A distinct feature of our framework is that both the query language as well as the transformation rules needed to generate efficient SQL are safely user-extensible, to account for many variations in the SQL back-ends, as well for domain-specific knowledge. The transformation rules are guaranteed to be type-preserving and hygienic by their very construction. They can be built from separately developed and reusable parts and arbitrarily composed into optimization pipelines.

With this technique we have embedded into OCaml a relational query language that supports a very large subset of SQL including grouping and aggregation. Its types cover the complete set of intricate SQL behaviors.

Joint work with Kenichi Suzuki and Yukiyoshi Kameyama.

The current version is January 2016.
Sound and Efficient Language-Integrated Query -- Maintaining the ORDER is the next version, which accomplishes all that T-LINQ does, plus adds ORDER and LIMIT. Whereas this paper relies on the operational semantics and does syntactic normalization-by-rewriting, the new SQUR relies on the denotational semantics and does normalization-by-evaluation

quel.pdf [425K]
The paper published in the Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM), 2016, St. Petersburg, FL, USA, January 18-19, 2016, pp. 37-48

< >
The source code of the system. It is distributed under the MIT License.


Typed Template Haskell is still not quite sound

Typed Template Haskell was introduced to guarantee that the generated code is well-typed by construction. In particular, splicing the generated code into the program being compiled should never give any type error. Alas, to make Typed Template Haskell more capable and expressive, side effects were allowed during the code generation. As a consequence, one could produce code with unbound variables, as we demonstrate below. Typed Template Haskell hence is ultimately unsound.

The original Template Haskell (Haskell Workshop, 2002) is essentially untyped, with no guarantees that the generated code is any good, or even well-typed. Writing code generators was akin to programming in `dynamically typed' languages: although we can easily tell that the fully generated program is untypable (because it cannot be compiled), we cannot easily point out which part of the generator was to blame. The dissonance with the type discipline of Haskell is jarring. On the other hand, Template Haskell is very expressive: it generates not only expressions but also definitions, type declarations, classes and instances, etc.-- all sort of things that may appear in Haskell code. Designing a type system for such broad code generation was, and still is, a great challenge.

Against this background, Typed Template Haskell was introduced. It produces code only for expressions (and not for declarations, etc.). On the other hand, we can finally write typed generators. Typed Template Haskell is quite similar to the lambda-circle calculus or MetaOCaml, but restricted to two levels. Here is a sample typed generator.

     t1 :: Q (TExp Int)
     t1 = do
       c1 <- [|| (1::Int) + 2 || ]
       c2 <- [|| 3 + $$(return c1) || ]
       return c2
Like in MetaOCaml, the type of code values TExp is indexed by the type of the expression to be generated. [|| ... ||] is the typed antiquotation, corresponding to the brackets .< ... >. of MetaOCaml. The brackets enclose, or quote, the code to generate, which must be a well-typed expression. The unquotation, or typed splice, is $$( ... ). The splice is evaluated and must produce (a possibly open) code, which is then spliced into the enclosing typed quotation. If the splice appears outside a quotation, it is called top-level. In that case, the splice must produce a closed expression, which is inserted into the code being compiled. The example generator produces the following code:
     3 GHC.Num.+ ((1 :: GHC.Types.Int) GHC.Num.+ 2)
or, hiding the module qualifications, 3 + (1 + 2).

The problematic example is very short:

     t2 :: Q (TExp Int)
     t2 = do
       r <- runIO $ newIORef undefined
       c1 <- [|| \x -> (1::Int) +
                          xv <- [||x|| ]
                          runIO $ writeIORef r xv
                          return xv) || ]
       runIO $ readIORef r
It generates x_0: just the single, unbound variable. It is clearly not typable. Such code cannot be compiled or spliced-in at the top level. The culprit is runIO, which runs IO computations from within typed splices. It is highly desirable, for error handling and many other reasons. It is also responsible for breaking type soundness.

Once again we see the tension between safety and expressiveness. Without effects, we cannot generate all the code we want and cannot handle errors well. With side effects, maintaining type safety is a challenge. Fortunately, the challenge has recently been met, outside GHC and Typed Template Haskell.

The problem sprang to attention during the lecture by Simon Peyton Jones at the First Metaprogramming Summer School in Cambridge, UK in August 2016 -- when runIO was mentioned.

The current version is August 2016.
THProblem.hs [<1K]
The complete Haskell code illustrating the problem

Tagless-Staged: Combinators for Impure yet Hygienic Code Generation
Type sound code generation with arbitrary side-effects

MetaOCaml resolves the tension between expressiveness and safety in a different, easy to implement way


A monadic approach for avoiding code duplication when staging memoized functions

[The Abstract of the paper]
Building program generators that do not duplicate generated code can be challenging. At the same time, code duplication can easily increase both generation time and runtime of generated programs by an exponential factor. We identify an instance of this problem that can arise when memoized functions are staged. Without addressing this problem, it would be impossible to effectively stage dynamic programming algorithms. Intuitively, direct staging undoes the effect of memoization. To solve this problem once and for all, and for any function that uses memoization, we propose a staged monadic combinator library. Experimental results confirm that the library works as expected. Preliminary results also indicate that the library is useful even when memoization is not used.

The paper introduces the continuation+state monad for code generation with let-insertion, and briefly describes abstract interpretation for generating efficient code without its intensional analysis. These are the foundational techniques for typeful multi-staged (MetaOCaml) programming.

Joint work with Walid Taha, Kedar N. Swadi, and Emir Pasalic.

The current version is January 2006.
Proceedings of the 2006 ACM SIGPLAN Workshop on Partial Evaluation and Semantics-based Program Manipulation (PEPM), 2006, Charleston, South Carolina, USA, January 9-10, 2006, pp. 160-169.
< >

Dynamic Programming Benchmark: The corresponding MetaOCaml source code
< >

Shifting the Stage: Staging with Delimited Control
An alternative, notationally more convenient approach relying on delimited control rather than monads


The Next Stage of Staging

[The Abstract of the paper]
Staging is a program generation paradigm with a clean, well-investigated semantics which statically ensures that the generated code is always well-typed and well-scoped. Staging is often used for specializing programs to the known properties or parts of data to improve efficiency, but so far it has been limited to generating terms. This short paper describes our ongoing work on extending staging, with its strong safety guarantees, to generation of non-terms, focusing on ML-style modules. The purpose is to map out the promises and challenges, then to pose a question to solicit the community's expertise in evaluating how essential our extensions are for the purpose of applying staging beyond the realm of terms.

We demonstrate our extensions' use in specializing functor applications to eliminate its (currently large) overhead in OCaml. We explain the challenges that those extensions bring in and identify a promising line of attack. Unexpectedly, however, it turns out that we can avoid module generation altogether by representing modules, possibly containing abstract types, as polymorphic records. With the help of first-class modules, module specialization reduces to ordinary term specialization, which can be done with conventional staging. The extent to which this hack generalizes is unclear. Thus we have a question to the community: is there a compelling use case for module generation? With these insights and questions, we offer a starting point for a long-term program in the next stage of staging research.

Joint work with Jun Inoue and Yukiyoshi Kameyama.

The current version is January 2016.
StagingNG.pdf [169K]
Staging Beyond Terms: Prospects and Challenges
The paper published in the Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM), 2016, St. Petersburg, FL, USA, January 18-19, 2016, pp. 103-108


Type(class)-directed symbolic differentiation

We show how to take an ordinary numeric Haskell function, e.g., a term of the type D a => a -> a where D is a Floating-like class, and produce a function of the same type that is the symbolic derivative of the former. The original function can be given to us in a separately compiled module with no available source code. The derived function is an ordinary numeric Haskell function and can be applied to numeric arguments -- or differentiated further. The Floating-like class D currently supports arithmetic and a bit of trigonometry. We also support partial derivatives.
     test1f x = x * x + fromInteger 1
     test1 = test1f (2.0::Float)      -- 5.0
     test1f' x = diff_fn test1f x
     test1' = test1f' (3.0::Float)    -- 6.0
     test1f'' x = diff_fn test1f' x
     test1'' = test1f'' (10.0::Float) -- 2.0
The original function test1f can be evaluated numerically, test1 , or differentiated symbolically, test1f' . The result is again an ordinary numeric function (i.e., 2x), which can be applied to a numeric argument, see test1' . Alternatively, test1f' can be differentiated further.

The original function is emphatically not represented as an algebraic data type -- it is a numeric function like sin or tan . Still, we are able to differentiate it symbolically (rather than numerically or automatically). The key insight is that Haskell98 supports a sort of reflection -- or, to be precise, type-directed partial evaluation and hence term reconstruction.

Our approach also shows off the specification of the differentiation rules via type classes (which makes the rules extensible) and the emulation of GADT via type classes. In 2006, we improved the approach by developing an algebraic simplifier and by avoiding any interpretative overhead.

The current version is 1.1, Nov 24, 2004.
differentiation.lhs [11K]
The complete literate Haskell code with more examples, including those of partial differentiation. It was originally posted as Typeful symbolic differentiation of compiled functions on the Haskell mailing list on Wed Nov 24 04:05:12 EST 2004.

Most optimal symbolic differentiation of compiled numeric functions


Most optimal symbolic differentiation of compiled numeric functions

We show symbolic differentiation of a wide class of numeric functions without imposing any interpretative overhead. The functions to differentiate can be given to us in a compiled form (in .hi files); their source code is not needed. We produce a (compiled, if needed) function that is an exact, algebraically simplified analytic derivative of the given function.

Our approach is reifying code into its `dictionary view', intensional analysis of typed code expressions, and staging so to evaluate under lambda. We improve the earlier, 2004 approach in algebraically simplifying the result of symbolic differentiation, and in removing interpretative overhead with the help of Template Haskell (TH). The computed derivative can be compiled down to the machine code and so it runs at full speed, as if it were written by hand to start with.

In the process, we develop a simple type system for a subset of TH code expressions (TH is, sadly, completely untyped) -- so that accidental errors can be detected early. We introduce a few combinators for the intensional analysis of such typed code expressions. We also show how to reify an identifier like (+) to a TH.Name -- by applying TH to itself. Effectively we obtain more than one stage of computation.

Our technique can be considered the inverse of the TH splicing operation: given a (compiled) numeric expression of a host program, we obtain its source view as a TH representation. The latter can be spliced back into the host program and compiled -- after, perhaps, simplification, partial evaluation, or symbolic differentiation. As an example, given the definition of the ordinary numeric, Floating a => a->a function test1f x = let y = x * x in y + 1 (which can be located in a separately compiled file), we reify it into a TH code expression, print it, differentiate it symbolically, and algebraically simplify the result:

     *Diff> test1cp
     \dx_0 -> GHC.Num.+ (GHC.Num.* dx_0 dx_0) 1
     *Diff> test1dp
     \dx_0 -> GHC.Num.+ (GHC.Num.+ (GHC.Num.* 1 dx_0) (GHC.Num.* dx_0 1)) 0
     *Diff> test1dsp
     \dx_0 -> GHC.Num.+ dx_0 dx_0
The output is produced by TH's pretty-printer. We can splice the result in a Haskell program as $(reflectQC test1ds) 2.0 and use it as ordinary, hand-written function \x -> x+x .
The current version is 1.1, Dec 3, 2006.
Diff.txt [8K]
The message with the explanation of optimality and more examples. It was originally posted as Jones-optimal, typed, symbolic differentiation of (compiled) functions on the Haskell mailing list on Sun, 3 Dec 2006 17:56:27 -0800 (PST).

Diff.hs [9K]
The main implementation file, which defines the reification of code into TH representation, differentiation rules and algebraic simplification rules, all via the intensional analysis of the typed code. The file also includes many examples, including those of partial differentiation.

DiffTest.hs [<1K]
Running the splicing tests from Diff.hs. Due to the TH requirement, this code must be in a separate module.

TypedCode.hs [5K]
This file introduces the type Code a of typed TH code expressions. The (phantom) type parameter is the expression's type. The file defines combinators for building and analyzing these typed expressions.

TypedCodeAux.hs [2K]
Obtain the Name that corresponds to a top-level (Prelude-level) Haskell identifier by applying TH to itself.


Editorial: PEPM2012 Special Issue of Higher-Order and Symbolic Computation

The five articles in this special issue are the extended, journal versions of the papers first presented at the ACM SIGPLAN 2012 Workshop on Partial Evaluation and Program Manipulation. PEPM'12 took place in January 2012 in Philadelphia, Pennsylvania, USA. Out of 19 papers presented at PEPM 2012, the program committee selected six and invited their authors to submit an extended and improved version to the Special issue. Each submission was reviewed by three reviewers according to the rigorous journal standards. Five have been recommended for publication.

The PEPM Symposium/Workshop series is about the theory and practice of program transformation understood broadly, ranging from program manipulation such as partial evaluation, to program analyses in support of program manipulation, to treating programs as data objects (metaprogramming). PEPM focuses on techniques, supporting theory, tools, and applications of the analysis and manipulation of programs. PEPM specifically stresses that each technique or tool of program manipulation should have a clear, although perhaps informal, statement of desired properties, along with an argument how these properties could be achieved. The papers included in this special issue reflect the entire scope of PEPM, its interplay of theory and practice, and its stress on rigor and clarity.

The current version is July 2013.
PEPM2012-Special-issue-editorial.pdf [43K]
The full Editorial describing the five papers
Higher-Order and Symbolic Computation. August-September 2013. Special issue with selected papers from PEPM 2012. Edited together with Julia Lawall and Simon Thompson

Last updated March 9, 2019

This site's top page is
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML