previous   next   up   top

MetaOCaml -- an OCaml dialect for multi-stage programming

 

BER MetaOCaml is a conservative extension of OCaml for ``writing programs that generate programs''. BER MetaOCaml adds to OCaml the type of code values (which denote ``program code'', or future-stage computations), and two basic constructs to build code values: quoting and splicing. The generated code can be printed, stored in a file -- or compiled and linked-back to the running program, thus implementing run-time code optimization. A well-typed BER MetaOCaml program generates only well-typed programs: The generated code shall compile without type errors. The generated code may run in the future but it is typechecked now. BER MetaOCaml is a complete re-implementation of the original MetaOCaml by Walid Taha, Cristiano Calcagno and collaborators.


 

Summary

BER MetaOCaml is a conservative extension of OCaml with staging annotations to construct and run typed code values. See below for a brief introduction to these features. MetaOCaml code without staging annotations, or with annotations erased, is regular OCaml4.

BER MetaOCaml is in the spirit of the original MetaOCaml by Taha, Calcagno and contributors but has been completely re-written using different algorithms and techniques. The old MetaOCaml code still runs with little or no changes. Overall, BER MetaOCaml aims at the most harmonious way of integrating staging with OCaml, minimizing the differences. It becomes easier to keep MetaOCaml up-to-date and ensure its long-term viability.

BER MetaOCaml has been re-structured to minimize the amount of changes to the OCaml type-checker and to separate the `kernel' from the `user-level'. The kernel is a set of patches and additions to OCaml, responsible for producing and type-checking code values. Processing of built code values -- so-called `running' -- is user-level. Currently the user-level metalib supports printing, and byte-compiling and linking of code values. Users may add other ways of running the code, for example, compiling it to machine code, C or LLVM -- without any need to hack into (Meta)OCaml or even recompile it.

BER MetaOCaml is almost entirely backwards compatible with the original MetaOCaml. To the user, the two major differences of BER MetaOCaml are as follows (see below on this page for the detailed explanation):

Smaller visible differences are better printing of cross-stage persistent values and the full support for labeled arguments. A long-standing problem with records with polymorphic fields has fixed itself. The BER N100 code is now extensively commented, and has a regression test suite. MetaOCaml has never supported generating code with local or first-class modules and objects. However one may use modules and objects in the code generator. Camlp4 is excluded for now and probably forever. Instead of the syntactic extension pa_monad, BER MetaOCaml adds Nicolas Pouillard's lightweight let! notation for monadic programming. Offshoring is temporarily disabled, to be resurrected as a user-level library.

BER N100 is much less invasive into OCaml: compare the size of the patch to the main OCaml type-checker typing/typecore.ml, which contains the bulk of the changes for type checking the staging constructs. In the previous version BER N004, the patch had 564 lines of additions, deletions and context; now, only 328 lines. (The core MetaOCaml kernel is trx.ml, with 1800 lines.)

The file NOTES.txt in the BER MetaOCaml distribution describes the features of BER MetaOCaml in more detail and outlines directions for further development. The R&D program is extensive, ranging from the staging theory to the developing of translators of the generated code to C, LLVM, and other targets. There is fun and tasks for everyone. Hopefully the release of BER MetaOCaml would incite using and researching typed meta-programming.

Version
The current version is N100, January 2013.
References
ber-metaocaml-ChangeLog [2K]

< http://www.metaocaml.org >
The web site of the original MetaOCaml. All examples and tutorials on that web sites apply.

The MetaOCaml files: Status report and research proposal

 

Installation and availability

BER MetaOCaml N100 corresponds to the current version of OCaml, 4.00.1. Befitting the separation between the kernel and the user-level, BER MetaOCaml is distributed as a set of patches to OCaml plus a separate library metalib. The installation involves patching the OCaml distribution, bootstrapping the system, and compiling metalib and the BER MetaOCaml toplevel. The patched OCaml compiler is fully source-compatible with OCaml and can process OCaml code, with staging constructs or without. BER MetaOCaml is available as:

 

Introduction to staging and MetaOCaml

The standard example of meta-programming -- the running example of A.P.Ershov's 1977 paper that begat partial evaluation -- is the power function, computing x^n. In OCaml:
     let square x = x * x
     let rec power n x =
       if n = 0 then 1
       else if n mod 2 = 0 then square (power (n/2) x)
       else x * (power (n-1) x)
     (* val power : int -> int -> int = <fun> *)
(The comment below the code shows the reply of the MetaOCaml toplevel.)

Suppose our program has to compute x^7 many times. We may define

     let power7 x = power 7 x
to name and share the partial application.

In MetaOCaml, we may also specialize the power function to a particular value n, obtaining the code which will later receive x and compute x^n. We re-write power n x annotating expressions as computed `now' (when n is known) or `later' (when x is given).

     let rec spower n x =
       if n = 0 then .<1>.
       else if n mod 2 = 0 then .<square .~(spower (n/2) x)>.
       else .<.~x * .~(spower (n-1) x)>.
     (* val spower : int -> ('cl, int) code -> ('cl, int) code = <fun> *)
The two annotations, or staging constructs, are brackets .< e >. and escape .~e . Brackets .< e >. `quasi-quote' the expression e, annotating it as computed later. Escape .~e, which must be used within brackets, tells that e is computed now but produces the result for later. That result, the code, is spliced-in into the containing bracket. The inferred type of spower is different. The result is no longer an int, but ('cl, int) code -- the code of expressions that compute an int. The first type argument to code, often named 'cl, is a so-called environment classifier and can be skipped on first reading. The type of spower spells out which argument is received now, and which later: the future-stage argument has the code type. To specialize spower to 7, we define
     let spower7_code = .<fun x -> .~(spower 7 .<x>.)>.;;
      (*
     val spower7_code : ('cl, int -> int) code = .<
       fun x_1 ->
        (x_1 *
          (((* cross-stage persistent value (id: square) *))
            (x_1 * (((* cross-stage persistent value (id: square) *)) (x_1 * 1)))))>.
      *)
and obtain the code of a function that will receive x and return x^7. Code, even of functions, can be printed, which is what MetaOCaml toplevel just did. The print-out contains so-called cross-stage persistent values, or CSP, which `quote' present-stage values such as square to be used later. One may think of CSP as references to `external libraries' -- only in our case the program acts as a library for the code it generates. If square were defined in a separate file, say, sq.ml, then its occurrence in spower7_code would have been printed simply as Sq.square.

If we want to use thus specialized x^7 now, in our code, we should compile spower7_code and link it back to our program. This is called `running the code', and uses .!, pronounced `run':

     let spower7 = .! spower7_code
     (* val spower7 : int -> int = <fun> *)

The specialized spower7 has the same type as the partially applied power7 above. They behave identically. However, power7 x will do the recursion on n, checking n's parity, etc. In contrast, the specialized spower7 has no recursion (as can be seen from spower7_code). All operations on n have been done when the spower7_code was computed, producing the straight-line code spower7 that only multiplies x.

MetaOCaml supports arbitrary number of later stages, letting us write not only code generators but also generators of code generators, etc.

References
Walid Taha: A Gentle Introduction to Multi-stage Programming
< http://www.cs.rice.edu/~taha/publications/journal/dspg04a.pdf >

A.P. Ershov: On the partial computation principle
Information Processing Letters, 1977, v6, N2, pp. 38-41.
This is the English version of the paper that started partial evaluation as a research area. The power example, in Fig. 1 of the paper, can be seen on the following manuscript scan
< http://ershov.iis.nsk.su/files/rukopisi/fold241/241_75.gif >
The Information Processing Letters paper is the English version of
A. P. Ershov: A Theoretical Principle of System Programming
Doklady AN SSSR (Soviet Mathematics Doklady), 1977, v18,N2, pp.312--315.
For space reasons, the power example was omitted from the Doklady AN SSSR paper, but it can be seen in the notes to the November 1976 draft
< http://ershov.iis.nsk.su/files/rukopisi/fold241/241_17.gif >
This is perhaps the first occurrence of the power function specialization.
For other drafts of this paper, see ``The Ershov Archive for the history of computing''
< http://ershov.iis.nsk.su/archive/eaindex.asp?lang=2&gid=1480 >

 

Data constructor restriction

As a stop-gap measure, BER MetaOCaml N100 imposes the restriction that all data constructors and record labels used within brackets must come from the types that are declared in separately compiled modules. For example, the following expressions are all acceptable:
     .<true>.
     .<raise Not_found>.
     .<Some [1]>.
     .<{Complex.re = 1.0; im = 2.0}>.
     let open Complex in .<{re = 1.0; im = 2.0}>.
     .<let open Complex in {re = 1.0; im = 2.0}>.

because the data types bool, option, list, Complex.t are either Pervasive or defined in the (separately compiled) standard library. However, the following are not allowed and flagged as compile-time error:

     type foo = Bar;;
     .<Bar>.
     
     module Foo = struct exception E end;;
           .<raise Foo.E>.

The type declaration foo or the module declaration Foo must be moved into a separate file. The corresponding .cmi file must also be available at run-time: either placed into the same directory as the executable, or somewhere within the OCaml library search path.

The restriction came about because the OCaml abstract syntax tree of expressions Parsetree.expression (which realizes code values) at first blush has no space for type declarations. External type declarations like those of Complex.t are found in complex.cmi, which can be looked up when the generated code is compiled. Local type declarations are not generally reflected in .cmi; it is not clear therefore how to refer to them in the generated code.

However there is a way to include type declarations within the Parsetree.expression tree. Therefore, the restriction will be lifted in some future release. On the other hand, from the early experience the restriction does not seem to be very limiting, or hard to satisfy.

 

Detecting scope extrusion

Although MetaOCaml permits manipulation and splicing of open code, its type system statically ensures that only closed code can be printed or run: We can't run the code we haven't finished constructing. For example,
     .<fun x -> .~(let y = .! .<x>. in .<y>.)>.
gives a type error since .<x>. is obviously open. The environment classifiers were introduced specifically to catch such errors.

This static guarantee holds only for pure code. Effects such as storing code values in mutable cells void the guarantee. Here is the example using the old MetaOCaml from 2006 (version 3.09.1 alpha 030). (The problem can be illustrated simpler, but the following example is more realistic and devious.)

     let c =
      let r = ref .<fun z->z>. in 
      let f = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
      .<fun x -> .~f (.~(!r) 1)>. ;;
     (*
     val c : ('a, '_b -> int) code =
       .<fun x_4 -> ((fun x_2 -> 0) ((fun y_3 -> x_2) 1))>.
     *)

One must look hard to see that x_2 is actually unbound in the resulting code (shown in comments, as the output from the interpreter). The problem becomes apparent when we attempt to run that code:

     .! c;;
     (*
     Characters 77-78:
       Unbound value x_2
     
     Exception: Trx.TypeCheckingError.
     *)

Since we get the error anyway (without much diagnostics though), one may discount the severity of the problem. Alas, sometimes scope extrusion results in no error -- just in wrong results. A simple variation plasters over the problem:

     let c1 =
      let r = ref .<fun z->z>. in 
      let _ = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
      !r;;
     (*
     val c1 : ('a, '_b -> '_b) code = .<fun y_3 -> x_2>.
     *)
We then use c1 to construct the code c2:
     let c2 = .<fun y -> fun x -> .~c1>.;;
     (*
     val c2 : ('a, 'b -> 'c -> '_d -> '_d) code =
       .<fun y_1 -> fun x_2 -> fun y_3 -> x_2>.
     *)
which contains no unbound variables and can be run without problems.
     (.! c2) 1 2 3;;
     (* - : int = 2 *)

It is most likely that the user did not intend for fun x -> in c2 to bind x of c1. This is the blatant violation of lexical scope. And yet we get no error or other overt indication that something went wrong.

BER MetaOCaml N100 has none of this. Although the type system still permits code values with escaped variables, attempting to use such code in any way -- splice, print or run -- immediately raises an exception. For example, entering the expression c above in the top-level BER MetaOCaml N100 gives:

     Exception: Failure
     Scope extrusion at Characters 89-99:
           let f = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
                                        ^^^^^^^^^^
     for the identifier x_7 bound at Characters 74-75:
           let f = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in
                         ^

The exception message tells which variable got away, where it was bound and where it eloped.

Thus it is now guaranteed that if a code is successfully generated, it shall compile without type errors -- whether the generator was pure or not.

 

Many ways to run the code

BER MetaOCaml is structured as the `kernel' and the `user-level'. The former type-checks and compiles staging annotations. Handling the generated code is up to user-level libraries. For example, metalib in the BER MetaOCaml distribution prints and byte-compiles code values. User-level MetaOCaml code can be changed and extended without the need to recompile MetaOCaml. Further, changes in OCaml will have less effect on the user-level code.

Currently the operator .! is still part of the MetaOCaml kernel, but it is on its way out. Running the code .! e can be expressed at user-level (even now) as:

     Runcode.run {Runcode.cde = e}
where the data type type 'a cde = {cde : 'cl. ('cl,'a) code} represents closed code, which may be run. The operator .! might still be retained, as syntax: the parser will expand .! e as {Runcode.cde = e}.

By removing .! from the kernel, MetaOCaml type-checking becomes simpler and better integrated with OCaml. Mainly, it becomes easier to add run-like functions, which are regular functions to be linked with or loaded at the toplevel as any other library functions. There is no need to recompile MetaOCaml any more. The old MetaOCaml had so-called `off-shoring' to translate the generated code to C and Fortran. For example, it was used to generate FFT kernels. The generated C code was good enough to plug as it was into the FFTW benchmarking and testing framework. During the overhaul the offshoring has been left for clean-up and re-writing. It can now be brought back, as another run-like function.

 

What about the native MetaOCaml

Currently, BER MetaOCaml supports byte-compilation of the generated code and linking it back into the running program -- that is, run-time code specialization. Executing byte-complied code is relatively slow, especially for numeric code. Performance-sensitive OCaml applications are compiled with the ocamlopt, into machine code. It is frequently asked therefore when MetaOCaml would support machine-code compilation of the generated code.

Let us first clear a misconception that byte-code MetaOCaml limits the compilation of the generated code to be byte-code only. In two largish projects MetaOCaml was used to generate the optimal code for FFT kernels and for Gaussian Elimination. Gaussian Elimination is a large family of the algorithms, parameterized by the domain (|int|, float, polynomial), by the choice of pivoting, by the need to compute the determinant, rank, the permutation matrix, by the layout of matrices and by a half a dozen more aspects. Because Gaussian Elimination is often used in inner loops, the performance matters a great deal. Therefore, all this parameterization has to be done statically. With Jacques Carette we generated many GE procedures for particular parameter choices. The generated code was stored in files, to be compiled in a library, by ocamlopt. Although the code generator was byte-code OCaml, the generated code could be stored and compiled by the optimizing, native OCaml. We used the byte-code run, but only for testing. The FFT project was similar: the generator was byte-code, the byte-code run was used for testing, the generated code was C and compiled with the Intel C compiler.

To stress, the byte-code generator produces the code to be compiled by any suitable compiler. The mode (byte-code or native) of the generator and of the generated code are not related. Hence the current BER MetaOCaml is already useful in practice -- but more could and should be done.

Thus in principle one can already use ocamlopt to compile the generated code. If the code includes CSP only as literals or external references (and any code can be arranged that way), the code can be stored into a file and then passed to ocamlopt. The result can be linked back using OCaml's own dynlink. What is needed is to make this process automatic and convenient. A future version of MetaOCaml will certainly include a more convenient tools for native compilation. Such a tool is `user-level'; developing it could be a good student project.

References
Generating optimal FFT code and relating FFTW and Split-Radix

Multi-stage programming with functors and monads: eliminating abstraction overhead from generic code
Joint work with Jacques Carette on generating many variations of Gaussian Elimination

 

Credits

BER MetaOCaml is a complete re-implementation of the original MetaOCaml.

The original MetaOCaml was funded primarily by the NSF project "ITR/SY(CISE): Putting Multi-Stage Annotations to Work" led by Walid Taha. Most of the original development and implementation of staging was done by Cristiano Calcagno. Edward Pizzi has implemented the pretty-printing of code -- which since then has been extensively modified and maintained by Jacques Carette. Xavier Leroy, INRIA, helped with the compiler specifics.

I am very grateful to Jacques Carette, Walid Taha, and Chung-chieh Shan for many helpful discussions and encouragement. Jacques Carette has extensively re-written the printing of code values, and is currently maintaining this part. Reports and testing by Tran Minh Quang and Mustafa Elsheik are appreciated.



Last updated February 5, 2013

This site's top page is http://okmij.org/ftp/

oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML