A short tutorial on embedding effectful DSLs, or:
In truth, effects are not married to monads and approachable directly. The profound insight behind monads is the structuring, the separation of `pure' (context-independent) and effectful computations. The structuring can be done without explicating mathematical monads, and especially without resorting to vernacular monads such as State, etc. This article gives an example: a simple, effectful, domain-specific sublanguage embedded into an expressive `macro' metalanguage. Fancy, or even function types, and abstraction facilities such higher-order functions and modules of the metalanguage help keep the DSL to the bare minimum, often to the first order, making it easier to reason about and implement.
We pick a rather complex effect -- non-determinism -- and use it in OCaml, which at first blush seems unsuitable since it is call-by-value and has no monadic sugar. And yet, we can write non-deterministic programs just as naturally and elegantly as in Haskell or Curry.
The running tutorial example is computing all permutations of a given
list of integers. The reader may want to try doing that in their
favorite language or plain OCaml. Albeit a simple exercise, the code
is often rather messy and not obviously correct.
In the functional-logic language Curry, it is
strikingly elegant: mere
foldr insert . It is the re-statement
of the specification:
a permutation is moving the elements of the
source list one-by-one into some position in the
initially empty list. The code immediately tells that the number of
possible permutations of
n elements is
n!. From its very
conception in the 1959 Rabin and Scott's paper, non-determinism was
to write clear specifications -- and then to make them executable.
That is what will shall do.
Eugenio Moggi and Sonia Fagorzi: A Monadic Multi-stage Metalanguage.
FoSSaCS 2003: Foundations of Software Science and Computational Structures
Moggi and Fagorzi described monads as a tool for structuring -- staging -- of effectful computations. There are other ways to introduce sublanguages, such as the tagless-final style shown off in this article.
The origins of non-determinism
ilist_tbelow, the semantic domains of integer and integer list expressions). The meaning of a complex expression is computed by combining the meaning of immediate sub-expressions, that is, compositionally. A language is thus defined by specifying the semantic domain types and the meaning computations for its syntactic forms. These definitions are typically collected into a signature, such as:
module type NDet = sigwhich we will be filling in. Since we will be talking about integer lists, we need the integer type and at least the integer literals:
type int_t val int: int -> int_tWe can add the standard operations on integers, but they are not needed for the problem at hand. They can always be added later. After all, the extensibility is the strong suit of the tagless-final embedding.
We also need integer lists, with the familiar constructors.
type ilist_t val nil: ilist_t val cons: int_t -> ilist_t -> ilist_t val list: int list -> ilist_tThe
listprimitive makes an OCaml list to be the list in our DSL: it is the notation for literal DSL lists. It is not strictly necessary yet convenient. Indeed, Haskell, OCaml and other languages have a special notation for lists, e.g.,
[1;2;3], in addition to the cons
1::2::3::notation. We also need pattern-matching on lists, or the deconstructor. The syntax is admittedly ungainly: we are trying to represent
match ... withas an applicative expression:
val decon: ilist_t -> (unit -> ilist_t) -> (* if nil *) (int_t -> ilist_t -> ilist_t) -> (* if cons h t *) ilist_t
We also need
foldr. Strictly speaking, we do not need it: it is expressible
through the features already defined. But
foldr is so fundamental,
it is convenient to have as a primitive.
val foldr: (int_t -> ilist_t -> ilist_t) -> ilist_t -> ilist_t -> ilist_t
Finally, we define the operations for non-determinism: failure and the binary choice. The latter takes two non-deterministic expressions and non-deterministically executes one of them.
val fail: ilist_t val (|||): ilist_t -> ilist_t -> ilist_t
And we are done.
end (* of the signature NDet *)
An attentive reader may get the feeling that something is amiss: where are functions? We have defined neither the DSL function type nor operations to create and apply functions. Our DSL is not a lambda-calculus; it is essentially first order. Please hold your wonder.
NDetDSL may be, it is enough for the task at hand. We now use it to write the list permutation as elegantly as in Curry.
First, we need the non-deterministic list insertion:
insert x lst is
to insert the element
x somewhere in
lst, returning the extended list.
That is, it inserts
x at the front of
lst, or after the first element of
lst, or after the second element of
lst, etc. The algorithm can be
formulated, and hence implemented, inductively:
insert x lst either
x at the front of
lst or within
somewhere in its tail.
Computing the list permutation is now accomplished. The following is the complete code, which also includes a simple test.
module Perm(S:NDet) = struct open S let rec insert x l = cons x l ||| decon l (fun () -> fail) (* l is empty, no tail *) (fun h t -> cons h (insert x t)) let perm = foldr insert nil let test1 = perm (list [1;2;3]) endThe DSL primitives such as
niletc. are all defined in the implementation
Sof the signature
NDet. The code does not depend on any particular implementation, which is hence abstracted over as an argument
S. DSL code is hence typically represented as an OCaml functor, parameterized by the DSL implementation.
Although our code looks like the Curry code and is exceedingly simple,
there is something odd about it. We have said that
NDet has no functions:
no function types, no way to create or apply functions,
let alone recursive functions. What is
foldr a higher-order function? They are functions --
in the metalanguage but not in
NDet. From the DSL point of
insert is a very fancy `macro'. Our code then is a combination
of a trivial, non-deterministic DSL with a very expressive,
higher-order `macro' system (*).
Moreover, the DSL evaluation and the `macro-expansion' run like coroutines.
It is not unheard of: after all, coroutines were invented
as a communication mechanism among phases of a Cobol compiler
(hat tip to Aggelos Biboudis). This coroutine-like evaluation is the
essence of Moggi's computational calculus.
Melvin E. Conway. Design of a Separable Transition-diagram Compiler.
Commun. ACM 6.7 (July 1963), pp. 396-408.
Our motivation hence is to make the preprocessor less human.
Much of the power of C comes from having a powerful preprocessor. The preprocessor is called a programmer.
P.J. Moylan: The Case against C. Technical Report EE9240. July 1992. The University of Newcastle
This is the source for the above C joke. The section ``Modularity'' of Moylan's paper reads like the compelling motivation for the tagless-final approach.
Writing the DSL code as functors is ungainly. Lighter, functorless notation: blending DSL into OCaml shows what we can do to make the DSL programming more pleasant.
Permcode we need an implementation of the
NDetsignature, which we develop in this section. Since we are interested in the list of all permutations, it seems natural to take
ilist_tto be lists of all choices an integer or a list DSL expression may produce.
One can hear the shouts: this is the List monad! Yes, it is -- after all,
it is one of the implementations of non-determinism, envisioned already
by Rabin and Scott in the the 1950s. But it is not the only one.
The accompanying code shows another implementation of
NDet, in terms
of delimited continuations, the
delimcc library. One may easily think of
others, e.g., using the operating system threads.
The list implementation is the easiest to explain -- and also the most
performant for our simple task. It is has to be stressed: our
DSL has no
return operations. The user does not know if
ilist_t is the `list monad' or other monad or no monad at all. Nor
should the user care: they want non-deterministic operations on lists,
and that is what
Ndet provides, directly.
Thus we will be implementing
module NDetL = structfilling in the definitions one-by-one. As already said, we take a non-deterministic expression to mean the list of all values it may produce. For now we assume finitely non-deterministic expressions, which is enough for our task:
type int_t = int list type ilist_t = int list listLiteral expressions are deterministic: have exactly one value:
let int x = [x] let list x = [x] let nil = []
Two auxiliary functions will make further definitions easier:
let concatmap: ('a -> 'b list) -> 'a list -> 'b list = fun f l -> List.concat @@ List.map f l let liftm2 : ('a -> 'b -> 'c) -> 'a list -> 'b list -> 'c list = fun f l1 l2 -> concatmap (fun x -> List.map (f x) l2) l1The types and names should hint at their purpose, which becomes clear below. Since we are not concerned with monads, we do not need to prove any monad laws. (Stating equational laws is generally a good idea. But these laws should be formulated in terms of
NDetrather than a particular implementation. Also, the standard monad laws are the least interesting.) Now we can implement the constructor and deconstructor for the
let cons: int_t -> ilist_t -> ilist_t = liftm2 (fun h t -> h::t) let decon: ilist_t -> (unit -> ilist_t) -> (int_t -> ilist_t -> ilist_t) -> ilist_t = fun l onnil oncons -> concatmap (function |  -> onnil () | h::t -> oncons (int h) [t]) l let rec foldr: (int_t -> ilist_t -> ilist_t) -> ilist_t -> ilist_t -> ilist_t = (* exercise for the reader *)
In the list implementation of non-determinism, the two non-deterministic operations have a particularly simple expression:
let fail =  let (|||) = (@) end
We are finished, and ready to run our
Perm code. The result is in
let _ = let module M = Perm(NDetL) in M.test1 (* - : NDetL.ilist_t = [[1; 2; 3]; [1; 3; 2]; [2; 1; 3]; [2; 3; 1]; [3; 1; 2]; [3; 2; 1]] *)
sortjust like that -- and execute it, too. It is called `slow sort' -- one of the benchmarks of functional-logic programming. Although not usually fast, it is correct by definition. The actual performance depends on the implementation and could be quite good.
To express sorting we need two more non-deterministic primitives. Extending a language defined in the tagless-final style is easy: we just add new definitions reusing the old ones:
module type NDetComm = sig include NDet val rId : (int list -> bool) -> ilist_t -> ilist_t val once : ilist_t -> ilist_t endThe operation
rIdis a form of a logical conditional: it imposes a guard (a predicate constraint) on a non-deterministic expression. The name is chosen to match the Curry standard library. The primitive
headin Curry) expresses the so-called `don't care non-determinism': if an expression has several latent choices,
oncepicks one of them.
The sorting is written literally as ``a sorted permutation'':
module Sort(Nd:NDetComm) = struct open Nd include Perm(Nd) let rec sorted = function |  -> true | [_] -> true | h1 :: h2 :: t -> h1 >= h2 && sorted (h2::t) let sort l = once @@ rId sorted @@ perm l let tests = sort (list [3;1;4;1;5;9;2]) endAn attentive reader must have noticed that the sortedness is expressed `meta-theoretically' as one might say (why?).
Extending a DSL implementation is just as easy as extending the language definition: we just add the code for the new primitives, which are indeed primitive:
module NDetLComm = struct include NDetL let rId = List.filter let once = function  ->  | h::_ -> [h] end
And we can really sort:
let module M = Sort(NDetLComm) in M.tests
As a warm-up, let us take one particular DSL implementation, such as
described earlier. Let us write
perm without any functors this time,
as an ordinary OCaml function:
let perm : int list -> int list list = fun l -> let open NDetL in let rec insert x l = cons x l ||| decon l (fun () -> fail) (* l is empty, no tail *) (fun h t -> cons h (insert x t)) in foldr insert nil (list l)
It is the ordinary OCaml function, no kidding: we can apply it just like
We now abstract over the DSL implementation. First, we add to
observation function, which should have been there from the very beginning.
But it is never too late.
module type NDetO = sig include NDet val run : ilist_t -> int list list end
The permutation function will receive the DSL implementation as the (first-class) module argument:
let perm : (module NDetO) -> int list -> int list list = fun (module S:NDetO) l -> let open S in let rec insert x l = cons x l ||| decon l (fun () -> fail) (* l is empty, no tail *) (fun h t -> cons h (insert x t)) in run @@ foldr insert nil (list l)
Granted, explicitly supplying this implementation argument is a bit of a bother. At least we are dealing with an ordinary function rather than a functor. Luckily, modular implicits, currently available as an OCaml branch, dispense with the explicit argument passing. Using DSLs in OCaml becomes convenient: DSL primitives look like the ordinary OCaml operations, but can be distinguished by their types.
An interested reader might want to ponder:
foldris not actually necessary: it can be written using the other features of
Ndet. Write it.
'a repr, a set of OCaml values that represent DSL expressions of the type
'a. We have managed to do without
'a repr. What have we lost?
'a reprand implement this language.
NDet: e.g., using the free(r) monad or threads. Besides the depth-first search (underlying the list implementation), try to implement complete search strategies such as breadth-first search or iterative deepening.
NDet. Why? How to speed it up? Hint: this web site points to an article on this very topic.
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML