A short tutorial on embedding effectful DSLs, or: An effectful computation as a small, first-order DSL
In truth, effects are not married to monads and approachable directly. The profound insight behind monads is the structuring, the separation of `pure' (context-independent) and effectful computations. The structuring can be done without explicating mathematical monads, and especially without resorting to vernacular monads such as State, etc. This article gives an example: a simple, effectful, domain-specific sublanguage embedded into an expressive `macro' metalanguage. Abstraction facilities of the metalanguage such higher-order functions and modules help keep the DSL to the bare minimum, often to the first order, easier to reason about and implement.
The key insight predates monads and goes all the way back to the origins of ML, as a scripting language for the Edinburgh LCF theorem prover. What has not been clear is how simple an effectful DSL may be while remaining useful. How convenient it is, especially compared to the monadic encodings. How viable it is to forsake the generality of first-class functions and monads and what benefits it may bring. We report on an experiment set out to explore these questions.
We pick a rather complex effect -- non-determinism -- and use it in OCaml, which at first blush seems unsuitable since it is call-by-value and has no monadic sugar. And yet, we can write non-deterministic programs just as naturally and elegantly as in Haskell or Curry.
The running tutorial example is computing all permutations of a given
list of integers. The reader may want to try doing that in their
favorite language or plain OCaml. Albeit a simple exercise, the code
is often rather messy and not obviously correct.
In the functional-logic language Curry, it is
strikingly elegant: mere
foldr insert . It is the re-statement
of the specification:
a permutation is moving the elements of the
source list one-by-one into some position in the
initially empty list. The code immediately tells that the number of
possible permutations of
n elements is
n!. From its very
conception in the 1959 Rabin and Scott's paper, non-determinism was
to write clear specifications -- and then to make them executable.
That is what will shall do.
The extended abstract, submitted to the ML Family Workshop 2017
Eugenio Moggi and Sonia Fagorzi: A Monadic Multi-stage Metalanguage.
FoSSaCS 2003: Foundations of Software Science and Computational Structures
Moggi and Fagorzi described monads as a tool for structuring -- staging -- of effectful computations. There are other ways to introduce sublanguages, such as the tagless-final style shown off in this article.
M. Gordon, R. Milner, L. Morris, M. Newey and C. Wadsworth: A Metalanguage for Interactive Proof in LCF
Conference Record of the Fifth Annual ACM Symposium on Principles of Programming Languages, 1978
The origins of non-determinism
ilist_tbelow, the semantic domains of integer and integer list expressions). The meaning of a complex expression is computed by combining the meaning of immediate sub-expressions, that is, compositionally. A language is thus defined by specifying the semantic domain types and the meaning computations for its syntactic forms. These definitions are typically collected into a signature, such as:
module type NDet = sigwhich we will be filling in. Since we will be talking about integer lists, we need the integer type and at least the integer literals:
type int_t val int: int -> int_tWe can add the standard operations on integers, but they are not needed for the problem at hand. They can always be added later. After all, the extensibility is the strong suit of the tagless-final embedding.
We also need integer lists, with the familiar constructors.
type ilist_t val nil: ilist_t val cons: int_t -> ilist_t -> ilist_t val list: int list -> ilist_tThe
listprimitive makes an OCaml list to be the list in our DSL: although every DSL list can be expressed through
cons, the special notation for literal DSL lists is convenient. We also need pattern-matching on lists, or the deconstructor. The syntax is admittedly ungainly: we are trying to represent
match ... withas an applicative expression:
val decon: ilist_t -> (unit -> ilist_t) -> (* if nil *) (int_t -> ilist_t -> ilist_t) -> (* if cons h t *) ilist_t
We also need
foldr. Strictly speaking, we do not need it: it is expressible
through the features already defined. But
foldr is so fundamental,
it is convenient to have as a primitive.
val foldr: (int_t -> ilist_t -> ilist_t) -> ilist_t -> ilist_t -> ilist_t
Finally, we define the operations for non-determinism: failure and the binary choice. The latter takes two non-deterministic expressions and non-deterministically executes one of them.
val fail: ilist_t val (|||): ilist_t -> ilist_t -> ilist_t
And we are done.
end (* of the signature NDet *)
An attentive reader may get the feeling that something is amiss: where are functions? We have defined neither the DSL function type nor operations to create and apply functions. Our DSL is not a lambda-calculus; it is essentially first order. Please hold your wonder.
NDetDSL may be, it is enough for the task at hand. We now use it to write the list permutation as elegantly as in Curry.
First, we need the non-deterministic list insertion:
insert x lst is
to insert the element
x somewhere in
lst, returning the extended list.
That is, it inserts
x at the front of
lst, or after the first element of
lst, or after the second element of
lst, etc. The algorithm can be
formulated, and hence implemented, inductively:
insert x lst either
x at the front of
lst or within
somewhere in its tail.
Computing the list permutation is now accomplished. The following is the complete code, which also includes a simple test.
module Perm(S:NDet) = struct open S let rec insert x l = cons x l ||| decon l (fun () -> fail) (* l is empty, no tail *) (fun h t -> cons h (insert x t)) let perm = foldr insert nil let test1 = perm (list [1;2;3]) endThe DSL primitives such as
niletc. are all defined in the implementation
Sof the signature
NDet. The code does not depend on any particular implementation, which is hence abstracted over as an argument
S. DSL code is hence typically represented as an OCaml functor, parameterized by the DSL implementation.
Although our code looks like the Curry code and is exceedingly simple,
there is something odd about it. We have said that
NDet has no functions:
no function types, no way to create or apply functions,
let alone recursive functions. What is
foldr a higher-order function? They are functions --
in the metalanguage but not in
NDet. From the DSL point of
insert is a `macro'. Our code then is a combination
of a trivial, non-deterministic DSL with a very expressive,
higher-order `macro' system (*).
Moreover, the DSL evaluation and the `macro-expansion' run like coroutines.
It is not unheard of: after all, coroutines were invented
as a communication mechanism among phases of a Cobol compiler
(hat tip to Aggelos Biboudis). This coroutine-like evaluation is the
essence of Moggi's computational calculus.
Melvin E. Conway. Design of a Separable Transition-diagram Compiler.
Commun. ACM 6.7 (July 1963), pp. 396-408.
Our motivation hence is to make the preprocessor less human.
Much of the power of C comes from having a powerful preprocessor. The preprocessor is called a programmer.
P.J. Moylan: The Case against C. Technical Report EE9240. July 1992. The University of Newcastle
This is the source for the above C joke. The section ``Modularity'' of Moylan's paper reads like the compelling motivation for the tagless-final approach.
Writing the DSL code as functors is ungainly. Lighter, functorless notation: blending DSL into OCaml shows what we can do to make the DSL programming more pleasant.
Permcode we need an implementation of the
NDetsignature, which we develop in this section. Since we are interested in the list of all permutations, it seems natural to take
ilist_tto be lists of all choices an integer or a list DSL expression may produce.
One can hear the shouts: this is the List monad! Yes, it is -- after all,
it is one of the implementations of non-determinism, envisioned already
by Rabin and Scott in the the 1950s. But it is not the only one.
The accompanying code shows another implementation of
NDet, in terms
of delimited continuations, the
delimcc library. One may easily think of
others, e.g., using the operating system threads. See also a definitely
non-monadic code-generation implementation below.
The list implementation is the easiest to explain -- and also the most
performant for our simple task. It is has to be stressed: our
DSL has no
return operations. The user does not know if
ilist_t is the `list monad' or other monad or no monad at all. Nor
should the user care: they want non-deterministic operations on lists,
and that is what
NDet provides, directly.
Thus we will be implementing
module NDetL = structfilling in the definitions one-by-one. As already said, we take a non-deterministic expression to mean the list of all values it may produce. For now we assume finitely non-deterministic expressions, which is enough for our task:
type int_t = int list type ilist_t = int list listLiteral expressions are deterministic: have exactly one value:
let int x = [x] let list x = [x] let nil = []
Two auxiliary functions will make further definitions easier:
let concatmap: ('a -> 'b list) -> 'a list -> 'b list = fun f l -> List.concat @@ List.map f l let liftm2 : ('a -> 'b -> 'c) -> 'a list -> 'b list -> 'c list = fun f l1 l2 -> concatmap (fun x -> List.map (f x) l2) l1The types and names should hint at their purpose, which becomes clear below. Since we are not concerned with monads, we do not need to prove any monad laws. (Stating equational laws is generally a good idea. But these laws should be formulated in terms of
NDetrather than a particular implementation. Also, the standard monad laws are the least interesting.) Now we can implement the constructor and deconstructor for the
let cons: int_t -> ilist_t -> ilist_t = liftm2 (fun h t -> h::t) let decon: ilist_t -> (unit -> ilist_t) -> (int_t -> ilist_t -> ilist_t) -> ilist_t = fun l onnil oncons -> concatmap (function |  -> onnil () | h::t -> oncons (int h) [t]) l let rec foldr: (int_t -> ilist_t -> ilist_t) -> ilist_t -> ilist_t -> ilist_t = (* exercise for the reader *)
In the list implementation of non-determinism, the two non-deterministic operations have a particularly simple expression:
let fail =  let (|||) = (@) end
We are finished, and ready to run our
Perm code. The result is in
let _ = let module M = Perm(NDetL) in M.test1 (* - : NDetL.ilist_t = [[1; 2; 3]; [1; 3; 2]; [2; 1; 3]; [2; 3; 1]; [3; 1; 2]; [3; 2; 1]] *)
sortjust like that -- and execute it, too. It is called `slow sort' -- one of the benchmarks of functional-logic programming. Although not usually fast, it is correct by definition. The actual performance depends on the implementation and could be quite good.
To express sorting we need two more non-deterministic primitives. Extending a language defined in the tagless-final style is easy: we just add new definitions reusing the old ones:
module type NDetComm = sig include NDet val rId : (int list -> bool) -> ilist_t -> ilist_t val once : ilist_t -> ilist_t endThe operation
rIdis a form of a logical conditional: it imposes a guard (a predicate constraint) on a non-deterministic expression. It is hence akin to
List.filter. The name is chosen to match the Curry standard library. The primitive
headin Curry) expresses the so-called `don't care non-determinism': if an expression has several latent choices,
oncepicks one of them.
The sorting is written literally as ``a sorted permutation'':
module Sort(Nd:NDetComm) = struct open Nd include Perm(Nd) let rec sorted = function |  -> true | [_] -> true | h1 :: h2 :: t -> h1 >= h2 && sorted (h2::t) let sort l = once @@ rId sorted @@ perm l let tests = sort (list [3;1;4;1;5;9;2]) endAn attentive reader must have noticed that the sortedness is expressed `meta-theoretically' as one might say (why?).
Extending a DSL implementation is just as easy as extending the language definition: we just add the code for the new primitives, which are indeed primitive:
module NDetLComm = struct include NDetL let rId = List.filter let once = function  ->  | h::_ -> [h] end
And we can really sort:
let module M = Sort(NDetLComm) in M.tests
As a warm-up, let us take one particular DSL implementation, such as
described earlier. Let us write
perm without any functors this time,
as an ordinary OCaml function:
let perm : int list -> int list list = fun l -> let open NDetL in let rec insert x l = cons x l ||| decon l (fun () -> fail) (* l is empty, no tail *) (fun h t -> cons h (insert x t)) in foldr insert nil (list l)
It is the ordinary OCaml function, no kidding: we can apply it just like
We now abstract over the DSL implementation. First, we add to
observation function, which should have been there from the very beginning.
But it is never too late.
module type NDetO = sig include NDet val run : ilist_t -> int list list end
The permutation function will receive the DSL implementation as the (first-class) module argument:
let perm : (module NDetO) -> int list -> int list list = fun (module S:NDetO) l -> let open S in let rec insert x l = cons x l ||| decon l (fun () -> fail) (* l is empty, no tail *) (fun h t -> cons h (insert x t)) in run @@ foldr insert nil (list l)
Modular implicits, currently an OCaml branch, can even save us the trouble of
NDet implementation explicitly.
DSLs become convenient:
DSL primitives look like the ordinary OCaml operations,
but can be distinguished by their types. Instead of first-class
modules we could have used plain records. Our
approach hence easily applies to other ML(-like) languages.
NDetDSL is not monadic. We program in it just like in ML, directly operating on effectful expressions, without any binds and returns. To be sure, monads are not without benefits: clear separation of pure and effectful computations in types and syntax; uniformity; easy extension to higher-order functions. Let us see how much if any we have lost without monads.
Our DSL approach just as clearly separates pure and effectful:
anything of the type
potentially non-deterministic; everything else is deterministic.
Thus from the type of
insert : int_t -> ilist_t -> ilist_t
we immediately tell that
insert deterministically combines
Thanks to the richness of the metalanguage,
NDet did not need
higher-order (or even first-order) functions.
Hence we do not
need to worry about dealing with them non-deterministically. There are
certainly problems requiring higher-order effectful computations.
The experience with Lightweight Modular Staging in Scala shows,
however, that for rather many
practical problems, a first-order DSL is sufficient.
Finally, forsaking monads permits more implementations of our
NDet. Suppose we wish to generate code for list
permutations and realize the types as
type int_t = int list code type ilist_t = int list list code(see the accompanying code for the full implementation). Had
NDetbeen monadic, such an implementation is impossible:
'a codeis not a monad. First,
return : 'a -> 'a codeis problematic since not every value can be lifted to code (think of reference cells and I/O channels). Second, we cannot generally execute the code until we have finished generating it (because intermediate code may be open). Therefore,
bindis not expressible either.
An interested reader might want to ponder:
foldris not actually necessary: it can be written using the other features of
NDet. Write it.
'a repr, a set of OCaml values that represent DSL expressions of the type
'a. We have managed to do without
'a repr. What have we lost?
'a reprand implement this language.
NDet: e.g., using the free(r) monad or threads. Besides the depth-first search (underlying the list implementation), try to implement complete search strategies such as breadth-first search or iterative deepening.
NDet. Why? How to speed it up? Hint: this web site points to an article on this very topic.
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML