previous   next   up   top

Non-determinism: a sublanguage rather than a monad

 

A short tutorial on embedding effectful DSLs, or: An effectful computation as a small, first-order DSL

 

Introduction

A puzzlingly named, exceedingly technical device introduced to structure the denotational semantics has by now achieved cult status. It has been married to effects -- more than once. It is compulsively looked for in all manner of things, including burritos. At least two ICFP papers brought it up without a rhyme or reason (or understanding), as the authors later admitted. I am talking about monads.

In truth, effects are not married to monads and approachable directly. The profound insight behind monads is the structuring, the separation of `pure' (context-independent) and effectful computations. The structuring can be done without explicating mathematical monads, and especially without resorting to vernacular monads such as State, etc. This article gives an example: a simple, effectful, domain-specific sublanguage embedded into an expressive `macro' metalanguage. Abstraction facilities of the metalanguage such higher-order functions and modules help keep the DSL to the bare minimum, often to the first order, easier to reason about and implement.

The key insight predates monads and goes all the way back to the origins of ML, as a scripting language for the Edinburgh LCF theorem prover. What has not been clear is how simple an effectful DSL may be while remaining useful. How convenient it is, especially compared to the monadic encodings. How viable it is to forsake the generality of first-class functions and monads and what benefits it may bring. We report on an experiment set out to explore these questions.

We pick a rather complex effect -- non-determinism -- and use it in OCaml, which at first blush seems unsuitable since it is call-by-value and has no monadic sugar. And yet, we can write non-deterministic programs just as naturally and elegantly as in Haskell or Curry.

The running tutorial example is computing all permutations of a given list of integers. The reader may want to try doing that in their favorite language or plain OCaml. Albeit a simple exercise, the code is often rather messy and not obviously correct. In the functional-logic language Curry, it is strikingly elegant: mere foldr insert []. It is the re-statement of the specification: a permutation is moving the elements of the source list one-by-one into some position in the initially empty list. The code immediately tells that the number of possible permutations of n elements is n!. From its very conception in the 1959 Rabin and Scott's paper, non-determinism was called for to write clear specifications -- and then to make them executable. That is what will shall do.

References
nondet.ml [12K]
The complete OCaml code for the article

nondet-paper.pdf [161K]
The extended abstract, submitted to the ML Family Workshop 2017

Eugenio Moggi and Sonia Fagorzi: A Monadic Multi-stage Metalanguage.
FoSSaCS 2003: Foundations of Software Science and Computational Structures
Moggi and Fagorzi described monads as a tool for structuring -- staging -- of effectful computations. There are other ways to introduce sublanguages, such as the tagless-final style shown off in this article.

M. Gordon, R. Milner, L. Morris, M. Newey and C. Wadsworth: A Metalanguage for Interactive Proof in LCF
Conference Record of the Fifth Annual ACM Symposium on Principles of Programming Languages, 1978

The origins of non-determinism

 

Sub-language of non-deterministic computations on lists

We start by designing a language just expressive enough for our problem of computing a list permutation using non-determinism. We embed this ``domain-specific'' language (DSL) into OCaml in the tagless-final style. In this style, a language is defined by specifying how to compute the meaning of its expressions. The meaning is an OCaml value of some abstract type (such as the types int_t and ilist_t below, the semantic domains of integer and integer list expressions). The meaning of a complex expression is computed by combining the meaning of immediate sub-expressions, that is, compositionally. A language is thus defined by specifying the semantic domain types and the meaning computations for its syntactic forms. These definitions are typically collected into a signature, such as:
     module type NDet = sig
which we will be filling in. Since we will be talking about integer lists, we need the integer type and at least the integer literals:
     type int_t
     val int: int -> int_t
We can add the standard operations on integers, but they are not needed for the problem at hand. They can always be added later. After all, the extensibility is the strong suit of the tagless-final embedding.

We also need integer lists, with the familiar constructors.

     type ilist_t
     val nil:  ilist_t
     val cons: int_t -> ilist_t -> ilist_t
     val list: int list -> ilist_t
The list primitive makes an OCaml list to be the list in our DSL: although every DSL list can be expressed through nil and cons, the special notation for literal DSL lists is convenient. We also need pattern-matching on lists, or the deconstructor. The syntax is admittedly ungainly: we are trying to represent match ... with as an applicative expression:
     val decon: ilist_t -> 
       (unit -> ilist_t) ->             (* if nil *)
       (int_t -> ilist_t -> ilist_t) -> (* if cons h t *)
       ilist_t

We also need foldr. Strictly speaking, we do not need it: it is expressible through the features already defined. But foldr is so fundamental, it is convenient to have as a primitive.

     val foldr: (int_t -> ilist_t -> ilist_t) -> ilist_t -> ilist_t -> ilist_t

Finally, we define the operations for non-determinism: failure and the binary choice. The latter takes two non-deterministic expressions and non-deterministically executes one of them.

     val fail: ilist_t
     val (|||): ilist_t -> ilist_t -> ilist_t

And we are done.

     end (* of the signature NDet *)

An attentive reader may get the feeling that something is amiss: where are functions? We have defined neither the DSL function type nor operations to create and apply functions. Our DSL is not a lambda-calculus; it is essentially first order. Please hold your wonder.

References
Typed final (tagless-final) style

 

List permutation, non-deterministically

However feeble our NDet DSL may be, it is enough for the task at hand. We now use it to write the list permutation as elegantly as in Curry.

First, we need the non-deterministic list insertion: insert x lst is to insert the element x somewhere in lst, returning the extended list. That is, it inserts x at the front of lst, or after the first element of lst, or after the second element of lst, etc. The algorithm can be formulated, and hence implemented, inductively: insert x lst either inserts x at the front of lst or within lst, i.e., somewhere in its tail.

Computing the list permutation is now accomplished. The following is the complete code, which also includes a simple test.

     module Perm(S:NDet) = struct
       open S
     
       let rec insert x l = 
         cons x l ||| decon l
                       (fun ()  -> fail)                 (* l is empty, no tail *)
                       (fun h t -> cons h (insert x t))
     
      let perm = foldr insert nil
     
      let test1 = perm (list [1;2;3])
     end
The DSL primitives such as foldr, fail, nil etc. are all defined in the implementation S of the signature NDet. The code does not depend on any particular implementation, which is hence abstracted over as an argument S. DSL code is hence typically represented as an OCaml functor, parameterized by the DSL implementation.

Although our code looks like the Curry code and is exceedingly simple, there is something odd about it. We have said that NDet has no functions: no function types, no way to create or apply functions, let alone recursive functions. What is insert then? Isn't foldr a higher-order function? They are functions -- in the metalanguage but not in NDet. From the DSL point of view, insert is a `macro'. Our code then is a combination of a trivial, non-deterministic DSL with a very expressive, higher-order `macro' system (*). Moreover, the DSL evaluation and the `macro-expansion' run like coroutines. It is not unheard of: after all, coroutines were invented as a communication mechanism among phases of a Cobol compiler (hat tip to Aggelos Biboudis). This coroutine-like evaluation is the essence of Moggi's computational calculus.

References

Melvin E. Conway. Design of a Separable Transition-diagram Compiler.
Commun. ACM 6.7 (July 1963), pp. 396-408.

(*) An old joke comes to mind:

Much of the power of C comes from having a powerful preprocessor. The preprocessor is called a programmer.

Our motivation hence is to make the preprocessor less human.

P.J. Moylan: The Case against C. Technical Report EE9240. July 1992. The University of Newcastle
This is the source for the above C joke. The section ``Modularity'' of Moylan's paper reads like the compelling motivation for the tagless-final approach.

Writing the DSL code as functors is ungainly. Lighter, functorless notation: blending DSL into OCaml shows what we can do to make the DSL programming more pleasant.

 

One implementation of non-determinism

To run the Perm code we need an implementation of the NDet signature, which we develop in this section. Since we are interested in the list of all permutations, it seems natural to take int_t and ilist_t to be lists of all choices an integer or a list DSL expression may produce.

One can hear the shouts: this is the List monad! Yes, it is -- after all, it is one of the implementations of non-determinism, envisioned already by Rabin and Scott in the the 1950s. But it is not the only one. The accompanying code shows another implementation of NDet, in terms of delimited continuations, the delimcc library. One may easily think of others, e.g., using the operating system threads. See also a definitely non-monadic code-generation implementation below.

The list implementation is the easiest to explain -- and also the most performant for our simple task. It is has to be stressed: our NDet DSL has no bind or return operations. The user does not know if ilist_t is the `list monad' or other monad or no monad at all. Nor should the user care: they want non-deterministic operations on lists, and that is what NDet provides, directly.

Thus we will be implementing

     module NDetL = struct
filling in the definitions one-by-one. As already said, we take a non-deterministic expression to mean the list of all values it may produce. For now we assume finitely non-deterministic expressions, which is enough for our task:
     type int_t   = int list
     type ilist_t = int list list
Literal expressions are deterministic: have exactly one value:
     let int x  = [x]
     let list x = [x]
     let nil    = [[]]

Two auxiliary functions will make further definitions easier:

     let concatmap: ('a -> 'b list) -> 'a list -> 'b list =
       fun f l -> List.concat @@ List.map f l
     
     let liftm2 : ('a -> 'b -> 'c) -> 'a list -> 'b list -> 'c list =
       fun f l1 l2 -> concatmap (fun x -> List.map (f x) l2) l1
The types and names should hint at their purpose, which becomes clear below. Since we are not concerned with monads, we do not need to prove any monad laws. (Stating equational laws is generally a good idea. But these laws should be formulated in terms of NDet rather than a particular implementation. Also, the standard monad laws are the least interesting.) Now we can implement the constructor and deconstructor for the NDet lists:
     let cons: int_t -> ilist_t -> ilist_t = liftm2 (fun h t -> h::t)
     let decon: ilist_t -> (unit -> ilist_t) -> (int_t -> ilist_t -> ilist_t) -> ilist_t =
         fun l onnil oncons ->
           concatmap (function
           | []   -> onnil ()
           | h::t -> oncons (int h) [t]) l
     
     let rec foldr: (int_t -> ilist_t -> ilist_t) -> 
                      ilist_t -> ilist_t -> ilist_t = (* exercise for the reader *)

In the list implementation of non-determinism, the two non-deterministic operations have a particularly simple expression:

     let fail   = []
     let (|||)  = (@)
     end

We are finished, and ready to run our Perm code. The result is in the comments:

     let _ = let module M = Perm(NDetL) in  M.test1
     (*
      - : NDetL.ilist_t = [[1; 2; 3]; [1; 3; 2]; [2; 1; 3]; [2; 3; 1]; [3; 1; 2]; [3; 2; 1]]
     *)
References
When Monads will not do

 

Advanced non-determinism: sorting

An immediate application of list permutation is sorting: indeed, sorting, by definition, is a sorted permutation. Our DSL can truly specify sort just like that -- and execute it, too. It is called `slow sort' -- one of the benchmarks of functional-logic programming. Although not usually fast, it is correct by definition. The actual performance depends on the implementation and could be quite good.

To express sorting we need two more non-deterministic primitives. Extending a language defined in the tagless-final style is easy: we just add new definitions reusing the old ones:

     module type NDetComm = sig
       include NDet
       val rId  : (int list -> bool) -> ilist_t -> ilist_t 
       val once : ilist_t -> ilist_t
     end
The operation rId is a form of a logical conditional: it imposes a guard (a predicate constraint) on a non-deterministic expression. It is hence akin to List.filter. The name is chosen to match the Curry standard library. The primitive once (called head in Curry) expresses the so-called `don't care non-determinism': if an expression has several latent choices, once picks one of them.

The sorting is written literally as ``a sorted permutation'':

     module Sort(Nd:NDetComm) = struct
       open Nd
       include Perm(Nd)
     
       let rec sorted = function
         | []  -> true
         | [_] -> true
         | h1 :: h2 :: t -> h1 >= h2 && sorted (h2::t)
     
       let sort l = once @@ rId sorted @@ perm l
     
       let tests = sort (list [3;1;4;1;5;9;2])
     end
An attentive reader must have noticed that the sortedness is expressed `meta-theoretically' as one might say (why?).

Extending a DSL implementation is just as easy as extending the language definition: we just add the code for the new primitives, which are indeed primitive:

     module NDetLComm = struct
       include NDetL
       let rId  = List.filter
       let once = function [] -> [] | h::_ -> [h]
     end

And we can really sort: let module M = Sort(NDetLComm) in M.tests

 

Lighter, functorless notation: blending DSL into OCaml

One often hears (from the reviewers) the complaint that writing DSL expressions as OCaml functors is cumbersome. After all, OCaml functors are not first-class. The way tagless-final is often presented in OCaml indeed looks heavy. But there are other ways, blending the DSL code into the regular OCaml. The result looks quite like the Lightweight Modular Staging (LMS) in Scala, which has been used for serious DSLs.

As a warm-up, let us take one particular DSL implementation, such as NDetL described earlier. Let us write perm without any functors this time, as an ordinary OCaml function:

     let perm : int list -> int list list = fun l ->
       let open NDetL in
       let rec insert x l = 
         cons x l ||| decon l
                       (fun ()  -> fail)                 (* l is empty, no tail *)
                       (fun h t -> cons h (insert x t))
       in foldr insert nil (list l)

It is the ordinary OCaml function, no kidding: we can apply it just like perm [1;2;3].

We now abstract over the DSL implementation. First, we add to NDet the observation function, which should have been there from the very beginning. But it is never too late.

     module type NDetO = sig
       include NDet
       val run : ilist_t -> int list list
     end

The permutation function will receive the DSL implementation as the (first-class) module argument:

     let perm : (module NDetO) -> int list -> int list list = fun (module S:NDetO) l ->
       let open S in
       let rec insert x l = 
         cons x l ||| decon l
                       (fun ()  -> fail)                 (* l is empty, no tail *)
                       (fun h t -> cons h (insert x t))
       in run @@ foldr insert nil (list l)

Modular implicits, currently an OCaml branch, can even save us the trouble of passing the NDet implementation explicitly. DSLs become convenient: DSL primitives look like the ordinary OCaml operations, but can be distinguished by their types. Instead of first-class modules we could have used plain records. Our approach hence easily applies to other ML(-like) languages.

 

When Monads will not do

Our NDet DSL is not monadic. We program in it just like in ML, directly operating on effectful expressions, without any binds and returns. To be sure, monads are not without benefits: clear separation of pure and effectful computations in types and syntax; uniformity; easy extension to higher-order functions. Let us see how much if any we have lost without monads.

Our DSL approach just as clearly separates pure and effectful: anything of the type int_t or ilist_t is potentially non-deterministic; everything else is deterministic. Thus from the type of insert : int_t -> ilist_t -> ilist_t we immediately tell that insert deterministically combines non-deterministic computations.

Thanks to the richness of the metalanguage, NDet did not need higher-order (or even first-order) functions. Hence we do not need to worry about dealing with them non-deterministically. There are certainly problems requiring higher-order effectful computations. The experience with Lightweight Modular Staging in Scala shows, however, that for rather many practical problems, a first-order DSL is sufficient.

Finally, forsaking monads permits more implementations of our NDet. Suppose we wish to generate code for list permutations and realize the types as

     type int_t   = int list code
     type ilist_t = int list list code
(see the accompanying code for the full implementation). Had NDet been monadic, such an implementation is impossible: 'a code is not a monad. First, return : 'a -> 'a code is problematic since not every value can be lifted to code (think of reference cells and I/O channels). Second, we cannot generally execute the code until we have finished generating it (because intermediate code may be open). Therefore, bind is not expressible either.

 

Questions and exercises

All in all, we have described a direct alternative to the monadic encoding of effects: defining a small domain-specific language with the necessary effectful operations. The DSL will be blended into OCaml or other ML-like language; therefore, it can be kept tiny, with no abstraction facilities of its own, or even functions. OCaml, serving as an inordinarily expressive macro language, compensates.

An interested reader might want to ponder:



Last updated August 10, 2017

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML