An example of a monadic programming in Scheme that juxtaposes Haskell code for a particular state monad with the corresponding Scheme code.
This article sprang from a question posted on comp.lang.functional about building of trees whose nodes are tagged with unique integers. "Ideally every term/node has to have its id associated with it at construction time," the poster said. "An obvious way to implement it in this [referentially-transparent] way is to have a collection of state which is passed along all the time to every function. But this seems rather tedious."
An imperative solution is obvious: introduce a global variable -- current_id
-- and use its value to tag every allocated tree node,
incrementing the variable afterwards. A functional solution is
relatively straightforward too: every function that creates or
operates on trees must take an extra argument -- the current value of the
tagging counter. The function should return multiple values: the
result of its computation and (perhaps incremented) value of the
counter. The latter solution does seem to be tedious, requiring
multiple values, and ugly. Is there a way to solve the problem purely
functionally and elegantly?
A proposed Haskell solution [SN-monad] relied on monads to hide the counter holding the tag for a new node. The solution did seem to reach the goal. Building a tagged tree looked just as simple as building an untagged one. The counter variable was well-hidden. And yet it was there, invisibly threading through all computations. Rendering this solution in Scheme can be fun.
We will cite the Haskell code in our Scheme code, for comparison
and documentation. We will identify Haskell code by ;--
comments.
We start by defining a datatype for integer-tagged values. We could use records or associative lists; However, a simple pair will suffice:
;-- type Numbered a = (Int,a) (define (make-numbered-value tag val) (cons tag val)) ; accessors of the components of the value (define (nvalue-tag tv) (car tv)) (define (nvalue-val tv) (cdr tv))We need to define a counting monad and its two fundamental operations:
bind
(aka >>=
) and return
. In Haskell, we write;-- newtype NumberedM a = NumberedM (Int -> Numbered a) ;-- instance Monad NumberedM where ;-- NumberedM m >>= f = NumberedM $ \n -> let (n1,v) = (m n) ;-- NumberedM m' = f v ;-- (n1',v') = (m' n1) ;-- in (n1',v') ;-- return x = NumberedM $ \n -> (n,x)
There are two ways of rendering that in Scheme. In this section, we will be discussing an easy way, which suffices as long we are working with a
single monad, or do not mix objects of different monads. In that case,
all the type tagging apparent in the Haskell solution is
unnecessary. Our monad is simply a function Int -> Numbered a
. A
monad is a delayed computation: a lambda-expression in our case. It
takes the value of the counter, performs whatever computation it is
meant to do, and returns the result tagged with the (perhaps
incremented) counter.
The two fundamental operations are introduced as:
;-- return:: a -> NumberedM a (define (return val) (lambda (curr_counter) (make-numbered-value curr_counter val)))and
;-- (>>=):: NumberedM a -> (a -> NumberedM b) -> NumberedM b (define (>>= m f) (lambda (curr_counter) (let* ((m_result (m curr_counter)) (n1 (nvalue-tag m_result)) ; result of the delayed computation (v (nvalue-val m_result)) ; represented by m (m1 (f v)) ; feed the result to f, get another m1 ) (m1 n1)))) ; The result of the bigger monad
The bind operator, >>=
, makes a "bigger" monad out
of a monad m
and a function f
(which yields
a monad when applied to a value). That bigger monad is a delayed
computation that incorporates both computations: those represented by
m
and by f
. The job of the bind operator is to
merge these computations and to take care of passing the state -- the
counter in our case -- from one computation to the other. The code of the bind operator first applies the monad m
to the curr_counter
. This executes the delayed
computation and gives us its result and the new counter. We apply f
to the result of the delayed computation m
and get another monad, m1
. The latter is a function,
which we apply to the counter returned by m
and get a
tagged value -- the result of the combined monad.
It is easy to verify that monad axioms are satisfied. Indeed, return
is the left and the right unit of bind
,
that is,
(>>= (return v) f) ==is-the-same-as==> (f v) (>>= m (lambda (v) (return v))) ==is-the-same-as==> m
Indeed, the direct inspection shows that (>>= m (lambda (v) (return
v)))
takes the result of (m counter)
into two parts and puts them back together again.
The bind operation is clearly associative,
(>>= m (lambda (v) (>>= (f v) g))) <==is-the-same-as==> (>>= (>>= m f) g) ; v is not free in g nor in f
Thus we have indeed built a monad. Besides the canonical return
and bind
operations, we need one more
operation (so-called morphism):
;-- get the current id and increment it ;-- incr:: NumberedM Int ;-- incr = NumberedM $ \n -> (n+1,n) (define incr (lambda (n) (make-numbered-value (+ 1 n) n)))
We also need an operation to run the monad, that is, to take the delayed computation represented by the monad and execute it. In our case, a delayed computation needs a value -- the initial value of the tagging counter.
;-- run_numberedM:: NumberedM a -> Int -> Numbered a ;-- run_numberedM (NumberedM m) init_count = m init_count (define (runM m init-counter) (m init-counter))The result of
runM
is a numbered value, a pair of
the final counter and the final value.Let us now see what this painstakingly defined monad can do for us. Let us consider trees
;-- data Tree a = Nd a (Forest a) deriving Show ;-- type Forest a = [Tree a]In Scheme terms, a node is
(value . forest)
where
forest is (node ...)
. A node is a list whose head is the
node's value and the tail is the list of its children. A forest is a
list of zero or more nodes.
Let's define a function to make a node -- a node that is tagged with a unique integer:
(define (make-node val kids) (>>= incr (lambda (counter) (return (cons (make-numbered-value counter val) kids)))))
That's what the original posted wanted: associate with each node a unique integer at node-construction time.
The code is hardly surprising: we obtain the value of the "global" counter, increment the counter and build a node out of the tagged value and the list of kids. Only there is no global counter and no mutation.
Now let's try to build a full binary tree, where the value of each node is that node's height:
;-- make_btree 0 = make_node 0 [] ;-- make_btree depth = do { ;-- left <- make_btree (depth -1); ;-- right <- make_btree (depth -1); ;-- make_node depth [left, right] ;-- }
This function that takes the desired depth and returns a monad, which -- when run -- produces a tagged binary tree.
The first attempt In Scheme is:(define (build-btree-r depth) (if (zero? depth) (make-node depth '()) (>>= (build-btree-r (- depth 1)) (lambda (left-branch) (>>= (build-btree-r (- depth 1)) (lambda (right-branch) (make-node depth (list left-branch right-branch))))))))
A syntactic sugar is direly needed. First, we introduce letM
:
(letM ((name initializer)) expression) ==> (>>= initializer (lambda (name) expression))Compare with a regular
let
:(let ((name initializer)) body) ==> (apply (lambda (name) body) (list initializer))
There are some differences, however: for one thing, letM
takes only one binding, while the regular let
may take several. The body of letM
is a single expression
that has to evaluate to a monad. The let
's body can have several
expressions. There is a deep reason for these differences. If a
let-form has several bindings, their initializing expressions are
evaluated in an undefined order. Such non-determinism does not exits
in the monadic world: since the evaluation of our monad involves threading of a state through all computations, the computations
must execute in the precisely defined order, one after another. Monads
introduce sequencing (single-threading) into the functional world. In
that sense, monadic computation emulates imperative
computation. However, in the imperative world, statements execute in
the order they are written because such is the semantics of an
imperative language. We trust the system honoring our statement order,
because the order of executing mutations is crucial to knowing the
global state of the system at any point in time. In contrast, in
monadic world there is no really global state. A state is explicitly
passed from one computation to another. Two computations are executed
in order because the second needs the result of the first.
(define-macro letM (lambda (binding expr) (apply (lambda (name-val) (apply (lambda (name initializer) `(>>= ,initializer (lambda (,name) ,expr))) name-val)) binding)))We can also introduce an even sweeter form
letM*
:(letM* (binding binding ...) expr) ==> (letM (binding) (letM* (binding ...) expr))which relates to
letM
exactly as let*
relates to let
.(define-macro letM* (lambda (bindings expr) (if (and (pair? bindings) (pair? (cdr bindings))) `(letM ,(list (car bindings)) (letM* ,(cdr bindings) ,expr)) `(letM ,bindings ,expr))))With these sweets, we can re-write our build-btree as
(define (build-btree depth) (if (zero? depth) (make-node depth '()) (letM* ((left-branch (build-btree (- depth 1))) (right-branch (build-btree (- depth 1)))) (make-node depth (list left-branch right-branch)))))Note the code does not explicitly mention any counter at all! Let's see how it runs:
> (pp (runM (build-btree 3) 100)) (115 (114 . 3) ((106 . 2) ((102 . 1) ((100 . 0)) ((101 . 0))) ((105 . 1) ((103 . 0)) ((104 . 0)))) ((113 . 2) ((109 . 1) ((107 . 0)) ((108 . 0))) ((112 . 1) ((110 . 0)) ((111 . 0)))))
Each node of the tree is uniquely tagged indeed. The counter starts at 100 and counts forward. The value of the node (the second component of a pair) is that node's height. The node tagged with 114 is the root node; its value is 3. Number 115 is the final value of the counter.
It is interesting to compare the build-btree code with a code that constructs a regular, non-tagged full binary tree:
(define (build-btree-c depth) (if (zero? depth) (cons depth '()) (let ((left-branch (build-btree-c (- depth 1))) (right-branch (build-btree-c (- depth 1)))) (cons depth (list left-branch right-branch))))) > (pp (build-btree-c 3)) (3 (2 (1 (0) (0)) (1 (0) (0))) (2 (1 (0) (0)) (1 (0) (0))))
The similarity is staggering. It seems we have achieved our goal:
building of a tagged tree looks almost identical to that of an
un-tagged tree. The tagging counter is well-hidden from view. The
counter does not get in the way and does not clutter the look and feel
of the computation. The letM*
form reminds us however
that the counter does exist. The form emphasizes the important
difference between the two functions. The function build-btree-c
constructs left and right branches in an
indeterminate order. The branches could even be built in parallel,
hardware permitting. The order does not matter -- the result is the
same regardless of the sequence. The form letM*
however
makes the computation in the function build-btree
strictly sequential: there, the right branch is constructed strictly
after the left branch, with their parent node following.
Thanks to type classes, Haskell monads are more general than the previous sections showed. We can mix several monads in the same expression:
f = do { putStrLn "Enter a number: "; n <- readLn; all_n <- return [1..n]; evens <- return $ all_n >>= (\i -> if (i `rem` 2) == 0 then return i else fail "odd"); return evens } main = fThis silly code "prints" all even numbers up to the one entered by the user. There are two monads in this fragment:
IO a
monad and [a]
monad: a list is also a monad.
Expression return [1..n]
returns an object of a type IO [Int]
whereas return i
yields a monad [Int]
. The outer return
in the same expression
returns IO [Int]
. It is remarkable that a Haskell
compiler is able to figure out which monad each return
function yields. The compiler bases its decision on the type of the result expected in each case. To render the above code in
Scheme, we have to explicitly identify monad operations of
different monads: (define f (IO::>> (put-line "Enter a number: ") (IO::>>= (read-int) (lambda (n) (IO::>>= (IO::return (iota 1 n)) (lambda (all-n) (IO::>>= (IO::return (List:>>= all-n (lambda (i) (if (even? i) (List::return i) (List::fail "odd"))))) (lambda (evens) (IO::return evens)))))))))However, even in this pathologically mixed case we note that some sub-expressions use only the IO monad, and one subexpression uses only the List monad. Therefore, we can write:
(define f (let ((>>= IO::>>=) (return IO::return)) ; define the "current" monad (beginM (put-line "Enter a number: ") (letM* ((n (read-int)) (all-n (return (iota 1 n))) (evens (return ; re-define the current monad (let ((>>= List::>>=) (return List::return) (fail List::fail)) (letM ((i all-n)) (if (even? i) (return i) (fail "odd")))))) ) (return evens)))))
This is not as elegant as in Haskell, yet is readable. With a bit of syntactic sugar and a module system similar to that of DrScheme, we could even replace
(let ((>>= IO::>>=) (return IO::return))with
(using IO)
.[SN-monad] A serially-numbering monad.
<../Computation/monads.html#serially-numbering-monad>
[monads] Monads in Haskell and other languages
<../Computation/monads.html>
[MShell] UNIX pipes as IO monads. Monadic i/o and UNIX shell programming.
<../Computation/monadic-shell.html>
[MScheme-IO] Monadic scheme of I/O
<misc.html#monadic-io>
[MP-Scheme] Monadic programming in Scheme: an example.
An article posted on a newsgroup comp.lang.scheme on Wed, 10 Oct 2001 20:27:21 -0700.
[MV-Reif] Multiple values as an effect. Monadic reflection and reification.
<misc.html#multiple-value-effect>
This site's top page is http://okmij.org/ftp/
Converted from SXML by SXML->HTML