The generators in our story come from the formula
yield = exceptions + non-determinism. We will mostly use the Haskell
implementation because of more revealing types;
porting to OCaml or other languages is straightforward. The basic
generator interface defines one (abstract) type
GenT e m a
and two functions, for yielding and iterating -- along with the observation
-- Basic interface type GenT e m a -- abstract instance Monad m => Monad (GenT e m) instance MonadTrans (GenT e) type LoopBody e m = e -> m () yield :: Monad m => e -> GenT e m () for_loop :: (MonadTrans t, Monad m, Monad (t m)) => GenT e (t m) () -> LoopBody e (t m) -> t m () observe :: (MonadTrans t, Monad m) => t m a -> m a
Gen e m a is the type of computations that yield
values of the type
e in the monad
m, eventually returning the
value of the type
a. The particular implementation of generators used here
requires the monad
MonadPlus. Likewise the monad transformer
that appears in the signatures of
observe must be
LogicT. Other implementations of the basic generator interface
lift these requirements.
The primitive generator
yield (along with the fact that
Gen e m is a monad)
lets us write other generators. In this article, we will use two.
One is the generator that enumerates values within a given range.
Compared to the familiar
Enum a => enumFromTo :: a -> a -> [a],
the generator does not create any list, even lazily.
Rather, it yields elements as they are being enumerated:
enumFromToG :: (MonadPlus m, Enum a) => a -> a -> GenT a m () enumFromToG xbeg xend | fromEnum xbeg <= fromEnum xend = yield xbeg >> enumFromToG (succ xbeg) xend enumFromToG xbeg xend = return ()The second sample generator yields labels from an in-order traversal of a tree with integer-labeled leaves. We have seen this generator many times in an earlier page on generators.
type Label = Int in_order2 :: (MonadIO m, MonadPlus m) => Tree -> GenT Label m ()
To perform the iteration, we connect a generator with the consumer of the generated values, that is, with a loop body. The sample loop:
sl1 = for_loop (enumFromToG 'a' 'e') $ \i -> lift . print $ icorresponds to the for-loop in Pascal
for i:='a' to 'e' do writeln(i)The second sample loop
for_loop (in_order2 (make_full_tree 3)) $ \i -> lift . print $ icannot be idiomatically expressed in Pascal. It can be written however, roughly in the same form, in CLU, which had generators. To execute the overall iterative computation we have to
observeit, in a base monad such as
The basic interface is an abstraction for simple generators, such as those
in CLU, which suffice for many cases. Parallel loops however need
more powerful generators, described by the advanced
interface below. The basic interface was the abstraction of the advanced one.
The advanced interface exposes the implementation of generators as
yield = exceptions + non-determinism.
-- Advanced interface type GenT e m a = MonadPlus m => ErrorT e m a -- now the concrete type yield :: MonadPlus m => e -> GenT e m () yield x = raise x `mplus` return () for_loop :: (LogicT t, MonadPlus (t m), Monad m) => GenT e (t m) () -> LoopBody e (t m) -> t m () for_loop gen body = iter Nothing $ catchError' gen >>= body catchError' :: MonadPlus m => GenT e m () -> m e iter:: (LogicT t, MonadPlus (t m), Monad m) => Maybe Int -> t m () -> t m () msplit :: (LogicT t, MonadPlus (t m), Monad m) => t m a -> t m (Maybe (a, t m a))Since generators combine exceptions (the
ErrorTmonad) and non-determinism (the
LogicTmonad), they inherit the operations of those monads. Specifically,
catchError', an instance of
catch, obtains the value yielded by a generator. The combinator
iter, a particular case of
bagOfN, takes a non-deterministic computation
t m ()and executes all of its choices, one-by-one. If the first argument of
Just n, only the first
nchoices are executed and the rest are discarded. Effectively,
iter`runs' a non-deterministic computation, actualizing all its non-deterministic choices. For example, desugaring the first sample loop gives the following advanced code:
iter Nothing $ do i <- catchError' $ enumFromToG 'a' 'e' lift . print $ iThe loop body non-deterministically obtains the value yielded by the generator, and prints it. The
iter Nothingline -- the loop header, so to speak -- makes the body run through all choices, that is, all generated values.
Generators: yield = exceptions + non-determinism
The detailed description of the generators used on this page
The complete code for this article
nl1 = observe $ for_loop (in_order2 (make_full_tree 3)) $ \i -> for_loop (enumFromToG 'a' 'e') $ \j -> lift . print $ (i,j)The print-out confirms -- although we have never doubted -- that for each value of
jranges over all its possible values.
Since the inner
j-loop is within the
i-loop body, the variable
scopes over the entire inner loop, including its generator. That is,
the inner loop generator may refer to the outer-loop variable, making
loop dependence obvious:
nl2 = observe $ for_loop (in_order2 (make_full_tree 3)) $ \i -> for_loop (enumFromToG i 3) $ \j -> lift . print $ (i,j)The binding scope of a loop variable matches the scope of iteration of the corresponding loop.
The story picks up as we desugar the nested loop
nl1 and rearrange the code
using equational reasoning. We obtain a different way of writing
nested loops, now using the advanced interface:
nl1' = observe $ iter Nothing $ do -- loop header i <- catchError' (in_order2 (make_full_tree 3)) j <- catchError' (enumFromToG 'a' 'e') lift . print $ (i,j) -- loop bodywhich we will write in the abbreviated form as
nl1' = observe $ iter Nothing $ do i <- e1 j <- e2 bodyThe binding and control scopes are a bit harder to see now. Recall that the Haskell notation
do i <- e1 j <- e2 bodyis a syntactic sugar for
e1 >>= (\i -> e2 >>= (\j -> body))exposing that
e2-- as was the case in the original
nl1. The repeated
do-bindings corresponds to what other languages call `nested let' or `repeated let':
let i = e1 in let j = e2 in body
The control scope also becomes clear if we look at the evaluation context
of the first generator
do i <-  j <- e2 bodyThe context includes the generator
e2. The non-deterministic expression
e1replicates its context; it plugs the first yielded value into the first context replica, the second yielded value into the second replica, etc. Each replica has the copy of
e2, the generator for the
j-loop, which runs anew in each replicated copy. The
bodyis thus executed at each point in the Cartesian product of the domain of
iand the domain of
j. The loop
nl1'indeed has the same behavior as the original
nl1. The binding scope of a loop variable and the control scope -- the evaluation context of the generator -- still match.
nl2 re-iterates the point that the second generator
is within the binding scope of the loop variable
with the first generator.
nl2' = observe $ iter Nothing $ do -- loop header i <- catchError' (in_order2 (make_full_tree 3)) j <- catchError' (enumFromToG i 3) lift . print $ (i,j) -- loop bodyThe binding scope is lexical and hence is easier to see than the control scope. Since the two scopes match, we can tell which generator holds control over which by examining the binding scopes of their loop variables.
Although the desugared form of
for_loop is less pretty to look at,
it is more insightful. There is only one binding form for loop
do-binding, which is just the lambda-binding. Lambda
provides binding and generators provide iteration. The binding scope
of a loop variable matches the control scope of the iteration.
do-bindings with generators as binding expressions correspond to nested loops. Many programming languages offer so-called parallel bindings, such as the following OCaml form:
let i = e1 and j = e2 in bodyAlthough
istill scopes over
bodyit no longer scopes over
e2contains an occurrence of
i, it would remain free. The following two expressions should help see the difference in binding scope:
let i = 1 in let i = i + 1 in let j = i + i in (i,j) let i = 1 in let i = i + 1 and j = i + i in (i,j)The former, with all nested
lets, evaluates to
(2,4); the latter evaluates to
The bindings are called parallel because
e2 no longer have
any data dependency and may be evaluated in any order:
e2 can be
evaluated without waiting for the value to be bound to
e2 may even be evaluated in parallel, and the
program should not -- ought not to -- notice.
Our story comes to a decisive point. Can we realize parallel bindings,
first in Haskell, to maintain not only data independence of
e2 but also their control independence? Can we use parallel bindings
to implement parallel loops?
let such as
let i = e1 and j = e2 in body
may be regarded, in the first approximation, as a syntactic sugar for
(fun (i,j) -> body) (e1,e2). The Haskell implementation starts off easy:
pair e1 e2 >>= \ (i,j) -> body. What remains is to determine the
zip-like combinator to pair-up two generators. The naive implementation:
seq2G :: Monad m => m t -> m t1 -> m (t, t1) seq2G e1 e2 = do i <- e1 j <- e2 return (i,j)alas is not satisfactory:
e2is in the evaluation context of
e1, and hence control-dependent on
e1. Somehow we have to exclude
e2from the evaluation contexts of each other, we have to isolate non-deterministic effects of
e2and prevent them from affecting each other.
LogicT lets us run a non-deterministic computation
in an isolated `sand-box'.
msplit :: (LogicT t, MonadPlus (t m), Monad m) => t m a -> t m (Maybe (a, t m a))That is,
e1in the sandbox and reports the outcome:
x; the computation
e1'represents the remaining non-deterministic choices. Technically speaking,
msplitreifies the non-determinism effect as a
Maybe (a, t m a)value. This sandbox isolation is exactly what is needed for the parallel combinator:
par2G :: (LogicT t, MonadPlus (t m), Monad m) => t m t1 -> t m t2 -> t m (t1, t2) par2G e1 e2 = do iR <- msplit e1 jR <- msplit e2 case (iR,jR) of (Just (i,e1'), Just (j,e2')) -> return (i,j) `mplus` par2G e1' e2' _ -> mzeroWe determine the first choice of
e1and, separately, the first choice of
e2, and pair them up as the first choice of the
par2G e1 e2.
The story concludes with the parallel loop
pl1' = observe $ iter Nothing $ do -- loop header (i,j) <- par2G (catchError' (in_order2 (make_full_tree 3))) (catchError' (enumFromToG 'a' 'e')) lift . print $ (i,j) -- loop bodyLexically, the two generator expressions are not in scope of either loop variable binding. The print-out shows the two generators indeed run in parallel. Neither of the two generators is in the control scope of the other. Binding and control scopes match again.
Our pairing combinator
par2G does not report which of the two generators
finished first. We describe a more general combinator below.
The Haskell and OCaml generators derived on this page effectively
run on the spaghetti stack, one may say, which they inherit from
the implementation of non-determinism. The Haskell and OCaml
generators hence can run side-by-side. Recall that
our generators can be represented as -- reified into -- a lazy stream
msplit). Writing a
zipWith-like function to process two
streams in parallel is straightforward then.
We demonstrate the parallel composition of Haskell and OCaml generators on the famous ``same-fringe problem'', determining if two binary trees have the same fringe, that is, yield the same sequence of labels when traversed in a particular order. Crucially, we must stop the traversal as soon as the mismatch is found, returning it. To make clear the early traversal termination, we print the labels as we traverse the tree. The printing operation makes the traversal effectful and hence precludes lazy lists.
We solve the problem by defining the parallel loop -- the
zipWith-like combinator for streams. We show the Haskell code first.
To let the zipping function deal with the streams of unequal size, we
make its argument types to be
Maybe a, with
signifying stream's end:
zipWithL :: (Monad m, LogicT t, MonadPlus (t m)) => (Maybe a -> Maybe b -> c) -> t m a -> t m b -> t m c zipWithL f m1 m2 = do r1 <- msplit m1 r2 <- msplit m2 case (r1,r2) of (Nothing, Nothing) -> return $ f Nothing Nothing (Just (v1,_),Nothing) -> return $ f (Just v1) Nothing (Nothing,Just (v2,_)) -> return $ f Nothing (Just v2) (Just (v1,t1),Just (v2,t2)) -> return (f (Just v1) (Just v2)) `mplus` zipWithL f t1 t2The function
same_fringebelow takes two generator expressions and runs them side-by-side, generating the stream of mismatches. The one-element stream of
(Nothing,Nothing)denotes the complete match.
same_fringe :: (Monad m, LogicT t, MonadPlus (t m), Eq a) => ErrorT a (t m) () -> ErrorT a (t m) () -> t m (Maybe a, Maybe a) same_fringe m1 m2 = zipWithL (,) (catchError' m1) (catchError' m2) >>= check where check r@(Nothing,_) = return r check r@(_,Nothing) = return r check r@(Just x,Just y) | x /= y = return r check _ = mzeroThe following function compares the in-order sequences of labels of two trees, returning the first mismatch.
sfringe_test :: MonadIO m => Tree -> Tree -> m (Maybe Label, Maybe Label) sfringe_test t1 t2 = observe $ same_fringe (in_order2 t1) (in_order2 t2)The generator
in_order2, described earlier on this page, prints the trace of the traversal. The trace reveals that the two trees are indeed traversed side-by-side; the traversal stops as soon as the first mismatch is found. We may also print, say, the first three mismatches.
The OCaml implementation is the same. We define a parallel loop over two generators:
let for_loop_2 p gen1 gen2 body : unit = let rec loop = function | (Nil,Nil) -> body None None | (Cons (v1,_),Nil) -> body (Some v1) None | (Nil,Cons (v2,_)) -> body None (Some v2) | (Cons (v1,t1), Cons (v2,t2)) -> body (Some v1) (Some v2); loop (t1(),t2()) in loop (msplit p gen1, msplit p gen2)employing it to write the generator for mismatches of two other generators
let pmismatch : ('a option * 'a option) stream Delimcc.prompt = new_prompt () let same_fringe p gen1 gen2 = for_loop_2 p gen1 gen2 (fun x y -> match (x,y) with | (None,_) | (_,None) as r -> yield pmismatch r | (Some x,Some y) as r when not (x = y) -> yield pmismatch r | _ -> ())We may run
same_fringeas any other generator, to obtain the first mismatch or the first
The complete code for the OCaml examples.
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML