By eliminating duplicate possible worlds, a set monad can dramatically accelerate non-deterministic computations. But such an efficient monad does not exist -- or so it has been believed. We describe the challenges and two ways to overcome them, taking hints from partial evaluation and monadic reflection.
The possible worlds can be tracked in `parallel' (in the breadth-first fashion) or one after another (backtracking). To detect duplicate intermediate states, one has to store the ones already tracked. The Set monad hence may be regarded as a memoization, trading space for time. When the intermediate states are very numerous, the opposite trade-off may be advisable. In other cases, with many duplicates, the Set monad is exponentially worthwhile.
Perhaps for these theoretical and natural reasons, the question of Set monads comes up regularly on blogs and mailing lists. The answers have been disappointing. We recount the challenges after the background section. First, there is a misconception that the Set monad is impossible. Rather, Set has to be a restricted monad. The restrictions are unpalatable however. It turns out, a genuine, unrestricted Set monad exists -- but in its naive formulation is a joke. Its CPS version is more efficient, to a limit. It does make a point that a restricted monad can be realized as a regular, restriction-free continuation monad. Yet efficiency may suffer. Some computations that should take linear time in a Set monad take exponential time in the naive CPS version. Furthermore, it seems this inefficiency is inherent and there is nothing we can do.
The last two sections present approaches to attain the expected
efficiency. The first insight comes from partial evaluation: take
advantage and propagate the statically known information. For example,
the choice choose [1,2]
signals that all intermediate values at that
point in the computation are integers and hence may be collected in a
set. The second insight is that the composition of monadic reflection
and reification is often an optimization. Efficient and
restriction-free Set monads exist after all.
Michal Armoni and Mordechai Ben-Ari: The concept of nondeterminism: its development and implications for teaching
ACM SIGCSE Bulletin, 41(2), June 2009, pp. 141-160.
10.1145/1595453.1595495
MonadPlus
abstraction for
non-deterministic computations in Haskell and its most known
implementation, the List monad. The running examples will show off the
duplication of possible worlds, which clearly blows off in one
example. The efficient Set monads described later on the page speed up
such computations exponentially.
As other effects, nondeterminism is represented
in Haskell as a Monad. Non-deterministic computations are built
with the standard monadic operations return
and (>>=)
and
two primitives: mplus m1 m2
for the non-deterministic choice between computations m1
and m2
; mzero
for failure, the choice among zero alternatives.
The type class MonadPlus
specifies the operations as
class Monad m => MonadPlus m where mzero :: m a mplus :: m a -> m a -> m athus defining the `monad for non-determinism'. All our Set monads (except the restricted version) are the instances of
MonadPlus
. The operation
to non-deterministically choose
a value among several is generically
defined aschoose :: MonadPlus m => [a] -> m a choose = foldr (\x c -> return x `mplus` c) mzero(One could have picked
choose
to be the primitive non-deterministic
operation.) The first running example computes the indices, among the first
30, of those triangular numbers that are sums of two triangles:ptriang :: MonadPlus m => m Int ptriang = do let triang n = n * (n+1) `div` 2 k <- choose [1..30] i <- choose [1..k] j <- choose [1..i] if triang i + triang j == triang k then return k else mzeroThe example is defined generically for any
MonadPlus
implementation.
The code plainly matches the description of the Pythagorean triangle
problem: pick two triangles that sum to a triangle and return the
index of the latter. The unlucky choice is a failure. Such clarity of
expression was the chief motivation for non-determinism: ``The main
advantage of these [non-deterministic] machines is the small number of
internal states that they require in many cases and the ease in which
specific machines can be described'' (Rabin and Scott, 1959).One implementation of non-determinism is to collect all `states' -- all values that could possibly be produced at some point in the computation -- in a List. Hence the List monad:
instance Monad [] where return x = [x] m >>= f = concatMap f m instance MonadPlus [] where mzero = [] mplus = (++)
There is much to like about the List monad, not the least is simplicity. Thanks to laziness, the possible worlds are tracked one at a time; hence the List monad realizes non-determinism as backtracking. In this implementation, our example gives
ptriang_list :: [Int] ptriang_list = ptriang -- [3,6,8,10,11,13,15,16,18,20,21,21,23,23,23,26,27,28,28,28]The result, shown in the comments, has many duplicates: Some triangular numbers can be decomposed into two triangles in more than one way. Since we are interested only in the existence of the decomposition, computing the duplicates is a waste. This waste, tracking duplicate possible worlds, can blow-up, as the following example by Petr Pudlak makes clear.
step :: MonadPlus m => Int -> m Int step i = choose [i, i+1] stepN :: MonadPlus m => Int -> m Int stepN 0 = return 0 stepN n = stepN (n-1) >>= stepIn this Pascal-triangle--like example,
stepN n
is essentially choose [0..n]
, obtained by iterating step
n
times:stepN5_list :: [Int] stepN5_list = stepN 5 -- [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5]
There are many, many ways to choose 2
, for instance. Tracking these
many redundant choices is harmful, leading to the exponential explosion:
stepN18len_list = length $ stepN 18 -- 262144 -- (0.39 secs, 284107216 bytes) stepN19len_list = length $ stepN 19 -- 524288 -- (0.78 secs, 567120392 bytes)
It is natural to collect the possible states (values reached at some point) in a set. Rabin and Scott did exactly that in their 1959 paper. They studied finite automata, whose states are easy to compare, by their labels. The comparison may be difficult or impossible in general.
The list monad tracks all values that a non-deterministic computation
may produce at some point -- from the initial point through the
completion. Duplicate values mean redundant tracking. To eliminate
the duplicates, the intermediate values should be collected in a set
rather than a list. By analogy with the List monad, we need the Set
monad. Alas, we cannot write instance Monad Set
because we cannot
define its required method return :: a -> Set a
. Only comparable
values can be put into a set; therefore Set a
requires the Ord a
constraint. However, in a monad, return
must be polymorphic, without
any constraints (after all, return
is meant to be a natural
transformation).
Therefore, one often hears of restricted monads -- whose return
and (>>=)
operations are not fully polymorphic; the types of
their arguments are constrained, e.g., to be members of certain classes.
A Set a
data type can be made an instance of a restricted
monad. Incidentally, Set is the main (and seems, the only) motivation
for restricted monads. There have been suggested several
implementations of restricted monads. Without committing to any
in particular, we define the return and bind operations as mere functions
rather than methods.
retS :: Ord a => a -> Set a retS = S.singleton bindS :: (Ord a, Ord b) => Set a -> (a -> Set b) -> Set b bindS m f = S.foldr (\x r -> f x `union` r) S.empty m
Set
here and throughout the web page means the one in Data.Set
;
the module name will be abbreviated to S
. The step
example takes the form:stepS :: Int -> Set Int stepS i = fromAscList [i, i+1] stepNS :: Int -> Set Int stepNS 0 = retS 0 stepNS n = stepNS (n-1) `bindS` stepS stepNS5 :: Set Int stepNS5 = stepNS 5 -- fromList [0,1,2,3,4,5]
The result of stepNS 5
has no longer any duplicates. Compared
to the List-monad implementation from the previous section, the example
runs much faster:
stepNS18len = S.size $ stepNS 18 -- 19 -- (0.00 secs, 1102168 bytes) stepNS19len = S.size $ stepNS 19 -- 20 -- (0.00 secs, 1641352 bytes)As expected, collecting the possible choices for the intermediate results in a set has eliminated redundant computations and improved efficiency.
However, the restricted Set monad, as any restricted monad, is quite
restricted. It is not a monad at all. For example, the monad
law return x >>= f == f x
does not generally hold: while the right-hand-size is well-typed
for any x::a
and f :: a -> m b
, the left-hand-side is well-typed
only for x
of certain restricted types. Many monadic idioms and
functions become impossible -- such as the function ap :: Monad m => m (a -> b) -> m a -> m b
used in a pattern like return f `ap` x1 `ap` ... `ap` xn
. Since functions are not
generally comparable, we cannot return
them.
We do not have to sacrifice the monad laws however. Below, the efficiency of the restricted Set monad is attained without imposing restrictions. We can have the restriction-free monad for non-determinism, and eliminate the duplicates too.
Data.Set.Set
to be an instance of the Monad
type
class, as we have just seen. Yet a set monad -- a non-determinism
monad that produces the set of possible results -- does exist. We show
two. One of them is a joke; the other is more efficient. Both can be
significantly accelerated while still remaining the ordinary,
unrestricted monads. The acceleration is the subject of the next two
sections. This section is the starting point. It shows that a
restricted monad, a monad with restrictions on the types of
produced values, can be implemented as an ordinary, continuation
monad. Restricted monads are avoidable.We start with the List monad, which collects the currently possible values in a list and produces the list of possible results. Converting the latter list to a set gives a non-determinism monad in which the results are observed as a set.
runSnaive :: Ord a => [a] -> Set a runSnaive = fromList ptriang_naive :: Set Int ptriang_naive = runSnaive ptriang -- fromList [3,6,8,10,11,13,15,16,18,20,21,23,26,27,28]
Compared to ptriang_list
earlier, there are no longer duplicates
in the result. This `solution' must be a joke: the duplicates are
still present during the computation and are removed only at the very end.
(That is why the Ord
constraint, requiring values to be comparable,
appears just in the run
function.)
Therefore, stepN n
still runs in time exponential in n
. Here is
an even simpler
example, with the exponential number of identical choices:
-- repeated choices from [1,1] dups :: MonadPlus m => Int -> m Int dups 0 = return 1 dups n = choose [1,1] >> dups (n-1)
runSnaive (dups n)
runs in O(2^n)
time, albeit returning the
singular set.The naivete is less noticeable if we turn the monad into the continuation-passing style:
-- This is just Ord r => Cont (Set r) newtype SetC a = SetC{unSetC :: forall r. Ord r => (a -> Set r) -> Set r}
SetC
is the familiar, ordinary Cont
monad, with a set as the
answer-type. The Ord
constrains the answer-type but not
the type a
of the produced values.
The run function builds the answer-set, supplying the initial continuation singleton
and the Ord
constraint:runSetC :: Ord r => SetC r -> Set r runSetC m = unSetC m singletonThe failing computation
mzero
produces the empty set, and
the non-deterministic choice mplus
pursues both alternatives and
unions the results.instance MonadPlus SetC where mzero = SetC $ \k -> empty mplus m1 m2 = SetC $ \k -> unSetC m1 k `union` unSetC m2 kWe have thus converted the restricted Set monad into an ordinary, unrestricted monad.
The SetC
monad may look like the same joke as the un-CPS, naive
version. Although the output of our examples has no duplicates, the examples
themselves do not run any faster. Actually, this is not quite true.
Some examples in the SetC
monad can run exponentially faster. The example dups
does run faster in GHC, in some circumstances.
In the naive Set monad, we obtain with a bit of inlining and substitution:
runSnaive (dups (n+1)) === fromList (choose [1,1] >> dups n) === fromList (dups n ++ dups n) {common sub-expression elimination by a clever optimizer} === let x = dups n in fromList (x ++ x)The last step relied on a clever optimizer noticing the two occurrences of
dups n :: [Int]
and lifting that
common sub-expression.
Alas, it does not help the overall time complexity. Although the 2^(n+1)
-element list x ++ x
will be computed faster, fromList
still
needs to traverse it completely when converting to a set. The overall complexity
remains O(2^n)
. What would have helped are the properties of S.fromList
and of the set union, which give fromList (x++x) === fromList x
. Alas,
this domain-specific knowledge is not available to a general-purpose
optimizer.
Running the same example in the CPS version SetC
brings an interesting
twist:
runSetC (dups (n+1)) === unSetC (choose [1,1] >> dups n) singleton === let k _ = unSetC (dups n) singleton in unSetC (choose [1,1]) >>= k () === let k _ = runSetC (dups n) in k () `union` k () {common sub-expression elimination by a clever optimizer} === let k _ = runSetC (dups n) x = k () in x `union` xSince
runSetC (dups n)
is the singleton for any n
, x `union` x
takes constant time. Therefore, runSetC (dups (n+1))
runs in linear time. The elimination of the common sub-expression k ()
makes a huge difference in time complexity now. The GHC optimizer
indeed does such a common-subexpression-elimination in certain
cases (when the answer-type is specialized).
Unfortunately, the lucky cases like the above are isolated. The stepN
example exponentially explodes with SetC
. The cause is not
hard to see: runSetC (stepN n)
will have to compute runSetC (choose [0,1] >>= k)
, which is runSetC (k 0) `union` runSetC (k 1)
. There are no longer
common sub-expressions; runSetC (k i)
, which is
essentially runSetC (stepN (n-1))
,
produces different, although mostly overlapping results, for different i
.
We have seen two proper Set monads, the naive one and its CPS version. The latter is less of a joke and takes some advantage of sets to speed-up some computations. The claim that a restricted Set monad can be converted to a proper, non-restricted monad gets a bit of credence -- only a small bit so far. The principal flaw of both Set monads is the independent tracking of the computations forked from a non-deterministic choice. The forked computations may go through many common states, but they are not shared. Hence the performance remains sub-optimal. The following two sections fix the problem.
Petr Pudlak: Constructing efficient monad instances on `Set` (and other containers with constraints) using the continuation monad
< http://stackoverflow.com/q/12183656/1333025 >
Purely Functional Lazy Non-deterministic Programming (JFP 2011)
The naive Set monad is a joke because the set is produced only at the end of the computation. The possible intermediate results are being collected in a list, which lets duplicates accumulate. To remove the duplicates, we should be able to compare the possible intermediate results. They could be functions, however, or other incomparable data. Hence we have to deal with the values that are known to be comparable, and the ones that might not be. The collection for the possible intermediate results has to be `a blend', a disjoint union of a set and a list:
data SetM a where SMOrd :: Ord a => S.Set a -> SetM a SMAny :: [a] -> SetM a
SetM
is a GADT: Ord a
constrains the type of the collected values,
but only within the SMOrd
alternative. Just like List, SetM
is an ordinary, restriction-free monad:instance Monad SetM where return x = SMAny [x] m >>= f = collect . Prelude.map f $ setMtoList m setMtoList :: SetM a -> [a] setMtoList (SMOrd x) = S.toList x setMtoList (SMAny x) = xThe operation
(>>=)
is quite like the one of the List monad, computing
the results for each possible world of m
and collecting
them. The collection is not quite the concatenation:collect :: [SetM a] -> SetM a collect [] = SMAny [] collect [x] = x collect ((SMOrd x):t) = case collect t of SMOrd y -> SMOrd (S.union x y) SMAny y -> SMOrd (union x (fromList y)) collect ((SMAny x):t) = case collect t of SMOrd y -> SMOrd (union y (fromList x)) SMAny y -> SMAny (x ++ y)If all
SetM a
in the list [SetM a]
are SMAny
, then the collection is
indeed concatenation. If at least one of them is SMOrd
, we know that all type a
values are comparable, even those contained in SMAny
. In other words, if we somehow know that some of the
currently possible intermediate values are comparable, then all of
them are. This is the principle of propagating knowledge. The MonadPlus
instances uses that principle too:instance MonadPlus SetM where mzero = SMAny [] mplus (SMAny x) (SMAny y) = SMAny (x ++ y) mplus (SMAny x) (SMOrd y) = SMOrd (union y (fromList x)) mplus (SMOrd x) (SMAny y) = SMOrd (union x (fromList y)) mplus (SMOrd x) (SMOrd y) = SMOrd (union x y)If at least one choice yields comparable results, the results of the other choice are comparable too, even if we did not know that (see the
SMAny
alternative) until now.
We get to know which intermediate results
are comparable from the programmer's telling us. We ask the programmer
to use the optimized version to choose
among comparable
alternatives:
chooseOrd :: Ord a => [a] -> SetM a chooseOrd x = SMOrd (fromList x)
One may argue that choose
must always have the Ord
constraint. (We could instead ask the programmer
to use the specialized mplusOrd :: Ord a -> m a -> m a -> m a
where appropriate.) All remains is to extract the
set as the final result:
runSetM :: Ord a => SetM a -> Set a runSetM (SMOrd x) = x runSetM (SMAny x) = S.fromList x
If we re-write our examples with the efficient choose
, we find that
they indeed run much faster; stepN n
runs in linear time, as expected.
Thus from program annotations -- by asking the programmer to tell us of some comparable values -- we gain static knowledge. We then arrange to propagate that static knowledge throughout the computation at run-time. As in partial evaluation, the speedup can be exponentially impressive.
SetC
monad, -- illustrating the principle
of monadic reflection and reification.Reification represents an effectful computation as a value.
For example, a non-deterministic computation may be
represented by the (lazy) tree of its choices -- or by the set
of possible results. The SetC
monad already has a function
to reify a SetC
computation into a set:
reifySet :: Ord r => SetC r -> Set r reifySet = runSetC
Monadic reflection turns the representation of a computation
to the computation itself. For example, the reflection takes
a set of values and non-deterministically selects one.
Our choose
was essentially such a function:
reflectSet :: Ord r => Set r -> SetC r reflectSet s = SetC $ \k -> S.foldr (\x r -> k x `union` r) S.empty s
From the types, reifySet
and reflectSet
look like inverses.
They indeed are, as one can easily check.
reifySet . reflectSet $ set == set reflectSet . reifySet $ m ~ mReflecting and then reifying a set gives the equal set. Reifying and then reflecting a computation gives an observationally equivalent computation. Any reification-reflection pair must satisfy these laws.
The second law says that the composition reflectSet . reifySet
is a sound program transformation that preserves equivalence.
It is often an optimization transformation. Indeed, the
reified computation probably has done calculations between the choices.
The reification result, a set, records only the results of the choices.
Reflecting the set gives a computation that chooses among the
already computed results.
We apply this optimization recipe to SetC
. Rather than changing the monad,
we ask the programmer to add reflectSet . reifySet
`annotations'
at appropriate places, for instance, around recursive calls. The stepNR
computation will now read:
stepNR :: Int -> SetC Int stepNR 0 = return 0 stepNR n = (reflectSet . reifySet $ stepNR (n-1)) >>= stepIt runs in linear time, giving the exponential improvement.
We have thus presented the second proper, efficient Set monad. The
bare SetC
monad is inefficient because the results of
non-deterministic choices are unioned only at the very end. The
reflection-reification `annotation' splits the computation in
chunks. The sets of results are unioned sooner, at chunk boundaries,
hence eliminating the duplicates among the intermediate results. The
performance improves as expected. Although the reflection-reification
looks quite different from the first optimization principle of
knowledge tracking, it, too, takes and follows through
on the hints left by the programmer. The appearance of reflectSet . reifySet
in the code is such a hint, telling us that the
intermediate values at that point in the program are comparable and
hence the duplicates can be detected and removed.
Andrzej Filinski: Representing Monads
POPL 1994
The paper introduced monadic reflection and reification and used it to represent monads in `direct style'.
Embedded probabilistic programming
Section 3 of the paper tells how reification and reflection accelerate the probabilistic inference
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML