# How to restrict a monad without breaking it

By eliminating duplicate possible worlds, a set monad can dramatically accelerate non-deterministic computations. But such an efficient monad does not exist -- or so it has been believed. We describe the challenges and two ways to overcome them, taking hints from partial evaluation and monadic reflection.

## Introduction

Non-determinism was first introduced into computing in 1959, in the Turing-award paper by Rabin and Scott: ``A nondeterministic automaton has, at each stage of its operation, several choices of possible actions... A nondeterministic automaton is not a probabilistic machine but rather a machine with many choices in its moves. At each stage of its motion across the tape it will be at liberty to choose one of several new internal states''. Rabin and Scott have also introduced a method to deterministically reason about (and, eventually, implement) such non-deterministic machines: keep track of all currently possible states -- of all `possible worlds'. Each internal state may transition to zero, one or more new internal states. The collection of currently possible states is updated as the computation progresses. Such a collection could be a list -- hence the famous List monad. However, if two new states of the machine turn out the same, surely it is redundant to track both. The order among possible states may have no significance either. A better choice for the collection of currently possible states is a set. That is exactly what Rabin and Scott chose, in their, now textbook, conversion of the non-deterministic finite automaton to a deterministic one.

The possible worlds can be tracked in `parallel' (in the breadth-first fashion) or one after another (backtracking). To detect duplicate intermediate states, one has to store the ones already tracked. The Set monad hence may be regarded as a memoization, trading space for time. When the intermediate states are very numerous, the opposite trade-off may be advisable. In other cases, with many duplicates, the Set monad is exponentially worthwhile.

Perhaps for these theoretical and natural reasons, the question of Set monads comes up regularly on blogs and mailing lists. The answers have been disappointing. We recount the challenges after the background section. First, there is a misconception that the Set monad is impossible. Rather, Set has to be a restricted monad. The restrictions are unpalatable however. It turns out, a genuine, unrestricted Set monad exists -- but in its naive formulation is a joke. Its CPS version is more efficient, to a limit. It does make a point that a restricted monad can be realized as a regular, restriction-free continuation monad. Yet efficiency may suffer. Some computations that should take linear time in a Set monad take exponential time in the naive CPS version. Furthermore, it seems this inefficiency is inherent and there is nothing we can do.

The last two sections present approaches to attain the expected efficiency. The first insight comes from partial evaluation: take advantage and propagate the statically known information. For example, the choice `choose [1,2]` signals that all intermediate values at that point in the computation are integers and hence may be collected in a set. The second insight is that the composition of monadic reflection and reification is often an optimization. Efficient and restriction-free Set monads exist after all.

References
Michael O. Rabin and Dana Scott: Finite Automata and Their Decision Problems
IBM Journal of Research and Development 3 (2), April 1959, pp. 114-125.
doi:10.1147/rd.32.0114

Michal Armoni and Mordechai Ben-Ari: The concept of nondeterminism: its development and implications for teaching
ACM SIGCSE Bulletin, 41(2), June 2009, pp. 141-160.
10.1145/1595453.1595495

For the sake of reference, we recall the `MonadPlus` abstraction for non-deterministic computations in Haskell and its most known implementation, the List monad. The running examples will show off the duplication of possible worlds, which clearly blows off in one example. The efficient Set monads described later on the page speed up such computations exponentially.

As other effects, nondeterminism is represented in Haskell as a Monad. Non-deterministic computations are built with the standard monadic operations `return` and `(>>=)` and two primitives: `mplus m1 m2` for the non-deterministic choice between computations `m1` and `m2`; `mzero` for failure, the choice among zero alternatives. The type class `MonadPlus` specifies the operations as

```     class Monad m => MonadPlus m where
mzero :: m a
mplus :: m a -> m a -> m a
```
thus defining the `monad for non-determinism'. All our Set monads (except the restricted version) are the instances of `MonadPlus`. The operation to non-deterministically `choose` a value among several is generically defined as
```     choose :: MonadPlus m => [a] -> m a
choose = foldr (\x c -> return x `mplus` c) mzero
```
(One could have picked `choose` to be the primitive non-deterministic operation.) The first running example computes the indices, among the first 30, of those triangular numbers that are sums of two triangles:
```     ptriang :: MonadPlus m => m Int
ptriang = do
let triang n = n * (n+1) `div` 2
k <- choose [1..30]
i <- choose [1..k]
j <- choose [1..i]
if triang i + triang j == triang k then return k else mzero
```
The example is defined generically for any `MonadPlus` implementation. The code plainly matches the description of the Pythagorean triangle problem: pick two triangles that sum to a triangle and return the index of the latter. The unlucky choice is a failure. Such clarity of expression was the chief motivation for non-determinism: ``The main advantage of these [non-deterministic] machines is the small number of internal states that they require in many cases and the ease in which specific machines can be described'' (Rabin and Scott, 1959).

One implementation of non-determinism is to collect all `states' -- all values that could possibly be produced at some point in the computation -- in a List. Hence the List monad:

```     instance Monad [] where
return x = [x]
m >>= f = concatMap f m

mzero = []
mplus = (++)
```

There is much to like about the List monad, not the least is simplicity. Thanks to laziness, the possible worlds are tracked one at a time; hence the List monad realizes non-determinism as backtracking. In this implementation, our example gives

```     ptriang_list :: [Int]
ptriang_list = ptriang
-- [3,6,8,10,11,13,15,16,18,20,21,21,23,23,23,26,27,28,28,28]
```
The result, shown in the comments, has many duplicates: Some triangular numbers can be decomposed into two triangles in more than one way. Since we are interested only in the existence of the decomposition, computing the duplicates is a waste. This waste, tracking duplicate possible worlds, can blow-up, as the following example by Petr Pudlak makes clear.
```     step :: MonadPlus m => Int -> m Int
step i = choose [i, i+1]

stepN :: MonadPlus m => Int -> m Int
stepN 0 = return 0
stepN n = stepN (n-1) >>= step
```
In this Pascal-triangle--like example, `stepN n` is essentially `choose [0..n]`, obtained by iterating `step` `n` times:
```     stepN5_list :: [Int]
stepN5_list = stepN 5
-- [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5]
```

There are many, many ways to choose `2`, for instance. Tracking these many redundant choices is harmful, leading to the exponential explosion:

```     stepN18len_list = length \$ stepN 18
-- 262144
-- (0.39 secs, 284107216 bytes)

stepN19len_list = length \$ stepN 19
-- 524288
-- (0.78 secs, 567120392 bytes)
```

It is natural to collect the possible states (values reached at some point) in a set. Rabin and Scott did exactly that in their 1959 paper. They studied finite automata, whose states are easy to compare, by their labels. The comparison may be difficult or impossible in general.

A Set monad, analogous to the List monad, is allegedly impossible. The best we can do is to define Set as a restricted monad for non-determinism. It is efficient as expected. However, it is not a monad at all. Later sections will reveal that the Set monad is not as impossible as it appears.

The list monad tracks all values that a non-deterministic computation may produce at some point -- from the initial point through the completion. Duplicate values mean redundant tracking. To eliminate the duplicates, the intermediate values should be collected in a set rather than a list. By analogy with the List monad, we need the Set monad. Alas, we cannot write `instance Monad Set` because we cannot define its required method `return :: a -> Set a`. Only comparable values can be put into a set; therefore `Set a` requires the `Ord a` constraint. However, in a monad, `return` must be polymorphic, without any constraints (after all, `return` is meant to be a natural transformation).

Therefore, one often hears of restricted monads -- whose `return` and `(>>=)` operations are not fully polymorphic; the types of their arguments are constrained, e.g., to be members of certain classes. A `Set a` data type can be made an instance of a restricted monad. Incidentally, Set is the main (and seems, the only) motivation for restricted monads. There have been suggested several implementations of restricted monads. Without committing to any in particular, we define the return and bind operations as mere functions rather than methods.

```     retS :: Ord a => a -> Set a
retS = S.singleton

bindS :: (Ord a, Ord b) => Set a -> (a -> Set b) -> Set b
bindS m f = S.foldr (\x r -> f x `union` r) S.empty m
```
`Set` here and throughout the web page means the one in `Data.Set`; the module name will be abbreviated to `S`. The `step` example takes the form:
```     stepS :: Int -> Set Int
stepS i = fromAscList [i, i+1]

stepNS :: Int -> Set Int
stepNS 0 = retS 0
stepNS n = stepNS (n-1) `bindS` stepS

stepNS5 :: Set Int
stepNS5 = stepNS 5
-- fromList [0,1,2,3,4,5]
```

The result of `stepNS 5` has no longer any duplicates. Compared to the List-monad implementation from the previous section, the example runs much faster:

```     stepNS18len = S.size \$ stepNS 18
-- 19
-- (0.00 secs, 1102168 bytes)

stepNS19len = S.size \$ stepNS 19
-- 20
-- (0.00 secs, 1641352 bytes)
```
As expected, collecting the possible choices for the intermediate results in a set has eliminated redundant computations and improved efficiency.

However, the restricted Set monad, as any restricted monad, is quite restricted. It is not a monad at all. For example, the monad law `return x >>= f == f x` does not generally hold: while the right-hand-size is well-typed for any `x::a` and `f :: a -> m b`, the left-hand-side is well-typed only for `x` of certain restricted types. Many monadic idioms and functions become impossible -- such as the function `ap :: Monad m => m (a -> b) -> m a -> m b` used in a pattern like `return f `ap` x1 `ap` ... `ap` xn`. Since functions are not generally comparable, we cannot `return` them.

We do not have to sacrifice the monad laws however. Below, the efficiency of the restricted Set monad is attained without imposing restrictions. We can have the restriction-free monad for non-determinism, and eliminate the duplicates too.

References

## Unrestricted but inefficient Set monads

We cannot make `Data.Set.Set` to be an instance of the `Monad` type class, as we have just seen. Yet a set monad -- a non-determinism monad that produces the set of possible results -- does exist. We show two. One of them is a joke; the other is more efficient. Both can be significantly accelerated while still remaining the ordinary, unrestricted monads. The acceleration is the subject of the next two sections. This section is the starting point. It shows that a restricted monad, a monad with restrictions on the types of produced values, can be implemented as an ordinary, continuation monad. Restricted monads are avoidable.

We start with the List monad, which collects the currently possible values in a list and produces the list of possible results. Converting the latter list to a set gives a non-determinism monad in which the results are observed as a set.

```     runSnaive :: Ord a => [a] -> Set a
runSnaive = fromList

ptriang_naive :: Set Int
ptriang_naive = runSnaive ptriang
-- fromList [3,6,8,10,11,13,15,16,18,20,21,23,26,27,28]
```

Compared to `ptriang_list` earlier, there are no longer duplicates in the result. This `solution' must be a joke: the duplicates are still present during the computation and are removed only at the very end. (That is why the `Ord` constraint, requiring values to be comparable, appears just in the `run` function.) Therefore, `stepN n` still runs in time exponential in `n`. Here is an even simpler example, with the exponential number of identical choices:

```     -- repeated choices from [1,1]
dups :: MonadPlus m => Int -> m Int
dups 0 = return 1
dups n = choose [1,1] >> dups (n-1)
```
`runSnaive (dups n)` runs in `O(2^n)` time, albeit returning the singular set.

The naivete is less noticeable if we turn the monad into the continuation-passing style:

```     -- This is just Ord r => Cont (Set r)
newtype SetC a =
SetC{unSetC :: forall r. Ord r => (a -> Set r) -> Set r}
```
`SetC` is the familiar, ordinary `Cont` monad, with a set as the answer-type. The `Ord` constrains the answer-type but not the type `a` of the produced values. The run function builds the answer-set, supplying the initial continuation `singleton` and the `Ord` constraint:
```     runSetC :: Ord r => SetC r -> Set r
runSetC m = unSetC m singleton
```
The failing computation `mzero` produces the empty set, and the non-deterministic choice `mplus` pursues both alternatives and unions the results.
```     instance MonadPlus SetC where
mzero = SetC \$ \k -> empty
mplus m1 m2 = SetC \$ \k -> unSetC m1 k `union` unSetC m2 k
```
We have thus converted the restricted Set monad into an ordinary, unrestricted monad.

The `SetC` monad may look like the same joke as the un-CPS, naive version. Although the output of our examples has no duplicates, the examples themselves do not run any faster. Actually, this is not quite true. Some examples in the `SetC` monad can run exponentially faster. The example `dups` does run faster in GHC, in some circumstances.

In the naive Set monad, we obtain with a bit of inlining and substitution:

```     runSnaive (dups (n+1))
=== fromList (choose [1,1] >> dups n)
=== fromList (dups n ++ dups n)
{common sub-expression elimination by a clever optimizer}
=== let x = dups n
in fromList (x ++ x)
```
The last step relied on a clever optimizer noticing the two occurrences of `dups n :: [Int]` and lifting that common sub-expression. Alas, it does not help the overall time complexity. Although the `2^(n+1)`-element list `x ++ x` will be computed faster, `fromList` still needs to traverse it completely when converting to a set. The overall complexity remains `O(2^n)`. What would have helped are the properties of `S.fromList` and of the set union, which give `fromList (x++x) === fromList x`. Alas, this domain-specific knowledge is not available to a general-purpose optimizer.

Running the same example in the CPS version `SetC` brings an interesting twist:

```     runSetC (dups (n+1))
=== unSetC (choose [1,1] >> dups n) singleton
=== let k _ = unSetC (dups n) singleton
in unSetC (choose [1,1]) >>= k ()
=== let k _ = runSetC (dups n)
in k () `union` k ()
{common sub-expression elimination by a clever optimizer}
=== let k _ = runSetC (dups n)
x = k ()
in x `union` x
```
Since `runSetC (dups n)` is the singleton for any `n`, `x `union` x` takes constant time. Therefore, `runSetC (dups (n+1))` runs in linear time. The elimination of the common sub-expression `k ()` makes a huge difference in time complexity now. The GHC optimizer indeed does such a common-subexpression-elimination in certain cases (when the answer-type is specialized).

Unfortunately, the lucky cases like the above are isolated. The `stepN` example exponentially explodes with `SetC`. The cause is not hard to see: `runSetC (stepN n)` will have to compute `runSetC (choose [0,1] >>= k)`, which is `runSetC (k 0) `union` runSetC (k 1)`. There are no longer common sub-expressions; `runSetC (k i)`, which is essentially `runSetC (stepN (n-1))`, produces different, although mostly overlapping results, for different `i`.

We have seen two proper Set monads, the naive one and its CPS version. The latter is less of a joke and takes some advantage of sets to speed-up some computations. The claim that a restricted Set monad can be converted to a proper, non-restricted monad gets a bit of credence -- only a small bit so far. The principal flaw of both Set monads is the independent tracking of the computations forked from a non-deterministic choice. The forked computations may go through many common states, but they are not shared. Hence the performance remains sub-optimal. The following two sections fix the problem.

References
The complete code for the article

Petr Pudlak: Constructing efficient monad instances on `Set` (and other containers with constraints) using the continuation monad
< http://stackoverflow.com/q/12183656/1333025 >

## Efficient Set monad through knowledge propagation

The first approach to improve the naive Set monad takes a clue from partial evaluation: use the statically available knowledge and propagate it. This section develops the first efficient and restriction-free Set monad.

The naive Set monad is a joke because the set is produced only at the end of the computation. The possible intermediate results are being collected in a list, which lets duplicates accumulate. To remove the duplicates, we should be able to compare the possible intermediate results. They could be functions, however, or other incomparable data. Hence we have to deal with the values that are known to be comparable, and the ones that might not be. The collection for the possible intermediate results has to be `a blend', a disjoint union of a set and a list:

```     data SetM a where
SMOrd :: Ord a => S.Set a -> SetM a
SMAny :: [a] -> SetM a
```
`SetM` is a GADT: `Ord a` constrains the type of the collected values, but only within the `SMOrd` alternative. Just like List, `SetM` is an ordinary, restriction-free monad:
```     instance Monad SetM where
return x = SMAny [x]
m >>= f = collect . Prelude.map f \$ setMtoList m

setMtoList :: SetM a -> [a]
setMtoList (SMOrd x) = S.toList x
setMtoList (SMAny x) = x
```
The operation `(>>=)` is quite like the one of the List monad, computing the results for each possible world of `m` and collecting them. The collection is not quite the concatenation:
```     collect :: [SetM a] -> SetM a
collect []  = SMAny []
collect [x] = x
collect ((SMOrd x):t) = case collect t of
SMOrd y -> SMOrd (S.union x y)
SMAny y -> SMOrd (union x (fromList y))
collect ((SMAny x):t) = case collect t of
SMOrd y -> SMOrd (union y (fromList x))
SMAny y -> SMAny (x ++ y)
```
If all `SetM a` in the list `[SetM a]` are `SMAny`, then the collection is indeed concatenation. If at least one of them is `SMOrd`, we know that all type `a` values are comparable, even those contained in `SMAny`. In other words, if we somehow know that some of the currently possible intermediate values are comparable, then all of them are. This is the principle of propagating knowledge. The `MonadPlus` instances uses that principle too:
```     instance MonadPlus SetM where
mzero = SMAny []
mplus (SMAny x) (SMAny y) = SMAny (x ++ y)
mplus (SMAny x) (SMOrd y) = SMOrd (union y (fromList x))
mplus (SMOrd x) (SMAny y) = SMOrd (union x (fromList y))
mplus (SMOrd x) (SMOrd y) = SMOrd (union x y)
```
If at least one choice yields comparable results, the results of the other choice are comparable too, even if we did not know that (see the `SMAny` alternative) until now.

We get to know which intermediate results are comparable from the programmer's telling us. We ask the programmer to use the optimized version to `choose` among comparable alternatives:

```     chooseOrd :: Ord a => [a] -> SetM a
chooseOrd x = SMOrd (fromList x)
```

One may argue that `choose` must always have the `Ord` constraint. (We could instead ask the programmer to use the specialized `mplusOrd :: Ord a -> m a -> m a -> m a` where appropriate.) All remains is to extract the set as the final result:

```     runSetM :: Ord a => SetM a -> Set a
runSetM (SMOrd x) = x
runSetM (SMAny x) = S.fromList x
```

If we re-write our examples with the efficient `choose`, we find that they indeed run much faster; `stepN n` runs in linear time, as expected.

Thus from program annotations -- by asking the programmer to tell us of some comparable values -- we gain static knowledge. We then arrange to propagate that static knowledge throughout the computation at run-time. As in partial evaluation, the speedup can be exponentially impressive.

References
The complete code for the article

## Efficient Set monad through reflection-reification

We have developed one serious, efficient, restriction-free Set monad by improving the naive version. This section improves the CPS variant -- the `SetC` monad, -- illustrating the principle of monadic reflection and reification.

Reification represents an effectful computation as a value. For example, a non-deterministic computation may be represented by the (lazy) tree of its choices -- or by the set of possible results. The `SetC` monad already has a function to reify a `SetC` computation into a set:

```     reifySet :: Ord r => SetC r -> Set r
reifySet = runSetC
```

Monadic reflection turns the representation of a computation to the computation itself. For example, the reflection takes a set of values and non-deterministically selects one. Our `choose` was essentially such a function:

```     reflectSet :: Ord r => Set r -> SetC r
reflectSet s = SetC \$ \k -> S.foldr (\x r -> k x `union` r) S.empty s
```

From the types, `reifySet` and `reflectSet` look like inverses. They indeed are, as one can easily check.

```     reifySet . reflectSet \$ set == set
reflectSet . reifySet \$ m   ~  m
```
Reflecting and then reifying a set gives the equal set. Reifying and then reflecting a computation gives an observationally equivalent computation. Any reification-reflection pair must satisfy these laws.

The second law says that the composition `reflectSet . reifySet` is a sound program transformation that preserves equivalence. It is often an optimization transformation. Indeed, the reified computation probably has done calculations between the choices. The reification result, a set, records only the results of the choices. Reflecting the set gives a computation that chooses among the already computed results.

We apply this optimization recipe to `SetC`. Rather than changing the monad, we ask the programmer to add `reflectSet . reifySet` `annotations' at appropriate places, for instance, around recursive calls. The `stepNR` computation will now read:

```     stepNR :: Int -> SetC Int
stepNR 0 = return 0
stepNR n = (reflectSet . reifySet \$ stepNR (n-1)) >>= step
```
It runs in linear time, giving the exponential improvement.

We have thus presented the second proper, efficient Set monad. The bare `SetC` monad is inefficient because the results of non-deterministic choices are unioned only at the very end. The reflection-reification `annotation' splits the computation in chunks. The sets of results are unioned sooner, at chunk boundaries, hence eliminating the duplicates among the intermediate results. The performance improves as expected. Although the reflection-reification looks quite different from the first optimization principle of knowledge tracking, it, too, takes and follows through on the hints left by the programmer. The appearance of `reflectSet . reifySet` in the code is such a hint, telling us that the intermediate values at that point in the program are comparable and hence the duplicates can be detected and removed.

References
The complete code for the article

POPL 1994
The paper introduced monadic reflection and reification and used it to represent monads in `direct style'.

Embedded probabilistic programming
Section 3 of the paper tells how reification and reflection accelerate the probabilistic inference

### Last updated July 4, 2013

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org