do
-notation for OCaml
(Joint work with Jacques Carette and Lydia E. van Dijk)
MonadIO
MonadPlus
interface, constructs for fair disjunctions, fair
conjunctions, conditionals, pruning, and an expressive top-level
interface. Implementing these additional constructs is well-known in
models of backtracking based on streams, but not known to be possible
in continuation-based models. We show that all these additional
constructs can be generically and monadically realized
using a single primitive which we call msplit
. We present
two implementations of the library: one using success and failure
continuations; and the other using control operators for manipulating
delimited continuations.
It seems that something similar (but not identical) to
msplit
has been discovered at least twice before:
under the names unbuild
(Andrew Gill, 1996) and
destroy
(Josef Svenningsson, ICFP02). Our function
reflect
is also related to unfoldr
. One of the
earlier versions of Kanren did indeed implement something
like destroy
, internally using regular list constructors to split a
stream.
However, the present function msplit
is
significantly more general: it has nothing to do with Haskell lists
and never uses any lists constructors, even internally. The functions
msplit
and reflect
operate over any
backtracking computation over any base monad. The
presence of the base monad makes things trickier as we now must assure
that we do not repeat effects. Therefore, the equivalent of the
destroy/infoldr rule of Josef Svenningsson's
destroy . unfoldr === id
is, in ``small-step semantics'':
rr (lift m >> mzero) === lift m >> mzero rr (lift m `mplus` tma) === lift m `mplus` (rr tma) where rr tm = msplit tm >>= reflectOne can read
mzero
and mplus
as
generalized list constructors []
and (:)
,
and read (>>=)
as a flipped application. The more complex
laws assure that effects are not duplicated. It is fair to say that
whereas foldr/build and destroy/unfoldr fusion techniques work for
Haskell lists, msplit/reflect in the LogicT paper is a fusion
law for a general backtracking computation over arbitrary, perhaps
even strict, base monad. That computation may look nothing like list;
in fact, the LogicT paper gives an example with delimited
continuations, with abort
playing the role of nil
and shift
the role of cons
.
Joint work with Chung-chieh Shan, Dan Friedman and Amr Sabry.
LogicT as an extensible effect
The Haskell Symposium 2015 paper describes in Sec 5 the
extensible-effect implementation of LogicT.
LogicT.tar.gz [12K]
The Haskell code accompanying the paper
The code includes two implementations of LogicT
:
based on the two-continuation model of streams, and the direct-style
implementation with first-class delimited continuations, using
CCCxe
or CCExc
libraries. The code has two
larger examples: Missionaries and cannibals problem, and playing
Tic-Tac-Toe with minimax search and two heuristics. The latter code is
rather heavily based on the code by Andrew Bromage.
How to take a tail
of a functional stream
A brief and de-sugared explanation of splitting
a functional stream
Simple fair and terminating backtracking Monad Transformer
A more efficient implementation of LogicT
Josef Svenningsson:
Shortcut Fusion for Accumulating Parameters and Zip-like Functions.
Proc. ICFP'02, 2002, pp. 124-132
MonadPlus
and the
MonadPlus
transformer with fair conjunctions and disjunctions and
with distinct good termination properties. This implementation
terminates where more conventional MonadPlus
realizations (e.g.,
the List
monad) may diverge. Here's an example:
pythagorean_triples :: MonadPlus m => m (Int,Int,Int) pythagorean_triples = do let number = (return 0) `mplus` (number >>= (return . (+1))) i <- number guard $ i > 0 j <- number guard $ j > 0 k <- number guard $ k > 0 guard $ i*i + j*j == k*k return (i,j,k) test = take 7 $ runM Nothing pythagorean_triplesWe stress that
i
, j
, and k
are all infinite streams of numbers.
If we run this example with the standard List
monad or with list
comprehensions, divergence is imminent, because the backtracking ``gets
stuck'' on k
. Our fair stream implementation
terminates. Our implementation also terminates and gives a stream of
non-trivial answers if we remove the i>0
guards and make
number
to be left-recursive. See the example at
the end of the source code.
The function runM :: Maybe Int -> Stream a -> [a]
of our monad takes an extra argument: runM (Just n) m
runs the computation m
for at most n
``backtracking'' steps. This limit is another way to improve
termination.
In the present monad the laws are satisfied modulo the order of the
answers: we should treat the result of runM
as a multi-set rather
than a list. This issue is discussed in the LogicT paper.
I used to think that this monad is less capable than those described in the LogicT paper. It turns out that FBackTrackT implements the LogicT interface -- rather trivially. Thus FBackTrackT is an implementation of LogicT, with quite a good performance.
FBackTrackT.hs [6K]
The corresponding monadic transformer (in Haskell98),
with examples
LogicT - backtracking monad transformer with fair operations and pruning
We present a practical way to write purely functional lazy non-deterministic programs that are efficient and perspicuous. We achieve this goal by embedding the programs into existing languages (such as Haskell, SML, and OCaml) with high-quality implementations, by making choices lazily and representing data with non-deterministic components, by working with custom monadic data types and search strategies, and by providing equational laws for the programmer to reason about their code.
This is joint work Sebastian Fischer and Chung-chieh Shan.
The laws of sharing with non-determinism were stated `modulo observation', but that notion was not clearly defined. The proof that the laws are congruences was also left to future work. That work has been completed: the paper Reasoning with Extensible Effects formally defines the contextual equivalence modulo handler (`observer') and introduces the framework to prove equational laws to be congruences.
Deriving1.hs [17K]
Executable small-step semantics based on an alternative set of laws
The laws of non-determinism with explicit sharing described in the paper are concise but not constructive because of pattern-matching on bottom. The present code introduces an alternative set of laws. The code implements small-step semantics of a small monadic language with non-determinism and sharing. The semantics is based on the oriented version of the alternative equational laws. The code is hence the proof that the laws are constructive.
The datatype MP a
represents the syntax of the
language. The clauses of the function step
define the
reduction relation, in the declarative form. The semantics is
executable; there are functions to evaluate a term,
observing intermediate results if desired. The code includes the
extensive test suite.
HANSEI: Embedded Probabilistic Programming
Memoization of non-deterministic functions in HANSEI is implemented
using thread-local storage, according to the principles explained
in the paper.
Efficient Set monads
A different way to share possible intermediate results of
non-deterministic computations
class Monadish m where gret :: a -> m p p a gbind :: m p q a -> (a -> m q r b) -> m p r bUnlike the ordinary monads (where
m
is a type
constructor of one argument), here m
is a type
constructor of three type arguments, p
, q
and a
. The argument a
is the type of values
produced by the monadic computation. The two other arguments describe
the type of the computation state before and after the computation is
executed. As in ordinary monad, gbind
is required to be
associative and gret
to be the left and the right unit
of gbind
. It is clear that all ordinary monads are
particular case of Monadish. Like the ordinary class monad, Monadish
is implementable in Haskell98.
The need for such a type-state indexed `monad' has been
observed on many occasions. For example, we can introduce monadish
LIO p q a
where p
and q
are either Locked
or Unlocked
,
representing the state of a particular lock. Just by looking at the
type of a computation, we can see if the computation acquires the
lock (its (inferred) type is LIO Unlocked Locked a
),
acquires and releases the lock (its type is LIO Unlocked Unlocked a
), or does not use the lock (its type LIO p p a
and polymorphic over p
). It becomes easy
to statically ensure that the lock can be released only if
it is being held. The locking problem can also be solved with the
ordinary monad (by defining a function like with_lock
which uses the monad in a contra-variant position). The latter
solution is more cumbersome, and becomes increasingly cumbersome as we
wish to represent more complex effects, such as control effects due to
delimited control operators with answer-type modification.
The variable state, parameterized `monad' has been thoroughly investigated by Robert Atkey, MSFP2006. The message below is an independent discovery.
Robert Atkey: Parameterised Notions of Computation. MSFP2006
Lightweight static resources, for safe embedded and
systems programming
RealTime.hs [8K]
Using Monadish to statically track ticks so to enforce
timing and protocol restrictions when writing device drivers and
their generators
Delimited continuations with effect typing, full soundness,
answer-type modification and polymorphism
The genuine shift/reset control operators are implemented in
terms of the parameterized monad.
Matthieu Sozeau: Type-safe printf using delimited continuations,
in Coq (Jan 2008)
<http://lambda-the-ultimate.org/node/2626>
The implementation of the parameterized `monad' (called GMonad) in Coq.
The Coq code not only defines the
operations gret
and gbind
but also states
their laws, which Coq will verify for each GMonad instance.
Lightweight monadic regions
Section 6 of the paper describes a Haskell library for manual resource
management, where deallocation is explicit and safety is assured by a
form of linear types. We implement the linear typing in Haskell with
the help of phantom types and a parameterized monad to statically track
the type-state of resources.
Linear and affine lambda-calculi
The tagless-final encoding uses the variable-state `monad'
to track resources associated with bound variables, to enforce
the linearity of variable references.
Monad
type class
class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m bthe Haskell Report also introduces
class Monad m => MonadPlus m where mzero :: m a mplus :: m a -> m a -> m aIf
Monad
expressions are to represent general
effectful computations, MonadPlus
is meant specifically for
expressing non-deterministic choice (mplus
) and failure
(mzero
). Haskell report states the equational laws for (>>=)
and
return
that any monad implementation has to satisfy. The laws are
meticulously described in numerous monad tutorials and inscribed on
many T-shirts. What are the laws of MonadPlus
?
It is often claimed that mplus
must be associative and mzero
must
be its left and right unit. That is, supposedly mplus
and mzero
should satisfy the same laws as list concatenation and the empty list
-- that is, the monoid laws. It should be stressed and stressed again
that the Haskell Report says nothing about the laws of MonadPlus
. Absent
any other authority, the status of the MonadPlus
laws is
undetermined.
Do we want the monoid laws for MonadPlus
?
The answer is surprisingly complicated. In a short and confusing form,
it may be reasonable to use the commutative monoid laws to
contemplate non-determinism and to approximate the result of a
program. If one objects to the commutativity of mplus
, the
associativity should not be a requirement either. This article shows
simple examples of this complex issue. Along the way, we discuss
why we may want equational laws to start with.
An equational law is a statement of equality: for example, the
left-unit monoid law for MonadPlus
states that mzero `mplus` m
and m
must be `the same'. This may mean that given the particular
definitions of mzero
and mplus
, the expression mzero `mplus` m
reduces to m
for any m
. Such a fine-grained, intensional view of
equality is most helpful when writing implementations and proofs. Let's
take the following simple model of non-deterministic computation:
data NDet a = Fail | One a | Choice (NDet a) (NDet a)with
One a
standing for the deterministic computation whose result is
a
, Fail
denoting failure and Choice
representing non-deterministic
choice. If we implement mzero
as Fail
, then the left-unit monad law,
on the fine-grain reading, immediately gives us one clause in the
implementation of mplus
:
instance MonadPlus NDet where mzero = Fail Fail `mplus` m = m -- ... continued belowthe associativity of
mplus
gives the second clause:
(Choice m1 m2) `mplus` m = Choice m1 (m2 `mplus` m)The remainder is trivial:
(One v) `mplus` m = Choice (One v) mThus the monoid laws gave us an implementation of
MonadPlus
. Building a
prototype, an `executable specification', is one, very good reason to specify
equational laws. Hinze's ICFP 2000 Functional Pearl is the most elegant and
inspiring example of using equational laws to derive and improve an
implementation -- of, as it happens, backtracking monad transformers.
One does not have to take equational laws too literally. For example, with the following, even more lucid instance
instance MonadPlus NDet where mzero = Fail mplus = Choicethe expression
mzero `mplus` m
reduces to Choice Fail m
, which is
clearly different from m
. One may still insist that these
expressions should be treated as `morally' the same. That is,
replacing mzero `mplus` m
with m
, in any context and for any m
should be observationally equivalent: should not change the results
of a program. We now have to define `the result' of
our program: how to observe an NDet
computation. Commonly, see Hinze (2000) again for an example, a
program of the form return v `mplus` m
is regarded as resulting in
v
; mzero
is to be observed as failure. (To generalize, one may
take return v1 `mplus` (return v2 `mplus` m)
as having two
results: first v1
and then v2
.) Therefore, the observation
function for NDet
, typically called run
, is to be defined as
follows:
run :: NDet a -> a run Fail = error "failure" run (One x) = x -- the intended meaning of One run (Choice (One x) m) = x -- ... continuedIf we insist that replacing
mzero `mplus` m
with just m
should
preserve the observations, we have to add the following clause to run
:
run (Choice Fail m) = run mThe associativity of
mplus
lets us complete the definition:
run (Choice (Choice m1 m2) m) = run (Choice m1 (Choice m2 m))
To recap, in this implementation, mzero `mplus` m
does not reduce to
m
. Likewise, m1 `mplus` (m2 `mplus` m3)
, which is (Choice m1 (Choice m2 m3)
, is a structurally different NDet
value than m1 `mplus` (m2 `mplus` m3)
. Still, as programs, mzero `mplus` m
and
m
produce the same results. Furthermore, replacing the former
expression with the latter, as part of any larger expression,
preserves the result of the overall program. (That claim, while
intuitively true for our implementation, is very hard to prove
formally: see JFP2021 for the proof and the rigorous discussion of
equivalence modulo observation.)
The same can be said for m1 `mplus` (m2 `mplus` m3)
and
(m1 `mplus` m2) `mplus` m3
. This preservation of the results, the
observational equivalence, lets a compiler, a programmer or a tool to
optimize mzero `mplus` m
to m
. Deriving and justifying
optimizations is another compelling application of equational
laws.
Finally, the equational laws can be used to predict and understand the
behavior of a program without regard to any particular implementation.
Let consider the programs ones
and main
:
ones = return 1 `mplus` ones main = ones >>= \x -> if x == 1 then mzero else return xRecall that in our notion of observation,
return v `mplus` m
is
regarded as resulting in v
. Thus ones
has 1
as the result -- as the
first and all the following results. As for main
,
main = ones >>= \x -> if x == 1 then mzero else return x === {- inline ones -} (return 1 `mplus` ones) >>= \x -> if x == 1 then mzero else return x === {- distributivity of mplus over bind -} (return 1 >>= \x -> if x == 1 then mzero else return x) `mplus` (ones >>= \x -> if x == 1 then mzero else return x) === {- the second clause of mplus is just main -} (return 1 >>= \x -> if x == 1 then mzero else return x) `mplus` main === {- monad law -} mzero `mplus` main === {- monoid law -} main
That chain of equational re-writes brought out nowhere. Clearly, no
other chain does any better: there is no way to convert
main
either to the form
return v `mplus` something
or mzero
. Our intuition agrees: ones
keeps offering 1
, which main
rejects.
Now consider onestoo
and the program maintoo
:
onestoo = ones `mplus` return 2 maintoo = onestoo >>= \x -> if x == 1 then mzero else return xIf
mplus
is associative, onestoo
can be re-written to
onestoo = ones `mplus` return 2 === {- inlining ones -} (return 1 `mplus` ones) `mplus` return 2 === {- associativity -} return 1 `mplus` (ones `mplus` return 2) === return 1 `mplus` onestoo
Absent any other laws, we must regard maintoo
just as diverging as
main
. This time the conclusion goes against our intuitions. We would like to
interpret m `mplus` return 2
as a non-deterministic choice that
includes 2
. There is therefore a chance that maintoo
may produce
that result. If MonadPlus
is meant to
represent non-determinism and to implement
non-deterministic search, a search procedure
ought to be complete: if it is possible maintoo
may produce 2
, it
should be found sooner or later. A MonadPlus
instance with the
associative mplus
cannot be an implementation of a complete search
procedure, unless we assume some sort of commutativity.
To derive the result 2
in our maintoo
example, we have to posit
additional laws for mplus
, at the very least, m `mplus` return v === return v `mplus` m
. One can even take mplus
to be commutative
for arbitrary arguments. The commutative and associative mplus
agrees with our intuitions of non-deterministic choice. But it
violently disagrees with the notion of observation used so far:
return v `mplus` something
is to be observed as resulting in
v
. With the commutative mplus
, return 1 `mplus` return 2
is to
be equal to return 2 `mplus` return 1
, which are observed as 1
and
2
respectively. It is not reasonable to treat 1
and 2
as the
same observation. We have to re-define observation and treat return v `mplus` m
as a program that might produce the result v
in one
particular run. The program may finish with some other result.
Intuitive as it is, such notion of observation is weak, feeling like a
typical disclaimer clause: the program we sold you may do nothing at
all. It feels, well, non-deterministic.
It is still reasonable to assume the commutative monoid laws as a
over-broad specification. If a MonadPlus
implementation produces
one of possibly many answers predicted by the specification, the
implementation should be deemed correct. We have to be, however,
explicit about such an interpretation. To avoid the confusion with
the common notion of observation -- what the program actually returns --
we should not be saying that a MonadPlus
is required to be be
associative and commutative. A program m1 `mplus` (m2 `mplus` m3)
may still print different results from (m1 `mplus` m2) `mplus` m3
.
Thus which laws MonadPlus
is to obey? It
depends on what we want to use the laws for, and how precise we want
to be about the program behavior. As specification, mplus
may be
regarded as associative. It should also then be commutative,
otherwise it does not represent non-deterministic choice. If it is
commutative we have to significantly weaken the notion of
observation. Generally, a specification is supposed to be an
approximation of the behavior of a real program. What is special
about MonadPlus
is just how imprecise the commutative monoid
specification is. A reasonable alternative -- non-associative
mplus
-- is described in ICFP05 and JFP2011 papers below, with many
examples of equational reasoning with such monads.
The JFP2021 paper `Not by Equations Alone' discusses laws of non-determinism in great detail, including proving the laws being congruences. The paper takes the approach opposite to the one shown in the present article. Rather than postulate the laws and then derive an observation function to uphold them, the paper starts with the observation function as given, and checks which equational laws hold up to that observation. The paper hence shows which commonly used observations for non-determinism give rise to which equational laws, and what one has to sacrifice for a richer equational theory.
LogicT - backtracking monad transformer with fair operations and pruning
Purely Functional Lazy Non-deterministic Programming
Not by Equations Alone: Reasoning with Extensible Effects, JFP2021
The problem is more complex than it appears. An apparent
solution is to represent fuzzy numbers as distribution
functions. Operator +
should be generalized to take two
distribution functions as arguments and yield the distribution function
of the sum. Other arithmetical operators and functions should be
lifted as well. As Jerzy Karczmarczuk noted, this naive approach is
fraught with problems. For example, fuzzy numbers cannot be an
instance of the Num class as the latter requires its members to belong to
the Eq class; comparing functions is in general undecidable. A
more immediate problem is that the arithmetic of distributions is far
more complex. Distributions are not linear: the distribution of a sum
is a convolution of the distributions of its terms. Division is
especially troublesome, if the denominator has a non-zero probability
of being zero.
The following article develops a different approach. The key insight is that we do not need to ``symbolically'' manipulate distributions. We represent a random variable by a computation, which, when executed, generates one number. If executed many times, the computation generates many numbers -- a sample of its distribution. We can lift the standard arithmetical operators and functions (including comparisons) to such computations, and write ordinarily-looking arithmetical expressions over these random computations. We can then execute the resulting expression many times and obtain a sample distribution of the resulting quantity. The sample lets us estimate the mathematical expectation, the variance and other moments of the resulting quantity. Our method is thus a Monte Carlo estimation of a complex distribution.
As an example, we compute the center and the variance of
sqrt( (Lw - 1/Cw)^2 + R^2)
which is the impedance of an electric circuit, with an inductance
being a normal variable (center 100, variance 1), frequency being
uniformly distributed from 10 through 20 kHz. Resistance is also
uniformly distributed from 10 through 50 kOm and the capacitance has the
square root of the normal distribution. (The latter is just to
make the example more interesting.)
In the follow-up, summarizing article, Jerzy Karczmarczuk designed a complementary approach, based on approximating a distribution by linear combinations of normal distributions. The arithmetic of normal distributions is well-understood.
HANSEI: Embedded domain-specific language for probabilistic models and (nested) inference
bind
, return
operations. One
has to prove the three monad laws: return
is the left and
right unit of bind
, and bind
is
associative. Proving the associativity law is less straightforward.
The following article demonstrates proving associativity for a particular Error monad. Filinski's `star' notation and a variable-free formulation of the associativity law turned out quite helpful.
bind
operator in terms of reflection and reification of monadic
effects.Scheme's multiple values as an effect
Another discussion of monadic reflection and reification
Is there a way to solve the problem purely functionally and elegantly? The article proposes one solution. It relies on monads to effectively hide the counter that uniquely tags every node. Building a tagged tree now looks just as simple as building an untagged one. The counter variable is well-hidden. And yet it is there, invisibly threading through all computations.