Haskell Programming


 

Provably perfect random shuffling and its pure functional implementations

We show two purely functional programs that perfectly, randomly and uniformly shuffle a sequence of arbitrary elements. We prove that the algorithms are correct. We also show why a commonly used sort-based shuffling algorithm is not perfect.
The article first formally defines the perfect random shuffle, outlines the algorithm, and discusses the well-known imperative shuffler that swaps elements in the array. We present and analyze two purely functional implementations. The naive one runs in O(n^2) time. A more sophisticated code has O(n*logn) time complexity. The article contains the full code to arrange a sequence in a complete binary tree and to extract an arbitrary element from the tree maintaining the completeness of the remaining tree.
Version
The current version is 1.2, Jul 14, 2002.
References

perfect-shuffle.txt [plain text file]
The article with the full source code for the two implementations, including the code for building and deconstructing a complete binary tree. It was originally posted as Provably perfect shuffle algorithms on a newsgroup comp.lang.functional on Mon, 3 Sep 2001 13:59:46 -0700

Dave Bayer, Persi Diaconis: Trailing the dovetail shuffle to its lair
Ann. Appl. Probab. 2 (1992), no. 2, pp. 294-313.
In the context of card shuffling, `perfect shuffle' means a deterministic interlacing of the two halves of the deck. The paper studies the perfect and other card shuffles in detail.

 

Pure-functional transformations of cyclic graphs and the Credit Card Transform

Cycles certainly make it difficult to transform graphs in a pure non-strict language. Cycles in a source graph require us to devise a way to mark traversed nodes -- however we cannot mutate nodes and cannot even compare nodes with a generic ('derived') equality operator. Cycles in a destination graph call for keeping track of already constructed nodes so we can make back-references. An obvious solution is to resort to a state monad or IORefs. There is also a monad-less solution, which involves maintaining a dictionary of already constructed nodes. This solution is less obvious: seemingly we cannot add a node to the dictionary until we have built the node. This fact means that we cannot use the updated dictionary when building the descendants of the node -- which need the updated dictionary to link back. The problem can be overcome however with a credit card transform (a.k.a. "buy now, pay later" transform). To avoid hitting the bottom, we just have to "pay" by the "due date".

For illustration, the article below considers the problem of printing out a non-deterministic finite automaton (NFA) and transforming it into a deterministic finite automaton (DFA). Both NFA and DFA are represented as cyclic graphs.

Version
The current version is 1.3, Aug 18, 2002.
References

CCard-transform-DFA.lhs [10K]
A literate Haskell code expounding the technique. The code was first posted in a message Transformations of cyclic graphs [Was: Efficiency of list indices in definitions] on Thu, 08 Aug 2002 16:11:30 -0700 on the Haskell-Cafe mailing list.

Tying the knot: How to build a cyclic data structure
<http://haskell.org/wiki/wiki?TyingTheKnot>
A CommonHaskellIdioms Haskell Wiki page.

Trying to tie the knot
<http://www.mail-archive.com/haskell@haskell.org/msg10687.html>
A message on the Haskell mailing list about a fixed-point combinator helping to tie a knot with forward links. Posted on Fri, 12 Apr 2002 18:44:07 -0700.

 

De-biforestation

Deforestation is usually defined as the elimination of an intermediate data structure, often a collection of items, sent from a single producer to a single consumer. Jorge Adriano showed an interesting example of one producer feeding two independent consumers. The two streams of data are mutually dependent. Furthermore, the rate of production is non-uniform: Generating an element in one stream may require generating trillion items in the other stream. If the first consumer is being evaluated, that consumer causes the evaluation of the producer until the latter yields an item. The trillion of items incidentally produced in the other stream have to be stored somewhere. In more OS-centric terms, the evaluator can't generally reduce two expressions in parallel. If one stream consumer is running, the other consumer sits in the ready queue. Therefore, data sent to the second consumer must be buffered, sometimes at great expense.

We will consider the problem of one producer and two consumers in the general setting. We derive a deforested version, which no longer needs to buffer any produced items. When we run that version, the memory retaining profile shows no space leaks. We also discuss a ``parallel'' writing of several streams into two distinct files. Our solution is safe and yet the i/o is effectively interleaved. The deforested solution is derived, via a sequence of equivalent transformations. In fact, the derived code worked on the first try.

Version
The current version is 1.2, Dec 5, 2002.
References

de-biforestation.lhs [12K]
The article as a well-commented literate Haskell code. An earlier version was posted on the Haskell-Cafe mailing list on Tue, 3 Dec 2002 22:05:05 -0800 (PST).

Jorge Adriano: producing and consuming lists
A message posted on the Haskell-Cafe mailing list on Tue, 5 Nov 2002 16:04:58 +0000. The message describes the problem of consuming two inter-dependent streams while avoiding space leaks.

 

Similarity between instruction scheduling and imperative functional programming

The article makes the case that a transformation of a typical imperative code into functional code is similar to the data and control dependency analyses in compilers for modern CPUs. By typical imperative code we mean a "typical numerical" code that involves nested loops and updateable arrays.

We consider three examples of such transformations, including two pure functional implementations of a bubble-sort. The imperative code is actually quite complex: nested loops, data dependencies across iterations, reading from and writing into the array, conditional branches that depend on data and the iteration history, control dependencies across nested loops. We show two solutions in Haskell. Both use fast, imperative STArrays; both try to separate the imperative part from a functional part. The imperative part is performed "in a monad" while the functional part is written in ordinary Haskell. The two solutions are closely related, and written to emphasize their similarity. One of the solution is an interpreter for a trivial "virtual machine".

The analysis of data and control dependencies across loop iterations, scheduling of instructions and interlocking of their results, pre-fetching of data -- is the job of an advanced compiler or the CPU. On the other hand, we have to do the same kind of analyses to separate the functional part of a computation from its imperative part. It appears functional programming is closer to "the bare metal" as one would have thought.

The assertion that a dependency analysis of imperative array code is related to extracting a functional program from an imperative one has been stated by Andrew W. Appel, in his paper "SSA is Functional Programming". The present article has shown several illustrations of the thesis. We have done manual dependency analyses of several code snippets that update arrays in place. The analyses let us extract pure-functional parts of imperative programs. The rest requires serialization and can be handled by an ST monad.

Version
The current version is 1.2, Feb 10, 2002.
References

The article with the full source code [plain text file]
The article itself. It was originally posted as FP vs assembly programming for super-scalar CPUs [Was: Nested Loops in Clean] on a newsgroup comp.lang.functionalon Thu, 16 Aug 2001 18:24:24 -0700

Andrew W. Appel SSA is Functional Programming
ACM SIGPLAN Notices v. 33, no. 4, pp. 17-20, April 1998.

 

Takusen: a DBMS access library built around a left-fold enumerator

Takusen is a joint project with Alistair Bayley. Takusen is a DBMS access library. It has a backend for Oracle on Unix, Linux or Windows via OCI, and a stub to test the library without any database. The infrastructure and the stub let one interface natively with other databases.

The distinguished feature of Takusen is processing query results using a left-fold enumerator. The user supplies an iteratee function, which receives rows one-at-a-time from the result-set. The number of the arguments to the iteratee is the number of the columns in the result-set, plus the seed. Each column in the result-set has its own Haskell type. The latter could be a Maybe type if the particular iteratee wishes to process NULLs.

The benefits are: more convenient and intuitive enumeration, iteration, and accumulation (see tests for examples); the retrieved data are not merely strings but have Haskell types: Int, Float, Date, etc.; buffer preallocation; pre-fetching; support for both enumerators and cursors, proper handling of all errors including various IO errors. No unsafe operations are used.

Version
The current version is 1.5, May 10, 2004.
References
From enumerators to cursors: turning the left fold inside outbrThe invertible non-recursive enumerator designed in that article is the foundation of the Takusen interface.
<http://code.haskell.org/takusen/>
 

Preventing memoization in (AI) search problems

In lazy evaluation, an expression is not evaluated unless its result is needed. Once an expression is evaluated, its result is remembered, or memoized, in case the value of the same expression is needed again. This on-demand evaluation and memoization happens transparently and `for free', sparing us from explicitly programming these features. Another great attraction of lazy evaluation is the potential to improve algorithmic complexity: As Bird et al. wrote, ``for some problems at least -- lazy evaluators are strictly more powerful than eager ones.''

The implicit memoization however is a trade-off. We trade space to store expression's result for time gained in avoiding recomputations. The trade-off is particularly worthy if the result takes much less memory than needed for the closure (thunk) of the expression. This is often the case with numeric code. Non-deterministic, probabilistic programming and general AI search problems are the opposite case. A non-deterministic expression is typically represented as a lazy search tree, which is often huge even for small expressions. It becomes a better trade-off to re-evaluate an expression rather than to fill all memory with results.

Alas, GHC is designed for the opposite trade-off. Therefore, using Haskell even for simple search problems is quite a challenge since memoization gets in the way. Preventing the memoization is surprisingly hard, since GHC is very good at finding the opportunities for it, even within thunks. This article uses a typical example of non-deterministic search to illustrate the problem posed by lazy evaluation and to describe a few tricks to prevent memoization. Some of them are unexpected.

Our running example computes and prints the first n elements of the infinite stream of Pythagorean triples pyth, using three infinite streams of integers from 1. As typical for non-deterministic programs, the example generates candidate solutions and rejects most of them.

     from :: MonadPlus m => Int -> m Int
     from i = return i `mplus` from (i+1)
     
     pyth :: MonadPlus m => m (Int,Int,Int)
     pyth = do
       x <- from 1
       y <- from 1
       z <- from 1
       if x*x + y*y == z*z then return (x,y,z) else mzero
The interleaving of three infinite streams precludes the List monad (depth-first search), see below. Instead of a lazy list, we use a search tree to represent the result of a non-deterministic computation:
     data Tree1 a = Fail1 | Val1 a | Node1 (Tree1 a) (Tree1 a)
We rely on non-strictness of Haskell to prevent evaluation of tree nodes until we traverse to them: after all, the tree may be infinite -- as is the case in our example. Tree1 is an instance of Monad and MonadPlus; here are the most complex parts of these instances (see the accompanying code for the rest):
     Node1 e1 e2 >>= f = Node1 (e1 >>= f) (e2 >>= f)
     mplus = Node1

To `run' the non-deterministic computation and produce the the stream of triples, we traverse the Tree, extract the successfully produced results from the Val leaves and return them as a lazy list. Different tree traversals correspond to different non-deterministic search strategies. Depth-first traversal (DFS) is the most efficient, needing only O(d) space to examine a node at depth d. Alas, an infinite branch in the tree traps DFS. In our pyth tree, DFS will get stuck chasing an infinite chain of Fail. Breadth-first traversal (BFS) in contrast shall visit any node in a tree, given time. BFS is a complete strategy: if a solution (leaf Val) exists, BFS will find it. Alas, BFS needs a lot of space to maintain the job queue, the frontier of the search. At search depth d the frontier may take O(2^d) space. Iterative deepening is a hybrid method, complete as BFS yet needing as little of working space as DFS. Iterative deepening explores the progressively long `prefix' of tree with DFS. Each new exploration phase repeats all the work of the previous explorations of shallower prefixes. Iterative deepening clearly trades time for space. Despite its gross wastefulness, the method is quite popular, for example, in automated theorem proving. Its trade-off has proved worthwhile.

Here are the results of computing and printing the first n Pythagorean triples. The code was compiled by GHC 7.0.4 with optimization -O2.

   Mutator time, sec   CG time, sec   Memory in use, MB   Average residency, KB 
BFS, n=30 13.0 5.0 3 465
Iter Deep, n=30 0.15 0.06 5 1506
Iter Deep, n=100 4.8 1.3 56 20832
We expected iterative deepening to take more time than BFS but significantly less memory. We observed just the opposite.

Recall that iterative deepening keeps re-traversing the tree. Each exploration cycle redoes all the previous explorations. Lazy evaluation helps, it seems. When we first reach Node e1 e2, we evaluate e1 and e2 that were stored in the node unevaluated (otherwise, we would have diverged constructing the tree, which is infinite). Lazy evaluation replaces e1 and e2 with their results. When iterative deepening comes across the same node in a new cycle, it gets the results of e1 and e2 right away. That seems like a good thing, until we look at the space. As iterative deepening explores the Tree, it needs more and more memory to store the explored prefix, which is about twice the size of the BFS frontier. Lazy evaluation thus defeats the purpose of iterative deepening, of recomputing the revisited tree nodes to avoid storing them. Lazy evaluation does exactly the wrong thing.

In a strict language, we would have used thunks to represent infinite trees. If tree nodes store thunks, lazy evaluation would memoize thunks -- which evaluate to themselves rather than to trees. It seems therefore the following modification should stop lazy evaluation's meddling in iterative deepening.

     data Tree2 a = Fail2 | Val2 a | Node2 (() -> Tree2 a) (() -> Tree2 a)
     Node2 e1 e2 >>= f = Node2 (\() -> e1 () >>= f) (\() -> e2 () >>= f)
     mplus e1 e2 = Node2 (\() -> e1) (\() -> e2)
Every time we need traverse through a Node, we have to force the thunks and re-compute the branches. At least, in theory. Here is the practice.
   Mutator time, sec   CG time, sec   Memory in use, MB   Average residency, KB 
BFS, n=30 13.0 5.0 3 509
Iter Deep, n=30 0.3 0.1 8 2964
Iter Deep, n=100 10.6 1.7 96 39244
The fix that was supposed to help iterative deepening has made it worse.

Such an unexpected result was quite a puzzle. It seems GHC is just too smart. Apparently it notices that a thunk (\() -> e) can only be applied to the same argument. Therefore, the first time the thunk is forced by applying it to (), the result can justifiably be memoized: the next time around the thunk will be applied to the same (), and hence, will give the same result anyway.

The new fix is to deliberately confuse GHC. We obfuscate the tree-construction operations (>>=) and mplus with auxiliary functions app and app1.

     Node3 e1 e2 >>= f = Node3 (app1 e1 f) (app1 e2 f)
     mplus e1 e2 = Node3 (app e1) (app e2)
     
     {-# NOINLINE app #-}
     app e () = e
     
     {-# NOINLINE app1 #-}
     app1 e f () = e () >>= f
That does the trick. Here are the results.
   Mutator time, sec   CG time, sec   Memory in use, MB   Average residency, KB 
BFS, n=30 13.2 4.7 3 413
Iter Deep, n=30 0.4 0.03 2 78
Iter Deep, n=100 13.4 0.9 2 413
Finally we see the agreement with our expectations. Iterative deepening now needs only 2MB (down from 56MB in the first try) to compute the first 100 Pythagorean triples. We may now try bigger searches.

We have seen that lazy evaluation is a trade-off, which may be hurtful in some cases, in particular, in search problems over huge data structures, where it is often beneficial to recompute the result than to store it. Preventing lazy evaluation is possible but surprisingly tricky.

Version
The current version is May 2012.
References

Richard S. Bird, Geraint Jones and Oege de Moor: More Haste, Less Speed: Lazy versus Eager Evaluation
J. Functional Programming, 1997, v7(5), pp. 541--547

STrees.hs [11K]
Complete code for our example and the benchmark.

Delimited control and breadth-first, depth-first, and iterative deepening search with more details on BFS and iterative deepening

 

Controlled memoization

The problem is adding memoization to a recursive function Int -> Int. That seems simple, until we realize that finding the value of that particular function at the argument x requires evaluating the function for smaller arguments and may require evaluating the function for larger arguments. Thus straight-forward memoization leads to a large and sparsely populated memoization table. As the original poster observed, the memoized function begins thrashing sooner than the original code.

The solution is to balance the cost of recomputing the value versus the cost of maintaining the memoization table and doing the lookup. It is worth memoize the values of the function for some small x -- which are being used frequently. For larger x, re-computation is better. The parameter sc_upb in the code determines the balance, the largest argument value for which we still memoize. The simple benchmark shows that this controlled memoization does improve both the space and time performances.

A notable feature of the code is the transparency of adding the controlled memoization. The body of the algorithm remains essentially the same.

Version
The current version is 1.1, Mar 16, 2005.
References
memo-array.lhs [2K]
The literate Haskell code with explanations and the benchmark
The code was originally posted as Re: (Newbie) Dynamic Programming, Memoizing Etc. on Haskell-Cafe mailing list on Wed, 16 Mar 2005 18:57:48 -0800 (PST)
 

How to make a function strict without changing its body

This is a simple trick of making a function strict in some or all of its arguments, without affecting the body of the function.

     iterate2 f x n | seq f $ seq x $ seq n $ False = undefined
     iterate2 f x n = --- as before
That is, we add one simple line to the definition of the function, without changing the proper body of the function at all. We prepend an extra clause with the guard that always yields False. Before failing, however, the guard forces the evaluation of the selected arguments of the function. If needed, seq can be replaced with deepSeq.

Making (some of the) arguments of a function strict can make the function run in constant space, and so an iteration (Runge-Kutta iteration in the above example) can proceed without running out of (stack) space. The function iterate2 (and a similar function rk4Next) were applied to arguments like (n-1) and y + a1/6 -- which is a red flag. These are exactly the kinds of expressions that lead to memory exhaustion. Perhaps it is because the size of an unevaluated thunk for (n-1) is so much bigger than the size of the evaluated thunk. It seems that arithmetic expressions are the best candidates for some opportunistic evaluation...

Version
The current version is 1.1, Oct 2004.
References

Re: How do I get a long iteration to run in constant space
<http://www.haskell.org/pipermail/haskell-cafe/2004-October/006983.html>
Discussion thread on the Haskell-Cafe mailing list, where the message was originally posted on Tue, 5 Oct 2004 00:44:40 -0700 (PDT)

Partial signatures
Another application of the trick of adding a clause with an always failing guard

 

Logic programming in Haskell optimized for reasoning

We demonstrate an executable model of the evaluation of definite logic programs, i.e., of resolving Horn clauses presented in the form of definitional trees. Our implementation, DefinitionTree, is yet another embedding of Prolog in Haskell. It is distinguished not by speed or convenience. Rather, it is explicitly designed to formalize evaluation strategies such as SLD and SLD-interleaving, to be easier to reason about and so help prove termination and other properties of the evaluation strategies. The main difference of DefinitionTree from other embeddings of Prolog in Haskell is the absence of name-generation effects. We need neither gensym nor the state monad to ensure the unique naming of logic variables. Since naming and evaluation are fully separated, the evaluation strategy is no longer concerned with fresh name generation and so is easier to reason about equationally.

Version
The current version is January 2008.
References

DefinitionTree.hs [14K]
The commented source code

Pure, declarative, and constructive arithmetic relations
DefinitionTree has been used to describe SLD and SLD-interleave evaluation strategies and to prove the basic properties of solution sets.

 

Newer FastCGI and the memory-efficient IO interface

We present an efficient library for writing CGI and FastCGI programs. Our goal is the least latency and the least memory consumption. The library underlies a production web application server Metcast. The server has been running since February 2007, sending large amounts of text and binary data in response to a continuous stream of requests. The server processes a request in constant and small memory imposing little latency, encoding and sending to the client chunks of data as soon as the database delivers them. The server handles hundreds of such requests without allocating memory; in fact, it uses only one 16KB buffer for all of its I/O. With the exception of an occasional rank-2 type and extended pattern guards, all of the code is in Haskell98. Not a single unsafeHaskell function occurs in the code.

The library is originally based on the code written by Jesse Tov, who in turn used NewCGI by Bjorn Bringert et al. Most of the code has been re-written. The biggest change is minimizing the amount of state. The library is centered around generalized input and output ports: a simple IO interface to read, write and copy via a single, large, once allocated buffer. The buffer and its dimensions are hidden, to discourage aliasing. We use this interface for the IO to and from the client, to and from files or the PostgreSQL database, and for reading from external servers.

A generalized input port is a procedure to read at most the specified number of bytes into a given buffer. The procedure should return the number of bytes actually read, or 0 on EOF. It may throw various exceptions if reading didn't go well. The procedure is like hGetBuf partially applied to a handle.

     newtype Input = Input (forall m. EMonadIO m => Ptr Word8 -> Int -> m Int)
     inputFd       :: Fd -> Input
     inputStr      :: EMonadIO m => String -> m Input
     inputCombined :: EMonadIO m => Input -> Input -> m Input
One can construct an Input from a Posix file descriptor or a String. One can combine two Inputs into one, which reads from the first generalized port until EOF and then reads from the second. The frequently mentioned EMonadIO is a class of monads permitting IO along with throwing and catching of arbitrary errors. The IO monad and most of its transformations are in that class. EMonadIO lets us write gthrow, gcatch, gbracket, etc. without even thinking of the current monad.

Dually, a generalized output port is a procedure to write the specified number of bytes from the given buffer. It may throw various exceptions if writing didn't go well.

     newtype Output = Output (forall m. EMonadIO m => Ptr Word8 -> Int -> m ())
     outputFd :: Fd -> Output
     newtype BCopy = BCopy (forall m. EMonadIO m => 
                             Input -> Output -> Maybe Int -> m (Maybe Int))
The CGI monad exports (Fast)CGI output and error streams as Output, and lets us access request content as Input. We can use the generalized ports for reading and writing strings. The ports are intended though to be connected via BCopy, which copies from Input to Output the desired number of bytes or till EOF.
Version
The current version is February 2007.
References

NewerCGI.hs [29K]
The commented source code of the library for writing CGI and FastCGI programs
The code was first mentioned in a message Takusen and large PostgreSQL blobs [was: Handling custom types in Takusen] on the Haskell-Cafe mailing list on Fri, 27 Jul 2007 20:34:29 -0700 (PDT)

FastCGI.hsc [3K]
Bindings to the FastCGI C client library

 

Catching exceptions in a general MonadIO

Handling exceptions via catch, bracket, catchDyn, etc. in a MonadIO other than the IO itself has been a fairly frequently requested feature. The requests tend to recur, probably because these functions cannot be defined for a general MonadIO. However, we can define generic exception handling for a large and interesting subset of MonadIO, which includes various (repeated) transformations of IO by ReaderT, WriterT, ErrorT, StateT, and newtype wrapping.

The generic catch has been successfully used since 2006 in a database library Takusen, where many operations work in a monad ReaderT Session IO; the database session data are always available as the environment. Many other foreign libraries are structured around sessions, connections, library environments, which can be encapsulated in a monad. We should nevertheless be able to handle IO errors and user exceptions that arise in such computations.

Back in 2006 Jesse Tov has done an admirably thorough job of implementing exception handling in general monads. His code defines two classes: EMonad and EMonadIO -- which contain most of the interesting monads. The latter is the subclass of the former, permitting arbitrary IO via liftIO. In either case, we use gthrow, gbracket, gcatch, ghandle, gfinally, etc. -- without even thinking which Monad we are in and how error handling is actually implemented, via ErrorT or via IO exceptions. It works universally for most of monads of interest. The experience of using EMonadIO since 2006 in large production code has been most positive.

Version
The current version is 1.1, February 2006.
References

CaughtMonadIO.lhs [6K]
The complete literate Haskell code
The code includes two tests, illustrating throwing and catching of (dynamic) exceptions in monads obtained from IO by several applications of ReaderT, WriterT and ErrorT.
The code was originally posted as generic catch in a MonadIO on the Haskell mailing list on Tue Feb 7 22:48:24 EST 2006

Takusen's MonadIO, with gtry, gtryJust, gbracket, gfinally.The code supports both the new- and the old-style exceptions. Joint work with Alistair Bayley.
<http://code.haskell.org/takusen/Control/Exception/MonadIO.hs>

Discussion threads:

Haskell' ticket, introduced as the result of the above Haskell-Cafe discussions:
<http://hackage.haskell.org/trac/haskell-prime/ticket/110>

 

Simple and reliable uni- and bi-directional pipes

MySysOpen module offers a reliable, proven way of interacting with another local or remote process via a unidirectional or bidirectional channel. It supports pipes and Unix and TCP sockets. MySysOpen is a simple and explicit alternative to the multi-threaded IO processing of the GHC run-time system. The module is the Haskell binding to sys_open -- the extended, user-level file opening interface.

The second half of MySysOpen.hs contains several bi-directional channel interaction tests. One checks repeated sending and receiving of data; the amount of received data is intentionally large, about 510K. Two other tests interact with programs that are not specifically written for interactive use, such as sort. The latter cannot produce any output before it has read all of the input, accepting no input terminator other than the EOF condition. One test uses shutdown to set the EOF condition. The other test programs the handler for a custom EOF indicator, literally in the file name of the communication pipe.

Version
The current version is December 2008.
References

MySysOpen.hs [7K]
The code for the sys_open interface and tests

sys_open.c [20K]
The complete, well-commented code for sys_open

Applications of sys_open

 

A character-efficient Mastermind game code

The following code is a space-efficient implementation of the game Mastermind. It is efficient not in terms of the heap or stack space but in the source text space. Jan Rochel wrote the original code and put it in his e-mail signature. He asked for a shorter version. The present code relies on a different algorithm to compute exact and inexact matches in two strings. The new algorithm made the code a bit faster, but more importantly, a bit (7%) shorter. The new code is also more obscure and exhibits some interesting patterns, in particular f x=(x\\).(x\\).

A version to put in a .signature (285 chars):

     import Random;import List;import Char;p=putStrLn;u=uncurry;f x=(x\\).(x\\)
     main=mapM(\x->randomRIO(49,54))[1..4]>>=n 0.map chr>>=p.("Tries: "++).show
     e=((partition$u(==)).).zip;h(p,q)=['*'|x<-p]++['+'|x<-(u f)$unzip q]
     n a s=getLine>>=m where{m i|i==s=return a;m i=p(h$e i s)>>n(a+1)s}

An expanded version that is easier to read:

     import Random;import List;import Char
     
     main = mapM(\x->randomRIO(49,54))[1..4]
            >>= n 0 . map chr >>= p . ("Tries: "++) . show
     p=putStrLn
     u=uncurry
     e=((partition$u(==)).) . zip
     f x=(x\\).(x\\)
     h(p,q)=['*'|x<-p] ++ ['+'|x<-(u f) $ unzip q]
     n a s=getLine>>=m a s
     m a s i | i==s = return a
     m a s i = p(h$e i s) >> n(a+1)s
The code runs on GHC(i).
Version
The current version is 1.1, May 24, 2003.
References
The code was originally posted as an article Haskell: Mastermind code for a .sig on a newsgroup comp.lang.functional on Sat, 24 May 2003 00:04:40 -0700
 

How to implement AND without pattern-matching

How to implement a function AND, i.e., (&&), without explicit or even implicit pattern-matching? Note that the functions not and (||) are defined in the Prelude with the use of pattern-matching. The if function may rely on pattern-matching as well. The article below gives several answers.
One of the suggested solutions relies on pointer arithmetics, in Haskell.
Version
The current version is 1.1, Jul 13, 2001.
References
The article with the source code for three solutions. [plain text file]
The article itself. It was originally posted as Re: bothTrue on a newsgroup comp.lang.functional on Fri, 13 Jul 2001 17:45:22 -0700


Last updated May 5, 2012

This site's top page is http://okmij.org/ftp/

oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from SXML by SXML->HTML