From oleg-at-okmij.org Wed Apr 27 16:17:04 2005
To: haskell@haskell.org
Subject: Zipper as a delimited continuation
X-comment: Aug 26, 2009: implemented a different convention for the
  direction of traversal steps. It is less convenient but more
  intuitive since it takes one step to reach a parent of a subterm. The
  addendum was prompted by a conversation with Ron de Bruijn, who
  pointed out that the traversal directions in the original `traverse'
  are unintuitive. In particular, it takes more than one Up step to
  really reach a parent of a subterm.
X-comment: Feb 2011: Switching to the delimcc library
Message-ID: <20050427231704.DE8F9ABCC@Adric.metnet.navy.mil>
Date: Wed, 27 Apr 2005 16:17:04 -0700 (PDT)
Status: OR


This is the first part of a reply to a query about a zipper with two
foci, posted on this list by Oktaviandi Hadi Nugraha on Apr 13. In
this part we introduce the framework to answer the question.

Our treatment of zipper is quite different from that of Huet (JFP,
1997) and Hinze and Jeuring (JFP 2001). Our zipper is polymorphic over
the data structure to traverse, and the zipper creation procedure is
generic and does not depend on the data structure at all.  Different
data structures or different realizations of the same abstract data
structure can use the same zipper and the same zipper creation and
manipulation functions. Our zipper type depends only on the interface
(but not the implementation!) of a traversal function. Our zipper is a
derivative of a traversal function rather than that of a data
structure itself.

Zipper is a construction that lets us replace an item deep in a
complex data structure, e.g., a tree or a term, without any
mutation. The resulting data structure will share as much of its
components with the old structure as possible. The old data structure
is still available (which can be handy if we wish to 'undo' the
operation later on). Zipper is essentially an `updateable' and yet
pure functional cursor into a data structure.

Zipper is also a delimited continuation reified as a data
structure. In this message, we use delimited continuations directly to
derive the zipper. We will be relying on the delimited continuation
library delimcc, which permits single and multiple prompts.

> -- Haskell98!
> module Zipper1 where
>
> -- Import delimcc, see
> -- http://okmij.org/ftp/continuations/implementations.html#CC-monads
> import CCExc
> -- import CCCxce  -- would work as well

> import Control.Monad.Identity
> import Control.Monad.Trans


The derivation of the zipper starts with a term traversal function. The
zipper will be as good and powerful as the traversal function. Let us
adopt as a running example the familiar and dear data structure:

> data Term = Var String | A Term Term | L String Term 
>             deriving (Eq)
> instance Show Term where
>   show term = showt 0 term 
>    where showt _ (Var s)   = s
>          showt p (A e1 e2) = paren (p > 10) 
>                               (showt 10 e1 ++ " " ++ showt 11 e2)
>          showt p (L v e)   = paren (p > 0) ("L" ++ v ++ ". " ++ showt 0 e)
>          paren True  s = "(" ++ s ++ ")"
>          paren False s = s


In this message, we chose the following function to traverse the lambda-term:

> data Direction = Down | DownRight | Up | Next deriving (Eq, Show)
> traverse :: Monad m => (Term -> m (Maybe Term, Direction)) -> Term -> m Term
> traverse tf term = do
>   (term', direction) <- tf term
>   let new_term = maybe term id term'
>   let select Up t = return t
>       select Next t@(Var _) = return t
>       select dir t@(L v t1) | dir == Next || dir == Down = do
>   	  t1' <- traverse tf t1
>   	  return $ L v t1'
>       select DownRight t@(A t1 t2) = do
>   	  t2' <- traverse tf t2
>   	  return $ A t1 t2'
>       select dir t@(A t1 t2) | dir == Next || dir == Down = do
>   	  t1' <- traverse tf t1
>   	  t2' <- traverse tf t2
>   	  return $ A t1' t2'
>   select direction new_term

The function `traverse' receives the traversal function `tf' and the
lambda-term, and walks and _updates_ the term, as guided by `tf'.  The
traversal function receives the current subterm and should return a
pair. If the first component of a pair is (Just t'), then t' replaces
the current subterm. The second component of tf's result is the
direction to proceed. The direction Next means proceed in the
depth-first order. Other directions may be used to skip some parts of
the term and so avoid walking the whole term. The function is written
in a monadic style (for an arbitrary monad m). We shall need that
later.

[Addendum] 
I guess I should have used better names for Up, Down, etc. For
example, one can see from the code for 'traverse' that Up means we
are done with the term, and Next and Down are almost synonymous. To
move to the parent of a subterm, we almost always have to make two
steps Up. I guess I visualized the direction not by the destination
point but by the source point: where to go from the current term:
traverse its children or not. At the end of the file, we show
a different choice of step directions.
[/Addendum]


We will be using the following term as a running example:

> -- P2 numeral
> term1 = L "f" (L "x" (A (A f (L "f" (A f (L "f" (L "x" x)))))
> 			  (A (A f (L "f" (L "x" x))) x)))
>  where [f,x] = map Var ["f","x"]

The first test simply traverses the whole term, makes no alterations
and returns the (copy of the) term:

> testt1 = runIdentity (traverse (\term -> return (Nothing,Next)) term1)

-- *Zipper1> testt1 == term1
-- True

To make sure that we really traverse the term, we can print out all
the encountered subterms:

> testt2 = traverse tf term1
>     where tf term = print term >> return (Nothing,Next)


We instantiate `traverse' for the IO monad this time. We can skip some
parts of the term during the traversal:

> testt3 = traverse tf term1
>     where 
>     tf term@(A (Var "f") _)  = do
> 			         print "cutting" >> print term
> 			         return (Nothing,Up)
>     tf term = print term >> return (Nothing,Next)

and we can modify the term

> testt4 = runIdentity (traverse tf term1)
>     where tf (L "x" (Var "x")) = return (Just (L "y" (Var "y")),Next)
> 	    tf _ = return (Nothing,Next)

which indeed returns term1 with all occurrences of (L "x" (Var "x"))
replaced with (L "y" (Var "y")).


Now we an introduce the zipper:

> data Zipper m term dir = 
>     Zipper term ((Maybe term, dir) -> CCW m (Zipper m term dir)) 
>   | ZipDone term

The traversal monad is the CC delimcc transformer applied to the base
monad "m".  We will be using the prompt flavor PS. There is only one
prompt of this flavor, bound to the global variable |ps|, for the
fixed answer-type Zipper m term dir (the answer-type is recursive).

> type CCM m term dir a = CC (PS (Zipper m term dir)) m a 
> type CCW m w = CC (PS w) m w

Zipper is indeed polymorphic over the term to traverse (as well over the
source monad 'm'). We can use this Zipper, as it is, for _any_ data
structure that can be traversed by a function that looks like
`traverse'. 

Creating the zipper equally generic:

> zip'term :: (Monad m) =>
>   ((term -> CCM m term dir (Maybe term, dir)) -> term -> CCM m term dir term)
>   -> term -> CCW m (Zipper m term dir)
> zip'term trav term = pushPrompt ps (trav tf term >>= return . ZipDone)
>     where tf term = shift0P ps (\k -> return (Zipper term k))

Both Zipper and its creation function depend neither on the
representation of the term nor on the traversal strategy.  All the
information about the data structure and its traversal is
encapsulated in one single function `trav'.

We can now examine term1, subterm by subterm, using the cursor,
zipper, rather than the enumerator, traverse:

> zip'through (ZipDone term) = liftIO (print "Done" >> print term)
> zip'through (Zipper term k) = do
>   liftIO $ print term
>   k (Nothing,Next) >>= zip'through
> tz1 :: IO ()
> tz1 = runCC $ zip'term traverse term1 >>= zip'through

We have effectively `inverted' the operation of term traversal. With
the cursor, we can keep the arbitrary state from one traversal step to
another.

The cursor provided by zipper is updateable. We can now descend (or
walk) to the desired node, replace it, and zip up the result to yield
the updated term:

> zip'move dir (Zipper term k) = 
>   liftIO (print dir >> print term) >> k (Nothing,dir)

> zip'upr (Zipper term k) nt = do 
>   liftIO (print term >> print "replacing with" >> print nt)
>   k (Just nt,Up)
> zip'all'the'way'up (ZipDone term) = return term
> zip'all'the'way'up (Zipper term k) = 
>   k (Nothing,Up) >>= zip'all'the'way'up
>
> tz2 :: IO () 
> tz2 = runCC $ do
>   z <- zip'term traverse term1
>   z1 <- zip'move Next z
>   z1 <- zip'move Next z1
>   z2@(Zipper (A _ _) k) <- zip'move DownRight z1
>   res <-  zip'upr z2 (A (Var "x") (Var "x")) >>= zip'all'the'way'up
>   liftIO $ (print "Result" >> print res)
>   --zip'through z1

*Zipper1> tz2
Next
Lf. Lx. f (Lf. f (Lf. Lx. x)) (f (Lf. Lx. x) x)
Next
Lx. f (Lf. f (Lf. Lx. x)) (f (Lf. Lx. x) x)
DownRight
f (Lf. f (Lf. Lx. x)) (f (Lf. Lx. x) x)
f (Lf. Lx. x) x
"replacing with"
x x
"Result"
Lf. Lx. f (Lf. f (Lf. Lx. x)) (x x)

We obtain an arbitrary number of cursors over the same data structure:
for example, in `tz2', the cursor 'z1' is still valid at the end, and
still points out to the `third' subterm. The cursor 'z2' is valid as
well. All these cursors are isolated: the updates done with the cursor
'z2' are invisible to the cursor 'z1' (as we can see if we uncomment
the last statement in tz2). Essentially, each cursor runs in its own
transaction. In the second part we will discuss how to make cursors
that see the updates of each other. Contrary to the traditional
database wisdom, for zippers, it is far harder to implement lower
isolation modes than the higher ones.


[Addendum]
We now implement a different convention for traversal steps:

> data Direction1 = FirstKid | RightKid | Parent deriving (Eq, Show)

Now it takes one step, Parent, to reach a parent of a subterm. We also
clarified the names of the steps to reach the first child or the last
child of a composite term.


> traverse1 :: (Monad m) => 
>              (Term -> m (Maybe Term, Direction1)) -> Term -> m Term
> traverse1 tf term = loop term
>  where 
>  loop term = do
> 	(term', direction) <- tf term
> 	let new_term = maybe term id term'
> 	let select Parent t = return t
> 	    select dir t@(L v t1) | dir == FirstKid || dir == RightKid = do
> 	       t1' <- loop t1
> 	       loop $ L v t1'
> 	    select RightKid t@(A t1 t2) = do
> 	       t2' <- loop t2
> 	       loop $ A t1 t2'
> 	    select FirstKid t@(A t1 t2) = do
> 	       t1' <- loop t1
> 	       loop $ A t1' t2
> 	select direction new_term

The zipper functions remain exactly as they are. The following example
illustrates the full traversal of a sample term
    (L "x" (A (Var "a") (Var "b")))

> tz3 :: IO () 
> tz3 = runCC $ do
>   z <- zip'term traverse1 (L "x" (A (Var "a") (Var "b")))
>   z <- zip'move FirstKid z
>   z <- zip'move FirstKid z
>   z <- zip'move Parent z
>   z <- zip'move RightKid z
>   z <- zip'move Parent z
>   z <- zip'move Parent z
>   ZipDone term <- zip'move Parent z
>   liftIO $ print "Done:"
>   liftIO $ print term

FirstKid
Lx. a b
FirstKid
a b
Parent
a
RightKid
a b
Parent
b
Parent
a b
Parent
Lx. a b
"Done:"
Lx. a b

The new traverse1 may be more intuitive but less convenient: there are
no Next step.