Breadth-first labeling

Although the main problem is about trees, it can be described and easily understood on lists. Let us remind ourselves of a few common (and less common) operations on lists, to be used throughout.

One of the most fundamental and familiar list operations is mapping, defined in the OCaml standard List library as

    val map : ('a -> 'b) -> 'a list -> 'b list

It transforms a list preserving its shape (i.e., length), using the given mapping function 'a->'b to transform elements. The element transformation depends only on the element's value: the mapping function is supposed to be pure. In any event, the order of applying the mapping function to list elements is explicitly left unspecified.

The next important operation is what some may call `visiting'; the OCaml standard List library calls it

    val iter : ('a -> unit) -> 'a list -> unit

It visits each element of the argument list in turn, applying the given (effectful) function 'a -> unit. The order of its applications is now specified: strictly in the list order.

The List library also defines left folding, or accumulating traversal (also called `reduce'):

    val fold_left : ('z -> 'a -> 'z) -> 'z -> 'a list -> 'z

It may be regarded as a `pure functional' interface to visiting. Indeed, the two operations are implementable in terms of each other, demonstrating they are two guises of the same algorithm:

    let iter : ('a -> unit) -> 'a list -> unit = fun f l ->
      fold_left (fun _ x -> f x) () l
    
    let fold_left : ('z -> 'a -> 'z) -> 'z -> 'a list -> 'z = fun f z l ->
      let zr = ref z in iter (fun x -> zr := f !zr x) l; !zr

The state (or, `heap') passing implicit in iter is made explicit in left folding interface.

What Okasaki calls `numbering' is the operation that takes a collection and builds the new one of the same shape with each element replaced by its index in a traversal (visiting) order. For lists, this operation has the signature:

    val list_num : int -> 'a list -> int list

where the first argument sets the index for the first element. One might also want to preserve the original elements, pairing them with the indices

    val list_lab : int -> 'a list -> (int * 'a) list

We will call such an operation `labeling'. For example, list_lab 101 [2;3;5;7;11] should return [(101,2);(102,3);(103,5);(104,7);(105,11)]. One may want to generalize labeling even further, using as labels not just consecutive integers but any stream or list of sufficient length.

All of the above are the instances of the general operation: accumulating, or `stateful' map. In its effectful guise, it applies an (effectful) transformer to each element of the list, strictly in order, collecting the results in the output list -- what in Haskell is called mapM:

    let rec map_accum' : ('a -> 'b) -> 'a list -> 'b list = fun f -> function
        | []   -> []
        | h::t -> let h' = f h in h'::map_accum' f t

The pure functional version explicates the state passing. The mapping function receives the state (the accumulator) as the first argument, returning the final accumulator along with the mapping result:

    let rec map_accum : ('z -> 'a -> ('z*'b)) -> 'z -> 'a list -> ('z*'b list) = fun f z -> function
        | []   -> (z,[])
        | h::t -> let (z,h') = f z h in let (z,t') = map_accum f z t in (z,h'::t')

Numbering and labeling are the particular instances of the accumulating mapping. For example:

    let ilabeling : int -> 'a -> int * (int*'a) = fun z x -> (z+1,(z,x))
    
    let list_lab : int -> 'a list -> (int * 'a) list = fun z l ->
      map_accum ilabeling z l |> snd

(this particular ilabeling accumulating transformer will appear in many examples below). Other operations can also be expressed in terms of map_accum, for example, zipping with a list of sufficient length. It is a pity that the accumulating map is under-appreciated and not often provided in standard libraries.

Just like iter and fold_left, map_accum and map_accum' are implementable in terms of each other. For instance:

    let map_accum : ('z -> 'a -> ('z*'b)) -> 'z -> 'a list -> ('z*'b list) = fun f z l ->
       let zr = ref z in
       let l' = map_accum' (fun x -> let (z,b) = f !zr x in zr := z; b) l in
       (!zr,l')

One may hence view the earlier pure map_accum code as the result of inlining map_accum' in the above and `working out' the zr passing, eventually dissolving the mutation.

Implementing the `pure' interface map_accum in terms of the effectful map_accum' is the pattern we shall follow. We do not shy away from stateful code so long as the state can be properly encapsulated in the end. A `pure functional' implementation with the explicit threading of state (such as the original map_accum) does not eliminate state from the algorithm but clutters the code with explicit state passing. Using mutable state in internal functions lets us focus on interesting parts of the algorithm.

The problem

The overall goal is to implement numbering, labeling -- in general, map-accumulating -- on trees, in breadth-first traversal order.

Consider a binary tree with values of the type 'a in leaves and nodes -- for example, as represented by the following OCaml data type:

    type 'a tree = Leaf of 'a | Node of 'a * 'a tree * 'a tree

We shall refer to both Leaf and Node as tree nodes. Here is a sample tree, with the integer payload:

    let tree1 = 
      Node (1, Node (2, Leaf  3, 
                        Node (4,  Leaf 5, 
                                  Leaf 6)),
               Node (12,Node (14, Leaf 15, 
                                  Leaf 16), 
                        Leaf 13))

The problem is to write the accumulating, or effectful map on trees:

    val bfa' : ('a -> 'b) -> 'a tree -> 'b tree
    val bfa  : ('z -> 'a -> 'z*'b) -> 'z -> 'a tree -> ('z*'b tree)

which traverses the tree breadth-first applying the payload transformer to each tree node value in turn, returning the transformed tree of the same shape in the end. Furthermore, the running time of the entire transformation should be linear in the size of the tree. This is the most general statement of the problem, encompassing all simpler variations considered so far.

Since the bfa' and bfa (like map_accum' and map_accum earlier) are expressible in terms of each other, it is enough to implement one of them. (One can transform the code of the effectful bfa' to pass the accumulator explicitly, obtaining the mutation-free bfa. One might also be content with encapsulating state inside the outwardly pure bfa.)

The reader may wish to pause and try writing bfa or bfa' on their own. At the very least, one ought to try implementing the simpler depth-first accumulating mapping.

An example of accumulating mapping is labeling: for our sample tree1, bfa ilabeling 101 tree1 should produce the following tree

    Node ((101, 1), Node ((102, 2), Leaf  (104, 3),
                                    Node ((105, 4),  Leaf (108, 5), 
                                                     Leaf (109, 6))),
                    Node ((103, 12),Node ((106, 14), Leaf (110, 15), 
                                                     Leaf (111, 16)),
                                    Leaf  (107, 13)))

This article derives and compares several solutions: with and without mutation, strict and lazy, some described in the Okasaki paper, and some are not.

References

Chris Okasaki: Breadth-first numbering: lessons from a small exercise in algorithm design
ICFP 2000, doi:10.1145/351240.351253
This is the original paper that presented the (simpler) numbering problem and several solutions. The tree data type used in that paper had only Nodes bearing payload values. Leaves had no values.

bfn.ml [17K]
The complete OCaml code accompanying the article

Why it is interesting

Albeit toy, the problem is not obvious. Accumulating map is an amalgamation of visiting and rebuilding. Whereas visiting (traversing) a tree is easy either depth-first or breadth-first, rebuilding is much easier depth-first. Indeed, consider the following tree mapper: it traverses a tree depth-first, rebuilding it. Thanks to structural recursion, the code can't be clearer.

    let rec tree_map : ('a -> 'b) -> 'a tree -> 'b tree = fun f -> function
      | Leaf x       -> Leaf (f x)
      | Node (x,l,r) -> Node (f x, tree_map f l, tree_map f r)

Although breadth-first traversal is also straightforward, it is not structurally recursive. (As Gibbons and Jones put it, ``depth-first traversal runs with the grain of the tree, but breadth-first traversal runs against the grain.'') Let's look into it, since it will be useful later.

The analogue of List.iter is traversing a tree in breadth-first order, performing an effectful operation on each visited payload value:

    val bfi : ('a -> unit) -> 'a tree -> unit

When breadth-first traversing the tree Node (x,l,r), we first visit x, then the root of l, then the root of r, then the roots of the trees in the sequence built from from the children of l and r, etc. Clearly, we need a more general operation: breadth-first traversing a sequence of trees (or, a forest):

    val bfi_forest : ('a -> unit) -> 'a tree list -> unit

Here is the first implementation of these operations, which we call the `level approach':

    let rec bfi_forest : ('a -> unit) -> 'a tree list -> unit = fun f -> function
       | []     -> ()
       | forest -> 
          let () = List.iter (function Leaf x | Node (x,_,_) -> f x) forest in
          map_concat (function Leaf _ -> [] | Node (_,l,r) -> [l;r]) forest |>
          bfi_forest f
    
    let bfi : ('a -> unit) -> 'a tree -> unit = fun f t -> bfi_forest f [t]

That is, to breadth-first traverse trees in the forest, we first visit the roots of the trees, then `cut the roots' and form a new forest from the remaining branches, in order, and traverse it. For our sample tree, bfi (Printf.printf "%d ") tree1 prints 1 2 12 3 4 14 13 5 6 15 16.

The code is written for ease of understanding rather than efficiency. Still, it is asymptotically optimal: the traversal takes linear time in the size of the tree. One can clearly speed it up, twice, by fusing multiple traversals of the forest.

In contemplating such fusion one notices that bfi_forest deals with two forests at a time, representing the current level (the level of roots) and the next level (the level of children). We take a tree from the forest, visit its root and add its children, if any, to the second forest. If we use a list (last-in, first-out data structure) to represent a forest, it is more convenient to accumulate the second forest in the reverse order. We put that forest in the correct order once we turn to visiting it. All in all, we arrive at the following, also linear, algorithm:

    let rec bfi_forests : 
        ('a -> unit) -> 'a tree list * 'a tree list -> unit = fun f -> function
      | ([],[])   -> ()
      | ([],next) -> bfi_forests f (List.rev next,[])
      | (Leaf x::t,next)       -> f x; bfi_forests f (t,next)
      | (Node (x,l,r)::t,next) -> f x; bfi_forests f (t,r::l::next)
    
    let bfi : ('a -> unit) -> 'a tree -> unit = fun f t -> bfi_forests f ([t],[])

A pair of lists, in opposite orders, is the well-known pure functional implementation of queues. No wonder: one easily sees in bfi_forests the textbook breadth-first traversal algorithm using a queue of trees: we visit the root of the tree at the head of the queue, pushing its children branches to the other end of the queue. One could have written bfi_forests (which we call `the queue approach') first and derived the earlier, `level approach', by visiting all roots `in bulk'.

I, however, prefer to think of a pair of lists, in opposite orders, as a Huet zipper rather than a queue. The procedure bfi_forests zips through the source forest, accumulating the new one.

The pure functional guise of visiting is (left) folding, trivially implementable in terms of the earlier bfi (again highlighting that visiting and folding are just two interfaces of the same traversal process):

    let bf_fold_left : ('z -> 'a -> 'z) -> 'z -> 'a tree -> 'z = fun f z t ->
      let zr = ref z in
      bfi (fun x -> zr := f !zr x) t;
      !zr

Using either level or queue implementation of bfi, one can eliminate the mutation by working out the `heap passing' into bfi. We leave this mutation-free implementation of bf_fold_left as an exercise to the reader (but see the accompanying code).

References

Jeremy Gibbons and Geraint Jones: The under-appreciated unfold. IFCP 1998, doi:10.1145/291251.289455
That paper presents derivations of level-based traversal in terms of folds, unfolds, and long zips. Our level approach bfi_forest, in functional setting, also makes an appearance. It was communicated to the authors by Bernhard Moeller around 1993.

Naive solution

The first solution is clearly dumb, and (in the eyes of some people) damned, because of its use of mutation. It is also clearly correct and has the desired time complexity.

    let bfa'_naive : ('a -> 'b) -> 'a tree -> 'b tree = fun f t ->
      let t' = tree_map (fun x -> (ref None,x)) t in
      bfi (fun (rf,x) -> rf := Some (f x)) t';
      tree_map (fun ({contents = Some b},_) -> b) t'

Since breadth-first rebuilding of a tree is complicated (in fact, we have not even attempted it yet), let's build the tree depth-first -- that's the general idea. Or, in more detail:

number the tree nodes depth-first. The addresses of newly allocated reference cells are such unique node identifiers;
traverse the tree breadth-first and compute the map that associates a node identifier with the payload in the (yet to be built) result tree;
finally, traverse the tree depth-first one more time, building the result of the same shape and using the above map to determine the payload for each node.

Put this way, bfa'_naive can be written without mutations -- maintaining the time complexity given an appropriate data structure to represent the map.

Incidentally, bfa'_naive makes essentially three passes over the tree -- which we will often see in other solutions.

Although the naive solution looks inferior, in a twisted way it will come back to us. For now, let's consider what one may call `classical' solutions.

Breadth-first labeling via queue/zipper

A different set of solutions emerges when we bite the bullet and contemplate rebuilding trees breadth-first. We have written the code for breadth-first visiting: bfi and bf_fold_left -- in fact, two versions of the code, using the level and queue approaches. Can we somehow adjust that code to not just visit a tree branch but return a new one of the same shape?

Let's look at the queue implementation of bfi first. Recall, the key data structure was the tree forest zipper (a pair of tree forests)

    type 'a forest_zipper = 'a tree list * 'a tree list

and the key operation was

    val bfi_forests : ('a -> unit) -> 'a forest_zipper -> unit

that traversed the zipper using the given function to visit every node in turn. To turn the visiting into rebuilding (mapping), we need the operation of the following signature:

    val bfa_forests : ('a -> 'b) -> 'a forest_zipper -> 'b forest_zipper

It should take a (possibly effectful) mapping function and apply it to the forest represented by the zipper breadth-first, in order, returning the zipper for the rebuilt forest. The key invariant to maintain is the returned zipper having the same shape as the input one (that is, the two zippers must be equal modulo the differences in the payload of tree nodes).

Let us see if we can adjust bfi_forests code to obtain bfa_forests that preserves the desired invariant. It turns out rather easy:

    let rec bfa_forests : ('a -> 'b) -> 'a forest_zipper -> 'b forest_zipper = fun f -> function
      | ([],[])   -> ([],[])
      | ([],next) -> let (next',[]) = bfa_forests f (List.rev next,[]) in
                     ([],List.rev next')
      | (Leaf x::t,next) -> 
          let v = f x in
          let (t',next') = bfa_forests f (t,next) in
          (Leaf v::t',next')
      | (Node (x,l,r)::t,next) -> 
          let v = f x in
          let (t',r'::l'::next') = bfa_forests f (t,r::l::next) in
          (Node (v,l',r')::t',next')

The preservation of the invariant should be clear (by simple induction). That the mapping function f:'a->'b is applied in breadth-first order is also clear (after all, bfa_forests is just a variation of bfi_forests). Linear time complexity (in the size of all trees in the input zipper) is also easy to see. In fact, the code effectively traverses the nodes of all zipper trees three times, counting the two List.rev. However, each constructor in the tree data structure is examined only once.

The desired bfa' is the simple wrapper

    let bfa' : ('a -> 'b) -> 'a tree -> 'b tree = fun f t ->
      let ([t'],[]) = bfa_forests f ([t],[]) in t'

Its correctness follows from the invariant of bfa_forests.

This is essentially the queue-based solution presented in Okasaki's ICFP 2000 paper in Fig. 3 (with the queue implemented as a pair of lists). The mutable state in bfa' can be easily eliminated by threading the state of the stream, which is what Okasaki's code essentially does (Okasaki considers a simpler breadth-first numbering problem, where the state threading can be replaced by the adjustment of the initial number).

Okasaki stresses that this queue-based solution is surprisingly uncommon: he posed this problem to many people, and only one other person gave that solution. It is indeed surprising that this solution is so uncommon: for us, it emerged naturally, as a simple variation of the standard breadth-first traversal.

References

Chris Okasaki: Breadth-First Numbering: An Algorithm in Pictures
<http://okasaki.blogspot.com/2008/07/breadth-first-numbering-algorithm-in.html>
A pictorial explanation of the queue-based algorithm

Level-based solution

Let's turn to the level-based breadth-first visiting and modify it for the breadth-first tree rebuilding/mapping.

Recall, the key data structure was the tree forest (the list of trees) and the key operation was

    val bfi_forest : ('a -> unit) -> 'a tree list -> unit

To turn the visiting into rebuilding (mapping), we need the operation of the following signature:

    val bfa_forest : ('a -> 'b) -> 'a tree list -> 'b tree list

It should take a (possibly effectful) mapping function and apply it to the forest breadth-first, in order, returning the rebuilt forest of the same shape.

Modifications to bfi_forest to obtain bfa_forest are just as straightforward as in the queue/zipper approach:

    let rec bfa_forest : ('a -> 'b) -> 'a tree list -> 'b tree list = fun f -> function
      | [] -> []
      | forest -> 
          let payload' = map_accum' (function Leaf x | Node (x,_,_) -> f x) forest in
          let next     = map_concat (function Leaf _ -> [] | Node (_,l,r) -> [l;r]) forest in
          let next'    = bfa_forest f next in
          rebuild (forest,payload',next')
    
    and rebuild : 'a tree list * 'b list * 'b tree list -> 'b tree list = function
      | ([],[],[]) -> []
      | (Leaf _::t, v::payload, children) -> 
          Leaf v :: rebuild (t,payload,children)
      | (Node _::t, v::payload, l::r::children) ->
          Node (v,l,r) :: rebuild (t,payload,children)

First we map the roots of the forest trees computing the payload for the tree roots in the result forest. We then compute the forest next of the children branches and map it, obtaining next'. Finally, we rebuild the trees, using the mapped payload for the roots, the mapped children, and the original forest (which tells the top-level shape of the source forest, and hence of the result forest). That the forest shape is preserved is hence clear. The order of applying the mapping function, inherited from bfi_forest, is breadth-first, as desired. The time complexity is also clearly linear; each constructor in the tree data structure is examined three times (although two of the repeated tree traversal may easily be fused).

Again, the desired bfa' is a simpler wrapper over bfa_forest.

This is essentially the levels solution in Okasaki's ICFP 2000 paper (Fig. 5), only a bit more principled, without the hack of computing lengths and dividing by two, etc. Our code hence is easy to extend to non-binary trees.

Breadth-first by depth-first

Whatever one may say about the naive solution, using simple structural recursion to traverse and rebuild trees is too good an idea to pass on. The key, recall, was a correspondence between tree nodes and indices in breadth-first traversal order -- maintained in a round-about way through mutable cells. Let's try to be more thoughtful about this correspondence.

The `natural' identifier of each tree node is its path from the root. The breadth-first order p < q on paths p and q is defined as

length p < length q
length p = length q and p is lexicographically less than q

The definition naturally leads to a `breadth-first view' of a tree as an ordered sequence of levels

    type 'a bfview = 'a list list

A level contains payload values that come from the nodes of the same path length. Levels within the view are ordered by path length: the first level contains the payload of the tree root (path length 0); the second level contains the payloads of its two children, etc. Within a level the values are in the lexicographic order of the paths of the corresponding nodes.

Interestingly, we can build a bfview and rebuild the tree from a bfview using structural recursion and in linear time:

    let df_to_bfview : 'a tree -> 'a bfview = fun t ->
      let rec loop : 'a bfview -> 'a tree -> 'a bfview = 
      fun (h::lv) -> function 
        | Leaf x -> (x::h)::lv
        | Node (x,l,r) -> 
            let lvl = loop (if lv = [] then [[]] else lv) l in
            let lvr = loop lvl r in
            (x::h)::lvr
      in 
      loop [[]] t |> List.map List.rev

If one look closely, the internal loop is almost the depth-first tree left fold. For the sample tree1, its breadth-first view is [[1]; [2; 12]; [3; 4; 14; 13]; [5; 6; 15; 16]].

The rebuilding function is also structurally recursive and taking linear time. It needs the source tree solely to provide the shape for the result:

    let df_of_bfview : 'a tree -> 'b bfview -> 'b tree = fun t lv ->
      let rec loop : 'b bfview -> 'a tree -> 'b tree * 'b bfview =
      fun ((x::h)::lv) -> function 
        | Leaf _ -> (Leaf x, h::lv)
        | Node (_,l,r) -> 
            let (l',lvl) = loop lv  l in
            let (r',lvr) = loop lvl r in
            (Node (x,l',r'), h::lvr)
      in 
      fst @@ loop lv t

The internal loop is essentially the depth-first tree accumulating map. It is easy to see that df_of_bfview tree (df_to_bfview tree) is tree for all trees.

Finally, we notice that the breadth-first accumulating map can be performed on a breadth-first view, trivially:

    let bfview_map_accum : ('z -> 'a -> 'z*'b) -> 'z -> 'a bfview -> ('z*'b bfview) = fun f -> 
      map_accum (map_accum f)

Composing the three above functions gives the desired bfa:

    let bfa : ('z -> 'a -> 'z*'b) -> 'z -> 'a tree -> ('z*'b tree) = fun f z t ->
      df_to_bfview t |> bfview_map_accum f z |> (fun (z,lv) -> (z, df_of_bfview t lv))

This is the essence of the naive bfa'_naive; the three parts of the composition correspond to the three lines in the bfa'_naive code. It was not dumb after all.

References

BFN.hs [5K]
A similar in spirit level-based solution, in Haskell (July 20, 2009)

Jeremy Gibbons and Geraint Jones: The under-appreciated unfold. IFCP 1998, doi:10.1145/291251.289455
The paper gives alternative ways to compute 'a bfview (called `levels' in that paper) in terms of right folds and unfolds. Although these pleasing characterizations are less efficient, they can be transformed (using accumulator-passing and deforestation) into what looks quite like our df_to_bfview.

Lazy solution

Contemplating the bfa in the previous section one begins to wonder if bfview_map_accum and df_of_bfview may be fused. Can we map-accumulate levels as we rebuild a tree? We can -- provided we somehow know the initial accumulator values for each level in a bfview. In the code below, these initial values are collected in a 'z list: the first element is the accumulator value to visit and rebuild the root; the second element is the starting accumulator to use when visiting and rebuilding the root's children, etc.

    let rec bf_tree_accum : ('z -> 'a -> 'z*'b) -> 'z list -> 'a tree -> ('z list * 'b tree) = 
     fun f (z::zs) -> function
       | Leaf x -> let (z',b) = f z x in (z'::zs, Leaf b)
       | Node (x,l,r) -> 
           let (z',b) = f z x in
           let (zsl,l') = bf_tree_accum f zs  l in
           let (zsr,r') = bf_tree_accum f zsl r in
           (z'::zsr, Node (b,l',r'))

The function does look like a particular kind of an accumulating map on a tree. Accumulating maps do keep coming up, don't they.

To actually use bf_tree_accum we need the list of initial accumulator values for each bfview level. The reader may wish to pause and think how to compute it.

Let us do a small experiment, on the running example bfa ilabeling 101 of labeling tree1. The tree has four levels; the initial labels for each level have to be [101; 102; 104; 108]. Using this list as the argument to bf_tree_accum, we obtain as the result of bf_tree_accum ilabeling [101; 102; 104; 108] tree1 the desired labeled tree and the list of final accumulator values per level. That list is [102; 104; 108; 112]. Look closer: the resulting list, without the last element, is the tail of the argument list. According to Jeremy Gibbons, that was Geraint Jones' original discovery. In hindsight, it is obvious: the final accumulator value of traversing a level is the initial accumulator value for the next level. Hence the 'z list argument to pass to bf_tree_accum is essentially that function's result. Tying the knot comes to mind.

The rest is technicality. We need deferred computations provided by OCaml's Lazy module and the syntax form lazy exp to return a value of the type 'a Lazy.t without evaluating the expression exp. The evaluation of thus suspended expression is forced by Lazy.force : 'a Lazy.t -> 'a. Deferred computations can be transformed (without forcing them) by

    let lfmap : ('a -> 'b) -> 'a Lazy.t -> 'b Lazy.t = 
      fun f x -> lazy (f (Lazy.force x))

The sequence of accumulator values that feeds back into bf_tree_accum must be a `lazy' stream, whose tail might not be computed yet:

    type 'a lstreamv = Cons of 'a * 'a lstream 
     and 'a lstream  = 'a lstreamv Lazy.t

A few combinators help dealing with lazy values and streams:

    val lfst  : ('a * 'b) Lazy.t -> 'a Lazy.t
    val llfst : ('a Lazy.t * 'b) Lazy.t -> 'a Lazy.t
    val lsnd  : ('a * 'b) Lazy.t -> 'b Lazy.t
    val lhd   : 'a lstream -> 'a Lazy.t
    val ltl   : 'a lstream -> 'a lstream
    val lcons : 'a Lazy.t -> 'a lstream -> 'a lstream

The types tell what they are supposed to do.

The following is the version of bf_tree_accum adjusted for the lazy stream of accumulator values:

    let rec bfa_lazy' : ('z -> 'a -> 'z*'b) -> 'z lstream -> 'a tree -> 'z lstream * 'b Lazy.t tree =
      fun f zs -> function
        | Leaf x -> let zb = lfmap (fun z -> f z x) (lhd zs) in
                    (lcons (lfst zb) (ltl zs), Leaf (lsnd zb))
        | Node (x,l,r) -> 
            let zb = lfmap (fun z -> f z x) (lhd zs) in
            let (zsl,l') = bfa_lazy' f (ltl zs) l in
            let (zsr,r') = bfa_lazy' f zsl r in
            (lcons (lfst zb) zsr, Node (lsnd zb,l',r'))

Using OCaml (which, unlike Haskell, distinguishes deferred computations in types) makes it clear that the spine of the result tree is computed eagerly, but the payload values lazily.

The following ties the knot

    let bfa_lazy : ('z -> 'a -> 'z*'b) -> 'z -> 'a tree -> 'b tree = fun f z t ->
      let rec zst = lazy (bfa_lazy' f (lcons (lazy z) (llfst zst)) t) in
      Lazy.force zst |> snd |> tree_map Lazy.force

The signature is a bit off: returning the final accumulator value is a bit messy and left as an exercise.

This is the (general form of the) lazy solution to breadth-first labeling, discovered by Jones and Gibbons back in early 1990s. Nowadays that solution is often presented in Haskell. Albeit elegant, the Haskell version obscures the data flow and hides memory costs (our 'a tree data type is fully strict, without hidden thunks, and can be compactly laid out in memory).

References

Geraint Jones and Jeremy Gibbons. Linear-time breadth-first tree algorithms: An exercise in the arithmetic of folds and zips. Technical Report No. 71, University of Auckland, 1993. (aka IFIP Working Group 2.1 working paper 705 WIN-2)
<http://www.cs.ox.ac.uk/publications/publication2363-abstract.html>

BFNLazy.hs [2K]
My rediscovery of the lazy algorithm, in a slightly different setting (September 8, 2009)

Breadth-first labeling

Preliminaries: numbering, labeling and map-accumulating -- on lists

The problem

References

Why it is interesting

References

Naive solution