Algebraic Data Types and Pattern-Matching

We present a systematic embedding of algebraic data types and their (recursive) processing using pattern-matching, and illustrate on examples of sums and recursive sums of products (strict and lazy trees). The method preserves all advantages of the tagless-final style, in particular, multiple interpretations -- such as evaluating the same DSL term strictly and non-strictly, and pretty-printing it as OCaml and Lua code. In effect, we may write Lua code with patter-matching and type safety. As another application, we investigate the efficiency of emulating left-fold via right-fold, in call-by-value, call-by-name and call-by-need.



Many examples of tagless-final DSL embedding center on languages with numbers and functions. Occasionally, arrays and lists make an ad hoc appearance. One may, however, want to embed a language with tuples, sums (such as option a.k.a Maybe), as well as various trees. Rather than designing the embedding for each particular data type from scratch, one would ideally apply a general method, almost mechanically. Such general and convenient embedding recipe has been lacking so far.

Examples of ADT embedding in tagless-final style are rare because ADT invariably evoke pattern-matching -- the biggest attraction of ADT -- and pattern-matching in tagless-final style is known to be awkward. Or so I thought.

This article demonstrates a systematic embedding of a language with ADTs, and their recursive processing (constructing, transforming, consuming) using pattern-matching. The main idea is representing pattern-matching in the object language (DSL) as pattern-matching in the metalanguage. With OCaml as the metalanguage, pattern-matching on object terms, using function (or match), becomes quite convenient.

Representing data types in tagless-final style is, in principle, not difficult. The principles go back to Goedel System T, which embedded the data type of natural numbers (Peano numerals), with the constructors 0 and S -- and, importantly, the deconstructor: the recursor R. The next milestone is the work of Boehm and Berarducci, which showed that every strictly-positive data type can be faithfully represented in the pure typed polymorphic lambda calculus (System F), including both construction and deconstruction.

The problem is the convenience of the embedding -- of representing the construction and deconstruction of object-language (DSL) data types in the metalanguage. (In the case of Boehm-Berarducci encoding, another big problem is performance.) A metalanguage may have a convenient syntax sugar for pattern-matching (such as case or match) -- but it will not not work for pattern-matching of embedded DSL terms, which may be encoded as functions or strings. Generally, the encoding of embedded DSL terms is abstracted; we have to use the provided constructor and deconstructor functions only. Deconstructing DSL data types in the bare `functional' (applicative) style turns out rather ugly quickly. As obvious in hindsight, it all depends on how the data type manipulation operations, the deconstructor in particular, are designed. One may, after all, use the pattern-matching sugar of the metalanguage. The key idea, somewhat ironically, comes from the Boehm and Berarducci's paper (which is the source of many insights).

William W. Tait: Intensional Interpretations of Functionals of Finite Type I. Journal of Symbolic Logic, 1967, v32, N2, pp. 198--212
The modern presentation of Goedel System T

Beyond Church encoding: Boehm-Berarducci isomorphism of algebraic data types and polymorphic lambda-terms


Basic DSL

We start with the basic DSL, to be extended later with data types. It is spartan, for ease of explanation, with only integers and recursive functions. DSLs with first-class functions were presented many times before. This time, for variety and simplicity, we embed second-class functions. The language is defined by the following signature:
    module type basic = sig
      type 'a repr
      val int : int -> int repr
      val (+) : int repr -> int repr -> int repr
      val recfun : (('a repr -> 'b repr) -> 'a repr -> 'b repr) -> 
        ('a repr ->'b repr)
      type 'a obs
      val observe : 'a repr -> 'a obs

The signature is hopefully understandable, from other tagless-final tutorials. The following sample basic terms -- two functions and one function application -- illustrate the language further.

    module ExBasic(S:basic) = struct
      open S
      let tf1 = recfun @@ fun self x -> x + int 1
      let tf2 = recfun @@ fun self x -> self (x + int 1)
      let r1 = tf1 (int 3)
The function tf2 is not terminating: we do not have conditionals, yet.

The most straightforward interpreter of the basic language is the meta-circular evaluator (to be called ES): mapping 'a repr DSL terms to OCaml terms of type 'a, and DSL operations (addition) to the OCaml addition. The evaluator is trivial to even mention: see the accompanying source code if needed. It evaluates the sample term r1 to 4.

A characteristic of tagless-final style is multiple interpretations: interpreting the same DSL in (many) different ways. For example, we may also define a lazy evaluator:

    module EL = struct
      type 'a repr = unit -> 'a
      let (+) = fun x y -> fun () -> Stdlib.(+) (x ()) (y ())
      type 'a obs = 'a
      let observe : 'a repr -> 'a obs = fun x -> x ()
For now, it is equivalent to the strict ES. A different sort of evaluator is (pretty)-printer. For example, we may print DSL terms as OCaml code:
    module PPO = struct
      type 'a repr = string
      let int : int -> int repr = string_of_int
      let (+) = Printf.sprintf "(%s + %s)"
      type 'a obs = string
      let observe : 'a repr -> 'a obs =
With this interpreter, the same sample term r1 now evaluates to the string
    let rec fn_1 x_2 = (x_2 + 1) in fn_1 (3)
which an OCaml interpreter evaluates to 4 (the same result the ES interpreter gave directly.)

There are many ways to display DSL terms. For example, we may pretty-print them as Lua code -- obtaining in effect a DSL compiler into Lua. This is quite less straightforward: Lua is a statement-oriented language, like C or Pascal, distinguishing expressions (which yield values) from statements (which do not).

    module PPL = struct
      type 'a repr = (string -> string) -> string
      type 'a obs = string
With the PPL interpreter, the same sample term r1 is now observed as
    local function fn_1 (x_2) return (x_2 + 1) 
When executed, this Lua code prints 4, as expected.

The following sections extend the basic language with data types, showing examples of their construction and processing. The four interpreters (two evaluators ES and EL and two pretty-printers PPO and PPL) are extended correspondingly.

The current version is February 2024
References [21K]
The complete code accompanying the article


Simple sums: trileans

We now present the method to systematically embed languages with data types, by examples. The first example is a simple sum data type. To make it more interesting, we try the ternary sum: trileans, or the sign data type. We hence define the trilean language, extending the basic with the (abstract) type for trileans and their constructors and the deconstructor.
    module type trilean = sig
      include basic
      type trilean                         (* abstract *)
      type trilean_shape = Neg | Zero | Pos
      val con : trilean_shape -> trilean repr (* general constructor *)
      val compare : int repr -> trilean repr  (* specific constructor *)
      val decon : trilean repr -> (trilean_shape -> 'w repr) -> 'w repr
      val failure : string -> 'w repr
The signature also introduces the shape of a trilean. Since trilean is a simple sum, its shape is likewise a sum naming the alternatives. (In the next section we shall see that the data type and its shape type differ substantially, in general.) We have to stress that the DSL implementation of trilean is hidden -- trilean may be an ADT or it may be realized as something else (we shall see the examples of both). The trilean_shape, however, gives us a metalanguage view on this object-language data type. The general constructor con reflects the concrete, manifest trilean_shape into an opaque DSL term: in other words, constructs DSL data type values, all three of them. A deconstructor, literally, is the inverse: it reifies an (abstract) DSL term into the concrete shape, letting us analyze it. The Boehm-Berarducci's general approach would derive the following pure lambda term for the deconstructor:
    val decon' : trilean repr -> 
       on_neg:(unit -> 'w) -> on_zero:(unit -> 'w) -> on_pos:(unit -> 'w) -> 'w
That is, decon' takes three functions (`handlers') and invokes one of them depending on the trilean alternative of the deconstructed term. Our decon is the De Morgan dual, representing a product as a `negated' sum:
    (not A) * (not B) * (not C) = not (A + B + C)
where A, B, and C are types and not X = X -> F.

Let us see how convenient it is to deal with trileans. Trileans, as sums, give us conditional processing, letting us write useful recursive functions:

    module Ex3(S:trilean) = struct
       open S
       let sumtorial = recfun @@ fun self n ->
       decon (compare n) @@ function
       | Zero -> int 0
       | Pos  -> n + self (n + int (-1))
       | Neg  -> failure "neg"
       (* This is actually a DSL macro *)
       let not x = decon x @@ function
       | Pos  -> con Neg
       | Neg  -> con Pos
       | Zero -> con Zero
       let r1 = sumtorial (int 5)
       let r2 = not (compare ((sumtorial (int 1)) + int (-2)))
The trick becomes clear: the deconstructor and the manifest OCaml data type trilean_shape let us use the OCaml pattern-matching syntax on essentially DSL data type terms. To be precise, we use the OCaml pattern-matching on reified DSL terms, which are OCaml data type values.

To actually run the example we need interpreters of the trilean language. For the metacircular interpreter ES3 (the extension of the basic evaluator ES) we chose the likewise metacircular realization of trilean: the trilean_shape itself.

    module ES3 = struct
      include ES
      type trilean_shape = Neg | Zero | Pos
      type trilean = trilean_shape
      let con =
      let decon = fun x k -> k x
      let compare : int repr -> trilean repr = fun x ->
        if x > 0 then Pos else if x = 0 then Zero else Neg
The sample r1 of Ext evaluates to 15.

For the lazy interpreter, we implement trileans differently, as integers: just because we can.

    module EL3 = struct
      include EL
      type trilean_shape = Neg | Zero | Pos
      type trilean = int
      let con : trilean_shape -> trilean repr = function
        | Neg  -> int (-1)
        | Zero -> int 0
        | Pos  -> int 1
      let decon : trilean repr -> (trilean_shape -> 'w repr) -> 'w repr = fun x k -> fun () ->
          match x () with
          | 0 -> k Zero ()
          | n when n > 0 -> k Pos ()
          | _  -> k Neg ()
      let compare : int repr -> trilean repr =
Although trileans are represented differently, the DSL code (the example Ex3) cannot know it, since trilean is abstract. Therefore, EL3 may interpret Ex3 example as well, with the same result as ES3.

We use the same representation of trileans as signed integers for the pretty-printers: PPO3 for the OCaml-pretty-printer and PPL3 for the Lua pretty-printer. The latter is quite less trivial: if ... then ... else ... end in Lua is a statement, which may not yield values. See the accompanying source code for detail. Here is how the sample r2 of Ex3 is evaluated by PPO:

    begin match (let rec fn_9 x_10 = 
    begin match x_10 with
    | 0 -> 0
    | n when n > 0 -> (x_10 + (fn_9 ((x_10 + -1))))
    | _  -> failwith "neg"
    end in fn_9 (1) + -2) with
    | 0 -> 0
    | n when n > 0 -> -1
    | _  -> 1
and by PPL:
    local function fn_7 (x_8) local temp_9 = x_8
    if temp_9 > 0 then return (x_8 + fn_7((x_8 + -1))) elseif temp_9 == 0 then return 0 else error("neg") end 
    local temp_6 = (fn_7(1) + -2)
    if temp_6 > 0 then print(-1) elseif temp_6 == 0 then print(0) else print(1) end
The current version is February 2024
References [21K]
The complete code accompanying the article



Logically the next step is to add to our DSL simple products, or tuples. The tuples will let us write recursive functions with several arguments. Tuples, however, is a very simple extension, simpler than sums, and are left as an exercise to the reader.

We jump to the typical and interesting case of ADT: recursive sums of products, or trees. The embedding goes along the same lines as trileans and at first glance seems straightforward:

    module type tree = sig
      include trilean
      type 'a tree
      type 'a tree_shape =
        | Leaf of 'a repr
        | Node of 'a tree repr * int repr * 'a tree repr
      val con3   : trilean_shape -> trilean repr
      val decon3 : trilean repr -> (trilean_shape -> 'w repr) -> 'w repr
      val con   : 'a tree_shape -> 'a tree repr
      val decon : 'a tree repr -> ('a tree_shape -> 'w repr) -> 'w repr
We add to the trilean language the abstract data type of trees (taking liberty to parameterize by the type of leaf values). The shape data type tells which exactly tree it is: a binary tree with integers at the nodes and the leaves of type 'a. The general constructor con and deconstructor decon convert between the tree and the shape. (The constructor and deconstructor of trileans are renamed to con3 and decon3.)

The closer look at 'a tree_shape shows that it is non-recursive. Although the tree is certainly a recursive (inductive) data type, its shape view is not. The tree_shape only reveals the the `surface' (one unrolling, so to speak) shape of the tree, not the full shape. This single unrolling is the insight of the Boehm and Berarducci's paper.

Having defined the tree DSL, let us see how to write DSL programs with trees.

    module ExT(S:tree) = struct
      open S
      let leaf x     = Leaf x |> con
      let node l x r = Node (l,x,r) |> con
      let t1 = node (leaf (con3 Zero)) (int 3) (leaf (con3 Pos))
      let t2 = node t1 (int 4) t1
      let complete = recfun @@ fun complete n ->
        decon3 (compare n) @@ function
          | Zero -> con (Leaf (con3 Pos)) 
          | Pos  -> node (complete (n + int (-1))) n (complete (n + int (-1)))
          | Neg  -> node (complete n) n (complete n)
      let not = ... (* borrow not from Ex3 *)
      let incr = recfun @@ fun incr t ->
        decon t @@ function
          | Leaf n -> leaf (not n)
          | Node (l,n,r) -> node (incr l) (n + (int 1)) (incr r)
      let sum = recfun @@ fun sum t ->
        decon t @@ function
        | Leaf n -> begin decon3 n @@ function
            | Zero | Pos -> int 0 | _ -> int (-1)
        | Node (l,n,r) -> sum l + n + sum r
      let r1 = sum t2
      let r2 = t2 |> incr |> sum
      let r3 = complete (int 2) |> incr |> sum

The convenience `macros' leaf and node let us construct sample trees such as t1 and t2 with ease. We can also construct trees programmatically: complete n builds a complete binary tree of depth n with trilean leaves. The depth n may be negative, which results in an infinite tree (which could actually be constructed and processed, in a lazy interpreter). The function incr is a tree transformer: it rebuilds a tree into the one with increased node values and negated leaf values. Finally, sum is a tree consumer, summing up the node and leaf values. The sample term r3 shows off a pipeline of a producer, a transformer and a consumer. The DSL tree code relies on pattern-matching all throughout. Thanks to tree_shape, the DSL pattern-matching is almost as convenient as OCaml's.

The metacircular interpreter ES3T of the tree DSL (the extension of ES3)

    module ES3T = struct
      include ES3
      let con3   = con
      let decon3 = decon
      type 'a tree = 
        | L of 'a repr
        | N of 'a tree repr * int repr * 'a tree repr
      type 'a tree_shape =
        | Leaf of 'a repr
        | Node of 'a tree repr * int repr * 'a tree repr
      let con   : 'a tree_shape -> 'a tree repr = function
        | Leaf n -> L n
        | Node (l,n,r) -> N (l,n,r)
      let decon : 'a tree repr -> ('a tree_shape -> 'w repr) -> 'w repr = function
        | L n -> fun k -> k (Leaf n)
        | N (l,n,r) -> fun k -> k (Node (l,n,r))
clearly shows the difference between the tree and its shape: the former is a recursive (inductive) data type but the latter is not. The deconstructor `unrolls' a tree, but only once; the constructor con `rolls' back. We can now run our DSL code ExT; for example, r3 evaluates to 3.

The lazy evaluator is just as straightforward:

    module EL3T = struct
      include EL3
      type 'a tree = 
        | L of 'a repr
        | N of 'a tree repr * int repr * 'a tree repr
      type 'a tree_shape =
        | Leaf of 'a repr
        | Node of 'a tree repr * int repr * 'a tree repr
Here we made the tree itself, as well as its node and leaf values non-strict. We could have made the tree strict in leaves (by replacing L of 'a repr with L of 'a) or in the node values, or both. The EL3T evaluator interprets the sample ExT code just as well, and with the same result. It can also do more: it lets us construct a tree with infinite branches, transform it and meaningfully process (find the length of the shortest path from the root to the tree). The accompanying code shows the complete example. If the tree contains at least one finite path, the program terminates and returns its length. We have used the complete constructor from ExT as is: complete (int (-1)) does not diverge but really produces an infinite tree. The strictness hence is not the property of a DSL -- it is the property of an interpreter. The same DSL code can be interpreted in different ways indeed -- the characteristic of the tagless-final approach.

The DSL code may also be interpreted to produce a printed representation of the code, in OCaml (using PPO3T) or Lua (using PPL3T): see the source code. As an example, here is how the sample term r3 of ExT looks when printed as Lua:

    local function fn_41 (x_42) local temp_50 = x_42
    if temp_50.left then return ((fn_41(temp_50.left) + temp_50.nv) + fn_41(temp_50.right)) else local temp_51 = temp_50.leaf
    if temp_51 > 0 then return 0 elseif temp_51 == 0 then return 0 else return -1 end end 
    local function fn_43 (x_44) local temp_48 = x_44
    if temp_48.left then return {left=fn_43(temp_48.left); nv=(temp_48.nv + 1); right=fn_43(temp_48.right)} else local temp_49 = temp_48.leaf
    if temp_49 > 0 then return {leaf=-1} elseif temp_49 == 0 then return {leaf=0} else return {leaf=1} end end 
    local function fn_45 (x_46) local temp_47 = x_46
    if temp_47 > 0 then return {left=fn_45((x_46 + -1)); nv=x_46; right=fn_45((x_46 + -1))} elseif temp_47 == 0 then return {leaf=1} else return {left=fn_45(x_46); nv=x_46; right=fn_45(x_46)} end 

Comparing it with r3 term itself, one can say we obtain the way to write Lua code with pattern-matching -- and type safety.

The current version is February 2024
References [21K]
The complete code accompanying the article


Algebraic data types and measuring strictness

Many DSLs deal not only with integers, floats or strings but also with composite data types like tuples, options, records, arrays, lists. They too can be represented in tagless-final style. Arrays are straightforward: we only need to introduce, similarly to OCaml's Array module, the operations make, length, get and set. Algebraic data types are more interesting: we want to be able to not only construct them but also deconstruct, or patter-match. Programming languages with algebraic data types typically have a dedicated syntax for pattern-matching. Reproducing such syntax sugar in an embedded DSL is difficult -- but not impossible if the host language is expressive enough. This article shows the example, introducing an eDSL with lists and using it for standard pattern-match--based list processing. Lists are a prototypical example of algebraic data types, encompassing tuples (or, products), variants (or, sums), and recursion.

As an illustration of multiple interpretations, afforded by the tagless-final style, we provide several interpretations of lists: strict, non-strict and lazy (memoized). We also define `operation counting' interpreters, which count cons/closure construction and deconstruction. We then investigate how efficiently left-fold can be emulated via right-fold, answering the question if left-fold is superfluous. Does the answer depend on the strictness of the list and the evaluation strategy -- call-by-name, call-by-value or call-by-need? Analysis of call-by-need is particularly tricky, so any firm counts are welcome. In the next article, we extend the counting to report not only the total amount of cons/closure allocations but the peak/average allocation rate: i.e., the space complexity of the algorithm. We use the extended counting to verify fusion in different strategies, and check if zip can be time- and space-efficiently expressed via a fold.

As the basis we take the simplest EDSL with functions and integer arithmetic, described a decade earlier. It was embedded in OCaml and defined by the signature below. The EDSL was meant to illustrate various evaluation strategies: call-by-value, call-by-name and call-by-need. For the sake of such generality, we introduced the abstract ('a,'b) arrow type of DSL functions (normally, the arrow type constructor of the host language suffices).

    module type EDSL = sig
      type 'a exp				(* representation of terms *)
      type ('a,'b) arrow			(* The arrow type *)
      val int : int -> int exp
      val (+) : int exp -> int exp -> int exp
      val (-) : int exp -> int exp -> int exp
      val lam : ('a exp -> 'b exp) -> ('a,'b) arrow exp
      val (/) : ('a,'b) arrow exp -> ('a exp -> 'b exp) (* application *)
As syntax sugar, we introduce let-expressions for the embedded language (not to be confused with OCaml lets) and the often occurring two-argument eDSL functions (which should remind one of zip_with):
    let (let-) : 'a exp -> ('a exp -> 'b exp) -> 'b exp = fun e body ->
      lam body / e
    let lam2 : ('a exp -> 'b exp -> 'c exp) -> ('a,('b,'c)arrow)arrow exp =
      fun f -> lam @@ fun x -> lam @@ fun y -> f x y
They are defined generically for any implementation of EDSL. One may wonder if let- is the mere reverse application: fun e body -> body e. Not quite: it depends on the interpreter/evaluation strategy, as we shall see later.

We now extend the EDSL, adding the type of embedded language lists 'a list and the familiar list constructors nil and cons:

    type +'a list                          (* List type of the EDLS *)
    val nil  : 'a list exp
    val cons : 'a exp -> 'a list exp -> 'a list exp
That is, eDSL lists are constructed by either nil or cons; the latter requires, as the second argument, an already constructed list. This is only one part of the standard inductive definition of lists, however. The second part says that lists are constructed only with nil and cons. The total (as we aim and assume) operation decon expresses that fact constructively: for any list l it tells exactly which constructor, nil or cons, was used to build it; in case of cons, we also obtain the arguments used with that constructor.
    val decon : 'a list exp -> 'w exp -> ('a exp -> 'a list exp -> 'w exp) -> 'w exp
It is supposed to satisfy the equations
    decon onnil oncons nil === onnil           decon onnil oncons (cons x l) === oncons x l
We call this deconstructor `lazy'. We may even emphasize laziness by defining the deconstructor as
    val decon : 'a list exp -> (unit -> 'w exp) -> ('a exp -> 'a list exp -> 'w exp) -> 'w exp
but this is not needed. Another way to write the deconstructor is to rely on option and tuples of the host language, if any:
    val decon : 'a list exp -> ('a exp * 'a list exp) option
We call it strict, for the reasons that become clear later, when we come to the implementation of the signature. Its defining equations are
    decon nil === None                         decon (cons x l) === Some (x,l)

Finally, we need the induction principle: if we can handle nil, and can handle cons x l assuming the handling of l, we can process any lists. In a word, we need fold. Rather than introducing it as an eDSL primitive, we provide tools to write it, as we will be investigating different folds below. Thus, we introduce recursion as primitive -- which, together with decon, lets us writes folds, and all other list processing operations.

    val fix : (('a,'b)arrow exp -> ('a,'b)arrow exp) -> ('a,'b)arrow exp

The EDSL is thus defined, and we may try writing list-processing functions. The first is map; to remind, it can be defined in OCaml as

    let map : ('a -> 'b) -> 'a list -> 'b list = fun f lst ->
      let rec loop l = match l with
      | []   -> []
      | h::t -> f h :: loop t
      in loop lst
In our EDSL, it is written as follows (using the lazy decon):
    let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst ->
      let- f = fexp in
      let loop = fix @@ fun loop -> lam @@ fun l -> 
        decon l
          (fun h t -> cons (f / h) (loop / t))
      in loop / lst
With the strict decon, the similarity with the OCaml map code is closer:
    let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst ->
      let- f = fexp in
      let loop = fix @@ fun loop -> lam @@ fun l -> 
        match decon l with
        | None       -> nil
        | Some (h,t) -> cons (f / h) (loop / t)
      in loop / lst
We can even use pattern-matching. One may notice that our EDSL map is actually an OCaml function with two arguments: that is, to the EDSL it is a macro. This macro-form of map is more convenient to use (and, if needed, can be converted to the EDSL function by lam2). Still, we have to be aware that its first argument fexp is an EDSL expression, which gets used many times in the loop. Therefore, it better be shared, using let- (if an interpreter of EDSL supports sharing: some don't, as we shall see.)

A few other examples: append, left and right fold, and zip_with (called map2 in OCaml):

    let append : 'a list exp -> 'a list exp -> 'a list exp = fun l1 l2 ->
      let loop = fix @@ fun loop -> lam @@ fun l1 -> 
        match decon l1 with
        | None       -> l2
        | Some (h,t) -> cons h (loop / t)
      in loop / l1
    let fold_right : ('a,('z,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst ->
        let- f = fexp in
        let loop = fix @@ fun self -> lam @@ fun lst ->
          match decon lst with
          | None -> z
          | Some (h,t) -> f / h / (self / t)
        loop / lst
    let fold_left : ('z,('a,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst ->
        let- f = fexp in
        let loop = fix @@ fun self -> lam2 @@ fun z lst ->
          match decon lst with
          | None -> z
          | Some (h,t) -> self / (f / z / h) / t
        loop / z / lst
    let zip_with : ('a,('b,'c)arrow)arrow exp -> 'a list exp -> 'b list exp -> 'c list exp
        = fun fexp l1 l2 ->
          let- f = fexp in
          let loop = fix @@ fun loop -> lam2 @@ fun l1 l2 ->
            match decon l1 with
            | None -> nil
            | Some (h1,t1) -> match decon l2 with
              | None -> nil
              | Some (h2,t2) -> cons (f / h1 / h2) (loop / t1 / t2)
          in loop / l1 / l2
All in all, the EDSL code is quite close to how would we write it in OCaml. The accompanying code shows more examples, of defining and using list-processing functions.

To run the examples, we need an implementation of the EDSL. The first that comes to mind is meta-circular one: EDSL functions are OCaml functions and EDSL lists are normal OCaml lists. It is indeed straightforward and left as an exercise to the reader.

We will be mostly interesting with the call-by-name, call-by-value and call-by-need family of interpreters, called CBName, CBValue and CBLazy (see the earlier article: Parameterizing expressions by the evaluation order). They all share the same representation of EDSL expressions and functions, as:

    type 'a exp       = unit -> 'a 
    type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b)
In fact, the only difference between the different strategies is the implementation of lam. The extension to lists is likewise almost uniform. Here are non-strict lists:
    module ListName(S:EDSL with type 'a exp = unit -> 'a 
                           and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b))
     = struct
      include S
      type 'a list = Nil | Cons of 'a exp * 'a list exp
      let nil  : 'a list exp = fun () -> Nil
      let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l ->
        fun () -> Cons (x,l)
      let decon : 'a list exp -> ('a exp * 'a list exp) option = fun lst ->
        match lst () with
        | Nil        -> None
        | Cons (h,t) -> Some (h,t)
      let fix : (('a,'b)arrow exp -> ('a,'b)arrow exp) -> ('a,'b)arrow exp =
        fun f -> fun () ->
          let rec fp () = f fp () in fp ()
Indeed, here cons does not evaluate its arguments. It should also become clear why decon was called `strict'.

Strict lists differ only in the interpretation of cons: now it does evaluate its arguments.

    module ListValue(S:EDSL with type 'a exp = unit -> 'a 
                           and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b))
     = struct
      include ListName(S)
      let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l ->
        fun () -> let xv = x () and lv = l () in 
                  Cons ((fun () -> xv),(fun () -> lv))
In lazy lists, cons arguments are evaluated on demand and shared:
    module ListLazy(S:EDSL with type 'a exp = unit -> 'a 
                           and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b))
     = struct
      include ListName(S)
      let share : 'a exp -> 'a exp = fun e -> 
        let r = ref e in
        fun () -> let v = !r () in r := (fun () -> v); v
      let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l ->
        fun () -> Cons (share x, share l)
We may run the list examples with ListName(CBName), ListValue(CBValue) and ListLazy(CBLazy) (or even ListLazy(CBValue)) interpreters and confirm they all produce the same results.

But efficiency probably differs. How can we check? By defining a counting interpreter (actually, a transformer) that modifies the definitions of cons, decon and lam to count cons-tructions, decon-structions and building and applying of (selected) closures. See the accompanying source code for detail (the functor ListCnt).

To get the feel for how the counters work, consider the following program.

    let- dummy = int 0 in
    let- l = cons (int 0) (cons (int 1) nil) in
    let- unused = append l l in
    let- l2 = append l l in
    fold_right (lam2 (+)) (int 0) l2
The call-by-value interpreter with strict lists reports: 6 cons constructed, 8 deconstructed, 9 closures created and 12 applied. Indeed, l contributes 2 cons constructions, the append in unused contributes 2 constructions and 2 deconstructions, the append in l2 contributes the same. Finally, fold_right walks the whole list and hence does 4 deconstructions. The total matches the report by the counting interpreter. As to closures, each let- contributes one closure construction and application. lam2 contributes one (outer) closure construction, whose application to two arguments contributes 1 closure construction and 2 applications. Again, the totals agree with the measurements.

For lazy lists with the lazy interpreter, we obtain for the same program 4 cons constructed and 6 deconstructed; the closure statistics is the same. The lazy interpreter does not evaluate unused, and so the append l l contribution (2 cons and 2 decons) is not happening. For unstrict lists with the call-by-name interpreter, we obtain 6 cons constructed and deconstructed and 12 closures created and applied. The unused is again not evaluated; however, due to the lack of sharing, the construction of l and lam2 (+) happens repeatedly.

We now apply the counting framework to the question considered a few years ago: if fold-left is worth it, given that it is easily expressible via right-fold:

    let fold_left_via_right : ('z,('a,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp
      = fun fexp z lst ->
        let- f = fexp in
        fold_right (lam2 @@ fun e a -> lam @@ fun z -> a / (f / z / e)) 
          (lam lst / z
All other list operations are expressible via right fold. For example, map:
    let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst ->
      let- f = fexp in
      fold_right (lam2 @@ fun e l -> cons (f / e) l) nil lst

We conduct an experiment: define the following three benchmarks and apply to two lists, of sizes 4 and 16, resp.

    let tmr l  = fold_right (lam2 (Fun.flip (-))) (int 0) l
    let tml l  = fold_left (lam2 (-)) (int 0) l
    let tmlr l = fold_left_via_right (lam2 (-)) (int 0) l
The count of constructed and deconstructed cons-cell is predictable: exactly the size of the input list -- regardless of strictness of lists or the interpreter. Left- and right- fold processing (benchmarks tml and tmr) happen to have the same complexity: the same cons and closure use, again for all interpreters. The following table reports (closure creation, closure application) counts, for different interpreters:
    ~               left- or right-          left-via-right
    Value, size 4     ( 5,  8)                ( 22,  28)
    Value, size 16    (17, 32)                ( 70, 100)
    Lazy, size 4      ( 5,  8)                ( 22,  28)
    Lazy, size 16     (17, 32)                ( 70, 100)
    Name, size 4      ( 8,  8)                ( 28,  28)
    Name, size 16     (32, 32)                (100, 100)
Since there is no `dead' code, call-by-value with strict lists and call-by-need with lazy lists have the same complexity. Call-by-name is predictably worse: more closure construction due to lack of sharing. Expressing left-fold via right-fold is about 3 times worse, regardless of the evaluation strategy. We'd better keep left-fold as a primitive.
The current version is August 2022
References [7K]
The basic EDLS, which appeared in Parameterizing expressions by the evaluation order [19K]
The complete code accompanying the article, with more examples

Left fold vs. right fold: theory vs. practice
An old article with the analysis of left- and right- folds