# Algebraic Data Types and Pattern-Matching

We present a systematic embedding of algebraic data types and their (recursive) processing using pattern-matching, and illustrate on examples of sums and recursive sums of products (strict and lazy trees). The method preserves all advantages of the tagless-final style, in particular, multiple interpretations -- such as evaluating the same DSL term strictly and non-strictly, and pretty-printing it as OCaml and Lua code. In effect, we may write Lua code with patter-matching and type safety. As another application, we investigate the efficiency of emulating left-fold via right-fold, in call-by-value, call-by-name and call-by-need.

## Introduction

Many examples of tagless-final DSL embedding center on languages with numbers and functions. Occasionally, arrays and lists make an ad hoc appearance. One may, however, want to embed a language with tuples, sums (such as `option` a.k.a `Maybe`), as well as various trees. Rather than designing the embedding for each particular data type from scratch, one would ideally apply a general method, almost mechanically. Such general and convenient embedding recipe has been lacking so far.

Examples of ADT embedding in tagless-final style are rare because ADT invariably evoke pattern-matching -- the biggest attraction of ADT -- and pattern-matching in tagless-final style is known to be awkward. Or so I thought.

This article demonstrates a systematic embedding of a language with ADTs, and their recursive processing (constructing, transforming, consuming) using pattern-matching. The main idea is representing pattern-matching in the object language (DSL) as pattern-matching in the metalanguage. With OCaml as the metalanguage, pattern-matching on object terms, using `function` (or `match`), becomes quite convenient.

Representing data types in tagless-final style is, in principle, not difficult. The principles go back to Goedel System T, which embedded the data type of natural numbers (Peano numerals), with the constructors `0` and `S` -- and, importantly, the deconstructor: the recursor `R`. The next milestone is the work of Boehm and Berarducci, which showed that every strictly-positive data type can be faithfully represented in the pure typed polymorphic lambda calculus (System F), including both construction and deconstruction.

The problem is the convenience of the embedding -- of representing the construction and deconstruction of object-language (DSL) data types in the metalanguage. (In the case of Boehm-Berarducci encoding, another big problem is performance.) A metalanguage may have a convenient syntax sugar for pattern-matching (such as `case` or `match`) -- but it will not not work for pattern-matching of embedded DSL terms, which may be encoded as functions or strings. Generally, the encoding of embedded DSL terms is abstracted; we have to use the provided constructor and deconstructor functions only. Deconstructing DSL data types in the bare `functional' (applicative) style turns out rather ugly quickly. As obvious in hindsight, it all depends on how the data type manipulation operations, the deconstructor in particular, are designed. One may, after all, use the pattern-matching sugar of the metalanguage. The key idea, somewhat ironically, comes from the Boehm and Berarducci's paper (which is the source of many insights).

References
William W. Tait: Intensional Interpretations of Functionals of Finite Type I. Journal of Symbolic Logic, 1967, v32, N2, pp. 198--212
The modern presentation of Goedel System T

Beyond Church encoding: Boehm-Berarducci isomorphism of algebraic data types and polymorphic lambda-terms

## Basic DSL

We start with the basic DSL, to be extended later with data types. It is spartan, for ease of explanation, with only integers and recursive functions. DSLs with first-class functions were presented many times before. This time, for variety and simplicity, we embed second-class functions. The language is defined by the following signature:
```    module type basic = sig
type 'a repr
val int : int -> int repr
val (+) : int repr -> int repr -> int repr
val recfun : (('a repr -> 'b repr) -> 'a repr -> 'b repr) ->
('a repr ->'b repr)

type 'a obs
val observe : 'a repr -> 'a obs
end
```

The signature is hopefully understandable, from other tagless-final tutorials. The following sample basic terms -- two functions and one function application -- illustrate the language further.

```    module ExBasic(S:basic) = struct
open S
let tf1 = recfun @@ fun self x -> x + int 1
let tf2 = recfun @@ fun self x -> self (x + int 1)
let r1 = tf1 (int 3)
end
```
The function `tf2` is not terminating: we do not have conditionals, yet.

The most straightforward interpreter of the `basic` language is the meta-circular evaluator (to be called `ES`): mapping `'a repr` DSL terms to OCaml terms of type `'a`, and DSL operations (addition) to the OCaml addition. The evaluator is trivial to even mention: see the accompanying source code if needed. It evaluates the sample term `r1` to `4`.

A characteristic of tagless-final style is multiple interpretations: interpreting the same DSL in (many) different ways. For example, we may also define a lazy evaluator:

```    module EL = struct
type 'a repr = unit -> 'a
let (+) = fun x y -> fun () -> Stdlib.(+) (x ()) (y ())
...
type 'a obs = 'a
let observe : 'a repr -> 'a obs = fun x -> x ()
end
```
For now, it is equivalent to the strict `ES`. A different sort of evaluator is (pretty)-printer. For example, we may print DSL terms as OCaml code:
```    module PPO = struct
type 'a repr = string
let int : int -> int repr = string_of_int

let (+) = Printf.sprintf "(%s + %s)"
...
type 'a obs = string
let observe : 'a repr -> 'a obs = Fun.id
end
```
With this interpreter, the same sample term `r1` now evaluates to the string
```    let rec fn_1 x_2 = (x_2 + 1) in fn_1 (3)
```
which an OCaml interpreter evaluates to `4` (the same result the `ES` interpreter gave directly.)

There are many ways to display DSL terms. For example, we may pretty-print them as Lua code -- obtaining in effect a DSL compiler into Lua. This is quite less straightforward: Lua is a statement-oriented language, like C or Pascal, distinguishing expressions (which yield values) from statements (which do not).

```    module PPL = struct
type 'a repr = (string -> string) -> string
type 'a obs = string
...
end
```
With the `PPL` interpreter, the same sample term `r1` is now observed as
```    local function fn_1 (x_2) return (x_2 + 1)
end
print(fn_1(3))
```
When executed, this Lua code prints 4, as expected.

The following sections extend the `basic` language with data types, showing examples of their construction and processing. The four interpreters (two evaluators `ES` and `EL` and two pretty-printers `PPO` and `PPL`) are extended correspondingly.

Version
The current version is February 2024
References
The complete code accompanying the article

## Simple sums: trileans

We now present the method to systematically embed languages with data types, by examples. The first example is a simple sum data type. To make it more interesting, we try the ternary sum: trileans, or the sign data type. We hence define the `trilean` language, extending the `basic` with the (abstract) type for trileans and their constructors and the deconstructor.
```    module type trilean = sig
include basic

type trilean                         (* abstract *)
type trilean_shape = Neg | Zero | Pos

val con : trilean_shape -> trilean repr (* general constructor *)
val compare : int repr -> trilean repr  (* specific constructor *)
val decon : trilean repr -> (trilean_shape -> 'w repr) -> 'w repr
val failure : string -> 'w repr
end
```
The signature also introduces the shape of a trilean. Since trilean is a simple sum, its shape is likewise a sum naming the alternatives. (In the next section we shall see that the data type and its shape type differ substantially, in general.) We have to stress that the DSL implementation of `trilean` is hidden -- `trilean` may be an ADT or it may be realized as something else (we shall see the examples of both). The `trilean_shape`, however, gives us a metalanguage view on this object-language data type. The general constructor `con` reflects the concrete, manifest `trilean_shape` into an opaque DSL term: in other words, constructs DSL data type values, all three of them. A deconstructor, literally, is the inverse: it reifies an (abstract) DSL term into the concrete shape, letting us analyze it. The Boehm-Berarducci's general approach would derive the following pure lambda term for the deconstructor:
```    val decon' : trilean repr ->
on_neg:(unit -> 'w) -> on_zero:(unit -> 'w) -> on_pos:(unit -> 'w) -> 'w
```
That is, `decon'` takes three functions (`handlers') and invokes one of them depending on the trilean alternative of the deconstructed term. Our `decon` is the De Morgan dual, representing a product as a `negated' sum:
```    (not A) * (not B) * (not C) = not (A + B + C)
```
where `A`, `B`, and `C` are types and `not X = X -> F`.

Let us see how convenient it is to deal with trileans. Trileans, as sums, give us conditional processing, letting us write useful recursive functions:

```    module Ex3(S:trilean) = struct
open S

let sumtorial = recfun @@ fun self n ->
decon (compare n) @@ function
| Zero -> int 0
| Pos  -> n + self (n + int (-1))
| Neg  -> failure "neg"

(* This is actually a DSL macro *)
let not x = decon x @@ function
| Pos  -> con Neg
| Neg  -> con Pos
| Zero -> con Zero

let r1 = sumtorial (int 5)
let r2 = not (compare ((sumtorial (int 1)) + int (-2)))
end
```
The trick becomes clear: the deconstructor and the manifest OCaml data type `trilean_shape` let us use the OCaml pattern-matching syntax on essentially DSL data type terms. To be precise, we use the OCaml pattern-matching on reified DSL terms, which are OCaml data type values.

To actually run the example we need interpreters of the `trilean` language. For the metacircular interpreter `ES3` (the extension of the `basic` evaluator `ES`) we chose the likewise metacircular realization of `trilean`: the `trilean_shape` itself.

```    module ES3 = struct
include ES
type trilean_shape = Neg | Zero | Pos
type trilean = trilean_shape

let con = Fun.id
let decon = fun x k -> k x
let compare : int repr -> trilean repr = fun x ->
if x > 0 then Pos else if x = 0 then Zero else Neg
...
end
```
The sample `r1` of `Ext` evaluates to 15.

For the lazy interpreter, we implement trileans differently, as integers: just because we can.

```    module EL3 = struct
include EL
type trilean_shape = Neg | Zero | Pos
type trilean = int

let con : trilean_shape -> trilean repr = function
| Neg  -> int (-1)
| Zero -> int 0
| Pos  -> int 1

let decon : trilean repr -> (trilean_shape -> 'w repr) -> 'w repr = fun x k -> fun () ->
match x () with
| 0 -> k Zero ()
| n when n > 0 -> k Pos ()
| _  -> k Neg ()

let compare : int repr -> trilean repr = Fun.id
...
end
```
Although trileans are represented differently, the DSL code (the example `Ex3`) cannot know it, since `trilean` is abstract. Therefore, `EL3` may interpret `Ex3` example as well, with the same result as `ES3`.

We use the same representation of trileans as signed integers for the pretty-printers: `PPO3` for the OCaml-pretty-printer and `PPL3` for the Lua pretty-printer. The latter is quite less trivial: `if ... then ... else ... end` in Lua is a statement, which may not yield values. See the accompanying source code for detail. Here is how the sample `r2` of `Ex3` is evaluated by `PPO`:

```    begin match (let rec fn_9 x_10 =
begin match x_10 with
| 0 -> 0
| n when n > 0 -> (x_10 + (fn_9 ((x_10 + -1))))
| _  -> failwith "neg"
end in fn_9 (1) + -2) with
| 0 -> 0
| n when n > 0 -> -1
| _  -> 1
end
```
and by `PPL`:
```    local function fn_7 (x_8) local temp_9 = x_8
if temp_9 > 0 then return (x_8 + fn_7((x_8 + -1))) elseif temp_9 == 0 then return 0 else error("neg") end
end
local temp_6 = (fn_7(1) + -2)
if temp_6 > 0 then print(-1) elseif temp_6 == 0 then print(0) else print(1) end
```
Version
The current version is February 2024
References
The complete code accompanying the article

## Trees

Logically the next step is to add to our DSL simple products, or tuples. The tuples will let us write recursive functions with several arguments. Tuples, however, is a very simple extension, simpler than sums, and are left as an exercise to the reader.

We jump to the typical and interesting case of ADT: recursive sums of products, or trees. The embedding goes along the same lines as trileans and at first glance seems straightforward:

```    module type tree = sig
include trilean

type 'a tree

type 'a tree_shape =
| Leaf of 'a repr
| Node of 'a tree repr * int repr * 'a tree repr

val con3   : trilean_shape -> trilean repr
val decon3 : trilean repr -> (trilean_shape -> 'w repr) -> 'w repr

val con   : 'a tree_shape -> 'a tree repr
val decon : 'a tree repr -> ('a tree_shape -> 'w repr) -> 'w repr
end
```
We add to the `trilean` language the abstract data type of trees (taking liberty to parameterize by the type of leaf values). The shape data type tells which exactly tree it is: a binary tree with integers at the nodes and the leaves of type `'a`. The general constructor `con` and deconstructor `decon` convert between the tree and the shape. (The constructor and deconstructor of trileans are renamed to `con3` and `decon3`.)

The closer look at `'a tree_shape` shows that it is non-recursive. Although the tree is certainly a recursive (inductive) data type, its shape view is not. The `tree_shape` only reveals the the `surface' (one unrolling, so to speak) shape of the tree, not the full shape. This single unrolling is the insight of the Boehm and Berarducci's paper.

Having defined the `tree` DSL, let us see how to write DSL programs with trees.

```    module ExT(S:tree) = struct
open S
let leaf x     = Leaf x |> con
let node l x r = Node (l,x,r) |> con

let t1 = node (leaf (con3 Zero)) (int 3) (leaf (con3 Pos))
let t2 = node t1 (int 4) t1

let complete = recfun @@ fun complete n ->
decon3 (compare n) @@ function
| Zero -> con (Leaf (con3 Pos))
| Pos  -> node (complete (n + int (-1))) n (complete (n + int (-1)))
| Neg  -> node (complete n) n (complete n)

let not = ... (* borrow not from Ex3 *)

let incr = recfun @@ fun incr t ->
decon t @@ function
| Leaf n -> leaf (not n)
| Node (l,n,r) -> node (incr l) (n + (int 1)) (incr r)

let sum = recfun @@ fun sum t ->
decon t @@ function
| Leaf n -> begin decon3 n @@ function
| Zero | Pos -> int 0 | _ -> int (-1)
end
| Node (l,n,r) -> sum l + n + sum r

let r1 = sum t2
let r2 = t2 |> incr |> sum
let r3 = complete (int 2) |> incr |> sum
end
```

The convenience `macros' `leaf` and `node` let us construct sample trees such as `t1` and `t2` with ease. We can also construct trees programmatically: `complete n` builds a complete binary tree of depth `n` with trilean leaves. The depth `n` may be negative, which results in an infinite tree (which could actually be constructed and processed, in a lazy interpreter). The function `incr` is a tree transformer: it rebuilds a tree into the one with increased node values and negated leaf values. Finally, `sum` is a tree consumer, summing up the node and leaf values. The sample term `r3` shows off a pipeline of a producer, a transformer and a consumer. The DSL tree code relies on pattern-matching all throughout. Thanks to `tree_shape`, the DSL pattern-matching is almost as convenient as OCaml's.

The metacircular interpreter `ES3T` of the tree DSL (the extension of ES3)

```    module ES3T = struct
include ES3
let con3   = con
let decon3 = decon

type 'a tree =
| L of 'a repr
| N of 'a tree repr * int repr * 'a tree repr

type 'a tree_shape =
| Leaf of 'a repr
| Node of 'a tree repr * int repr * 'a tree repr

let con   : 'a tree_shape -> 'a tree repr = function
| Leaf n -> L n
| Node (l,n,r) -> N (l,n,r)

let decon : 'a tree repr -> ('a tree_shape -> 'w repr) -> 'w repr = function
| L n -> fun k -> k (Leaf n)
| N (l,n,r) -> fun k -> k (Node (l,n,r))
end
```
clearly shows the difference between the tree and its shape: the former is a recursive (inductive) data type but the latter is not. The deconstructor `unrolls' a tree, but only once; the constructor `con` `rolls' back. We can now run our DSL code `ExT`; for example, `r3` evaluates to `3`.

The lazy evaluator is just as straightforward:

```    module EL3T = struct
include EL3

type 'a tree =
| L of 'a repr
| N of 'a tree repr * int repr * 'a tree repr

type 'a tree_shape =
| Leaf of 'a repr
| Node of 'a tree repr * int repr * 'a tree repr
...
end
```
Here we made the `tree` itself, as well as its node and leaf values non-strict. We could have made the tree strict in leaves (by replacing `L of 'a repr` with `L of 'a`) or in the node values, or both. The `EL3T` evaluator interprets the sample `ExT` code just as well, and with the same result. It can also do more: it lets us construct a tree with infinite branches, transform it and meaningfully process (find the length of the shortest path from the root to the tree). The accompanying code shows the complete example. If the tree contains at least one finite path, the program terminates and returns its length. We have used the `complete` constructor from `ExT` as is: `complete (int (-1))` does not diverge but really produces an infinite tree. The strictness hence is not the property of a DSL -- it is the property of an interpreter. The same DSL code can be interpreted in different ways indeed -- the characteristic of the tagless-final approach.

The DSL code may also be interpreted to produce a printed representation of the code, in OCaml (using `PPO3T`) or Lua (using `PPL3T`): see the source code. As an example, here is how the sample term `r3` of `ExT` looks when printed as Lua:

```    local function fn_41 (x_42) local temp_50 = x_42
if temp_50.left then return ((fn_41(temp_50.left) + temp_50.nv) + fn_41(temp_50.right)) else local temp_51 = temp_50.leaf
if temp_51 > 0 then return 0 elseif temp_51 == 0 then return 0 else return -1 end end
end

local function fn_43 (x_44) local temp_48 = x_44
if temp_48.left then return {left=fn_43(temp_48.left); nv=(temp_48.nv + 1); right=fn_43(temp_48.right)} else local temp_49 = temp_48.leaf
if temp_49 > 0 then return {leaf=-1} elseif temp_49 == 0 then return {leaf=0} else return {leaf=1} end end
end

local function fn_45 (x_46) local temp_47 = x_46
if temp_47 > 0 then return {left=fn_45((x_46 + -1)); nv=x_46; right=fn_45((x_46 + -1))} elseif temp_47 == 0 then return {leaf=1} else return {left=fn_45(x_46); nv=x_46; right=fn_45(x_46)} end
end

print(fn_41(fn_43(fn_45(2))))
```

Comparing it with `r3` term itself, one can say we obtain the way to write Lua code with pattern-matching -- and type safety.

Version
The current version is February 2024
References
The complete code accompanying the article

## Algebraic data types and measuring strictness

Many DSLs deal not only with integers, floats or strings but also with composite data types like tuples, options, records, arrays, lists. They too can be represented in tagless-final style. Arrays are straightforward: we only need to introduce, similarly to OCaml's Array module, the operations `make`, `length`, `get` and `set`. Algebraic data types are more interesting: we want to be able to not only construct them but also deconstruct, or patter-match. Programming languages with algebraic data types typically have a dedicated syntax for pattern-matching. Reproducing such syntax sugar in an embedded DSL is difficult -- but not impossible if the host language is expressive enough. This article shows the example, introducing an eDSL with lists and using it for standard pattern-match--based list processing. Lists are a prototypical example of algebraic data types, encompassing tuples (or, products), variants (or, sums), and recursion.

As an illustration of multiple interpretations, afforded by the tagless-final style, we provide several interpretations of lists: strict, non-strict and lazy (memoized). We also define `operation counting' interpreters, which count cons/closure construction and deconstruction. We then investigate how efficiently left-fold can be emulated via right-fold, answering the question if left-fold is superfluous. Does the answer depend on the strictness of the list and the evaluation strategy -- call-by-name, call-by-value or call-by-need? Analysis of call-by-need is particularly tricky, so any firm counts are welcome. In the next article, we extend the counting to report not only the total amount of cons/closure allocations but the peak/average allocation rate: i.e., the space complexity of the algorithm. We use the extended counting to verify fusion in different strategies, and check if zip can be time- and space-efficiently expressed via a fold.

As the basis we take the simplest EDSL with functions and integer arithmetic, described a decade earlier. It was embedded in OCaml and defined by the signature below. The EDSL was meant to illustrate various evaluation strategies: call-by-value, call-by-name and call-by-need. For the sake of such generality, we introduced the abstract `('a,'b) arrow` type of DSL functions (normally, the arrow type constructor of the host language suffices).

```    module type EDSL = sig
type 'a exp				(* representation of terms *)
type ('a,'b) arrow			(* The arrow type *)

val int : int -> int exp
val (+) : int exp -> int exp -> int exp
val (-) : int exp -> int exp -> int exp

val lam : ('a exp -> 'b exp) -> ('a,'b) arrow exp
val (/) : ('a,'b) arrow exp -> ('a exp -> 'b exp) (* application *)
end
```
As syntax sugar, we introduce let-expressions for the embedded language (not to be confused with OCaml `let`s) and the often occurring two-argument eDSL functions (which should remind one of `zip_with`):
```    let (let-) : 'a exp -> ('a exp -> 'b exp) -> 'b exp = fun e body ->
lam body / e

let lam2 : ('a exp -> 'b exp -> 'c exp) -> ('a,('b,'c)arrow)arrow exp =
fun f -> lam @@ fun x -> lam @@ fun y -> f x y
```
They are defined generically for any implementation of `EDSL`. One may wonder if `let-` is the mere reverse application: `fun e body -> body e`. Not quite: it depends on the interpreter/evaluation strategy, as we shall see later.

We now extend the EDSL, adding the type of embedded language lists `'a list` and the familiar list constructors `nil` and `cons`:

```    type +'a list                          (* List type of the EDLS *)

val nil  : 'a list exp
val cons : 'a exp -> 'a list exp -> 'a list exp
```
That is, eDSL lists are constructed by either `nil` or `cons`; the latter requires, as the second argument, an already constructed list. This is only one part of the standard inductive definition of lists, however. The second part says that lists are constructed only with `nil` and `cons`. The total (as we aim and assume) operation `decon` expresses that fact constructively: for any list `l` it tells exactly which constructor, `nil` or `cons`, was used to build it; in case of `cons`, we also obtain the arguments used with that constructor.
```    val decon : 'a list exp -> 'w exp -> ('a exp -> 'a list exp -> 'w exp) -> 'w exp
```
It is supposed to satisfy the equations
```    decon onnil oncons nil === onnil           decon onnil oncons (cons x l) === oncons x l
```
We call this deconstructor `lazy'. We may even emphasize laziness by defining the deconstructor as
```    val decon : 'a list exp -> (unit -> 'w exp) -> ('a exp -> 'a list exp -> 'w exp) -> 'w exp
```
but this is not needed. Another way to write the deconstructor is to rely on option and tuples of the host language, if any:
```    val decon : 'a list exp -> ('a exp * 'a list exp) option
```
We call it strict, for the reasons that become clear later, when we come to the implementation of the signature. Its defining equations are
```    decon nil === None                         decon (cons x l) === Some (x,l)
```

Finally, we need the induction principle: if we can handle `nil`, and can handle `cons x l` assuming the handling of `l`, we can process any lists. In a word, we need `fold`. Rather than introducing it as an eDSL primitive, we provide tools to write it, as we will be investigating different folds below. Thus, we introduce recursion as primitive -- which, together with `decon`, lets us writes folds, and all other list processing operations.

```    val fix : (('a,'b)arrow exp -> ('a,'b)arrow exp) -> ('a,'b)arrow exp
```

The EDSL is thus defined, and we may try writing list-processing functions. The first is `map`; to remind, it can be defined in OCaml as

```    let map : ('a -> 'b) -> 'a list -> 'b list = fun f lst ->
let rec loop l = match l with
| []   -> []
| h::t -> f h :: loop t
in loop lst
```
In our EDSL, it is written as follows (using the lazy `decon`):
```    let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst ->
let- f = fexp in
let loop = fix @@ fun loop -> lam @@ fun l ->
decon l
nil
(fun h t -> cons (f / h) (loop / t))
in loop / lst
```
With the strict `decon`, the similarity with the OCaml `map` code is closer:
```    let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst ->
let- f = fexp in
let loop = fix @@ fun loop -> lam @@ fun l ->
match decon l with
| None       -> nil
| Some (h,t) -> cons (f / h) (loop / t)
in loop / lst
```
We can even use pattern-matching. One may notice that our EDSL `map` is actually an OCaml function with two arguments: that is, to the EDSL it is a macro. This macro-form of `map` is more convenient to use (and, if needed, can be converted to the EDSL function by `lam2`). Still, we have to be aware that its first argument `fexp` is an EDSL expression, which gets used many times in the loop. Therefore, it better be shared, using `let-` (if an interpreter of EDSL supports sharing: some don't, as we shall see.)

A few other examples: `append`, left and right fold, and `zip_with` (called `map2` in OCaml):

```    let append : 'a list exp -> 'a list exp -> 'a list exp = fun l1 l2 ->
let loop = fix @@ fun loop -> lam @@ fun l1 ->
match decon l1 with
| None       -> l2
| Some (h,t) -> cons h (loop / t)
in loop / l1

let fold_right : ('a,('z,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst ->
let- f = fexp in
let loop = fix @@ fun self -> lam @@ fun lst ->
match decon lst with
| None -> z
| Some (h,t) -> f / h / (self / t)
in
loop / lst

let fold_left : ('z,('a,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp = fun fexp z lst ->
let- f = fexp in
let loop = fix @@ fun self -> lam2 @@ fun z lst ->
match decon lst with
| None -> z
| Some (h,t) -> self / (f / z / h) / t
in
loop / z / lst

let zip_with : ('a,('b,'c)arrow)arrow exp -> 'a list exp -> 'b list exp -> 'c list exp
= fun fexp l1 l2 ->
let- f = fexp in
let loop = fix @@ fun loop -> lam2 @@ fun l1 l2 ->
match decon l1 with
| None -> nil
| Some (h1,t1) -> match decon l2 with
| None -> nil
| Some (h2,t2) -> cons (f / h1 / h2) (loop / t1 / t2)
in loop / l1 / l2
```
All in all, the EDSL code is quite close to how would we write it in OCaml. The accompanying code shows more examples, of defining and using list-processing functions.

To run the examples, we need an implementation of the EDSL. The first that comes to mind is meta-circular one: EDSL functions are OCaml functions and EDSL lists are normal OCaml lists. It is indeed straightforward and left as an exercise to the reader.

We will be mostly interesting with the call-by-name, call-by-value and call-by-need family of interpreters, called `CBName`, `CBValue` and `CBLazy` (see the earlier article: Parameterizing expressions by the evaluation order). They all share the same representation of EDSL expressions and functions, as:

```    type 'a exp       = unit -> 'a
type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b)
```
In fact, the only difference between the different strategies is the implementation of `lam`. The extension to lists is likewise almost uniform. Here are non-strict lists:
```    module ListName(S:EDSL with type 'a exp = unit -> 'a
and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b))
= struct
include S
type 'a list = Nil | Cons of 'a exp * 'a list exp

let nil  : 'a list exp = fun () -> Nil
let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l ->
fun () -> Cons (x,l)

let decon : 'a list exp -> ('a exp * 'a list exp) option = fun lst ->
match lst () with
| Nil        -> None
| Cons (h,t) -> Some (h,t)

let fix : (('a,'b)arrow exp -> ('a,'b)arrow exp) -> ('a,'b)arrow exp =
fun f -> fun () ->
let rec fp () = f fp () in fp ()
end
```
Indeed, here `cons` does not evaluate its arguments. It should also become clear why `decon` was called `strict'.

Strict lists differ only in the interpretation of `cons`: now it does evaluate its arguments.

```    module ListValue(S:EDSL with type 'a exp = unit -> 'a
and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b))
= struct
include ListName(S)

let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l ->
fun () -> let xv = x () and lv = l () in
Cons ((fun () -> xv),(fun () -> lv))
end
```
In lazy lists, `cons` arguments are evaluated on demand and shared:
```    module ListLazy(S:EDSL with type 'a exp = unit -> 'a
and type ('a,'b)arrow = (unit -> 'a) -> (unit ->'b))
= struct
include ListName(S)

let share : 'a exp -> 'a exp = fun e ->
let r = ref e in
fun () -> let v = !r () in r := (fun () -> v); v

let cons : 'a exp -> 'a list exp -> 'a list exp = fun x l ->
fun () -> Cons (share x, share l)
end
```
We may run the list examples with `ListName(CBName)`, `ListValue(CBValue)` and `ListLazy(CBLazy)` (or even `ListLazy(CBValue)`) interpreters and confirm they all produce the same results.

But efficiency probably differs. How can we check? By defining a counting interpreter (actually, a transformer) that modifies the definitions of `cons`, `decon` and `lam` to count cons-tructions, decon-structions and building and applying of (selected) closures. See the accompanying source code for detail (the functor `ListCnt`).

To get the feel for how the counters work, consider the following program.

```    let- dummy = int 0 in
let- l = cons (int 0) (cons (int 1) nil) in
let- unused = append l l in
let- l2 = append l l in
fold_right (lam2 (+)) (int 0) l2
```
The call-by-value interpreter with strict lists reports: 6 cons constructed, 8 deconstructed, 9 closures created and 12 applied. Indeed, `l` contributes 2 cons constructions, the `append` in `unused` contributes 2 constructions and 2 deconstructions, the append in `l2` contributes the same. Finally, `fold_right` walks the whole list and hence does 4 deconstructions. The total matches the report by the counting interpreter. As to closures, each `let-` contributes one closure construction and application. `lam2` contributes one (outer) closure construction, whose application to two arguments contributes 1 closure construction and 2 applications. Again, the totals agree with the measurements.

For lazy lists with the lazy interpreter, we obtain for the same program 4 cons constructed and 6 deconstructed; the closure statistics is the same. The lazy interpreter does not evaluate `unused`, and so the `append l l` contribution (2 cons and 2 decons) is not happening. For unstrict lists with the call-by-name interpreter, we obtain 6 cons constructed and deconstructed and 12 closures created and applied. The `unused` is again not evaluated; however, due to the lack of sharing, the construction of `l` and `lam2 (+)` happens repeatedly.

We now apply the counting framework to the question considered a few years ago: if fold-left is worth it, given that it is easily expressible via right-fold:

```    let fold_left_via_right : ('z,('a,'z)arrow)arrow exp -> 'z exp -> 'a list exp -> 'z exp
= fun fexp z lst ->
let- f = fexp in
fold_right (lam2 @@ fun e a -> lam @@ fun z -> a / (f / z / e))
(lam Fun.id) lst / z
```
All other list operations are expressible via right fold. For example, `map`:
```    let map : ('a,'b)arrow exp -> 'a list exp -> 'b list exp = fun fexp lst ->
let- f = fexp in
fold_right (lam2 @@ fun e l -> cons (f / e) l) nil lst
```

We conduct an experiment: define the following three benchmarks and apply to two lists, of sizes 4 and 16, resp.

```    let tmr l  = fold_right (lam2 (Fun.flip (-))) (int 0) l
let tml l  = fold_left (lam2 (-)) (int 0) l
let tmlr l = fold_left_via_right (lam2 (-)) (int 0) l
```
The count of constructed and deconstructed cons-cell is predictable: exactly the size of the input list -- regardless of strictness of lists or the interpreter. Left- and right- fold processing (benchmarks `tml` and `tmr`) happen to have the same complexity: the same cons and closure use, again for all interpreters. The following table reports (closure creation, closure application) counts, for different interpreters:
```    ~               left- or right-          left-via-right
Value, size 4     ( 5,  8)                ( 22,  28)
Value, size 16    (17, 32)                ( 70, 100)

Lazy, size 4      ( 5,  8)                ( 22,  28)
Lazy, size 16     (17, 32)                ( 70, 100)

Name, size 4      ( 8,  8)                ( 28,  28)
Name, size 16     (32, 32)                (100, 100)
```
Since there is no `dead' code, call-by-value with strict lists and call-by-need with lazy lists have the same complexity. Call-by-name is predictably worse: more closure construction due to lack of sharing. Expressing left-fold via right-fold is about 3 times worse, regardless of the evaluation strategy. We'd better keep left-fold as a primitive.
Version
The current version is August 2022
References
call_by_any.ml [7K]
The basic EDLS, which appeared in Parameterizing expressions by the evaluation order

lists.ml [19K]
The complete code accompanying the article, with more examples

Left fold vs. right fold: theory vs. practice
An old article with the analysis of left- and right- folds