A Time-Travel Story

Introduction
List.map's little secret
Time travel and multiple worlds
Time-travel-safe and Lock-free
Conclusion

Introduction

This is the real story. It is documented. I was involved in it. It is about going back in time to make a different choice. Unpredictably, a private matter comes to light, a secret, believed to be perfectly safe.

Time travel stories are often about stupefying outcomes of changing the past. This story is no exception. It accidentally exposes a dirty trick, a dubious operation that noone can see. And yet... Let's go to the beginning.

Once I received a message from a Hansei user: His probabilistic program gave clearly wrong results. (Hansei is a library for probabilistic programming in OCaml. It works by altering past choices and counting alternative histories.) He eventually traced the problem to the library function List.map: when he wrote the list map himself, the problem disappeared. I was puzzled. Mapping over a list is so simple, with no place for bugs to hide. My correspondent mentioned using Batteries, the alternative, almost standard OCaml library. I went to the Batteries repository to look at the source code and found this:

Version

The current version is September 2014.

References

Why List.map does not be implemented
The thread on the Caml-List, September 29-October 1, 2014

HANSEI: Embedded Probabilistic Programming

Hayo Thielecke: Using a Continuation Twice and Its Implications for the Expressive Power of call/cc
Higher-Order and Symbolic Computation, volume 12, April 1999,pp. 47-73.
More stories of time-travel with unexpected consequences. For example, as it turns out, time travel is noticeable even if one does not change anything in the past (see ``the refutation of the idempotency hypothesis'').

List.map's little secret

List.map in Batteries is not what one would have expected...

The standard, familiar List.map function:

     let rec map : ('a -> 'b) -> 'a list -> 'b list = fun f -> function
     | []     -> []
     | h :: t -> f h :: map f t

is non-tail-recursive. It hence consumes stack space proportional to the size of the input list. When mapping over long lists, the program may run out of stack (or cause time-consuming stack reallocations) since stack is typically much smaller than the heap.

OCaml Batteries is a well-optimized library. Its List.map is tail-recursive and runs in constant stack space. Strictly speaking, this is impossible to do efficiently. That's where a secret comes in. Let's look at the source code. After desugaring, it is:

     type 'a mut_list = {
         hd: 'a;
         mutable tl: 'a list
       }
     
     let map_mut : ('a -> 'b) -> 'a list -> 'b list = fun f -> function
       | []     -> []
       | h :: t ->
           let rec loop dst = function
             | []     -> ()
             | h :: t ->
                 let cell = {hd = f h; tl = []} in
                 dst.tl <- Obj.magic cell;
                 loop cell t
           in
           let r = {hd = f h; tl = []} in
           loop r t;
           Obj.magic r

The function loop is clearly tail-recursive. Just as clearly, there is magic. Let's try to understand the code -- by deriving it ourselves.

List.map maps one element of the input into one element of the output. It needs no buffers and can, in principle, work in constant space. Still, it is expected to maintain the list order. Therefore, if it takes an element from the head of the input list it should add the transformed result to the end of the output list.

     module V1 = struct
       let snoc : 'a -> 'a list -> 'a list = fun x l -> l @ [x]
     
       let map : ('a -> 'b) -> 'a list -> 'b list = fun f l ->
         let rec loop acc = function
           | []     -> acc
           | h :: t -> loop (snoc (f h) acc) t
         in loop [] l
     end

This version V1 of list mapping is tail recursive, relying on snoc to add an element to the tail of a list. Alas, snoc needs space proportional to the input list size. Our goal is to make snoc run in constant time.

In preparation for an optimization, we split out the empty list case, so that snoc deals only with non-empty lists:

     module V2 = struct
       let snoc : 'a -> 'a list -> 'a list = fun x l -> l @ [x]
     
       let map : ('a -> 'b) -> 'a list -> 'b list = fun f -> function
         | []     -> []
         | h :: t -> 
             let rec loop acc = function
               | []     -> acc
               | h :: t -> loop (snoc (f h) acc) t
             in loop [f h] t
     end

Next we look very carefully at map to see which list operations it really needs -- and abstract them away in NList, a module of non-empty lists:

     module type NList = sig
       type 'a t
       val singleton : 'a -> 'a t
       val to_list : 'a t -> 'a list
       val snoc : 'a -> 'a t -> 'a t
     end

In terms of NList, the list mapping takes the following general form -- which we will use for the rest of this web page:

     module ListMap(NL:NList) = struct
       let map : ('a -> 'b) -> 'a list -> 'b list = fun f -> function
         | []     -> []
         | h :: t -> 
             let rec loop acc = function
               | []     -> NL.to_list acc
               | h :: t -> loop (NL.snoc (f h) acc) t
             in loop (NL.singleton (f h)) t
     end

Ordinary lists are one, naive, realization of NList:

     module NL_list : NList = struct
       type 'a t = 'a list
       let singleton x = [x]
       let to_list x = x
       let snoc x l = l @ [x]
     end

ListMap(NL_list) is the inefficient version V2 shown earlier. Let's find a better implementation of NList. The modular abstraction ensures the representation independence: the map code cannot know or depend on a particular NList implementation. We are free to change the representation of non-empty lists to make snoc simple.

Let's try the familiar, from tech interviews, linked list: a chain of links with a mutable tail (called cons-cells below, for historical reasons). A list is represented by a pair of the first and the last link in the chain.

     module NL_twocons : NList = struct
       type 'a cons = {hd : 'a; mutable tl : 'a cons option}
       type 'a t = 'a cons * 'a cons
     
       let new_cell : 'a -> 'a cons = fun x -> {hd=x; tl=None}
       let singleton x = let l = new_cell x in (l,l)
     
       let to_list (first,_) = 
         let rec go = function 
           | {hd;tl=Some t} -> hd :: go t
           | {hd;tl=None}   -> [hd] 
         in go first
     
       let snoc x (first, last) = 
         let cell = new_cell x in
         last.tl <- Some cell;
         (first,cell)
     end

Adding to the tail becomes the trivial mutation of the tl pointer in the last cons-cell, to point to the new cell. Alas, the benefits of the now optimal snoc are negated by to_list.

It takes arcane knowledge to make to_list also constant-time:

     module NL_magic : NList = struct
       type 'a cons = {hd : 'a; mutable tl : 'a list}
       type 'a t = 'a cons * 'a cons
     
       let from_cons : 'a cons -> 'a list = Obj.magic
     
       let new_cell : 'a -> 'a cons = fun x -> {hd=x; tl=[]}
     
       let singleton x = let l = new_cell x in (l,l)
       let to_list (first,_) = from_cons first
     
       let snoc x (first, last) = 
         let cell = new_cell x in
         last.tl <- from_cons cell;
         (first,cell)
     end

The structure 'a NL_magic.cons has the same memory layout as the cons-cell h :: t of the ordinary list. Therefore, we can `cast' one from the other. (Strictly speaking, casting from an immutable h :: t to 'a cons is not safe: the OCaml compiler does take immutability into account in optimizing. But we only need the other direction, which is safe.) With all NList operations taking constant time, we have achieved our goal. The map_mut code at the beginning of this section is ListMap(NL_magic).map, after inlining and simple optimizations (which could have been done by the compiler).

The code has the frequently seen `initialization pattern': an initially allocated structure has some uninitialized (set to [], in our case) fields, later filled-in by mutation. The structure is then `frozen' and becomes immutable. The operation from_cons is such `freezing'. So long as the not-yet-filled-in structure is not revealed to others, so long as the unique ownership is preserved, such delayed in-place initialization is safe. In our case, 'a cons is internal to NL_magic; no pointers to it are visible outside ListMap(NL_magic). The mutation to the private data is really a secret. And yet it came to light. How?

References

maps.ml [6K]
The complete OCaml code for the derivation, with tests.

Time travel and multiple worlds

Time travel and multiple worlds is not science fiction: it is the reality of non-deterministic programming. Altering a past choice, or backtracking, lets us count alternative histories. That is how Hansei works, the probabilistic programming library of our story. It is there where the List.map problem was first detected. Hansei was also used in the discussions on the Caml-list. In this article, however, we explain the problem using a quite simpler library, with only two operations:

     val choose : 'a -> 'a -> 'a
     val top    : (unit -> 'a) -> 'a list

Here, choose x y non-deterministically chooses x or y; top thunk gives the list of thunk results, in all possible histories. For example,

     top (fun () -> not (choose false true))

produces [true;false], for the two possible choices.

Backtracking has to capture the program state, or context, to reinstate it later. Time travel invariably involves continuations. That is how choose and top can be implemented, in terms of the delimited control operators of the delimcc library, in a couple of lines of code:

     open Delimcc
     let p = new_prompt ()
     
     let choose : 'a -> 'a -> 'a = fun x y -> 
       shift p (fun k -> k x @ k y)
     
     let top : (unit -> 'a) -> 'a list = fun th ->
       push_prompt p (fun () -> [th ()])

The coin flip choose false true first captures the context in k and makes the false choice: the entire choose false true expression is replaced with false and the program continues as usual, eventually producing [true]. We then backtrack and make the different choice, re-running the program, which now gives [false]. The results are collected at the end.

Let's take a more complex example, of two independent choices:

     top (fun () -> map (fun _ -> choose false true) [1;2])
     (* [[false; false]; [true; false]; [false; true]; [true; true]] *)

The result, put in comments, shows four possible outcomes of two independent coin flips. We have used List.map from the standard library. Any inefficient -- V1, V2, ListMap(NL_list) -- version of map gives the same result. However, the efficient tail-recursive version, ListMap(NL_magic) or map_mut, differs:

     top (fun () -> let module M = ListMap(NL_magic) in 
                    M.map (fun _ -> choose false true) [1;2])
     (* [[false; false]; [false; false]; [true; false]; [true; false]] *)

Two coin flips no longer come up both true! In the nutshell, that is the problem that was reported by the Hansei user.

What went wrong? Let us look at ListMap again:

     module ListMap(NL:NList) = struct
       let map : ('a -> 'b) -> 'a list -> 'b list = fun f -> function
         | []     -> []
         | h :: t -> 
             let rec loop acc = function
               | []     -> NL.to_list acc
               | h :: t -> loop (NL.snoc (f h) acc) t
             in loop (NL.singleton (f h)) t
     end

If the function in f h makes a choice, backtracks and makes another choice, NL.snoc (f h) acc is executed twice: with different f h but the same acc. If snoc does mutation, as in the optimized versions:

     let snoc x (first, last) = 
       let cell = new_cell x in
       last.tl <- from_cons cell;
       (first,cell)

the second execution of the assignment overrides and destroys the result of the first one. That is how the true choices became lost.

It is not wrong to think that choose false true `splits the world' (pretty much like an experiment in many-worlds interpretation of Quantum Mechanics). In one world, choose false true returns false and in the other true. However, the split worlds share the same heap. A mutation in one world is hence visible in all the others. In programming language terms, a delimited continuation captures the stack, but not the heap. If the captured continuation modifies the heap, its reinstatement brings a race condition. From yet another angle, map in the split worlds share the same partially constructed output list. The mutation is only safe when the list is uniquely owned. When the worlds split, private remains private but no longer unique.

What makes the problem elusive and fascinating is its arising not because of an external tempering with data. Tempering is impossible: map's private data are truly inaccessible and temper-proof. Rather, the map function is tricked to temper with its data itself.

References

maps.ml [6K]
The complete OCaml code for the derivation, with tests.

Time-travel-safe and Lock-free

Non-deterministic programming is indeed quite like shared memory parallelism. Our problem came about because the optimized List.map is not thread safe. The fix below is hence a thread-safe and lock-free version. It is lock-free because the implementation of non-determinism provides no locks. The idea for the fix is taken from a message by Gabriel Scherer on the Caml-list. However, it is implemented and explained simpler, without a canary value.

Luckily, non-determinism is not as bad as the general shared memory parallelism. First, `thread preemption' can only occur at specific points, when we call a callback such as the mapping function. Mainly, we still have the guarantee that private data are modified only by their owners. Therefore, we need to examine only the map code; we do not need to know what a callback or a consumer of the resulting list may do.

Looking carefully at the NL_magic code

     let snoc x (first, last) = 
       let cell = new_cell x in
       last.tl <- from_cons cell;
       (first,cell)

shows that snoc has a pre-condition that we neglected to check: last should be the last cons-cell of the list, as the name implies. That is, last.tl must be []. In the single-threaded version, the precondition was guaranteed by the fact that map never calls snoc twice on the same list.

If snoc is called twice on the same list argument, it will find the the list already extended by someone else. Hence snoc is no longer the sole owner of the list -- rather, it holds the shared fragment, from first through the last cons-cell. We ought to take this fragment private, by deep copying, thus restoring the unique ownership. That code below implements this idea:

     let snoc_simple : 'a -> 'a cons -> 'a cons = fun x dst ->
       let cell = new_cell x in
       dst.tl <- from_cons cell;
       cell
     
         (* Copy from first to last inclusive *)
     let deep_copy : 'a t -> 'a t = fun (first,last) ->
       let rec loop first last dst =
         if first == last then dst 
         else match first with
         | _ :: ((h :: _) as t) -> loop t last (snoc_simple h dst)
         | _      -> assert false
       in
       let res = new_cell first.hd in
       let last = loop (from_cons first) (from_cons last) res in
       (res,last)
     
     let snoc x ((first, last) as l) =
       if last.tl = [] then (first,snoc_simple x last)
       else let (first,last) = deep_copy l in
            (first, snoc_simple x last)

We have only shown the changed parts of NL_magic. The new version is still just as fast in the `single-threaded' case (only last.tl = [] check is added); it has no race conditions in the non-deterministic case. Our deep-copying of the list is what the world splitting should have done, to keep the two new worlds independent.

References

Gabriel Scherer: Re: Why List.map does not be implemented
Message on the Caml-List posted on Wed, 29 Oct 2014

maps.ml [6K]
The complete OCaml code for the derivation, with tests.

Conclusion

Preconditions, invariants, representational independence, isolation do indeed help write better programs -- at least by helping understand and mitigate problems. In fact, the running problem is about an `obvious' (and hence not explicated) precondition that unexpectedly turned out less obvious. We have also confirmed that continuations often break obvious assumptions. As to the complexities of time travel -- just watch Shane Carruth's ``Primer''.
References: < https://en.wikipedia.org/wiki/Primer_(film) >

Last updated July 11, 2017

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML