Grasping `all-the-apples-at-once'

This is a light talk with the analysis of an impromptu programming contest: solving a simple but realistic problem in a language of participants' choice. It turns out that the solutions that were obviously right, expressed in just the right words were written not in the languages I have expected. This talk notes the unsettling observations and a few unexpected encouragements, along with a reflection on the common way of teaching and arguing about programming languages.

The contest happened at a computer-focused community Lobsters. Somewhat like HN, Lobsters is a forum for its members to submit and discuss articles: from journal papers to blog posts, announcements and videos. The community is what one may call the `mainstream IT': developers, from embedded systems to the Web, system administrators as well as students, computer hobbyists, and a few academics.

One day an article about a common job interview question was posted for discussion. Spontaneously, members started submitting their own solutions, in their language of choice. They also commented on their own and others solutions.

It has struck me how the the solutions neatly fall into two classes -- irrespective of the language they are written in, be it a functional (such as Haskell, Erlang or Clojure) or imperative (Go, Ruby, Raku). Only now I realized that recursion is dual to iteration not just in theory. Like loops, it is low-level: selecting one apple at a time. Likewise is the `pure-functional' and imperative handling of state.

How do we grasp all the apples at once? As in learning and teaching of natural languages, should we focus less on grammar and more on conversation and storytelling?

The talk was given at an online meeting of IFIP WG 2.1 on July 1, 2021.

References

The Lobsters thread with the contest. February 2021.
<https://lobste.rs/s/zpqrpc/random_job_interview_challenge_clojure>

Most candidates cannot solve this interview problem:
Input: "aaaabbbcca"
Output: [("a",4), ("b", 3), ("c", 2), ("a", 1)]
Write a function that converts the input to the output
I ask it in the screening interview and give it 25 minutes
How would you solve it?

Alexey Grigorev (@Al_Grigor) February 3, 2021 (Twitter)

References

The statement of the problem is quoted from the web page posted on Lobsters:
A Random Job Interview Challenge In Clojure. Feb 4 2021 by Savo Djuric
<https://savo.rocks/posts/a-random-job-interview-challenge-in-clojure/>

The blog post submitted to Lobsters for discussion described one solution, in Clojure. Somehow it prompted the members to solve the problem in their language of choice, and post the results. Hence there developed an impromptu contest.

Below is a sample of posted solutions. They are written in various languages by various people -- yet somehow fall into two groups, which I call for now Group I and Group II. I'll tell what they mean only later -- I hope you'd allow me a bit of suspense.

Clojure (Group I)

    (->> "aaaabbbcca"
         (partition-by identity)
         (map (juxt (comp str first) count))))

The submitter also gave another solution, ``a bit chattier but also more legible:''

    (for [p (partition-by identity "aaaabbbcca" )]
      [(-> p first str) (count p)])

It is worth noting that comp is the functional composition, and juxt, with the rough type (a->b * a->c) -> a -> (b * c), is the combinator that takes a tuple of functions and a value, applies each function to it and returns the results as a tuple.

Ruby (Group I)

There were three solutions in Ruby -- all one-liners, and rather similar. The first is probably the easiest to understand.

    "aaaabbbcca".split("").chunk_while {|a,b| a==b}.map {|a| [a.first, a.size]}
    
    
    "aaaabbbcca".chars.slice_when {|c,n| c != n }.map {|s| [s[0], s.size] }
    
    
    'aaaabbbcca'.chars.chunk(&:itself).map{|x,y| [x, y.size]}

APL (Group I)

Two APL solutions:

    (((1∘+ +\) ⊢ ≠ ¯1∘⌽) (((⊃,≢) ⊢) ⌸) ⊢) 'aaaabbbcca'
    
    f←(⊃,≢)¨⊢⊆⍨1+(+\0,2≠/⊢)
    ⊢x←f'aaaabbbcca'

Go (Group II)

``I know this is a Clojure post, but out of curiosity I wrote it in Go (what I've been working in lately) just to see if I could solve it quickly. Took about 10 minutes, subtract 3-4 for fighting with runes. It's not particularly elegant, but it works.''

    type result struct {
            letter string
            count  int
            }
    
    func main() {
       const input = "aaaabbbcca"
       var ret []result
    
       currentLetter := string(input[0])
       countCurrentLetter := 1
    
       for _, elem := range input[1:] {
               elemAsString := string(elem)
               if currentLetter == elemAsString {
                       countCurrentLetter++
               } else {
                       ret = append(ret, result{currentLetter, countCurrentLetter})
                       currentLetter = elemAsString
                       countCurrentLetter = 1
               }
       }
       ret = append(ret, result{currentLetter, countCurrentLetter})
       fmt.Printf("%+v", ret)
    }

The salient feature of this and similar solutions is explicit iteration: the for loop.

Racket (Group II)

    (for/fold ([p #f]
               [cnt 0]
               [counts null]
               #:result (cdr (reverse (cons `(,p ,cnt) counts))))
              ([c (in-string "aaaabbbcca")])
      (if (equal? p c)
          (values c (add1 cnt) counts)
          (values c 1 (cons `(,p ,cnt) counts))))

The solution is pure functional -- yet closely resembling the Go solution above.

Raku (Group I)

    "aaaabbbcca".subst( /(.) $0*/ , {"({$0},{$/.chars})" }, :g )

Raku is the artist formerly known as Perl 6. Perl has always been known for its (ir)regular expressions, whose power is on display.

Erlang (Group II)

    L = "aaaabbbcca",
    lists:reverse(lists:foldl(
        fun
            (Elt, [ {[Elt], N} | R ]) -> [ {[Elt], N + 1} | R];
            (Elt, Acc) -> [{[Elt], 1} | Acc]
        end,
        [],
        L
    )).

It is rather similar to the Racket solution. We see the characteristic for Erlang non-linear pattern matching: the variable Elt appears twice in the first pattern.

Python (Group I)

    [(k, len(list(g))) for k, g in itertools.groupby("aaaabbbcca")]

Rust (Group II)

    fn run_length_encode(ins: &str) -> Vec<(char, usize)> {
        let mut out = vec![];
        let mut i = ins.chars();
        if let Some(mut c) = i.next() {
            let mut count = 1;
            for new_c in i {
                if new_c == c {
                    count += 1;
                } else {
                    out.push((c, count));
                    count = 1;
                    c = new_c;
                }
            }
            out.push((c, count));
        }
        out}

It is quite like the solution in Go -- but cleaner and with a better handling of edge cases. The author did grasp the nature of the problem: run-length encoding.

Julia (Group II)

    function rle(itr)
       acc = Tuple{eltype(itr), Int}[]
       x = iterate(itr)
       isnothing(x) && return acc # iterable is empty
       last, state = x
       n = 1
       for v in Iterators.rest(itr, state)
               if last == v
                       n += 1
               else
                       push!(acc, (last, n))
                       last = v
                       n = 1
               end
       end
       push!(acc, (last, n))
       return acc
    end

It mirrors the Rust solution above.

Haskell (Group II)

    main = interact soln
    
    soln = show . reverse . go []
    
    go counts [] = counts
    go counts (x:xs) = go (inc x counts) xs
    
    inc c ((x,ct):xs) | c == x = (x,ct+1):xs
    inc c xs = (c, 1):xs

Although this solution relies on the explicit recursion rather than fold, it is rather similar to the Erlang solution. (A different person later mentioned another solution, using group).

A particular feature of the Lobsters' contest is comments of the participants, about their and others' solutions. Below are several notable comments, which go on to make my point in this talk.

One commenter wondered:

``Reading the Clojure solution involves following the thought process of how the author ended up there. Reading the equivalent Go solution has very little cognitive overhead.
How does Clojure programmers write code that is not just brief and elegant in the eyes of the author, but can also be read quickly by others without time consuming mental gymnastics?''

and another replied:

``I guess it's just a matter of what you're used to. I find both Go solutions above arduous to understand because you have to follow the algorithm and construct what it functionally does in your head. Whereas in most of the clojure solutions the code directly tells you what's happening.
Eg. the top post with comments:
    ; threading macro (take input, then do stuff to it in steps)
    (->> "aaaabbbcca"
         ; split into lists of consecutive same character
         (partition-by identity)
         ; map each item (list of chars) with a function
         (map
           ; return list of outputs of functions
           (juxt
             ; first character as string
             (comp str first)
             ; count of characters
             count)))
The information is much more dense. If you are good at reading it, you understand it at a glance.
I believe the same is true for the APL solutions. I can't read them, but someone who does sees the algorithm clearly.''

Yet another commenter concurred:

``Same here; they [Go solutions] just seem super tedious and low-level; like you're spoon-feeding the compiler on basic things instead of just describing the algorithm naturally.
If you don't have the vocabulary to describe an algorithm, of course the low-level version is going to be easier to read, much like someone who is only just learning English will find the `Simplified English' wikipedia more accessible than the regular one that often uses specialized jargon. But if you work with algorithms every day, you owe it to yourself to invest in learning the jargon!''

I have mentioned apples several times already. The following quote explains what is it all about.

``Suppose we have a box of apples from which we wish to select all of the good apples. Not wishing to do the task ourselves, we write a list of instructions to be carried out by a helper.
The instructions corresponding to a conventional language might be expressed something like the following: Select an apple from the box. If it is good, set it aside in some place reserved for the good apples; if it is not good, discard it. Select a second apple; if it is good put it in the reserved place, and if it is not good discard it.... Continue in this manner examining each apple in turn until all of the good apples have been selected. In summary, then, examine the apples one at a time starting with the first one we pick up and finishing with the last one in the box until we have selected and set aside all of the good apples.
On the other hand the instructions corresponding to an array language could be stated simply as ``Select all of the good apples from the box.'' Of course, the apples would still have to be examined individually, but the apple-by-apple details could be left to the helper.
Conventional programming languages may be considered then ``one-apple-at-a-time'' languages with machine languages being a very primitive form, whereas array languages may be considered ``all-the-apples-at-once'' languages.''

Keith Smillie: ``My Life with Array Languages''. October 2005. This analogy is credited to Frederick P. Brooks: that Fred Brooks, of the Mythical Man-Month fame.

A few solutions to a single problem in an uncontrolled contest do not let us draw any statistical conclusions. Still there is enough data for observations and reflections. The first observation is that quite a variety of languages were used to solve the problem, successfully, and it took the submitters, by their self-reports, about 5-10 minutes. The submitted solutions also seem idiomatic, reflecting the `community standards', the way these languages are (self-) taught and used in the industry.

The second, surprising observation is that the submitted solutions fall into two classes:

scan the input string char by char, building the result: Group II
see the problem as groupBy followed by mapping: Group I

Here are the languages used to write the solutions in each group:

Group I: Clojure, Ruby, Raku, Python, Haskell, APL
Group II: Go, Racket, Erlang, Rust, Julia, Haskell

Whichever the classification of programming languages one may choose -- functional vs. imperative, old vs. new, created by `amateurs' vs. `professionals' -- there is a representative in both groups.

Comparing the pure functional Racket solution with the imperative Go (or, better, Rust or Julia) solutions shows them surprisingly similar. Recursion is indeed dual to iteration, and fold is just a for-loop in disguise. Neither recursion nor fold help us grasp `all the apples at once'.

Looking at the Racket and Rust (or Go, Julia) solutions leads to another conclusion, about state. Although the Racket solution is `pure functional', it is just as stateful as Rust and Go solutions. There is the same amount of state. The state is updated in the Racket solution at once, with the values clause; the update is matched to its target by position rather by name, which is quite error prone. (As a side observation, the handling of edge cases is clearer and obviously correct in Rust and Julia than in the Racket or Go solutions.)

Thus being `pure functional' does not eradicate state; it merely changes the way state is handled -- which could be either more or less clearer and convenient. What unequivocally improves the handling of the state is partitioning and encapsulating it into `higher-level', composable operations like `groupBy': see Clojure, Ruby, Python solutions. The groupBy may itself be implemented with mutable state or without -- no matter, the state is abstracted away and the user of groupBy does not have to care about it.

The first pleasant surprise was the notable number of Group I, that is, all-the-apples-at-once solutions -- and the number of languages in use today with the adequate facilities to express these solutions.

The blog post by Savo Djuric that prompted the contest has also described a solution, which is worth quoting.

Since I recently started learning Clojure, and I regularly do 4clojure exercises, this problem seemed intriguing, so I decided to try to solve it immediately. I fired up my REPL, and within several minutes i came up with:
    (interleave
     (map str (map first (partition-by identity "aaaabbbcca")))
     (map count (partition-by identity "aaaabbbcca")))
    => ("a" 4 "b" 3 "c" 2 "a" 1)
Yaay! It works! But it is not exactly what was asked for. Nor is it very ‘clean’, so to say. What bothers me the most in this solution is that i have repetitive code, so I decided to solve that with let:...
    (defn chop-chop [coll]
      (let [x (partition-by identity coll)]
        (map list (map (comp str first) x)
                    (map count x))))
I must say that it felt great to solve a problem within several minutes (albeit it was not that hard), since I'm only at the beginning of my journey with Clojure.

The blog post shows the ideal, to me, way to solve the problem: grasping its nature -- groupBy, or partition-by in Clojure, -- prototyping the solution, understanding its imperfections and wanting to do better, taking steps to improve. I think the author has a bright future as a programmer.

From his blog I learned that the author never went to college, got into programming by accident (but prompted by his talented brother), and has been studying programming languages (now, Clojure) entirely on his own. May be that's the secret.

Another pleasant surprise was seeing the solution that in my eyes is ideal:

    [(k, len(list(g))) for k, g in itertools.groupby("aaaabbbcca")]

This Python solution is expressed in what is essentially the conventional mathematical notation, readable by everyone (unlike APL). More importantly, it is linguistically economical -- again, unlike APL. It uses just the two directly relevant facilities: groupBy and comprehension. Notation is clearly here the tool of thought.

A comment cited earlier hits the nail on the head: to describe an algorithm, one needs vocabulary. A large part of programming is building -- learning or constructing -- vocabulary to talk about common (sub)problems, e.g., run-length-encoding, groupBy, mapping, accumulating, filtering, etc. Learning, finding and naming the `right' abstractions is the core mathematical activity -- as explicated in the `Notation as a tool of thought'.

A natural language is also a tool of thought. Building vocabulary and understanding when and how to use a particular word is the hardest part of learning a language. That natural and formal languages have a lot in common is the idea with a long pedigree -- famously and forcefully articulated by Montague in 1970s. Keith Smillie, in his paper about array languages cited earlier further argues that teaching programming ought to adapt the best ideas of teaching foreign languages.

Both in programming and natural languages, one can acquire basics relatively quickly. Fluency takes decades. There doesn't seem to be anything better than constant practice, speaking and listening, writing and rewriting. The ending of the famous interview between Ernest Hemingway and George Plimpton (the `Interviewer', the magazine's founder and editor) (Paris Review, issue 18, Spring 1958) comes to mind:

INTERVIEWER

Do you do any rewriting as you read up to the place you left off the day before? Or does that come later, when the whole is finished?

HEMINGWAY

I always rewrite each day up to the point where I stopped. When it is all finished, naturally you go over it. You get another chance to correct and rewrite when someone else types it, and you see it clean in type. The last chance is in the proofs. You’re grateful for these different chances.

INTERVIEWER

How much rewriting do you do?

HEMINGWAY

It depends. I rewrote the ending to Farewell to Arms, the last page of it, thirty-nine times before I was satisfied.

INTERVIEWER

Was there some technical problem there? What was it that had stumped you?

HEMINGWAY

Getting the words right.

How would I solve the problem? When I first saw it, the very first thought that came to mind was: groupBy, which is the common name for the operation to group consecutive elements of a sequence. We then need to present the groups in the form of (group-element, counter) pairs.

I will use OCaml for concreteness (and for other reasons that should be clear soon). Suppose we have the operation

    group : 'a list -> 'a list list

which groups equal (according to some criterion) consecutive elements of a list in their own list, and returns the list of such groups. (A group cannot be empty). Then the first approximation to the solution is as follows, quite close to the first Clojure solution above:

    let rll (group : char list -> char list list) : char list -> (char * int) list =
      group >> List.(map (fun g -> (hd g, length g)))

Type annotations are optional, but I prefer to write them. (One reason to use OCaml is types: not fancy, but sufficient to visualize data transformations, without too many details.) Since group is the assumption, it appears as the parameter (argument). Here, (>>) is the left-to-right functional composition (as in F#). It is convenient to think of input as a sequence, such as list. But a string is not a list in OCaml -- but can be easily turned into one, with the help of a couple standard library functions:

    let rll (group : char list -> char list list) : string -> (char * int) list =
      String.to_seq >> List.of_seq >> group >> List.(map (fun g -> (hd g, length g)))

With the fork combinator, so frequent in APL (but a left-to-write version):

    let fork : ('a -> 'b) -> ('a -> 'c) -> ('b -> 'c -> 'd) -> 'a -> 'd = fun f g h -> fun x -> h (f x) (g x)

the solution can be written even cleaner:

    let rll (group : char list -> char list list) : string -> (char * int) list =
      String.to_seq >> List.of_seq >> group >> List.(map (fork hd length pair))

We only need the implementation of group itself to run this code. Alas, the OCaml standard library at present does not provide a grouping operation. I can of course write it. But let me rephrase this approach in more detail.

Let us start from scratch and develop the solution in smaller steps, emphasizing tighter and tighter `grasping' of the data flow. The stress is on understanding data flow, rather than on writing loops or recursive functions and struggling with termination conditions and edge cases. We won't obsess about state either: it comes naturally when needed. Grouping also comes naturally, and simpler. When we reach the solution, one look at it will tell that it must be right. It is. It is also reasonably efficient. It can even be later mechanically recast into C for better efficiency -- not the sort of C one typically writes by hand, but correct and problem-free.

First, from the problem statement one quickly understands that we are facing a sequential problem: transforming a sequence of characters into some other sequence (of pairs), locally. Indeed, appending more characters to the sample input "aaaabbbcca" will not affect the prefix ("a",4), ("b", 3), ("c", 2) of the sample output. Once this sequential character is understood, the rest follows almost automatically.

To think about and develop the solution, we need a vocabulary, which we build up as we go along. First we need a word for a sequence. Let's call it 'a seq: a type of a sequence of elements of type 'a. Let us not worry for now how it is actually implemented and treat it as an abstract type. Since the input is given as string, we obviously need the operation

    of_string : string -> char seq

to transform (or, render) a string as a sequence of characters. Assume it is given. Again, what is important for now is the type, which represents the data flow.

Next we have to deal with grouping of neighboring characters, which is inherently stateful: which group the character belongs to depends on the context, that is, on the character that came before, or follows next. Let's go with the latter and assume the operation

    look_ahead : 'a seq -> ('a * 'a option) seq

that pairs the current element of sequence with the element that comes next. The next element may be absent (if the sequence is about to end), hence the 'a option type, allowing for the missing value.

The grouping itself is then inserting group breaks, determined by peeking at the next element:

    type 'a annot = Plain of 'a | Break of 'a
    let group : ('a * 'a option) seq -> 'a annot seq = 
      map @@ function
            | (x,Some next) when x = next -> Plain x
            | (x,_)                       -> Break x

That is, we transform a looked-ahead sequence into a sequence of elements annotated with group boundaries: Plain x means the current group continues, Break x means x is the last element in its group. The next element (if any) begins a different group. (The 'a annot type could have been realized differently, as a pair 'a * bool.) We have also assumed the existence of the mapping operation

    map : ('a -> 'b) -> 'a seq -> 'b seq

Finally, we have to count, which again requires state (the count of elements in the current group) and emit the tuples at the group boundary:

    let count : 'a annot seq -> ('a * int) seq =
      map_accum_filter 0 @@ fun cnt -> function
        | Plain x -> (None, cnt+1)
        | Break x -> (Some (x,cnt+1), 0)

The group break forces the output (describing the group that has just ended) and resets the counter. We assumed the existence of accumulating map-filter, a rather popular operation in stream processing:

    map_accum_filter : 'state -> ('state -> 'a -> 'b option * 'state) -> 'a seq -> 'b seq

The mapping ('state ->'a -> 'b option * 'state) takes the current state and the current element and returns a tuple of a possibly transformed element and the new state. If the possibly transformed element is None, nothing is emitted in the output stream; the state advances nevertheless. From the type of it, map_accum_filter looks `pure', with functional-state passing. Hold that thought.

That is it. The solution to the problem, the run-length-encoded sequence of characters, is then the composition of what we have developed so far:

    let rll = of_string >> look_ahead >> group >> count

which expresses the problem in the concise and obviously correct way. We only need to add to our chain an observation operation, to transform the resulting (char * int) seq to a list, or to print it out.

One advantage of our solution is the ease of modification. For example, if the source string is UTF-8 encoded, we only need to add to the pipeline one more operation: UTF-8 decoding and grapheme clustering:

    utf_decode : char seq -> uchar seq

We leave this as an exercise to the reader.

To actually run the rll program, we have to fulfill our promises: to pick a concrete representation for 'a seq and implement the operations that we assumed for it. In OCaml terms, we have to implement the following signature:

    module type simple_seq = sig
      type 'a seq
      val of_string : string -> char seq
      val map : ('a -> 'b) -> 'a seq -> 'b seq
      val look_ahead : 'a seq -> ('a * 'a option) seq
      val map_accum_filter : 'state -> ('state ->'a -> 'b option * 'state) -> 'a seq -> 'b seq
      val iter : ('a -> unit) -> 'a seq -> unit
      val to_list : 'a seq -> 'a list
    end

It turns out that the OCaml standard library has almost everything we need: the Seq.t data type and the mapping, string/list conversion, etc. operations on it. What is missing is map_accum_filter and look_ahead. However, the standard library provides filter_map on sequences. We merely need to add state, realized as mutable state:

    let map_accum_filter : 'state -> ('state -> 'a -> 'b option * 'state) -> 'a seq -> 'b seq =
      fun init f seq ->
        let stat = ref init in
        seq |> Seq.filter_map (fun a -> let (bo,st) = f !stat a in stat := st; bo)

The overall map_accum_filter does have a `functional' interface with the explicit state-passing; the implementation is imperative however. There is no inherent conflict between a clear specification and a fast implementation. The remaining look_ahead turns out easily expressed via map_accum_filter (we leave the implementation to the reader -- or see the accompanying code).

Looking back at our solution, we notice that all state ends up encapsulated. The state for look-ahead is encapsulated in look_ahead. The grouping per se turned out stateless. Counting requires one more piece of state. Each operation keeps the state it needs, without knowing or interfering with the state of others.

We also notice that there are no loops or recursive functions. Neither did we have to deal with termination conditions. That is the feature of all Group I solutions, clearly articulated back in APL: specify how to change the content of a collection or its shape, but don't waste time thinking how to get the elements or where to put them.

Our 'a seq can also be realized as iterator/generator, to be composed via for-loops -- with good performance. Even OCaml Seq.t fuses by design (albeit incompletely: there are still intermediate data structures, but of constant size). Perfect fusion is achievable, if one uses, say, strymonas, which also permits the generation of C code. Here it is.

    void rll(const int64_t * aa_1,const int64_t al_2)
    {
       int64_t v_3 = 0;
       int64_t v_4 = 0;
       bool v_5 = 0;
       int64_t v_6 = 0;
       while (v_6 <= al_2)
       {
          int64_t t_7;
          t_7 = v_6;
          v_6++;
          if (t_7 < al_2)
          {
             int64_t t_9;
             t_9 = aa_1[t_7];
             if (v_5)
             {
                int64_t t_10;
                t_10 = v_4;
                v_4 = t_9;
                if (!(t_10 == t_9))
                {
                   int64_t t_11;
                   t_11 = v_3;
                   v_3 = 0;
                   printf("%ld\n",(long)t_10);
                   printf("%ld\n",(long)(t_11 + 1));
                }
                else {
                        v_3++;
                }
             }
             else {
                     v_4 = t_9;
                     v_5 = 1;
             }
          }
          else {
                  if (v_5)
                  {
                     int64_t t_8;
                     t_8 = v_3;
                     v_3 = 0;
                     printf("%ld\n",(long)v_4);
                     printf("%ld\n",(long)(t_8 + 1));
                  }
          }
       }}

One more thing: it all worked on the first try.

References

apples.ml [4K]
The complete source code

apples.c [2K]
The strymonas solution: generated C code. For the generator, see the directory examples/run-length-encoding in the strymonas distribution.

Incremental, Linear Pretty-printing
The stepwise pipeline development can be taken quite far. The above reference develops a rather non-trivial algorithm with the optimal, difficult to achieve performance.

Grasping `all-the-apples-at-once'

Introduction

References

Contest problem

References

Sample solutions

Comments

The apples analogy

Observations

Pleasant surprises

Conclusion: Programming and natural languages, software and literature

Epilogue: my `stream of apples' solution

References

Last updated June 3, 2024