Free Variables and Free Effects: an elementary introduction to algebraic effects and handlers

Whenever one writes an interpreter or a compiler, or studies logic, model theory, programming language theory -- one soon has to face variables. The well-known way to deal with them is by introducing environment, or variable assignment in logic. There is, however, another, more general approach, related to algebraic effects. One should not be too surprised. Algebraic effects originate from studying terms with variables, and equations on them: free algebras.

This article explains both approaches on an exasperatingly simple example. The example scales up, to something rigorous, realistic and interesting: see Not by Equations Alone: Reasoning with Extensible Effects.


Simple reminder of denotational semantics

Consider an expression "(1+(3-2))". It is a well-parenthesized character string: a sequence of characters with no particular meaning. Let's give them a meaning -- the one that springs to mind when seeing this string. (That is not the only possible meaning, as we see soon.) That is, digits mean integers, parentheses mean what is inside them, and "+" and "-" mean integer addition and subtraction. To state this formally (and hence concisely and unambiguously), it is common to use the so-called semantic brackets 〖 〗 as the abbreviation for ``the meaning of''. For example, what we said earlier can be re-stated formally as:
"1"〗 = 1     〖"2"〗 = 2     〖"3"〗 = 3     〖"+"〗 = +     〖"-"〗 = -
On the right-hand-side of the equal sign are the assigned meanings: the numbers 1, etc. and the integer operations. Something is missing however: how to find the meaning of complex (composite) expressions. This is schematically specified by:
"(e1+e2)"〗 = 〖e1〗 + 〖e2〗         〖"(e1-e2)"〗 = 〖e1〗 - 〖e2
where e, e1, e2 stand for arbitrary well-parenthesized sequences of characters. These equations express compositionality: the meaning of a composite expression is computed from the meaning of its parts -- but not the shape or other information about those parts. The meaning of the original expression then works out as:
"(1+(3-2))"〗 = 〖"1"〗 + 〖"(3-2)"〗 = 1 + (3 - 2) = 2
Compositionality means that the meaning of 〖"(e1+e2)"〗 depends solely on the meaning of e1 and e2 -- but not on the structure of these subexpressions. It follows then, for example, that 〖"(1+(3-2))"〗 must mean the same as 〖"(1+1)"〗, because 〖"(3-2)"〗 = 〖"1"〗. In other words, substituting a sub-expression with another (perhaps, simpler) sub-expression of the same meaning does not change the overall meaning.

In our example, the meaning of the expression is an integer. Said loftier, the semantic domain is the set of integers. We call it N semantics. This is not the only possible way to assign meaning to parenthesized expressions of digits and plus/minus characters. One may, for example, chose as the semantic domain the two-element set {odd, even}. We leave the meaning assignment (semantic equations) as an exercise, and consider yet another semantic domain: sequences of WASM instructions. The semantic equations are

"1"〗 = i32.const 1           〖"2"〗 = i32.const 2           〖"3"〗 = i32.const 3
"(e1+e2")〗 = 〖e1〗 〖e2i32.add            〖"(e1-e2")〗 = 〖e1〗 〖e2i32.sub
The meaning of the same original expression "(1+(3-2))" in the WASM domain is the instruction sequence
    i32.const 1   i32.const 3   i32.const 2   i32.sub   i32.add
We have obtained a simple compiler to WASM. (The approach does scale to much more complicated expressions.)


Free variables and Free semantics

Let us add variables to our expressions. Our language of parenthesized strings of digits and plus/minus operators will now include identifiers: sequences of letters and digits starting with a letter. Here is a sample expression in the extended language: "(y-(x+1))". The earlier semantics giving expressions the meaning as integers has the obvious trouble: there is no clear way to tell which integer should be the meaning of "x" or "y".

One may give meaning in a different way: as a collection of algebraic identities such as

    (1)  "(x+1)" === "(1+x)"
that is, assertions that some expressions are ``equal''. We use the sign === to signify such equivalence of expressions (in contrast to the = sign used for the semantic equality, e.g., equality of integers). Here, variables pose no problems: this is just an elementary school algebra. The above algebraic identity specifies one aspect of the meaning. Along with the congruence laws
if     e1 === e2     then    "e1+e" === "e2+e"
and the laws of equivalence it lets us equate complicated terms containing addition to one.

Variables are problematic only for the most naive denotational approach. Giving a compositional semantics to a language with variables is straightforward: we merely need a more sophisticated semantic domain than mere integers. Since we do not know the meaning of variables, we ask for it. A domain that accommodates questions can be described by the following (OCaml) inductive data type declaration

    type d = V of int | E of var * (int -> d)
where var is the type of identifiers. This is the set of well-founded infinitely-branching trees of finite height with integer leaves. We call this D semantics. For convenience, we introduce
    let an n   = V n
    let qu v k = E(v,k)
In the D domain, V n is an answer: the meaning as an integer n. On the other hand, E(v,k) is a question, about the meaning of the variable v, where k of the type int->d is an infinite tuple, indexed by integers. For each possible answer to the question -- for each possible integer value n the variable v may take -- k n tells the meaning of the expression for that particular variable value assignment.

The denotational semantics is then given as (we attach the superscript for easy reference):

"1"d  = an 1       〖"2"d  = an 2    etc.
vd      = qu v an
where v stands for an arbitrary identifier. The meaning of a digit is, hence, an answer: the number the digit stands for. The meaning of a variable is a question about that variable, immediately returning the answer. The meaning of a composite expression becomes:
"(e1+e2")d  = lift2 (+) 〖e1d  〖e2d
("e1-e2")d  = lift2 (-) 〖e1d  〖e2d
The only change is the appearance of lift2 : (int->int->int) -> (d->d->d). It transforms/adjusts the meaning of binary operations: extends them from N to D. It is defined generically for any binary operation on integers, by induction and case analysis:
    lift2 op (V n1) (V n2)  = an (op n1 n2)
    lift2 op (E (v,k)) d2   = qu v (fun n -> lift2 op (k n) d2)
    lift2 op d1 (E (v,k))   = qu v (fun n -> lift2 op d1 (k n))
If we know the answer to both operands, we can compute the answer to the operation. If an operand is a question, the operation is also a question. When it gets answered, the operation is tried again. The function lift2 hence propagates questions; the first operand gets to ask questions first.

All in all, the above semantic equations map an expression to a question-answer tree with integer leaves, which tells the meaning of the expression for each possible answer about the meaning of its variables. For example, for our sample term "(y-(x+1))" we compute its meaning as

    E("y", fun n -> E("x", fun m -> V(n-(m+1))))

The just defined semantics lets us verify the earlier equational law "1+x===x+1", by computing the denotations of the both sides and checking they are the same -- relying on the fact that the (semantic) addition on integers is commutative. The semantics lets us pose and verify many more identities, from the properties of integer addition. For example:

(2)    "(e+i)" === "(i+e)"       "((e1+e2)+e3)" === "(e1+(e2+e3))"
for any integer string i and well-parenthesized expressions e, e1, etc. For example, we obtain "(y-(x+1))" === "(y-(1+x))", because both sides have the same meaning (shown above).


Environment semantics

The just described denotational semantics for the toy language with variables is not the common one. The most familiar semantics uses environment, also called variable assignment in logic and typically denoted by ρ. The environment is a var->int mapping: ρ(v) tells the meaning (i.e., integer) associated with a variable v in this environment. Since environment is a finite map (any program has only a finite number of variables), it is often implemented as an associative list. The meaning of an expression hence is a function env -> int, which tells which integer the expression stands for given a particular variable assignment. We call this semantic domain E. The semantic equations are:
"1"e  = fun ρ -> 1         〖ve  = fun ρ -> ρ(v)
"(e1+e2)"e  = fun ρ -> 〖e1e ρ + 〖e2eρ        〖"(e1-e2)"e  = fun ρ -> 〖e1e ρ - 〖e2eρ

The E semantics above is built over the N domain. We may just as well use the WASM domain as the basis. The environment in this case maps variable names to WASM instructions. The meaning of the sample expression "(y-(x+1))" is then

    fun ρ -> ρ("y")   ρ("x")   i32.const 1   i32.add   i32.sub
In the environment that maps "x" to local.get 0 and "y" to local.get 1, our running example means
    local.get 1   local.get 0   i32.const 1   i32.add   i32.sub


Environment, Free, and Initial semantics

The semantics E looks rather different from the earlier D. But only on first sight. Let's define
    let ane n   = fun ρ -> n
    let que v k = fun ρ -> k (ρ(v)) ρ
    let lift2e op e1 e2 = fun ρ -> op (e1 ρ) (e2 ρ)
then the semantic equations
"1"e  = ane 1        〖ve  = que v ane
"(e1+e2)"e  = lift2e (+) 〖e1ee2e        〖"(e1-e2)"e  = lift2e (-) 〖e1ee2e
take exactly the same form as 〖 〗d. Only the choice of the three functions an, qu, lift2 differs, reflecting the different semantic domain (E rather that D). One may therefore say that D and E semantics, and whatever else one may come up with, are all instances of the single general semantics of variables (call it G):
"1"g  = ang 1        〖vg  = qug v ang
"(e1+e2)"g  = lift2g (+) 〖e1ge2g        〖"(e1-e2)"g  = lift2g (-) 〖e1ge2g
parameterized by the functions ang, qug and lift2g (and lift1g if the language has unary operations). This general semantics is an initial algebra semantics of variables. All other semantics are obtained by appropriately choosing the parameters ang, qug and lift2g.

Semantics E and D are thus two different instances of G. They can also be related in another way:

ee = hnd 〖ed
where hnd (which we call the handler) is inductively defined by the two cases:
    hnd V(n)   = ane n
    hnd E(v,k) = que v (hnd k)
In other words, the handler takes a question-answer tree and a variable assignment ρ and uses the latter to answer all the questions in the former. Another way to understand the handler is the wholesale replacement of V and E nodes in the question-answer tree with ane and que. (One may also note that hnd is what is called fold in initial algebra.)

Since hnd is fold, and ane and que could just as well be parameters ang and qug rather than fixed functions, the semantics D is also the initial algebra semantics (and, hence, isomorphic to G).

The question-answering strategy provided by hnd (and hence, by the environment semantics) can verify more identities, for example

    "x+y" === "y+x"
which cannot be verified in the D semantics because nothing guarantees that the question about the meaning of, say, x is answered with the same integer no matter when this question is asked relative to other questions. The handler hnd makes this guarantee but other handlers may not. On the other hand, (1) and (2) are valid for any question-answering strategy.

This is the taste of things to come in the paper Not by Equations Alone: Reasoning with Extensible Effects, where one encounters two sorts of semantics. One is the initial algebra, question-answer--tree semantics like D that presupposes no question-answering strategy. We often call such semantics `free'; the identities (equational laws) verified it are to be called `free' as well. The other semantics, such as E, rely on a particular question-answering strategy, and hence can verify more identities.


Simple State

Let us turn to effects. Let's go back to the toy language with digits and addition/subtraction, and instead of variables, introduce "Get" and "(Put i)" (where i is a string of digits). We may tell the story of the language with free variables again, with hardly any changes. The effect operations, like the variables x and y earlier, are just names, with no definite interpretation. Therefore, the expression "(Get+1)", like "(x+1)" earlier, is open, and poses problem for the naive denotational semantics. Just as before, we may specify the semantics via identities, such as
    "((Put 1)+Get)" === "((Put 1)+1)"
(in which one may recognize the so-called Put-Get law).

As in the case of variables, effects call for the extension of the semantic domain. The question-answer trees work here as well:

    type p = V of int | E of ops * (int -> p)
where ops is the type of effect operations: in our case,
    type ops = Get | Put of int
Again we introduce the convenient functions
    let anp n    = V n
    let qup op k = E(op,k)

The semantics (call it P)

"1"p  = anp 1                        〖"2"p  = anp 2    etc.
"Get"p  = qup Get anp       〖"(Put i)"p  = qup (Put i) anp
"(e1+e2")p  = lift2p (+) 〖e1p  〖e2p           〖("e1-e2")p  = lift2p (-) 〖e1p  〖e2p
and lift2p is lift2 with anp/qup instead of an/qu. The semantics P is hence almost literally the same as D. It is also an instance of G.

Since the question-answer semantics of state P turns out almost literally D, one may wonder what an environment-like semantics for state may look like, and how it may relate to E. In our simple state language, there is only one piece of state. Therefore, the environment ρ is the function unit -> int, which is isomorphic to an integer. The environment-like semantics of state (call it S) uses the following question operation:

    let qus Get k     = fun ρ -> k ρ ρ
    let qus (Put i) k = fun ρ -> k i i
(ans is the same as ane). The semantics equations then
es = hnd 〖ep
where hnd is essentially the same as before:
    hnd V(n)    = ans n
    hnd E(op,k) = qus op (hnd k)
Just like E is an instance of the initial algebra semantics D, S is an instance of P. Clearly, S is the familiar state-passing semantics of state.

The free semantics of state P lets us verify "(Get + 1)" === "(1 + Get)" and, in general, the identities similar to (2). On the other hand, it does not verify the Put-Get law. But the more specific semantics S does, as one can easily work out.

One may now ask, what should an/qu operations be to verify the Put-Get law. This is the central question for the algebraic effects and handlers approach. One may also ask: given a particular way of answering Get-Put questions (that is, particular an/qu operations) what are the corresponding additional identities that may be verified? This is the question of the `Not by Equations Alone' paper (in particular, for the Get/Put effect, it is answered in Sec 5 of the paper).

We have thus confirmed the analogy between free variables and effect operations. It can be traced to Reynolds' Idealized Algol and explored in more detail in Having an Effect. The free, question-answer--tree semantics makes the analogy particularly easy to see. A variable access may indeed be an effect   when the variable resides far off core (maybe in another host in a distributed system) or in a memory-mapped i/o. The question-answer--tree semantics could be particularly useful there.