[Haskell] Applicative translucent functors in Haskell
Chung-chieh Shan
Mon Sep 13 15:20:33 EDT 2004
http://www.haskell.org/pipermail/haskell/2004-September/014515.html
Comment: added LANGUAGE pragmas.
On 2004-09-08T19:46:55+0200, Tomasz Zielonka wrote:
] On Wed, Sep 08, 2004 at 04:27:23PM +0100, Simon Peyton-Jones wrote:
]] The ML orthodoxy says that it's essential to give sharing constraints by
]] name, not by position. If every module must be parameterised by every
]] type it may wish to share, modules might get a tremendous number of type
]] parameters, and matching them by position isn't robust. I think that
]] would be the primary criticism from a programming point of view. I have
]] no experience of how difficult this would turn out to be in practice.
] How about named fields in type constructors? Something like Haskell's
] records but at type level. Seems like a fun extension ;)
Proponents of ML-style module systems emphasize the advantage
of `sharing by specification' (or `fibration') over `sharing by
construction' (or `parameterization') (MacQueen 1986; Pierce 2000;
Harper and Pierce 2003). As Simon Peyton-Jones noted, in the context
of our translations of ML-style modules into System F-omega and
Haskell, sharing by specification gives type-equality constraints by
name, whereas sharing by construction gives type-equality constraints
by position. Harper and Pierce (2003; Pierce 2000) give examples of
modular programming where the latter approach can lead to an exponential
number of parameters, which are clumsy to deal with at best. It has
been often suggested that records at the type level be introduced to
address this issue (Jones 1995, 1996; Shao 1999a,b; Shan 2004; Tomasz
Zielonka in this discussion thread).
In this message, we (Oleg Kiselyov and Chung-chieh Shan) translate
Harper and Pierce's example into Haskell, using only the most common
Haskell extensions to give type-equality constraints by name and avoid
an exponential blowup. This exercise suggests that, while type-level
records may be convenient to have in Haskell, they may not be strictly
necessary to express sharing by specification. As shown below, we
can indeed refer to type parameters `by name', taking advantage of
the ability of a Haskell compiler to unify type expressions and bind
type variables. Our technique may be generalizable to encode all
sharing by specification. We hope this message helps clarify the
difference between the two sharing styles, and relate the ML and Haskell
orthodoxies.
First, let us demonstrate the exponential explosion of type variables.
We again will be using OCaml and Haskell in parallel, to make our
Haskell translation of module expression clearer. Later we shall
show how we prevent the exponential explosion in Ocaml -- and how
to translate that solution to Haskell. Again this message is a
doubly-literal code: both in OCaml and Haskell. It can be loaded in
GHCi or Hugs -98 as it is. To get the OCaml code, please filter the
text of the message with "sed -n -e '/^}/ s/^} //p'"
> {-# LANGUAGE MultiParamTypeClasses, FlexibleInstances #-}
> {-# LANGUAGE ScopedTypeVariables #-}
> {-# LANGUAGE FunctionalDependencies #-}
> {-# LANGUAGE UndecidableInstances #-}
> module Fibration where
(The final solution arrived at by the end of this message does not
require -fallow-undecidable-instances above.)
Let us consider a module of the following interface (a signature, in
ML speak):
} module type FN = sig
} type a
} type b
} val app : a -> b
} end
This is the interface of a regular function. It can be thought of
as a compiler stage or network protocol stack that translates one
intermediate language or representation (type a) into another (type b).
Here are two sample modules of that signature:
} module TIF = struct
} type a = int
} type b = float
} let app x = float_of_int x
} end
}
} module TFI = struct
} type a = float
} type b = int
} let app x = truncate x + 1
} end
In our Haskell translation, a signature corresponds to a type class,
and an implementation (a structure, aka module) to an instance:
> class NFN a b where
> napp:: a -> b
>
> instance NFN Integer Float where
> napp x = fromInteger x
>
> instance NFN Float Integer where
> napp x = truncate x + 1
Let us write a module that represents a composition appr . appl of
two FN-functions: appl: al->bl and appr: ar->br. In order for the
composition to be well-formed, the result type of appl must be the
argument type of appr: bl = ar (which we will call the intermediate
type t). Let us further suppose that we wish to make this intermediate
type explicit (e.g., for inspection, to resolve overloading, to invoke
the two intermediate functions separately, etc). Thus we arrive at the
following interface:
} module type NFN1 = sig
} type a1
} type t
} type b1
} val app1 : a1 -> b1
} end
Or, in Haskell
> class NFN1 a t b where
> napp1:: t -> a -> b
Note that the type of the intermediate result is really needed in
Haskell, to resolve the overloading and properly select the instance.
The composition of two modules of the signature FN is computed by the
following transparent functor:
} module NFn1(L: FN)(R: (FN with type a = L.b)) = struct
} type a1 = L.a
} type t = L.b
} type b1 = R.b
} let app1 x = R.app (L.app x)
} end
It takes two modules of the signature FN, labeled L and R. We should
note a _sharing constraint_: the type a of module R must be the same
as the type b of module L. The result of the NFn1 is a module of the
signature NFN1.
Here is an example of using the module
} module TIFI = NFn1(TIF)(TFI)
} let test_tifi = TIFI.app1 7;; (* 8 *)
In Haskell, the functor corresponds to an instance with constraints
that correspond to the argument signatures. The sharing is expressed
by sharing of the names of type variables:
> instance (NFN a t, NFN t b) => NFN1 a t b where
> napp1 t x = napp (napp x `asTypeOf` t)
>
> test_nfn1::Integer
> test_nfn1 = napp1 (undefined::Float) (7::Integer) -- 8
Suppose we wish to compose two modules NFN1 again. Again, we wish to
expose all intermediate types
} module type NFN2 = sig
} type a2
} type tl type t type tr
} type b2
} val app2 : a2 -> b2
} end
}
} module NFn2(L: NFN1)(R: (NFN1 with type a1 = L.b1)) = struct
} type a2 = L.a1
} type tl = L.t
} type t = L.b1
} type tr = R.t
} type b2 = R.b1
} let app2 x = R.app1 (L.app1 x)
} end
We should note again that the functor NFn2 imports two modules of the
signature NFN1 and re-exports their types, after relabeling them to avoid
ambiguity and applying the sharing constraint R.a1 = L.b1.
In Haskell:
> class NFN2 a tl t tr b where
> napp2:: (tl,t,tr) -> a -> b
>
> instance (NFN1 a tl t, NFN1 t tr b) => NFN2 a tl t tr b where
> napp2 (tl,t,tr) x = napp1 tr $ ((napp1 tl x) `asTypeOf` t)
We can do that again:
} module type NFN3 = sig
} type a3
} type tl type t type tr
} type b3
} val app3 : a3 -> b3
} end
}
} module NFn3(L: NFN2)(R: (NFN2 with type a2 = L.b2)) = struct
} type a3 = L.a2
} type tll = L.tl
} type tl = L.t
} type trl = L.tr
} type t = L.b2
} type tlr = R.tl
} type tr = R.t
} type trr = R.tr
} type b3 = R.b2
} let app3 x = R.app2 (L.app2 x)
} end
In Haskell:
> class NFN3 a tll tl tlr t trl tr trr b where
> napp3:: ((tll,tl,tlr),t,(trl,tr,trr)) -> a -> b
>
> instance (NFN2 a tll tl tlr t, NFN2 t trl tr trr b)
> => NFN3 a tll tl tlr t trl tr trr b where
> napp3 (tl,t,tr) x = napp2 tr $ ((napp2 tl x) `asTypeOf` t)
Here is a usage example
} module TII = struct
} type a = int
} type b = int
} let app x = x + 2
} end
}
} module NM1 = NFn1(TII)(TII)
} module NM2 = NFn2(NM1)(NM1)
} module NM3 = NFn3(NM2)(NM2)
} let test_nm3 = NM3.app3 5;; (* 21 *)
> instance NFN Integer Integer where
> napp x = x + 2
>
> test_nfn3:: Integer
> test_nfn3 = let i = undefined::Integer
> i3 = (i,i,i)
> in napp3 (i3,i,i3) (5::Integer) -- 21
The exponential explosion of the type variables is apparent. The term
expressions, the module expressions, and the sharing constraints are all
`linear'. That is, if we wish to define another level of composition,
NFN4, we write an expression similar to NFn3, which, if we disregard the
type variables, has roughly the same size, in characters. It's only
when we look at the type variables we see the explosion. The explosion
can be overcome if could magically say: import module NFNn as L; import
module NFNn as R; make sure that L.bn = R.an; and re-export the rest.
Alas, we can't deal with the type variables of a structure 'in bulk'.
If we wish to re-export them, we have to enumerate them all.
The explosion is particularly apparent in Haskell, where we refer to
type parameters of a class by their position rather than by their name.
If we wish to write another level of composition, say, NFN4, we merely
need the first type variable NFN3 and the last type variable of NFN3.
Alas, we have to enumerate all the type variables in-between.
It turns out that we _can_ refer to type variables of a module `in bulk',
both in OCaml and in Haskell. To do that, we introduce a more
structural representation:
} module type FN1 = sig
} type a type b type t
} module ML : (FN with type a = a and type b = t)
} module MR : (FN with type b = b and type a = t)
} val app : a -> b
} end
}
} module Fn1(L: FN)(R: (FN with type a = L.b)) = struct
} type a = L.a type b = R.b type t = L.b
} module ML = L module MR = R
} let app x = R.app (L.app x)
} end
}
} module type FN2 = sig
} type a type b type t
} module ML : (FN1 with type a = a and type b = t)
} module MR : (FN1 with type b = b and type a = t)
} val app : a -> b
} end
}
} module Fn2(L: FN1)(R: (FN1 with type a = L.b)) = struct
} type a = L.a type b = R.b type t = L.b
} module ML = L module MR = R
} let app x = R.app (L.app x)
} end
The details of the two halves of the composition are stowed away in the
submodules ML and MR. We avoid the explosion in Ocaml because we can
mention, for example, the type tl in NFN2 above as ML.t in FN2 instead.
We can build a chain of functions using source code of size logarithmic
in the length of the chain.
Let us extend the chain one more time for illustration, and show an
example:
} module type FN3 = sig
} type a type b type t
} module ML : (FN2 with type a = a and type b = t)
} module MR : (FN2 with type b = b and type a = t)
} val app : a -> b
} end
}
} module Fn3(L: FN2)(R: (FN2 with type a = L.b)) = struct
} type a = L.a type b = R.b type t = L.b
} module ML = L module MR = R
} let app x = R.app (L.app x)
} end
}
} module M1 = Fn1(TIF)(TFI)
} module M2 = Fn2(M1)(M1)
} module M3 = Fn3(M2)(M2)
} let test_m3 = M3.app 5;; (* 9 *)
With the help of Haskell type classes, we can also stow away the
detailed type information of a module.
First, we re-define our class representing the signature FN to take an
extra type parameter:
> class FN s a b | s -> a, s -> b where
> app:: s -> a -> b
The parameter 's' is a `label' that uniquely identifies an instance of
the class FN: in other words, the label 's' represents a module of a
signature FN. The label is a `proxy' for the module. Here are a few
examples of such modules:
> instance FN (Integer->Float) Integer Float where
> app _ x = fromInteger x
>
> instance FN (Float->Integer) Float Integer where
> app _ x = truncate x + 1
>
> instance FN (Integer->Integer) Integer Integer where
> app _ x = x + 2
>
> data L = L
> instance FN L Integer Integer where
> app _ x = x + 2
In the last example, we used an `artificial' label `L'. Now we can
write the signature and the functor FN1 that `composes' FN once, the
signature FN2 and the corresponding functor that compose FN twice,
etc. However, because in Haskell an instance can refer to itself, we
can create a recursive functor and a signature:
> instance (FN u a t, FN v t b) => FN (u,v) a b where
> app s = app (snd s) . app (fst s)
The FN instance above subsumes the old classes NFN1, NFN2, NFN3, etc.,
all under the same FN class:
> fn'1 = undefined :: (Integer->Float, Float->Integer)
> fn'2 = (fn'1, fn'1)
> fn'3 = (fn'2, fn'2)
> test_fn' = app fn'3 5 -- 9
If fn'1 is a 2-stage compiler, then fn'2 is a 4-stage compiler and
fn'3 is an 8-stage one. The types of fn'1, fn'2, fn'3 above grow
exponentially, just as the Ocaml signature FN2 above is twice the size
of FN1 when expanded out. But signature definitions in Ocaml and type
synonyms in Haskell allow us to avoid the explosion in the source code.
Even though the details of these composed modules are stowed away,
they are not hidden. Indeed, the label uniquely determines the the
module. We can inspect the label or its type to find sub-labels,
which uniquely describe the intermediate modules and their internal
types. For example, here is a Haskell function that runs the first 3
stages of an 8-stage compiler like fn'3:
> stages123of8 ~((s12,(s3,s4)),s5678)
> = app s3 . app s12
The type of stages123of8 is inferred to be
*Fibration> :t stages123of8
stages123of8 :: forall b a s5678 s4 s3 s12 b1.
(FN s12 a b1, FN s3 b1 b) =>
((s12, (s3, s4)), s5678) -> a -> b
Here ((s12,(s3,s4)),s5678) is a dummy type argument that identifies the
module (in other words, resolves the overloading). Note that the term
and type above only mentions the parts of the module that are actually
used, not the exponentially-sized details of say s5678. We thus avoid
exponential blowup and achieve sharing by specification.
> test_stagem3 = stages123of8 fn'3 5
*Fibration> :t test_stagem3
test_stagem3 :: Float
*Fibration> test_stagem3
6.0
We should point out that we have indeed accessed an intermediate type in
fn'3: although the whole compiler "app fn'3" maps Integer to Integer,
the first three stages map Integer to an intermediate type Float. We
have taken a great advantage of the pattern-matching ability of the
Haskell compiler: the ability to unify one type expression with the
other and bind type variables.
REFERENCES
Robert Harper, and Benjamin C. Pierce. 2003. Design issues in advanced
module systems. In Advanced topics in types and programming languages,
ed. Benjamin C. Pierce. Cambridge: MIT Press. Draft manuscript.
Mark P. Jones. 1995. From Hindley-Milner types to first-class structures.
In Proceedings of the Haskell workshop, ed. Paul Hudak. Tech. Rep. YALEU/
DCS/RR-1075, New Haven: Department of Computer Science, Yale University.
http://www.cse.ogi.edu/~mpj/pubs/haskwork95.pdf
Mark P. Jones. 1996. Using parameterized signatures to express modular
structure. In POPL '96: Conference record of the annual ACM symposium
on principles of programming languages, 68-78. New York: ACM Press.
http://www.cse.ogi.edu/~mpj/pubs/paramsig.html
http://www.cse.ogi.edu/~mpj/pubs/paramsig.pdf
David B. MacQueen. 1986. Using dependent types to express modular
structure. In POPL '86: Conference record of the annual ACM symposium
on principles of programming languages, 277-286. New York: ACM Press.
http://www.cs.bell-labs.com/who/dbm/papers/popl86/paper.ps
Benjamin C. Pierce. 2000. Advanced module systems: A guide for the
perplexed. ICFP invited talk.
http://www.cis.upenn.edu/~bcpierce/papers/modules-icfp.ps
Chung-chieh Shan. 2004. Higher-order modules in System F-omega and
Haskell. Draft manuscript.
http://www.cs.rutgers.edu/~ccshan/xlate/xlate.pdf
Zhong Shao. 1999a. Transparent modules with fully syntactic signatures.
In ICFP '99: Proceedings of the ACM international conference on functional
programming, vol. 34(9) of ACM SIGPLAN Notices, 220-232. New York:
ACM Press.
http://flint.cs.yale.edu/flint/publications/fullsig.pdf
Zhong Shao. 1999b. Transparent modules with fully syntactic signatures.
Tech. Rep. YALEU/ DCS/ TR-1181, Department of Computer Science, Yale
University, New Haven.
http://flint.cs.yale.edu/flint/publications/fullsig-tr.pdf