Our polymorphic variants are literally open co-products: dual of extensible polymorphic records of HList. The duality is negation:
NOT (A | B) -> (NOT A & NOT B)This implication is a part of the deMorgan law that holds both in classical, and, more importantly, in intuitionistic logics. Our encoding of sums is the straightforward Curry-Howard image of that law.
Our implementation of polymorphic variants in terms of HList records uses no type classes, no type-level programming or any other type hacking. In fact, there are no type annotations, type declarations, or any other mentioning of types, except in the comments.
open-coproduct.txt [4K]
Polymorphic variants (extensible, recursive sums) with HList
The message posted on the Haskell mailing list on Sun, 2 Jul 2006 18:35:12 -0700 (PDT)
< http://guppy.eng.kagawa-u.ac.jp/~kagawa/PVH >
Koji Kagawa: Polymorphic Variants in Haskell
Presented at the Haskell Workshop 2006
This paper proposes a quite different solution to the expression problem, encoding constructors of open `data types' as class methods. Our example is patterned after the running example of this paper.
The code files below closely follow the layout of the running example in Laemmel and Peyton-Jones' ICFP2005 talk. The comments describe which part of the code is the `Library', which part of the code is the (data-structure--independent) `generic function', and which part of the code overrides the general processing of the generic function. The code has two examples; one of them is gsize of the original SYB3 paper.
Our variation of SYB3 takes advantage of the fact that type-class-bounded and higher-rank polymorphisms can be partially traded for each other. To be more precise, we use type codes to pass around functions. The type code must then be interpreted by a type class similar to Apply of HList (cf. HMap). This extra interpretation step is the main conceptual deviation from SYB3, largely taking the important idea of function combinators out of it.
dat2.hs [4K]
With a little bit of CPS, we now eliminate recursive instance dependency even for recursive datatypes. This is our final solution.
Re: Scrap your boilerplate and Variants
The discussion thread started from a message posted on the Haskell-Generics mailing list on Tue Oct 24 23:28:24 EDT 2006
< http://www.cs.vu.nl/boilerplate/ >
Scrap your boilerplate ... in Haskell
That web page includes the pointer to our inspiration, SYB3: Scrap your boilerplate with class: extensible generic functions by Ralf Laemmel and Simon Peyton-Jones. In: Proceedings of ICFP 2005.
SYB1, when traversing terms and invoking user functions for subterms of a particular type, relies on a run-time typecase. The latter requires run-time type representation, which is provided by the typeclass Typeable. The typeclass implements the method `cast' for a safe cast from the `generic type' to the specific type. We observe that the typecase, at the type level, has always existed in Haskell: it is the type equality predicate TypeEq of HList.
Our generic functions, like those of SYB1, are ordinary functions. Since the transformation for each subterm is chosen at compile-time, however, we do not need any casts, run-time type representations, or Typeable. We do not need higher-ranked types either. Also unlike SYB1, the set of traversal strategies is extensible: To the pre-defined strategies for gmap, gshow, gfold and geq, the programmer may at any time add a new class of traversals.
Our most general generic function is gapp, which applies a generic function to a term. A generic function is, quite literally, made of two parts. First is a term traversal strategy, identified by a label. One strategy may be to `reduce' a term using a supplied reducing function (cf. fold over a tree). Another strategy rebuilds a term. The second component of a generic function is spec, the HList of `exceptions', or ad hoc transformations. Each element of spec is an ordinary function that tells how to transform a term of a specific type. Exceptions override the generic traversal: If the type of the input term matches the argument type of one of the functions in spec we apply that function. If no specific function applies, we do the generic action implicit in the traversal strategy.
The algorithm that selects for each input subterm the appropriate transformer is tantamount to overloading resolution algorithm. Smash thus implements typeclass instance selection algorithm in the typechecker itself. The messages below describe the implementation of typeclasses in terms of Smash, with an additional flexibility of reordering, inspecting and deleting `instances'.
smash-first.txt [8K]
Smash your boiler-plate without class and Typeable
The message describing the original Smash approach, posted on the Haskell mailing list on 10 Aug 2006 21:16:34 -0000
smash-along.txt [12K]
Smash along your boilerplate; how to traverse a non-existent term
The improved and generalized approach, described in the message posted on the Haskell mailing list on Tue, 5 Jun 2007 00:09:02 -0700 (PDT)
< http://www.haskell.org/haskellwiki/Applications_and_libraries/Generic_programming/Smash >
`Smash' on Haskell Wiki
undefined and produce the fully-defined smallest (resp. largest) term of the desired type:
gminimum () :: (Maybe Int,Either Bool (Maybe Char))
-- ==> (Nothing,Left False)
gmaximum () :: (Maybe Int,Either Bool (Maybe Char))
-- ==> (Just 2147483647,Right (Just '\1114111'))
The result of traversing (undefined::(Maybe Int,Either Bool (Maybe Char))) is (Nothing,Left False). We take advantage of the fact that undefined stands for a term of any structure and is deconstructible and traversable. Haskell's non-strictness lets us meaningfully traverse non-existing terms.
Traversing non-existing sums such as undefined::Either Int Char requires special attention. Sums give us a choice. We should report the available alternatives, and let the users make the choice of a component of the sum. The implementation of gminimum always chooses the first available alternative; gmaximum chooses the last available.
The function gminimum is akin to the de-typechecker since both convert `undefined' to `defined'. De-typechecker produces only polymorphic functions, whereas gminimum yields data, including monomorphic functions.
smash-along.txt [12K]
Smash along your boilerplate; how to traverse a non-existent term
The explanation, described in the second half of the message posted on the Haskell mailing list on Tue, 5 Jun 2007 00:09:02 -0700 (PDT)
Serialization can be written generically, with generic fold. The following Smash code says the result of serializing an object is the list of all its primitive fields.
serialize xs = gapp (TL_red concat) primitive_fields_show xs
primitive_fields_show =
(\ (s::Int) -> [show s]) :+:
(\ (s::Float) -> [show s]) :+:
(\ (s::String) -> [s]) :+:
HNil -- more can be added in the future
For example, for the following data type of a company and a sample term (from the famous example of SYB papers)
data Company = C [Dept]; data Dept = D Name Manager [Unit]
data Unit = PU Employee | DU Dept
data Employee = E Person Salary; data Person = P Name Address
data Salary = S Float; type Manager = Employee
type Name = String; type Address = String
genCom = C [D "Research" (E (P "Laemmel" "Amsterdam") (S 8000.0))
[PU (E (P "Joost" "Amsterdam") (S 1000.0)),
PU (E (P "Marlow" "Cambridge") (S 2000.0))],
D "Strategy" (E (P "Blair" "London") (S 100000.0)) []]
serialize genCom produces the list
["Research","Laemmel","Amsterdam","8000.0","Joost","Amsterdam",
"1000.0","Marlow","Cambridge","2000.0","Strategy","Blair",
"London","100000.0"]
Hugh Perkins has posed the problem of writing a generic deserializer, which should work for terms of any structure in the domain of a generic programming library. His particular application is complex terms representing XML documents. The deserializer can indeed be written generically: in the first approximation, it is a generic monadic map over a term. It takes a term, the prototype, and the list of serialized primitive fields as a monadic state. The deserializer traverses the prototype replacing each primitive field with the corresponding value from the monadic state. For example, assuming the following list of serialized fields:
retro = ["Metaphysics", "Kant","Koeningsberg","800.0",
"Hume","Edinburgh","100.0",
"Marlowe","Cambridge","200.0",
"Ruling","Thatcher","London","50000.0"]
and genCom as the prototype, deserialize genCom retro `upgrades' the company giving us the term
C [D "Metaphysics" (E (P "Kant" "Koeningsberg") (S 800.0))
[PU (E (P "Hume" "Edinburgh") (S 100.0)),
PU (E (P "Marlowe" "Cambridge") (S 200.0))],
D "Ruling" (E (P "Thatcher" "London") (S 50000.0)) []]
In the discussion thread, Jeremy Gibbons has pointed out that McBride and Paterson's idiomatic traverse subsumes monadic gmap. See his and Oliveira's paper below for the detailed discussion.
Hugh Perkins et al. Thread on the Hs-Generics mailing list, June 26-July 3, 2007.
Jeremy Gibbons and Bruno C. d. S. Oliveira. The essence of the Iterator pattern.
< http://web.comlab.ox.ac.uk/jeremy.gibbons/publications/#iterator >
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML