From oleg at okmij.org Tue Sep 2 01:57:18 2008 To: haskell@haskell.org Subject: The final view on typed sprintf and sscanf Message-ID: <20080902075718.17EB1AE54@Adric.metnet.fnmoc.navy.mil> Date: Tue, 2 Sep 2008 00:57:18 -0700 (PDT) Status: OR It would be remiss not to mention the dual solution to the problem of typed sprintf and sscanf sharing the same formatting specification. The previous message defined the embedded domain-specific language of formatting specifications in the initial style, as a data type. The language can also be defined in the final style. To the end user, the difference is hardly noticeable: all the tests of the previous message work as they are (modulo a few adjustments caused by the monomorphism restriction). However, whereas the initial style required GADT, the final solution is entirely in Haskell98. One often hears that hardly anything interesting can be written in Haskell98. I submit that implementing type-indexed terms, thought to require GADTs or similar dependent-type-like extensions, ought to count as interesting. Again, the formulation of the problem and the end-user interface remain exactly the same as described in the previous message. Here are a few examples: > tp1 = sprintf $ lit "Hello world" > -- "Hello world" > ts1 = sscanf "Hello world" (lit "Hello world") () > -- Just () > tp2 = sprintf (lit "Hello " ^ lit "world" ^ char) '!' > -- "Hello world!" > ts2 = sscanf "Hello world!" (lit "Hello " ^ lit "world" ^ char) id > -- Just '!' > fmt3 () = lit "The value of " ^ char ^ lit " is " ^ int > tp3 = sprintf (fmt3 ()) 'x' 3 > -- "The value of x is 3" > ts3 = sscanf "The value of x is 3" (fmt3 ()) (\c i -> (c,i)) > -- Just ('x',3) The only difference is the dummy unit argument to the fmt3 term, to keep it polymorphic and away from the monomorphism restrictions. Otherwise, the examples look the same and work the same. The complete code is available at PrintScanF.hs It is Haskell98 and should work on any Haskell98 system (tested on GHC 6.8.2 and Hugs September 2006). Whereas the initial version defined the formatting language with the help of GADT, the final version uses a simple, single-parameter type class > class FormattingSpec repr where > lit :: String -> repr a a > int :: repr a (Int -> a) > char :: repr a (Char -> a) > (^) :: repr b c -> repr a b -> repr a c > infixl 5 ^ The printer and the scanner are two interpreters of the language > newtype FPr a b = FPr ((String -> a) -> b) > instance FormattingSpec FPr where > lit str = FPr $ \k -> k str > char = FPr $ \k -> \x -> k [x] > (FPr a) ^ (FPr b) = FPr $ \k -> a (\sa -> b (\sb -> k (sa ++ sb))) > newtype FSc a b = FSc (String -> b -> Maybe (a,String)) > instance FormattingSpec FSc where > lit str = FSc $ \inp x -> > maybe Nothing (\inp' -> Just (x,inp')) $ prefix str inp > char = FSc $ \inp f -> case inp of > (c:inp) -> Just (f c,inp) > "" -> Nothing > (FSc a) ^ (FSc b) = FSc $ \inp f -> > maybe Nothing (\(vb,inp') -> b inp' vb) $ a inp f > sprintf :: FPr String b -> b > sprintf (FPr fmt) = fmt id > sscanf :: String -> FSc a b -> b -> Maybe a > sscanf inp (FSc fmt) f = maybe Nothing (Just . fst) $ fmt inp f The transformation from the initial to the final style follows the correspondence described in http://okmij.org/ftp/tagless-final/tagless-typed.html#in-fin In fact, PrintScanF.hs was `derived' from PrintScan.hs by Emacs `code movements'. Once the syntax errors have been fixed, the code worked on the first try. One can easily define a yet another interpreter, to convert the formatting specification to a C-like format string.