{-# OPTIONS -fglasgow-exts #-} -- Haskell98 plus higher-ranked types only -- Eliminating Array bound checking with non-dependent types -- An attempt to give as literal as possible translation of -- Hongwei Xi's code (Xi and Pfenning, PLDI98) into Haskell -- (Haskell98 + higher-rank types). -- The present translation has a lot of common with KMP-deptype.hs -- (translation of another Hongwei Xi's code, from the same paper). -- The file eliminating-array-bound-check.lhs gives a slightly -- less literal (but perhaps more elegant) translation. module Dep where import Data.Array {- Our example is `bsearch', taken from the famous paper "Eliminating Array Bound Checking Through Dependent Types" by Hongwei Xi and Frank Pfenning (PLDI'98). Hongwei Xi's code was written in SML extended with a restricted form of dependent types. Here is the original code of the example (taken from Figure 3 of that paper, see also http://www-2.cs.cmu.edu/~hwxi/DML/examples/) datatype 'a answer = NONE | SOME of int * 'a assert sub <| {n:nat, i:nat | i < n } 'a array(n) * int(i) -> 'a assert length <| {n:nat} 'a array(n) -> int(n) fun('a){size:nat} bsearch cmp (key, arr) = let fun look(lo, hi) = if hi >= lo then let val m = (hi + lo) div 2 val x = sub(arr, m) in case cmp(key, x) of LESS => look(lo, m-1) | EQUAL => (SOME(m, x)) | GREATER => look(m+1, hi) end else NONE where look <| {l:nat, h:int | 0 <= l <= size /\ 0 <= h+1 <= size } int(l) * int(h) -> 'a answer in look (0, length arr - 1) end where bsearch <| ('a * 'a -> order) -> 'a * 'a array(size) -> 'a answer The text after `<|' are dependent type annotations. They _must_ be specified by the programmer -- even for internal functions such as `look'. -} -- Our bsearch function. bsearch cmp (key, arr) = brand arr (\arrb -> bsearch' cmp (key, arrb)) bsearch' cmp (key,(arr,lo,hi)) = look lo hi where look lo hi = index_cmp lo hi Nothing $ \lo' hi' -> let m = bmiddle lo' hi' x = arr !. m in case cmp (key,x) of LT -> look lo (bpred m) EQ -> Just (unbi m, x) GT -> look (bsucc m) hi {- This code is just as algorithmically efficient as the Dependent SML code: one middle index computation, one element comparison, one index comparison, one index increment or decrement per iteration. There are no type annotations as none are needed. Operator (!.) is a statically safe array indexing operator. The type system and the trust properties of the kernel below guarantee that in the expression "arr !. m" the index `m' is positively in range of the array `arr' bounds. -} -- Tests barr1 = listArray (5,5 + (length s)-1) s where s = "abcdefgh" btest1 = bsearch (uncurry compare) ('c',barr1) btest2 = bsearch (uncurry compare) ('a',barr1) btest3 = bsearch (uncurry compare) ('h',barr1) btest4 = bsearch (uncurry compare) ('x',barr1) btest5 = bsearch (uncurry compare) ('x',listArray (0,-10) []) -- ---------------------------------------------------------------------- -- The trusted kernel -- -- The code relies on a compact general-purpose trusted kernel below. -- That code should be put into a separate module. In particular, -- the data constructors BArray, BIndex, should not be exported and -- should not be available outside of the kernel. -- -- The phantom type 's' is a type proxy for the pair of numbers -- (low,high) -- the boundaries of array indices. In Haskell, low can -- be arbitrarily larger than high. We don't know what array indices -- are until the run time. Yet we know (due to the way BArray, etc, types -- are constructed) that the same value of s means the the same -- interval of indices. The safety depends on the trusted way of -- creating branded types: the constructors BIndex and BArray should be -- used in the trusted kernel only, and should not be available anywhere -- else. The uniqueness of 's' afforded by the explicit universal -- quantification prevents mixing up of different brands. -- Branded arrays and indices -- These are `newtype's and so impose no run-time overhead. newtype BArray s i a = BArray (Array i a) -- the value of the type |BIndex s i| denotes an index value that -- is assuredly { low <= i <= high } where low and high are the bounds -- represented by the type parameter |s| newtype Integral i => BIndex s i = BIndex i unbi (BIndex i) = i -- the value of the type |BIndexL s i| denotes an index value that -- is assuredly { low <= i } where low is the lower bound -- represented by the type parameter |s| newtype Integral i => BIndexL s i = BIndexL i -- the value of the type |BIndexH s i| denotes an index value that -- is assuredly { i <= high } where high is the upper bound -- represented by the type parameter |s| newtype Integral i => BIndexH s i = BIndexH i {- We must re-iterate that safety depends on assurances of the code that constructs BIndex values. Because of the high assurance, we must formulate the safety properties as propositions, and prove them. Fortunately, the code below is compact and straightforward, as well as general purpose. -} {- We start with an introduction rule for BArray, which determines the run-time bounds of an array and passes the branded array and the low and upper bounds to its continuation. The function brand has a higher-rank type. It is the quantification of 's' as well as the absence of the BArray constructor elsewhere guarantee that the same brand entails the same bounds. -} brand:: (Ix i, Integral i) => Array i e -> (forall s. (BArray s i e, BIndexL s i, BIndexH s i) -> w) -> w brand (a::Array i e) k = let (l,h) = bounds a in k ((BArray a)::BArray () i e, BIndexL l, BIndexH h) {- Proposition: For the argument (a,l,h) of the continuation k, l >= low bound of the array a, h <= upper bound of the array a. Proof: from the semantics of the function `bounds', taken here as an axiom, and the code itself. -} bmiddle:: (Integral i) => BIndex s i -> BIndex s i -> BIndex s i bmiddle (BIndex i1) (BIndex i2) = BIndex ((i1 + i2) `div` 2) {- Proposition: l <= i1 <= h, l <= i2 <= h |- l <= (i1 + i2) `div` 2 <= h Proof: plain arithmetics. We should stress that the type of bmiddle assures that all indices involved have the same brand -- that is, the same lower and upper boundaries. A brand 's' is a (type-level) representation of index bounds. At compile time, we don't know what they are, but the unforgeability of 's' statically guarantees that the same 's' represents the same bounds. -} -- Index comparison: -- Compare BIndexL with BIndexH. If the first is less or equal -- than the second, invoke the onle continuation, passing the indices -- converted to BIndex. -- Safety proposition: -- {i:int | low <= i}, {j:int | j <= high}, i <= j -- ===> {i,j:int | low <= i,j <= high}. See the definition of BIndex then. index_cmp :: Integral i => BIndexL s i -> BIndexH s i -> w -> (BIndex s i -> BIndex s i -> w) -> w index_cmp (BIndexL i) (BIndexH j) onother onle = if i <= j then onle (BIndex i) (BIndex j) else onother -- Index arithmetics -- Safety proposition: -- {i:int | low <= i <= high} -- ===> {i:int | low <= i+1} . See the definition of BIndexL then. -- The safety proposition justifies our use of the data constructor -- BIndexL. bsucc:: Integral i => BIndex s i -> BIndexL s i bsucc (BIndex i) = (BIndexL (succ i)) -- Safety proposition: -- {i:int | low <= i <= high} -- ===> {i:int | i-1 <= high} . See the definition of BIndexH then. -- The safety proposition justifies our use of the data constructor -- BIndexH. bpred:: Integral i => BIndex s i -> BIndexH s i bpred (BIndex i) = (BIndexH (pred i)) -- Because a branded index is assuredly within the bounds of the array of -- the same brand, we can write the following. Actually, we may _safely_ -- replace a ! i with `unsafeAt a i' infixl 5 !. (!.):: (Ix i) => BArray s i e -> (BIndex s i) -> e (BArray a) !. (BIndex i) = a ! i