previous   next   up   top

Region-based resource management

 


 

Introduction

Monadic Regions is a technique for managing resources such as memory areas, file handles, database connections, etc. It was introduced by Tofte and Talpin as a memory allocation technique, and is implemented in ML-Kit and Cyclone. A region is an area of memory holding heap allocated data (reference cells). Regions may nest and so more than one region may be active at any given point. A new reference cell may only be allocated in an active region, and may then only be used while that region is active. The system statically guarantees that no cell can be accessed when its region is closed. Therefore, all cells in a region can be immediately garbage collected when the region is closed.

Regions offer an attractive alternative to both manual allocation of resources and garbage collection. Unlike the latter, region-based resource management makes resource disposal and finalization predictable. We can precisely identify the program points where allocated resources are freed and finalization actions are run. Like other automatic resource managements schemes, regions statically assure us that no resource is used after it is disposed of, no resource is freed twice, and all resources are eventually deallocated.

Monadic Regions can manage resources other than memory, as explained by Brandon Moore: ``I'm assuming you understand how the type on runST and the STRef operations ensure that, even though you can smuggle out an STRef in the result from runST, you will never be able to use it again. The idea was to do the equivalent thing with databases: use fancy types to ensure that handle can only be used inside to origination withDB or withCursor or whatever, and the bracketing function can release the resource on the way out, without worrying about it being used again.'' In a follow-up, Benjamin Franksen concurred: ``I think this is an extremely good idea. I have been very frustrated with finalizers because of their limitations (can't rely on them being called at all), so have (reluctantly) been using the unsafe bracket version. Making it safe via a type system trick is really the way to go.''

The paper by Fluet and Morrisett describes monadic regions, their calculus, and the type- and meaning-preserving translation to a variant of System F. The authors mention as future work providing a RGN monad for Haskell programmers. We show exactly this monad in Section Heavy-weight implementation of region calculus , which however relies on heavy type-class trickery. A simpler, and still complete, solution exists, especially suitable for IO regions and ensuring the validity of file handles: Lightweight monadic regions .

With some limitations, File and Database IO regions can be implemented already in Haskell98 with a common extension for rank-2 types (existentials). This technique has indeed been used in Takusen, to provide precisely the assurances that Brandon Moore wanted. One may hope this assured approach to File IO would be more widely used.

References
Matthew Fluet, J. Gregory Morrisett: Monadic regions. ICFP 2004, pp. 103-114

Brandon Moore, Benjamin Franksen: Db and STRef. Messages posted on the Haskell-Cafe mailing list, Jan 15-16 2006.

Takusen: a DBMS access library built around a left-fold enumerator

 

Simple IO regions, in Haskell98 + existentials

We show a very simple implementation of Monadic Regions for the particular case of File IO. The technique statically guarantees that neither a file handle nor any computation involving the handle can leak outside of handle's region. Therefore, the handle can be safely closed, and its resources disposed of, whenever control leaves the corresponding withFile block. Many handles can be open simultaneously; the type system enforces the proper nesting of their regions. The technique has no run-time overhead and induces no run-time errors. Unlike the other implementation of monadic regions, only the basic extension (rank-2 types, aka existentials) is needed. No new type classes are introduced. The technique underlies safe database interfaces of Takusen.

The simplicity of the implementation comes with a limitation: although an inner region can use resources allocated in outer regions, an inner region cannot allocate such outer-region resources. That is, an inner region can read from a file handle opened in an outer region, but cannot itself create such file handles. An inner region therefore is not allowed to return any file handles. We also cannot store file handles in reference cells. These limitations have been overcome in another, also simple implementation of IO regions. The latter does require type classes, however.

     test0 = withFile "/etc/motd" (const $ return True)
     reader q = do
                c1 <- qGetChar q
                c2 <- qGetChar q
                return [c1,c2]
     
     test1  = withFile "/etc/motd" reader
     test1r = runIOM test1 >>= print
Above is the first example of the region-based file IO. Instead of handles, we have hQs -- marked handles. They are created by the function withFile and used similarly to regular handles. A special mIOM monad is a newtype away from the regular IO. The phantom type parameter of the IOM monad, like the s parameter of the ST monad, marks the region.
     *IORegions98Test> :t reader
     reader :: Q mark -> IOM mark [Char]
According to its inferred type, the reader takes a marked handle and yields an IO computation with the same mark. If we try to leak the handle, we get a type error:
     test2 = runIOM (withFile "/etc/motd" (\q -> return q))
       Inferred type is less polymorphic than expected
         Quantified type variable `mark' escapes
       In the first argument of `runIOM', namely
         `(withFile "/etc/motd" (\ q -> return q))'
We certainly can perform an mIOM computation and return its result: test3 below works. An attempt to leak mIOM out of withFile may seem successful: test4' is accepted by the type checker. It cannot be run however, see test4'r; the error message betrays the implementation, which relies on the fact that only monomorphic types may be instances of Typeable and so the required instance Typeable mark cannot be created. The source code contains a number of tests, many of which have been suggested by Brandon Moore.
     test3 = withFile "/etc/motd" (\q ->  (qGetChar q)) -- OK
     
     test4' () = join $ withFile "/etc/motd" (\q ->  return (qGetChar q))
     test4'r = runIOM (test4' ())
         No instance for (Typeable mark)
           arising from use of `test4''

We may open several file handles. Their regions may nest, the handles of the parent regions can be used in any descendant region:

     -- Now we add the type signature, just to show that we can do that
     reader2 :: Q mark -> Q mark -> IOM mark String
     reader2 q1 q2 = do
                c1 <- qGetChar q1
                c2 <- qGetChar q2
                return [c1,c2]
     
     test5  = withFile "/etc/motd" (\q1 ->
                                  withFile "/etc/motd" (\q2 -> reader2 q1 q2))
     test5r = runIOM test5 >>= print

The module IORegions98E with a limited export is the kernel of trust of our implementation. The module defines

     newtype IOM marks a = IOM (IO a) -- data constructor is not exported
     newtype Q mark      = Q Handle   -- data constructor is not exported
     withFile :: Typeable a => FilePath -> (Q mark -> IOM mark a) -> IOM mark a
     qGetChar :: Q mark -> IOM mark Char
     runIOM   :: (forall mark. IOM mark a) -> IO a
We wish to assure that the handle hQ never escapes from its region. The handle can escape if it is incorporated into the result of withFile. The type of that result must then mention the type of hQ. The only computation that eliminates the type hQ is qGetChar, yet the latter introduces the type mIOM. If we make sure that neither mIOM nor hQ may appear in the type of the values produced by withFile, our goal is achieved. Constraining the type of the result of withFile to be an instance of Typeable does the trick: since IOM mark a and Q mark must be fully polymorphic with respect to mark to be usable for runIOM, these types cannot be instances of Typeable, because of the polymorphism. The handle may also escape its region if it is stored in a reference cell. The module IORegions98E does not however provide any mIOM computation for reference cells. And if it had, the type of the storable values would have been similarly constrained to be Typeable. Since we are assured that no computation with the marked handle can occur after we leave withFile, the implementation of withFile may safely close the handle at that point. The module IORegions98E should not normally be touched by the end user. Still, there may be a need for computations with hQ other than qGetChar. Since the functions within the module have unrestricted access to unmarked handles, special care is required when adding more definitions to the trusted kernel. The programmer must not leak unmarked handles and must not close handles. To force closing a handle, one should throw an exception.
Version
The current version is 1.3, Jan 26, 2006.
References
IORegions98E.hs [3K]
The (very simple) implementation, with many comments.

IORegions98Test.hs [3K]
IORegions98Test1.hs [2K]
extensive test code, including all the tests by Brandon Moore.

 

Lightweight monadic regions

[The Abstract of the paper]
We present Haskell libraries that statically ensure the safe use of resources such as file handles. We statically prevent accessing an already closed handle or forgetting to close it. The libraries can be trivially extended to other resources such as database connections and graphic contexts.

Because file handles and similar resources are scarce, we want to not just assure their safe use but further deallocate them soon after they are no longer needed. Relying on Fluet and Morrisett's calculus of nested regions, we contribute a novel, improved, and extended implementation of the calculus in Haskell, with file handles as resources.

Our library supports region polymorphism and implicit region subtyping, along with higher-order functions, mutable state, recursion, and run-time exceptions. A program may allocate arbitrarily many resources and dispose of them in any order, not necessarily LIFO. Region annotations are part of an expression's inferred type. Our new Haskell encoding of monadic regions as monad transformers needs no witness terms. It assures timely deallocation even when resources have markedly different lifetimes and the identity of the longest-living resource is determined only dynamically.

For contrast, we also implement a Haskell library for manual resource management, where deallocation is explicit and safety is assured by a form of linear types. We implement the linear typing in Haskell with the help of phantom types and a parameterized monad to statically track the type-state of resources.

Joint work with Chung-chieh Shan.

Version
The current version is 1.2, June 2008.
References
region-io.pdf [179K]
The paper published in the Proceedings of 2008 Haskell Symposium, pp. 1-12.

Chung-chieh Shan. Slides of the talk at the Haskell Symposium, September 25, 2008. Victoria, Canada.
< http://www.cs.rutgers.edu/~ccshan/capability/region-io-talk.pdf >

SafeHandles1.hs [6K]
SafeHandles1Test.hs [3K]
Safe file IO in a single region: the library and test code for Section 2 of the paper, the warm-up. We satisfy all requirements but the timeliness: all open handles remain open until the whole single region is finished; the longest-living safe handle prevents closing of all the others. This code essentially implements the ST s monad, only with safe handles instead of references. We introduce naive regions, which cannot share handles. We improve upon them later.

SafeHandlesFM.hs [7K]
SafeHandlesFMTest.hs [8K]
Nested regions using explicit witness terms: the library and test code for Section 3 of the paper. This is the implementation of FRGN by Fluet and Morrisett (ICFP04), adjusted for the application domain: IO and Handles rather than memory allocation and pointers. The implementation relies on term-level witnesses of region inclusion (subtyping).

SafeHandles.hs [12K]
SafeHandlesTest.hs [10K]
Nested regions as monad transformers: the library and test code for Sections 4 and 5 of the paper. This is the final solution: lightweight monadic regions with type-level--only enforcement of the region discipline.

multi-handle-io0.hs [12K]
multi-handle-io.hs [13K]
Alternative: tracking type-state in a parameterized monad, Section 6 of the paper. The more general solution in multi-handle-io.hs supports recursive computations in the parameterized monad. Surprisingly, that requires regions.

 

Heavy-weight implementation of region calculus

We show a heavy-weight typeclass-based implementation of Monadic Regions. Fluet and Morrisett mention as future work providing a RGN monad for Haskell programmers. The code below does exactly that.

The code gives the first example of the total type comparison predicate, which can handle even non-ground types and quantified type variables.

Version
The current version is 1.1, Nov 9, 2004.
References
MonadicRegions.hs [6K]
The complete implementation of the RGN monad and the tests

Philippa Cowderoy: Monad.Reader IssueFour 2005-06-27. The ImpureThoughts column. Section ``Policing the State''

 

Tracking dynamic values in types

How to statically track resources which are acquired and released depending on a dynamic condition? Taking a lock as a sample resource, we wish to statically ensure lock safety: no held lock may be acquired, no free lock may be freed, all locks must be free by the end of the computation. The simplest case, of one unconditionally acquired and released lock, is easily solved with a parameterized (generalized) `monad', tracking the so-called type-state. The extension to multiple locks is straightforward. What if the acquisition of a lock depends on the value read from the standard input? How to statically ensure that the held, and only the held lock is released if we cannot statically know which lock will be held? The generalized monad helps here as well, combined with the `branding' of lightweight static capabilities.

The question of statically tracking locks whose acquisition and release depends on a statically unknown value was posed by Naoki Kobayashi. The present code was written as a response. For clarity, the code uses two locks and a toy version of HList. The introductory locking program looks as follows:

     tlock2 = do
       l <- lget
       lock lock1
       x <- return (read l)
       lput (show (x+1))
We read a line from the standard input, lock lock1, parse the line and print something. The inferred type LIO q (Insert (Lock N1) q) () tells that tlock2 finishes with the lock N1 still held. To be precise, the state of locks at the end of tlock2 differs from the initial state q by the addition of Lock N1. Here LIO p q a is a parameterized monad, carrying the initial p and the final q type states, along with the result type a of the computation. The RebindableSyntax extension of GHC lets us use the convenient do-notation. Attempting to run the computation, runLIO tlock2, gives a type error: the type checker complaints that Lock N1 is not released.

The second example shows the conditional lock manipulations. Depending on user input, one of two locks is acquired; one of two locks is later released. We statically ensure that the acquired lock is released -- although we cannot statically tell which lock that will be.

     rtcl3 = runLIO $ do
       l <- lget
       brand_bool ((read l) > (0::Int)) (\b bnot -> do
         clock b lock1
         clock bnot lock2
         lput l
         cunlock b lock1
         cunlock bnot lock2)
We read a number and check if it is is positive. We acquire lock1 if it is, and lock2 if it is not. The locks are released on the opposite condition, bnot vs. b. The function brand_bool produces the unique set of two complementary `keys'. A lock locked with one key must be unlocked with the other. We may lock with both keys, making the lock held unconditionally. When we run the code, we see that depending on the input value, either lock N1 is acquired and released, or lock N2. The lock safety holds in either case. If we forget a cunlock statement, predicate the release on a wrong condition, or try to release a lock unconditionally, the type checker will complain that it is unable to prove lock safety. The safety indeed would be violated in some executions of the code.

The example illustrates enforcing sophisticated resource usage constraints in a mature, well-supported language, relying on the existing type system and type inferencing. All types have been inferred. The code used a particular form of constraint-based typing (type families) and a particular way of solving them. Type function applications are reduced when their arguments become sufficiently instantiated so that a definitional case may be selected.

One is reminded of the adage that the dependent-type checker cannot test for the value of dynamic input. The type checker can assure that the programmer does not forget such a test.

Version
The current version is (original) April 2008; (type families) August 2011.
References
Locks.hs [17K]
The complete, commented code, with the toy version of HList and many tests. The first, warm-up example illustrates acquiring several locks at once and releasing them in any order, so long as lock safety holds.

Fun with type functions
In ``Reflections on the work of C. A. R. Hoare.'' ed. A. W. Roscoe, Cliff B. Jones, and Ken Wood. Springer, 2010. pp. 303-333.
Joint work with Simon Peyton Jones and Chung-chieh Shan.
< http://research.microsoft.com/en-us/um/people/simonpj/papers/assoc-types/#fun >
The paper explains a simple version of the lock safety problem.

Variable (type)state `monad'

Lightweight static capabilities



Last updated June 17, 2013

This site's top page is http://okmij.org/ftp/

oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML