previous   next   up   top




Inverse typechecking and theorem proving in intuitionistic and classical logics

A logic programming system with first-class relations and the complete decision procedure (e.g., Kanren) can define the pure Hindley-Milner typechecking relation for a language with polymorphic let, sums and products. The typechecking relation relates a term and its type: given a term we obtain its type. It can also work in reverse: given a type, we can obtain terms that have this type. Or, we can give a term with blanks and a type with blanks, and ask the relation to fill in the blanks.

The code below implements this approach. We use Scheme notation for the source language (we could just as well supported ML or Haskell-like notations). The notation for type terms is infix, with the right-associative arrow. As an example, the end of the file type-inference.scm shows the derivation for call/cc, shift and reset from their types in the continuation monad. Given the type:

     (define (cont a r) `((,a -> . ,r) -> . ,r))
     (((a -> . ,(cont 'b 'r)) -> . ,(cont 'b 'b)) -> . ,(cont 'a 'b))
within 2 milli-seconds, we obtain the term for shift:
     (lambda (_.0) (lambda (_.1)
                    ((_.0 (lambda (_.2) (lambda (_.3) (_.3 (_.1 _.2)))))
                      (lambda (_.4) _.4))))

From Curry-Howard correspondence, determining a term for a type is tantamount to proving a theorem -- in intuitionistic logic as far as our language is concerned. We formulate the proposition in types, for example:

     (define (neg x) `(,x -> . F))
     (,(neg '(a * b)) -> . ,(neg (neg `(,(neg 'a) + ,(neg 'b)))))
This is one direction of the deMorgan law. In intuitionistic logic, deMorgan law is more involved:
     NOT (A & B) == NOTNOT (NOT A | NOT B)
The system gives us the corresponding term, the proof:
     (lambda (_.0)
        (lambda (_.1)
          (_.1 (inl (lambda (_.2)
                      (_.1 (inr (lambda (_.3) (_.0 (cons _.2 _.3))))))))))

The de-typechecker can also prove theorems in classical logic, via the double-negation (aka CPS) translation. We formulate a proposition:

     (neg (neg `(,(neg 'a) + ,(neg (neg 'a)))))

and, within 403 ms, obtain its proof:

     (lambda (_.0) (_.0 (inr (lambda (_.1) (_.0 (inl _.1))))))
The proposition is the statement of the Law of Excluded Middle, in the double-negation translation.

Programming languages can help in the study of logic.

The current version is 1.2, Jan 26, 2006.
type-inference.scm [19K]
Kanren code for the pure Hindley-Milner type inference relation relating a term in the lambda-calculus with fixpoint, polymorphic let, sums and products -- and its type. The code contains many comments that explain the notation, the incremental composition of the typechecking relation and taking the advantage of the first-class status of Kanren relations for typechecking of polymorphic let.

logic.scm [9K]
Kanren code that illustrates the application of the inverse typechecker to proving theorems in intuitionistic and classical logics.

The Kanren project < >

Reversing Haskell typechecker

Deriving a program from its incomplete specification such as type using constraints, background knowledge, heuristics or training set is the subject of inductive programming
< >


Type checking as small-step abstract evaluation

We expound a view of type checking as evaluation with `abstract values'. Whereas dynamic semantics, evaluation, deals with (dynamic) values like 0, 1, etc., static semantics, type-checking, deals with approximations like int. A type system is sound if it correctly approximates the dynamic behavior and predicts its outcome: if the static semantics predicts that a term has the type int, the dynamic evaluation of the term, if it terminates, will yield an integer.

Conventional type-checking is big-step evaluation in the abstract: to find a type of an expression, we fully `evaluate' its immediate subterms to their types. We propose a different kind of type checking that is small-step evaluation in the abstract: it unzips (pun intended: cf. Huet's zipper) an expression into a context and a redex. The type-checking algorithms in the paper are implemented in Twelf; the complete code is available. In particular, the well-commented file lfix-calc.elf implements, by way of introduction and comparison, small-step type-checking for simply typed lambda calculus.

The benefit of small-step typechecking is that, by `following' the control flow, it is more suitable for effectful computations. A more interesting result comes from the fact that static evaluation, unlike the dynamic one, goes `under lambda'. Thus the small-step static evaluation context is a binding context.

Colon becomes the new turnstile! Since the evaluation context is neither commutative or associative but is subject to focusing rules, we obtain the framework for expressing various substructural and modal logics.

Joint work with Chung-chieh Shan.

The current version is 1.1, June 2007.
A Substructural Type System for Delimited Continuations
This paper introduced small-step typechecking when designing the first sound type system for dynamic delimited continuations.

small-step-typechecking.tar.gz [10K]
Commented Twelf code with small small-step typechecking algorithms and many sample derivations.

Abstract interpretation in the denotational context:
Patrick Cousot, Types as abstract interpretations. POPL 1997, 316-331.

Abstract interpretation as an aid during program development of logical programs:
Manuel Hermenegildo, Germa'n Puebla, and Francisco Bueno: Using global analysis, partial specifications, and an extensible assertion language for program validation and debugging.
In: The logic programming paradigm: A 25-year perspective, ed. Krzysztof R. Apt, Victor W. Marek, Mirek Truszczynski, and David S. Warren, 161-192, 1999. Berlin: Springer-Verlag.

Interpreting types as abstract values: A tutorial on Hindley-Milner type inference


The challenge of first-class memory

First-class memory (or, first-class store) is a program value that represents a region of mutable memory. In other words, it is a collection of mutable cells -- with the interface to allocate new cells, get and update the contents of a cell. First-class memory helps implement versioned arrays, replay debugging and nested transactions (Morrisett, 1993), among others. It is particularly useful in non-deterministic and probabilistic programming. A non-deterministic choice is in many ways akin to Unix fork(2), splitting the current computational process into several, one per choice alternative. Each process has its own memory, whose contents is inherited from the parent but can be mutated without affecting the other processes -- thus ensuring that independent non-deterministic choices are indeed evaluated independently. As in Unix, making copies of process memory is best done lazily, as `copy-on-write'. To manage process memory and copy-on-write, the Unix kernel maintains a special data structure. First-class memory is like such data structure, letting an ordinary programmer do the kernel-like memory management.

First-class memory has at least the following interface (described below in Haskell):

     type Memory       -- the abstract type of first-class memory
     empty :: Memory        -- allocate new, initially empty memory
     size  :: Memory -> Int -- get the current memory size (the number of allocated cells)
     type Ref a        -- the abstract reference to a memory cell holding a value of type a
     new  :: a -> Memory -> (Ref a, Memory)
     get  :: Ref a -> Memory -> a
     set  :: Ref a -> a -> Memory -> Memory
The interface is similar to the familiar Data.STRef interface, with newSTRef, readSTRef, writeSTRef operations. However, our new, get and set explicitly use the Memory in which the cell is (to be) located.

The interface comes with the performance requirement: the execution time of get and set (or, more precisely, the number of accesses to the real computer memory) should not exceed a small constant. In other words, the implementation of Memory should at least be as efficient as the Data.IntMap trie.

The challenge is implementing the interface efficiently -- and, at the same time, statically guaranteeing some degree of correctness. The correct implementation should satisfy the familiar laws:

     let (r,m') = new x m in get r m'  === x           for all x.
     let m' = set r' x m in get r m'   === get r m     if r /= r', r' exists
     let m' = set r x m  in get r m'   === x           if r exists
     let m' = set r x m in set r y m'  === set r y m
That is: we get what we store; storing in a memory cell overrides its previous value without affecting the other cells. Ensuring these laws in types seems to require too complicated type system to be practical. Still, we would like the types at the very least guarantee an approximation of these laws, where === is understood as equating all values of the same type. In other words, we want to ensure that get-ting and set-ting of existing references are total and type-preserving. We want to use types so that the correctness guarantees are clear even to the compiler, which would then optimize out redundant checks. Implementing practical first-class memory with such typing guarantees is still an open challenge.

Let's consider a straightforward non-solution. Recall, Memory should store values of arbitrary types. The first impulse is to represent memory cells as Dynamic, `dynamically-typed' values. Hence the implementation

     type Memory = IntMap Int Dynamic
     type Ref a  = Int
a finite map from locations (Int) to values of some type. Alas, the operation fromDynamic :: Dynamic -> Maybe a to read from this Dynamic reference cell is partial. Dynamic stores a value and the representation of its type; fromDynamic checks that the (representation of the) requested result type a matches the type (representation) of the stored value. Trying to extract from the Dynamic a value of a different type than was stored returns Nothing. In the correctly implemented Memory, get for an existing reference is total and type-preserving. Therefore, the type-matching check of fromDynamic should always succeed and hence redundant. The Dynamic representation is unsatisfactory because it imposes a run-time check that should never fail. Again, the challenge is to ensure correctness statically and avoid run-time assertions.

Besides the performance, Dynamic is unsatisfactory theoretically. A Dynamic stores a value and a representation of its type. On the other hand, the memory address -- the location of a memory cell -- uniquely identifying a cell also uniquely describes the type of the stored value. That is, within given Memory, the location is itself a type representation: location equality implies type equality. Therefore, the type representation used in Dynamic is redundant.

The pragmatic way to avoid redundant checks is to use unsafeCoerce. Although helping performance, it forces us to rely on pen-and-paper correctness proofs. We better make no mistakes in the proof, or else risk a segmentation fault.

Let's recall the outline of the pen-and-paper correctness argument, to see why it is so hard to express in types. First, the type of reference cells Ref a includes the type of the stored value. The type signatures of new, get and set show that once a reference r is created to hold the value of some type a, get r and set r can only read and write values of that type a. To complete the proof we have to check that the reference r created by new x m is `fresh' -- it names a cell that is different from any cell in the parent memory m -- and get r m' and set r y m' really access that very cell r. The correctness of get and set is hence tied up to the third function, new: that new really creates fresh references. Expressing and using this freshness property in types is the real challenge.

J. Gregory Morrisett: Refining First-Class Stores
ACM SIGPLAN Workshop on State in Programming Languages, 1993

Memory.hs [4K]
First-class memory module used in the probabilistic programming language Hakaru10.
In this implementation the reference Ref acts as a cache for the stored value.

Probabilistic programming using first-class stores and first-class continuations


How to reify fresh type variables?

To type-check a use of rank-2 polymorphism, a type checker generates (gensyms) a fresh type variable and watches for its scope, following the proof rule for universal introduction. Due to this implementation, rank-2 polymorphism has been used (some would say abused) to express static capabilities and assure a wide variety of safety properties statically, including:

In each of these applications, the fresh types generated by the type checker seems to reflect fresh values generated by the program at run time. For example, when generating code as a first-order data structure, we need gensym not only at the type level to ensure lexical scope but also at the value level to make unique concrete variable names. Can dependent types expose and eliminate this reflection for the superfluous encoding that it is? In other words, can our code for generating fresh values be reused during type checking to assure their safety properties more directly? In particular, whereas generated types are only implicitly fresh and are equipped with no operations, generated values are explicitly fresh and are equipped with operations such as comparison. Can we use generated variable names to automate weakening?

Joint work with Chung-chieh Shan.

Chung-chieh Shan: talk at the Shonan meeting on dependently typed programming. Shonan Village Center, Japan. September 15, 2011.
< >

Lightweight Dependent-type Programming
Lightweight monadic regions
A few applications of fresh type variables, not counting the ST monad

Last updated November 4, 2017

This site's top page is
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML