Type Arithmetic

Computation based on the theory of types

In an article "Re: What type is map map and how do I prove it?" posted on comp.lang.functional on July 21, 2000, Phil Quirk wrote:
Or better yet, has there been any work in programming languages which involve mostly performing algebra on types? It seems like you could probably form a complete theory of computation based on type theory, and from there create a language which uses these ideas. I'm just curious if anything like this has been done.
This web page shows a simple type system that affirmatively answers this question. This type system implements Peano arithmetics, slightly generalized to negative numbers. Certain types in this system denote numerals. Arithmetic operations on such types-numerals -- addition, subtraction, and even division -- are expressed as type reduction rules executed by a compiler. A remarkable trait is that division by zero becomes a type error -- and reported as such by a compiler.

A different implementation of unary type arithmetic, with addition, multiplication and exponentiation being invertible 3-place relations where every two operands determine the third -- and which supports even the logarithm -- is available as a Haskell type (class) program.

A related web page on Number-parameterized types demonstrates decimal and binary type arithmetics and their application to static typechecking of an array concatenation and ensuring safety of low-level, device-driver-like code.

This paper was originally submitted as a small observation to the eternal discussion on relative merits of static and dynamic type systems. The paper aimed to show what an ultimate type system looks like and how it converges to a pure dynamic type system. The type system in question is indeed ultimate as every type (set) contains only one value. In fact, a type is an unary encoding of the corresponding value.

The implementation of this idea required a powerful type system: every type system that is capable of expressing arithmetics on types must be undecidable. Usually that is not what language designers want. Nevertheless, there are languages with such powerful type systems; one of them is rather well-known: C++. It is the standard C++ compiler itself rather than a custom C++ code that manipulates terms representing arithmetic types and performs type derivation. The C++ compiler acts as a term comprehension and rewriting engine -- an interpreter of untyped lambda-calculus over type identifiers. The present paper may therefore serve as a curious example, of an ability of a C++ compiler to act as a term re-writing engine and to support powerful dependent type systems.


 

Implementation

A lambda-interp.cc code implements a type system which assigns a distinct type to every number. Inference rules let the compiler determine types of the results of arithmetic operations and expressions. The ultimate specialization of types helps the compiler catch even the most subtle errors. In particular, an expression whose evaluation eventually includes a division by zero will be flagged as mistyped -- before an executable is built, let alone run. We rely extensively on a partial template instantiation. It is a standard C++ feature that offers term comprehension and resolution of polymorphic types. The code in this article compiled with egcs-2.91.66 and g++ 2.95.2 compilers, on FreeBSD and Linux.  

Type Addition

Excerpted below are type inference rules for addition. They are specified as a set of partially-specialized template classes. These classes have no members -- they merely denote type term rewriting rules. Please refer to the source code for the complete set of rules.
     class Zero { ... };
     template<class T> class Succ {...};
     template<class T> class Neg  {...};

     template<typename T1, typename T2>
     struct Add {                    // Traits class
     };

                                // (a+1) + (b+1) ==> a + (b + 1 + 1)
     template<typename T1, typename T2>
     struct Add<Succ<T1>, Succ<T2> > {
       typedef typename Add<T1, Succ<Succ<T2> > >::sum sum;
     };

     template<typename T>
     struct Add<Zero,T> {            // 0 + a ==> a
       typedef T sum;
     };
 

Type Division

The following classes define type inference rules for division. They evaluate a type expression Div<T1,T2>, by repeatedly subtracting a Divisor from a Dividend until the result becomes zero or a negative type.
     template<typename T1, typename T2>
     struct Div {                         // Traits class
     };

     template<typename Dividend, typename Divisor, typename Ratio>
     struct DivEval {                     // Traits class
     };

     template<typename Divisor, typename Ratio>
     struct DivEval<Zero, Divisor, Ratio>  {
       typedef Ratio value;
     };

     template<typename T, typename Divisor, typename Ratio>
     struct DivEval<Neg<T>, Divisor, Ratio> {
       typedef typename Add<Ratio, Neg<Succ<Zero> > >::sum value;
     };

     template<typename Dividend, typename Divisor, typename Ratio>
     struct DivEval<Succ<Dividend>, Divisor, Ratio> {
       typedef typename Add< Succ<Dividend>, Neg<Divisor> >::sum
         new_dividend;
       typedef typename DivEval< new_dividend, Divisor, Succ<Ratio> >::value
         value;
     };

     template<typename T1, typename T2>
     struct Div<T1, Succ<T2> > {
       typedef typename DivEval<T1, Succ<T2>, Zero>::value ratio;
     };
There are no data members in these Add<>, Div<>, DivEval<> template classes. There is no conventional number arithmetics either. There is only term rewriting, which amounts to arithmetics on types. C++ compiler acts as a term decomposition/composition engine.  

Examples of evaluation

Here is the main function

     #define SHOWVAL(X) printf("The value of " #X " is: %d\n", X .eval())
     main()
     {
       Succ<Succ<Zero> > two;
       Succ<Zero> one;

       SHOWVAL( two );
       SHOWVAL( add(neg(one),neg(two)) );
       SHOWVAL( add(two,neg(two)) );
       SHOWVAL( div(one,two) );
       SHOWVAL( div(two,one) );
       SHOWVAL( div(add(one,add(two,two)),add(div(two,two),one)) );
     }
and the result of its execution:
     The value of two is: 2
     The value of add(neg(one),neg(two)) is: -3
     The value of add(two,neg(two)) is: 0
     The value of div(one,two) is: 0
     The value of div(two,one) is: 2
     The value of div(add(one,add(two,two)),add(div(two,two),one)) is: 2
It is instructive to take a look at a gas assembly code generated, for example, for the last expression:
     pushl $2                ; the value to print
     pushl $.LC10            ; the format string
     call printf
The result of the expression -- number 2 -- has been determined during compilation. The executable code is only to print out that result.

When you add a statement:

     SHOWVAL( div(add(two,two),add(div(two,two),neg(one))) );
to the main() function above, the C++ compiler says:
     lambda-interp.cc: In function `int main()':
     lambda-interp.cc:217: no type named `ratio' in 
        `struct Div<Succ<Succ<Succ<Succ<Zero> > > >,Zero>'
Look at the types of the operands! The compiler has figured out that a divisor in one of the div() operations has the type of Zero, which makes the expression mistyped. Division by zero is a type error!  

User-defined "functions"

A user can define his own functions as he wishes. For example, let's compute (3*x+1)/(x-1)

	// The following is a type constructor that computes (3*x+1)
	// It is set to emulate an expression:
	//	let y = x + x in let z = y + x in z + 1
     template<typename x> struct Foo {
       template<typename y> struct body {
         template<typename z> struct body2 {
           typedef typename Add<z,Succ<Zero> >::sum result;
         };
         typedef typename body2<typename Add<y,x>::sum>::result result;
       };
       typedef typename body<typename Add<x,x>::sum>::result result;
     };

	// The following type constructor computes Foo(x)/(x-1)
     template<typename x> struct Bar {
       typedef typename Foo<x>::result y;
       typedef typename Add<x,Neg<Succ<Zero> > >::sum d;
       typedef typename Div<y,d>::ratio result;
     };
 
        // The following is a meaningless definition that pleases
        // the compiler. Well, one can say it converts a "value" into the
        // corresponding "type"...
     template<typename TX>
     typename Bar<TX>::result bar(TX)
     {
       Bar<TX>::result type;
       return type;
     }
Finally, the main() function below
     int main (void) {
       Succ<Succ<Zero> > two;
       Succ<Zero> one;

       SHOWVAL( bar(two) );
       SHOWVAL( bar(add(two,one)) );
     }
prints out the following result:
 
     The value of bar(two) is: 7
     The value of bar(add(two,one)) is: 5
 

Input/output

This is somewhat of a problem: user input and interactions with an external (i.e., OS) environment is a tricky subject. Definitions of many programming languages often assume that all necessary initial values just magically "appear" when operations commence.

Still, there is something we can do about that:

        // Convert a natural number to a "unary" notation as a sequence of
        // Succ _type constructors_ applied to a Zero type constructor.
        // This function converts a value to a "type".
        // In a sense it is equivalent to a getInt function
        // that is present in some languages...
     template<int N> struct Peanoizer {
       typedef Succ<typename Peanoizer<N-1>::result> result;
       result type;
     };
     struct Peanoizer<0> {
       typedef Zero result;
       result type;
     };

     int main(void) {
       SHOWVAL( Peanoizer<ARG1>().type );
       SHOWVAL( bar( Peanoizer<ARG1>().type ) );
     }
Here's the transcript of compiling (evaluating) of the above code:
 
     g++ -DARG1=4 lambda-interp.cc -o lambda-interp
     ./lambda-interp
     The value of Peanoizer<ARG1>().type is: 4
     The value of bar( Peanoizer<ARG1>().type ) is: 4
 
     g++ -DARG1=1 lambda-interp.cc -o lambda-interp

     lambda-interp.cc: In instantiation of `Bar<Succ<Zero> >':
     lambda-interp.cc:316:   instantiated from here
     lambda-interp.cc:283: no type named `ratio' in 
       `struct Div<Succ<Succ<Succ<Succ<Zero> > > >,Zero>'
     lambda-interp.cc: In function `int main()':
     lambda-interp.cc:316: no type named `result' in `struct Bar<Succ<Zero> >'
     lambda-interp.cc:355: confused by earlier errors, bailing out
Which basically says the result of a division of 4 by 0 has an unknown type -- division by zero is a type error. Therefore the compiler refused to compile the code.

Delivering input terms for an interpreter to evaluate is a problem in any lambda-calculator: one has to step out of the box and introduce some sort of OS interface. In our case, the g++ compiler itself acts like a lambda-interpreter: Template (i.e., polymorphic) classes and functions are playing the role of abstractions, template instantiations are applications, and type inference is tantamount to evaluation. Therefore all the input data have to be delivered to the compiler; a -D compiler option is one way to do that.  

Type arithmetics vs. Value arithmetics

The C++ formulas for type addition in a previous section appear similar to the following Haskell code:

   data Integr = Zero | Succ Integr
        deriving Show
   add:: Integr -> Integr -> Integr
   add Zero y = y
   add (Succ x) y = Succ (add x y)
   add (Succ (Succ (Succ Zero))) (Succ (Succ Zero))
     ==> Succ (Succ (Succ (Succ (Succ Zero))))
However, in the Haskell code, Zero and Succ are data constructors. The arguments and the result of add are all of the same type, Integr. In the C++ code, Zero is a type and a data constructor. Succ however is a type, rather than a data constructor! Add is a type operator: it takes two types and yields another type. So C++ compiler operates on types just as the Haskell code above operates on values.
Arithmetics of values
the Haskell code
Arithmetics of types
the C++ code
Zero is a nullary data constructor of type Intgr Zero is a nullary data constructor of type Zero, of kind *
Succ is a data constructor of type Intgr -> Intgr Succ is a type constructor of kind * -> *
add is a function of a type Integr -> Integr -> Integr Add is a type function of kind * x * -> *

It is interesting to note how C++ arithmetic types are similar to Church numerals. The latter represent integers as repeated applications of a function. C++ arithmetic types represent integers as repeated applications of a type constructor.
   

Ultimate static type system

One of the paradoxical issues in the recurring debate about types is Robert Harper's thesis that dynamic typing is a particular case of a static typing. This article has attempted to show the other extreme -- the ultimately precise static typing. Curiously, at this extreme static and dynamic typing converge again. This article presents an example of a type system which lets a compiler detect a latent division by zero in an expression, and flag it as a type error.

On one end are dynamically typed languages. Their (byte) compiler is simple and fast; it cheerfully accepts almost every garbage we can throw at it. The run-time of these languages is inherently slower as every operation must first run a battery of tests on its operands: that the referenced variables are indeed defined, the operands of a div operation are indeed numbers, and furthermore, the divisor is not zero. If a branch of code was not executed in a particular run, the errors in this branch, however grave or obvious, remain unreported.

The ultimately statically-typed "language" outlined above achieves the other extreme. The run-time is now fast and trivial as it only prints the results that have already been determined by the compiler. Compilation on the other hand is highly involved. For example, to make sure that no divisor in all division operations is zero, the compiler basically has to evaluate the divisor expressions -- in effect compute them at compile time. In general, such an evaluation may be involved, and may even fail to terminate.

Thus the extreme static typing looks identically to dynamic typing, only with compile- and run-time phases reversed. The optimal approach seems to be to do some computation at compile time (e.g., making sure all referenced variables are declared, or that operands of any numerical operation are indeed numbers) and leave more tenuous checks and the rest of the computation to run-time. As we know, the middle ground is the most controversial position.
   

References

lambda-interp.cc [.cc, 10K]
A self-contained source code which contains inference rules of the type-rewriting system, and shows how they work.

Functional Style in C++: Closures, Late Binding, and Lambda Abstractions

Number-parameterized types and decimal type arithmetics

Class-parameterized classes, and the type-level logarithm

Binary type arithmetic
 

From other archives

comp.lang.functional newsgroup
an article Type arithmetics [was: What type is map map...] posted on Fri, 21 Jul 2000 23:00:25 GMT
comp.lang.c++.moderated newsgroup
an article C++ compiler as a term-rewriting engine posted on 04 Oct 1999 00:00:00 GMT
comp.lang.functional newsgroup
an article x/0 as a type error [Extremely static typing] posted on Sun, 03 Oct 1999 01:28:43 GMT

The present page is a compilation of two articles posted on comp.lang.functional and comp.lang.c++.moderated on Oct 3, 1999 and Jul 21, 2000.


Last updated May 4, 2007

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!