Or better yet, has there been any work in programming languages which involve mostly performing algebra on types? It seems like you could probably form a complete theory of computation based on type theory, and from there create a language which uses these ideas. I'm just curious if anything like this has been done.This web page shows a simple type system that affirmatively answers this question. This type system implements Peano arithmetics, slightly generalized to negative numbers. Certain types in this system denote numerals. Arithmetic operations on such types-numerals -- addition, subtraction, and even division -- are expressed as type reduction rules executed by a compiler. A remarkable trait is that division by zero becomes a type error -- and reported as such by a compiler.
A different implementation of unary type arithmetic, with addition, multiplication and exponentiation being invertible 3-place relations where every two operands determine the third -- and which supports even the logarithm -- is available as a Haskell type (class) program.
A related web page on Number-parameterized types demonstrates decimal and binary type arithmetics and their application to static typechecking of an array concatenation and ensuring safety of low-level, device-driver-like code.
This paper was originally submitted as a small observation to the eternal discussion on relative merits of static and dynamic type systems. The paper aimed to show what an ultimate type system looks like and how it converges to a pure dynamic type system. The type system in question is indeed ultimate as every type (set) contains only one value. In fact, a type is an unary encoding of the corresponding value.
The implementation of this idea required a powerful type system: every type system that is capable of expressing arithmetics on types must be undecidable. Usually that is not what language designers want. Nevertheless, there are languages with such powerful type systems; one of them is rather well-known: C++. It is the standard C++ compiler itself rather than a custom C++ code that manipulates terms representing arithmetic types and performs type derivation. The C++ compiler acts as a term comprehension and rewriting engine -- an interpreter of untyped lambda-calculus over type identifiers. The present paper may therefore serve as a curious example, of an ability of a C++ compiler to act as a term re-writing engine and to support powerful dependent type systems.
A lambda-interp.cc code implements a type system which assigns a distinct type to every number. Inference rules let the compiler determine types of the results of arithmetic operations and expressions. The ultimate specialization of types helps the compiler catch even the most subtle errors. In particular, an expression whose evaluation eventually includes a division by zero will be flagged as mistyped -- before an executable is built, let alone run. We rely extensively on a partial template instantiation. It is a standard C++ feature that offers term comprehension and resolution of polymorphic types. The code in this article compiled with egcs-2.91.66 and g++ 2.95.2 compilers, on FreeBSD and Linux.
class Zero { ... }; template<class T> class Succ {...}; template<class T> class Neg {...}; template<typename T1, typename T2> struct Add { // Traits class }; // (a+1) + (b+1) ==> a + (b + 1 + 1) template<typename T1, typename T2> struct Add<Succ<T1>, Succ<T2> > { typedef typename Add<T1, Succ<Succ<T2> > >::sum sum; }; template<typename T> struct Add<Zero,T> { // 0 + a ==> a typedef T sum; };
Div<T1,T2>
, by
repeatedly subtracting a Divisor from a Dividend until the result
becomes zero or a negative type.
template<typename T1, typename T2> struct Div { // Traits class }; template<typename Dividend, typename Divisor, typename Ratio> struct DivEval { // Traits class }; template<typename Divisor, typename Ratio> struct DivEval<Zero, Divisor, Ratio> { typedef Ratio value; }; template<typename T, typename Divisor, typename Ratio> struct DivEval<Neg<T>, Divisor, Ratio> { typedef typename Add<Ratio, Neg<Succ<Zero> > >::sum value; }; template<typename Dividend, typename Divisor, typename Ratio> struct DivEval<Succ<Dividend>, Divisor, Ratio> { typedef typename Add< Succ<Dividend>, Neg<Divisor> >::sum new_dividend; typedef typename DivEval< new_dividend, Divisor, Succ<Ratio> >::value value; }; template<typename T1, typename T2> struct Div<T1, Succ<T2> > { typedef typename DivEval<T1, Succ<T2>, Zero>::value ratio; };There are no data members in these
Add<>
,
Div<>
, DivEval<>
template
classes. There is no conventional number arithmetics either. There is
only term rewriting, which amounts to arithmetics on types. C++
compiler acts as a term decomposition/composition engine.
Here is the main function
#define SHOWVAL(X) printf("The value of " #X " is: %d\n", X .eval()) main() { Succ<Succ<Zero> > two; Succ<Zero> one; SHOWVAL( two ); SHOWVAL( add(neg(one),neg(two)) ); SHOWVAL( add(two,neg(two)) ); SHOWVAL( div(one,two) ); SHOWVAL( div(two,one) ); SHOWVAL( div(add(one,add(two,two)),add(div(two,two),one)) ); }and the result of its execution:
The value of two is: 2 The value of add(neg(one),neg(two)) is: -3 The value of add(two,neg(two)) is: 0 The value of div(one,two) is: 0 The value of div(two,one) is: 2 The value of div(add(one,add(two,two)),add(div(two,two),one)) is: 2It is instructive to take a look at a
gas
assembly code
generated, for example, for the last expression:
pushl $2 ; the value to print pushl $.LC10 ; the format string call printfThe result of the expression -- number 2 -- has been determined during compilation. The executable code is only to print out that result.
When you add a statement:
SHOWVAL( div(add(two,two),add(div(two,two),neg(one))) );to the main() function above, the C++ compiler says:
lambda-interp.cc: In function `int main()': lambda-interp.cc:217: no type named `ratio' in `struct Div<Succ<Succ<Succ<Succ<Zero> > > >,Zero>'Look at the types of the operands! The compiler has figured out that a divisor in one of the div() operations has the type of
Zero
, which makes the expression mistyped. Division by
zero is a type error!
A user can define his own functions
as he wishes. For example, let's compute (3*x+1)/(x-1)
// The following is a type constructor that computes (3*x+1) // It is set to emulate an expression: // let y = x + x in let z = y + x in z + 1 template<typename x> struct Foo { template<typename y> struct body { template<typename z> struct body2 { typedef typename Add<z,Succ<Zero> >::sum result; }; typedef typename body2<typename Add<y,x>::sum>::result result; }; typedef typename body<typename Add<x,x>::sum>::result result; }; // The following type constructor computes Foo(x)/(x-1) template<typename x> struct Bar { typedef typename Foo<x>::result y; typedef typename Add<x,Neg<Succ<Zero> > >::sum d; typedef typename Div<y,d>::ratio result; }; // The following is a meaningless definition that pleases // the compiler. Well, one can say it converts a "value" into the // corresponding "type"... template<typename TX> typename Bar<TX>::result bar(TX) { Bar<TX>::result type; return type; }Finally, the main() function below
int main (void) { Succ<Succ<Zero> > two; Succ<Zero> one; SHOWVAL( bar(two) ); SHOWVAL( bar(add(two,one)) ); }prints out the following result:
The value of bar(two) is: 7 The value of bar(add(two,one)) is: 5
This is somewhat of a problem: user input and interactions with an external (i.e., OS) environment is a tricky subject. Definitions of many programming languages often assume that all necessary initial values just magically "appear" when operations commence.
Still, there is something we can do about that:
// Convert a natural number to a "unary" notation as a sequence of // Succ _type constructors_ applied to a Zero type constructor. // This function converts a value to a "type". // In a sense it is equivalent to a getInt function // that is present in some languages... template<int N> struct Peanoizer { typedef Succ<typename Peanoizer<N-1>::result> result; result type; }; struct Peanoizer<0> { typedef Zero result; result type; }; int main(void) { SHOWVAL( Peanoizer<ARG1>().type ); SHOWVAL( bar( Peanoizer<ARG1>().type ) ); }Here's the transcript of compiling (evaluating) of the above code:
g++ -DARG1=4 lambda-interp.cc -o lambda-interp ./lambda-interp The value of Peanoizer<ARG1>().type is: 4 The value of bar( Peanoizer<ARG1>().type ) is: 4 g++ -DARG1=1 lambda-interp.cc -o lambda-interp lambda-interp.cc: In instantiation of `Bar<Succ<Zero> >': lambda-interp.cc:316: instantiated from here lambda-interp.cc:283: no type named `ratio' in `struct Div<Succ<Succ<Succ<Succ<Zero> > > >,Zero>' lambda-interp.cc: In function `int main()': lambda-interp.cc:316: no type named `result' in `struct Bar<Succ<Zero> >' lambda-interp.cc:355: confused by earlier errors, bailing outWhich basically says the result of a division of 4 by 0 has an unknown type -- division by zero is a type error. Therefore the compiler refused to compile the code.
Delivering input terms for an interpreter to evaluate is a problem in
any lambda-calculator: one has to step out of the box and introduce
some sort of OS interface. In our case, the g++ compiler itself acts
like a lambda-interpreter: Template (i.e., polymorphic) classes and
functions are playing the role of abstractions, template
instantiations are applications, and type inference is tantamount to
evaluation. Therefore all the input data have to be delivered to the
compiler; a -D
compiler option is one way to do that.
The C++ formulas for type addition in a previous section appear similar to the following Haskell code:
data Integr = Zero | Succ Integr deriving Show add:: Integr -> Integr -> Integr add Zero y = y add (Succ x) y = Succ (add x y) add (Succ (Succ (Succ Zero))) (Succ (Succ Zero)) ==> Succ (Succ (Succ (Succ (Succ Zero))))However, in the Haskell code,
Zero
and Succ
are data constructors. The arguments and the result of
add
are all of the same type, Integr
. In the
C++ code, Zero
is a type and a data
constructor. Succ
however is a type, rather than a
data constructor! Add
is a type operator: it takes two
types and yields another type. So C++ compiler operates
on types just as the Haskell code above operates on values.
Arithmetics of values the Haskell code |
Arithmetics of types the C++ code |
---|---|
Zero is a nullary data constructor of type Intgr
| Zero is a nullary data constructor of type Zero , of kind *
|
Succ is a data constructor of type Intgr -> Intgr
| Succ is a type constructor of kind * -> *
|
add is a function of a type Integr -> Integr -> Integr
| Add is a type function of kind * x * -> *
|
It is interesting to note how C++ arithmetic types are similar to Church
numerals. The latter represent integers as repeated applications of a
function. C++ arithmetic types represent integers as repeated
applications of a type constructor.
One of the paradoxical issues in the recurring debate about types is Robert Harper's thesis that dynamic typing is a particular case of a static typing. This article has attempted to show the other extreme -- the ultimately precise static typing. Curiously, at this extreme static and dynamic typing converge again. This article presents an example of a type system which lets a compiler detect a latent division by zero in an expression, and flag it as a type error.
On one end are dynamically typed languages. Their (byte) compiler is simple and fast; it cheerfully accepts almost every garbage we can throw at it. The run-time of these languages is inherently slower as every operation must first run a battery of tests on its operands: that the referenced variables are indeed defined, the operands of a div operation are indeed numbers, and furthermore, the divisor is not zero. If a branch of code was not executed in a particular run, the errors in this branch, however grave or obvious, remain unreported.
The ultimately statically-typed "language" outlined above achieves the other extreme. The run-time is now fast and trivial as it only prints the results that have already been determined by the compiler. Compilation on the other hand is highly involved. For example, to make sure that no divisor in all division operations is zero, the compiler basically has to evaluate the divisor expressions -- in effect compute them at compile time. In general, such an evaluation may be involved, and may even fail to terminate.
Thus the extreme static typing looks identically to dynamic typing,
only with compile- and run-time phases reversed. The optimal approach
seems to be to do some computation at compile time (e.g., making sure
all referenced variables are declared, or that operands of any
numerical operation are indeed numbers) and leave more tenuous checks
and the rest of the computation to run-time. As we know, the middle
ground is the most controversial position.
lambda-interp.cc [.cc
, 10K]
A self-contained source code which contains inference rules of the
type-rewriting system, and shows how they work.
Functional Style in C++: Closures, Late Binding, and Lambda Abstractions
Number-parameterized types and decimal type arithmetics
Class-parameterized classes, and the type-level logarithm
The present page is a compilation of two articles posted on comp.lang.functional and comp.lang.c++.moderated on Oct 3, 1999 and Jul 21, 2000.
oleg-at-okmij.org