symbol?
with syntax-rulesset!
quote
-- as a macrosetf
-- a polymorphic, generic setter -- as a simple Scheme macro??!lambda
and ??!apply
lambda
as a library syntaxassert
macro with informative failure reports, as a syntax-rule or a defmacro.cond-expand
macro: an implementation of SRFI-0 as a low-level macrodefine-opt
:
A concise definition form with optional arguments and default valuesand-let*
special form: a guarded let*
form
The syntax-rule macro system of R5RS does not seem to scale beyond simple macros. It is very difficult to write macros that compose, to assemble complex macros from already written and tested components. The previous approaches to composable syntax-rules are heavy, notationally and computationally. This article presents an alternative, lightweight style of writing composable syntax-rules, based on the CK abstract machine.
We demonstrate recursive, higher-order applicative macros defined in the style that looks like that of ML or (strict) Haskell. We write composable, call-by-value--like macros without resorting to the continuation-passing-style and thus requiring no macro-level lambda. The syntax remains direct-style, with nested applications.
Syntax-rules are difficult to compose because of their evaluation
order: the arguments of a macro-invocation are not expanded. That per
se does not preclude functional composition since the normal-order
lambda-calculus or non-strict languages like Haskell do not evaluate
arguments of a function application either. However, lambda-calculus
has first-class anonymous abstractions; Haskell also has
the case
form that forces evaluation of an
expression, to the extent needed to choose among the pattern-match
clauses. Syntax-rules have none of these compensating features.
Generally, a syntax-rule cannot obtain the result of the expansion of
its argument expression. The article on Systematic Macro Programming
on this page explains the composability problem in detail. So
far, the only way out has been to effectively change the evaluation
order by writing macros in the continuation-passing style (CPS).
However, CPS code is hard to read. Furthermore, building
continuations requires the notation for first-class, preferably
anonymous syntax-rule abstractions. Although the latter are possible
to emulate, the result is stylistically ugly and computationally
expensive. Some macro expanders take the shocking amount of time and memory to
expand CPS macros with anonymous abstractions.
This project was inspired by the question posed by Dan Friedman in March 2009:
Write the macroOur answer is the transliteration of the standard Haskell code implementing the straightforward algorithm for all permutations:permute
that takes any number of arguments and returns the list of their permutations:(permute a b c) ==> ((a b c) (b a c) (b c a) (a c b) (c a b) (c b a))
The order of the entries in the list is immaterial. One should write permute without resorting to CPS.
perm :: [a] -> [[a]] perm [] = [[]] perm (h:t) = concatMap (ins h) (perm t) ins :: a -> [a] -> [[a]] ins x [] = [[x]] ins x (h:t) = (x:h:t) : map (h:) (ins x t)We shall see that our code looks pretty much like the above, only with more parentheses.
Our macros are written in a CK style.
We distinguish values, which are always quoted like
'(1 2 3)
, from general applicative expressions such
as (c-append '(1) (c-cons '2 '(3 4)))
, which may
contain nested applications. Values are regarded as results of the
CK evaluation. Here is the first example, of the
CK-style macro cons
:
(define-syntax c-cons (syntax-rules (quote) ((c-cons s 'h 't) (ck s '(h . t)))))A CK-macro always receives the extra, first argument, typically called
s
. The macro should never look at s
,
passing it to the macro ck
described below. All arguments
of a CK-macro except for s
are always values; the
macro is free to pattern-match on them. A
CK-macro always expands into the call to the macro ck
,
passing it the s
argument followed by the produced value
or by the expression that will produce the resulting value. The macro
c-cons
produces a value, which is therefore quoted.
We now demonstrate recursion and functional
composition, or nested application. We define a macro c-append
, using the just defined c-cons
.
(define-syntax c-append (syntax-rules (quote) ((c-append s '() 'l2) (ck s 'l2)) ((c-append s '(h . t) 'l2) (ck s (c-cons 'h (c-append 't 'l2)))) ))The first clause yields a value, which is quoted. The second clause yields an expression, with the nested application. The code does look like the idiomatic Haskell code
append [] l2 = l2 append (h:t) l2 = h : (append t l2)
At the heart of the CK style is the CK abstract machine, which
focuses and re-focuses an applicative expression,
relying on user-defined CK-macros for (primitive) reductions.
The machine is implemented as a syntax-rule ck
. It
operates on the stack (the first argument to all CK macros) built
out of the frames (op va ... [] ea ...)
, represented
as ((op va ...) ea ...)
in the code below.
Here op
is the name of a CK-macro to do the reduction;
zero or more va
must all be values;
zero or more ea
are arbitrary expressions.
(define-syntax ck (syntax-rules (quote) ((ck () 'v) v) ; yield the value on empty stack ((ck (((op ...) ea ...) . s) 'v) ; re-focus on the other argument, ea (ck s "arg" (op ... 'v) ea ...)) ((ck s "arg" (op va ...)) ; all arguments are evaluated, (op s va ...)) ; do the redex ((ck s "arg" (op ...) 'v ea1 ...) ; optimization when the first ea (ck s "arg" (op ... 'v) ea1 ...)) ; was already a value ((ck s "arg" (op ...) ea ea1 ...) ; focus on ea, to evaluate it (ck (((op ...) ea1 ...) . s) ea)) ((ck s (op ea ...)) ; Focus: handle an application; (ck s "arg" (op) ea ...)) ; check if args are values ))
We get the ball rolling by expanding (ck () exp)
to evaluate the CK-expression exp
on the empty initial stack.
The Scheme expression (ck () (c-append '(1 2 3) '(4 5)))
will hopefully macro-expands to (1 2 3 4 5)
,
which the Scheme expression evaluator will try to evaluate,
reporting the error since 1 is not a procedure.
To see the result of just the macro-expansion, without
any further evaluations, we should quote it:
(define-syntax c-quote (syntax-rules (quote) ((c-quote s 'x) (ck s ''x))))We run our first applicative macro obtaining the result shown in the comment underneath.
(ck () (c-quote (c-append '(1 2 3) '(4 5)))) ; ==> (1 2 3 4 5)
A higher-order map
macro takes as the
first, non-s
, argument a closure to apply to each element of
the list given in the second argument.
(define-syntax c-map (syntax-rules (quote) ((c-map s 'f '()) (ck s '())) ((c-map s '(f ...) '(h . t)) (ck s (c-cons (f ... 'h) (c-map '(f ...) 't)))) ))The test below prepends 10 to each element of the argument list.
(ck () (c-quote (c-map '(c-cons '10) '((1) (2) (3) (4))))) ; ==> ((10 1) (10 2) (10 3) (10 4))The similar
concatMap
concatenates the result of the mapping:(define-syntax c-concatMap (syntax-rules (quote) ((c-concatMap s 'f '()) (ck s '())) ((c-concatMap s '(f ...) '(h . t)) (ck s (c-append (f ... 'h) (c-concatMap '(f ...) 't)))) )) (ck () (c-quote (c-concatMap '(c-cons '10) '((1) (2) (3) (4))))) ; ==> (10 1 10 2 10 3 10 4)
We have defined enough of the syntax-rule list processing library to solve Dan Friedman's problem:
(define-syntax c-perm (syntax-rules (quote) ((c-perm s '()) (ck s '(()))) ((c-perm s '(h . t)) (ck s (c-concatMap '(c-ins 'h) (c-perm 't)))))) (define-syntax c-ins (syntax-rules (quote) ((c-ins s 'x '()) (ck s '((x)))) ((c-ins s 'x '(h . t)) (ck s (c-cons '(x h . t) (c-map '(c-cons 'h) (c-ins 'x 't))))))) ; The following macro is a syntactic sugar to invoke c-perm (define-syntax perm (syntax-rules () ((perm . args) (ck () (c-quote (c-perm 'args))))))We indeed compute all permutations of a given list:
(perm 1 2 3) ; ==> ((1 2 3) (2 1 3) (2 3 1) (1 3 2) (3 1 2) (3 2 1))
CK.scm [9K]
The complete R5RS code for the example.
The code includes further examples, the indispensable factorial
and a more practical example of deleting an element from an associative
list. The latter two examples are from the CPS Macros-talk.
It is instructive to compare the direct-style applicative macros here
with the CPS versions in that 2002 talk.
Matthias Felleisen and Daniel P. Friedman: Control operators, the SECD machine, and the lambda-calculus.
In Martin Wirsing, editor, Formal Description of Programming
Concepts III, pages 193-217. Elsevier, Amsterdam, 1986.
That paper introduced the CK machine.
The macro match-case-simple
is a simple pattern-matcher for
linear patterns. It is small and efficient.
It was developed back in 2005 for the miniKanren translation of the
leanTAP theorem prover, to transform a first-order logic formula to the
Negation Normal Form. It should work on any R5RS and R6RS Scheme system.
The macro is a much simpler version of Bigloo's very
advanced match-case
special form, which is a built-in rather
than a macro.
To give the flavor of the pattern-matcher, we write a function to describe values:
(define (test-match x) (match-case-simple x (,x (number? x) "number") (() () "nil") (#t () "bool") (#f () "bool") ((,x . ,y) () (string-append "pair of " (test-match x) " and " (test-match y))) (__ () "something else")))
Formally, match-case-simple
has the following
syntax and semantics.
(match-case-simple exp <clause> ...[<else-clause>]) <clause> ::= (<pattern> <guard> exp ...) <else-clause> ::= (else exp ...) <guard> ::= boolean exp | () <pattern> :: = ,var -- matches always and binds the var pattern must be linear! No check is done __ -- matches always 'exp -- comparison with exp (using equal?) exp -- comparison with exp (using equal?) (<pattern1> <pattern2> ...) -- matches the list of patterns (<pattern1> . <pattern2>) -- ditto () -- matches the empty list
As a more interesting example, we show a meta-circular interpreter for a small subset of Scheme:
(define (int code env) (match-case-simple code (('quote ,x) () x) ((let ((,x ,e)) ,body) (symbol? x) (let ((xv (int e env))) (int body (cons (cons x xv) env)))) ((lambda () ,body) () ; thunk (lambda () (int body env))) ; closed over the env ((lambda ,argl ,body) (symbol? argl) ; arglist (lambda arglv (int body (cons (cons argl arglv) env)))) ((lambda (,x) ,body) (symbol? x) ; 1-arg function (lambda (xv) (int body (cons (cons x xv) env)))) ; the general case of lambda is skipped to keep the example small ((,op . ,args) () (let* ((opv (int op env)) (argvs (map (lambda (c) (int c env)) args))) (apply opv argvs))) (,x (symbol? x) (lookup x env)) (,x () x) ; probably number, string, etc. ))
match-case-simple.scm [5K]
The complete R5RS/R6RS code for the example.
The file contains the complete code for the meta-circular interpreter
and several tests.
In its present form the macro appeared in the miniKanren
translation of the leanTAP theorem prover, written in August 2005.
It was based on a match-tree
, used in
SSAX/examples/pull-punct-sxml.scm
example in the SSAX SourceForge
repository, whose first version was committed in 2002.
This article is a follow-up to Al Petrofsky's detailed guide on writing advanced syntax-rule macros. This present article discusses two subtle pitfalls that may arise in macros with nested syntax-rules, i.e., macros whose expansion contains definitions of other macros. The first pitfall is about lexical scoping and shadowing: pattern variables do not shadow. The second pitfall deals with passing parameters to a submacro.
If you are mindful of these pitfalls and use CPS/tail-calls, you will find that -- surprisingly -- writing R5RS macros looks a lot like writing regular tail-recursive Scheme procedures.
The USENET article with more examples and implementation [plain text file]
The article was posted as Re: An Advanced Syntax-Rules Primer for the Mildly Insane on a newsgroup comp.lang.scheme
on Tue, 5 Mar 2002 13:46:54 -0800
Al Petrofsky. An Advanced Syntax-Rules Primer for the Mildly Insane
A remarkably detailed and insightful explanation of R5RS macro expansions. The article was posted on a newsgroup comp.lang.scheme
on 2002-03-03 17:03:50 PST
??!lambda
and ??!apply
A R5RS macro-lambda is a notation for a first-class parameterized future macro-expansion action
(??!lambda (bound-var ...) body)Here
bound-var
is an identifier, body
is
an expression that may contain forms (??! bound-var)
, and
??!lambda
is just a symbol. It must be stressed that
??!lambda
is not a syntactic or bound variable. Although
the ??!lambda
form may look like a macro-application, it
is not. This is a critical distinction: Hilsdale and Friedman looked
for a macro-level lambda as a macro, but did not succeed. Our
macro-level abstractions are not macros themselves. The ??!lambda
form is first class: it can be passed to and returned
from R5RS macros, and can be nested.
The ??!lambda
-form is interpreted by a macro
??!apply
. To be more precise, we specify that the
following macro-application
(??!apply (??!lambda (bound-var ...) body) arg ...)expands to
body
, with all non-shadowed instances
of (??! bound-var)
hygienically replaced by arg
. In Scheme, question and exclamation marks are considered
ordinary characters. Hence ??!lambda
, ??!apply
and ??!
are ordinary symbols -- albeit oddly looking,
on purpose. A ??!lambda
form can have one or several
bound variables; the corresponding ??!apply
form should
have just as many arguments.macro-lambda.scm [6K]
An implementation of the R5RS macro ??!apply
that satisfies the
above specification.
Erik Hilsdale and Daniel P. Friedman. Writing macros in continuation-passing style
Scheme and Functional Programming 2000. September 2000.
Embedding a domain-specific notation or otherwise syntactically extending a core language often require sophisticated macros. A complex artifact is easier to develop if it can be composed of smaller, separately designed and tested parts. Macro systems such as R5RS Scheme macros and Camlp4 quotations are head-first, call-by-name or normal order re-writing systems. Such systems are in general non-compositional. The familiar idioms of a function call and a functional composition -- let alone higher-level combinators such as fold -- do not easily apply. Therefore, macros tend to be monolithic, highly recursive, ad hoc, and requiring a revelation to write, and the equal revelation to understand. As Section 7.3 of R5RS demonstrates, we have to resort to contortions and hacks even in simple cases.
However, macros written in a continuation-passing style (CPS) always compose. Making CPS macros practical still requires an encoding for macro-continuations. We have found this missing piece, with an insight for anonymous macro-level abstractions. The abstraction-passing technique is general and will apply to Camlp4, Tcl and other head-first code-generation systems.
In the specific case of R5RS macros, we present a stronger result: systematic developing of R5RS macros by a translation from the corresponding Scheme procedures. This is an unexpected result as typically R5RS macros look nothing like regular Scheme procedures.
A number of examples below demonstrate how to construct macros that are always compositional. In some circumstances, we can compose with "legacy" macros written without following our methodology. As a side effect, the technique makes R5RS macros functional, restores the modularity principle, and even makes possible higher-order combinators.
Several other examples elaborate the translation from Scheme procedures to R5RS macros, via the CPS transformation. A Scheme-to-syntax-rules compiler does such a translation mechanically.
Macros that Compose: Systematic Macro Programming
<http://link.springer.de/link/service/series/0558/tocs/t2487.htm>
CPS-Macros.ps.gz [58K]
Macros-talk.pdf [102K]
The paper is published in LNCS volume 2487 with the Copyright
held by Springer-Verlag. The file CPS-Macros.ps.gz
is a
local copy of that copyrighted paper. The file Macros-talk.pdf
is a transcript of the talk presented
on October 7, 2002 at GPCE 2002. The talk has a better running
example. Full paper reference: In: Don Batory, Charles Consel, Walid
Taha (Eds.), Generative Programming and Component Engineering: ACM
SIGPLAN/SIGSOFT Conference, GPCE 2002. Pittsburgh, PA, USA, October
6-8, 2002. Lecture Notes in Computer Science, volume 2487,
pp. 202-217. Springer-Verlag, October 2002.
The USENET article Transparent macro-CPS programming [plain text file]
posted on a newsgroup comp.lang.scheme
on Fri, 30 Nov 2001 17:24:21 -0800
The article demonstrates several examples of developing
complex macros by composition. One of the examples is a
compile-time implementation of the factorial over Peano-Church
numerals. The resulting ?plc-fact
macro is truly modular:
it is built from separately defined and tested arithmetic (?plc-pred
, ?plc-mul
) and comparison
macros.
The USENET article Scheme procedure -> R5RS macro, by example of quotify [plain text file]
posted on a newsgroup comp.lang.scheme
on Wed, 6 Feb 2002 16:21:50 -0800
Another example of the systematical construction of R5RS
macros. The article uses the systematic approach in solving a practical problem, posed on comp.lang.scheme. We start by implementing the
algorithm of the required S-expression transformation as a regular
Scheme procedure. When the algorithm is tested, we
mechanically convert the Scheme procedure into a R5RS macro. In the article, we do the conversion by hand for clarity. The mechanical translation is one of the test cases of the Scheme-to-syntax-rules compiler.
quotify.scm [10K]
The code for the above article. It is tested under Bigloo 2.4b and Scheme48.
A Scheme -to- syntax-rules compiler
Lambda-calculator as a R5RS macro
A more challenging R5RS macro: the normal-order evaluator
for untyped lambda calculus. The challenge comes from capture-free
substitutions and repeated identifications of left-most redexes. The
evaluator is truly modular: it relies on a separately designed and
tested one-step beta-substitutor, and on a higher-order
macro-combinator map
. Our technique of CPS and
macro-level abstractions guarantees composability of all pieces.
Syntax-rule-level ??!lambda
and ??!apply
Erik Hilsdale and Daniel P. Friedman. Writing macros in continuation-passing style
Scheme and Functional Programming 2000. September 2000. The present article suggests an improvement: a design for an anonymous macro abstraction.
This article presents compile-time (meta-) lambda-calculators: the normal-order evaluators for untyped lambda calculus. The calculators normalize a term in a pure untyped lambda calculus. The result can be compiled and evaluated in a call-by-value lambda-calculus with constants (i.e., Scheme). The lambda-calculators act as partial evaluators.
Therefore, the code has three kinds of lambdas: of the source language, of the transformer meta-language, and of the target language. Each is handled by its own evaluator, yet they look similar. In fact, the former and the latter lambda forms are identical. In all the cases, the lambda forms play the same role: a denotation for a parameterized future evaluation. The evaluation occurs in different ways, yet the idea of delaying an action and passing its promise as a first-class object is universal.
This calculator is implemented in CPS. It is an example of a challenging R5RS macro, which was designed modularly and systematically with the help of CPS, macro-lambda and higher-order combinators. The challenge comes from capture-free substitutions and repeated identifications of left-most redexes.
The December 2004 version implements the normal-order
beta-normalizer in direct style. This version is a single, short, fast, stand-alone macro. It does no alpha-renaming
directly and no substitutions directly. Everything is integrated with
normalization, and everything is delayed until the latest possible
moment. The only recursion here is via NORM
itself.
This version of the calculator follows the calculus of explicit substitutions, and is very similar to bottom-up parsing. The calculator normalizes a term given a term stack and an environment stack, both of which are initially empty. With two stacks, we have two shifts and two reductions. Applications shift onto the term stack. Redexes reduce the term stack and shift onto the environment stack. Variables reduce from the env stack.
That short, stand-alone, syntax-rule macro clearly shows that lambda-calculus is indeed very simple, on the surface.
Normal-order direct-style beta-evaluator with syntax-rules, and the repeated applications of call/cc
callcc-fixpoint.txt [plain text file]
The appendix of that article gives the complete code of
the normalizer and shows its many tests and examples.
The USENET article posted as Re-writing abstractions, or Lambda: the ultimate pattern macro [plain text file]
on newsgroups comp.lang.functional
and comp.lang.scheme
on Sun, 16 Dec 2001 13:14:02 -0800
The article, besides the extensive discussion, contains the source code for the meta-language lambda-evaluator along with regression tests.
A more systematic way of writing syntax-rule macros is a mechanical transformation of the corresponding Scheme procedures into macros. Scheme functions are far easier to develop and debug. After we are sure our Scheme code works, we submit our procedures to a Scheme--to--syntax-rules compiler -- and use but don't look at the result.
The Scheme -to- syntax-rules compiler mechanically converts (desugared) pure-functional Scheme procedures to the corresponding CPS syntax-rule macros. Currently the compilation to syntax-rules is hardly optimized. The result is very slow. In general, the implementation of syntax-rules on Bigloo and many other systems takes a lot of time and memory. For that reason, in Bigloo it is off by default: We need to specify a special flag -hygien to activate syntax-rule macros on Bigloo. It seems no one has bothered to optimize the macro expander, because no one asked for that.
We must emphasize however the speed of writing macros in the systematic way. We do all the development and testing with regular Scheme procedures. Once we have written the correct procedures, the translated syntax-rule macros just work.
cps-macro-conv.scm [25K]
The source code for the compiler and several sample
programs. The code extensively relies on a match-case
form provided by a Bigloo system.
Performance of the syntax-rule macro-expander across
different Scheme systems can vary by as much as three orders of
magnitude. This conclusion was the result of timing a syntax-rule
stress test below. When evaluated, the test prints a symbol
'prime
, which means that the number 5 is prime. The
primality check itself is done at the macro-expand time. The algorithm
is a pure-functional implementation of the sieve of Eratosthenes (see
the macro ?sieve
). The test is a "side effect" of an
exercise of translating Scheme procedures into syntax-rules by the
Scheme -to- syntax-rule compiler.
Try loading the file below in your favorite Scheme system. The results may vary by a few orders of magnitude: from 0.9 seconds (the latest Chez Scheme) to 15 minutes, on a comparable capable hardware.
R. Kent Dybvig has very kindly sent me the results for Chez Scheme running on a 1.2MHz Athlon CPU, 512MB RAM, Linux. The latest version of Chez Scheme, version 6.9a, with a built-in macro-expander finished the stress test in 0.920 seconds (user time). Both a commercial Chez Scheme and a free Petite Chez Scheme, version 6.9, finished the test in 5.6 seconds user time incurring 400-500 page faults. Both tests used a built-in macro-expander, compiled at optimize-level 3. With a portable macro-expander, Chez Scheme v6.9 finishes the test in 9.7 user seconds whereas the Petite Chez Scheme needs 62.6 user seconds. The slowdown in the last case is due to the fact the expander itself is interpreted, while in the other cases it is compiled. Local macros, i.e., those introduced by let-syntax and letrec-syntax, are always interpreted. Top-level macros are compiled in Chez Scheme and interpreted in Petite Chez Scheme. Even an old version of Petite Chez Scheme, 6.0, with a built-in expander, finished the test in 30 seconds (albeit taking around 70 MB of memory). For comparison, on a twice as capable platform (2GHz Pentium 4, 1GB RAM, FreeBSD), SCM 5d6 finishes the test in 5.2 seconds user time with the resident size of 2MB, Bigloo 2.4b takes 36.3 seconds with the resident size of 34 MB, and Scheme 48 v0.57 takes 652.9 seconds given the heap size of 100,000,000 words.
We must caution that the speedups in the latest version of Chez Scheme are of course specific to the stress test. The differences in performance for less stressful uses of macros are less notable. It is safe however to make one conclusion: the mechanical derivation of syntax-rule macros from the corresponding Scheme functions is viable. Although the resulting macros may be slow to expand on some platforms, they are not inherently slow. Carefully designed and optimized macro-expanders have little trouble with the syntax-rule-compiled macros.
syntax-rule-stress-test.scm [7K]
The source code for the stress test, to be loaded or fed
into a Scheme system.
The USENET article Scheme->syntax-rules [Was: hygienic macro primality
tester] [plain text file]
posted on a newsgroup comp.lang.scheme
on Mon, 15 Jul 2002 20:25:53 -0700
The article explains a non-pure and pure functional
implementations of the sieve of Eratosthenes in Scheme. The article
converts the pure one to syntax-rules, using the Scheme -to-
syntax-rules compiler.
A Scheme -to- syntax-rules compiler
www.scheme.comThe home page of Cadence Research Systems and Chez Scheme. The free version of Chez Scheme (Petite Chez Scheme) is available from that site.
This paper details how folklore notions of hygiene and referential transparency of R5RS macros are defeated by a systematic attack. We demonstrate syntax-rules that seem to capture user identifiers and allow their own identifiers to be captured by the closest lexical bindings. In other words, we have written R5RS macros that accomplish what commonly believed to be impossible. We build on the fundamental technique by Petrofsky of extracting variables from arguments of a macro. The present paper generalizes Petrofsky's idea to attack referential transparency.
This paper also shows how to shadow the lambda
form. The
shadowed lambda
acts as if it was infected by a virus,
which propagates through the lambda
's body infecting other
lambda
s in turn. The virus re-defines the macro being
camouflaged after each binding. This redefinition is the key insight
in achieving the overall referential opaqueness. Although we
eventually subvert all binding forms, we preserve the semantics of
Scheme as given in R5RS.
The novel result of this paper is a demonstration that although R5RS macros are deliberately restricted in expressiveness, they still wield surprising power. We have exposed faults and the lack of precision in commonly held informal assertions about syntax-rule macros, and pointed out the need for proper formalization. For a practical programmer this paper offers an encouragement: more and more powerful R5RS macros turn out to be possible.
Dirty-Macros.pdf [223K]
An extended version of a paper presented at the
Third Workshop on Scheme and Functional Programming, sponsored by ACM
SIGPLAN. October 3, 2002, Pittsburgh, Pennsylvania, USA.
Dirty-Macros-talk.pdf [104K]
A presentation at the Workshop. October 3, 2002. Pittsburgh, PA.
setf
-- a polymorphic, generic setter -- as a simple Scheme macrosetf
special form can be used and implemented in Scheme. Examples of setf!
:
> (define al '((1 "one") (2 "two"))) > (assv 1 al) (1 "one") > (setf! (cadr (assv 1 al)) "-one-") > (assv 1 al) (1 "-one-") |
> (tree-ref tree #f #f) a > (setf! (tree-ref tree #f #f) 'x) > (tree-ref tree #f #f) x |
setf!
as a simple Scheme macro.The USENET article with more examples and implementation [plain text file]
The article was posted as setf -- a polymorphic, generic setter -- as a simple Scheme macro on a newsgroup comp.lang.scheme
on Tue, 07 Jul 1998 16:18:18 GMT
The behavior of syntax-rules -- in particular, the renaming of identifiers that will appear at the top level -- is not completely specified in R5RS. Different implementations of macro-expanders vary widely in this respect.
The following is an illustrative test. It is a syntax-rule
for letrec
(taken verbatim from Section 7.3 of R5RS) with
only one change: the fragment
(let ((var1 <undefined>) ...) (let ((temp1 init1) ...) (set! var1 temp1) ... body ...)is replaced by the "equivalent" code:
(begin (define var1 #f) ... (define temp1 init1) ... (set! var1 temp1) ... body ...))We also changed the name to
letrec1
to avoid
confusion with the standard letrec
. If we use letrec1
at the top level and overlook the pollution of the global
namespace, we may expect that letrec1
is semantically
equivalent to letrec
.
Is it? To test this claim, we evaluate the following expression:
(letrec1 ((x 1) (y 2)) (display (+ x y)) (newline))It turns out, this expression may print 3, print 4, or raise an unbound variable error. You might wish to load the file referenced below into your Scheme system. The existing Scheme systems almost evenly fall into three categories regarding this test.
3
is printed by:4
is printed by:Furthermore, Richard Kelsey suggested on the discussion thread that such a 'non-deterministic' behavior is innate, and a consequence of an R5RS-blessed uncertainty whether top-level definitions are binding occurrences. We should point out however that SCM gives two different answers depending on the macro-expander. It appears what matters is not how a Scheme system treats top-level defines but how a macro-expander thinks the underlying system treats top-level defines.
The top level of Scheme is indeed under-specified and treacherous. On the bright side, we can now trace the lineage of macro-expanders in various implementations of Scheme.
syntax-rule-dark-corner.scm [3K]
The source code for the test.
Google discussion thread
The thread shows where the puzzle came from.
quote
-- as a macroHow fundamental is quote
? Can
quote
truly be implemented as a macro, with no cheating?
The following article gives the affirmative answer. We define a
low-level macro my-quote
that acts exactly like quote
but does not itself use quote
. Nor any other
special form sans the raw or sugared lambda
(i.e.,
let
, cond
). The macro relies on a R5RS
function string->symbol
, which is an ordinary Scheme
function.
The article thus demonstrates that quote
is
not a primitive construct. An implementor may
therefore leave out quote
from the core of his Scheme
system.
my-quote
acts precisely as the native Scheme
quote:(append (my-quote (1 2 3)) (my-quote ()) (my-quote (4))) ;=> (1 2 3 4) (symbol? (my-quote abcdef)) ; => #t
comp.lang.scheme
on Thu, 10 May 2001 01:00:27 +0000set!
set!
so we can do(set! (if (< x y) x y) 3.1415)We can even evaluate
(map (lambda (var val) (set! var val)) (list x y z) (list 1 2 3))to change the values associated with the variables
x
, y
and z
. We can still use set!
to mutate ordinary bindings as in (set! u (+ x
y))
.comp.lang.scheme
on Sun Jun 23 19:03:25 2002Labeled (aka keyword) arguments are present in many languages, e.g., Tcl/Tk, Ocaml, Perl -- and arguably contribute to their expressiveness. Some Scheme systems implement DSSSL extensions, which let us pass arguments to procedures as keyword-value pairs in arbitrary order. It becomes clear which argument receives which value. The programmer does not have to memorize the precise order of arguments to a procedure. He can even omit some arguments if the defaults are appropriate.
Alas we still have to use positional arguments for macros. Even if the DSSSL extensions are available, they do not apply to special forms. This article introduces portable, R5RS-compliant keyword (labeled) arguments for macros and procedures. The order of label-value pairs is arbitrary; arguments for which defaults exist may be omitted; required labeled arguments must be given. We implement labeled arguments via a meta macro. It takes the name of an ordinary (positional-argument) macro and the description of its arguments and defines a new macro, which accepts labeled arguments. A label is anything that can be used for identification: in the simplest case, a label is an identifier. However, strings, numbers, characters, and DSSSL keywords (if available), can all be used as labels. The article shows many examples. If we label an argument by a string, that string becomes that argument's documentation. Our approach can also be used for regular procedures. So we can enjoy keyword arguments for (first-order) functions even on those systems that don't support DSSSL.
Here are a few more interesting examples. The first one makes an ordinary procedure accept keyword arguments, with keywords being strings. The keyword resolution is done at macro-expand time.
(gen:define-labeled-arg-macro llookup (lookup ; the following are the descriptors for lookup's three arguments ("the predicate" eq?) ; default value is eq? ("key") ; no defaults: the argument is required ("the associative list") ; ditto ))
Applications become self-documenting:
(llookup "the associative list" '((a 1) (b 2)) "key" 'b) ; ==> (b 2)
Any datum may be a label, even characters and booleans. It
seems especially fitting to use booleans to mark branches on the
standard if
special form:
(gen:define-labeled-arg-macro lif ; keyword-arg variant of `if' (if ; the positional special form: the ordinary `if' (#\?) ; Character '? marks the test expression (#t) ; then-branch, required (#f 17) ; else-branch, optional )) (let ((x 1)) (lif #t "OK #\? (positive? x) " #f (/ x 0))) ; ==> "OK"
The USENET article with more examples and the self-contained
implementation [plain text file]
All the code is R5RS compliant and has been tested on
Petite Chez Scheme 6.0a, scheme48, and SCM.
The article was posted as Macros and procedures with
labeled arguments on a newsgroup comp.lang.scheme
on Sun, 2 May 2004 16:14:04 -0700
Functional XML parsing framework
The keyword meta-macro ssax:define-labeled-arg-macro
appears in the main module of the SSAX XML parser: SSAX.scm, version
5.0. That file shows an ``industrial'' example of a macro with labeled
arguments, ssax:make-parser
. The parser is R5RS compliant
and runs on many Scheme platforms. An interested reader might find in
SSAX.scm a few other curious R5RS macros.
Implementing keyword and default arguments, function's resource database
It has been claimed that Turing-completeness of syntax-rules
makes the hygiene of macros unenforceable. One may think that the
power of Turing-completeness would let us implement a non-hygienic
macro-expander atop syntax-rules. Therefore, when we see a macro
invocation (foo expr)
we allegedly cannot be sure if the
expansion satisfies the hygiene condition.
We formally prove that the above conclusion is unconsequential: even if we allow our code to be processed by an unknown macro expressed in a Turing-complete system, not all bets are off. It depends on the macro system.
Indeed, let us consider a strict subset of the R5RS macro system, such that define-syntax, let-syntax, and letrec-syntax may not appear in the template of any syntax-rule. The set of all syntactic transformers is therefore known before the macro-expansion starts. Such a macro system preserves the hygiene condition for macro expansions, defined in Kohlbecker, E.E., Friedman, D.P., Felleisen, M., Duba, B. 'Hygienic macro expansion' (1986) as: "Generated identifiers that become binding instances in the completely expanded program must only bind variables that are generated at the same transcription step." That paper formally proves the condition for the macro system that is equivalent to ours. And yet, such a macro system is Turing complete: we can write a Turing machine with the restricted R5RS macros. The code below demonstrates this constructive proof.
Thus even if we wrap our code in an unknown macro expressed in a Turing-complete system, we may still be certain that the hygiene condition for macro expansions is preserved.
turing-completeness-limited-macros.scm [5K]
The source code and the tests.
The finite control unit of the Turing machine is a set of
simple syntax-rules. The tape is realized as a pair of lists: one list
describes the portion of the tape from the beginning till the current
point. The other list represents the tape from the current point
onward. See the comments in the code for more details.
As an example, the code implements and tests two particular
Turing machines: the addition of natural numbers (in unary) and a
decision machine. The latter decides if the number of tokens on the
tape between two delimiting blanks is even or odd.
The code has been described in a USENET article Turing-completeness of simple syntax-rules posted on a newsgroup comp.lang.scheme
on Mon, 15 Sep 2003 22:14:40 -0700.
symbol?
with syntax-rulesWe show a syntax-rule macro-predicate that can be applied to
any expression and can soundly and reliably determine if that
expression is a symbol/literal -- rather than a pair, a vector, a
literal string, a literal number, etc. The mere existence of such a
predicate is surprising and counter-intuitive. For completeness and
reference, we also show two macro-identity predicates on identifiers:
id-eq??
and id-eqv??
.
Syntax-rules can determine the literal type of some arguments: a pair, a vector, an empty list, a boolean. We can also write a syntax-rule that tests if its argument is the specific number, the specific character, the specific string. A syntax-rule cannot determine if its argument is a string or an integer.
For a long time I used to think that we cannot write a
syntax-rule that tests if its argument is a symbol. In
particular, I wanted a macro-expand-time equivalent of the library
function symbol?
. We will call that macro symbol??
to avoid the confusion with the library function. That
macro should take three arguments: any syntactic form plus two
continuations, kt
and kf
. The macro symbol??
will expand to kt
if and only if its first
argument is a symbol. The macro would expand to kf
if its
argument is anything other than a symbol. I believed such a
macro symbol??
is impossible with syntax-rules -- until I
wrote it in August of 2003.
(define-syntax symbol?? (syntax-rules () ((symbol?? (x . y) kt kf) kf) ; It's a pair, not a symbol ((symbol?? #(x ...) kt kf) kf) ; It's a vector, not a symbol ((symbol?? maybe-symbol kt kf) (let-syntax ((test (syntax-rules () ((test maybe-symbol t f) t) ((test x t f) f)))) (test abracadabra kt kf)))))
The macro is based on the observation that if a form F
is an identifier and a syntax-rule pattern P
is
anything but a literal identifier, then P
can match
F
if and only if P
is an
identifier (symbol). The final form of this macro incorporates an
improvement suggested by Al Petrofsky in a USENET discussion
thread.
How to write symbol? with syntax-rules [plain text file]
The article with a detailed explanation, comparisons with
defmacro and syntax-case, the source code, and test cases.
The article was posted on a newsgroup comp.lang.scheme
on Tue, 30 Sep 2003 12:03:44 -0700.
An assert
macro with informative reports on failure
Thanks to the syntax-rule symbol??
, the good
assert
form can, after all, be written in R5RS Scheme.
A good assert macro, now as a syntax-rule [plain text file]
The article illustrates the assert macro on several examples
and gives the complete implementation. The code includes several
general-purpose syntax-rules, such as id-memv??
,
symbol??
, and k!reverse
-- as well as
syntax-rules to compute a list of all identifiers (without duplicates)
that occur in an arbitrary Scheme expression. We rely on light-weight
CPS syntax-rule macros throughout.
The article was posted on a newsgroup comp.lang.scheme
on Tue, 14 Oct 2003 11:53:09 -0700.
call/cc
. The proof relied
on a rather tedious CPS transformation, which was almost at the limit
of human patience. When complex CPS transformations are done by hand,
mistakes are easy to make and difficult to find. In order to simplify
the proof and increase our confidence in it, we wrote a syntax-rule
macro to CPS transform terms. The macro is simple so its correctness
is apparent. Alas, the simplicity of the CPS transform leads to many
administrative redexes. The resulting terms are difficult to
analyze. Therefore, we have implemented another syntax-rule macro to
normalize terms and reduce away administrative redexes. The latter
macro acts as a lambda-calculator.call/cc
lambda
as a library syntaxIs lambda
a fundamental syntactic form? Can we re-write
lambda
in terms of something else? Can we re-implement
the top-level lambda
with syntax-rules? This article
shows that we can, which gives us an ability to overload the top-level
lambda
, for tracing or for mischief. Once again, the R5RS
macro system has exceeded our expectations.
In the article, we overload lambda
to trace all
explicit and implicit lambda-applications. The implicit ones arise
from the use of let
forms. We overload lambda
at the top level as follows:
(define-syntax lambda (syntax-rules () ((lambda bindings body1 body2 ...) (make-lambda bindings (display "Entering lambda: ") (display 'bindings) (display " of values ") (display (list . bindings)) (display "Happy Holidays!") (newline) (begin body1 body2 ...)))))
The article falls within the gray area of R5RS. R5RS does not
promise that our lambda-overloading will work on every Scheme system.
On the other hand, R5RS does not declare our method to be in
error. Following Niels Bohr, the lambda-overloading technique may then
be called `non-trivial'. The technique has been tested on Scheme48
and Bigloo 2.4b. With some changes the examples work on Petite
Chez Scheme as well. In a follow-up message, Matthias Radestock
explained the reasons for the behavior of Chez Scheme -- treating
top-level let-syntax
and letrec-syntax
as
splicing constructs, similar to the top-level begin
. He
pointed out that derived-lambda is probably the first example showing
that the splicing behavior of let-syntax
leads to a loss
of expressiveness.
comp.lang.scheme
on Sun, 28 Dec 2003 16:05:43 -0800This article shows that macros do exist at run-time. Macros -- that is, syntax transformers and syntax objects -- can be stored in variables, passed to ordinary procedures and applied at run-time. We show three examples. The last one uses exclusively syntax-rules and is specifically written not to contradict R5RS in any way. The example runs on Scheme48, Petite Chez Scheme, SCM, and Gauche and demonstrates the pattern-matching machinery of syntax-rules at run time.
For example, evaluating the following expression compiles a
transformer and associates the result with the name foo
:
(compile-pattern 'foo (list '(_ (x ...) (y ...)) ''((x y) ...)))we can later apply
foo
to an S-expression to obtain
the transformed S-expression:(display (apply-pattern 'foo (list (list 10 20 30) (list 1 2 3))))prints the desired result
((10 1) (20 2) (30 3))
.comp.lang.scheme
on Wed, 28 Jan 2004 20:30:54 -0800expand1
. The following shows
a trivial and not fully general, but nevertheless helpful trick that
lets us see the intermediate macro-expansion results. The trick should
work in any R5RS Scheme:(define-syntax mtrace (syntax-rules () ((mtrace x) (begin (display "Trace: ") (write 'x) (newline) x))))One may wish to enclose a macro expression or the template part of a syntax-rule into
mtrace
. The article below shows
an example of using mtrace
to trace the expansion
of a complex recursive macro -- letrec
.comp.lang.scheme
on Tue, 2 Nov 2004 17:15:22 -0800We introduce the macro make-env
that maintains a
macro-expand-time environment. We can bind values in the
environment (associate them with keys) -- and then look them up by their keys.
The environment is lexically-scoped. We can shadow one binding with
another. For example:
(make-env (bind ((a 1)) (bind ((b 2)) (list (lookup a) (lookup b) (bind ((a 3)) (list (lookup a) (lookup b)))))))The example looks and feels as if
bind
were just
let
. However, bindings and look-ups occur at macro-expand
time. The whole code above expands to this:; ==> (list 1 2 (list 3 2))
We used macro-expand-time environments for an unusual purpose:
writing code where identifiers are S-expressions rather than
symbols. A program is a closed term. The meaning of a program does not
change if all identifiers are alpha-renamed. Therefore, an identifier
does not have to be a symbol. It can be anything that can be compared
in equality. Al Petrofsky has pointed out that key insight to me. The
macro make-env1
below extends this idea to
identifiers-as-lists: the macro maintains a syntactic environment in
which S-expressions are the keys, and identifiers are the
corresponding values. For example, the following is the correct Scheme code
(make-env1 (lambda ((id a b c)) (lambda ((id a b)) (list (id a b c) (id a b) '(id a b x) '(id a b) 'x))))Its macro-expansion is equivalent to the following:
(lambda (temp~1) (lambda (temp~2) (list temp~1 temp~2 'abx 'ab 'x)))A form
(lambda ((id a b ...)) body)
binds the key
(a b c ...)
to the value temp~xxx
. The value
is a temporary, colored identifier generated by syntax-rules
macros. The hygiene property makes such identifiers unique. The macro
make-env1
re-writes (lambda ((id a b ...)) body)
into
(lambda (temp~xxx) re-written-body)
. We may indeed use
composite identifiers such as (id a b ...)
in the body of
lambda. Here id
is a synonym for 'look-up'. The expansion
of id
in the current environment (maintained
implicitly by make-env1
) replaces (id a b ...)
with the corresponding value -- which is the identifier temp~xxx
-- the same identifier that was bound by lambda. That
was the critical condition that makes everything work.
The goal of representing identifiers by complex S-expressions is that the latter may be manipulated by syntax-rules. We can truly write syntax-rule macros that concatenate identifiers. The code referenced below does exactly that.
Dirty-Macros-talk.pdf [104K]
How to Write Seemingly Unhygienic and Referentially Opaque Macros
with Syntax-rules.
A presentation at the Third Workshop on Scheme and Functional
Programming. October 3, 2002. Pittsburgh, PA.
The explanation of the macro-expand-time environments starts
on page 45.
dirtier-macros.scm [8K]
The source code and the tests.
How to build `identifiers' with syntax-rules
Portable case-sensitive symbols, and the meaning of identifiers
These two links show practical applications of the idea of
``extended'' identifiers: portable case-sensitive symbols and the
implementation of the define-structure
macro with
syntax-rules.
Writing interpreters is the textbook application of Scheme, as a way to implement domain-specific languages. We may want to write an interpreter even if the source language is essentially (a subset of) Scheme itself: because the evaluation order of the source language differs from that of the host Scheme system, because we want to add tracing, or because we want to interpret a Scheme expression in non-standard ways. Partial evaluation and many program analyses such as data- and control flow can be expressed as non-standard, abstract interpretations.
Interpreters fold over the source language expression
represented as a data structure. Therefore, they must, at run time,
repeatedly traverse the expression, detect primitive expression forms
and dispatch to their processors: `dispatch on syntax'. The dispatch
is a source of a so-called `interpretive overhead'. If the only difference
between the source language and the host system is the semantics of
standard library functions, we can turn a source language expression
into a thunk, compile it, and evaluate after re-defining the standard
library functions with set!
. The source language
expression can now run at `full-speed' -- almost. When compiling the
source expression, we must tell the compiler not to inline the
standard functions; otherwise their re-definition has no effect. Thus
the overhead, of indirect standard function calls, remains.
This method of overhead reduction does not apply if we want to re-define
special forms such as if
, lambda
or
application -- for example, to fix a particular evaluation order.
We describe how to avoid the run-time dispatch on syntax,
how to abstract over special forms and interpret an expression using
various, non-standard meanings of lambda
, application,
etc. The code is in R5RS Scheme. The key is performing the dispatch on
syntax at macro-expand time. We present the R5RS macro xlate
, which takes an expression and the environment. The environment is an
associative list with interpretations for built-ins,
library functions and special forms, including application. The macro
xlate
traverses the given source language expression, detects
primitive forms and replaces them using the environment as a substitution. The
resulting Scheme expression (with re-defined built-ins and special
forms) can be evaluated or compiled by the host Scheme system as
usual. To avoid the overhead of searching variable environment, we use
higher-order abstract syntax.
The first example is the standard interpretation: we interpret a Scheme expression as the host Scheme system would evaluate it. The translation below is the identity.
(define-syntax standard (syntax-rules () ((standard exp) (let-syntax ((id (syntax-rules () ((id x) x))) (app (syntax-rules () ((app x . xs) (x . xs))))) (xlate (("inj" id) ("*" *) ("sub1" (lambda (x) (- x 1))) ("zero?" zero?) ("lambda" id) ("app" app) ("seq" begin)) exp))))) (standard ((lambda (x) (* (x 2) 3)) (lambda (n) (* n n)))) ; 12A trivial non-standard interpretation is counting the number of `constructors' in an expression. Every expression, including lambda-expression, now evaluate to an integer. We no longer treat abstractions as values.
(define-syntax count-ctr (syntax-rules () ((count-ctr exp) (let ((ctr1 (lambda (x) 1)) (op (lambda args (apply + 1 args)))) (xlate (("inj" ctr1) ("*" op) ("sub1" op) ("zero" op) ("lambda" (lambda (f) (+ 1 (f 0)))) ; a variable counts as 0 ("app" op) ("seq" op)) exp))))) (count-ctr (lambda (x) (* x 2))) ; 3 a variable counts as 0, lambda counts as 1 (count-ctr ((lambda (x) (* (x 2) 3)) (lambda (n) (* n n)))) ; 8In the non-standard interpretation, our running example evaluates to 8, because it contains two
lambda
s, two literal constants, two
multiplication operations, and two applications.
Our main example is evaluating a source language expression using call-by-value, call-by-ref, and call-by-name evaluation strategies. We can also do partial evaluation and CPS conversion, as described in the tagless-typed paper.
call-by-any.scm [8K]
The R5RS source code and the tests.
This site's top page is http://okmij.org/ftp/
Converted from SXML by SXML->HTML