From posting-system@google.com Tue Oct 14 18:53:15 2003
Date: Tue, 14 Oct 2003 11:53:09 -0700
From: oleg@pobox.com (oleg@pobox.com)
Newsgroups: comp.lang.scheme
Subject: A good assert macro, now as a syntax-rule
Message-ID: <7eb8ac3e.0310141053.382fa0e9@posting.google.com>
Status: OR

An article posted on this newsgroup in September 2001 presented a good
assert macro. The macro is distinguished by its reporting
capabilities. The message printed upon an assertion failure shows,
among other things, the bindings for interesting variables occurring
within the asserted conditions. A programmer can also specify any
strings or other expressions to print at that point. Entering a REPL
or a debugger after the failure might seem like the most informative
approach. However, it is too heavyweight, precludes automated
(regression) testing, and is platform-specific.  The assert macro
discussed in that article turns out informative, lightweight, and
portable -- as far as low-level macros go.

	     http://pobox.com/~oleg/ftp/Scheme/util.html#assert

The present article announces an implementation of the good assert
macro as a syntax-rule (something that was previously claimed
impossible). As such, the assert macro can run on Petite Chez Scheme,
Scheme48, SCM -r5, and other R5RS Scheme systems. The syntax-rule
implementation of the good assert significantly improves the
portability of that special form.

Most of the September 2001 article still applies, in particular the
discussion of assertion checking in various systems, and the
implementation details. The functionality of the good assert macro
remains unchanged. It's quoted below for reference.

A better assert macro

	syntax: assert ?expr ?expr ... [report: ?r-exp ...]

If (and ?expr ?expr ...) evaluates to anything but #f, the result is
the value of that expression.

If (and ?expr ?expr ...) evaluates to #f, an error is reported.  The
error message will show the failed expressions, as well as the values
of selected variables (or expressions, in general).  The user may
explicitly specify the expressions whose values are to be printed upon
the assertion failure -- as ?r-exp that follow the identifier
'report:'.  The identifier report: is an ordinary symbol whose name
happens to end in a colon.

Typically, ?r-exp is either a variable or a string constant; in
general, it's an arbitrary expression.  If the user specified no
?r-exp, the values of interesting variables that are referenced in
?expr will be printed upon the assertion failure.

Examples

  (let ((n (begin (display "Enter a positive integer:")
		  (newline) (read))))
    (assert (integer? n) (> n 0)
            report: "Domain error" n
		    "You should've entered a positive value" #\!)
    (fact n))

if you run this example and enter -1 at the prompt, you'll see

failed assertion: ((integer? n) (> n 0))
Domain error
n: -1
You should've entered a positive value!
*** ERROR IN (stdin)@3.5 -- assertion failure

We don't have to write a poem to the user, however. A simpler assertion

  (let ((n (begin (display "Enter a positive integer:")
		  (newline) (read))))
    (assert (integer? n) (> n 0))
    (fact n))

will do just as well. It will print, under the same circumstances:

failed assertion: ((integer? n) (> n 0))
bindings
n: -1
*** ERROR IN (stdin)@10.5 -- assertion failure

The assert macro determines that 'n' is an interesting variable and
prints its binding. The binding along with the text of the expression
in question make the cause of the failure obvious.

Assert is especially useful in regression tests. Numerous built-in
regression tests in the SSAX XML parser code all have the form:
       (let ((expected expected-result)
	     (computed (computation)))
	  (assert (equal? expected computed)))

If the expected result differs from the computed, the assert macro
will print them both. Such a behavior has proved highly useful in the
development of SSAX.


The old article claimed that the assert macro must be implemented via
low-level (aka Lisp-style) macros: "To determine the set of
interesting variables we need to check if an object in a form is _an_
identifier. R5RS macros can't do that." The recent discovery of the
syntax-rule macro symbol? has changed all that.

The new implementation is given in the Appendix. The implementation
includes the code for a few general-purpose syntax-rules, such as
id-memv??, symbol?? and k!reverse. We rely on light-weight CPS
syntax-rule macros throughout.

The assert syntax-rule is a part of a standard Prelude for SCM:
	 http://pobox.com/~oleg/ftp/Scheme/lib/myenv-scm.scm
which now exclusively uses syntax-rules. 
The corresponding validation code is
         http://pobox.com/~oleg/ftp/Scheme/tests/vmyenv.scm
The Makefile in that directory shows how to run the tests.
All tests (including SSAX) pass. The SSAX code includes a run-test
macro that supports '"Aa" as a notation for portable case-sensitive
symbols. That macro is now implemented as a syntax-rule, too. But this
is a subject for another article.


Appendix: The implementation of the good assert syntax-rule. It is
tested on SCM 5d6, Scheme48 and Petite Chez Scheme. For the latter two
systems, one may need to adjust the call of the function 'error' to
account for a platform-specific interface to that function.

; Frequently-occurring syntax-rule macros

; A symbol? predicate at the macro-expand time
;	symbol?? FORM KT KF
; FORM is an arbitrary form or datum
; expands in KT if FORM is a symbol (identifier), Otherwise, expands in KF

(define-syntax symbol??
  (syntax-rules ()
    ((symbol?? (x . y) kt kf) kf)	; It's a pair, not a symbol
    ((symbol?? #(x ...) kt kf) kf)	; It's a vector, not a symbol
    ((symbol?? maybe-symbol kt kf)
      (let-syntax
	((test
	   (syntax-rules ()
	     ((test maybe-symbol t f) t)
	     ((test x t f) f))))
	(test abracadabra kt kf)))))

; A macro-expand-time memv function for identifiers
;	id-memv?? FORM (ID ...) KT KF
; FORM is an arbitrary form or datum, ID is an identifier.
; The macro expands into KT if FORM is an identifier, which occurs
; in the list of identifiers supplied by the second argument.
; All the identifiers in that list must be unique.
; Otherwise, id-memv?? expands to KF.
; Two identifiers match if both refer to the same binding occurrence, or
; (both are undefined and have the same spelling).

; (id-memv??			; old code. 
;   (syntax-rules ()
;     ((_ x () kt kf) kf)
;     ((_ x (y . rest) kt kf)
;       (let-syntax
; 	((test 
; 	   (syntax-rules (y)
; 	     ((test y _x _rest _kt _kf) _kt)
; 	     ((test any _x _rest _kt _kf)
; 	       (id-memv?? _x _rest _kt _kf)))))
; 	(test x x rest kt kf)))))


(define-syntax id-memv??
  (syntax-rules ()
    ((id-memv?? form (id ...) kt kf)
      (let-syntax
	((test
	   (syntax-rules (id ...)
	     ((test id _kt _kf) _kt) ...
	     ((test otherwise _kt _kf) _kf))))
	(test form kt kf)))))

; Test cases
; (id-memv?? x (a b c) #t #f)
; (id-memv?? a (a b c) 'OK #f)
; (id-memv?? () (a b c) #t #f)
; (id-memv?? (x ...) (a b c) #t #f)
; (id-memv?? "abc" (a b c) #t #f)
; (id-memv?? x () #t #f)
; (let ((x 1))
;   (id-memv?? x (a b x) 'OK #f))
; (let ((x 1))
;   (id-memv?? x (a x b) 'OK #f))
; (let ((x 1))
;   (id-memv?? x (x a b) 'OK #f))

; Commonly-used CPS macros
; The following macros follow the convention that a continuation argument
; has the form (k-head ! args ...)
; where ! is a dedicated symbol (placeholder).
; When a CPS macro invokes its continuation, it expands into
; (k-head value args ...)
; To distinguish such calling conventions, we prefix the names of
; such macros with k!

(define-syntax k!id			; Just the identity. Useful in CPS
  (syntax-rules ()
    ((k!id x) x)))

; k!reverse ACC (FORM ...) K
; reverses the second argument, appends it to the first and passes
; the result to K

(define-syntax k!reverse
  (syntax-rules (!)
    ((k!reverse acc () (k-head ! . k-args))
      (k-head acc . k-args))
    ((k!reverse acc (x . rest) k)
      (k!reverse (x . acc) rest k))))


; (k!reverse () (1 2 () (4 5)) '!) ;==> '((4 5) () 2 1)
; (k!reverse (x) (1 2 () (4 5)) '!) ;==> '((4 5) () 2 1 x)
; (k!reverse (x) () '!) ;==> '(x)


; assert the truth of an expression (or of a sequence of expressions)
;
; syntax: assert ?expr ?expr ... [report: ?r-exp ?r-exp ...]
;
; If (and ?expr ?expr ...) evaluates to anything but #f, the result
; is the value of that expression.
; If (and ?expr ?expr ...) evaluates to #f, an error is reported.
; The error message will show the failed expressions, as well
; as the values of selected variables (or expressions, in general).
; The user may explicitly specify the expressions whose
; values are to be printed upon assertion failure -- as ?r-exp that
; follow the identifier 'report:'
; Typically, ?r-exp is either a variable or a string constant.
; If the user specified no ?r-exp, the values of variables that are
; referenced in ?expr will be printed upon the assertion failure.


(define-syntax assert
  (syntax-rules ()
    ((assert _expr . _others)
     (letrec-syntax
       ((write-report
	  (syntax-rules ()
			; given the list of expressions or vars,
			; create a cerr form
	    ((_ exprs prologue)
	      (k!reverse () (cerr . prologue)
		(write-report* ! exprs #\newline)))))
	 (write-report*
	   (syntax-rules ()
	     ((_ rev-prologue () prefix)
	       (k!reverse () (nl . rev-prologue) (k!id !)))
	     ((_ rev-prologue (x . rest) prefix)
	       (symbol?? x
		 (write-report* (x ": " 'x #\newline . rev-prologue) 
		   rest #\newline)
		 (write-report* (x prefix . rev-prologue) rest "")))))
	  
			; return the list of all unique "interesting"
			; variables in the expr. Variables that are certain
			; to be bound to procedures are not interesting.
	 (vars-of 
	   (syntax-rules (!)
	     ((_ vars (op . args) (k-head ! . k-args))
	       (id-memv?? op 
		 (quote let let* letrec let*-values lambda cond quasiquote
		   case define do assert)
		 (k-head vars . k-args) ; won't go inside
				; ignore the head of the application
		 (vars-of* vars args (k-head ! . k-args))))
		  ; not an application -- ignore
	     ((_ vars non-app (k-head ! . k-args)) (k-head vars . k-args))
	     ))
	 (vars-of*
	   (syntax-rules (!)
	     ((_ vars () (k-head ! . k-args)) (k-head vars . k-args))
	     ((_ vars (x . rest) k)
	       (symbol?? x
		 (id-memv?? x vars
		   (vars-of* vars rest k)
		   (vars-of* (x . vars) rest k))
		 (vars-of vars x (vars-of* ! rest k))))))

	 (do-assert
	   (syntax-rules (report:)
	     ((_ () expr)			; the most common case
	       (do-assert-c expr))
	     ((_ () expr report: . others) ; another common case
	       (do-assert-c expr others))
	     ((_ () expr . others) (do-assert (expr and) . others))
	     ((_ exprs)
	       (k!reverse () exprs (do-assert-c !)))
	     ((_ exprs report: . others)
	       (k!reverse () exprs (do-assert-c ! others)))
	     ((_ exprs x . others) (do-assert (x . exprs) . others))))

	 (do-assert-c
	   (syntax-rules ()
	     ((_ exprs)
	       (or exprs
		 (begin (vars-of () exprs
			  (write-report ! 
			    ("failed assertion: " 'exprs nl "bindings")))
		   (error "assertion failure"))))
	     ((_ exprs others)
	       (or exprs
		 (begin (write-report others
			  ("failed assertion: " 'exprs))
		   (error "assertion failure"))))))
	 )
       (do-assert () _expr . _others)
       ))))

(define (cerr . args)
  (for-each (lambda (x)
              (if (procedure? x) (x) (display x)))
    args))

; If there is (current-error-port), uncomment the following
;(define (cerr . args)
;  (for-each (lambda (x)
;              (if (procedure? x) (x (current-error-port))
;		(display x (current-error-port))))
;    args))

(define nl (string #\newline))