From www@deja.com Mon Jul 5 10:25:28 1999 Message-ID: <7lqpvr$o6r$1@nnrp1.deja.com> From: oleg@pobox.com To: oleg@pobox.com Subject: Announce: Read-time _apply_ and its usual and unusual applications Date: Mon, 05 Jul 1999 17:28:37 GMT Reply-To: oleg@pobox.com Keywords: external representation, printed form, reader-macro, read-time eval, Scheme Newsgroups: comp.lang.scheme Organization: Deja.com - Share what you know. Learn what you don't. Summary: Read-time application as a printed form for any datatype, available now X-Article-Creation-Date: Mon Jul 05 17:28:37 1999 GMT Sender: www@deja.com Content-Length: 7837 Status: OR A read-time application facility (reader-ctor) discussed previously on this newsgroup has been implemented and is available for immediate experimentation. This facility provides an extensible, new old way for external representation of Scheme values: read-time _application_. A proposal http://www.deja.com/[ST_rn=ps]/viewthread.xp?AN=465207316&search=thread& recnum=%3c7er28h$5i0$1@nnrp1.dejanews.com%3e%231/3 discussed motivation, objections, and issues. Differences between read-time evaluation (aka sharp-dot in CL parlance) and read-time application were mentioned as well. Although this topic turned out to be more inciting than exciting, I nevertheless went ahead and implemented reader-constructors. This article presents usage examples and discusses implementation issues. The article also points out to applications of reader-ctors other than giving a readable external representation to structures (records). In particular, read-time application can trivially implement a 'cond-expand' construct of SRFI-0 (and by extension, "Feature-based program configuration language", SRFI-7). To remind, a read-time application occurs whenever a Scheme reader comes across the following sequence of characters in its input stream: #,(tag arg1 ...) A 'tag' must be an (external representation of an) identifier, and 'arg1' etc. are external representations of some values, which may be read-time applications as well. The original proposal used a # ` notation for this form. As anticipated, this character combination met universal disapproval, so this article introduces #, to denote read-time applications instead. Upon encountering an #, external form, the 'read' procedure should locate a reader-constructor associated with the 'tag', read the arguments 'arg1'... and apply the constructor to the arguments. The result of the application is taken to be the value that corresponds to the #, external form. To define an association between a symbolic tag and a reader-ctor procedure, one should use procedure: define-reader-ctor SYMBOL PROC Examples: ; provide alternative readable representations to standard ; Scheme datatypes and other _values_ (define-reader-ctor 'list list) (with-input-from-string "#,(list 1 2 #f \"4 5\")" read) ==> (1 2 #f "4 5") (define-reader-ctor '+ +) (with-input-from-string "#,(+ 1 2)" read) ==> 3 One can (given the appropriate "prefix") read "(+ 1 2)" as the number 3 indeed, as one participant in this forum desired to. (with-input-from-string "#,(+ 1 (+ 2 3))" read) ==> error: can't add number 1 and a list '(+ 2 3) (with-input-from-string "#,(+ 1 #,(+ 2 3))" read) ==> 6 ; provide a readable representation to a structure (record) (define-reader-ctor 'my-vector (lambda x (apply vector (cons 'my-vector-tag x)))) (with-input-from-string "#,(my-vector (my-vector 1 2))" read) ==> a vector whose second element is a list of a symbol my- vector, number 1, and number 2. (with-input-from-string "#,(my-vector #,(my-vector 1 2))" read) ==> a vector whose second element is a my-vector constructed from numbers 1 and 2. (with-input-from-string "#,(my-vector #,(my-vector #,(+ 9 -4)))" read) ==> '#(my-vector-tag #(my-vector-tag 5)) ; provide readable interpretation to uniform vectors (per ; SRFI-4) ; Incidentally this was the initial thrust behind the ; reader-ctor proposal (define-reader-ctor 'f32 f32vector) (with-input-from-string "#,(f32 1.0 2.0 3.0)" read) ==> a uniform f32 vector with three elements Loading of a file containing a read-time application: (define (temp-proc) (let ((v '#,(f32 1.0 2.0 3.0))) (f32vector-ref v 1))) defines a procedure temp-proc (pretty-print temp-proc) ==> (lambda () (let ((v '#f32(1. 2. 3.))) (f32vector-ref v 1))) (temp-proc) ==> 2.0 ; provide readable interpretation to other 'complex' ; datatypes, eg, _ports_ (define-reader-ctor 'file open-input-file) (with-input-from-string "#,(file \"/tmp/a\")" (lambda () (read-char (read))) ==> returns the first character of the file "/tmp/a" ; In a #,(tag arg...) form, the tag itself may be a read-time ; application (define-reader-ctor 'plus-or-list (let ((flag #t)) (lambda () (begin0 (if flag '+ 'list) (set! flag (not flag)))))) Reading "#,(#,(plus-or-list) 1 2)" => 3 as expected Reading "#,(#,(plus-or-list) 1 2)" => (1 2) as expected Reading "#,(#,(plus-or-list) 1 2)" => 3 as expected That is, _sometimes_ "#,(#,(plus-or-list) 1 2)" reads as number 3, and some other times it reads as a list of two numbers. This shows how to make an "ambivalent reader". These examples really work. You can see that for yourselves, by running a verification code http://pobox.com/~oleg/ftp/Scheme/vread-apply.scm (see its title comments for details). This code also checks that syntax and semantic errors in #, forms are correctly detected and reported. Currently, read-time application is implemented by a transient, run-time, localized subversion of a Gambit reader -- interposing on one of its functions. This approach can easily be extended to other Scheme systems. The goal of the present implementation was to interfere as little as possible in the reader code, and by all means avoid rebuilding of the entire Gambit system. In true Lisp spirit, Gambit collects various parameters pertaining to parsing of input (e.g., the list of named character entries: #\tab, etc) in a special structure: a readtable. The last slot in readtable is a table of character-handlers. The latter are called when a specific character (eg., #\#, #\(, #\\ ) is encountered in reader's stream. The handler is to process that (and perhaps a few following characters) and to return the corresponding Scheme object. I have replaced a sharp-handler with a modified version that detects and processes a #, character combination. See http://pobox.com/~oleg/ftp/Scheme/read-apply.scm for complete details. I also had to decide where to store associations of tags with reader-ctor procedures. The most appropriate place is a readtable. In the proper implementation, the readtable structure ought to have a dedicated slot for reader-ctor associations. However, extending readtable changes its size, which will require complete rebuilding of the Gambit system. This was unacceptable. Therefore, the present implementation hides the reader-ctor table inside the sharp-handler itself. The latter is the table's sole consumer anyway. While the original sharp-handler was a regular procedure, the new handler is a closure. Obviously we need a way to add new associations to the reader-ctor table hidden inside the closure. The Gambit reader always calls a sharp-handler with two arguments. The modified handler takes two more _optional_ arguments as well, to maintain the reader-ctor table. Adding optional arguments does _not_ alter the original interface between the reader and a character-handler; this extension can be considered a sort of "derivation" or "inheritance". The subversion of the Gambit reader occurs only when define-reader-ctor is first called. Just loading or including of 'read-apply.scm' per se does not alter any of Gambit reader's data or structures. The reader-ctor patch is therefore lightweight, transient and transparent. BTW, this follows my intent on global subversion, e.g., of POSIX system calls. So far I have successfully managed to portably interpose on open(), close(), stat() and lstat(). No modifications to the kernel, any of the system libraries, or applications' source code are required. LD_PRELOAD is not required either. As was mentioned above, a read-time application trivially implements 'cond-expand' of SRFI-0, as a follow-up article will show.