Scheme Utility Functions

A standard prelude
Special forms and primitives that are used in almost all the code on this site.
An assert macro with informative reports on failure
define-opt: A concise define form allowing optional arguments
lookup-def: Look up a value in alists of various formats and handle the look-up failure
A catcher of errors
Native implementations of the exception-handling SRFI (SRFI-12)
Yet another kind of environment, like a pool dictionary in Smalltalk or a wordlist in Forth
string-split
String search functions: substring?, string-prefix?, etc.
Miscellaneous string utilities: string->integer, make-char-quotator
Helpful list manipulation utilities: list-intersperse and list-tail-diff
AND-LET*: an AND with local bindings, a guarded LET* special form
Enumerator ANY?
Parsing of input [in a separate document]
Reading IEEE binary floats [in a separate document]
File, directory, network, and other OS/POSIX interfaces [in a separate document]

A standard prelude

This standard prelude is a collection of special forms and primitives that are used in nearly all the code on this site.

assert
call-with-input-string, call-with-output-string: if not otherwise supported by a Scheme system
cout, cerr, nl
++, --, ++!, --!, inc, dec, inc!, dec!: the successor and the predecessor
when, whennot, begin0: convenient control operators
string-null?: from Olin Shiver's Underground String Library
cons*: from SRFI-1
let*-values: from SRFI-11
lookup-def
cond-expand: SRFI-0, as a low- or high-level macro, on platforms that do not support SRFI-0 natively
define-opt
handle-exceptions: SRFI-12
and-let*: SRFI-2

Besides convenience, the purpose of the standard prelude is to free the rest of my code from peculiarities of Scheme systems. The source code for the prelude itself therefore has to be rather platform-specific.

Version

The current version is 1.23, Oct 22, 2004.

References

myenv.scm [18K]
The prelude for a Gambit-C 3.0 Scheme system.

myenv-bigloo.scm [16K]
The prelude for a Bigloo 2.4b Scheme system.

myenv-chez.scm [22K]
The prelude for a Petite Chez 6.0a Scheme system. All special forms are realized as syntax-rule macros.

myenv-scm.scm [20K]
The prelude for a SCM 5d6 Scheme system. All special forms are realized as syntax-rule macros.

myenv-mit.scm [11K]
The prelude for a MIT 7.5.2 Scheme system.

vmyenv.scm [16K]
A thorough verification code.

An `assert` macro with informative reports on failure

We describe the design and the implementation of a portable assert macro with improved reporting capabilities. A message printed upon assertion failure shows, among other things, the bindings for interesting variables occurring within the asserted conditions. A programmer can also specify any strings or other expressions to evaluate and print at that point. Entering a REPL or a debugger after a failure might seem like the most informative approach. However, it is too heavyweight, precludes automated regression testing, and is platform-specific. The assert macro discussed in this article turns out informative, lightweight, and portable.

The good assert macro is implemented in SCM, Scheme48 and Petite Chez Scheme as a syntax-rule macro, and in Gambit-C and Bigloo Scheme systems as a low-level macro. An article posted in September 2001 claimed that assert cannot be written as a syntax-rule. The recent discovery of the syntax-rule macro symbol? that can reliably detect identifiers shows that the good assert form can, after all, be implemented in R5RS Scheme.

Version

The current version is 2.0, Oct 14, 2003.

References

A good assert macro [plain text file]
A USENET article that gives the motivation, discusses the assertion checking in various systems, introduces the better assert macro, gives examples of its usage, and comments on its implementation.
The article was posted on a newsgroup comp.lang.scheme on Sun, 23 Sep 2001 13:23:29 -0700.

A good assert macro, now as a syntax-rule [plain text file]
The article illustrates the assert macro on several examples and gives the complete R5RS implementation. The code includes several general-purpose syntax-rules, such as id-memv??, symbol??, and k!reverse -- as well as syntax-rules to compute a list of all identifiers (without duplicates) that occur in an arbitrary Scheme expression. We rely on light-weight CPS syntax-rule macros throughout.
The article was posted on a newsgroup comp.lang.scheme on Tue, 14 Oct 2003 11:53:09 -0700.

How to write symbol? with syntax-rules

A standard prelude
The assert macro is a part of the standard prelude.

vmyenv.scm [16K]
A thorough verification code.

Native implementations of the exception-handling SRFI (SRFI-12)

We describe complete native implementations of the exception handling SRFI (SRFI-12) on Gambit and Bigloo. Although SRFI-12 is officially withdrawn, it is still a well thought-out SRFI. Its native implementations help make Scheme code more portable.

Both Bigloo and Gambit implementations support the full SRFI-12, including continuable exceptions (signals), nested exceptions, and exceptional conditions. All examples given in SRFI-12 pass. Both implementations are native: a SRFI-12 application can capture, describe, and recover from native run-time exceptions, such as (/ 1 0) and (car '()). Native also means an ability to mix and match SRFI-12 and platform-specific exception-handling forms.

The Bigloo and Gambit implementations of SRFI-12 are in public domain and can be used for any suitable purpose.

Version

The current version is 1.3, Nov 16, 2001.

Platforms

Bigloo 2.4a and Gambit-C 3.0

References

< http://srfi.schemers.org/srfi-12/>
SRFI-12: Exception Handling. By William Clinger, R. Kent Dybvig, Matthew Flatt, and Marc Feeley

srfi-12.scm [10K]
The code for the procedures, with cond-expand for Gambit and Bigloo

A standard prelude
Implementations of a special form handle-exceptions

vsrfi-12.scm [5K]
Validation code: all examples from SRFI-12 and then some

Resurrect exception handling SRFI (SRFI-12) [plain text file]
An article posted on a newsgroup comp.lang.scheme on Sun, 18 Nov 2001 13:00:16 -0800. The article gives several examples of SRFI-12 usage. The article also shows the degree of integration of SRFI-12 into the Scheme systems.

`define-opt`: A concise `define` form allowing optional arguments

The form define-opt defines a function that may take optional arguments. For each optional argument, the programmer may specify a default value initializer, which is an arbitrary expression to be evaluated when the corresponding optional argument is omitted. The form define-opt is designed to be totally backwards-compatible with define: the meaning and the validity of a piece of code should not change if we rename all define forms into define-opt. OTH, define-opt is designed to be as compatible with DSSSL's extended define as possible -- while avoiding the non-standard lexical token #!optional. On systems that do support DSSSL (e.g., Gambit, Bigloo, Kawa) our define-opt expands into DSSSL's extended define, which is implemented efficiently on these systems.

Example:

     (define-opt (foo arg1 arg2 (optional arg3 (arg4 init4))) body)

Given below is the relevant part of the DSSSL specification, lifted from Gambit's online documentation:

     define-formals = formal-argument-list | r4rs-define-formals
     formal-argument-list = reqs opts rest keys
     reqs = required-formal-argument*
     required-formal-argument = variable
     opts = #!optional optional-formal-argument* | empty
     optional-formal-argument = variable | ( variable initializer )
     rest = #!rest rest-formal-argument | empty
     rest-formal-argument = variable
     keys = #!key keyword-formal-argument* | empty
     keyword-formal-argument = variable | ( variable initializer )
     initializer = expression
     r4rs-lambda-formals = ( variable* ) | ( variable+ . variable ) | variable
     r4rs-define-formals = variable* | variable* . variable

Variables in required-formal-arguments are bound to successive actual arguments starting with the first actual argument. It shall be an error if there are fewer actual arguments than required-formal-arguments.
Next variables in optional-formal-arguments are bound to remaining actual arguments. If there are fewer remaining actual arguments than optional-formal-arguments, then the variables are bound to the result of evaluating initializer, if one was specified, and otherwise to #f. The initializer is evaluated in an environment in which all previous formal arguments have been bound.
It shall be an error for a variable to appear more than once in a formal-argument-list.
It is unspecified whether variables receive their value by binding or by assignment.

Our define-opt does not currently support rest and keys arguments. Also, instead of #optional optional-formal-argument ... we write (optional optional-formal-argument ...).

Our define-opt is similar to PLT Scheme's opt-lambda. However, the very syntax of define-opt guarantees that optional arguments are really at the very end of the argument list.

Version

The current version is 1.1, Oct 30, 2003..

References

myenv-chez.scm [22K]
A portable implementation of define-opt using syntax-rules. That implementation should work on any R5RS Scheme system.

myenv.scm [18K]
An implementation of define-opt as a low-level macro that expands into DSSSL's extended define. Gambit-C supports DSSSL extensions.

vmyenv.scm [16K]
Thorough validation code, as one of the test cases in that file.

`lookup-def`: Look up a value in alists of various formats and handle the look-up failure

The form lookup-def looks up the value associated with a symbolic key in alist ((key value) ...) or ((key . value) ...). If the association has the form

(key . value) where value is not a pair: we return value
(key value): we return value
(key value1 value2 value3 ...): we return (value1 value2 value3 ...)

The procedure tries to do the right thing for both kinds of associative lists.

The form lookup-def is a special form rather than a regular procedure. Its first two arguments are evaluated exactly once. The default-value argument, if given, is evaluated only if the desired key is not found. I have not seen any need to pass lookup-def as an argument to other functions. If the latter is desired, it is not difficult to accomplish by explicitly wrapping lookup-def into a lambda form.

We use a pseudo-keyword argument warn: as a modifier. This is not really a keyword argument (although it may be, if the Scheme system turns out DSSSL-compatible).

Specification:

(lookup-def key alist): looks up key in alist and returns the associated value. Raises an error if the key is not found.
(lookup-def key alist default-exp): looks up key in alist and returns the associated value. If the the key is not found, we evaluate default-exp and return its result.
(lookup-def key alist warn: default-exp): the same as above. In addition, we write a warning (using cerr) if the key is not found.

Version

The current version is 1.1, Oct 22, 2004.

References

myenv-chez.scm [22K]
A portable implementation of lookup-def using syntax-rules. That implementation should work on any R5RS Scheme system.

myenv.scm [18K]
An implementation of lookup-def as a low-level macro.

vmyenv.scm [16K]
Thorough validation code, as one of the test cases in that file.

A catcher of errors

A form that makes sure a test case that was supposed to fail fails indeed. The catcher has been used extensively in the validation code on this site, to check that a function under test correctly detects errors of its invocation. An expected error does not abort the execution nor does it enter the debug mode. Therefore, the regression test code can go on without interruption. The article below also discusses a challenging topic of validating the error detection behavior of special forms.

Version

The current version is 1.4, Feb 27, 2001.

References

catch-error.scm [2K]
The commented source code.

Testing Scheme functions and special forms: catching expected errors [plain text file]
An article posted on a newsgroup comp.lang.scheme on Sun, 10 Jan 1999 23:29:44 GMT. The article describes tools to validate error reporting code of Scheme functions and special forms.

Yet another environment

The environment is an ordered list of associations of keys to values. Keys are Scheme symbols (identifiers). The environment does not generally follow any static scope of Scheme expressions, nor a dynamic scope of Scheme procedure activations. At any time we can add a new association to the environment, which will shadow the existing one for the same key, if any. An environment can be reified to a first-class value, either by exporting it into an associative list, or by capturing all or a part of it, see env.capture!. Therefore, an environment itself can be associated with a key. We can always extend the current environment with the previously exported or captured environment. Procedural values, as any other first-class values, may be placed in the environment. The execution of these procedures may push additional associations into the environment. The environment thus closely resembles dictionaries in Postscript or wordlists in Forth.

The following is a summary of the environment interface, which is fully documented in the title comments to the source code. The validation code provides many usage examples.

env.find KEY [syntax]
env.demand KEY [syntax]
env.print
env.bind KEY VALUE [syntax]
env.bind* ASSOC-LIST
env.mark: This procedure places a mark into the environment, and returns the mark
env.flush! MARK: This procedure flushes the environment through the given mark.
env.capture! MARK NAME: This procedure captures the environment through the MARK, creating a captured (i.e., a difference) environment with a given NAME (a string). The captured portion is removed from the environment. The captured environment can be put back to the environment via env.extend, env.with and env.with-exclusive.
env.extend CAPTURED-ENV: This procedure extends the environment with a previously CAPTURED-ENV. In other words, we extend the base environment with a difference environment.
env.with CAPTURED-ENV THUNK: This procedure executes THUNK in the environment extended by CAPTURED-ENV. After the thunk is finished, the original environment is restored. The result of THUNK is returned.
env.with-exclusive CAPTURED-ENV THUNK: The same is above but the CAPTURED-ENV temporarily replaces the current environment rather than extends it. Again, after THUNK is over, the original environment is restored.
env.->alist: This procedure exports the environment into an associative list, skipping special slots (which carry environments' names, for example). The list will maintain the same order of bindings as was in the original environment.

In the future, we plan to introduce virtual slots: slots whose values are (re-)computed the first time or every time the slot is accessed (cf. Dylan). Also, env.with and env.with-exclusive ought to use dynamic-wind.

Version

The current version is 2.4, Feb 27, 2001.

References

env.scm [9K]
The commented source code.

vmyenv.scm [16K]
A thorough validation/verification code.

Vocabulary package
A more ambitious project: (poly/homo)morphic dictionaries with a dynamic inheritance path, an embedded OO system.

Various string search functions

These functions are now part of SRFI-13. We provide optimized implementations in case a particular Scheme system does not offer SRFI-13 natively or conveniently.

substring? PATTERN STRING -> BOOL

This function searches STRING to see if it contains a substring PATTERN. The procedure returns the index of the first such occurrence; or #f if STRING does not contain PATTERN.

     (substring? "rat" "pirate")       =>  2
     (substring? "rat" "outrage")      => #f
     (substring? "" any-string)        =>  0

This is not an advanced search function: its worst-case time complexity is (n-m+1)*m achieved at (substring? "aaac" "aa....aac"). On the other hand, the implementation is not so dumb as to rely on substring. An inner comparison loop is entered only when first two characters of the pattern matched.

Note that (substring? pat str) is functionally identical to
(call-with-input-string str (lambda (port) (find-string-from-port? pat port)))

More specialized substring searches

     string-prefix? PATTERN STRING    -> BOOL
     string-prefix-ci? PATTERN STRING -> BOOL

These procedures test if PATTERN is a prefix of STRING.

     string-suffix? PATTERN STRING    -> BOOL
     string-suffix-ci? PATTERN STRING -> BOOL

These procedures test if PATTERN is a suffix of STRING.

     string-index STRING CHAR  -> INTEGER or #f
     string-rindex STRING CHAR -> INTEGER or #f

These procedures return the index of the first or the last occurrence of a character CHAR within STRING, or #f if STRING does not contain CHAR.

More advanced string searches can be done by applying input parsing functions to a string port.

Version

The current version is Apr 2003.

References

srfi-13-local.scm [9K]
The commented source code.

vinput-parse.scm [18K]
A thorough verification code.

Input parsing functions

`string-split`

     string-split STRING              -> STRINGS
     string-split STRING '()          -> STRINGS
     string-split STRING '() MAXSPLIT -> STRINGS

These procedures return a list of whitespace-delimited words in STRING. Leading and trailing whitespaces of the words are trimmed. If STRING is empty or contains only whitespace, the empty list is returned.

     string-split STRING CHARSET          -> STRINGS
     string-split STRING CHARSET MAXSPLIT -> STRINGS

These procedures return a list of words in STRING delimited by the characters in CHARSET. The latter is a list of characters to be treated as delimiters. Leading or trailing delimiters of the words are not trimmed. That is, the resulting list will have as many initial empty string elements as there are leading delimiters in STRING.

If MAXSPLIT is specified and positive, the resulting list will contain at most MAXSPLIT elements, the last of which is the string remaining after (MAXSPLIT - 1) splits. If MAXSPLIT is specified and non-positive, the empty list is returned. "In time critical applications it behooves you not to split into more fields than you really need."

These string-split procedures are based on a split function in Python or Perl.

     (string-split " abc d e f  ")       ==> ("abc" "d" "e" "f")
     (string-split " abc d e f  " '() 1) ==> ("abc d e f  ")
     (string-split " abc d e f  " '() 0) ==> ()
     (string-split ":" '(#\:))           ==> ("" "")
     (string-split ":abc:d:e::f:" '(#\:))
           ==> ("" "abc" "d" "e" "" "f" "")
     (string-split "root:x:0:0:Lord" '(#\:) 2)
           ==> ("root" "x:0:0:Lord")
     (string-split "/usr/local/bin:/usr/bin:/usr/ucb/bin" '(#\:))
           ==> ("/usr/local/bin" "/usr/bin" "/usr/ucb/bin")
     (string-split "/usr/local/bin" '(#\/))
           ==> ("" "usr" "local" "bin")

Version

The current version is 2.0, Sep 2, 1999.

References

util.scm [12K]
The commented source code.

vinput-parse.scm [18K]
A thorough verification code.

Miscellaneous string utilities

string->integer STRING START END -> INTEGER or #f

This procedure checks to see if a substring of STRING from START (inclusive) till END (exclusive) is a representation of a non-negative integer in decimal notation. If this is the case, this integer is returned. Otherwise -- when the substring contains non-decimal characters, or when the range from START till END is not within STRING, the result is #f.

This procedure is a specialization of the standard string->number. The latter is far more general: for example, it will try to read strings like "1/2", "1S2", "1.34" and even "1/0" (the latter causing a zero-divide error). Note that to string->number, "1S2" is a valid representation of an inexact integer 100. Oftentimes we want to be more restrictive about what we consider a number: we want merely to read an integral label.

make-char-quotator QUOT-RULES -> PROC

Given QUOT-RULES, an associative list of (char . string) pairs, return a quotation procedure. The returned quotation procedure takes a string and yields either a string or a list of strings. The quotation procedure checks to see if its argument string contains any occurrence of a character that needs to be encoded (quoted). If the argument string is "clean", it is returned unchanged. Otherwise, the quotation procedure will yield a list of string fragments. The input string will be broken at the places where the special characters occur. The special characters will be replaced by the corresponding encoding strings.

For example, the following expression yields a procedure that quotes special HTML characters:

     (make-char-quotator
       '((#\< . "&lt;") (#\> . "&gt;") (#\& . "&amp;") (#\" . "&quot;")))

String case modification functions

The following functions are now part of SRFI-13. We provide their optimized implementations in case a particular Scheme system does not offer SRFI-13 natively or conveniently.

     string-downcase  STRING -> STRING
     string-downcase! STRING -> UNSPECIFIED
     string-upcase    STRING -> STRING
     string-upcase!   STRING -> UNSPECIFIED

String concatenation functions from SRFI-13

We provide their optimized implementations in case a particular Scheme system does not offer SRFI-13 natively or conveniently. Some of these functions are specialized, by omitting optional arguments, for the sake of efficiency.

     string-xcopy! TARGET TSTART S SFROM STO      -> UNSPECIFIED
     string-concatenate-reverse STRINGS FINAL END -> STRING
     string-concatenate/shared STRINGS            -> STRING
     string-concatenate-reverse/shared STRINGS    -> STRING

Version

The current version is Apr 2003.

References

util.scm [12K]
The commented source code.

srfi-13-local.scm [9K]
The commented source code.

vinput-parse.scm [18K]
A thorough verification code.

Helpful list manipulation utilities

     list-intersperse SRC-L VAL -> LST

This procedure inserts a VAL between elements of the list SRC-L, returning a freshly allocated list.

     list-intersperse! SRC-L VAL -> UNSPECIFIED

This procedure inserts a VAL between elements of the list SRC-L inplace.

     list-tail-diff L1 L2 -> LST

This procedure returns a freshly allocated list LST such that the following holds:

(equal? L1 (append LST L2)) if L2 is a tail of L1
(equal? L1 LST) otherwise

Version

The current version is 1.7, Jan 26, 1999.

References

util.scm [12K]
The commented source code.

vmyenv.scm [16K]
A thorough verification code.

`AND-LET*`: an `AND` with local bindings

AND-LET* is an AND with local bindings, or a guarded LET* special form. The form and-let* takes a sequence of bindings (var init) or test expressions (test-expr). The form evaluates init and test-expr expressions in order. The results of init expressions are bound to the corresponding variables and can be used in subsequent init and test-expr. The first init or test-expr that returns a failure indicator #f triggers the "guard". The form and-let* then skips the rest of the bindings/test expressions and immediately returns #f.

The form AND-LET* can be considered a combination of LET* and AND , or a generalization of COND's send operator =>. AND-LET* closely resembles a do form in Haskell in the context of a Maybe Monad. The form AND-LET* is a light-weight, efficient and lexically-scoped exception mechanism.

AND-LET* is a SRFI-2, a Final Scheme Request for Implementation #2. AND-LET* used to be called LAND*; it was renamed on Jim Blandy's suggestion.

Examples:

     (and-let* ((my-list (compute-list)) ((not (null? my-list))))
            (do-something my-list))
     
     (define (look-up key alist)
       (and-let* ((x (assq key alist))) (cdr x)))
     
     (or
       (and-let* ((c (read-char)) ((not (eof-object? c))))
         (string-set! some-str i c) (inc! i))
       (begin (do-process-eof)))
     
           ; an excerpt from a term unifier
     (and-let* (((pair? tree2)) ; recursively match tree1 (a pair) and tree2
                 (env-new (match-tree (car tree1) (car tree2) env)))
       (match-tree (cdr tree1) (cdr tree2) env-new))

Version

The current version is 2.0, June 28, 2002.

References

< http://srfi.schemers.org/srfi-2/>
Final Scheme Request for Implementation #2

A standard prelude
AND-LET* is a part of the prelude.

vland.scm [10K]
An extensively commented validation code. This code shows an example of validating special forms, a rather challenging task.

LAND*: an AND with local bindings [plain text file]
An article posted on a newsgroup comp.lang.scheme on Sun Feb 22 21:53:52 1998 GMT.
The article gives the motivation, sample applications, the syntax and the denotational semantics of AND-LET*. The article also discusses programming with guards, compares the form with =>, and mentions challenges of writing test cases for special forms.

Enumerator `ANY?`

This generic enumerator searches for the first element in a collection satisfying a given predicate. By collection we mean a list, a vector, a string, or even an input port.

     any? PRED COLLECTION -> VALUE or #f

The enumerator applies PRED to every element of the COLLECTION in turn. The first element for which PRED returns anything but #f stops the iteration; the result of the predicate is returned. If none of the elements of the COLLECTION satisfy the predicate, the return value from the procedure is #f.

Version

The current version is 1.1, Feb 28, 1996.

References

util.scm [12K]
The commented source code -- a part of the miscellaneous utilities package

vmyenv.scm [16K]
Validation tests of any? and of other utility functions.

Last updated November 5, 2004

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from SXML by SXML->HTML

Scheme Utility Functions

A standard prelude

An assert macro with informative reports on failure

Native implementations of the exception-handling SRFI (SRFI-12)

define-opt: A concise define form allowing optional arguments

lookup-def: Look up a value in alists of various formats and handle the look-up failure

A catcher of errors

Yet another environment

Various string search functions

string-split

Miscellaneous string utilities

Helpful list manipulation utilities

AND-LET*: an AND with local bindings

Enumerator ANY?

Last updated November 5, 2004

An `assert` macro with informative reports on failure

`define-opt`: A concise `define` form allowing optional arguments

`lookup-def`: Look up a value in alists of various formats and handle the look-up failure

`string-split`

`AND-LET*`: an `AND` with local bindings

Enumerator `ANY?`