From www@deja.com Wed Mar  8 03:18:27 2000
Message-ID: <8a4h56$oqu$1@nnrp1.deja.com>
From: oleg@pobox.com
Subject: Reading IEEE binary floats in R5RS Scheme
Date: Wed, 08 Mar 2000 03:24:25 GMT
Newsgroups: comp.lang.scheme
References: <38bf00f8.78618577@news.berkeley.edu> <bpxln40xakb.fsf@neon.rchland.ibm.com> <8a1v19$ica$1@bob.news.rcn.net> <bpxput6qz1o.fsf@neon.rchland.ibm.com>
X-Article-Creation-Date: Wed Mar 08 03:24:25 2000 GMT
X-Comment: added a correction to run the code under Bigloo.
Status: OR

In article <bpxput6qz1o.fsf@neon.rchland.ibm.com>,
  Daniel Ortmann <ortmann@rchland.ibm.com> wrote:
>
> After reading a *lot* of functional papers and installing most scheme's and
> ML's that would compile on AIX ... the VERY FIRST practical program I
> attempted to write needed to
> - read some lines of text from a simulator graphics output file
> - read some of the numbers as either 4-byte IEEE single precision and/or as
>   8-byte douple precision floating point numbers
>
> WIPEOUT in guile and scsh.  WIPEOUT even in the amazing O'Caml.  :-(
>
> The only way this would be practically possible would be to compile special C
> code into the language implementations.

It appears there is a way to read IEEE binary floating-point numbers in
R5RS Scheme. The only assumption is that char->integer returns an integer
with the same bit pattern as the function's argument, a single 8-bit
ASCII character. The assumption holds for many Schemes including Gambit
(which supports Unicode and UTF-8, btw).

Listing 2 is a C code that writes a bunch of representative IEEE floats
in a binary file. The sample includes (+/-)minfloat, (+/-)maxfloat and
the epsilon. Note even writing of floating-point numbers in a portable
format is more challenging that it appears. Listing 1 is a R5RS Scheme
code that reads the binary floating-point numbers from a file and prints
them. The code doesn't handle +Inf, -Inf and NaNs, although it's trivial
to rectify.

One can twiddle bits in Scheme after all: it's just arithmetics...

The C code was compiled with gcc 2.95.2 on SunSparc/Solaris 2.6; The
Scheme code was interpreted by Gambit-C 3.0 on the same platform
and by Bigloo 2.4b on i686 (Pentium III). Please note that to run
the code under Bigloo and perhaps other Scheme systems, you must
change (expt 2 full-exp) into (expt 2.0 full-exp) as indicated
in the source below.

Please see 'binary-parse.scm' file (in the same directory as the
present file) for binary parsing, bit streams, and reading of 
(widely) variable number of bits from a file.


Listing 1

; Read a set of IEEE 754 float numbers from a file.
; The numbers are written in a big-endian format.
; See, for example, http://www.cs.auckland.ac.nz/~jham1/07.211/floats.html
; as a reference to IEEE 754

(define (read-byte port)
  (let ((c (read-char port)))
    (and (not (eof-object? c)) (char->integer c))))

(define (combine . bytes)
  (let loop ((bytes bytes) (accum 0))
    (if (null? bytes) accum
        (loop (cdr bytes)
              (+ (* 256 accum) (car bytes))))))

(define (read-float port)
  (let* ((b1 (read-byte port))
         (b2 (read-byte port))
         (b3 (read-byte port))
         (b4 (read-byte port)))
    (and b1 b2 b3 b4
         (let* ((sign-neg
                (and (> b1 127) (begin (set! b1 (- b1 128)) #t)))
                (full-exp (+ (* 2 b1)
                             (if (> b2 127)
                                 (begin (set! b2 (- b2 128)) 1) 0)))
                (num
                 (if (= 255 full-exp)
                     #f         ; won't handle NaN and +Inf, -Inf
		     ; For Bigloo, change (expt 2 full-exp)
		     ; to read: (expt 2.0 full-exp)
                     (* (expt 2 full-exp)
                        (combine
                         (if (zero? full-exp) b2 (+ b2 128))
                         b3 b4)
                        (if (zero? full-exp) 1.401298464324817e-45
                            7.006492321624085e-46)))))
           (if sign-neg (and num (- num)) num)))))

; The main loop
(call-with-input-file "/tmp/a"
  (lambda (port)
    (let loop ((num (read-float port)))
      (cond
       ((not num) (display "Done\n"))
       (else
        (display num) (newline)
        (loop (read-float port)))))))

; When interpreting the output of the main loop, please keep in mind
; that single-precision floating-point numbers are accurate only to 7
; decimal places.

Listing 2

/* Dump a sample set of FP numbers into a file, in a big-endian format */

#include <stdio.h>
#define OUT_FNAME "/tmp/a"

/* Dump 'datum' into a FILE, in a _big_ endian format */
/* Note! if you think that you can do
        fwrite(&datum,sizeof(datum),1,fp);
   then you probably haven't considered endiannes and other
   platform-specific data layout and representation issues.
   In the following we assume that endianness of FP numbers
   is the same as endianness of integers. This is not generally
   true for all platforms.
*/
static void write_long(const unsigned int datum, FILE * fp)
{
  const unsigned char b4 = datum;
  unsigned int accum = datum >> 8;
  const unsigned char b3 = accum;
  const unsigned char b2 = (accum >>= 8);
  const unsigned char b1 = (accum >>= 8);
  putc(b1,fp); putc(b2,fp); putc(b3,fp); putc(b4,fp);
}

static const float patterns [] = {
  0.0, -1.0, 1.0/3.0, (float)1.192092896E-07, 1+1.192092896E-07F,
  1e-23, -1e-23,
  (float)3.40282346638528860e+38, (float)-3.40282346638528860e+38,
  (float)1.40129846432481707e-45, (float)-1.40129846432481707e-45,
  (float)3.14159265358979323846};

int main(void)
{
  FILE * fp = fopen(OUT_FNAME,"wb");
  const float * pp;
  if( fp == (FILE*)0 )
    perror("Couldn't open " OUT_FNAME), exit(4);

  for(pp=patterns; pp<patterns+sizeof(patterns)/sizeof(patterns[0]); pp++
)
    write_long(*((int *)pp),fp);
  fclose(fp);
  printf("\nCreated sample file '%s'\n", OUT_FNAME);
  return 0;
}