From www@deja.com Wed Mar 8 03:18:27 2000 Message-ID: <8a4h56$oqu$1@nnrp1.deja.com> From: oleg@pobox.com Subject: Reading IEEE binary floats in R5RS Scheme Date: Wed, 08 Mar 2000 03:24:25 GMT Newsgroups: comp.lang.scheme References: <38bf00f8.78618577@news.berkeley.edu> <8a1v19$ica$1@bob.news.rcn.net> X-Article-Creation-Date: Wed Mar 08 03:24:25 2000 GMT X-Comment: added a correction to run the code under Bigloo. Status: OR In article , Daniel Ortmann wrote: > > After reading a *lot* of functional papers and installing most scheme's and > ML's that would compile on AIX ... the VERY FIRST practical program I > attempted to write needed to > - read some lines of text from a simulator graphics output file > - read some of the numbers as either 4-byte IEEE single precision and/or as > 8-byte douple precision floating point numbers > > WIPEOUT in guile and scsh. WIPEOUT even in the amazing O'Caml. :-( > > The only way this would be practically possible would be to compile special C > code into the language implementations. It appears there is a way to read IEEE binary floating-point numbers in R5RS Scheme. The only assumption is that char->integer returns an integer with the same bit pattern as the function's argument, a single 8-bit ASCII character. The assumption holds for many Schemes including Gambit (which supports Unicode and UTF-8, btw). Listing 2 is a C code that writes a bunch of representative IEEE floats in a binary file. The sample includes (+/-)minfloat, (+/-)maxfloat and the epsilon. Note even writing of floating-point numbers in a portable format is more challenging that it appears. Listing 1 is a R5RS Scheme code that reads the binary floating-point numbers from a file and prints them. The code doesn't handle +Inf, -Inf and NaNs, although it's trivial to rectify. One can twiddle bits in Scheme after all: it's just arithmetics... The C code was compiled with gcc 2.95.2 on SunSparc/Solaris 2.6; The Scheme code was interpreted by Gambit-C 3.0 on the same platform and by Bigloo 2.4b on i686 (Pentium III). Please note that to run the code under Bigloo and perhaps other Scheme systems, you must change (expt 2 full-exp) into (expt 2.0 full-exp) as indicated in the source below. Please see 'binary-parse.scm' file (in the same directory as the present file) for binary parsing, bit streams, and reading of (widely) variable number of bits from a file. Listing 1 ; Read a set of IEEE 754 float numbers from a file. ; The numbers are written in a big-endian format. ; See, for example, http://www.cs.auckland.ac.nz/~jham1/07.211/floats.html ; as a reference to IEEE 754 (define (read-byte port) (let ((c (read-char port))) (and (not (eof-object? c)) (char->integer c)))) (define (combine . bytes) (let loop ((bytes bytes) (accum 0)) (if (null? bytes) accum (loop (cdr bytes) (+ (* 256 accum) (car bytes)))))) (define (read-float port) (let* ((b1 (read-byte port)) (b2 (read-byte port)) (b3 (read-byte port)) (b4 (read-byte port))) (and b1 b2 b3 b4 (let* ((sign-neg (and (> b1 127) (begin (set! b1 (- b1 128)) #t))) (full-exp (+ (* 2 b1) (if (> b2 127) (begin (set! b2 (- b2 128)) 1) 0))) (num (if (= 255 full-exp) #f ; won't handle NaN and +Inf, -Inf ; For Bigloo, change (expt 2 full-exp) ; to read: (expt 2.0 full-exp) (* (expt 2 full-exp) (combine (if (zero? full-exp) b2 (+ b2 128)) b3 b4) (if (zero? full-exp) 1.401298464324817e-45 7.006492321624085e-46))))) (if sign-neg (and num (- num)) num))))) ; The main loop (call-with-input-file "/tmp/a" (lambda (port) (let loop ((num (read-float port))) (cond ((not num) (display "Done\n")) (else (display num) (newline) (loop (read-float port))))))) ; When interpreting the output of the main loop, please keep in mind ; that single-precision floating-point numbers are accurate only to 7 ; decimal places. Listing 2 /* Dump a sample set of FP numbers into a file, in a big-endian format */ #include #define OUT_FNAME "/tmp/a" /* Dump 'datum' into a FILE, in a _big_ endian format */ /* Note! if you think that you can do fwrite(&datum,sizeof(datum),1,fp); then you probably haven't considered endiannes and other platform-specific data layout and representation issues. In the following we assume that endianness of FP numbers is the same as endianness of integers. This is not generally true for all platforms. */ static void write_long(const unsigned int datum, FILE * fp) { const unsigned char b4 = datum; unsigned int accum = datum >> 8; const unsigned char b3 = accum; const unsigned char b2 = (accum >>= 8); const unsigned char b1 = (accum >>= 8); putc(b1,fp); putc(b2,fp); putc(b3,fp); putc(b4,fp); } static const float patterns [] = { 0.0, -1.0, 1.0/3.0, (float)1.192092896E-07, 1+1.192092896E-07F, 1e-23, -1e-23, (float)3.40282346638528860e+38, (float)-3.40282346638528860e+38, (float)1.40129846432481707e-45, (float)-1.40129846432481707e-45, (float)3.14159265358979323846}; int main(void) { FILE * fp = fopen(OUT_FNAME,"wb"); const float * pp; if( fp == (FILE*)0 ) perror("Couldn't open " OUT_FNAME), exit(4); for(pp=patterns; pp