From oleg@pobox.com Sun Feb 21 14:55:49 1999 Subject: Way extended handling of files and "files" Date: Sun, 21 Feb 1999 22:57:26 GMT Reply-To: oleg@pobox.com Keywords: POSIX, file, pipe, bidirectional pipe, TCP pipe, C, C++, Scheme Newsgroups: comp.lang.scheme Organization: Deja News - The Leader in Internet Discussion Summary: What's in the file name? X-Article-Creation-Date: Sun Feb 21 22:57:26 1999 GMT Content-Length: 7108 Status: OR This article deals with reading, writing, and "updating" of files and _"files"_. Although this post contains Scheme code, the topic is not specifically Scheme, and I apologize for that. The following Scheme code hopefully speaks clearer than I do: (cerr "The current date is: " (lambda (port) (with-input-from-file "tcp://localhost:13" (lambda () (port-copy (current-input-port) port))))) (cerr "A particularly wicked way of doing a string-substitution...\n") (let ((orig-string "Foo foofs here...") (subst-string "bar barfs here...") (file-name "| while read i; do echo $i | sed 's/[Ff]oo/bar/g'; done ")) (cerr "\n\tPerforming a string substitution via sed ..." "\n\t\tOpening a file '" file-name "'\n") (let ((io-port (##open-input-output-file file-name))) (cerr "\t\tsending pattern: " orig-string nl) (display orig-string io-port) (display #\newline io-port) (flush-output io-port) (cerr "\n\t\tdone sending...\n") (with-input-from-port io-port (lambda () (let ((read-str (next-token '() '(#\newline *eof*)))) (cerr "\n\t\tread: " read-str nl) (assert (equal? read-str subst-string)) (assert-curr-char '(#\newline) "")))) (cerr "\n\t\tone more time... [elided]\n"))) (define (validate-url url) (define (split-url url handler) (with-input-from-string url (lambda () (assert (equal? "http:" (next-token '() '(#\/)))) (assert-curr-char '(#\/) "") (assert-curr-char '(#\/) "") (let* ((hostname (next-token '() '(#\/ *eof*))) (resource (next-token '() '(*eof*)))) (handler hostname resource))))) (define (validate hostname resource) (let ((io-port (##open-input-output-file (string-append "tcp://" hostname ":80")))) (display "HEAD " io-port) (display resource io-port) (display " HTTP/1.0\r\n\r\n" io-port) (flush-output io-port) (with-input-from-port io-port (lambda () (let* ((proto-id (read)) (resp-code (read))) (cond ((= resp-code 200) (cerr "resource " resource " found at " hostname " length " (let loop ((token (next-token '(#\return #\newline) '(#\return #\newline #\: *eof*)))) (cond ((eof-object? (skip-while '(#\return #\newline #\: *eof*))) "unknown") ((string-ci=? token "content-length") (read)) (else (loop (next-token '() '(#\return #\newline #\: *eof*)))))) nl)) (else (cerr "\nA bummer:") (port-copy (current-input-port) (current-output-port))))))))) (split-url url validate)) (validate-url "http://www.iro.umontreal.ca/~gambit/") ==> resource /~gambit/ found at www.iro.umontreal.ca length 11598 The last example is slightly simplified for the sake of clarity. Although "tcp://localhost:13" and "| while read i; do echo $i | sed 's/[Ff]oo/bar/g'; done " look like something else, they are ordinary file names, at least as far as Gambit is concerned. Of course these file names do indeed look like something else, and feel accordingly. http://pobox.com/~oleg/ftp/Scheme/vext-io.scm has more examples. Note, Gambit has a special function "##open-input-output-pipe". A quote from its implementation (.../gambc30/lib/os.c) reads: default: /* ___IO_INPUT_OUTPUT */ ___free_mem (stream); return 0; /* bidirectional pipes not allowed */ As the Scheme snippet above shows, the fate has been gracious to allow bi-directional pipes after all. Also note that there does not appear to be any need in a separate function 'open-input-output-pipe'. The regular "with-input-from-file" etc. will suffice. Many thanks to Marc Feeley for Gambit and its extended i/o functions, and for making the source available. The extended file names at the beginning of the code are not Scheme-specific (hence the apology). Here's a similar code in the other language, // creating fstream over a file name "tcp://hostname:7" and // testing echoing though this stream strstream filename; filename << "tcp://" << hostname << ":7" << ends; cout << "\tEchoing: opening a fstream '" << filename.str() << "'" << endl; fstream fp(filename.str(),ios::in|ios::out); if( !fp.good() ) perror("Opening failed"), _error("Failure"); const char pattern [] = "1234567\r\n\007\r\n"; char reply [sizeof(pattern)]; cout <<"\t\tsending " << pattern << "..." << endl; fp << pattern; fp.flush(); assert( fp.good() ); cout << "\t\treceiving..."; [elided, see http://pobox.com/~oleg/ftp/packages/vendian_io.cc] This code works with a _virgin_ (unmodified) GNU's libstdc++.a, which was statically linked. None of the system libraries nor the OS kernel were changed in any way. No special privileges were required. The source for the C++ stream library or for system libraries was not needed. I have run the code on HP-PA B10.10, Sun/Solaris 2.6 and Linux 2.0.33. The whole trick is based on a single function sys_open.c: http://pobox.com/~oleg/ftp/packages/sys_open.c The function works and feels like regular open(2), but allows (way) "extended" file names - unidirectional, bidirectional and TCP pipes. One can call this function instead of the regular open() -- in implementing Scheme or any other language of one's choice. Yet no source modification is actually necessary. Even the source code isn't needed -- courtesy of a "Patch-free User-level Link-time intercepting of system calls and interposing on library functions" http://pobox.com/~oleg/ftp/syscall-interpose.html Although the page doesn't mention Sun/Solaris, the trick works on that platform too: I haven't updated the page yet. Note, on Linux even a bigger trick is possible: for example, fopen("tcp://hostname:13") as in strstream filename; filename << "tcp://" << hostname << ":13" << ends; cout << "\tReading from a datetime port: opening a FILE '" << filename.str() << "'" << endl; FILE * fp = fopen(filename.str(),"r"); if( fp == 0 ) perror("Opening failed"), _error("Failure"); cout << "\t\tthe result is: "; int c; while( (c = fgetc(fp)) != EOF ) cout << (char)c; fclose(fp); Not the fastest way to query a system date of course; nevertheless, this example illustrates that once open, the "file" named "tcp://hostname:13" looks like any ordinary FILE* -- to C, to Scheme, and to any other language. Again, no system or user library, or source code were modified. I have used the TCP pipes disguised as file names in my Scheme code to communicate to a remote CGI script (also written in Scheme) -- to do a remote method invocation of sorts. I can talk about extended file names almost incessantly as this topic appears fascinating. Yet as my past mistakes teach me, I'd better stop.