Contents
This is an explanation of a technique to effectively
"substitute" ("rename", "interpose on") an open(2)
libc function, or other similar functions and system calls. This technique
LD_PRELOAD
tricks;
libc
or any other system or user file;
Yet all application's attempts to open files are routed to my version of open()
, which may call the regular open()
in some cases.
This technique has been tested to work under GNU/Linux 2.0.27+ and 2.2.10+, gcc 2.7.2.3 and egcs 2.91.66
(Slackware and S.u.S.E. distributions). By this virtue, the trick will work with
GNU ld
on any other platform. The technique also works on SunSparc/Solaris 2.6 with Sun's native ld
, and HP-UX
9000/770 B.10.10, using HP's native ld
(HP-PA does not permit GNU ld
). The method will work on the other UNIX platforms as well, although the precise set of ld
's flags necessary to accomplish the trick may vary from one UNIX/ld vendor to another.
Suppose a file mc_open.o
implements a function mc_open()
, which has the same interface as the standard POSIX function open(2)
. Function mc_open()
may, for example, examine the name of the file to open and opening modes, consult "access control lists", and eventually invoke open(2)
, passing the same or altered or substituted file name and opening modes.
Suppose an object vendian_io.o
invokes open(2)
, either directly or
indirectly: For example, vendian_io.o
may use fopen()
or C++ fstream
's.
We want to link vendian_io.o
and mc_open.o
in such a way that every time vendian_io.o
calls open(2)
-- either directly or indirectly -- the call is forwarded to mc_open()
. mc_open()
may then examine its arguments and eventually invoke the true open(2)
itself.
ld
Note that the source of either vendian_io.o
or mc_open.o
is not needed: we deal solely with already compiled, object code.
Create a file mc_open_glue.c
:
/*
************************************************************
* This is a glue code that "renames" our mc_open() into open(),
* so that our mc_open() takes place of (interposes on)
* the ordinary open(2) system call.
************************************************************/
extern int mc_open(const char *filename, const int mode, const int mask);
int open(); /* Suppress errors as open(2) is declared with varargs...
on GNU/Linux and FreeBSD... */
int open(const char *filename, const int mode, const int mask)
{
#if !defined(DEBUG)
return mc_open(filename,mode,mask);
#else
int result; printf("\nOpening '%s'...",filename);
result = mc_open(filename,mode,mask); printf("\nresult... %d",result);
if(result < 0 )
perror("opening error");
return result;
#endif
}
Unless DEBUG
is defined, this glue function merely exits to mc_open()
.
Compile mc_open_glue.c
above obtaining mc_open_glue.o
Make an object file open_ext.o
as follows:
ar xv /usr/lib/libc.a open.o sysdep.o ld -r -x -wrap open -defsym __wrap_open=__libc_open \ -defsym __open=mc_open \ mc_open.o open.o sysdep.o -o open_ext.o
Link vendian_io.o
with mc_open_glue.o
and open_ext.o
made in the above two steps. This gives us the final executable:
gcc vendian_io.o mc_open_glue.o open_ext.o -o vendian_io -lmNote that
libc
is linked dynamically, as it usually is by
default; mc_open.o
may be linked in dynamically as well.
ld
The only difference from Solution 1 above is Step 3, which now should read:
Make an object file open_ext.o
as follows:
ar xv /usr/lib/libc.a t_open.o ld -r -h open -v -B immediate mc_open.o t_open.o -o open_ext.oThe rest of the procedure is identical.
ld
The only difference from Solution 1 above is Step 3, which now should read:
Make an object file open_ext.o
as follows:
ar xv /usr/lib/libc.a libc_open.o open.o open64.o ld -r -B local -z redlocsym mc_open.o libc_open.o open.o open64.o -o open_ext.oThe rest of the procedure is identical.
ld
Steps 1 and 3 from Solution 1 above have to be altered as follows:
Create a file mc_open_glue.c
:
/*
************************************************************
* This is a glue code that "renames" our mc_open() into open(),
* so that our mc_open() takes place of (interposes on)
* the ordinary open(2) system call.
************************************************************/
extern int mc_open(const char *filename, const int mode, const int mask);
int open(); /* Suppress errors as open(2) is declared with varargs...
on GNU/Linux and FreeBSD... */
#if defined(__FreeBSD__)
/* Redirecting a wrapped open() in mc_open() to _open() which
actually traps into the kernel. */
int __wrap_open(const char *filename, const int mode, const int mask)
{ return _open(filename, mode, mask); }
#endif
int open(const char *filename, const int mode, const int mask)
{
#if !defined(DEBUG)
return mc_open(filename,mode,mask);
#else
int result; printf("\nOpening '%s'...",filename);
result = mc_open(filename,mode,mask); printf("\nresult... %d",result);
if(result < 0 )
perror("opening error");
return result;
#endif
}
Make an object file open_ext.o
according to the following:
ld -r -x -wrap open mc_open.o -o open_ext.oThe other steps are the same.
Extended file names are the ones that may have "pipes" in them, for example,
"gunzip < /tmp/aa.gz |"
" | gzip -best > file.gz "
or even
" cat file-name | tee transcript |"
Another example of an extended file name is "tcp://localhost:13"
. These are the names of "files", and as such, can be passed to open()
, fopen()
, fstream()
, with-input-from-file
, etc.
This file name extension is implemented on the lowest possible level,
right before a request to open a file is passed to the kernel, by a system call open(2)
. A function sys_open()
(in a source file
sys_open.c
) acts as a "patch": that is, if you call sys_open()
instead of open(
) to open a file, you get all the open()
functionality plus the extended file names. Note, neither the kernel, nor libc nor any of user or system files and libraries are actually patched or modified in any way.
The function sys_open()
is a preprocessor to open(2)
that can handle extended file names like "cmd |
", "| cmd
", or "tcp://hostname:port"
where cmd
is anything that can be passed to /bin/sh
. The shell /bin/sh
is launched in a subprocess to interpret the cmd
; the shell's stdin
, stdout
or both become the file that is "opened" by this function. In all other respects sys_open()
is equivalent to open(2)
.
It has to be stressed that with this substitution in place,
no matter how one opens a file -- with open()
, fopen()
, ofstream()
, etc -- he can submit the extended file names and enjoy their functionality. You don't need Perl or Expect: the piped
file names may appear really anywhere where files are open.
One can "extend" file names even further, by allowing
http://
prefix or host:port||
suffix. In the latter case, "open" will do listen()
first.
The extended file names and the open()
substitution is a part
of a C++ "advanced" i/o and the arithmetic compression classlib. Its
Makefile runs a verification code that tests that the interposition really works.
FILE * fp = fdopen(sys_open(" | gzip -best > file.gz ", O_WRONLY | O_CREAT | O_TRUNC,0777),"wb"); (with-input-from-file (string-append " cat " file-name "| tee transcript |") ...) cout << "\tReading from a datetime port" << endl; FILE * fp = fopen("tcp://localhost:13","r"); if( fp == 0 ) perror("Opening failed"), _error("Failure"); cout << "\t\tthe result is: "; int c; while( (c = fgetc(fp)) != EOF ) cout << (char)c; fclose(fp); (let ((io-port (##open-input-output-file "| while read i; do echo $i | sed 's/[Ff]oo/bar/g'; done "))) (cerr "\t\tsending pattern: " orig-string nl) (display orig-string io-port) (display #\newline io-port) (flush-output io-port) (cerr "\n\t\tdone sending; receiving the result\n") (with-input-from-port io-port (lambda () [elided])))
.c
, 20K]
.cc
, 15K],fstream
and fopen()
.scm
, 6K]"tcp://" and "ltcp://"
-like pipes
sort
(as a representative of programs not specifically designed for
interactive use)
oleg-at-okmij.org