From oleg Tue Jun 14 11:00:22 CDT 1994
Newsgroups: comp.lang.c++
Subject: Opaque bases/variables in a class: kludges, suggestions, and dreams
Summary: Emulating opaque private variables / base classes
Followup-To: 
Distribution: 
Organization: University of North Texas, Denton
Keywords: opaque class variables, opaque base classes, kludges/emulation
Cc: 
Status: RO

There has been some discussion recently about access to private
variables.  Here's smth on a related topic (sorry, not *so* related)

Consider the following toy class 'foo' declared in file 'foo.h', with
the guts in the file 'foo.cc' and used in the file 'main.cc':

>>> Definition file 'foo.h':
class foo
{
  private:
	FILE *fp;
	int item;

  public:
	foo(const char * file_name);
	~foo(void) {}
	void write(void) const;
};

>>>Implementation file 'foo.cc':
#include <stdio.h>
#include "foo.h"

foo::foo(const char * file_name) : item(0)
{ assert( fp = fopen(file_name,"w") ); }

void foo::write(void) const
{ assert(fwrite((void *)&item,sizeof(item),1,fp) == 1); }

>>>File that uses it, 'main.cc':
#include "foo.h"
#include <stdio.h>

main(void)
{
  foo a_foo("foo_file");
  a_foo.write();
}

As you see, fp is a private variable of the class foo, all i/o for the
class is done internally, and main.cc does not use files by
itself. However, main.cc must #include <stdio.h> just to define FILE
used in the private declaration of foo. Note, that stdio.h got quite a
bit of stuff, which takes some time to process, and which is
absolutely useless for main.cc since it uses none of it.

Well, since struct FILE is referenced only by pointer, I can do the
following trick:

>>> File foo.h, version 2:
class foo
{
  private:
#ifdef _stdio_h
	FILE *fp;
#else
	void * fp;
#endif
	int item;

  public:
	foo(const char * file_name);
	void write(void) const;
};

File foo.cc remains the same, but I don't need to #include <stdio.h>
in main.cc any more.  So, if main.cc doesn't use stdio, it doesn't
need even to include stdio.h .  This results in faster compilation,
and looks cooler: don't include stuff we don't use/need.

The situation gets complicated in the following example

>>> File foo.h, 3d version:
class foo: private ofstream
{
	int item;
  public:
	foo(const char * name);
	void write(void);
};

>>> File foo.cc, 3d version:
#include <fstream.h>
#include "foo.h"

foo::foo(const char * name) : ofstream(name), item(0) {}
void foo::write(void) { *this << item; }

Here again, all i/o for the class foo is done internally, that is,
through an explicitly defined method write(); no access to ofstream of
class foo is possible (that is, by any function that uses foo).  But
still, main.cc has *got* to include fstream.h (and it's *a lot* to
process), because in order to construct an object of class foo, the
compiler needs to know the size of the ofstream object, even if the
object itself is not used in the code.  Moreover, there are far more
serious ramifications than just overhead from including unnecessary
stuff and general coolness: if someone one day tinkers with the
behavior of the class ofstream and, say, add a new method to it, we
ought to recompile main.cc (because it depends on "fstream.h",
formally speaking). But in reality, we can't use this new method of
ofstream anyway, so there is no real ("behavioral") dependency. Well,
we still need to recompile main.cc if the size of ofstream is changed
(as the result of someone's messing with the stream library).

Well, one can fall over backwards and write the following

>>> File foo.h, version 4:
class foo
{
#ifdef _fstream_h
  ofstream& my_fstream;
#else
  char& my_fstream;
#endif

  int item;
  public:
	foo(const char * name);
	~foo(void);
	void write(void) const;
};

>>> File foo.cc, version 4:
#include <fstream.h>
#include "foo.h"

foo::foo(const char * name) : my_fstream(*(new ofstream(name))), item(0) {}
void foo::write(void) const { my_fstream << item; }
foo::~foo(void) { delete &my_fstream; }

Here again we refer to the ofstream only by reference, which always
takes only 4 bytes (on UNIX). So, should one change the definition of
ofstream, we still don't need to recompile main.cc because we don't
use ofstream directly in main.cc. Happily, the size of the reference
does not depend on the size of the object it refers to.  We don't even
need to #include <fstream.h> in main.cc any more.  Well, that code
above, version 4, is an obvious kludge (and everybody would boo at
that). Yet it's a working kludge, I wrote once smth along these lines,
and it worked perfectly. Though uncool.

	Here is my wish: imagine I didn't have to resort to these
ploys with the cpp, because the compiler would do a similar kludges
for me.  Imagine there be a keyword 'opaque' in C++.

So, if I write
class foo {
	opaque XYZ * bar;
}

and XYZ is undefined, then the compiler wouldn't spit at me, providing
'bar' is not referenced in that particular compilation unit. Why
should compiler even care as to the exact type of the pointer, if it's
not used anyway.  Stretching the suggestion further, why won't we wish
that compiler interpreted
	class foo: opaque private Bar
	{
	};
as
	class foo {
		opaque Bar& bar;
	};

Well, a bit more tinkering is actually necessary, but it's all
feasible: actually, virtual base classes are sometimes implemented in
a similar way.  Well, we lose a bit in efficiency, we can't have
inline methods that use Bar (or bar), but we save in reducing
dependency and we cut down on processing "unnecessary" #include's (in
a real example where I came across this situation, #including really
takes a while, sometimes conflicts arise and the compiler chokes up).

	In a far-fetched wish, I dream of a day when C++ would get rid
of .h files and would talk about standard contexts (or even levels of
standard contexts/environments), smth similar to what was done in
Algol 68. What I mean is an efficient database of classes (and sundry
functions). The compiler queries whenever it needs a
declaration/description of a particular class/object; the compiler
doesn't have to entirely load it up. Unlike regular precompiled
headers, one doesn't have to rebuild the database if one declaration
changes, the database can allow duplication (with some mechanism: be
it a PATH or a project tag to resolve ambiguity), and the database can
distinguish between changes that affect the size of an object, and
changes (like adding/altering a member function) that essentially does
not "change" the object per se, only its behavior.  [ disclaimer: it
was just a dream, it doesn't have to be logical nor feasible ].

						Oleg

P.S. I'm not a frequent reader of this newsgroup, please send comments
if any to oleg@ponder.csci.unt.edu or oleg@unt.edu