;; -*-mode: Outline;  -*-

TOC
   General
   Awards

   Keynote: The Internet's coming silent spring
   Interactive 3D Graphics applications for Tcl 
   The AGFL Grammar Work Lab
   Plan9 and Inferno
   Introduction to Air Traffic Management Systems
   The Future Is Coming: Where the X Window System Should Go
   XCL: An Xlib Compatibility Layer for XCB
   Biglook: A Widget Library for the Scheme Programming Language
   BSD BoF
   An Implementation of Scheduler Activations on the NetBSD Operating System
   Authorization and Charging in Public WLANs Using FreeBSD and 802.1x
   Design and Performance of the OpenBSD Stateful Packet Filter (pf)
   Enhancing NFS. Cross-Administrative Domain Access
   Ningaui: A Linux Cluster for Business
   CPCMS: A Configuration Management System Based on Cryptographic Names
   X Meets Z: Verifying Correctness in the Presence of POSIX Threads
   Cyclone: A Safe Dialect of C
   Cooperative Tasking Without Manual Stack Management
   Improving Wait-Free Algorithms for Interprocess Communication in Embedded Real-Time Systems
   Special Closing Session: How Flies Fly?


* General
June 14-16, 2002, Monterey, CA

Attendance: ca. 1100 people, down from the last year's 1500. The
slowdown in the economy was noticeable.

The conference is selective: in the general track, 25 out of submitted
105 papers were accepted. In the FREENIX track, 26 out of 53 submitted
papers were accepted.

* Awards

James Gosling received the Lifetime USENIX achievement (aka
the "Flame") award.

Accepting the award, James Gosling told a story of his first USENIX
conference. It was in 1978, at Columbia U (?). At the first day of
the conference, in the morning, he took a stroll through a park near
the conference center. He enjoyed the park and the walk. When he later
told that to several people, they stared at him in disbelief, touched
him to make sure, and exclaimed: "You're alive!" That, James Gosling
said, epitomizes my programming career: "I was just to dumb to know
better."

The Apache project received the Software Tool User Group award.


* The Internet's coming silent spring
Lawrence Lessig, Stanford University. The keynote address.

Lawrence Lessig is the director of the Center for Internet and Society
at Stanford Law School. He's the board member of the Electronic
Frontier Foundation (EFF).

The style of his presentation was most unusual. His slides flashed a
word or a sentence in white against a black background. The first
slide read: "</spring>".

Lawrence Lessig started by telling three stories. The first was the
story of Armstrong's fight against RCA. RCA hired Armstrong to improve
their AM technology. Armstrong has invented the FM radio instead. When
he started to promote it, RCA fought tooth and nail. One of RCA's
internal memos said, "we cannot _allow_ that damn radio technology to
compete". After twenty years of litigation, Armstrong gave up and
committed suicide. "FM radio was not allowed," Lessig said.

The second story: Internet vs. AT&T. When Paul Baran first proposed
the packet switching technology, Department of Defense asked the
opinion of _the_ telephone company. AT&T said, this will never
work. Internally, an AT&T manager wrote, "we cannot _allow_ that damn
technology to suck the blood out of our veins". Internet was no
allowed.

The third story was about an Aibo dog, a web site to teach an Aibo dog
to dance jazz, and Sony's attempt to shut down that web site.

The point of these stories was that societies or environments can
either allow or forbid. End-to-end architectures allow
innovation. Centralized architectures (like the old AT&T network)
exercise a tight control over who can access their network and what
kind of protocols they are permitted to run.

Internet is a end-to-end architecture, a neutral network that lets
many different clients to connect and communicate to each other.  The
key innovations of the Internet (the TCP/IP protocol, USENET, HTTP,
Hotmail, IRC) were done either by graduate students ("kids") or
by non-Americans. By design, Internet encourages innovation by
outsiders. Another example of an end-to-end network that fostered
innovation (802.11, Bluetooth) is _unlicensed_ spectrum.

The open, end-to-end architecture of the Internet is threatened. The
physical layer (the owners of the wires and fibers) and the content
layer are corrupting the core. With the help of lawyers, the owners of
the networks and of the content have managed to frame the debate in a
very restrictive way.

First, they say: "It's our property". The cable companies say: the
cable and the fiber are our property, therefore, we can dictate what
kind of services and protocols are allowed on our networks. Running
web servers on client computers is not allowed.  Compare that with the
dawn of the Internet and USENET: AT&T owned the networks, and charged
a great deal for their use. However, AT&T was not permitted to
discriminate against the users. As long as the users pay, they are
free to talk about anything over the phone, and they can send all
kinds of noises including data-modulated ones.

The second issue that frames the current debate: "It's just theft."
Napster is the case in point.  Curiously, the Napster situation has
occurred many times before. Early in the 20th century, piano rolls
"Napsterized" sheet music. After much acrimony over compensations to
composers and copyright holders, a solution has been found. The
U.S. Congress has instituted a compulsory licensing of sheet
music. Radio broadcasting permitted distribution of music to wide
audiences for little cost, which rekindled the "napsterization"
debate. Again, the solution was a compulsory licensing. Copyright
holders were compensated for the uses of their content. However, the
copyright holders could not dictate how the content should or should
not be played. Cable TV brought the napsterization issue again, and
again the compulsory licensing was the solution. The issue arose once
again in VCR vs. the film industry. Jack Valenti (the president of the
Motion Picture Association of America) said that VCR is a "Boston
robber of the film industry" [Nowadays, Jack Valenti equates DeCSS
with terrorism. Some people never change.] After eight years, the case
went to the Supreme Court, which said: as long as there is a
_potential_ significant non-infringing use of a new technology, the
technology cannot be enjoined on the basis of the copyright law. In
the case of VCR, no compulsory licensing was necessary: it turns out
that the "robbed" film industry found ways to make a lot of money on
videotapes.

These examples showed that (big) distributors of content always
resisted new ways of using the content. However, they never managed to
suppress the innovation and the innovators -- until now.  DMCA
(Digital Millennium Copyright protection act) and courts in the Napster
case let the media conglomerates regulate the technology. The old
content distribution networks trying to use the law and the courts to
prevent any new ways of distributing the content.

The unfortunate fact, Lawrence Lessig said, is that the current debate
has taken extreme positions: the absolute control of the content
vs. communism. We need a balance. As the compulsory licensing examples
show, there are many ways people can share content without being
thieves. 

Consider a book. If we copy a book entirely and sell the copy, we
clearly infringe on the copyright. If we make a copy of a chapter to
mark it up, we exercise our fair use right. Besides these two uses,
there are many other, _unregulated_ uses. We can sell the book to a
used book store, we can give the book away to a library, we can
hi-light sentences in the book, we can write comments on margins, we
can burn the book or use it as a doorstop. All these uses are
unregulated. The majority of the uses are unregulated. When it comes
to digital content however, most of the uses turn out regulated: it's
either an infringement or the fair use. The debate becomes polarized,
and the middle ground is lost.

We need to quash the bad rhetoric, reframe the debate, flee from the
extremes and seek the balance.

The presentation certainly struck the chord. At the end of his
presentation, Lawrence Lessig received a standing ovation.


* Interactive 3D Graphics applications for Tcl 
Oliver Kersting and Juergen Doellner,
Hasso-Plattner-Institute, University of Potsdam. Freenix track.

Common tasks of programming a 3D application are configuring a scene
and its properties, and exploration. These tasks are most conveniently
performed in a domain-specific, interpreted, _scripting_ language.

The authors have built a higher-level Virtual Rendering System (VRS)
on the top of OpenGL. The main object in the VRS library (which is
written in C++) is a _node_. A node can be a shape, a property of a
shape (color, reflection coefficient, etc), or a transformation. Nodes
can be assembled in a scene graph. The authors have built Tcl bindings
to the VRS library. Therefore, the scene manipulation -- an often
repeated and crucial step -- can be done in a scripting language.  A
scene graph can be built incrementally and iteratively. The graph is
rendered by a VRS library in real time.  The presenter gave a
demonstration. Indeed, zooming, rotation, associating a zooming factor
with a slider can all be done interactively.

The authors have built a tool that semi-automatically maps C++ API
into Tcl. Wrapper classes are generated automatically. Moreover,
overloaded C++ methods are turned into "generic" Tcl functions. This
is a distinguishing feature of the tool. Interestingly, the tool
exports an "overloaded" delete operator into the corresponding Tcl
function. Scene nodes in Tcl scripts must be explicitly deleted.

The authors have also built a nice Tcl IDE, with a class browser,
property sheets and an editor/shell window with keyword completion and
syntax coloring.

Unlike VTC, VRS is a general purpose rendering library. VRS supports
advanced shading schemes, RenderMan, FreeType fonts, 2D imaging, etc.
	 http://www.vrs3d.org
The library is under LGPL.


* The AGFL Grammar Work Lab
Cornelis H.A. Koster and Erik Verbruggen, KUN.  Freenix track.

The talk has described the first _free_ parser generator for natural
languages, under the GNU public license (GPL/LGPL). The parser
generator takes a grammar and a lexicon and yields a parser. The
latter reads the text and converts it into a Head-Modifier (H-M) form.
Generated parsers can be used commercially royalty-free.

The H-M form is a set of pairs
        [head, modifiers]
where 'head' and 'modifiers' are (possibly empty) strings of
words. For example, the phrase "new walking shoes" is translated into
        [N:shoes, A:walking]
        [N:shoes, A:new]
 
Both the head and the modifiers are typed: N (for noun), P (pronoun),
A (adjective), V (a verbform). H-M pairs represent _major_ relations
expressed in the text; "unimportant" words are eliminated. The parser
can also normalize the tense of verbs. Here's a more elaborate
example:
 
        "Every PhD student gets a reduced price for the ETAP
conference"
 
is parsed into:
 
        [N:student, N:PhD]  [N:student, V:gets]
        [V:gets, N:price]   [P:it,V:reduced]
        [V:reduced,N:price] [N:price, for N:conference]
        [N:conference, N:ETAP]  

The conversion from a passive particle phrase "a reduced price" to an
active verb construction [P:it,V:reduced] [V:reduced,N:price] is
remarkable. The original phrase did not say who authorized the
reduction of the price for the conference. The parser has determined
that the actor is missing, and inserted a dummy actor 'it':
[P:it,V:reduced].
 
H-M pairs produced by a parser can be used for text classification,
information retrieval, and question answering systems. The parser can
also be used in educational software. In fact, the parser has been
used by the authors in a text (patent application) classification
system called Peking.

The AGFL package comes with a sample grammar and a lexicon, of
English. AGFL stands for an "affix grammar over a finite lattice",
which is a _context-free_ grammar with set-valued attributes.  The
parser is non-deterministic, and therefore is not fast. It is,
however, robust: it does not crash when is presented with unknown
words or an incorrect phrase. Again, the package is free.

The AGFL package is available at:
        http://www.cs.kun.nl/agfl/
 
The grammar of English is very illuminating:
        http://www.cs.kun.nl/agfl/npx/grammar.html

Given an ambiguous phrase, the parser currently returns only one
parsing alternative (chosen more or less arbitrarily and
heuristically). Version 2.0 will incorporate statistics and
probabilities. The parser is more scalable and faster than DCG is
Prolog. 

The system is written in a CDL3 language. The CDL system is also a
parser generator, which yields fast _deterministic_ parsers.

When I asked the speaker to compare his approach with a "dominant"
pure statistical approach, he said that extremes are rarely
sufficient. Statistical approaches need a lot of statistics to
represent the basic facts of the language (e.g., the
Subject-Verb-Object order of English). The latter can be more
succinctly and conveniently expressed by rules. It's more productive
to use rules for most of the grammar, and statistics to resolve
ambiguities.

Cornelis H.A. Koster conducted a BoF on natural language
processing. The main topic of the BoF was again the statistical
vs. rule-based classification and retrieval of documents. He mentioned
an interesting fact that stemming and singular/plural normalization do
not improve classification much. One of the drawbacks of the
pure-statistical approach to the information retrieval is that the
relevance score is highly non-linear. A relevance score of 99% tells
that the document in question is most likely highly relevant to a
query. OTH, a relevance score of 70% often means that the document has
nothing to do with the topic. The precision and the recall of pure
statistical classifications and retrieval reach a plateau. You need to
add the rules of the grammar to get over the plateau. Dr. Koster has
briefly described his document classification system Peking. The
documents to classify are patent application. First he parses the
document into HM pairs as outlined above. The system collects the
statistics of HM pairs and winnows out stiff and noisy terms. What's
left are significant pairs, which describe "the essence" of a set of
documents.


* Plan9 and Inferno
As the year before, Vita Nuova had a booth at USENIX Expo. Two
exhibitors were constantly busy demonstrating the two remarkable
operating systems. Inferno now runs natively on a iPAQ.

Inferno supports only one language: Limbo. Because the language is
safe, Inferno does not need a kernel-user-space boundary. This makes the
system fast. A Limbo virtual machine, DIS, is rather similar to
JVM. It's possible to map (all?) JVM codes to DIS. Alas, DIS does not
support tail-recursion.

Roger Peppe, one of the exhibitors, showed a Tk emulation in Limbo.
Essentially, he wrote a Limbo/Tk. He also showed an amazingly transparent
networking in Inferno/Plan9. A serial port in Plan9 is represented by
several files. You can control the port with the regular 'echo'
command. He demonstrated a driver for a digital camera, written in
"shell" (in Limbo). The interface to the driver is a file system
again. In Plan9/Inferno, it's possible to create 'virtual files' which
are associated with a pair of handlers rather than with a storage.
When you load the camera driver, you see a set of files in a directory
you designate. Doing "echo zoom 0.5 > kodak/ctl" changes the zoom
factor. You can see the pictures by executing "ls -l kodak/img/" at
the shell prompt. Roger Peppe showed a GUI, again written in shell
(Limbo/Tk to be precise) that talks to the camera by reading and
writing to the files in the kodak/ directory. You can mount the kodak
directory on another computer: iPAQ that was connected to the Plan9
laptop via a wireless link. You can indeed control the camera from the
iPAQ. The transparency of networking in Plan9/Inferno is astounding.

The underlying protocol, Styx, is well thought out and concise. It
takes fewer than 10 pages to completely describe it. OS developers
could greatly enhance their systems if they implement Styx.

Distributive programming in Limbo is based on channels -- like in
communicating sequential processes. Threads are synchronized through
rendez-vous, like in Ada. From the distributive system point of view,
Limbo programming looks surprisingly similar to Erlang.

Plan9 and Inferno binaries are freely available, even for commercial
use. The source, the core, costs money. The source cannot be freely
distributed, but it can be modified and used in a commercial
application royalty-free. Changes do not have to be reported back to
Vita Nuova. See the FAQs at www.vitanuova.com


* Introduction to Air Traffic Management Systems
Ron Reisman and James Murphy, NASA Ames Research Center, 
Rob Savoye, Seneca Software. Invited talk.

Trajectory synthesis is the hottest topic, which includes physics,
scheduling, even expert systems and neural nets. 

The flight routes are fixed. Every two hours, the FAA command center
in VA reviews the weather and other conditions and replaces the routes
with the alternatives, according to the rule book.

All _strategic_ systems run on UNIX.

A tidbit: PowerPC, which was developed at IBM, was financed by FAA as
a part of the advanced automated system (AAS) project. FAA needed a
platform to run UNIX reliably enough for air traffic controllers. The
AAS project was probably the biggest software engineering disaster. It
was budgeted for $3.5 billion; when AAS was canceled $9 _billion_ was
spent.

At one point, Congress and FAA discussed a free flight
system. According to the system, there are no mid-flight traffic
controllers and pilots have a freedom to choose the best route during
the cruising phase. This turns out not a good idea: even the authors
of the free flight proposal do not believe in it any more. Ground
controllers and fixed routes will remain.


* The Future Is Coming: Where the X Window System Should Go
James Gettys, Compaq Computer Corp.  Freenix track.

The talk focused on migration and replication of X window sessions. A
user should be able to disconnect from an X window application and
reconnect to it from another terminal. A user may want to replicate
his "screen" so other users can interact with an application, possibly
in parallel. When a session is moved to a different terminal however,
the new X server may have the different screen size, screen depth and
the resolution, different fonts and other resources.

There are a few tools that support screen replication and migration to
some extent: xmove, VNC, pseudo-servers, Emacs. VNC shuffles bitmaps
around and doesn't handle screens of different sizes and depths.

Emacs actually supports transparent session replication and migration:
see make-frame-on-display. The feature has been implemented for quite
some time and works well. The only drawback is a remarkable obscurity
of the feature and the lack of advertising.

Server-site fonts: "a 15-year migraine". Any server-based fonts will
fail: there isn't a guarantee an application will get its fonts;
serious applications need a lot of details about font metrics and
therefore overload the server with font requests; long startup times
(while the server rasterizes all glyphs in application-requested
fonts, even if an application will use only few glyphs).

A toolkit is the best place to handle migration and replication. Alas,
most toolkits are not built with these issues in mind.

Gim Gettys reported that he is slowly overcoming the problems, by
changing a toolkit (GTK) to be aware that the screen resolution, the
screen size and depth may change during application's lifetime. He has
already managed to move a simple widget from one screen to
another. Still, there is a lot of work to do.


* XCL: An Xlib Compatibility Layer for XCB
Jamey Sharp and Bart Massey, Portland State University.  Freenix track.

XCB is C bindings for the X protocol. Unlike XLib, XCB deals only with
communication, marshaling and similar issues. XCB offers far simpler
and consistent interface to the X protocol than XLib does. XCB is very
small (27 KB compiled) and is therefore particularly suitable for
embedded systems. XCB can be used in a multi-threaded environment. The
code for XCB is mainly _automatically_ generated from the
specification of the X protocol!

XLib contains much more functionality than mere a client-side of the X
protocol. XCL is an attempt to implement some of that functionality,
using XCB to communicate with the server. XCL+XCB is meant as a
replacement of XLib (a lighter version of XLib, to be precise). The
goal is largely achieved. XCL+XCB is 55 KB compiled (whereas XLib is
665 KB. Some of that can be reduced by replacing pervasive macros with
procedure calls). Some of the XCL code is automatically generated;
most of the other is just copied from the XLib source code and
adjusted to use XCB. XCL does not implement all of the XLib. In
particular, XCL does not support X color management, i18n and
caches. As the presenter showed, these features, although accounting
for more than the half of XLib, are hardly used at all. Most of the
toolkits do i18n and color management on their own.

The author reported the success in relinking of several X windows
applications (rxvt and gw) to use XCL rather than XLib.

	     http://xcb.cs.pdx.edu/


* Biglook: A Widget Library for the Scheme Programming Language
Erick Gallesio, University of Nice; and Manuel Serrano, INRIA.  Freenix track.

Like Tcl, Biglook relies on a programming language for building
graphical user interfaces. It uses a CLOS-like object layer [generic
functions rather than class methods] to represent widgets, and Scheme
closures to handle events. Biglook is a (Bigloo) Scheme
library. Biglook code can be compiled by a Bigloo compiler into C or
JVM byte code.  Biglook implemented a full-fledged IDE for Scheme:
Bee.

Biglook is a thin wrapper on the top of native back-end
libraries. Currently, two native widget libraries are supported: GTK+
and Swing. The wrapping overhead is quite low: about 15% of additional
allocations.

The paper stresses several points that greatly contribute to Biglook
expressiveness: 'virtual slots' in records, keyword arguments, and
closures as call-backs.

Virtual slots is the mechanism to relay actions on widget classes to
the back-end widget library. Virtual slots also make the API concise
and expressive. For example, three classical functions 'iconify',
'de-iconify' and 'window-iconified' are represented in Biglook by a
single virtual slot 'visible' in a window class.

The variable number of arguments and keyword arguments are essential
to declarative GUI programming. Without them, the creation of a widget
must be split into two steps: the instantiation of a widget and the
setting of its attributes. An alternative, passing of a list of
attributes to a widget constructor, involves a run-time overhead and
disables compile-time checking of attributes and their values. In
Biglook, keyword parameters are resolved at macro-expansion (i.e.,
compile) time. A macro instantiate::widget (which is automatically
generated from widget's class declaration) does all the work of
resolving keyword attributes. Other automatically generated macros are
getters and setters, and a 'with-access::widget' binding form.

During the talk, Manuel showed several GUIs and their complete code,
which was only 10-lines long or shorter.

Of a special note is a comparison of Biglook, Tcl/Tk, Swing and GTK on
an example of a widget with several buttons. In Biglook, the creation
of a button, its packing and the setting of the event handler are all
part of the same expression. In Tcl/Tk, the code that deals with a
button is spread out in three places. In Swing, the same code is split
in 4 places. In GTK, in 5 places. The paper notes that in the
experience of the authors, the Biglook source code is about twice
smaller than the same interface implemented in Tcl/Tk, and about four
times smaller that the interface implemented in C with GTK+.

     http://kaolin.unice.fr/Biglook


* BSD BoF

Like the previous two conferences, this year's USENIX had a SuperBSD
BoF. Leaders of three non-commercial BSD projects and of BSD/OS and
Mac OS X/Darwin came to talk about their projects' achievements and
future plans. It seems that every year the SuperBSD BoF is attended by
more and more people.

** FreeBSD
FreeBSD now includes a dramatically faster driver for the Intel
Gigabit Ethernet card. The driver was written by Intel and is
supported by Intel in FreeBSD CVS tree.

FreeBSD 5.0 will include many parts of the TrustedBSD (which is
being supported by DARPA). 

FreeBSD 5.0 will include UFS2: 64-bit block pointers, native extended
attributes (ACL, security ids, etc), transparent hash directories.
Marshall Kirk McKusick said that hash directories (which hash file
names into i-node numbers as directory blocks are read in memory) are
more efficient than B-tree-based directories.

** Darwin
bash as /bin/sh, SAMBA, Python and _Ruby_ are the part of the base
system.
WebDAV will be used heavily, as a distributed filesystem.
netinfo will be phased out and replaced with LDAP/OpenDirectory.
MacOS X versions 10.0 used a mixture of NetBSD and FreeBSD user-level
applications. The upcoming and future versions of MacOS X and Darwin
will closely track FreeBSD.


* An Implementation of Scheduler Activations on the NetBSD Operating System
Nathan J. Williams, Wasabi Systems Inc.  Freenix track.

The talk started with an overview of three implementation models of
threads. In one model, threads exist only at the user level. All
threads run within the same kernel process. The implementation is
fast; however, when one thread enters the kernel or blocks, all other
threads within the same process are blocked too. On the other extreme
are kernel threads. Like processes, kernel threads are scheduled or
blocked independently. Unlike processes, different threads may share
the same address space, job identification, user credentials,
etc. Although thread switching is faster than process switching, it is
still relatively expensive. What's more important, each kernel thread
takes kernel resources, which are limited. Kernel thread
synchronization involves a system call and is therefore expensive.

The third model -- which is the topic of the talk -- is a two-level
implementation. N kernel threads run M (M>=N) user threads. Thread
synchronization and switching occurs at the user level. When a kernel
thread executing a user thread blocks on an i/o or other system call,
the kernel "sends a signal" to an application, telling it that a user
thread has blocked. The signal will be handled by the remaining kernel
threads assigned to an application. If there are no other threads to
handle the "signal", the kernel may launch a new thread and give it to
the application. The "signal" handler (called an upcall handler) will
mark the user thread that got itself blocked as non-runnable and will
switch to another runnable user thread if any. This approach implies a
co-operation between the kernel and the user application. The kernel
scheduler notifies the application about scheduling decisions that
affect the application: creation and deletion of kernel threads, a
thread becoming blocked, unblocked, or pre-empted. In the latter case,
the pre-emption event is delivered to an application _after_ it was
resumed (so the application cannot stave off the pre-emption but still
can know that it was off the CPU for some time). The NetBSD
implementation described in the talk uses so-called schedule
activations. Scheduling event notifications are called upcalls.  They
behave pretty much like POSIX signals. An application registers its
intent to be notified of scheduling events by registering an upcall
handler. The application should also allocate a set of stack frames to
run the handler.

The implementation described in the talk looks surprisingly similar to
call/cc-based threads/co-routines. When the kernel executes an upcall,
it runs a special trampoline code in the user space. The trampoline
saves the stack pointer and the context of the previously running
code, sets the stack frame and jumps to the upcall handler.  Upcall
handlers are supposed to "switch" to an appropriate user thread,
hence, they do not return. To avoid running out of stack frames for
upcall handlers, an application must determine which stack frames are
no longer "relevant" and manually "garbage collect them" (i.e., mark
them as free). The talk clearly showed kernel hackers re-discovering
call/cc.

Scheduler activations will be in the next release of NetBSD. The
maximum number of concurrent threads ran without encountering resource
limitations was 7000-8000.

After the talk, Clem Cole commented that they have to support more
threads and have to spend the enormous amount of time on
debugging. Clem Cole said that DEC/Compaq thought they got threads
right in their True64 OS. Until they tried to run a Netcenter, which
launched 10,000-100,000 threads. A lot of bugs showed up then.


* Authorization and Charging in Public WLANs Using FreeBSD and 802.1x
Pekka Nikander, Ericsson Research NomadicLab.  Freenix track.

The topic of the talk is how to do authentication in a public wireless
network. This research started from a business proposal for collecting
micro-payments for access to a wireless LAN and other resources. Alas,
the funding for the author's start-up dried out, so he decided to do
research instead.

The talk gave a good overview of possible threats in wireless LANs: a
laptop may pretend to be a router; a laptop may run DHCP and advertise
itself as a default gateway; a laptop can launch denial-of-service
attacks against cheap access points and thus force clients to use
expensive access points. Incidentally, the latter threat has been
realized, in cellular networks. The presenter said Russia has powerful
jamming antennas on the border with Finland. They suppress base
stations in the border regions and force cell phones to connect to
base stations in Russia. The latter charge 10 times as much for
access.

The rest of the talk explained an implementation of LAN-level
authentication mechanisms (EAP and 802.1x) as Netgraph nodes. Some of
the NG-nodes -- e.g., MAC-address-based filtering, tag-based filtering
at the frame level -- are rather generic and can be used in other
projects. Although EAP and 802.1x solve the problem of a bi-lateral
authentication, these techniques do not help much in
multi-cast/broadcast. Resource discovery and similar protocols are
inherently multi-cast.


* Design and Performance of the OpenBSD Stateful Packet Filter (pf)
Daniel Hartmeier, Systor AG.  Freenix track.

The PF filter chains filtering rules in a linear linked list. For each
packet, the rules are evaluated one after another. For optimization,
the filters are grouped in "skip steps" according to the interface
number and the source address. If the first filter within the group
failed (e.g., did not match the source address), then the other
filters in the group are skipped.

The topic of the talk is an implementation of _stateful_ rules. The
state for a TCP connection is a ip-address,port,seq-number tuple. The
filter can easily check if a TCP packet belongs to an already
established session. If it does, the packet is allowed through,
without any need to consult any rules. The pf filter maintains the
state for UDP "connections" as well. A UDP "session" is defined as a
flow of packets within a short (e.g., 30 seconds) interval between the
packets. The state of sessions is maintained in a red-black tree.

The PF filter reassembles a packet from fragments and applies rules to
whole packets. The filter may optionally 'randomize' TCP sequence
numbers for outgoing packets.

The talk compared the pf filter with IPFilter and with iptables on
several benchmarks. IPtables is a stateless filter. It turns out that
maintaining sessions' state is cheap but evaluating rules is
expensive. With IPTables, 200 rules halve the bandwidth. Therefore,
stateful filters (which need fewer rules to express a policy) perform
better than stateless filters.


* Enhancing NFS. Cross-Administrative Domain Access
Joseph Spadavecchia and Erez Zadok, Stony Brook University.  Freenix track.

The talk presented two attempts to tinker with an NFS server to make
NFS a bit more secure. The attempts are: range-mapping for UIDs and
file cloaking. The latter prevents certain files within an exported
file system from being visible to NFS clients. However, a rogue client
(a user with a root access) can easily defeat the guards described in
the talk. Peter Honeyman clearly (if somewhat emotionally) pointed
that fact out. NFS is just an insecure protocol.


* Ningaui: A Linux Cluster for Business
Andrew Hume, AT&T Labs-Research; and Scott Daniels, EDS.  Freenix track.

Ningaui is a tiny Australian mouse-like beast.

Ningaui is a Linux cluster -- but it's not a Beowulf. The latter is
designed for high-performance scientific computations. In contrast,
Ningaui is aimed at business applications and stresses
high-availability, fault tolerance, and a very intensive i/o. A node
in a Ningaui cluster can be brought down, upgraded in software and
hardware, and plugged back into the cluster. At that point the node
becomes a part of the cluster without any additional
intervention. Although currently Ningaui runs on Linux boxes, the
cluster software is largely platform independent. In a response to a
question, Andrew Hume said that it's easy to plug FreeBSD nodes into
the cluster (in fact, he was about to do that).

Currently the cluster is based on Linux (RedHat 6.2) boxes
inter-connected by a cLAN. The nodes are loosely coupled. The cluster
software consists of a job scheduler, a file replicator, IP address
takeover software, and the code to run a leader election protocol. Job
scheduling, the central service, is implemented as an auction with
nodes bidding for jobs. The winning node gains a lease to the job; the
node should compute the result within the lease time, or apply for the
lease extension. The winning node also loads all necessary files into
its local file system. Files are loaded via a 'ccp' (cluster copy
command), which effectively replicates the data. No NFS is used. The
bidding/leasing implementation automatically guarantees
load-balancing, fail-over and recovery. The job scheduler is run by a
leader. An election (or the confirmation) of a leader happens every 5
seconds, by a single UDP broadcast by each voter. The leader is
determined in 10 ms after the broadcast. The election protocol was
designed at AT&T and is rather efficient.

Currently the cluster runs an At&T billing auditing application. AT&T
call centers upload (ftp) their billing files to the cluster, which
uncompresses the files, checks the records, computes statistics and
watches for anomalies. One of the files they processed was 140 GB
long! The job of being an incoming FTP site is a "job" subject to
load-balancing and bidding. Nodes in a cluster bid for becoming an FTP
site; the winning node gets a lease, acquires the IP address and
starts accepting files. If the node fails, its lease eventually
expires, and another node takes over. No heartbeat monitoring is
necessary.

The talk briefly enumerated problems the author encountered when
working in such a data intensive environment. For example, a chain
"gzip -d file.gz | a.out > output-file" fails 2-5% of the time on a
heavily loaded system. TCP storms happen: TCP stack would unexpectedly
accept connections that it does not have resources to process. The
lack of a direct i/o under Linux is a nuisance: all file i/o goes
through the buffer cache. A single reading of an extremely large file
pollutes the cache and evicts all other file blocks. ReiserFS turns
out not very good when dealing with few but humongous files: every
couple of weeks the kernel panics. OTH, due to the logging of updates,
rebooting is quick. The superdaemon inetd has an undocumented
connection throttle set to 40 processes/min. The author had to examine
the source code, reset the throttle to 5000 processes/min and
recompile inetd. The benefit of having the source code available is
clear.

All the problems encountered with the Ningaui so far are merely
nuisances. The authors run a similar billing auditing application on a
Sun E10K and have encountered more and more serious problems. Thus
E10K is no more reliable or available than a Linux/i386 cluster -- but
vastly more expensive.


* CPCMS: A Configuration Management System Based on Cryptographic Names
Jonathan S. Shapiro, Johns Hopkins University.  Freenix track.
Awarded Best Paper!

CPCMS (now called OpenCM) is intended to be a replacement for
CVS. Unlike CVS, which keeps tracks of files, OpenCM keeps track of
collections of objects, their meta-data, and the history. Objects are
things with attributes rather than merely files. A version number is
assigned to a change rather than to the content.

OpenCM was built as a configuration management system for the EROS
project, which is an open-source, _secure_ OS
(www.eros-os.org). Therefore, OpenCM was designed to be an assured
configuration-management system, which tracks changes not only to
files but to sets of files. The system should provide assurances that
the file content hasn't been tampered with. Consequently, the access
control should be extended to configurations. CVS does not even track
file renames.

OpenCM intentionally looks and feels like CVS. Like CVS, OpenCM
includes a command-line client. OpenCM, however, supports rename
operations, assigns versions to configurations rather than to files,
supports IDE-centric workspaces and cryptographic authentication. Most
interestingly, OpenCM supports a disconnected operation: a user can
commit and rollback while being disconnected from the repository.
Objects in a repository are named by a cryptographic hash of object's
content. This gives the client an assurance that the retrieved
configuration was not tempered with. OpenCM is scalable due to
replication. The uniqueness of object names and the immutability of
the content objects greatly facilitates the replication. Access
control relies on X.509 certificates (which are to be distributed in a
trusted way because the certificates are generally self-signed to
avoid the complexities and the cost of CA). Privileges are assigned to
groups, which are made from users and other groups. The latter feature
supports delegation.
	 http://www.opencm.org


* X Meets Z: Verifying Correctness in the Presence of POSIX Threads
Bart Massey, Portland State University; and 
Robert T. Bauer, Rational Software Corp.  Freenix track.

The paper showed how formal methods helped solve a really hard
multi-threaded problem. The problem arose during the development of
XCB, a client library for the X protocol. See the XCB/XCL talk above.
The library is supposed to be _transparently_ used in a multi-threaded
environment. Making the library thread-safe is hard, because the X
protocol does not, by its design, prevent deadlocks. Therefore, the
client library (XCB in our case) must be careful to avoid these
potential deadlocks. Alas, the careful design is hard. The author
tried a brute force approach and failed. His algorithm may prevent a
deadlock in one case and allow it in another. Bart Massey will correct
the algorithm, only to find another fatal flaw later. After four
attempts, he realized that there must be a better way.

As a professor at Portland State University, Bart Massey teaches
Z. The textbook in his class is "The way of Z. Practical Programming
with formal methods". Interestingly, this book is written by a
radiologist. Bart Massey decided then to follow what he teaches. He
wanted to specify POSIX threads in Z, specify the X server behavior in
Z, describe the properties of the desired client algorithm in Z, and
then formally prove that the whole system is deadlock free.

The latter part -- the formal proof -- is still wanting. Unfortunately
the semantics of POSIX threads is not well-specified. Still, the mere
fact of describing the client behavior in Z has given a crucial
insight that lead to the correct algorithm. That algorithm is
implemented in the current version of XCB. Although there is no formal
proof of its correctness (yet), the extensive testing and use revealed
no problems.

Thus the lesson of this talk is that formal methods are indeed
useful. Good ideas come up just from mere writing down a formal
specification. Engineers use Math and logic extensively in their work;
software engineers and opensource developers should do that too.


* Cyclone: A Safe Dialect of C
Trevor Jim, AT&T Labs-Research; and Greg Morrisett, Dan Grossman,
Michael Hicks, James Cheney, and Yanling Wang, Cornell. General track.

The talk made a flimsy case for Cyclone. The presenter never justified
his assertions that developers really need to know precise details of
the data representation and to access bits and bytes of complex data
structures. Incidentally, if these assertions were true, the
developers should not be using Cyclone instead of C. Cyclone
introduces several kinds of pointers, including a fat pointer. A fat
pointer carries the range information, needed for run-time bound
checks. A fat pointer is not merely a pointer, yet its details are
opaque to developers. A dereference operation via a fat pointer is not
atomic (which creates problems for using Cyclone in multi-threaded
programs).

The idea of a never-null pointer and of a static reasoning of these
pointers is interesting. Let's consider:
	 int @f = x;
where '@' marks a never-null pointer, similar to '*' marking a regular
pointer. An assignment or an initialization of a never-null pointer
entail an _implicit_ run-time nullarity check. However, if the value
to assign is known to the compiler to be a never-null pointer, no
run-time checks will be inserted. Note that the Cyclone compiler adds
implicit run-time checks -- which make the language safer but remove
the programmer from "the bare metal." If a programmer doesn't mind an
abstraction of the bare metal, the programmer then would be more
comfortable with a more expressive language like OCaml, for example,
or Java.

Performance: from 0 overhead to 3x slowdown, in pointer-intensive
applications. After the talk, several persons noted that 3x slowdown
is unacceptable. Cyclone still does not support multi-threading nor
completely manual memory management (Cyclone supports arena management
and GC?).

One person asked a funny and witty question: with never-null pointers
(marked by '@' rather than the regular '*'), fat pointers (marked by
'?')  and contemplated forward-only pointers, won't the designers of
Cyclone run out of characters to mark these pointers? Have they
considered Unicode?
	   http://www.cs.cornell.edu/projects/cyclone/
 
* Cooperative Tasking Without Manual Stack Management
Atul Adya, Jon Howell, Marvin Theimer, Bill Bolosky, and John Douceur,
Microsoft Research. General track.

The talks has stirred up a lively debate, which continued after the
talk and all through the lunch.

There are two major concurrency models: multi-threading and
event-driven computations. As Ousterhout pointed out, threads are a
bad idea, for most of the purposes.

The acrimony between the two models is caused by conflating several
concepts. The talk took time to explain different axes of a concurrent
application.

** task management
*** serial task management.
On one end is a serial task management: a task executes on a CPU until
it is finished. No other task is scheduled until the first task
completes. This strategy is the easiest to reason about; yet it is
inefficient as the CPU idles when the task waits for i/o completion.

*** preemptive task management
Execution of tasks can interleave on a uniprocessor or overlap on
multiprocessors. This strategy seems efficient, until we consider the
communication between tasks. If tasks are completely separate and do
not share any data, there is no danger of them being interleaved or
overlapping [see a data partitioning axis below]. However, the tasks
do communicate. In the threaded model, tasks share the same address
space and communicate by updating shared variables.

With pre-emptive multi-threading, a thread is generally never sure
that it is working with consistent data structures. A thread can be
pre-empted at any moment. A newly scheduled thread may alter the data
structures. To be sure of consistency of its data, a thread must lock
the data structures and then verify (and if needed, restore)
invariants. However a thread must not hold locks for a long
time. Furthermore, to avoid deadlocks, the thread must release locks
in certain situations (when it failed to acquire all needed locks, for
example or before doing a slow i/o). Once the thread released the
locks and then re-acquired them, it must verify the invariants all
over again.

*** co-operative task management
This is a compromise between the serial and the pre-emptive
management. Unlike the serial task management, a task may surrender
the CPU before the completion and let another task (thread) run in its
address space. Unlike the pre-emptive management, a thread yields
control on its own accord, at well-defined points in its execution.

Cooperative multi-threading does not eliminate the problem of
consistency of data structures. After the thread yielded the control
and then regained it, the thread has to verify all
invariants. However, a thread yields at the points defined by its
programmer. Between these points, the computation can be assumed
serial. No locks are needed to guarantee the consistency of thread's
data. Therefore, the cooperative multitasking model is significantly
less error-prone than the pre-emptive one. The cooperative
multitasking often has a better performance, too, due to the
elimination of the locking overhead. The work on the Flash web server
and the last year USENIX presentation ["High-performance memory-based
web servers: kernel and user-space performance" by Philippe Joubert et
al] showed that multi-threaded web servers are not competitive against a
finite state machine with an efficient event notification.

** stack management
Stack management is making sure a thread can access its local data
after the thread has lost and then regained the control. Automatic
stack management is the one performed, for example, by C thread
libraries. A thread library saves the stack and other context of the
thread that has lost control, and restores the context for the thread
scheduled to run next. Manual stack management is when a programmer
must pack a callback and all the data for it in a closure. The
programmer thus manually creates a continuation frame -- to be
executed when an event arrives.

Other axes are: i/o management (synchronous vs. asynchronous),
conflict management (optimistic vs. pessimistic; semaphores
vs. monitors), data partitioning. The paper explicitly considers only
the task and stack management axes.

In a multi-threaded application, pre-emption of a thread is
transparent to a programmer. The thread library or the OS take care of
saving and restoring thread's stack. OTH, an event-driven application
often must explicitly specify which function to execute when an event
arrives and which data to pass it. These data must be manually packed
in a closure. The manual stack management rips a C function apart and
leads to a code that is difficult to maintain. The talk gave several
good illustrations of horrors of event-based programming in a language
that does not natively support closures and continuations.

The paper mentions that the transparency of thread's pre-emption and
resumption is actually a drawback, which makes multi-threaded programs
difficult to reason about. The paper contends that the immensely
convoluted style of typical event-passing programs has nothing to do
with the cooperative task management. The convoluted style is the
consequence of the manual _stack_ management. The presenter
specifically mentioned that Scheme doesn't suffer that style
problem. In Scheme, event-based programming looks elegant and natural.

As the paper says, "some languages have a built-in facility for
transparently constructing closures; Scheme's call-with-current
continuation is an obvious example. Such a facility obviates the idea
of manual stack management altogether. The paper focuses on the stack
management problem in conventional languages without elegant
closures."

The rest of the paper and of the talk showed techniques to make
cooperative-multitasking programming, specifically in C, more
pleasant. The paper demonstrated adapters that allow mixing of
cooperative-thread and preemptive-thread code (written by disparate
programmers). Note a USENIX01 talk "A toolkit for user-level file
systems" by David Mazieres, who used C++ templates to automate
building of continuations.

The solution advocated in the talk is to use co-routines based on
"fibers". I think though that subliminally the message of the talk was
to use Scheme.


*** Additional note about the drawbacks of threads
Another USENIX'02 paper, 
   "Ninja: A Framework for Network Services"
   J. Robert von Behren, Eric A. Brewer, Nikita Borisov, Michael Chen, Matt
   Welsh, Josh MacDonald, Jeremy Lau, Steve Gribble, and David Culler,
   University of California at Berkeley
   General Track proceedings, pp, 87-102.

includes the following passage (the top of the first column on p. 91):
 
<blockquote>
The scalability limits of threads are well-known and have been studied
in several contexts, including Internet services [PDZ99] and parallel
computing [RV89] . Generally, as the number of threads grows, OS
overhead (scheduling and aggregate memory footprint) increases, which
leads to a decrease in overall performance. Direct use of threads
presents several other correctness and tuning challenges, including
race conditions and lock contention.
 
Principle: Concurrency is implicit in the programming model; threads are
managed by the runtime system.
 
Since we must avoid excess threads to achieve graceful degradation, we
simply prevent services from creating threads directly. Instead,
services only define what could be concurrent, via (explicitly)
concurrent stages. Conceptually, each stage has a dedicated but
bounded thread pool, but the allocation and scheduling of threads is
handled by SEDA. Thus the system as a whole is event-driven, but
stages may block internally (for example, by invoking a library
routine or blocking I/O call), and use multiple threads for
concurrency. The size of the stage's thread pool must be balanced
between obtaining sufficient concurrency and limiting the total number
of threads; SEDA uses a feedback loop to manage thread pools
automatically. The particular policies are beyond the scope of this
paper, as they only effect the node performance. Roughly, allocation
is based on effective use of threads (non idle) and priorities, while
scheduling is based on queue size and tries to batch tasks for better
locality and amortization (as in [Lar00] ).
</blockquote>

Mentioned references:

[PDZ99] V. S. Pai, P. Druschel, and W. Zwaenepoel. "Flash: An efficient and
portable Web server." Proc. of the 1999 Annual USENIX Technical Conference,
June 1999.
 
[RV89] E. Roberts and M. Vandevoorde. Work crews: An abstraction for
controlling parallelism. DEC SRC Technical Report 42, Palo Alto, California,
1989.
 
Back in 1989 people started to realize that threads aren't good for
parallel computing!
 
[Lar00] J. Larus. Enhancing server performance with Staged-Server. http://
www.research.microsoft.com/~larus/Talks/StagedServer.ppt, October
2000.                                                                          

* Improving Wait-Free Algorithms for Interprocess Communication in Embedded Real-Time Systems
Hai Huang, Padmanabhan Pillai, and Kang G. Shin, University of
Michigan.  General track.

This talk, perhaps unwittingly, has confirmed the previous talk's
observation that threads are really a bad idea.

The gist of a threading model is a (conceptually) parallel execution
of computations that communicate via shared data.  This model is
fault-prone, deadlock-prone, priority--inversion-prone; the model
exhibits a poor scalability due to locking. The model is hard to
reason about, e.g., it's hard to compute the worst execution time of a
multi-threaded task. The latter is a particular drawback in real-time
(RT) systems.

A better IPC model is a wait-free inter-process
communication. Concurrently running tasks can communicate through a
shared state -- but in a way that requires no locking and still
guarantees race-free execution. The drawback is bigger space
requirements.

The talk has described a memory efficient wait-free algorithm for a
single-writer multiple-reader problem. The algorithm is a combination
of two kinds of non-blocking protocols. One is a non-blocking write
(NBW) protocol, which is as fast as a simple write to shared memory
without any synchronization. Another class of lock-free write/read
protocols relies on the presence of certain atomic operations:
compare-and-set or increment or decrement. Both classes of the
lock-free protocols trade space for safety.

The interesting result of the paper is that by combining the two
classes, it's possible to design a fast lock-free protocol that takes
less memory that each of the original algorithms separately.  The
insight is an appropriate partitioning of reader tasks into slow and
fast readers. The fast readers and the writer interact via a NBW
protocol; the slow readers interact via one of the protocols of the
second class (e.g., a double-buffering protocol). The hybrid protocol
takes less space than either NBW or the double-buffering protocol
applied to all readers; The hybrid protocol is faster than the
double-buffering algorithm alone.

The paper and the talk showed the results of several benchmarks. The
graphs demonstrated that the best locking solution to the
single-writer multiple-readers problem has the least space
requirements but absolutely horrible worst-case performance.


* Special Closing Session: How Flies Fly?
Michael H. Dickinson, Williams Professor, UC Berkeley

A fascinating talk about flies: how a small fruit fly maneuvers at
high speed; how it avoids obstacles and how it lands. To understand
the flight of insects, the presenter contended, we need to take a
holistic approach. We need to research how the movement of wings
produces a lift; how a fly moves and controls its wings; how the
visual system modulates these controls; and how the behavior (e.g.,
sensing a smell) modulate the reflexive control of the visual
system. Professor Dickinson described an experimental flight arena
built in their Lab at Berkeley. Using the arena, they have been
investigating aerodynamic forces, muscle groups exercised during
various flight stages, firings of neurons controlling the muscles, and
the effect of visual clues and other sensors.

Flies move in complex patterns made of straight lines and very sharp
90-degree turns. The turns are reflexive and governed by a visual
system. When a fly approaches an obstacle on the left, it makes a
sharp turn to the right.

How do flies fly? Folklore says that flies defy aero-dynamics. In a
sense, it's true. If we take a wing of the fly and place it in an
aerodynamic tube, we can determine lift and drag coefficients as
functions of the angle of attack. If we take rapid snapshots of wing
movements of a fly, we can determine the speed of flight and the
angles of attack during one wing cycle. We can use these experimental
data to compute the average lift -- which turns out less than fly's
weight. So flies can't fly. A closer inspection shows that flies take
advantage of non-stable flight conditions. In all insects, wings move
backward and forward rather than up-and-down. When the wing sharply
begins the forward motion, it creates a vortex over the wing. The air
in the vortex has a high speed, therefore, a low pressure. This adds
to the lift. At the end of the forward stroke, the wing _turns_. This
enlarges the vortex and again adds to the lift. The wing then moves
backwards and crosses the wake it created during the forward
motion. This interaction again adds to the lift. With all these things
considered, the lift is enough for the flight. It is interesting to
note that planes avoid unstable flight conditions as much as
possible. The wings of a plane are designed to make the air flow over
and above the wing as smoothly as possible, without building
vortices.The wing of a plane never crosses its own wake. Insects on
the other hand delight in an unstable flight. They have mastered the
unstable flight and thus defeated the conventional, "stable"
aerodynamics.

The wings of a fly are controlled by two pairs of strong power
muscles.  Two muscles go along the thorax (the body) and two muscles
go across. During the flight, the power muscles work "asynchronously",
_without_ the stimulation by the nervous system. The muscles are
stimulated by themselves. The contraction of one pair of muscles moves
the wings forward and stretches the other pair of muscles. This
stretching causes the second pair to contract and to move the wings
backwards -- and also to stretch the first pair of muscles. The
muscles thus oscillate in a resonance mode. The brain controls the
flight via steering muscles: small muscles that move a "hinge"
connecting wings to the body and change the angle of attack and the
amplitude of wing movement. Thus the nervous system finely modulates
the the wing movement. Flies have two pairs of wings. The second
(rear) pair of wings is almost degenerated. Although the rear wings
play no aerodynamic role, they are very important. The small wings are
equipped with sensors, which are directly connected to the nervous
cells that control the muscles modulating the movement of the main
wings. This reflexive feedback system makes the flight stable even in
the presence of wind.

If the nervous system does not directly stimulate the power muscles,
how does the flight start? First the nervous system leaks Ca into the
body fluid. The presence of calcium brings the power muscles to the
brink of the oscillation mode. To start the oscillations, a fly uses a
special starter muscle. This muscle contracts exactly once during the
flight: it moves the wings and thus pushes the power muscles into the
resonance.  When a fly is about to land, the steering muscles
disengage the wings.  The power muscles get oxygen not from the blood
or the body fluid but from a number of small pipes. Oxygen is
delivered directly to cells.

A robotics Lab at UC Berkeley is building an artificial insect, based
on what they have learned from a fruit fly.

The talk has made it very difficult to swat flies any more.