Notes on the 2010 LISA conference

Norman Wilson
Electrical and Computer Engineering
University of Toronto

Boilerplate

LISA is an annual system administration conference held by SAGE, the system administrators' special interest group within the USENIX Association. The 2010 conference was held in San Jose CA, 7-12 November 2010.

The event comprises a lot of half- and full-day tutorials and three days of technical sessions. The latter has parallel tracks presenting `refereed papers,' usually three talks to a 90-minute session; `invited talks,' 90 minutes each; and a few other events. There are also a number of one- and two-hour informal birds-of-a-feather (BOF) sessions each evening on a variety of special topics.

Proceedings are published only electronically. Those registered for the technical sessions get a USB drive containing all the refereed papers in PDF format. Those registered for at least one extra-cost tutorial get a USB drive with additional tutorial-specific material. The papers, other material (often copies of slides) from the invited talks, and sometimes other stuff including audio and video recordings are made available on the conference web site. Initially only USENIX and SAGE members and registered attendees can get to the conference material on the web; after the conference access is open to anyone.

What follows are my notes on the parts of the conference I found most interesting. My tastes may not be the same as yours. Normally I attend only the technical sessions, since the tutorials rarely seem interesting enough to justify the extra cost. This year I attended most of a full-day presentation about Oracle Solaris 11 as well (it didn't cost extra).

Solaris futures

Engineers and system administrators from Oracle (formerly from Sun) gave an all-day series of talks lauding Solaris 11 Express, the field-test version of the next major release of Solaris, which was then about to be made generally available. (It was indeed released a few days later.)

Here are some highlights. Much of it is not news to avid followers of OpenSolaris, from which (as expected) Solaris 11 evolved.

Software packages

The format and tools for packaging software, whether parts of Solaris or third-party additions, has been redone from scratch.

The new scheme bundles fewer things to a package. Packages are stored in a repository, which may be in a local file system or accessed over the network. A package records the other packages it depends on, somewhat like the old system, except since packages are more-specific the dependencies are finer-grained, and it is believed that the dependency graph is now correct and complete. (The dependencies recorded in Sun's old packaging scheme were neither.)

Because packages are finer-grained, and because dependencies are explicit, updating an individual package is much easier. Patches are no longer needed; new versions of the affected packages are issued, and updating one package automatically updates any others depending on it (and any depending on those others, and so on). It is no longer necessary to patch a system up to date after installing; the default is to install the newest version of each package, so to stay current one just keeps the package repository current.

All this is rather like what some Linux distributions have been doing for some time: e.g. the rpm/yum mechanism for the Red Hat/Fedora family, or apt for the Debian family.

Old-style SVr4 packages (pkgadd(1M) et al) are still allowed, but are not used by Solaris itself. Third-party vendors are urged to adopt the new stuff.

System installation

The mechanism for installing Solaris has been completely redone as well. The manual installers are much simpler, offering two basic choices (workstation or server) rather than inviting you to select from a lot of different options. If you want to customize further by hand, you are expected just to use the new packaging tools to fetch and install whatever else you need or want.

Jumpstart is replaced by a new Automatic Installer (AI). The new design takes a rather higher-level approach, with a cleaner separation between work that must be done while the installer is running (creating the root file system if necessary, installing software packages) and work better done by the newly-installed system. Hence that one is no longer required or even expected to use the same release of Solaris in the installation environment as is being installed. It is also meant to make it less tricky to install a new virtual machine or zone.

Conceptually, AI can be booted over the network or from local media like a CD-ROM. AI then figures out network configuration from DHCP (RARP and bootparams are no longer used); locates or generates a manifest describing what is to be installed where, similar to a Jumpstart profile; locates a package repository, which may be on local media or somewhere out on the network; and installs everything and reboots.

There are no more Jumpstart begin and finish shell scripts. In place of begin commands there is some mechanism to allow the manifest to be generated on-the-fly (or at least such a mechanism is planned). In place of finish commands, one is expected either to supply software packages (which can include commands to be executed after installation), or to install first-boot self-disabling SMF services to finish configuration when the newly-installed system is first booted. All this affords a different and mostly higher-level way of doing things; experience will show whether it suffices. (I can think of one or two things I do with Jumpstart that I'm not sure how to fit into the new scheme.)

The root file system is always expected (probably required) to be a ZFS pool. The installer can use this to advantage; for example, the clumsy Live Upgrade mechanism is replaced by one that stores different bootable roots in different ZFS snapshots, making it easy to switch back and forth.

Campus license and support agreements

The all-day session was about the code, not about licensing policy, but I spoke later to an Oracle rep in their booth in the exhibit hall about campus software licensing. Sun had a reasonably-priced campus-wide support agreement for univeristies; Oracle has no such thing yet for Solaris. This will be a big problem next year, when our university's existing agreement expires; we can't afford to license our systems one-by-one, especially the older ones that were originally very expensive. (Oracle presently charges 8% of the original price paid to license a Sun system.)

I was told that the Sun people at Oracle understand the need for university campus agreements and are working on how to do such a thing within the framework of Oracle. The bad news is that they think it may be as long as another year before it can happen. It is not clear where that leaves us.

Other Solaris tidbits

The Crossbow virtual-network-interface subsystem discussed at last year's LISA is now part of the system. This allows one or more virtual NICs to share zero or more physical network interfaces, with traffic divided in various ways (by destination MAC address, by VLAN) and configurable rate limiting to share the available bandwidth fairly.

ZFS gains more features, including encryption, deduplication, and a new zfs diff command to list files that differ between two snapshots.

Something that is already in Solaris 10 but that I hadn't noticed: not only is it possible to make a Solaris-8-branded zone on a Solaris 10 system (so that programs running within that zone think they're on Solaris 8), it is possible to specify an arbitrary hostid for a zone. Thus, for example, if one has an ancient software license that can no longer be updated and is tied to a specific hostid, one can at least get rid of the corresponding ancient hardware by moving the OS image into a zone.

Postfix

Wietse Venema of IBM Research gave an interesting and entertaining invited talk about the past, present, and future of the Postfix mailer.

For those who don't know it, Postfix is a mail-transfer agent (MTA) written from scratch, partly in response to all the serious flaws found in sendmail in the late 1980s and early 1990s, partly just as an example of how to write a complex subsystem without undue security risk. It is designed as a collection of relatively-modest-sized programs, most of them running without special permissions, rather than as a single enormous privileged program. Postfix is freely available, and is included as an option in many Linux distributions.

I hadn't realized that Postfix runs on quite a lot of SMTP servers now. A survey someone made in 2006, which tried to limit itself to official second-level domain gateways, found about 12.3% of servers using sendmail, 8.6% Postfix, 8.5% Postini, 7.6% Microsoft Exchange.

Postfix has grown over the years, mostly by adding new pieces rather than by enlarging the existing ones. The evolution of the e-mail world, in particular the enormous growth of junk mail, have invalidated some early assumptions and provoked some changes and rearrangements, but in general the design has evolved gracefully (or so it appears from the outside, anyway). For example, Postfix now has both its own SMTP-based mechanism for vetting incoming mail and a way to plug in filters designed for the sendmail Milter protocol. Wietse was tickled to receive a Sendmail Innovation Award for the latter.

One change to reduce junk-mail calls particularly caught my fancy. The SMTP protocol requires that the client wait for a greeting message from the server, then send one command at a time, waiting for a response before sending another. Many junk-mail robots don't bother with these rules: they just send a block of commands all at once, expecting the server to read them one by one and ignoring any errors. Some SMTP servers, including sendmail and the simpler one I use for my home systems, can be told to wait a few seconds before sending the initial greeting; the idea is that if the client sends something in those few seconds, it's a rule-breaker and the connection might as well be dropped. The trouble is that this catches hardly anyone. Wietse suspects that the robots actually wait for the greeting before sending the block of commands, but that they likely don't bother to do that right, so he changed Postfix to send the greeting in two lines (using the SMTP continuation-line syntax so there is no protocol violation), dropping the connection if the client sends anything before the second line is sent. Wietse says this catches robots much better. I look forward to trying this out in my own code.

IPv6

Richard Jimmerson of ARIN gave an invited talk encouraging everyone to start using IPv6 because the supply of IPv4 addresses is running short. I've become somewhat jaded on this topic, having heard a number of strident speeches about it over the past 15 years or so, so I went to another talk. Later I listened to the MP3 recording of Jimmerson's talk. I'm glad I did; his presentation was practical rather than strident.

As of the middle of November 2010, when Jimmerson spoke, only 12 /8 IPv4 address blocks remained unallocated. That was the entire remaining supply for all five regional address authorities worldwide. Since 14 /8s had been allocated worldwide to date in 2010, it seems likely the supply will be exhausted sometime in 2011.

There is no immediate disaster when the address reserve runs out. There is an agreement among the world's five regional address-allocation authorities that when the reserve reaches five /8s, each authority gets one more, so no region will be immediately starved. No organization gets addresses from the authorities one at a time: they get chunks large enough to last for a while; so existing organizations have some breathing room too. Organizations occasionally surrender address space, sometimes out of the goodness of their hearts, sometimes just when an organization ceases to exist. ARIN expect a market to develop: you need some addresses; I have a /24 block I don't need; how much will you pay me? But things will continue to get tighter. IPv6 is probably the only way out in the long term.

Nobody plans to discard or deprecate the existing IPv4 network any time soon. The trouble is just that we're running out of room to add new IPv4 hosts. It becomes more and more likely that some host you want to talk to, or some that wants to talk to you, will speak only IPv6. Everyone will be happier if that doesn't mean the two of you can't talk.

Jimmerson suggests we think not of `IPv6 conversion,' but of `adoption:' adding IPv6 support gradually to our existing IPv4 networks. There's no immediate reason to toss out an existing network that works just fine, but it would be prudent to plan for communicating with hosts that can speak only IPv6. It would make sense to teach external-facing services like HTTP servers and SMTP gateways to speak both IPv4 and IPv6, and to look into IPv4/IPv6 address-translation boxes so your existing IPv4 hosts can talk to others' IPv6-only sites.

This sort of thinking works in the other direction too. New organizations ought to consider running (mostly) IPv6 internally, but will need a few IPv4 addresses for external-facing services to communicate with the existing IPv4 network world. There is a rule that each address authority must set aside a quarter of its final /8 IPv4 block for this and other transition needs.

There are still substantial stumbling blocks in the way of wholesale IPv4-to-v6 conversion. It will take a while for hard-to-convert legacy systems to die off. There are many embedded-system devices like network switches and print servers that currently understand only IPv4 and must be upgraded or replaced. (This is a big concern for retail ISPs, many of whom have deployed thousands of such devices in customers' homes.) Some of the less-formal aspects of Internet infrastructure have yet to catch up: IPv6 RBLs, for example. Perhaps most important: we all have a lot of experience deploying and troubleshooting IPv4 networks, but not so much for IPv6.

So although IPv4 addresses will henceforth be in very short supply, and it is smart to begin phasing in IPv6, there's no need for panic. (Strident doomsayers—they are still around—notwithstanding.) The IPv4 network will be around for many years yet. But it may not be so long before some resources can be accessed only with IPv6, so we shouldn't sit complacently on our hands either.

(Update: the remaining supply of unallocated /8 blocks dropped to five at the beginning of February 2011. As planned, the remaining five were then allocated, one to each of the five worldwide address authorities.)

Odds and ends

Delaet, Joosen, and Vanbrabant of the Katholieke Universiteit Leuven (in Belgium) have attempted to construct a taxonomy of configuration management systems, assessing specification models, language properties, type and availability of support, scalability, and other metrics. The idea is not to declare absolute winners and losers, but to suggest which systems might best fit a given environment. The classification of specific systems is a work in progress; see their web site for the most-current info.

Adam J. Oliner of Stanford presented the notion of `influence' as a way to understand events in complex systems. What he means is something familiar to any experienced debugger: don't just look at the causes you know about, look for any events that might be correlated with the bug, as such correlations often hint at surprising chains of causality. The experienced debugger does this through intuition; Oliner has a formal approach. His motivating example was a self-driving automobile that kept swerving unexpectedly, nearly running itself off the road. This was an invited talk, so no paper was published; I hope to look later for a written version.

There were several talks about techniques and engines for analyzing system logs to try to spot errors. None of them struck me as especially interesting, but of late I have become more interested in real-time monitoring than in log analysis, so perhaps I just didn't pay enough attention. I ought to go back and read the papers.

Bill Cheswick of AT&T Research gave a talk I've seen him give before. The gist is that we've completely lost our head about passwords: we obsess about making them hard to guess, forcing people to change them frequently, encouraging people to have a different password for each of the 73 Internet services they use daily. But these are not today's problems. Guessing passwords (except trivial ones like one's own name or birth date) ceased to be a real problem years ago. It has been understood for decades that the more often passwords must be changed, and the more passwords one must have, the more likely one will pick simple silly ones or store them where they can be easily be snooped (on pieces of paper, in configuration files).

A more serious concern is that passwords can be intercepted by tapping the network (encryption makes that harder but not impossible) or by keystroke-logging and other malware. We have known for decades how to do one-time and challenge-response passwords, and the near-universality of GUIs opens up even more possibilities (my password? it's when I click on this specific map location, or that interesting little knob in the Julia set) that are far more resistant to real threats and far easier on the end user, especially when the end user is Grandma. Maybe we should be looking into such alternatives to passwords rather than obsessing about past issues that either don't matter or make things worse.

Bill doesn't expect to get his point across any time soon, but he keeps trying.