Subject: Combinatorics and software support
From: Rich Morin <rdm@cfcl.com>
Date: Tue, 5 Nov 2002 14:26:01 -0800

This note was inspired by, and ties in with, Tom Lord's "how to create
21,780 new free software jobs (2,530 in R&D)", but it goes off in a new
direction.  Hope youall find it interesting and not toooo OT...

The Problem

When I started programming, back in 1970, most software was written
from scratch.  Examine the problem, pick a language, write some code.
Any operating system interaction was likely to be quite minimal: open
a file, mount a tape, etc.  As a result, I could bill myself as a
"scientific programmer", based on my wits and my Fortran experience.

These days, job listings include "laundry lists" of skills, including
multiple OSes, languages, packages, frameworks, etc.  This is not the
product of confused HR staff, as some would like to think.  Rather, it
is the result of individual projects selecting "one from column A, ..."
from a broad menu of technologies.

My current project, for example, is targeted at end users on Mac OS X,
but it will use a bit of AI and P2P technology along the way.  At this
point, the probable technology list includes CamelBones, Cocoa, CycL,
Interface Builder, Perl, Project Builder, SOAP, XML, etc.

Each of these brings something to the party (or I wouldn't be using it),
but the chance of finding someone with this skill set is laughably small.
And, although some of these tools are easy to learn, others are not.  The
Mac OS X Frameworks, for example, are HUGE; finding the needed method and
figuring out how to invoke it can be a real challenge.

So much for the travails of programmers; now, let's consider the poor
sysadmin.  S/he is faced with a variety of operating systems, tools,
and hardware interfaces, stemming from the design choices made in a
number of different development projects.  Said developers have, BTW,
left the company and didn't leave particularly readable notes.

As in the case of the programmer, the sysadmin's learning curve for
each of the technologies involved can be immense.  To pick one example,
the Mac OS X 10.2 distribution (including the Developer Tools) contains
a bit over 120K files and 40K directories.

In short, we have created a monster.  We build systems which no single
human can understand, then deploy them in assorted combinations at sites
which may have no local administrators (let alone programmers).  The
consequences of this include maintenance headaches, instability, and a
great deal of frustration for all concerned.

Current Approaches

Apple has a pretty nifty update service for OSX, but it only handles the
base OS.  If Joe Sikspak adds in a bunch of software, interactions may
emerge which Apple cannot predict, let alone handle.  Thus, if things go
wonky, Joe is largely on his own.

Assorted folks have designed systems that allow fleets of machines to be
administered from a central location.  These can be very effective, as
long as local modifications are kept under control.  But, in many cases,
this is not politically acceptable: users want to be able to install the
software they want, when they want it.

As the degree of local autonomy goes up, the amount of centralized control
(and consequent stability) diminishes.  In fact, given much variation in
local systems, the central administrators will lose the ability to track
the implications of particular local changes.  Instead, they'll have to
fall back on distributing patch kits and putting out brush fires.

A Model Solution

I'd like to propose a partial solution to this conundrum, based on tracking,
feedback, analysis, and modeling.  Let's say that Joe Sikspak's system kept
track of its operational status (e.g., file checksums and time stamps, events
gathered from log files) and reported back to a centralized server.

Given enough such reports, the server could look for common patterns, detect
exceptions, etc.  The human administrators would look through these reports,
dealing with them as appropriate.  In some cases, Joe should be told that he
has a problem.  In other cases, a rule might be added to the model to define
how this variation should be handled.

Modeling a single computer system is an overwhelming task; it's impossible
to predict what might happen, what a given program or user will do, etc.  My
suspicion, however, is that modeling sets of systems may actually get a lot
easier.  If the model can recognize "common practice", the exceptions will
stand out and can be analyzed separately.  If administrative rules for "best
practice" can be formalized, they can be folded back into the model.

Most of the system I'm describing is pretty prosaic: tracking local system
metadata and reporting it to a server, for instance, are "trivial" (lots of
work to get right, but no fundamental problems :-).  The tricky part lies in
the analysis and modeling portions.

Ideally, the system I have in mind would also support a lot of introspection;
a human should be able to ask pretty arbitrary questions and have the system
"figure out" the answer from its collected data, rules, etc.  This isn't the
sort of thing I want to program in an imperative language such as Perl; that
would be both painful to program and brittle to use.

At this point, I'm looking hopefully at OpenCyc (www.opencyc.org), an
inference engine built on a form of predicate calculus.  OpenCyc has shown
that it can handle very large sets (e.g., millions) of facts and rules.  It
also has interfaces for SQL and XML, allowing some useful extensibility in
these directions.

Thanks for listening; comments and suggestions are welcome.

-r
-- 
email: rdm@cfcl.com; phone: +1 650-873-7841
http://www.cfcl.com/rdm    - my home page, resume, etc.
http://www.cfcl.com/Meta   - The FreeBSD Browser, Meta Project, etc.
http://www.ptf.com/dossier - Prime Time Freeware's DOSSIER series
http://www.ptf.com/tdc     - Prime Time Freeware's Darwin Collection