Subject: Re: FSBs and mechanized documentation
From: Rich Morin <rdm@cfcl.com>
Date: Tue, 7 Mar 2006 16:20:09 -0800

At 1:30 PM +0900 3/6/06, Stephen J. Turnbull wrote:
>>>>>> "Rich" == Rich Morin <rdm@cfcl.com> writes:
>
>     Rich>   *  The raw information is available for inspection.
>
>     Rich>      Open Source development tools (e.g., Bugzilla, CVS)
>     Rich> have accessible code and data, so tools can easily extract
>     Rich> relationship information, etc.  Also, the information is
>     Rich> free of proprietary restrictions.
>
> This is generally not true of bug databases.

I'm not sure of the referent for "this", above, so I'll make a general
response.  Bugzilla is Open Source, as is MySQL (which it uses).  I've
dug into Bugzilla's tables to extract information.  It wasn't as easy
as it might have been, because I couldn't find any documentation on the
schema, etc.  Nonetheless, I was able to extract the data I needed.

I'd consider it unfortunate for an Open Source project to restrict read
access to its bug report data, but that is an administrative issue, not
a technical limitation.  It's also possible that some Open Source bug
reporting systems use storage formats that aren't easy to parse, etc.
Again, that's unfortunate, but easily resolvable if need be.


> ...  The documentation is copious and pretty unusable---I found myself
> referring to source a lot ...  Both are due to the fact that it's
> automatically generated. ... My conclusion is that automatic generation
> by itself is not very useful.

Bingo.  Machines are good at some things; people are good at others.  If
you want text that explains things, try to find a knowledgeable person
who can write it.  Failing that, try to get a knowledgeable person to
give a brain dump to a writer, then look over the results.  However, if
you want to keep track of detailed information, use a computer!

The better documentation generators, IMHO, take advantage of both human
and mechanized input.  They allow (nay, encourage!) programmers to write
down the purpose and usage information for each substantial element (e.g.,
data structure, file, function).  Even if programmers (and managers!) are
willing to to play along, convenience is a critical factor.

Mechanical harvesting can supplement this text, however, adding details
that might otherwise be inconvenient to discover.  For example, static
analysis of files (e.g., code, documents, libraries) and dynamic analysis
of activities (e.g., using DTrace to log which files a program touches)
can be used to find a wealth of relationships.

This information can be used to generate indexes and other navigational
aids (e.g., image-mapped dot(1) diagrams with pop-up text), etc.  Indeed,
one of the great strengths of mechanically-generated web pages is their
ability to provide a completeness and consistency that a human could not
(and certainly would not) attempt.


As noted in my responses to Quinn, an integrated documentation system
can answer a wide range of questions.  Some of these answers might be
interesting to developers, others might aid in project administration.

Given that many software projects passed "enormous" some time back,
I'd think that we all might like a bit of automated assistance.  The
question is what it should look (and act) like.  My current thinking,
BTW, is that a cross between a wiki and a mechanized documentation
system might have some really useful characteristics.  See my blog:

  http://www.cfcl.com/weblog/archives/001002.html

-r
-- 
http://www.cfcl.com/rdm            Rich Morin
http://www.cfcl.com/rdm/resume     rdm@cfcl.com
http://www.cfcl.com/rdm/weblog     +1650-873-7841

Technical editing and writing, programming, and web development