Subject: Re: mechanised documentation and my business model solution
From: Rich Morin <>
Date: Fri, 24 Mar 2006 02:36:03 -0800

When I speak of mechanized documentation, I have in mind
documentation generators (eg, Doxygen, JavaDoc, POD) and
other tools, as described in my Model-based Documentation
essays (  As I can't expect
folks to wade through these pages, here's a precis and a
loose "cost summary".

Mechanized documentation systems can harvest three basic
sources of information:

  *  explicit notations (eg, specially-formatted comments)
     made for the use of documentation system

  *  existing documentation (eg, man pages, PDF files)

  *  static (eg, source code, libraries) and dynamic
     (eg, log files, DTrace output) system information

However, in order to do this, they must first embody some
"model" of the system under study.  This is basically a
loose ontology (ie, collection of entity and relationship
classes) that define the sorts of things we wish to track.

This model allows the documentation system to have places
to put harvested data, ways to vet appropriateness of the
collected relationship instances, etc. (minor hand wave;
see the MBD essays for details, case studies, etc. :-)

Given this scenario, we have two major costs to consider:

  *  creation/maintenance of the explicit notations

  *  creation/maintenance of the documentation system

The cost of the first part can be zero, if no explicit
notations are made.  However, this will result in poor
results, as there won't be any explanation of the intent
of the programmers (OK; this calls that, but why?).

So, we should expect some small percentage (eg, 1%) of
the programming effort to be spent on adding comments.
The lurking detail here is that this percentage may not
have been spent on an existing system.  So, the cost of
"catching up" may be substantial.  However, some of it
can be paid over time, as part of system maintenance.

The cost of the second part is much higher, initially,
but it only has to be paid once.  That is, if the doc.
system "understands" how to harvest and present given
sorts of information, any code base can serve as input.
For example, Doxygen reads fairly arbitrary C/C++ code.

Of course, there are also some other costs.  If the doc.
system provides a mechanism for feedback, some work may
be needed to read and respond to comments, etc.  Folks
may come up with new kinds of entities and relationships
to track, or more devious ways to harvest information.
In short, a minor maintenance burden should be assumed.

--            Rich Morin     +1 650-873-7841

Technical editing and writing, programming, and web development