Subject: RE: business case for mechanized documentation
From: Rich Morin <rdm@cfcl.com>
Date: Wed, 12 Apr 2006 13:54:04 -0800

At 12:26 PM -0600 4/12/06, Anderson, Kelly wrote:
>> Similar notions apply to refactoring.  Indeed, if you move all of
>> your comments into your method names, a documentation generator will
>> have lots of material to work with.  And, given that Agile Languages
>> tend to be big on introspection, it may be possible to harvest useful
>> information from running programs.  Of course, the usual caveats
>> about code coverage metrics apply...
>
> It's possible, but why would you need to if the code were readable?
> (Kind of a rhetorical question)

Granted, but I'll take it seriously.

I should probably begin by saying that I'm not a big fan of
ExtremelyLongNamesThatTryToExplainEverything.  They are hard
to remember, hard to type, make code formatting awkward, etc.
That said, global names need to be unique and should be more
meaningful than local loop counters (e.g., $i).


Generally speaking, there are two kinds of comments: header
comments (e.g., at the start of a file, before a function)
and body comments (e.g., tacked onto the ends of code lines,
interspersed with lines of code).

File header comments should give information on the origin,
purpose, and organization of the rest of the file.  If the
file is large, they might include a "table of contents" which
lists the functions and one-line summaries.

Function header comments serve similar purposes.  They should
include usage information, expected return values, and notes
on any unusual aspects (e.g., algorithms, constraints, data
structures) that may be involved.

Body comments indicate the purpose of the upcoming code and
highlight unusual aspects of it.  I also like to use these to
give examples of the input data format, etc.


None of these purposes can be met particularly well by using
more functions, with or without ExtremelyLongNames, etc.  So,
moving comments into code is insufficient, in general.

<aside>
There's a legend about Bill Joy, to the effect that:

  *  He would write down a set of comments, describing
     what the code needed to do.

  *  He would then replace the comments with code.

  *  When all the comments were gone, he was done.
</aside>


Even the clearest code and the most apposite and revealing
comments are limited, however, by their content.  For example,
a function may list the data structures and functions that it
uses, but data structures and functions contain no indication
of which functions use _them_.

Documentation generators harvest this sort of information by
examining the entire collection of code.  They may also look
into other resources (e.g., include files, libraries, man
pages) to find out where external items are defined.

In some cases, however, even these techniques won't be able
to find out where things are defined.  For example, Rails
takes extensive advantage of Ruby's method_missing facility:

  http://www.rubycentral.com/ref/ref_c_object.html#method_missing

It parses the name of the missing method, then provides the
needed behavior dynamically.  In these cases, examination of
the code won't indicate where the method is defined.  So, some
way should be found to lead the programmer to this information.


In summary:

  Comments do things that code should not be expected to do.

  Documentation generators (etc) can harvest information
  from both code and (properly-formatted) comments, making it
  available in a variety of useful presentation formats, etc.

-r
-- 
http://www.cfcl.com/rdm            Rich Morin
http://www.cfcl.com/rdm/resume     rdm@cfcl.com
http://www.cfcl.com/rdm/weblog     +1 650-873-7841

Technical editing and writing, programming, and web development