Subject: Re: Estimating investments in Linux and other libre software
From: Ian Lance Taylor <>
Date: 3 Nov 1998 10:26:20 -0500

   Date: Mon, 02 Nov 1998 16:29:58 -0500
   From: (Frank Hecker)

   Mike estimated
   (based on the amount of code added to the kernel in a year, divided by a
   nominal 10 lines of completed code per day per developer) that the
   amount of programmer time put into Linux kernel development was
   equivalent to approximately 120 full-time developers....

I'm not comfortable with this sort of calculation.

For whatever reason, computer programming is a discipline with very
wide variation in productivity between good programmers and mediocre
programmers.  I believe I've seen estimates of a ratio of 30 to 1 in
amount of code produced.

My experience with free software is that most of the unpaid work is
done by highly productive programmers.  After all, they're the ones
with the skill to contribute on a part-time basis.

Therefore, I think that in any effort to compare programmer time
between free software projects and funded software projects, you have
to consider that the odds are that the programmers on the free
software projects are significantly more productive.  That makes me
feel that any comparison based on the amount of code developed in a
particular period of time, such as the above which uses amount of code
added to the kernel in one year, has a real risk of talking about
different ideas which sound similar but are really incommensurable.
The ``nominal 10 lines of completed code per day per developer'' may
simply have nothing to do with the actual code and the actual
developers in question.

To put it another way, while one can reasonably try to make guesses
like ``equivalent to approximately 120 full-time developers,'' to go
beyond that into such things as speculation about resources invested
in Linux vs. NT with implications for viability, or into
considerations such as the size of the free software programming
talent pool, seems meaningless.  For better or worse, programmers are
not commodities.  Free software development driven by volunteers is
not the same as funded software development driven by paid employees.

   One aspect of being more sophisticated is having better metrics about
   actual and potential developer resources; then you can begin to think
   implementing formal strategies for recruitment and retention and being
   able to evaluate their relative success or failure.  This is analogous
   to what more sophisticated non-profit organizations do in terms of
   measuring the success of direct-mail campaigns or volunteer
   initiatives.  (In fact I would contend that the analogy is actually
   fairly exact, given that in both cases you're leveraging unpaid
   volunteer resources and have to deal with all the issues that entails:
   managing the role of volunteers vs. that of in-house paid staff, paying
   attention to the ideals that motivate volunteers, etc.)

In principle I agree that this sort of thing would be a good idea.  In
practice I think the free software development pool is relatively
small and relatively idiosyncratic compared with the pool of people
who contribute to non-profit organizations.

For example, the GNU autoconf package languished for a couple of years
until it was recently picked up by Ben Elliston.  This was not due to
a lack of available free programming talent, nor to a feeling that it
was unimportant.  If it had been a charity, I'm sure people would have
contributed.  However, as a programming project, it had to wait until
one individual had the time, interest, ability, and opportunity.

My point is that a theoretical argument about potential developer
resources can easily founder on the reality of the actual set of
people interested and available to do the work.  That doesn't make it
a bad idea, of course.  But I think it would be inadvisable for an FSB
to make plans on this basis.

   This would be a great area for someone in academia to do a study and
   come up with some publishable data on past and present investments in
   libre software both in toto and by project.  Most if not all the
   information you'd need for such a study is publicly available; for
   example, you could use old tarballs or CVS snapshots to do source line
   counts and counts of the number of unique contributors for particular
   projects over time.

I would certainly find this interesting.  For the GNU project, you
could get a rough approximation simply by examining the ChangeLog